Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion cpp/nd/string_array_holder.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -29,12 +29,19 @@ class string_stream_array_holder
std::vector<int64_t> offsets_;
std::vector<int64_t> range_offsets_;
std::vector<const void*> holders_;

// Zero-copy buffer cache: raw pointers to buffer data and offsets.
// SAFETY: These pointers remain valid as long as holders_ contains the chunk arrays.
// This eliminates shared_ptr atomic reference counting in get_range_data() hot path.
std::vector<const uint8_t*> buffer_cache_;
std::vector<const uint32_t*> offsets_cache_;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical Issue: Uninitialized cache vectors could cause undefined behavior

The new buffer_cache_ and offsets_cache_ members are declared but never initialized or populated in this header file. Since initialize_single_range() signature changed but no implementation exists in this repo, any code using this class will have empty cache vectors, leading to out-of-bounds access if the get_range_data() implementation (which must exist somewhere) tries to use these caches.

Recommendation: Either include the implementation that populates these caches, or mark this as a breaking change that requires the corresponding indra implementation to be synced first. The PR description mentions this mirrors indra/cpp/nd/string_array_holder.hpp - verify that all necessary implementation code is also synced to avoid runtime crashes.


const void* dynamic_std_holder_ = nullptr;
const void* dynamic_icm_holder_ = nullptr;
bool is_valid_ = true;

void initialize(const nd::array& arr);
void initialize_single_range(const auto& range_adapter);
void initialize_single_range(const auto& range_adapter, const nd::array& source_arr);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API Breaking Change: Missing parameter will break existing callers

The signature change from initialize_single_range(const auto& range_adapter) to initialize_single_range(const auto& range_adapter, const nd::array& source_arr) adds a required parameter. Any existing code calling this function will fail to compile.

Recommendation: Verify that no callers exist in deeplake or dependent codebases, or provide a migration path (e.g., overload for backward compatibility or update all call sites in this PR).

void initialize_complex(const nd::array& arr);
bool try_initialize_range_arrays(const auto& vstacked, const nd::array& fallback);
void clear_range_data();
Expand Down
Loading