Skip to content

Conversation

@Alex-PLACET
Copy link
Collaborator

No description provided.

@codecov
Copy link

codecov bot commented Feb 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.67%. Comparing base (e7ed35d) to head (a136355).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #651      +/-   ##
==========================================
+ Coverage   88.30%   88.67%   +0.36%     
==========================================
  Files         112      112              
  Lines        8857     9048     +191     
==========================================
+ Hits         7821     8023     +202     
+ Misses       1036     1025      -11     
Flag Coverage Δ
unittests 88.67% <100.00%> (+0.36%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Alex-PLACET Alex-PLACET self-assigned this Feb 4, 2026
@Alex-PLACET Alex-PLACET marked this pull request as ready for review February 4, 2026 16:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces copy-tracking instrumentation across several array/buffer types and updates the test suite to assert expected copy/move semantics when SPARROW_TRACK_COPIES is enabled (via the TRACK_COPIES CMake option).

Changes:

  • Added copy_tracker::reset/count assertions to many copy/move-related tests (guarded by #ifdef SPARROW_TRACK_COPIES).
  • Instrumented copy constructors / copy paths in multiple core types (arrays, Arrow C-interface helpers, buffers) to call copy_tracker::increase(...).
  • Enabled TRACK_COPIES=ON in CI workflows so copy-tracking assertions run in CI.

Reviewed changes

Copilot reviewed 41 out of 41 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
test/test_variable_size_binary_view_array.cpp Adds copy-tracking assertions for copy construction.
test/test_union_array.cpp Adds copy-tracking assertions for dense/sparse union array copies.
test/test_timestamp_array.cpp Adds copy/move tracking assertions and refactors move test to compare moved result.
test/test_struct_array.cpp Adds copy-tracking assertions for struct_array copies.
test/test_string_array.cpp Adds tracking for constructor/copy/move semantics of string/variable-size binary arrays.
test/test_run_end_encoded_array.cpp Adds copy-tracking assertions for run_end_encoded_array copies.
test/test_record_batch.cpp Switches to type-based copy-tracker key for record_batch tests.
test/test_primitive_array.cpp Adds copy-tracking assertions around ArrowArray/buffer copies and a few constructors.
test/test_null_array.cpp Adds copy-tracking assertions for null_array copies.
test/test_list_array.cpp Adds copy-tracking assertions for list/list_view/fixed_sized_list array copies.
test/test_fixed_width_binary_array.cpp Adds copy/move tracking assertions for fixed_width_binary_array.
test/test_dictionary_encoded_array.cpp Adds copy-tracking assertions for dictionary_encoded_array copies.
test/test_decimal_array.cpp Adds tracking to constructors and introduces new copy/move test cases.
src/union_array.cpp Adds keys and copy tracking increments for dense/sparse union array copies.
src/struct_array.cpp Adds key and copy tracking increment for struct_array copy ctor.
src/run_end_encoded_array.cpp Adds key and copy tracking increment for run_end_encoded_array copy ctor.
src/record_batch.cpp Minor include cleanup (keeps copy tracker include).
src/null_array.cpp Adds key and copy tracking increment for null_array copy ctor.
src/map_array.cpp Adds key and copy tracking increment for map_array copy ctor.
src/arrow_interface/arrow_schema.cpp Increments copy-tracker when deep-copying ArrowSchema.
src/arrow_interface/arrow_array.cpp Increments copy-tracker when deep-copying ArrowArray.
include/sparrow/variable_size_binary_view_array.hpp Adds copy-tracker integration + explicit copy ctor to track copies.
include/sparrow/variable_size_binary_array.hpp Adds copy-tracker integration + explicit copy ctor to track copies.
include/sparrow/union_array.hpp Declares sparse_union_array copy ctor/assignment for tracking.
include/sparrow/u8_buffer.hpp Adds copy-tracker key + increments on u8_buffer copy ctor.
include/sparrow/timestamp_array.hpp Adds copy-tracker key overload + increments on timestamp_array copy ctor.
include/sparrow/primitive_array.hpp Removes redundant <cstdint> include (now pulled elsewhere).
include/sparrow/null_array.hpp Declares null_array copy ctor/assignment for tracking.
include/sparrow/list_array.hpp Adds copy-tracker keys + tracks list/list_view/fixed_sized_list copies.
include/sparrow/layout/primitive_array_impl.hpp Adds copy-tracker key overload + increments on primitive_array_impl copy ctor.
include/sparrow/fixed_width_binary_array.hpp Adds key + explicit copy ctor to track fixed_width_binary array copies.
include/sparrow/dictionary_encoded_array.hpp Adds copy-tracker key overload + increments on dictionary_encoded_array copy ctor.
include/sparrow/decimal_array.hpp Adds copy-tracker key overload + increments on decimal_array copy ctor.
include/sparrow/buffer/buffer_view.hpp Adds copy tracking to buffer_view copy ctor.
include/sparrow/buffer/buffer.hpp Adds copy-tracker key + increments on buffer copy ctors.
include/sparrow/arrow_interface/arrow_schema.hpp Adds key for ArrowSchema to support schema copy tracking.
include/sparrow/arrow_interface/arrow_array.hpp Adds key for ArrowArray to support array copy tracking.
.github/workflows/qemu.yaml Enables -DTRACK_COPIES=ON in CI.
.github/workflows/osx.yml Enables -DTRACK_COPIES=ON in CI.
.github/workflows/manylinux.yml Enables -DTRACK_COPIES=ON in CI.
.github/workflows/linux.yml Enables -DTRACK_COPIES=ON in CI.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +28 to +34
namespace copy_tracker
{
template <typename T>
std::string key_u8_buffer()
{
return "u8_buffer<" + std::string(typeid(T).name()) + ">";
}
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key_u8_buffer() uses typeid(T).name() but this header doesn't include <typeinfo>. Please include <typeinfo> explicitly (or remove the RTTI dependency) to avoid build fragility across toolchains/stdlibs.

Copilot uses AI. Check for mistakes.
Comment on lines +136 to +138
#ifdef SPARROW_TRACK_COPIES
CHECK_EQ(copy_tracker::count("variable_size_binary_array"), 0);
#endif
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check hard-codes the key string ("variable_size_binary_array") even though the test already resets via copy_tracker::key<layout_type>(). Using the same key<layout_type>() for count() avoids drift if the key string ever changes and keeps the test consistent.

Copilot uses AI. Check for mistakes.
Comment on lines +129 to +135
layout_type arr{
std::move(data_buffer),
std::move(offset_buffer),
where_nulls,
"name",
metadata_sample_opt
};
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

layout_type arr{...}; is only used for its side effects (construction) and then never referenced. If the test build treats -Wunused-variable as an error, this will fail. Consider using CHECK_NOTHROW(...) again, or explicitly mark it used (e.g., (void)arr;).

Copilot uses AI. Check for mistakes.
Comment on lines 137 to 138
{
copy_tracker::increase(copy_tracker::key_buffer_view<T>());
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buffer_view is intended to be a cheap, non-owning view, but the new copy ctor calls copy_tracker::increase(copy_tracker::key_buffer_view<T>()), which constructs a std::string (allocations) and can throw. This changes the copy from trivially/noexcept copyable to potentially-throwing and adds overhead even when copy tracking is disabled (the key string is still built). Consider making this tracking compile-time gated (e.g., #ifdef SPARROW_TRACK_COPIES / if constexpr (copy_tracker::is_enabled())) and/or using a non-allocating key type (e.g., std::string_view / const char*) so copying a view stays noexcept and cheap.

Suggested change
{
copy_tracker::increase(copy_tracker::key_buffer_view<T>());
{
#if defined(SPARROW_TRACK_COPIES)
copy_tracker::increase(copy_tracker::key_buffer_view<T>());
#endif

Copilot uses AI. Check for mistakes.
Comment on lines 39 to 45
namespace copy_tracker
{
template <typename T>
std::string key_buffer()
{
return "buffer<" + std::string(typeid(T).name()) + ">";
}
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key_buffer() uses typeid(T).name() but this header doesn't include <typeinfo>. Relying on transitive includes for std::type_info/typeid is fragile and can break with different standard library implementations. Add an explicit #include <typeinfo> (or avoid typeid here).

Copilot uses AI. Check for mistakes.
Alex-PLACET and others added 2 commits February 9, 2026 09:05
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Collaborator

@JohanMabille JohanMabille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also track copies in T& T::operator=(const T&).

this->create_storage(rhs.size());
get_data().p_end = copy_initialize(rhs.begin(), rhs.end(), get_data().p_begin, get_allocator());
}
copy_tracker::increase(copy_tracker::key_buffer<T>());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also inscrease copies number in operator=.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

: p_data(other.p_data)
, m_size(other.m_size)
{
copy_tracker::increase(copy_tracker::key_buffer_view<T>());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also increase copies number in operator=

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

namespace copy_tracker
{
template <typename T>
std::string key_buffer()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about keeping the generic name key and constrain the template, like you did for primitive_array_impl ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

namespace copy_tracker
{
template <class T>
std::string key_buffer_view()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about keeping the generic name key and constrain the template like you did for primitive_array_impl ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator

@JohanMabille JohanMabille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're missing a last copy tracker in dictionary_encoded_array copy assign operator.

@Alex-PLACET Alex-PLACET merged commit 0401405 into man-group:main Feb 9, 2026
157 of 158 checks passed
@Alex-PLACET Alex-PLACET deleted the add_copy_trackers_in_tests branch February 9, 2026 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants