Skip to content

Conversation

@iceboundrock
Copy link

@iceboundrock iceboundrock commented Jan 29, 2026

I tested it locally

cargo clippy --all-targets --all-features
    Blocking waiting for file lock on build directory
   Compiling pyo3-build-config v0.27.2
   Compiling vortex-cuda v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-cuda)
   Compiling pyo3-ffi v0.27.2
   Compiling pyo3-macros-backend v0.27.2
   Compiling pyo3 v0.27.2
   Compiling pyo3-macros v0.27.2
    Checking vortex-error v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-error)
    Checking pyo3-bytes v0.5.0
    Checking pyo3-log v0.13.2
    Checking vortex-buffer v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-buffer)
    Checking vortex-utils v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-utils)
    Checking vortex-session v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-session)
    Checking vortex-metrics v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-metrics)
    Checking vortex-flatbuffers v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-flatbuffers)
    Checking vortex-mask v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-mask)
    Checking vortex-io v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-io)
    Checking vortex-dtype v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-dtype)
    Checking vortex-vector v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-vector)
    Checking vortex-scalar v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-scalar)
    Checking vortex-compute v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-compute)
    Checking vortex-array v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-array)
    Checking vortex-fastlanes v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/fastlanes)
    Checking vortex-zigzag v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/zigzag)
    Checking vortex-decimal-byte-parts v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/decimal-byte-parts)
    Checking vortex-runend v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/runend)
    Checking vortex-zstd v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/zstd)
    Checking vortex-fsst v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/fsst)
    Checking vortex-datetime-parts v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/datetime-parts)
    Checking vortex-sparse v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/sparse)
    Checking vortex-pco v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/pco)
    Checking vortex-bytebool v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/bytebool)
    Checking vortex-ipc v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-ipc)
    Checking vortex-sequence v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/sequence)
    Checking vortex-alp v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/alp)
    Checking vortex-btrblocks v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-btrblocks)
    Checking vortex-layout v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-layout)
    Checking vortex-scan v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-scan)
    Checking vortex-file v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-file)
    Checking vortex v0.1.0 (/Users/ruoshi/code/github/vortex/vortex)
    Checking vortex-bench v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-bench)
    Checking vortex-datafusion v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-datafusion)
    Checking vortex-fuzz v0.1.0 (/Users/ruoshi/code/github/vortex/fuzz)
    Checking vortex-duckdb v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-duckdb)
    Checking vortex-ffi v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-ffi)
    Checking vortex-cxx v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-cxx)
    Checking vortex-python v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-python)
    Checking vortex-jni v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-jni)
    Checking vortex-tui v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-tui)
    Checking lance-bench v0.1.0 (/Users/ruoshi/code/github/vortex/benchmarks/lance-bench)
    Checking duckdb-bench v0.1.0 (/Users/ruoshi/code/github/vortex/benchmarks/duckdb-bench)
    Checking datafusion-bench v0.1.0 (/Users/ruoshi/code/github/vortex/benchmarks/datafusion-bench)
    Checking compress-bench v0.1.0 (/Users/ruoshi/code/github/vortex/benchmarks/compress-bench)
    Checking random-access-bench v0.1.0 (/Users/ruoshi/code/github/vortex/benchmarks/random-access-bench)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 38.28s
cargo test -p vortex-duckdb      
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.54s
     Running unittests src/lib.rs (target/debug/deps/vortex_duckdb-59d33a3f980a2476)

running 178 tests
test convert::dtype::tests::test_binary_type ... ok
test convert::dtype::tests::test_primitive_types::case_01 ... ok
test convert::dtype::tests::test_bool_type ... ok
test convert::dtype::tests::test_null_type ... ok
test convert::dtype::tests::test_primitive_types::case_02 ... ok
test convert::dtype::tests::test_list_type ... ok
test convert::dtype::tests::test_decimal_type ... ok
test convert::dtype::tests::test_date_extension_type ... ok
test convert::dtype::tests::test_f16_unsupported ... ok
test convert::dtype::tests::test_primitive_types::case_03 ... ok
test convert::dtype::tests::test_primitive_types::case_04 ... ok
test convert::dtype::tests::test_primitive_types::case_05 ... ok
test convert::dtype::tests::test_empty_struct ... ok
test convert::dtype::tests::test_primitive_types::case_06 ... ok
test convert::dtype::tests::test_primitive_types::case_07 ... ok
test convert::dtype::tests::test_primitive_types::case_08 ... ok
test convert::dtype::tests::test_primitive_types::case_09 ... ok
test convert::dtype::tests::test_primitive_types::case_10 ... ok
test convert::dtype::tests::test_string_type ... ok
test convert::dtype::tests::test_temporal_extension_invalid_time_units ... ok
test convert::dtype::tests::test_struct_with_invalid_field_name ... ok
test convert::dtype::tests::test_struct_type ... ok
test convert::dtype::tests::test_time_extension_type ... ok
test convert::dtype::tests::test_unsupported_extension_type ... ok
test convert::dtype::tests::test_timestamp_with_timezone ... ok
test convert::dtype::tests::test_timestamp_extension_types ... ok
test convert::scalar::tests::test_timestamp_roundtrip ... ok
test convert::scalar::tests::test_scalar_round_trip ... ok
test convert::vector::tests::test_integer_vector_conversion ... ok
test convert::vector::tests::test_timestamp_extreme_values ... ok
test convert::vector::tests::test_empty_struct ... ok
test convert::vector::tests::test_fixed_sized_list ... ok
test convert::vector::tests::test_struct ... ok
test convert::vector::tests::test_timestamp_single_value ... ok
test convert::vector::tests::test_timestamp_seconds_vector_conversion ... ok
test convert::vector::tests::test_timestamp_milliseconds_vector_conversion ... ok
test convert::vector::tests::test_boolean_vector_conversion ... ok
test convert::vector::tests::test_list ... ok
test convert::vector::tests::test_timestamp_vector_conversion ... ok
test convert::vector::tests::test_list_out_of_order ... ok
test convert::vector::tests::test_timestamp_with_nulls_conversion ... ok
test convert::vector::tests::test_list_with_trailing_null ... ok
test convert::vector::tests::test_list_null_garbage_data ... ok
test convert::vector::tests::test_vector_with_nulls ... ok
test duckdb::config::tests::test_config_count ... ok
test duckdb::config::tests::test_config_get_flag ... ok
test duckdb::config::tests::test_config_invalid_key ... ok
test duckdb::config::tests::test_config_creation ... ok
test duckdb::config::tests::test_config_invalid_value ... ok
test duckdb::config::tests::test_config_get_all ... ok
test duckdb::config::tests::test_config_set_and_get ... ok
test duckdb::config::tests::test_config_list_available_options ... ok
test duckdb::config::tests::test_config_update_value ... ok
test duckdb::connection::tests::test_query_and_get_row_count_create_table ... ok
test duckdb::connection::tests::test_execute_success ... ok
test duckdb::connection::tests::test_query_and_get_row_count_select ... ok
test duckdb::config::tests::test_config_persistence_through_database ... ok
test duckdb::connection::tests::test_execute_with_null_bytes ... ok
test duckdb::connection::tests::test_query_and_get_row_count_insert ... ok
test duckdb::connection::tests::test_object_cache_get_nonexistent ... ok
test duckdb::connection::tests::test_connection_creation ... ok
test duckdb::connection::tests::test_query_and_get_row_count_update ... ok
test duckdb::logical_type::tests::test_clone_array_logical_type ... ok
test duckdb::logical_type::tests::test_clone_decimal_logical_type ... ok
test duckdb::logical_type::tests::test_clone_list_logical_type ... ok
test duckdb::logical_type::tests::test_clone_logical_type ... ok
test duckdb::logical_type::tests::test_clone_map_logical_type ... ok
test duckdb::logical_type::tests::test_clone_struct_logical_type ... ok
test duckdb::logical_type::tests::test_clone_union_logical_type ... ok
test duckdb::value::tests::test_huge_int_from_parts ... ok
test duckdb::vector::tests::test_create_validity_all_nulls ... ok
test duckdb::vector::tests::test_create_validity_all_valid ... ok
test duckdb::vector::tests::test_create_validity_empty ... ok
test duckdb::vector::tests::test_create_validity_single_element ... ok
test duckdb::vector::tests::test_create_validity_single_element_valid ... ok
test duckdb::vector::tests::test_create_validity_with_nulls ... ok
test duckdb::vector::tests::test_dictionary ... ok
test duckdb::vector::tests::test_row_is_null_all_nulls ... ok
test duckdb::connection::tests::test_null_handling ... ok
test duckdb::vector::tests::test_row_is_null_all_valid ... ok
test duckdb::vector::tests::test_row_is_null_byte_boundaries ... ok
test duckdb::vector::tests::test_row_is_null_single_element ... ok
test duckdb::vector::tests::test_row_is_null_single_element_valid ... ok
test duckdb::vector::tests::test_row_is_null_with_nulls ... ok
test duckdb::connection::tests::test_object_cache_put_get ... ok
test duckdb::connection::tests::test_query_and_get_row_count_delete ... ok
test duckdb::connection::tests::test_query_multiple_columns ... ok
test duckdb::connection::tests::test_query_column_types ... ok
test duckdb::connection::tests::test_type_conversion ... ok
test duckdb::database::tests::test_file_database_with_config ... ok
test duckdb::database::tests::test_database_with_config ... ok
test duckdb::connection::tests::test_query_single_value ... ok
test duckdb::connection::tests::test_query_multiple_rows ... ok
test duckdb::file_system::tests::test_writer_roundtrip_local ... ok
test e2e_test::object_cache_test::test_table_function_with_object_cache ... ok
test e2e_test::vortex_scan_test::test_issue_5927_not_in_does_not_panic ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_integers ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_floats ... ok
test e2e_test::vortex_scan_test::test_vortex_multi_column ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_list_of_ints ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_constant ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_booleans ... ok
test e2e_test::vortex_scan_test::test_scan_function_registration ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_integers_between ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_fixed_size_list_utf8 ... ok
test exporter::all_invalid::tests::all_null_array ... ok
test exporter::bool::tests::test_bool ... ok
test exporter::bool::tests::test_bool_all_invalid ... ok
test exporter::bool::tests::test_bool_long ... ok
test exporter::bool::tests::test_bool_nullable ... ok
test exporter::decimal::tests::test_decimal_zero_copy_exporter ... ok
test exporter::decimal::tests::test_decimal_zero_copy_exporter_subset ... ok
test exporter::decimal::tests::test_decimal_zero_copy_exporter_with_nulls ... ok
test exporter::dict::tests::test_constant_dict ... ok
test exporter::dict::tests::test_constant_dict_null ... ok
test exporter::dict::tests::test_export_empty_dict ... ignored, TODO(connor)[4809]: Exporters do not correctly handle empty vectors
test exporter::dict::tests::test_nullable_dict ... ok
test exporter::fixed_size_list::tests::test_export_all_null_lists ... ok
test exporter::fixed_size_list::tests::test_export_alternating_null_pattern ... ok
test exporter::fixed_size_list::tests::test_export_empty_fixed_size_list ... ok
test exporter::fixed_size_list::tests::test_export_fixed_size_list_with_nulls ... ok
test exporter::fixed_size_list::tests::test_export_list_size_one ... ok
test exporter::fixed_size_list::tests::test_export_nested_fixed_size_list ... ok
test exporter::fixed_size_list::tests::test_export_nested_fixed_size_list_with_nulls ... ok
test exporter::fixed_size_list::tests::test_export_non_empty_fixed_size_list ... ok
test exporter::fixed_size_list::tests::test_export_partial_range ... ok
test exporter::list::tests::test_export_empty_list ... ignored, TODO(connor)[4809]: Exporters do not correctly handle empty vectors
test exporter::list::tests::test_export_non_empty_list_of_strings ... ok
test exporter::list_view::tests::test_export_empty_list ... ignored, TODO(connor)[4809]: Exporters do not correctly handle empty vectors
test exporter::list_view::tests::test_export_non_empty_list_of_strings ... ok
test exporter::list_view::tests::test_export_non_empty_list_with_preceding_and_trailing_garbage ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_integers_in_list ... ok
test exporter::primitive::tests::test_primitive_exporter ... ok
test exporter::sequence::tests::test_sequence ... ok
test exporter::struct_::tests::struct_export_non_flat_vectors ... ok
test exporter::struct_::tests::test_struct_exporter ... ok
test exporter::struct_::tests::test_struct_exporter_with_nulls ... ok
test exporter::temporal::tests::test_timestamp_s ... ok
test exporter::temporal::tests::test_timestamp_time_us ... ok
test exporter::primitive::tests::test_long_primitive_exporter ... ok
test exporter::temporal::tests::test_timestamp_us ... ok
test exporter::tests::test_copy_from_slice_64_to_64 ... ok
test exporter::tests::test_copy_from_slice_248_to_128_middle_non_empty ... ok
test exporter::tests::test_copy_from_slice_64_to_empty ... ok
test exporter::tests::test_copy_from_slice_80_to_0 ... ok
test exporter::tests::test_copy_from_slice_80_to_64_case_1 ... ok
test exporter::tests::test_copy_from_slice_80_to_64_case_2 ... ok
test exporter::tests::test_copy_from_slice_80_to_64_case_3 ... ok
test exporter::tests::test_copy_from_slice_80_to_64_case_4 ... ok
test exporter::tests::test_copy_from_slice_empty_to_empty ... ok
test exporter::tests::test_set_validity_all_false ... ok
test exporter::tests::test_set_validity_all_true ... ok
test exporter::tests::test_set_validity_values_64bit_alignment ... ok
test exporter::tests::test_set_validity_values_all_false ... ok
test exporter::tests::test_set_validity_values_all_true ... ok
test exporter::tests::test_set_validity_values_mixed ... ok
test exporter::tests::test_set_validity_values_with_offset ... ok
test exporter::tests::test_set_validity_values_with_offset_and_smaller_len ... ok
test exporter::varbinview::tests::all_invalid_varbinview ... ok
test exporter::varbinview::tests::all_invalid_varbinview_section ... ok
test exporter::varbinview::tests::partial_invalid_varbinview_section ... ok
test e2e_test::vortex_scan_test::test_write_file ... ok
test utils::glob::tests::test_expand_local_disk_glob_no_matches ... ok
test utils::glob::tests::test_expand_local_disk_glob_relative_path ... ok
test utils::glob::tests::test_expand_local_disk_glob_single_file ... ok
test utils::glob::tests::test_expand_local_disk_glob_subdirectories ... ok
test utils::glob::tests::test_expand_local_disk_glob_wildcard ... ok
test e2e_test::vortex_scan_test::test_write_timestamps ... ok
test tests::test_vortex_max_threads_option_registration ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_list_of_utf8 ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_nested_fixed_size_list_utf8 ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_strings ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_multiple_files ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_strings_contains ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_ultra_deep_nesting ... ok
test e2e_test::vortex_scan_test::test_vortex_encodings_roundtrip ... ok
test utils::glob::tests::test_duckdb_glob_http_hook ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_over_http ... ok

test result: ok. 175 passed; 0 failed; 3 ignored; 0 measured; 0 filtered out; finished in 0.30s

   Doc-tests vortex_duckdb

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

"http" | "https" | "s3" => {
let reader = open_duckdb_reader(client_ctx, &url);

// Fallback to the legacy object_store path for s3 if DuckDB fs isn't configured.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when will that happen? I would really love to get rid of object_store here, it has some underlying dependencies on tokio that can cause some weird issues

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP/HTTPS/S3 now route solely through DuckDB FS; legacy object_store fallback removed, and failures now bubble as errors if httpfs ext of DuckDB is not initialized.

@joseph-isaacs joseph-isaacs added the action/benchmark Trigger full benchmarks to run on this PR label Jan 29, 2026
@codspeed-hq
Copy link

codspeed-hq bot commented Jan 29, 2026

CodSpeed Performance Report

Merging this PR will not alter performance

Comparing Lychee-Technology:develop (a131078) with develop (469e31f)

Summary

✅ 1138 untouched benchmarks
⏩ 1384 skipped benchmarks1

Footnotes

  1. 1384 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@0ax1 0ax1 self-requested a review January 29, 2026 10:40
@0ax1 0ax1 added action/benchmark-sql Trigger SQL benchmarks to run on this PR changelog/feature A new feature and removed action/benchmark Trigger full benchmarks to run on this PR labels Jan 29, 2026
@joseph-isaacs joseph-isaacs added action/benchmark-sql Trigger SQL benchmarks to run on this PR and removed action/benchmark-sql Trigger SQL benchmarks to run on this PR labels Jan 29, 2026
Copy link
Contributor

@0ax1 0ax1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to look into this. There's a couple of issues around thread safety / locking, error handling and memory leaks.

If you used AI to assist in writing the code for your PR please mention this in your PR description (see: https://github.com/vortex-data/vortex/blob/develop/CONTRIBUTING.md).

if (!error_out) {
return;
}
*error_out = duckdb_vx_error_create(message.data(), message.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this error get freed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C++ allocates errors via duckdb_vx_error_create; Rust side consistently frees with duckdb_vx_error_free in fs_error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest moving

static void SetError(duckdb_vx_error *error_out, const std::string &message) {
    if (!error_out) {
        return;
    }
    *error_out = duckdb_vx_error_create(message.data(), message.size());
}

static duckdb_state HandleException(std::exception_ptr ex, duckdb_vx_error *error_out) {
    if (!ex) {
        SetError(error_out, "Unknown error");
        return DuckDBError;
    }

    try {
        std::rethrow_exception(ex);
    } catch (const Exception &caught) {
        SetError(error_out, caught.what());
    } catch (const std::exception &caught) {
        SetError(error_out, caught.what());
    } catch (...) {
        SetError(error_out, "Unknown error");
    }
    return DuckDBError;
}

to error.cpp as they're not specific to file system logic.

pub struct VortexCopyFunction;

#[derive(Clone, Copy)]
struct SendableClientCtx(usize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we encode the pointer as a usize?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced usize casting with NonNull<duckdb_vx_client_context_> (SendableClientCtx) and pass the raw pointer via as_ptr(). This avoids truncation/aliasing issues from integer casts and preserves provenance; all FFI touchpoints now use the typed wrapper.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we use duckdb_vx_client_context? We typedef pointers to make them typesafe: typedef struct duckdb_vx_client_context_ *duckdb_vx_client_context;.

@joseph-isaacs joseph-isaacs added action/benchmark-sql Trigger SQL benchmarks to run on this PR and removed action/benchmark-sql Trigger SQL benchmarks to run on this PR labels Feb 2, 2026
pub struct VortexCopyFunction;

#[derive(Clone, Copy)]
struct SendableClientCtx(NonNull<cpp::duckdb_vx_client_context_>);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's 2 definitions of SendableClientCtx and there's an existing ClientContext definition in vortex-duckdb/src/duckdb/client_context.rs. Implement the SendableClientCtx functionality on the existing ClientContext.

Copy link
Contributor

@0ax1 0ax1 Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you add send + sync to ClientContext in client_context.rs add an explanation to the SAFETY comments why it is thread safe https://github.com/duckdb/duckdb/blob/e5fb0a7eab6bc6fd7f26bd5b8cd74c3441f3895b/src/include/duckdb/main/client_context.hpp#L67.

// SPDX-FileCopyrightText: Copyright the Vortex contributors

use futures::StreamExt;
#[cfg(test)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of glob.rs is unused at this point. We can drop the whole file.

const DEFAULT_CONCURRENCY: usize = 64;

lifetime_wrapper!(FsFileHandle, cpp::duckdb_vx_file_handle, cpp::duckdb_vx_fs_close, [owned, ref]);
unsafe impl Send for FsFileHandle {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked up in the meantime whether DuckDB file handles are thread safe. There's no synchronization happening in FileHandle: https://github.com/duckdb/duckdb/blob/e5fb0a7eab6bc6fd7f26bd5b8cd74c3441f3895b/src/include/duckdb/common/file_system.hpp#L71

It is though possible to use parallel reads/writes. The requirement is that all operations are position based Read(buffer, size, location) Write(buffer, size, location) and not cursor based and therefore stateful.

The write therefore needs to change away from the Cursor based approach.

        wrapper->handle->Seek(offset);
        wrapper->handle->Write(const_cast<uint8_t *>(buffer), len);

FILE_FLAGS_PARALLEL_ACCESS: https://github.com/duckdb/duckdb/blob/e5fb0a7eab6bc6fd7f26bd5b8cd74c3441f3895b/src/include/duckdb/common/file_open_flags.hpp#L117.

Please double check all operations on FSFileHandle whether they statefully change the handle. And make sure to use FILE_FLAGS_PARALLEL_ACCESS.

}

/// A VortexReadAt implementation backed by DuckDB's filesystem (e.g., httpfs/s3).
pub struct DuckDbFsReadAt {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about renaming this to DuckDbFsReader?

}
}

// SAFETY: DuckDB file handles are thread-safe for reads; writes are done via DuckDbFsWriter.
Copy link
Contributor

@0ax1 0ax1 Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to adjust these SAFETY comments to reflect under which conditions the handle is safe to use across threads. Please double check my previous statements in ref with the DuckDB source code: https://github.com/duckdb/duckdb/blob/e5fb0a7eab6bc6fd7f26bd5b8cd74c3441f3895b/src/include/duckdb/common/file_system.hpp#L71.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

action/benchmark-sql Trigger SQL benchmarks to run on this PR changelog/feature A new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants