Conversation
This patch introduces a new `compress` target for rublk, which uses the RocksDB key-value store as a backing device to provide a compressed userspace block device. The implementation includes several key features and optimizations: - **RocksDB Backend:** Leverages RocksDB for storing block data, with the key as the block address (u64) and the value as the 512-byte block content. - **Configurable Compression:** The compression algorithm can be selected via a command-line argument, with LZ4 as the default. - **Optimized Reads/Writes:** I/O is optimized by using batch operations (`multi_get` for reads and `WriteBatch` for writes) to reduce overhead. - **Configuration:** The device size and RocksDB data directory are configurable via command-line arguments. Device configuration is persisted in a JSON file for recovery. - **Cache Settings:** The device is configured with the `UBLK_ATTR_VOLATILE_CACHE` attribute, and RocksDB is tuned with a block cache and bloom filters to improve read performance. Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Offload the blocking `db.flush()` operation to a background thread to prevent it from stalling the main ublk I/O handler. This is achieved by: - Spawning a dedicated thread to perform the RocksDB flush. - Using an `eventfd` to notify the main thread when the flush operation is complete. - The main I/O handler now polls the `eventfd` and completes the queued flush commands only after the background flush has finished. This change makes the flush operation asynchronous from the perspective of the ublk I/O loop, improving responsiveness. Also, `WriteOptions` are now used to disable WAL and sync for normal writes, as flushes are handled explicitly. Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Remove the extra 'ublk_compress' subdirectory and store the RocksDB database files directly in the directory provided by the --dir command-line argument. This simplifies the directory structure and makes the behavior more intuitive for users. Signed-off-by: Ming Lei <tom.leiming@gmail.com>
548875f to
7b37638
Compare
Implement support for the `UBLK_IO_OP_DISCARD` command in the compress
target.
This is achieved by:
- Advertising discard support to the ublk core by setting the
`UBLK_PARAM_TYPE_DISCARD` parameter type and configuring the
discard granularity and limits.
- Using the optimized `db.delete_range_cf()` RocksDB method to delete
all keys within the specified sector range in a single, efficient
operation. This avoids the overhead of iterating over a potentially
very large range and creating a large `WriteBatch`.
- Opening the RocksDB instance with an explicit column family ("default")
to get the required handle for the `delete_range_cf` operation.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
09da1b8 to
f276799
Compare
Add a suite of integration tests for the `compress` target to validate its functionality, including basic I/O, recovery, and use as a standard block device. The following tests have been added: - `test_ublk_add_del_compress`: Verifies basic device creation, I/O operations (read/write), and deletion. - `test_ublk_compress_recover`: Tests the device recovery mechanism after a simulated failure. - `test_ublk_format_mount_compress`: Validates the device by formatting it with an ext4 filesystem, mounting it, and performing file I/O. To support this, a reusable test harness `__test_ublk_add_del_compress` has been created, and the test device size has been set to 8G. Signed-off-by: Ming Lei <tom.leiming@gmail.com>
To improve performance and avoid blocking the main ublk I/O loop, this patch offloads blocking RocksDB read operations to a dedicated worker thread. This is achieved by: - Spawning a new thread to handle all `db.multi_get()` calls. - Using a channel to send read jobs from the main I/O handler to the worker thread. The buffer address is safely passed by casting it to a u64. - Using a second channel for the worker to send completion data (or errors) back to the main thread. - Integrating with the `ublk` event loop by using an `eventfd` to notify the main thread when a read completion is ready. The `io_handler` now polls this `eventfd` and processes the completion queue. This change makes read operations asynchronous from the perspective of the ublk I/O loop, significantly improving responsiveness for concurrent workloads. Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Refactor the flush command handling to align with the asynchronous offloading pattern used by the read command. This change introduces a dedicated job/completion channel and a background thread specifically for flush operations. This approach removes the more complex shared `pending_tags` mutex and simplifies the `io_handler` logic. Now, both `READ` and `FLUSH` commands are handled symmetrically, each with its own worker thread and eventfd for notification, making the asynchronous flow more consistent and easier to maintain. Signed-off-by: Ming Lei <tom.leiming@gmail.com>
9f94908 to
47a09cd
Compare
Ensure that the logical and physical block sizes for a `compress` target device are immutable after first creation. This is achieved by: - Adding `logical_block_size` and `physical_block_size` fields to the `ublk_compress.json` configuration file. - On first run, the block sizes are determined from command-line arguments or defaults and saved to the JSON file. - On subsequent runs for an existing device, these values are read from the JSON file, and any block size arguments from the command line are ignored. This guarantees that the device geometry remains consistent across restarts, preventing potential data corruption or filesystem errors. Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Change the storage format for the compress target to improve performance and better align with typical I/O patterns. Instead of storing a fixed 512-byte sector per key, each key now stores one full logical block. The key is still the 512-byte sector offset of the start of the logical block. This change modifies the read, write, and discard handlers to calculate I/O operations in terms of logical blocks rather than sectors, reducing the number of required database operations for any given I/O request. The RocksDB block size has also been adjusted to be a multiple of the logical block size for better cache alignment. Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Wire up the ublk read-only flag with the RocksDB backend to ensure consistent behavior. This is achieved by: - Checking the read-only flag from the device parameters when adding a new `compress` target. - If the device is read-only, the RocksDB database is now opened in secondary (read-only) mode using `DB::open_cf_as_secondary`. - The I/O handler now rejects any `WRITE` or `DISCARD` commands with an `EACCES` error if the device is in read-only mode, providing a fast-path failure without attempting to modify the database. Signed-off-by: Ming Lei <tom.leiming@gmail.com>
cargo:warning=/usr/include/c++/13/cstdint:38:10: fatal error: bits/c++config.h: No such file or directory cargo:warning= 38 | #include <bits/c++config.h> cargo:warning= | ^~~~~~~~~~~~~~~~~~ Also install rustc 1.85.0, otherwise some package may not be built. Signed-off-by: Ming Lei <tom.leiming@gmail.com>
It takes too long, and not reliable, so remove test ci. Signed-off-by: Ming Lei <tom.leiming@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add rocksdb based compression target.