Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
123 changes: 123 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

libublk-rs is a Rust library for building Linux ublk (userspace block) target devices. It provides a high-level API for creating custom block devices that run in userspace while interfacing with the Linux kernel's ublk driver. The library uses io_uring for high-performance asynchronous I/O operations.

## Build Commands

- `cargo build` - Build the library
- `cargo build --features=fat_complete` - Build with fat completion feature
- `cargo test` - Run tests
- `cargo run --example null help` - Run the null target example with help
- `cargo run --example loop help` - Run the loop target example with help
- `cargo run --example ramdisk` - Run the ramdisk example

## Core Architecture

### Main Components

1. **Control Layer (`src/ctrl.rs`)**:
- `UblkCtrl` and `UblkCtrlBuilder` - Device creation and management
- Handles device lifecycle (add, start, stop, delete)
- CPU affinity management for queues
- Uses `/dev/ublk-control` for kernel communication

2. **I/O Layer (`src/io.rs`)**:
- `UblkDev` - Device representation
- `UblkQueue` - Per-queue I/O handling
- `UblkIOCtx` - I/O context management
- Raw SQE (Submission Queue Entry) manipulation via `RawSqe`

3. **Async Support (`src/uring_async.rs`)**:
- `UblkUringOpFuture` - io_uring integration
- `ublk_wait_and_handle_ios()` - Main event loop driver

4. **System Bindings (`src/sys.rs`, `src/bindings.rs`)**:
- Low-level kernel interface definitions
- Generated from C headers via build.rs

5. **Helpers (`src/helpers.rs`)**:
- `IoBuf` - I/O buffer management utilities

### Key Patterns

- **Async/Await Model**: The library is built around async/await with io_uring for high-performance I/O
- **Queue-per-Core**: Each device has multiple queues (typically one per CPU core)
- **Zero-Copy**: Uses memory mapping and buffer registration for efficient data transfer
- **RAII**: Device cleanup happens automatically when `UblkCtrl` is dropped

### Build System

The project uses a custom `build.rs` that:
- Generates Rust bindings from `ublk_cmd.h` using bindgen
- Adds serde serialization support to generated structs
- Handles kernel version compatibility issues (Fix753 workaround)

### Features

- `fat_complete` - Enables batch completion and zoned append operations
- Default build includes basic functionality

### Device Flags

- `UBLK_DEV_F_MLOCK_IO_BUFFER` - Locks I/O buffer pages in memory to prevent swapping
- Requires `CAP_IPC_LOCK` capability
- Incompatible with `UBLK_F_USER_COPY`, `UBLK_F_AUTO_BUF_REG`, and `UBLK_F_SUPPORT_ZERO_COPY`
- Use when predictable I/O latency is critical and swapping must be avoided

### Examples Structure

All examples follow the pattern:
1. Create `UblkCtrl` with `UblkCtrlBuilder`
2. Define target initialization function
3. Define per-queue I/O handling function
4. Call `ctrl.run_target()` with these functions
5. Handle graceful shutdown (Ctrl+C)

The examples demonstrate different target types:
- `null.rs` - Null device (discards writes, returns zeros)
- `loop.rs` - Loop device (file-backed)
- `ramdisk.rs` - RAM-based storage

### Dependencies

Key external dependencies:
- `io-uring` - Linux io_uring interface
- `smol` - Async runtime used in examples
- `serde` - Serialization for device parameters
- `bindgen` - C header binding generation (build-time)

## Development Notes

### Testing Requirements

- Tests require Linux kernel 6.0+ with CONFIG_BLK_DEV_UBLK enabled
- Some tests may require root privileges for device creation
- CI runs on both stable and nightly Rust toolchains

### Memory Locking (mlock) Support

When using `UBLK_DEV_F_MLOCK_IO_BUFFER`, the application requires `CAP_IPC_LOCK` capability:

```bash
# Grant capability to your ublk executable
sudo setcap cap_ipc_lock=eip /path/to/your/ublk_executable

# Or run with elevated privileges
sudo ./your_ublk_executable

# Check current capabilities
getcap /path/to/your/ublk_executable
```

This feature locks I/O buffer pages in physical memory to prevent them from being swapped to disk, ensuring consistent I/O performance but increasing memory pressure.

### Unprivileged Mode Support

The library supports unprivileged device creation via `UBLK_F_UNPRIVILEGED_DEV` flag, but requires:
- Proper udev rules installation
- `ublk_chown.sh` script in `/usr/local/sbin/`
- `ublk_user_id` binary installation
23 changes: 22 additions & 1 deletion examples/loop.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ bitflags! {
const ASYNC = 0b00000001;
const FOREGROUND = 0b00000010;
const ONESHOT = 0b00001000;
const MLOCK = 0b00010000;
}
}

Expand Down Expand Up @@ -266,14 +267,20 @@ fn __loop_add(
direct_io: 1,
back_file_path: backing_file.clone(),
};
let dev_flags = UblkFlags::UBLK_DEV_F_ADD_DEV
| if lo_flags.intersects(LoFlags::MLOCK) {
UblkFlags::UBLK_DEV_F_MLOCK_IO_BUFFER
} else {
UblkFlags::empty()
};
let ctrl = libublk::ctrl::UblkCtrlBuilder::default()
.name("example_loop")
.id(id)
.ctrl_flags(ctrl_flags)
.nr_queues(nr_queues.try_into().unwrap())
.depth(depth)
.io_buf_bytes(buf_sz)
.dev_flags(UblkFlags::UBLK_DEV_F_ADD_DEV)
.dev_flags(dev_flags)
.build()
.unwrap();
let tgt_init = |dev: &mut UblkDev| lo_init_tgt(dev, &lo);
Expand Down Expand Up @@ -333,6 +340,10 @@ fn loop_add(
}

fn main() {
env_logger::builder()
.format_target(false)
.format_timestamp(None)
.init();
let matches = Command::new("ublk-loop-example")
.subcommand_required(true)
.arg_required_else_help(true)
Expand Down Expand Up @@ -405,6 +416,13 @@ fn main() {
.long("oneshot")
.action(ArgAction::SetTrue)
.help("create, dump and remove device automatically"),
)
.arg(
Arg::new("mlock_io_buffer")
.long("mlock-io-buffer")
.short('m')
.action(ArgAction::SetTrue)
.help("enable UBLK_DEV_F_MLOCK_IO_BUFFER to lock IO buffers in memory"),
),
)
.subcommand(
Expand Down Expand Up @@ -454,6 +472,9 @@ fn main() {
if add_matches.get_flag("oneshot") {
lo_flags |= LoFlags::ONESHOT;
};
if add_matches.get_flag("mlock_io_buffer") {
lo_flags |= LoFlags::MLOCK;
}
let ctrl_flags: u64 = if add_matches.get_flag("unprivileged") {
libublk::sys::UBLK_F_UNPRIVILEGED_DEV as u64
} else {
Expand Down
11 changes: 11 additions & 0 deletions src/ctrl.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1074,6 +1074,17 @@ impl UblkCtrl {
return Err(UblkError::InvalidVal);
}

// Check mlock feature compatibility
if dev_flags.intersects(UblkFlags::UBLK_DEV_F_MLOCK_IO_BUFFER) {
// mlock feature is incompatible with certain other features
if (flags & sys::UBLK_F_USER_COPY as u64) != 0
|| (flags & sys::UBLK_F_AUTO_BUF_REG as u64) != 0
|| (flags & sys::UBLK_F_SUPPORT_ZERO_COPY as u64) != 0
{
return Err(UblkError::InvalidVal);
}
}

if id < 0 && id != -1 {
return Err(UblkError::InvalidVal);
}
Expand Down
43 changes: 42 additions & 1 deletion src/helpers.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ pub fn type_of_this<T>(_: &T) -> String {
pub struct IoBuf<T> {
ptr: *mut T,
size: usize,
mlocked: bool,
}

// Users of IoBuf has to deal with Send & Sync
Expand All @@ -22,7 +23,40 @@ impl<T> IoBuf<T> {

assert!(size != 0);

IoBuf { ptr, size }
IoBuf {
ptr,
size,
mlocked: false,
}
}

pub fn new_with_mlock(size: usize) -> Self {
let layout = std::alloc::Layout::from_size_align(size, 4096).unwrap();
let ptr = unsafe { std::alloc::alloc(layout) } as *mut T;

assert!(size != 0);

let mut buf = IoBuf {
ptr,
size,
mlocked: false,
};

// Attempt to mlock the buffer
let mlock_result = unsafe { libc::mlock(ptr as *const libc::c_void, size) };

if mlock_result == 0 {
buf.mlocked = true;
}
// Note: We don't fail if mlock fails, as it might be due to permissions
// or system limits. The caller can check with is_mlocked().

buf
}

/// Check if the buffer is currently locked in memory
pub fn is_mlocked(&self) -> bool {
self.mlocked
}

/// how many elements in this buffer
Expand Down Expand Up @@ -82,6 +116,13 @@ impl<T> DerefMut for IoBuf<T> {
/// Free buffer with same alloc layout
impl<T> Drop for IoBuf<T> {
fn drop(&mut self) {
// munlock the buffer if it was mlocked
if self.mlocked {
unsafe {
libc::munlock(self.ptr as *const libc::c_void, self.size);
}
}

let layout = std::alloc::Layout::from_size_align(self.size, 4096).unwrap();
unsafe { std::alloc::dealloc(self.ptr as *mut u8, layout) };
}
Expand Down
43 changes: 35 additions & 8 deletions src/io.rs
Original file line number Diff line number Diff line change
Expand Up @@ -332,8 +332,14 @@ impl UblkDev {
let bytes = self.dev_info.max_io_buf_bytes as usize;
let mut bvec = Vec::with_capacity(depth as usize);

let use_mlock = self.flags.intersects(UblkFlags::UBLK_DEV_F_MLOCK_IO_BUFFER);

for _ in 0..depth {
bvec.push(IoBuf::<u8>::new(bytes));
if use_mlock {
bvec.push(IoBuf::<u8>::new_with_mlock(bytes));
} else {
bvec.push(IoBuf::<u8>::new(bytes));
}
}

bvec
Expand Down Expand Up @@ -1021,10 +1027,13 @@ impl UblkQueue<'_> {
/// COMMIT_AND_FETCH_REQ command is used for both committing io command
/// result and fetching new incoming IO.
pub fn submit_fetch_commands_with_auto_buf_reg(
self,
buf_reg_data_list: &[sys::ublk_auto_buf_reg]
self,
buf_reg_data_list: &[sys::ublk_auto_buf_reg],
) -> Self {
assert!(self.support_auto_buf_zc(), "Auto buffer registration not supported");
assert!(
self.support_auto_buf_zc(),
"Auto buffer registration not supported"
);
assert!(
buf_reg_data_list.len() >= self.q_depth as usize,
"Buffer registration data list too short"
Expand All @@ -1034,7 +1043,7 @@ impl UblkQueue<'_> {
let buf_reg_data = &buf_reg_data_list[i as usize];
let auto_buf_addr = bindings::ublk_auto_buf_reg_to_sqe_addr(buf_reg_data);
let data = UblkIOCtx::build_user_data(i as u16, sys::UBLK_U_IO_FETCH_REQ, 0, false);

self.__queue_io_cmd(
&mut self.q_ring.borrow_mut(),
i as u16,
Expand Down Expand Up @@ -1151,13 +1160,23 @@ impl UblkQueue<'_> {
assert!(self.support_comp_batch());
for item in ios {
let tag = item.0;
self.commit_and_queue_io_cmd_with_auto_buf_reg(r, tag, buf_reg_data, item.1);
self.commit_and_queue_io_cmd_with_auto_buf_reg(
r,
tag,
buf_reg_data,
item.1,
);
}
}
UblkFatRes::ZonedAppendRes((res, lba)) => {
let mut buf_reg_data_for_zoned = *buf_reg_data;
buf_reg_data_for_zoned.index = (lba & 0xffff) as u16;
self.commit_and_queue_io_cmd_with_auto_buf_reg(r, tag, &buf_reg_data_for_zoned, res);
self.commit_and_queue_io_cmd_with_auto_buf_reg(
r,
tag,
&buf_reg_data_for_zoned,
res,
);
}
},
_ => {}
Expand Down Expand Up @@ -1271,7 +1290,15 @@ impl UblkQueue<'_> {
let mut state = self.state.borrow_mut();
let empty = self.q_ring.borrow_mut().submission().is_empty();

if empty && state.get_nr_cmd_inflight() == self.q_depth && !state.is_idle() {
// don't enter idle if mlock buffers is enabled
if !self
.dev
.flags
.intersects(UblkFlags::UBLK_DEV_F_MLOCK_IO_BUFFER)
&& empty
&& state.get_nr_cmd_inflight() == self.q_depth
&& !state.is_idle()
{
log::debug!(
"dev {} queue {} becomes idle",
self.dev.dev_info.dev_id,
Expand Down
5 changes: 5 additions & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,11 @@ bitflags! {
/// from queue's affinity instead of setting all CPUs
const UBLK_DEV_F_SINGLE_CPU_AFFINITY = 0b00010000;

/// enable mlock for io buffers: lock user IO buffer pages in memory
/// to prevent swapping. Requires CAP_IPC_LOCK capability.
/// It is required for ublk to be used as swap disk
const UBLK_DEV_F_MLOCK_IO_BUFFER = 0b00100000;

const UBLK_DEV_F_INTERNAL_0 = 1_u32 << 31;
const UBLK_DEV_F_INTERNAL_1 = 1_u32 << 30;
}
Expand Down
Loading