Skip to content

[Feature]: RocksDB → EloqStore Data Migration Tool #285

@MalikHou

Description

@MalikHou

Feature Request: RocksDB → EloqStore Data Migration Tool (Requirements Only)

Summary

Request an officially supported tool to migrate data from RocksDB to EloqStore, enabling both full dataset migration and (optionally) incremental migration for low-downtime cutover.

Motivation / Why This Matters

RocksDB is widely used for embedded or service-managed KV workloads. When migrating to EloqStore (for consolidation, cost, operability, or scaling needs), teams require a standardized migration path. Without an official tool, migrations rely on ad-hoc scripts that are error-prone, hard to validate, and difficult to operate at production scale.

Scope

In Scope

  • Migrate data from RocksDB instances used in production services.
  • Support common deployment realities:
    • large datasets across multiple local disks
    • multiple column families

Out of Scope (Initial Version)

  • Migration of non-RocksDB engines (e.g., LevelDB) unless explicitly added later.
  • Bi-directional sync or long-term replication.

Requirements

R1. Source Compatibility & Access Modes

  • Support migrating from:
    • a live RocksDB database (secondary access mode; no service interruption if possible)
    • an offline RocksDB database directory (no concurrent writes)
  • Support major RocksDB configurations commonly used in production:
    • single CF and multiple CF
    • custom comparators (must be detected and documented; migration may require constraints)
  • Provide clear compatibility documentation (supported RocksDB versions and file formats).

R2. Data Coverage

  • Migrate all key-value pairs for selected scope:
    • entire DB
    • selected column families
    • selected key ranges / prefixes (optional but desired)
  • Preserve relevant metadata where applicable/available:
    • sequence/version semantics if required by the target use case (must be clearly defined)
    • TTL behavior if the source uses TTL/expiration semantics (must be supported if present)

R3. Mapping / Target Representation

  • Provide a deterministic and documented mapping from RocksDB data model to EloqStore:
    • namespace/table mapping
    • column family mapping (if EloqStore supports analogous concepts)
    • key/value encoding preservation (raw bytes) by default
  • Mapping must be configurable to support different schemas and table layouts.

R4. Scalability & Performance Controls

  • Support parallel migration to handle large datasets efficiently:
    • configurable parallelism
    • configurable per-worker resource limits
  • Provide throttling controls to protect both:
    • the source host (disk IO/CPU contention)
    • the target EloqStore cluster (write throughput and backpressure)

R5. Reliability, Correctness, and Resumability

  • Support checkpointing and resumability:
    • safe restart without reprocessing completed ranges unnecessarily
  • Ensure idempotent writes or an equivalent mechanism that prevents duplication/corruption on retries.
  • Provide deterministic ordering guarantees where required for correctness (e.g., per-range processing).

R6. Incremental Migration (Optional but Desired)

  • Support a mode to capture and apply changes that occur during/after the baseline migration to enable low-downtime cutover.
  • Provide a clear operational model for:
    • when incremental mode starts
    • how consistency is assessed before switching traffic

R7. Validation and Verification

  • Provide validation capabilities to increase confidence in correctness:
    • at minimum: sampling-based key/value verification
    • optional: full verification mode for smaller datasets or critical tables
  • Generate a migration report including:
    • keys migrated, bytes migrated
    • per-column-family stats
    • validation results and discrepancy samples
    • failures/skips with actionable error reasons

R8. Observability & Operability

  • Emit structured logs suitable for centralized log systems.
  • Expose metrics for progress and health:
    • throughput, error rate, retry rate
    • progress per CF / key-range
  • Provide a dry-run / plan mode to:
    • validate access to RocksDB
    • enumerate CFs and basic stats
    • estimate migration scope without writing

R9. Security & Compliance

  • Do not leak secrets in logs.
  • Support secure credential handling for EloqStore endpoints.
  • Support encryption in transit where required (TLS).

R10. User Experience

  • Provide a CLI interface with clear help text and examples.
  • Provide documentation/runbooks for:
    • prerequisites (permissions, disk space, expected runtime)
    • recommended throttling settings
    • cutover checklist
    • rollback expectations (even if rollback is manual)

Acceptance Criteria

  • A single, supported tool (CLI and/or service) that can migrate production RocksDB datasets to EloqStore with:
    • checkpoint/resume
    • throttling and parallelism controls
    • validation reporting
    • production-grade logging/metrics
  • Documentation describing:
    • supported RocksDB versions and configurations
    • handling of column families and key ranges
    • expected operational steps for safe cutover

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions