Skip to content

Low-latency, multi-threaded data processing system designed to interface directly with low-level hardware

Notifications You must be signed in to change notification settings

kennethkenn/Data-Engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

High-Performance Data Engine (HPDE)

The High-Performance Data Engine (HPDE) is a low-latency, multi-threaded data processing system designed to interface directly with low-level hardware. It is built with a focus on achieving sub-microsecond end-to-end latency, lock-free concurrency, deterministic memory behavior, and minimal OS interference.

Build Instructions

Prerequisites

  • C++20 compatible compiler (clang++ preferred, or g++)
  • CMake (version 3.16 or later)

Steps

  1. Clone the repository:

    git clone <repository-url>
    cd Data\ Engine
  2. Create a build directory and navigate to it:

    mkdir build && cd build
  3. Configure the project with CMake:

    cmake -S .. -B . -DCMAKE_BUILD_TYPE=Release
  4. Build the project:

    cmake --build . --config Release
  5. Run the benchmark:

    ./hpde_benchmark --iterations 200000 --work 128 --workers 2 --no_drop

Project Structure

Data Engine/
|-- src/
|   |-- core.cpp
|   |-- core.h
|   |-- lock_free_queue.h
|   |-- custom_allocator.cpp
|   |-- custom_allocator.h
|   |-- assembly.S
|   |-- assembly.h
|   |-- assembly.asm
|   `-- benchmark.cpp
|-- python/
|   `-- run_benchmark.py
|-- CMakeLists.txt
`-- README.md

Features

  • C++ Core Engine:
    • Lock-free data structures
    • Custom memory allocator with preallocated pool
    • Fixed worker pool with per-worker queues
    • CPU affinity for worker threads (Windows/Linux)
  • x86-64 Assembly:
    • Memory barriers (lfence, sfence, mfence)
    • Atomic primitive (cmpxchg)
    • Timestamp counter (rdtsc)
  • Python Control Layer:
    • Benchmark orchestration script
    • Configuration via CLI flags

Performance Constraints

  • Sub-microsecond end-to-end latency
  • Lock-free concurrency
  • Deterministic memory behavior
  • Minimal OS interference

Testing & Validation

  • Use microbenchmarks pinned to isolated cores.
  • Measure p50, p99, and p99.9 latencies.
  • Use hardware counters to track cache misses and false sharing.

Benchmark CLI

hpde_benchmark [--iterations N] [--duration_ms N] [--work N] [--warmup N] [--workers N] [--no_drop] [--csv]

Outputs latency statistics (avg, p50, p99, p99.9, max) plus submitted/processed/dropped counts and ops/sec. Use --csv for header + single-row CSV output.

Deployment Guidelines

  • Disable hyper-threading (if latency sensitive).
  • Disable CPU power-saving states.
  • Lock memory (mlockall).
  • Run with real-time scheduling if permitted.

About

Low-latency, multi-threaded data processing system designed to interface directly with low-level hardware

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published