features_semantic_cache

category: "🔍 Search & Vector" version: "v1.3.0" status: "✅" date: "22.12.2025"

🎯 Semantic Query Cache

Cache für ähnliche Vektor-Queries mit semantischer Ähnlichkeit.

📋 Inhaltsverzeichnis

📋 Übersicht
✨ Features
🚀 Schnellstart
📖 Detaillierte Dokumentation
💡 Best Practices
🔧 Troubleshooting
📚 Siehe auch
📝 Changelog

📋 Übersicht

Status: ✅ Vollständig implementiert

Überblick

Der Semantic Query Cache ist ein intelligenter, LRU-basierter Cache für Query-Ergebnisse, der sowohl exaktes String-Matching als auch semantisches Ähnlichkeits-Matching unterstützt. Er reduziert LLM-Kosten um 40-60% durch Zwischenspeicherung von Prompt-Response-Paaren.

Key Features

1. Multi-Level Lookup Strategy

Query → Exact Match → Semantic Match (KNN) → Cache Miss

Exact Match: Schnelle O(1) Suche via Query-String
Semantic Match: KNN-Suche im Vektor-Space (konfigurierbarer Threshold)
Fallback: Query ausführen bei Cache Miss

2. Intelligent Eviction

LRU Eviction: Entfernt am längsten nicht genutzte Einträge
TTL Expiration: Automatisches Entfernen abgelaufener Einträge
Manual Eviction: evictLRU() für explizites Cleanup

3. Query Embedding

Feature-basiertes Embedding mit:

Tokenization: Extrahiert Tokens aus Query-Text
Bigrams: Erfasst Query-Struktur
Keywords: Identifiziert wichtige Terme (WHERE, JOIN, etc.)
Feature Hashing: Mappt Features auf 128-dim Vektor
L2 Normalization: Unit-Length Vektoren für Cosine Similarity

4. Thread-Safe Operations

Concurrent Reads: Mehrere Threads können get() gleichzeitig aufrufen
Concurrent Writes: Thread-safe put() mit Mutex-Schutz
Deadlock-Free: Sorgfältige Lock-Ordnung verhindert Deadlocks

Implementierung

Dateien

Header: include/cache/semantic_cache.h
Implementation: src/cache/semantic_cache.cpp
HTTP Handler: src/server/http_server.cpp

Architektur

class SemanticCache {
    // Key: SHA256(prompt + JSON.stringify(params))
    // Value: {response, metadata, timestamp_ms, ttl_seconds}
    
    bool put(prompt, params, response, metadata, ttl_seconds);
    std::optional<CacheEntry> query(prompt, params);
    Stats getStats();
    uint64_t clearExpired();
    bool clear();
};

Storage

RocksDB Column Family: Default CF
Key Format: SHA256 hash (32 bytes hex string)
Value Format: JSON {response, metadata, timestamp_ms, ttl_seconds}

TTL-Mechanik

Speicherung: timestamp_ms (Erstellungszeit) + ttl_seconds
Abfrage: isExpired() prüft current_time > (timestamp + TTL)
Cleanup: clearExpired() entfernt abgelaufene Einträge via WriteBatch
No-Expiry: ttl_seconds = -1 → nie ablaufen

HTTP API

POST /cache/put

Request:

{
  "prompt": "What is the capital of France?",
  "parameters": {"model": "gpt-4", "temperature": 0.7},
  "response": "The capital of France is Paris.",
  "metadata": {"tokens": 15, "cost_usd": 0.001},
  "ttl_seconds": 3600
}

Response:

{
  "success": true,
  "message": "Response cached successfully"
}

POST /cache/query

Request:

{
  "prompt": "What is the capital of France?",
  "parameters": {"model": "gpt-4", "temperature": 0.7}
}

Response (Hit):

{
  "found": true,
  "response": "The capital of France is Paris.",
  "metadata": {"tokens": 15, "cost_usd": 0.001}
}

GET /cache/stats

Response:

{
  "hit_count": 42,
  "miss_count": 8,
  "hit_rate": 0.84,
  "avg_latency_ms": 1.2,
  "total_entries": 100,
  "total_size_bytes": 524288
}

Performance

Benchmarks (Release Mode)

Operation	Zeit	Notes
put()	~3ms	Insert + compute embedding
get() exact	~1ms	Fast RocksDB lookup
get() similarity	~5ms	KNN search (HNSW)
remove()	~2ms	Delete + update LRU
evictLRU()	~20ms	For 100 entries

Performance-Ziele

Metric	Ziel	Status
Cache Hit Rate	>40%	✅ 81.82% erreicht
Lookup Latenz	<5ms	✅ 0.058ms gemessen
TTL Genauigkeit	±1s	✅ Millisekunden-Präzision
Cost Reduction	40-60%	✅ Workload-abhängig

Test Coverage

14/14 Tests bestanden:

✅ PutAndGetExactMatch
✅ CacheMiss
✅ SimilarityMatch
✅ DissimilarQueryMiss
✅ LRUEviction
✅ TTLExpiration
✅ ManualEviction
✅ RemoveEntry
✅ ClearCache
✅ HitRateCalculation
✅ ConfigUpdate
✅ EmptyInputRejection
✅ HitCountTracking
✅ ConcurrentAccess

Zusammenfassung

Der Semantic Cache ist produktionsbereit und bietet:

✅ Exakte Prompt+Parameter-Matching via SHA256
✅ Flexible TTL-Steuerung (pro Entry)
✅ Umfassende Metriken (Hit-Rate, Latenz, Size)
✅ HTTP API für CRUD-Operationen
✅ Thread-safe Implementierung
✅ Graceful Expiry-Handling

Code: 700+ Zeilen (Header + Impl + Tests)

ThemisDB v1.3.4 | GitHub | Documentation | Discussions | License

Last synced: January 02, 2026 | Commit: 6add659

ThemisDB Dokumentation

Version: 1.3.0 | Stand: Dezember 2025

📋 Schnellstart

🏗️ Architektur

🗄️ Basismodell

💾 Storage & MVCC

📇 Indexe & Statistiken

🔍 Query & AQL

💰 Caching

📦 Content Pipeline

🔎 Suche

⚡ Performance & Benchmarks

🏢 Enterprise Features

✅ Qualitätssicherung

Quality Assurance

🧮 Vektor & GNN

🌍 Geo Features

🛡️ Sicherheit & Governance

Authentication

Schlüsselverwaltung

Key Management

Verschlüsselung

TLS & Certificates

PKI & Signatures

PII Detection

Vault & HSM

Audit & Compliance

Security Audits

Gap Analysis

Competitive Gap Analysis

🚀 Deployment & Betrieb

Deployment

Docker

Observability

Change Data Capture

Operations

💻 Entwicklung

API Implementations

Changefeed

Security Development

Development Overviews

📄 Publikation & Ablage

Publishing

🔧 Admin-Tools

🔌 APIs

📚 Client SDKs

📊 Implementierungs-Zusammenfassungen

📅 Planung & Reports

📖 Dokumentation

📝 Release Notes

📖 Styleguide & Glossar

🗺️ Roadmap & Changelog

💾 Source Code Documentation

Main Programs

Source Code Module

🗄️ Archive

🤝 Community & Support

Vollständige Dokumentation: https://makr-code.github.io/ThemisDB/

features_semantic_cache

category: "🔍 Search & Vector" version: "v1.3.0" status: "✅" date: "22.12.2025"

🎯 Semantic Query Cache

📋 Inhaltsverzeichnis

📋 Übersicht

Überblick

Key Features

1. Multi-Level Lookup Strategy

2. Intelligent Eviction

3. Query Embedding

4. Thread-Safe Operations

Implementierung

Dateien

Architektur

Storage

TTL-Mechanik

HTTP API

POST /cache/put

POST /cache/query

GET /cache/stats

Performance

Benchmarks (Release Mode)

Performance-Ziele

Test Coverage

Zusammenfassung

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!