A production-ready Python server for the Cortex memory system with FastAPI, OpenAPI documentation, authentication, and gRPC support.
- π FastAPI Server: High-performance async HTTP API
- π OpenAPI Documentation: Interactive API docs at
/docs - π Authentication: API key and JWT token authentication
- π gRPC Support: High-performance RPC interface
- π― Smart Collections: Automatic memory organization
- β±οΈ Temporal Search: Time-aware memory retrieval
- π Semantic Search: AI-powered memory search
- π Production Ready: Docker, rate limiting, monitoring
- π Metrics: Prometheus metrics endpoint
- π§ Multi-User Support: Isolated memory spaces
- Python 3.11+
- Docker and Docker Compose
- OpenAI API key
- Clone the repository:
git clone https://github.com/yourusername/cortex-server.git
cd cortex-server- Copy environment configuration:
cp server/.env.example server/.env- Edit
server/.envand add your OpenAI API key:
OPENAI_API_KEY=your-openai-api-key-here
API_KEYS=your-api-key-1,your-api-key-2
SECRET_KEY=your-secret-key-for-jwt- Start all services:
docker-compose up -dThis starts:
- Cortex API Server (port 8080)
- gRPC Server (port 50051)
- ChromaDB (port 8003)
- Redis (port 6379)
- PostgreSQL (port 5432)
- Nginx proxy (port 80)
- Check health:
curl http://localhost:8080/health- View API documentation: Open http://localhost:8080/docs in your browser
- Install dependencies:
cd server
pip install -r requirements.txt- Start ChromaDB:
docker run -p 8003:8000 chromadb/chroma:latest- Start the server:
python run_server.pyAll API endpoints require authentication. Use one of:
- API Key (Direct):
curl -H "Authorization: Bearer your-api-key" http://localhost:8080/api/v1/memory- JWT Token:
# Generate token
curl -X POST http://localhost:8080/auth/token \
-H "Content-Type: application/json" \
-d '{"api_key": "your-api-key", "expires_in": 1440}'
# Use token
curl -H "Authorization: Bearer jwt-token" http://localhost:8080/api/v1/memorycurl -X POST http://localhost:8080/api/v1/memory \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"content": "User prefers TypeScript over JavaScript",
"context": "programming preferences",
"tags": ["typescript", "preferences"],
"user_id": "user_123"
}'curl -X POST http://localhost:8080/api/v1/memory/search \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"query": "programming preferences",
"limit": 5,
"memory_source": "all",
"temporal_weight": 0.3,
"user_id": "user_123"
}'curl -X POST http://localhost:8080/api/v1/memory/search \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"query": "recent discussions",
"date_range": "last week",
"user_id": "user_123"
}'import grpc
from app.generated import cortex_pb2, cortex_pb2_grpc
# Connect to server
channel = grpc.insecure_channel('localhost:50051')
stub = cortex_pb2_grpc.MemoryServiceStub(channel)
# Store memory
request = cortex_pb2.StoreMemoryRequest(
content="Test memory",
context="testing",
tags=["test"],
user_id="user_123"
)
response = stub.StoreMemory(request)
print(f"Stored memory with ID: {response.id}")
# Search memories
search_request = cortex_pb2.SearchMemoryRequest(
query="test",
limit=5,
user_id="user_123"
)
search_response = stub.SearchMemories(search_request)
for memory in search_response.memories:
print(f"Found: {memory.content} (score: {memory.score})")POST /auth/token- Generate JWT token from API key
POST /api/v1/memory- Store new memoryPOST /api/v1/memory/search- Search memoriesGET /api/v1/memory/{id}- Get memory by IDPUT /api/v1/memory- Update memoryDELETE /api/v1/memory- Delete memoryPOST /api/v1/memory/clear- Clear memories
GET /health- Health checkGET /api/v1/stats- System statisticsGET /metrics- Prometheus metricsGET /docs- Interactive API documentationGET /openapi.json- OpenAPI schema
| Variable | Description | Default |
|---|---|---|
OPENAI_API_KEY |
OpenAI API key for embeddings | Required |
API_KEYS |
Comma-separated API keys | Required |
SECRET_KEY |
JWT signing secret | Required |
HOST |
Server host | 0.0.0.0 |
PORT |
Server port | 8080 |
WORKERS |
Number of workers | 4 |
REDIS_URL |
Redis connection URL | redis://localhost:6379/0 |
CHROMA_URI |
ChromaDB URL | http://localhost:8003 |
RATE_LIMIT_PER_MINUTE |
API rate limit | 100 |
- SSL/TLS: Configure Nginx with SSL certificates
- Secrets: Use environment variables or secret management
- Monitoring: Enable Prometheus metrics and set up Grafana
- Scaling: Adjust worker count based on load
- Backup: Regular backup of ChromaDB and PostgreSQL
Run tests:
cd server
pytest tests/test_api.py -vβββββββββββββββββββ βββββββββββββββββββ
β HTTP Client β β gRPC Client β
ββββββββββ¬βββββββββ ββββββββββ¬βββββββββ
β β
ββββββΌβββββββββββββββββββββββββΌβββββ
β Nginx (Reverse Proxy) β
ββββββ¬βββββββββββββββββββββββββ¬βββββ
β β
ββββββΌβββββββ ββββββΌβββββββ
β FastAPI β β gRPC β
β (HTTP) β β Server β
ββββββ¬βββββββ ββββββ¬βββββββ
β β
ββββββΌβββββββββββββββββββββββββΌβββββ
β Cortex Service Layer β
ββββββββββββββββ¬βββββββββββββββββββββ
β
ββββββββββββββββΌβββββββββββββββββββββ
β Cortex Memory System β
β ββββββββββββ ββββββββββββ β
β β STM β β LTM β β
β ββββββββββββ βββββββ¬βββββ β
β β β
β βββββββββββββββββββββΌββββββββββ β
β β ChromaDB (Vectors) β β
β βββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββ
β β
βββββββββββΌβββ ββββββΌβββββββ
β Redis β β PostgreSQLβ
β (Cache) β β (API Keys)β
ββββββββββββββ βββββββββββββ
MIT License - See LICENSE file for details
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
For issues and questions, please open an issue on GitHub.