-
Notifications
You must be signed in to change notification settings - Fork 864
feat: Add warmup functionality to reduce search latency #195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Add warmup functionality to reduce search latency #195
Conversation
- Add enable_warmup parameter to HNSW and DiskANN embedding servers - Implement warmup() method on LeannSearcher for manual pre-warming - Auto-warmup option during LeannSearcher initialization (enable_warmup=True) - Pre-load embedding model at server startup to avoid cold-start latency - Add comprehensive tests for warmup functionality Fixes yichuan-w#177 (search recompute latency) Fixes yichuan-w#159 (warmup strategy)
|
Thanks, this is a known issue for a long time, we will look into that!! cc @andylizf , and can you fix the lint error here? |
- Remove unused imports (tempfile, Path, MagicMock) - Fix import order (stdlib before third-party) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
The macOS-13 CI jobs show as 'cancelled' rather than 'failed' - this appears to be a GitHub Actions runner issue, not a code problem. All other builds (macos-14, macos-15, ubuntu) passed successfully. Could you please re-run the cancelled macOS-13 jobs? |
|
Sure, I will do that later, sorry for the late responses since I was on vacation. And thanks again for your contribution! |
|
No worries, thanks for re-running the CI! Let me know if there's anything else that needs to be addressed. |
|
Thanks for implementing this warmup feature! The functionality looks good and solves a real latency problem. A few suggestions for potential future improvements (not blocking for this PR): 1. Extract common warmup logicThe warmup code in # leann/warmup.py
def warmup_embedding_model(model_name: str, embedding_mode: str, provider_options=None) -> float:
"""Pre-load embedding model by computing a dummy embedding."""
...2. Clarify
|
|
What do you think? @yichuan-w |
|
yeah, I guess it is good to merge if @majiayu000 can solve the conflict and pass the PR, btw @andylizf , whats the relationship between this and #176 |
, yichuan-w#196, yichuan-w#199 - Fix unused variable `zmq_port` in warmup() method (F841) - Remove unused imports in test_cli_list_performance.py (F401) - Sort imports in test files (I001) - Fix test_warmup.py: correct attribute name `_warmup_enabled` → `_warmup` - Add skipif for DiskANN test when backend is not installed - Apply ruff format to all modified files Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
enable_warmupparameter to HNSW and DiskANN embedding servers to pre-load model at startupwarmup()method onLeannSearcherfor manual pre-warming before first searchLeannSearcherinitialization (enable_warmup=True)Test Plan
tests/test_warmup.pyFixes #177
Fixes #159