-
Notifications
You must be signed in to change notification settings - Fork 75
Description
Dear LM Studio Team,
Thank you for building such a powerful and user-friendly platform for running local LLMs. LM Studio has become an essential tool for many developers working with private, offline language models.
I’d like to propose a new feature that would significantly enhance LM Studio’s capabilities in Retrieval-Augmented Generation (RAG) pipelines: a built-in document reranking endpoint.
Currently, LM Studio supports document retrieval and embedding generation, but there is no native way to rerank retrieved documents based on their relevance to a given query before they are passed to the LLM. Reranking is a critical step in high-quality RAG systems—it improves precision by reordering candidate documents using a more sophisticated cross-encoder or specialized reranker model (e.g., BAAI/bge-reranker, Cohere rerankers, etc.).
Having a dedicated /rerank or /rank HTTP endpoint—similar to the existing /embed endpoint—would allow users to:
Send a query and a list of candidate documents (or passages),
Receive a ranked list of documents with relevance scores,
Seamlessly integrate reranking into local RAG workflows without relying on external services.
This feature would greatly improve the out-of-the-box RAG performance for LM Studio users and align the platform more closely with production-grade local AI stacks.
Thank you for considering this request. I believe it would be a valuable addition to LM Studio’s growing set of local inference capabilities.
Best regards,