Add Qwen3 Reranker model#3958
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds support for three Qwen3 Reranker models (0.6B, 4B, and 8B variants) to the MTEB framework. The models are reranker models based on Qwen3 that can be used for relevance scoring tasks.
Changes:
- Added
Qwen3RerankerWrapperclass to load and run Qwen3 reranker models using causal language modeling with yes/no token probability scoring - Added three
ModelMetaconfigurations for Qwen3-Reranker-0.6B, Qwen3-Reranker-4B, and Qwen3-Reranker-8B - Imported
ScoringFunctionfrom model_meta module to support metadata configuration
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| similarity_fn_name=ScoringFunction.COSINE, | ||
| use_instructions=True, | ||
| training_datasets=qwen3_reranker_training_data, | ||
| adapted_from=None, |
There was a problem hiding this comment.
| adapted_from=None, | |
| adapted_from="Qwen/Qwen3-4B", |
| torch_dtype=torch.float32, | ||
| attn_implementation: str | None = None, | ||
| batch_size: int = 32, | ||
| max_length: int = 8192, |
There was a problem hiding this comment.
This shouldn't be passed to model initialization
| torch_dtype=torch.float32, | |
| attn_implementation: str | None = None, | |
| batch_size: int = 32, | |
| max_length: int = 8192, |
There was a problem hiding this comment.
we can keep attn_implementation in init, right? will remove rest
| self.token_false_id = self.tokenizer.convert_tokens_to_ids("no") | ||
| self.token_true_id = self.tokenizer.convert_tokens_to_ids("yes") | ||
|
|
||
| self.prefix = '<|im_start|>system\nJudge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be "yes" or "no".<|im_end|>\n<|im_start|>user\n' |
There was a problem hiding this comment.
I don't think that instruction should be hardcoded
There was a problem hiding this comment.
These is given on their hf page.
| queries = [text for batch in inputs1 for text in batch["query"]] | ||
| instructions = None | ||
| if "instruction" in inputs2.dataset.features: | ||
| instructions = [text for batch in inputs1 for text in batch["instruction"]] |
There was a problem hiding this comment.
Can you get task specific prompt? Instruction from batch will be only instruction retrieval/reranking tasks
|
By the way you cat get implementation from their repo https://github.com/QwenLM/Qwen3-Embedding/blob/main/evaluation/qwen3_reranker_model.py |
It's almost the same, but it uses vLLM. Should I use it? |
|
I don't think you need to change to vllm. I think it's better to use transformers or sentence transforemsrs, but I think this model is not compatible with sentence transformers |
Their script on github is using vllm, and yes its not compatible with Sentence Transformers, so I think we can keep the current transformers implementation. |
|
@Samoed I was able to run this code perfectly. Just 1 doubt is, for evaluation they have given these in their modelcard:
So, how can do evaluation on retrival subset only? |
|
We have example in docs https://embeddings-benchmark.github.io/mteb/usage/selecting_tasks/#filtering-benchmark-tasks |
Getting these, when trying to running above code, is it again because we allow only retrieval and not reranking one? |
|
Yes that's right. I think you can try to evaluate on their script on some reranking tasks and after that check your implementation |
Their script are using retrieval results in reranking. Evaluate reranking models section in readme.md |
|
I think you can still run reranking tasks |
I am not able to run their code, getting an error because of I think conflict in dependencies. |
|
@Samoed Could you try running it if possible? I tried it again. But not able to run it fully. |
closes #3718
Added 3 models:
mteb.get_model(model_name, revision)andmteb.get_model_meta(model_name, revision)