Skip to content

Commit 98e6857

Browse files
committed
update readme
1 parent e5a1ec5 commit 98e6857

File tree

2 files changed

+6
-5
lines changed

2 files changed

+6
-5
lines changed

README.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,15 +14,19 @@ The speculative sampling is proposed by Google and Deepmind independently. So I
1414

1515
## Usage
1616
### Inference
17-
In the sample, I use [bloomz-7b1](https://huggingface.co/bigscience/bloomz-7b1/tree/main) as the target model, [bloom-560m](https://huggingface.co/bigscience/bloom-560m/tree/main) as the approximation model.
17+
You need prepare a pair of models using the same embedding and vocabulary. The approximation model should be smaller than the target model. Here are some
18+
tested model pairs.
1819

19-
Tested Model Pairs
20+
<center>
2021

2122
| Approx Model | Target Model |
2223
|--------------|--------------|
2324
| [bloomz-7b1](https://huggingface.co/bigscience/bloomz-7b1/tree/main) | [bloom-560m](https://huggingface.co/bigscience/bloom-560m/tree/main) |
2425
| [TinyLlama-1.1B](https://huggingface.co/PY007/TinyLlama-1.1B-step-50K-105b) | llama-7b |
2526

27+
</center>
28+
29+
In the sample, I use [bloomz-7b1](https://huggingface.co/bigscience/bloomz-7b1/tree/main) as the target model, [bloom-560m](https://huggingface.co/bigscience/bloom-560m/tree/main) as the approximation model.
2630

2731
```bash
2832
python main.py \

main.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,4 @@ def generate(input_text, approx_model_name, target_model_name, num_tokens=40, ra
104104
if __name__ == "__main__":
105105
args = parse_arguments()
106106

107-
args.approx_model_name = MODELZOO["llama1b"]
108-
args.target_model_name = MODELZOO["llama7b"]
109-
110107
generate(args.input, args.approx_model_name, args.target_model_name, random_seed = args.seed, verbose=args.verbose)

0 commit comments

Comments
 (0)