update readme

feifeibear · feifeibear · commit 98e6857be334 · 2023-09-21T06:38:52.000Z
diff --git a/README.md b/README.md
@@ -14,15 +14,19 @@ The speculative sampling is proposed by Google and Deepmind independently. So I
 
 ## Usage
 ### Inference
-In the sample, I use [bloomz-7b1](https://huggingface.co/bigscience/bloomz-7b1/tree/main) as the target model, [bloom-560m](https://huggingface.co/bigscience/bloom-560m/tree/main) as the approximation model.
+You need prepare a pair of models using the same embedding and vocabulary. The approximation model should be smaller than the target model. Here are some
+tested model pairs.
 
-Tested Model Pairs
+<center>
 
 | Approx Model | Target Model |
 |--------------|--------------|
 | [bloomz-7b1](https://huggingface.co/bigscience/bloomz-7b1/tree/main) | [bloom-560m](https://huggingface.co/bigscience/bloom-560m/tree/main) |
 | [TinyLlama-1.1B](https://huggingface.co/PY007/TinyLlama-1.1B-step-50K-105b) | llama-7b |
 
+</center>
+
+In the sample, I use [bloomz-7b1](https://huggingface.co/bigscience/bloomz-7b1/tree/main) as the target model, [bloom-560m](https://huggingface.co/bigscience/bloom-560m/tree/main) as the approximation model.
 
 ```bash
 python main.py \
diff --git a/main.py b/main.py
@@ -104,7 +104,4 @@ def generate(input_text, approx_model_name, target_model_name, num_tokens=40, ra
 if __name__ == "__main__":
     args = parse_arguments()
     
-    args.approx_model_name = MODELZOO["llama1b"]
-    args.target_model_name = MODELZOO["llama7b"]
-    
     generate(args.input, args.approx_model_name, args.target_model_name, random_seed = args.seed, verbose=args.verbose)