-
Notifications
You must be signed in to change notification settings - Fork 6
change doc files, add qwen3 moe doc and pd_disaggregation doc #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
doc/PD_Disaggregation.md
Outdated
| -v /usr/local/sbin:/usr/local/sbin \ | ||
| -v /etc/hccn.conf:/etc/hccn.conf \ | ||
| -v /home/ckpt:/home/ckpt \ | ||
| sgl_mindspore:v0.10 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sgl_mindspore docker image seems not declared. Better not mention docker image before we provide the dockerfile
doc/PD_Disaggregation.md
Outdated
|
|
||
| ``` | ||
| ## MemFabric Adaptor install | ||
| *Notice: Prebuilt wheel package is based on aarch64, please leave an issue [here at sglang](https://github.com/sgl-project/sglang/issues) to let us know the requests for amd64 build.* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't ask to leave issues to other repo. Just use this repo.
doc/PD_Disaggregation.md
Outdated
|
|
||
| ## PD Disaggregation Examples | ||
| ### Running Qwen3-8B | ||
| Running Qwen3-8B with PD disaggregation on 2 x Atlas 800I A2. Model weights could be found [here](https://modelers.cn/models/MindSpore-Lab/Qwen3-8B). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While running tp4, better use bigger model as example, such as Qwen3-32B
|
|
||
| ## Send Request Example | ||
| ```shell | ||
| curl http://0.0.0.0:8000/generate \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0.0.0.0 usually not used as target IP, the behavior is not defined. change to 127.0.0.1 as example.
update docs for qwen3 moe and pd_disaggregation