Skip to content

Conversation

@zzzzzzzxh
Copy link
Contributor

update docs for qwen3 moe and pd_disaggregation

-v /usr/local/sbin:/usr/local/sbin \
-v /etc/hccn.conf:/etc/hccn.conf \
-v /home/ckpt:/home/ckpt \
sgl_mindspore:v0.10 \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sgl_mindspore docker image seems not declared. Better not mention docker image before we provide the dockerfile


```
## MemFabric Adaptor install
*Notice: Prebuilt wheel package is based on aarch64, please leave an issue [here at sglang](https://github.com/sgl-project/sglang/issues) to let us know the requests for amd64 build.*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't ask to leave issues to other repo. Just use this repo.


## PD Disaggregation Examples
### Running Qwen3-8B
Running Qwen3-8B with PD disaggregation on 2 x Atlas 800I A2. Model weights could be found [here](https://modelers.cn/models/MindSpore-Lab/Qwen3-8B).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While running tp4, better use bigger model as example, such as Qwen3-32B


## Send Request Example
```shell
curl http://0.0.0.0:8000/generate \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.0.0.0 usually not used as target IP, the behavior is not defined. change to 127.0.0.1 as example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants