Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
107 commits
Select commit Hold shift + click to select a range
3f86837
update
zhulinJulia24 Jan 5, 2026
566347a
update
WillowsZhu Jan 5, 2026
cf6b616
update
WillowsZhu Jan 5, 2026
30de5d1
update
zhulinJulia24 Jan 6, 2026
008fcf2
Merge branch 'InternLM:main' into refactor_all_configs
zhulinJulia24 Jan 12, 2026
ac67846
update
zhulinJulia24 Jan 12, 2026
a713434
fix quantization
zhulinJulia24 Jan 12, 2026
01600c0
update
zhulinJulia24 Jan 12, 2026
053eb9a
update
zhulinJulia24 Jan 12, 2026
e9c34a9
update
zhulinJulia24 Jan 12, 2026
9c49e17
update
zhulinJulia24 Jan 12, 2026
b28334c
updaet
zhulinJulia24 Jan 12, 2026
f886566
updaste
zhulinJulia24 Jan 13, 2026
b8881f0
update
zhulinJulia24 Jan 13, 2026
c738007
update
zhulinJulia24 Jan 13, 2026
f9fdf60
update
zhulinJulia24 Jan 13, 2026
089f775
update
zhulinJulia24 Jan 13, 2026
2233197
update
zhulinJulia24 Jan 13, 2026
896a039
update
zhulinJulia24 Jan 14, 2026
4e2c256
updsate
zhulinJulia24 Jan 14, 2026
d77db4f
Merge branch 'InternLM:main' into refactor_all_configs
zhulinJulia24 Jan 14, 2026
331f65f
update
zhulinJulia24 Jan 15, 2026
4a574f5
update
zhulinJulia24 Jan 15, 2026
533acab
updaste
zhulinJulia24 Jan 15, 2026
246f7e8
update
zhulinJulia24 Jan 15, 2026
234b3ff
update
zhulinJulia24 Jan 15, 2026
2a84752
updaste
zhulinJulia24 Jan 15, 2026
9e7bd3c
update gpus
zhulinJulia24 Jan 15, 2026
8ee06a2
update
zhulinJulia24 Jan 15, 2026
e95c683
update
zhulinJulia24 Jan 15, 2026
fcd8b7c
updaste
zhulinJulia24 Jan 15, 2026
afa1829
update
WillowsZhu Jan 15, 2026
cf42e17
update
zhulinJulia24 Jan 16, 2026
ff4c4ae
update
zhulinJulia24 Jan 16, 2026
d909d99
update
zhulinJulia24 Jan 16, 2026
ceb6ec1
update
zhulinJulia24 Jan 16, 2026
1763e57
update
zhulinJulia24 Jan 16, 2026
51b8e32
update
zhulinJulia24 Jan 16, 2026
1b47878
update
zhulinJulia24 Jan 16, 2026
3781977
update
zhulinJulia24 Jan 16, 2026
083f1db
update
zhulinJulia24 Jan 16, 2026
8fae59e
update
zhulinJulia24 Jan 16, 2026
72861b8
Merge branch 'main' into refactor_all_configs
zhulinJulia24 Jan 19, 2026
e9a9690
update
zhulinJulia24 Jan 19, 2026
980621f
merge main
zhulinJulia24 Jan 19, 2026
20d5b53
update
zhulinJulia24 Jan 19, 2026
dd03163
update
zhulinJulia24 Jan 19, 2026
4712636
update
zhulinJulia24 Jan 19, 2026
4cea370
update
zhulinJulia24 Jan 19, 2026
9939ff4
update
zhulinJulia24 Jan 19, 2026
2e197d3
update
zhulinJulia24 Jan 19, 2026
0054608
update
zhulinJulia24 Jan 19, 2026
808d8f0
updaste
zhulinJulia24 Jan 19, 2026
bee3710
update
zhulinJulia24 Jan 20, 2026
8308f01
update
zhulinJulia24 Jan 20, 2026
1643f27
update
zhulinJulia24 Jan 20, 2026
923c3cb
update
zhulinJulia24 Jan 20, 2026
dd520af
update
zhulinJulia24 Jan 20, 2026
30fa489
update
zhulinJulia24 Jan 20, 2026
101c9d3
fix model path
zhulinJulia24 Jan 20, 2026
0361fe7
update
zhulinJulia24 Jan 21, 2026
f984093
update
zhulinJulia24 Jan 21, 2026
6efc392
update
zhulinJulia24 Jan 21, 2026
36d3fe5
update
zhulinJulia24 Jan 21, 2026
a89f5a5
update
zhulinJulia24 Jan 21, 2026
016e9a3
Merge branch 'InternLM:main' into refactor_all_configs
zhulinJulia24 Jan 21, 2026
b984a0e
update
zhulinJulia24 Jan 21, 2026
8826d83
update
zhulinJulia24 Jan 21, 2026
077590a
merge main
zhulinJulia24 Jan 21, 2026
6ed5c52
fix default port
zhulinJulia24 Jan 21, 2026
fcf7e32
fix default port
zhulinJulia24 Jan 21, 2026
e1f3992
update
zhulinJulia24 Jan 21, 2026
73c3388
update
zhulinJulia24 Jan 21, 2026
91bbfcf
update
zhulinJulia24 Jan 22, 2026
6d8bf46
update max worker for oc
zhulinJulia24 Jan 22, 2026
2ecb60e
update
zhulinJulia24 Jan 23, 2026
08d5a32
fix default value
zhulinJulia24 Jan 23, 2026
6de2cf5
update
zhulinJulia24 Jan 23, 2026
ff7b1db
Merge branch 'InternLM:main' into refactor_all_configs
zhulinJulia24 Jan 23, 2026
ea2d145
update
zhulinJulia24 Jan 26, 2026
2eb25d7
merge
zhulinJulia24 Jan 26, 2026
a804471
update
zhulinJulia24 Jan 26, 2026
99d95b9
add timestap in log
zhulinJulia24 Jan 26, 2026
64478ab
fix cv2 not in docker issue
zhulinJulia24 Jan 26, 2026
6b535ca
fix error
zhulinJulia24 Jan 26, 2026
a237c53
update
zhulinJulia24 Jan 27, 2026
0a06dc3
update
zhulinJulia24 Jan 27, 2026
78ec35c
update name
zhulinJulia24 Jan 27, 2026
794306b
Merge branch 'InternLM:main' into refactor_all_configs
zhulinJulia24 Jan 27, 2026
df5d448
update
zhulinJulia24 Jan 28, 2026
e5dc921
merge main
zhulinJulia24 Jan 28, 2026
eac686f
update testcase
zhulinJulia24 Jan 28, 2026
a7a7ffe
Merge branch 'InternLM:main' into refactor_all_configs
zhulinJulia24 Jan 28, 2026
ac229c1
Merge branch 'InternLM:main' into refactor_all_configs
zhulinJulia24 Jan 29, 2026
add1300
update
zhulinJulia24 Jan 29, 2026
ca727fc
merge main
zhulinJulia24 Jan 29, 2026
b3e8abd
update
zhulinJulia24 Jan 29, 2026
978ee90
fix fail case and name typo
zhulinJulia24 Jan 29, 2026
bec9f8d
update
zhulinJulia24 Jan 30, 2026
b49c960
Merge branch 'InternLM:main' into refactor_all_configs
zhulinJulia24 Jan 31, 2026
1e5c5c9
Merge branch 'InternLM:main' into refactor_all_configs
zhulinJulia24 Feb 2, 2026
907a3f5
update benchmark name
zhulinJulia24 Feb 2, 2026
b8b4fe6
change numprompts
zhulinJulia24 Feb 2, 2026
84c0938
Merge branch 'InternLM:main' into refactor_all_configs
zhulinJulia24 Feb 3, 2026
15524e4
updaste
zhulinJulia24 Feb 3, 2026
a813a36
update fail case
zhulinJulia24 Feb 4, 2026
5cb1421
update
zhulinJulia24 Feb 5, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 8 additions & 16 deletions .github/workflows/api_eval.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,10 +38,10 @@ env:
HOST_PIP_CACHE_DIR: /nvme/github-actions/pip-cache
HOST_LOCALTIME: /usr/share/zoneinfo/Asia/Shanghai
ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true
REPORT_DIR: /nvme/qa_test_models/evaluation-reports/allure_report/${{ github.run_id }}
REPORT_DIR: /nvme/qa_test_models/evaluation_report/allure_report/${{ inputs.repo_ref }}_${{ github.run_id }}
COV_PARAM: --cov /opt/py3/lib/python3.10/site-packages/lmdeploy
FAIL_CONFIG: '--lf'
TEST_CODE_PATH: /nvme/qa_test_models/test_pkg/lmdeploy/${{ github.run_id }}
TEST_CODE_PATH: /nvme/qa_test_models/test_pkg/lmdeploy/${{ inputs.repo_ref }}_${{ github.run_id }}
OFFLINE_CODE_PATH: /nvme/qa_test_models/offline_pkg/lmdeploy
OFFLINE_REQUIREMENTS: /nvme/qa_test_models/offline_pkg/requirements.txt
DEEPSEEK_VL: /nvme/qa_test_models/offline_pkg/DeepSeek-VL
Expand All @@ -50,6 +50,7 @@ env:
HF_DATASETS_CACHE: /nvme/qa_test_models/hf_datasets
HF_HUB_OFFLINE: 1
HF_EVALUATE_OFFLINE: 1
RUN_ID: ${{ inputs.repo_ref }}_${{ github.run_id }}

jobs:
linux-build:
Expand Down Expand Up @@ -146,30 +147,20 @@ jobs:
test_evaluation:
needs: download_pkgs
if: ${{ !cancelled() }}
runs-on: [self-hosted, test-140]
timeout-minutes: 2400
runs-on: [self-hosted, linux-a100]
timeout-minutes: 7200
strategy:
fail-fast: false
matrix:
backend: ${{ fromJSON(inputs.backend || '["turbomind", "pytorch"]')}}
gpu_num: ['gpu_num_1', 'gpu_num_2', 'gpu_num_4', 'gpu_num_8']
include:
- n: 8
gpu_num: gpu_num_1
- n: 4
gpu_num: gpu_num_2
- n: 2
gpu_num: gpu_num_4
- n: 1
gpu_num: gpu_num_8
container:
image: openmmlab/lmdeploy:latest-cu12.8
options: "--gpus=all --ipc=host --user root -e PIP_CACHE_DIR=/root/.cache/pip -e NVIDIA_DISABLE_REQUIRE=1 --pull never"
volumes:
- /nvme/github-actions/pip-cache:/root/.cache/pip
- /nvme/github-actions/packages:/root/packages
- /nvme/github-actions/resources:/root/resources
- /nvme/qa_test_models/evaluation-reports:/root/evaluation-reports
- /nvme/qa_test_models:/nvme/qa_test_models
- /nvme/huggingface_hub:/nvme/huggingface_hub
- /mnt/121:/mnt/121
Expand Down Expand Up @@ -208,11 +199,12 @@ jobs:
ln -s /mnt/104/opencompass-data/data ./data
ln -s /nvme/qa_test_models/resource/nltk_data /usr/share/nltk_data
execution_mode="${{ github.event.inputs.execution_mode || 'both' }}"
ulimit -n 65535
if [ "$execution_mode" = "both" ] || [ "$execution_mode" = "infer" ]; then
pytest autotest/evaluate/test_api_evaluate.py -m "${{matrix.gpu_num}} and ${{matrix.backend}} and infer" -n ${{matrix.n}} --run_id ${{ github.event.inputs.run_id || github.run_id }} --alluredir=${{env.REPORT_DIR}} || overall_exit=$?
pytest autotest/evaluate/test_api_evaluate.py -m "${{matrix.gpu_num}} and ${{matrix.backend}} and infer" --alluredir=${{env.REPORT_DIR}} || overall_exit=$?
fi
if [ "$execution_mode" = "both" ] || [ "$execution_mode" = "eval" ]; then
pytest autotest/evaluate/test_api_evaluate.py -m "${{matrix.gpu_num}} and ${{matrix.backend}} and eval" -n 4 --run_id ${{ github.event.inputs.run_id || github.run_id }} --alluredir=${{env.REPORT_DIR}} || overall_exit=$?
pytest autotest/evaluate/test_api_evaluate.py -m "${{matrix.gpu_num}} and ${{matrix.backend}} and eval" -n 4 --alluredir=${{env.REPORT_DIR}} || overall_exit=$?
fi
exit $overall_exit
- name: Clear workspace
Expand Down
18 changes: 9 additions & 9 deletions .github/workflows/api_eval_h800.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,10 @@ env:
HOST_LOCALTIME: /usr/share/zoneinfo/Asia/Shanghai
OUTPUT_FOLDER: cuda12.8_dist_${{ github.run_id }}
ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true
REPORT_DIR: /nvme/qa_test_models/evaluation-reports/allure_report/${{ github.run_id }}
REPORT_DIR: /nvme/qa_test_models/evaluation_report/allure_report/${{ inputs.repo_ref }}_${{ github.run_id }}
COV_PARAM: --cov /opt/py3/lib/python3.10/site-packages/lmdeploy
FAIL_CONFIG: '--lf'
TEST_CODE_PATH: /nvme/qa_test_models/test_pkg/lmdeploy/${{ github.run_id }}
TEST_CODE_PATH: /nvme/qa_test_models/test_pkg/lmdeploy/${{ inputs.repo_ref }}_${{ github.run_id }}
OFFLINE_CODE_PATH: /nvme/qa_test_models/offline_pkg/lmdeploy
OFFLINE_REQUIREMENTS: /nvme/qa_test_models/offline_pkg/requirements.txt
DEEPSEEK_VL: /nvme/qa_test_models/offline_pkg/DeepSeek-VL
Expand All @@ -51,6 +51,8 @@ env:
HF_DATASETS_CACHE: /nvme/qa_test_models/hf_datasets
HF_HUB_OFFLINE: 1
HF_EVALUATE_OFFLINE: 1
RUN_ID: ${{ inputs.repo_ref }}_${{ github.run_id }}
TEST_ENV: h800

jobs:
linux-build:
Expand Down Expand Up @@ -105,7 +107,6 @@ jobs:
- /nvme/github-actions/packages:/root/packages
- /nvme/github-actions/resources:/root/resources
- /nvme/github-actions/opencompass-data:/root/opencompass-data
- /nvme/qa_test_models/evaluation-reports:/root/evaluation-reports
- /nvme/qa_test_models:/nvme/qa_test_models
- /nvme1/qa_test_models:/nvme1/qa_test_models
- /nvme2/share:/nvme2/share
Expand Down Expand Up @@ -133,7 +134,6 @@ jobs:
run: |
python3 -m pip install lmdeploy-*.whl --no-deps
python3 -m pip install -r requirements/test.txt
mv autotest/config-h800.yaml autotest/config.yaml
- name: Install opencompass
run: |
python3 -m pip install opencompass
Expand All @@ -152,13 +152,13 @@ jobs:
ln -s /nvme/qa_test_models/resource/nltk_data /usr/share/nltk_data
execution_mode="${{ github.event.inputs.execution_mode || 'both' }}"
if [ "$execution_mode" = "both" ] || [ "$execution_mode" = "infer" ]; then
pytest autotest/evaluate/test_api_evaluate.py -m "gpu_num_1 and ${{matrix.backend}} and infer" -n 8 --run_id ${{ github.event.inputs.run_id || github.run_id }} --alluredir=${{env.REPORT_DIR}} || overall_exit=$?
pytest autotest/evaluate/test_api_evaluate.py -m "gpu_num_2 and ${{matrix.backend}} and infer" -n 4 --run_id ${{ github.event.inputs.run_id || github.run_id }} --alluredir=${{env.REPORT_DIR}} || overall_exit=$?
pytest autotest/evaluate/test_api_evaluate.py -m "gpu_num_4 and ${{matrix.backend}} and infer" -n 2 --run_id ${{ github.event.inputs.run_id || github.run_id }} --alluredir=${{env.REPORT_DIR}} || overall_exit=$?
pytest autotest/evaluate/test_api_evaluate.py -m "gpu_num_8 and ${{matrix.backend}} and infer" -n 1 --run_id ${{ github.event.inputs.run_id || github.run_id }} --alluredir=${{env.REPORT_DIR}} || overall_exit=$?
pytest autotest/evaluate/test_api_evaluate.py -m "gpu_num_1 and ${{matrix.backend}} and infer" -n 8 --alluredir=${{env.REPORT_DIR}} || overall_exit=$?
pytest autotest/evaluate/test_api_evaluate.py -m "gpu_num_2 and ${{matrix.backend}} and infer" -n 4 --alluredir=${{env.REPORT_DIR}} || overall_exit=$?
pytest autotest/evaluate/test_api_evaluate.py -m "gpu_num_4 and ${{matrix.backend}} and infer" -n 2 --alluredir=${{env.REPORT_DIR}} || overall_exit=$?
pytest autotest/evaluate/test_api_evaluate.py -m "gpu_num_8 and ${{matrix.backend}} and infer" -n 1 --alluredir=${{env.REPORT_DIR}} || overall_exit=$?
fi
if [ "$execution_mode" = "both" ] || [ "$execution_mode" = "eval" ]; then
pytest autotest/evaluate/test_api_evaluate.py -m "${{matrix.backend}} and eval" -n 4 --run_id ${{ github.event.inputs.run_id || github.run_id }} --alluredir=${{env.REPORT_DIR}} || overall_exit=$?
pytest autotest/evaluate/test_api_evaluate.py -m "${{matrix.backend}} and eval" -n 4 --alluredir=${{env.REPORT_DIR}} || overall_exit=$?
fi
exit $overall_exit
- name: Clear workspace
Expand Down
51 changes: 21 additions & 30 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,12 @@ on:
required: true
description: 'Set benchmark type. Default is "["longtext", "throughput", "api_server", "prefixcache"]"'
type: string
default: "['apiserver', 'throughput', 'longtext', 'prefixcache']"
default: "['apiserver', 'mllm_apiserver', 'throughput', 'longtext', 'prefixcache']"
backend:
required: true
description: 'Set backend filter. Default is "["turbomind", "pytorch"]"'
type: string
default: "['turbomind', 'pytorch']"
offline_mode:
required: true
description: 'Whether start a offline mode, if true, you should prepare code and whl package by yourself'
Expand All @@ -28,11 +33,12 @@ env:
HOST_PIP_CACHE_DIR: /nvme/github-actions/pip-cache
HOST_LOCALTIME: /usr/share/zoneinfo/Asia/Shanghai
OUTPUT_FOLDER: cuda12.8_dist_${{ github.run_id }}
REPORT_DIR: /nvme/qa_test_models/benchmark-reports/${{ github.run_id }}
ALLURE_REPORT_DIR: /nvme/qa_test_models/benchmark-reports/allure_report/${{ github.run_id }}
TEST_CODE_PATH: /nvme/qa_test_models/test_pkg/lmdeploy/${{ github.run_id }}
REPORT_DIR: /nvme/qa_test_models/benchmark_report/${{ inputs.repo_ref }}_${{ github.run_id }}
ALLURE_REPORT_DIR: /nvme/qa_test_models/benchmark_report/allure_report/${{ inputs.repo_ref }}_${{ github.run_id }}
TEST_CODE_PATH: /nvme/qa_test_models/test_pkg/lmdeploy/${{ inputs.repo_ref }}_${{ github.run_id }}
OFFLINE_CODE_PATH: /nvme/qa_test_models/offline_pkg/lmdeploy
ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true
RUN_ID: ${{ inputs.repo_ref }}_${{ github.run_id }}

jobs:
linux-build:
Expand Down Expand Up @@ -172,9 +178,18 @@ jobs:
run: |
python3 -m pip list
lmdeploy check_env
- name: Run other benchmark
- name: Run other benchmark - all
if: contains(fromJson(github.event.inputs.backend), 'turbomind') && contains(fromJson(github.event.inputs.backend), 'pytorch')
run: |
pytest autotest/benchmark/test_${{matrix.benchmark_type}}_performance.py -n ${{matrix.n}} -m '${{matrix.gpu_num}} and not pr_test and not function' --alluredir=${{env.ALLURE_REPORT_DIR}}
- name: Run other benchmark - turbomind
if: contains(fromJson(github.event.inputs.backend), 'turbomind') && !contains(fromJson(github.event.inputs.backend), 'pytorch')
run: |
pytest autotest/benchmark/test_${{matrix.benchmark_type}}_performance.py -n ${{matrix.n}} --run_id ${{ github.run_id }} -m '${{matrix.gpu_num}} and not pr_test' --alluredir=${{env.ALLURE_REPORT_DIR}}
pytest autotest/benchmark/test_${{matrix.benchmark_type}}_performance.py -n ${{matrix.n}} -m '${{matrix.gpu_num}} and not pr_test and not function and turbomind' --alluredir=${{env.ALLURE_REPORT_DIR}}
- name: Run other benchmark - pytorch
if: contains(fromJson(github.event.inputs.backend), 'pytorch') && !contains(fromJson(github.event.inputs.backend), 'turbomind')
run: |
pytest autotest/benchmark/test_${{matrix.benchmark_type}}_performance.py -n ${{matrix.n}} -m '${{matrix.gpu_num}} and not pr_test and not function and pytorch' --alluredir=${{env.ALLURE_REPORT_DIR}}
- name: Clear workfile
if: always()
run: |
Expand All @@ -185,27 +200,3 @@ jobs:
rm -rf $workdir
mkdir $workdir
chmod -R 777 $workdir


get_result_overview:
if: always() && !cancelled()
needs: [benchmark]
timeout-minutes: 5
runs-on: [self-hosted, linux-a100]
container:
image: openmmlab/lmdeploy:latest-cu12.8
options: "--gpus=all --ipc=host --user root -e PIP_CACHE_DIR=/root/.cache/pip -e NVIDIA_DISABLE_REQUIRE=1 --pull never"
volumes:
- /nvme/qa_test_models:/nvme/qa_test_models
- /usr/share/zoneinfo/Asia/Shanghai:/etc/localtime:ro
steps:
- name: Clone repository
uses: actions/checkout@v2
with:
repository: ${{ github.event.inputs.repo_org || 'InternLM/lmdeploy' }}
ref: ${{github.event.inputs.repo_ref || 'main'}}
- name: Get overview
run: |
echo "status=done" >> ${{env.REPORT_DIR}}/status.txt
pip install pandas fire mmengine
python3 .github/scripts/action_tools.py generate_benchmark_report $REPORT_DIR
Loading