Benchmark#

Please install the lmdeploy precompiled package and download the script and the test dataset:

pip install lmdeploy
# clone the repo to get the benchmark script
git clone --depth=1 https://github.com/InternLM/lmdeploy
cd lmdeploy
# switch to the tag corresponding to the installed version:
git fetch --tags
# Check the installed lmdeploy version:
pip show lmdeploy | grep Version
# Then, check out the corresponding tag (replace <version> with the version string):
git checkout <version>
# download the test dataset
wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json

Benchmark offline pipeline API#

python3 benchmark/profile_pipeline_api.py ShareGPT_V3_unfiltered_cleaned_split.json meta-llama/Meta-Llama-3-8B-Instruct

For a comprehensive list of available arguments, please execute python3 benchmark/profile_pipeline_api.py -h

Benchmark offline engine API#

python3 benchmark/profile_throughput.py ShareGPT_V3_unfiltered_cleaned_split.json meta-llama/Meta-Llama-3-8B-Instruct

Detailed argument specification can be retrieved by running python3 benchmark/profile_throughput.py -h

Benchmark online serving#

Launch the server first (you may refer here for guide) and run the following command:

python3 benchmark/profile_restful_api.py --backend lmdeploy --num-prompts 5000 --dataset-path ShareGPT_V3_unfiltered_cleaned_split.json

For detailed argument specification of profile_restful_api.py, please run the help command python3 benchmark/profile_restful_api.py -h.