Shortcuts

Steps to create a huggingface online demo

create space

First, register for a Hugging Face account. After successful registration, click on your profile picture in the upper right corner and select “New Space” to create one. Follow the Hugging Face guide to choose the necessary configurations, and you will have a blank demo space ready.

A demo for LMDeploy

Replace the content of app.py in your space with the following code:

from lmdeploy.serve.gradio.turbomind_coupled import run_local
from lmdeploy.messages import TurbomindEngineConfig

backend_config = TurbomindEngineConfig(max_batch_size=1, cache_max_entry_count=0.05)
model_path = 'internlm/internlm2-chat-7b'
run_local(model_path, backend_config=backend_config, server_name="huggingface-space")

Create a requirements.txt file with the following content:

lmdeploy

FAQs

  • ZeroGPU compatibility issue. ZeroGPU is more suitable for inference methods similar to PyTorch, rather than Turbomind. You can switch to the PyTorch backend or enable standard GPUs.

  • Gradio version issue, versions above 4.0.0 are currently not supported. You can modify this in app.py, for example:

    import os
    os.system("pip uninstall -y gradio")
    os.system("pip install gradio==3.43.0")