Anthropic-Compatible Endpoints#

LMDeploy provides a lightweight Anthropic-compatible surface for easier integration with Anthropic-style clients and gateways.

Supported Endpoints#

POST /v1/messages
POST /v1/messages/count_tokens
GET /anthropic/v1/models

Required Headers#

For Anthropic POST endpoints, include:

content-type: application/json
anthropic-version: 2023-06-01 (or another accepted version string)

Notes and Current Limits#

POST /v1/messages supports text output, reasoning output (thinking blocks), and tool-use output (tool_use blocks).
Tool use requires launching API server with a configured tool parser (--tool-call-parser ...).
Reasoning block extraction depends on parser configuration (same parser stack used by OpenAI-compatible chat endpoint).
count_tokens is tokenizer/chat-template based and is intended for practical estimation.

Example: `/v1/messages`#

curl http://{server_ip}:{server_port}/v1/messages \
  -H "content-type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "internlm-chat-7b",
    "max_tokens": 128,
    "messages": [{"role": "user", "content": "Hello from Anthropic client"}]
  }'

Example: `/v1/messages` with tools#

curl http://{server_ip}:{server_port}/v1/messages \
  -H "content-type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "internlm-chat-7b",
    "max_tokens": 128,
    "messages": [{"role": "user", "content": "Find lmdeploy docs"}],
    "tools": [{
      "name": "search",
      "description": "Search docs",
      "input_schema": {
        "type": "object",
        "properties": {
          "query": {"type": "string"}
        },
        "required": ["query"]
      }
    }],
    "tool_choice": {"type": "auto"}
  }'

Streaming Events (SSE)#

When stream=true, the endpoint returns text/event-stream events such as:

message_start
content_block_start
content_block_delta (text_delta, thinking_delta, input_json_delta)
content_block_stop
message_delta
message_stop

Example: `/v1/messages/count_tokens`#

curl http://{server_ip}:{server_port}/v1/messages/count_tokens \
  -H "content-type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "internlm-chat-7b",
    "system": "You are a helpful assistant.",
    "messages": [{"role": "user", "content": "Count these tokens"}]
  }'

Anthropic-Compatible Endpoints

Contents