Hugging Face (Inference)
Hugging Face Inference Providers offer OpenAI-compatible chat completions through a single router API. You get access to many models (DeepSeek, Llama, and more) with one token. OpenClaw uses the OpenAI-compatible endpoint (chat completions only); for text-to-image, embeddings, or speech use the HF inference clients directly.- Provider:
huggingface - Auth:
HUGGINGFACE_HUB_TOKENorHF_TOKEN(fine-grained token with Make calls to Inference Providers) - API: OpenAI-compatible (
https://router.huggingface.co/v1) - Billing: Single HF token; pricing follows provider rates with a free tier.
Quick start
- Create a fine-grained token at Hugging Face → Settings → Tokens with the Make calls to Inference Providers permission.
- Run onboarding and choose Hugging Face in the provider dropdown, then enter your API key when prompted:
- In the Default Hugging Face model dropdown, pick the model you want (the list is loaded from the Inference API when you have a valid token; otherwise a built-in list is shown). Your choice is saved as the default model.
- You can also set or change the default model later in config:
Non-interactive example
huggingface/deepseek-ai/DeepSeek-R1 as the default model.
Environment note
If the Gateway runs as a daemon (launchd/systemd), make sureHUGGINGFACE_HUB_TOKEN or HF_TOKEN
is available to that process (for example, in ~/.openclaw/.env or via
env.shellEnv).
Model discovery and onboarding dropdown
OpenClaw discovers models by calling the Inference endpoint directly:Authorization: Bearer $HUGGINGFACE_HUB_TOKEN or $HF_TOKEN for the full list; some endpoints return a subset without auth.) The response is OpenAI-style { "object": "list", "data": [ { "id": "Qwen/Qwen3-8B", "owned_by": "Qwen", ... }, ... ] }.
When you configure a Hugging Face API key (via onboarding, HUGGINGFACE_HUB_TOKEN, or HF_TOKEN), OpenClaw uses this GET to discover available chat-completion models. During interactive onboarding, after you enter your token you see a Default Hugging Face model dropdown populated from that list (or the built-in catalog if the request fails). At runtime (e.g. Gateway startup), when a key is present, OpenClaw again calls GET https://router.huggingface.co/v1/models to refresh the catalog. The list is merged with a built-in catalog (for metadata like context window and cost). If the request fails or no key is set, only the built-in catalog is used.
Model names and editable options
- Name from API: The model display name is hydrated from GET /v1/models when the API returns
name,title, ordisplay_name; otherwise it is derived from the model id (e.g.deepseek-ai/DeepSeek-R1→ “DeepSeek R1”). - Override display name: You can set a custom label per model in config so it appears the way you want in the CLI and UI:
-
Provider / policy selection: Append a suffix to the model id to choose how the router picks the backend:
:fastest— highest throughput (router picks; provider choice is locked — no interactive backend picker).:cheapest— lowest cost per output token (router picks; provider choice is locked).:provider— force a specific backend (e.g.:sambanova,:together).
models.providers.huggingface.modelsor setmodel.primarywith the suffix. You can also set your default order in Inference Provider settings (no suffix = use that order). -
Config merge: Existing entries in
models.providers.huggingface.models(e.g. inmodels.json) are kept when config is merged. So any customname,alias, or model options you set there are preserved.
Model IDs and configuration examples
Model refs use the formhuggingface/<org>/<model> (Hub-style IDs). The list below is from GET https://router.huggingface.co/v1/models; your catalog may include more.
Example IDs (from the inference endpoint):
| Model | Ref (prefix with huggingface/) |
|---|---|
| DeepSeek R1 | deepseek-ai/DeepSeek-R1 |
| DeepSeek V3.2 | deepseek-ai/DeepSeek-V3.2 |
| Qwen3 8B | Qwen/Qwen3-8B |
| Qwen2.5 7B Instruct | Qwen/Qwen2.5-7B-Instruct |
| Qwen3 32B | Qwen/Qwen3-32B |
| Llama 3.3 70B Instruct | meta-llama/Llama-3.3-70B-Instruct |
| Llama 3.1 8B Instruct | meta-llama/Llama-3.1-8B-Instruct |
| GPT-OSS 120B | openai/gpt-oss-120b |
| GLM 4.7 | zai-org/GLM-4.7 |
| Kimi K2.5 | moonshotai/Kimi-K2.5 |
:fastest, :cheapest, or :provider (e.g. :together, :sambanova) to the model id. Set your default order in Inference Provider settings; see Inference Providers and GET https://router.huggingface.co/v1/models for the full list.