Hugging Face Blog生成式AI

將 hf CLI 設計為與 Hub 協作的代理優化方式

2026年6月4日 00:00

重點摘要

hf 是 Hugging Face Hub 的官方命令列介面。您在 Python SDK 上能對 Hub 做的任何操作,現在都能在終端機中完成:下載與上傳模型、資料集和 Spaces;建立與管理儲存庫、分支、標籤及 Pull Request;在 HF 基礎架構上執行 Jobs;管理 Buckets、Collections、Webhooks 以及 Inference Endpoints。hf CLI 多年來主要為我們的使用者打造,但如今也越來越多被編碼代理(如 Claude Code、Codex、Cursor 等)所採用。因此我們重新設計它,讓它能同時滿足兩類使用者的需求。這篇部落格文章總結了我們所做的工作。

站內 AI 整理稿

Back to Articles Designing the hf CLI as an agent-optimized way to work with the Hub Published June 4, 2026 Update on GitHub Upvote 4 Célina Hanouti celinah Follow Lucain Pouget Wauplin Follow hf is the official command-line entrypoint to the Hugging Face Hub. Anything you can do on the Hub from the Python SDK, you can do from your terminal: download and upload models, datasets and Spaces; create and manage repos, branches, tags and pull requests; run Jobs on HF infrastructure; manage Buckets, Collections, webhooks and Inference Endpoints. The hf CLI has been primarily built for our users over the years. But it's now increasingly used by coding agents: Claude Code, Codex, Cursor and more. So we rebuilt it to make it work for both audiences at once. This blog post summarizes what we did, and how we benchmarked it. We found that on complex, multi-step tasks the no-CLI baseline (an agent hand-rolling curl or the Python SDK) uses up to 6× as many tokens as the hf CLI. AI agent traffic on the Hub We started tracking agent usage of the Hub in April 2026. The hf CLI (and the huggingface_hub Python SDK it's built on) detects when a coding agent is driving it by reading the environment variables agents set: CLAUDECODE/CLAUDE_CODE for Claude Code, CODEX_SANDBOX for Codex, plus Cursor, Gemini, Pi, and the universal AI_AGENT. That single signal does two jobs: it shapes the CLI's output (more on that below) and it tags each Hub request with an agent/<name> user-agent, so we can attribute traffic to the agent driving it. The two largest by distinct users are Claude Code and Codex, well ahead of everything else, and they're the two agents we benchmark later in this article. The bars count distinct users per agent; request volume is the sub-label. Claude Code alone is ~40k users and nearly 49M requests, with Codex close behind. These are early numbers (we only began attributing agent traffic in April 2026), but the scale is already significant, and we expect it to keep growing as coding agents become a standard way to work with the Hub. Built for humans and agents Humans and coding agents expect different outputs for the same hf commands. A human wants rich terminal output: ANSI color, padded tables truncated to fit the screen, a green ✅ on success, ✔ for booleans, progress bars, prose hints. An agent wants the inverse: no ANSI, nothing truncated, every value in full since an agent can handle far denser output than a human, kept compact and structured to stay light on tokens. It also can't answer a CLI prompt and will happily re-run a command after a timeout. The rest of this section is how hf gives each side what it needs. We introduced agent-mode output in hf v1.9.0 and have been migrating the rest of the CLI to it gradually in the following releases. One command, multiple renderings When hf auto-detects agent use (via the environment variables mentioned above), it renders the same command differently. It optimizes output format for humans or agents without passing a flag: # human (default in a terminal): aligned table, truncated to fit, with a hint > hf models ls --author Qwen --sort downloads --limit 3 ID CREATED_AT DOWNLOADS LIBRARY_NAME LIKES PIPELINE_TAG PRIVATE TAGS ------------------------ ---------- --------- ------------ ----- --------------- ------- ------------------------- Qwen/Qwen3-0.6B 2025-04-27 21156913 transformers 1285 text-generation transformers, safetens... Qwen/Qwen2.5-1.5B-Ins... 2024-09-17 15143953 transformers 725 text-generation transformers, safetens... Qwen/Qwen3-4B 2025-04-27 14808352 transformers 625 text-generation transformers, safetens... Hint: Use `--no-truncate` or `--format json` to display full values. # agent (auto-detected): TSV, full ids + ISO timestamps + every tag, nothing truncated $ hf models ls --author Qwen --sort downloads --limit 3 id created_at downloads library_name likes pipeline_tag private tags Qwen/Qwen3-0.6B 2025-04-27T03:40:08+00:00 21156913 transformers 1285 text-generation False ['transformers', 'safetensors', 'qwen3', 'text-generation', 'conversational', 'arxiv:2505.09388', 'base_model:Qwen/Qwen3-0.6B-Base', 'base_model:finetune:Qwen/Qwen3-0.6B-Base', 'license:apache-2.0', 'text-generation-inference', 'endpoints_compatible', 'deploy:azure', 'region:us'] Qwen/Qwen2.5-1.5B-Instruct 2024-09-17T14:10:29+00:00 15143953 transformers 725 text-generation False['transformers', 'safetensors', 'qwen2', 'text-generation', 'chat', 'conversational', 'en', 'arxiv:2407.10671', 'base_model:Qwen/Qwen2.5-1.5B', 'base_model:finetune:Qwen/Qwen2.5-1.5B', 'license:apache-2.0', 'text-generation-inference', 'endpoints_compatible', 'deploy:azure', 'region:us'] Qwen/Qwen3-4B 2025-04-27T03:41:29+00:00 14808352 transformers 625 text-generation False ['transformers', 'safetensors', 'text-generation', 'arxiv:2309.00071', 'arxiv:2505.09388', 'base_model:Qwen/Qwen3-4B-Base', 'base_model:finetune:Qwen/Qwen3-4B-Base', 'license:apache-2.0', 'endpoints_compatible', 'deploy:azure', 'region:us'] A human gets an aligned table, truncated to fit the terminal, plus a hint on how to see more, with color cues for status (a green ✓ on success, red on error). An agent gets the complete record as TSV: full repo ids, full ISO timestamps, every tag, no ANSI codes, nothing truncated, clean to parse and light on tokens. In practice, we've implemented logging methods like .table(...), .result(...), .json(), etc., which take raw data as input and handle the formatting. In addition to human and agent modes, we've introduced --json and --quiet options to make it easier to pipe commands together. The default mode is automatically chosen based on context, but users can always force the format of their choice with --format human | agent | json | quiet. Next-command hints CLI commands rarely run in isolation: one step usually implies the next (git add, then git commit). Many hf commands now end with a hint: the exact next command to run, pre-filled with the IDs you just used, so a user or agent can chain straight to the next step instead of working it out from scratch. Start a Job in the background and it points you to its logs; create a Space and it points you to its boot status: $ hf jobs run --detach python:3.12 python train.py ✓ Job started id: 6f3a1c2e9b url: https://huggingface.co/jobs/celinah/6f3a1c2e9b Hint: Use `hf jobs logs 6f3a1c2e9b` to fetch the logs. For a human that's a convenience. For an agent it's a rail: the next action is named, parameterized with the right ids, and ready to run, so it takes fewer steps working out what to do. Errors behave the same way, naming the fix instead of just failing: Error: Not logged in. Run `hf auth login` first. Hints, warnings and errors all go to stderr while data goes to stdout, so none of this guidance pollutes the output the agent is parsing. Non-blocking and safe to retry hf never sits on an interactive prompt waiting for a key an agent can't press. A destructive command still asks a human to confirm, but in agent mode it fails fast with the fix in the message (Use --yes to skip confirmation.), and -y/--yes skips it. And because agents retry on timeouts and lost context, operations are built to be safe to repeat: hf repos create --exist-ok is a no-op if the repo already exists, and re-running an upload re-commits cleanly. Separately, the commands that move real data take a --dry-run that shows exactly what they'll transfer before they run, which proves handy for humans and agents alike, since neither has to commit to a long download or blind sync: # agent mode: a destructive command without --yes refuses, with the fix in the message $ hf repos delete my-org/old-model Error: You are about to permanently delete model 'my-org/old-model'. Proceed? Use --yes to skip confirmation. # commands that move data take --dry-run to preview the transfer first $ hf download deepseek-ai/DeepSeek-V4-Pro config.json --dry-run [dry-run] Will download 1 files (out of 1) totalling 1.8K. file size config.json 1.8K Discoverable, p

Related

相關文章

鈦媒體生成式AI

Edge AI Daily 早報(6月19日)

AI Engineer World's Fair 2026規模再創新高,標誌AI工程從幕後走向舞臺中央。行業面臨結構性調整:楊立昆警示OpenAI年虧210億美元揭示商業模式脆弱性,Transformer之父轉投OpenAI反映人才爭奪白熱化。Anthropic多線佈局——語音支持七種語言、加入碳清除聯盟、落子首爾辦事處,展現生態擴張野心。監管壓力加劇,意大利依據DMA調查蘋果iCloud,巴西開放iOS側載佣金降至5%,蘋果圍牆花園持續崩塌。

2 小時前
智東西生成式AI

谷歌時隔6年再發智能音箱,Gemini上桌,售價不到700元

智東西 編譯 | 劉煜 編輯 | 陳駿達 智東西6月18日消息,谷歌昨日宣佈,其首款搭載居家版Gemini語音助手的智能音箱(Google Home Speaker)已開啟預售,將於當地時間6月25日正式上市,售價為99.99美元(約合人民幣677.03元)。在此之前,谷歌已有6年沒有推出過獨立智能音箱產品。 谷歌這款智能音箱外觀近似球形,風格類似亞馬遜新一代Echo音箱與蘋果舊款音箱HomePod Mini。 ▲谷歌智能音箱(圖源:谷歌官網) 使用音箱時,用戶只需通過口令“Hey Google”或“OK Google”喚醒Gemini,就可以繼續下達相應指令。這與谷歌舊款音箱、智能顯示屏等喚醒語音助手的方式相同。此外,用戶只要按照日常說話習慣下達命令,Gemini便能理解用戶意圖,相比之前大大提升溝通效率。 一、加強短時對話記憶,會員可與Gemini不限次數對話 谷歌此次推出的全新音箱升級諸多功能。其中,音箱搭載的Gemini語音助手擁有10款全新擬人化語音音色,用戶可以根據喜好自行選擇聲線。音箱還可支持用戶一次性下達多條語音指令,即使指令未能說對、說完整,用戶中途改口Gemini也能識別。 Gemini還具備多鏈路推理能力,落地到實際生活場景中比較實用。例如,用戶問:“我支持的足球隊下場比賽天氣如何?”Gemini收到指令後,會自動查詢賽事時間、舉辦地點,同時匹配相應時段天氣,再給出答覆。 同時,Gemini加強了短時對話記憶,能承接上下文實現連續對話功能。即使用戶連續追問、甚至串聯多項任務、不重複交代前置條件,該語音助手也能實現來回連貫交流。 ▲谷歌Gemini對話場景(圖源:谷歌官網) 不僅如此,Gemini搭配的連續對話功能,能讓應答後的音箱麥克風保持短暫收音,用戶無需重複喊“OK Google”就能繼續提問。該功能現已全面支持所有Gemini原生適配的語言,包括

22 小時前

微軟,考慮接入DeepSeek

這篇消息聚焦「微軟,考慮接入DeepSeek」。原始導語提到:Copilot Cowork轉為按量計費。 從 AI 情報角度來看,這類內容值得關注其背後的技術進展、產品落地、產業競爭與後續市場影響。

23 小時前