Hugging Face BlogAI應用場景

Shipping huggingface_hub every week with AI, open tools, and a human in the loop

2026年6月23日 00:00

重點摘要

Back to Articles Shipping huggingface_hub every week with AI, open tools, and a human in the loop Published June 23, 2026 Update on GitHub Upvote - Lucain Pouget Wauplin Follow Célina Hanouti celinah Follow huggingface_hub is the Python client at the base of the Hugging Face ecosystem. transformers, datasets, diffusers, sentence-transformers and dozens of other libraries depend on it to talk to the Hub. Every week we don't ship a new release is a week of fixes and features stuck on main. For a long time we released every 4 to 6 weeks. We now release every week from a single GitHub Actions workflow. We built it using open-source tools and open-weights models and kept a human in the loop at the one place where judgment matters. Nothing in this post requires a vendor contract, a closed model,

站內 AI 整理稿

Back to Articles Shipping huggingface_hub every week with AI, open tools, and a human in the loop Published June 23, 2026 Update on GitHub Upvote - Lucain Pouget Wauplin Follow Célina Hanouti celinah Follow huggingface_hub is the Python client at the base of the Hugging Face ecosystem. transformers, datasets, diffusers, sentence-transformers and dozens of other libraries depend on it to talk to the Hub. Every week we don't ship a new release is a week of fixes and features stuck on main. For a long time we released every 4 to 6 weeks. We now release every week from a single GitHub Actions workflow. We built it using open-source tools and open-weights models and kept a human in the loop at the one place where judgment matters. Nothing in this post requires a vendor contract, a closed model, or infrastructure you can't run yourself. That was a design goal from the start since we wanted a workflow other maintainers could pick up and adapt. By the end of this post, you'll have everything you need to build your own. Where we started The old process was partly automated, mostly manual. Already in CI: Publishing to PyPI once a tag was pushed. Opening test branches in downstream libraries with the release candidate pinned. Still manual, every single time: Creating the release branch, bumping the version in __init__.py, committing, tagging, pushing. Watching the downstream CI runs and triaging failures. Reading through every PR merged since the last release and writing release notes by hand: grouped by theme, with context, in a voice that didn't read like a git log dump. Cutting the stable release after the RC period. Drafting an internal Slack announcement and social posts. Opening the post-release PR to bump main to the next dev0. Writing good notes for a new version was the heavy part, aggregating tens of PRs on different topics. Nothing technically hard but a few hours of focused attention. Add the announcements on top and a minor release was easily a half-day of work spread over several days. Two kinds of work So we decided to streamline the whole thing. Looking at that list, the work splits in two. Some steps are purely mechanical and can be automated: bumping the version, committing, tagging, pushing, opening downstream test branches, opening the post-release PR. Nobody needs to think about those. They just have to happen in the right order, every time, which is what a CI workflow is good at. The rest is different. Writing release notes, deciding what to highlight, phrasing an announcement for a human audience: that's brain work. It's the kind of judgment that kept the release manual for years. This is where AI comes in, turning a blank page into a solid first draft in seconds. It's also where we have to be careful because a draft that looks confident and is subtly wrong is worse than no draft at all. The design principle: open parts, reusable by anyone When we decided to fix this, we set one constraint up front: every moving part had to be something any maintainer could run themselves. No closed model behind an API we couldn't swap, no proprietary release platform, no secret sauce. Here's the entire stack: Part What it does GitHub Actions Orchestrates the whole release OpenCode Agent runtime that drives the model An open-weights model (currently GLM-5.2 from Z.ai) Drafts the release notes and Slack announcement HF Inference Providers Serves the model PyPI Trusted Publishing Publishes the package The second principle: the model drafts, a human decides. Language models are good at turning thirty terse PR titles into readable release notes. They are not good at being trusted blindly. So the workflow is human-supervised: the model does the first pass, a deterministic script checks its work, and a human reviews and edits before anything ships (more on that below). A tour of the pipeline The full workflow is a single file, .github/workflows/release.yml, triggered by hand from the Actions UI. It takes exactly one input: on: workflow_dispatch: inputs: release_type: type: choice options: - minor-prerelease # cut an RC from main - minor-release # promote the RC to final - patch-release # bugfix on an existing release branch From there, the jobs run roughly in this order: Prepare. Compute the next version, create or reuse the release branch, bump __version__, commit, tag, push. Publish to PyPI. Build and upload huggingface_hub. In parallel, build and upload the hf CLI as its own PyPI package. Release notes. Diff the commit range since the last tag, pull PR metadata from the GitHub API, and have the model draft a structured changelog (here's a recent one). Saved as a draft GitHub release. Downstream test branches. For RCs, open a branch in transformers, datasets, diffusers, sentence-transformers with the RC pinned, so their CI tells us fast if we broke something. Slack announcement. Read the notes and produce an internal announcement in our team voice. Archive notes. Upload both the raw AI draft and the human-edited version to a Hugging Face Bucket, side by side. Post-release bump. After a stable release, open a PR on main bumping to the next dev0. Comment on shipped PRs. Leave a "this shipped in vX.Y.Z" comment on every PR in the release. Sync CLI docs. Open a PR to our skills repo with the regenerated hf CLI skill docs. Report to Slack. Every step posts its status as a thread reply; a final job updates the root message with ✅ or ❌. The remaining manual steps are reviewing and publishing the draft release notes, and reviewing and posting an internal Slack message. Those two steps are where we want a human in the loop. Trust but verify: the human-in-the-loop core Here's the failure mode everyone worries about with AI-generated release notes: the model quietly drops a PR or invents one that isn't in this release. A changelog that's almost right is worse than no changelog because nobody re-checks it. We don't trust the generated release notes to be complete on the first try, we verify it deterministically. Before the model runs, a Python script retrieves all PRs that belong to the release and stores them as ground truth. # Deterministic: extract PR numbers from squash-merge commits in the range. PR_NUMBER_PATTERN = re.compile(r"\(#(\d+)\)$") pr_numbers = [ int(m.group(1)) for commit in commits_since_last_tag if (m := PR_NUMBER_PATTERN.search(commit.title)) ] save_manifest(pr_numbers) # the source of truth Then model drafts the notes from them. Once done, we check its output against the initial list of PRs: expected = set(load_manifest()) # what should be there found = extract_pr_refs(notes_md) # what the model wrote (#1234 -> 1234) missing = expected - found # silently dropped extra = found - expected # belongs to a different release If anything is missing or extra, we don't fail and we don't ship a wrong file. We hand the discrepancy back to the agent and ask it to fix exactly those PRs: for _ in range(MAX_ITERATIONS): missing, extra = validate(notes) if not missing and not extra: break # matches the manifest exactly run_agent_fix(missing_prs=missing, extra_prs=extra) This is the pattern that makes the whole thing trustworthy: a non-deterministic model wrapped in deterministic guardrails. The model is great at writing prose and unreliable at being exhaustive. So we let it write and let code enforce the consistency. Grounding the model so it doesn't make things up Completeness is one half. Accuracy is the other. A model summarizing a PR from its title alone will cheerfully invent a code example that doesn't match the real API. To prevent that, when we fetch PR metadata we also pull the actual documentation diffs from each PR: the unified diff of any .md file under docs/ that the PR touched. def fetch_doc_diffs(pr): return [ {"filename": f.filename, "status": f.status, "patch": f.patch} for f in pr.get_files() if f.filename.startswith("docs/") and f.filename.endswith(".md") and f.patch ] That diff goes into the model's context so when it writes "here's the new CLI

Related

相關文章

一邊擁抱,一邊質疑:遊戲公司開始放棄AI ?

這篇消息聚焦「一邊擁抱,一邊質疑:遊戲公司開始放棄AI ?」。原始導語提到:從“能不能用”,到“該怎麼用” 從 AI 情報角度來看,這類內容值得關注其背後的技術進展、產品落地、產業競爭與後續市場影響。

剛剛