Hugging Face Blog生成式AI

我們用本地模型免費分類處理 OpenClaw 儲存庫！*

2026年6月22日 00:00

重點摘要

返回文章列表我們用本地模型免費分類處理 OpenClaw 儲存庫！* 發布日期：2026年6月22日在 GitHub 上更新讚 7 +1 Onur Solmaz osolmaz 追蹤 ben burtenshaw burtenshaw 追蹤 shaun smith evalstate 追蹤 *免費如同啤酒，但不包含電費，且前提是你已經擁有硬體 2026年6月將被記為人們意識到封閉模型可能被撤回的時刻。回想 Anthropic 最新旗艦模型 Claude Fable 5 被下架的慘痛教訓，不難理解為何擁有自己的 AI 堆疊並能在本地運行模型變得比以往更加重要，尤其是當你的事業建立在 AI 之上。基於此，我們想分享如何利用 Gemma 和 Qwen 等本地模型，在代理框架中執行分類任務[1]。這種方法「

站內 AI 整理稿

Back to Articles We got local models to triage the OpenClaw repo for FREE!* Published June 22, 2026 Update on GitHub Upvote 7 +1 Onur Solmaz osolmaz Follow ben burtenshaw burtenshaw Follow shaun smith evalstate Follow *Free as in beer, excluding the cost of electricity, and assuming you already own the hardware June 2026 will go down as the moment that people realized closed models can be taken away. With the removal of Anthropic's latest flagship model Claude Fable 5 fresh in memory, one can see why it is more important than ever to own your AI stack and be able to run models locally, especially if you are building your business on top of AI. In that light, we wanted to share how we use local models like Gemma and Qwen in an agent harness, to run classification tasks[1]. This approach is different from using a model like BERT for classification. A local model in an agent harness like Pi can be used in tandem with structured outputs, to assign labels. We chose this approach because we already had local models and the harness on hand, and have conviction that similar setups will increase in popularity as local models improve in capability.[2] Our starting point was open source contributions in the OpenClaw repo. OpenClaw gets hundreds of issues and PRs every day, which need to be triaged, prioritized and routed to maintainers. I, Onur, am working to make local models work well with OpenClaw. Being a maintainer of this specific vertical, I need to react quickly to any P0 issues. With SOTA closed models like GPT-5, Opus, or Sonnet, this is a pretty straightforward task. But I happen to sit on 128 GB of unified memory, namely an NVIDIA GB10. So I took on the task: Can I build a real-time notification system that filters and notifies me for only the issues that I am responsible for... with local open-weight models? This tiny box, a.k.a. DGX Spark, can run gemma-4-26b-a4b with high concurrency and generate hundreds of tokens per second. If I set up my OpenClaw main agent running on a $200/mo ChatGPT Pro plan to trigger a job on every new issue or PR, that would use up my quota. I might instead set it to run every 2 hours, or 6 hours. This would batch issues over longer periods, so we would be trading real-time notifications for delayed processing. If I were to run this on a local model on the hardware I already have up and running, I would not only have near-instantaneous notifications, I would also be able to do it for free (or rather, for the cost of electricity). Categorizing issues and PRs We came up with a finite set of labels representing the categories of issues we need to triage, and then use a local model to classify each issue into one of those categories, like local_models, self_hosted_inference, acp, agent_runtime, codex, ui_tui and so on.[3] But how do we classify pull requests? A simple single request to a Chat Completions endpoint with a tool JSON schema, with the topics as an enum? Kind of. But this is 2026, not 2023, and we have AGENTS. We can do better! For the local model choices, we tested gemma-4-26b-a4b and qwen3.6-35b-a3b. With performance optimizations, both can generate hundreds of tokens per second locally. We use an agent harness to drive the classification run. For this, we bundle pi as a harness that can call local model endpoints. The agent by default receives the PR title, body and a truncated excerpt of the PR diff in the first prompt. Then, it can choose to use the bash tool to perform read-only operations on the OpenClaw repo (in case it needs to look at the codebase), or the final_json tool to submit the final classification result. You wouldn't want to give full bash access to a local model running in this high-throughput setting, because a prompt-injected issue or PR could otherwise steer the model into doing something unrelated to classification. For that reason, we use reposhell instead of bash: a restricted bash-like shell that only allows read-only operations (ls, find, cat, grep, etc.) on the OpenClaw repo. The model thinks it is using bash, but any operation that is not allowed is rejected: reposhell bound cwd=/repo/openclaw repos=openclaw type help for allowed commands; exit or quit to leave reposhell /repo/openclaw> help allowed: pwd, ls, find, rg, grep, sed -n, cat, head, tail, wc -l, git status --short, git show --name-only, git grep, git ls-files search: rg -n -i "lm studio" or grep -R -n -i "lm studio" . files: rg --files -g "*.ts" or git ls-files src examples: rg -n reposhell README.md | sed is not allowed; use one simple command at a time reposhell /repo/openclaw> head README.md # 🦞 OpenClaw — Personal AI Assistant <p align="center"> <picture> <source media="(prefers-color-scheme: light)" srcset="https://raw.githubusercontent.com/openclaw/openclaw/main/docs/assets/openclaw-logo-text-dark.svg"> <img src="https://raw.githubusercontent.com/openclaw/openclaw/main/docs/assets/openclaw-logo-text.svg" alt="OpenClaw" width="500"> </picture> </p> <p align="center"> reposhell /repo/openclaw> curl localhost reposhell policy denied command: unsupported command "curl" exit_code=2 reposhell /repo/openclaw> Here is a concrete example where this mattered. In one saved session example, qwen3.6-35b-a3b was classifying openclaw/openclaw#84621, titled Fix Kimi tool-call rewriting stop reason handling. The thinking block shows the model initially considering coding_agent_integrations because the changed path extensions/kimi-coding made it look plausible. The model used reposhell to inspect the local repo with simple read-only commands like ls extensions, ls extensions/kimi-coding, and cat extensions/kimi-coding/package.json. That package metadata showed the extension was actually @openclaw/kimi-provider, an OpenClaw Kimi provider plugin. So the model corrected the final labels to inference_api and tool_calling, and explicitly excluded coding_agent_integrations. We have mentioned earlier that we bundle a specific pi configuration that can only perform read-only operations and return classification output. We call it localpager-agent, named after localpager, the main project here. Each PR and issue generates a prompt, which is then passed to the CLI like below, alongside other args: localpager-agent \ --model "<model-id>" \ --base-url "<openai-compatible-base-url>" \ --session-dir "<session-output-dir>" \ --final-schema "<runtime-schema.json>" \ --tools bash,final_json \ --reposhell-socket "<reposhell.sock>" \ --reposhell-default-repo "<repo-id>" \ --reposhell-visible-repos "<repo-id>[,<repo-id>...]" \ -p "$(cat <rendered-prompt.md>)" Processing incoming PRs and issues So then what orchestrates everything in between the incoming PR/issue and the final notification on Discord? This is what the final filtered Discord notification looks like: a PR about the desired vertical gets routed to me. The orchestration around this is very simple; only the classification step involves an LLM: We use openclaw/gitcrawl to act as a local mirror for the repo. Whenever there is a new PR or issue, each item is normalized into the same shape and written into localpager's own SQLite database. If the item is new, localpager creates a classification job for it. A worker then claims jobs from that queue. It builds a GitHub context object containing the issue or PR title, body, labels, author, state, and optionally comments, changed files, and selected diff excerpts. That means the local model does not need to browse GitHub or open the URL itself most of the time. It is handed all the relevant context. The context object is rendered into a prompt and passed to localpager-agent as described in the previous section. The agent can think and use reposhell, but must eventually output a classification result in the defined schema. The output is stored back in localpager SQLite database, and relayed to Discord based on the notification policy configured by the user (i.e. notify me for these topics, but not these other ones). Below is a figure showing the ov

原始來源：Hugging Face Blog ↗

查看原始來源

36氪生成式AI

Claude Fable 5，名存實亡

assistant: 根據提供的內容，這似乎是一則關於AI模型服務的報導或評論。摘要如下：Claude的Fable 5模型在更新後性能大幅下滑，跑分結果出現斷崖式下跌。官方文檔揭露，用戶在付費使用Fable 5的過程中，實際運行的可能一直是舊版的Opus模型。此事件引發了對模型服務透明度的質疑。</think>Claude的Fable 5模型在更新後性能大幅下滑，跑分結果出現斷崖式下跌。官方文檔揭露，用戶在付費使用Fable 5的過程中，實際運行的可能一直是舊版的Opus模型。此事件引發了對模型服務透明度的質疑。

剛剛閱讀分析

智東西生成式AI

對話Kimi B端負責人黃震昕：把國產大模型搬上亞馬遜雲科技，未來與海外“御三家”掰手腕

月之暗面Kimi與亞馬遜雲科技展開四層合作，涵蓋基礎設施、平台服務、業務合作及垂直行業，藉此拓展全球市場。Kimi B端負責人黃震昕透露，公司提供業界最高人均算力，B端業務快速增長，並在Token效率、長程推理及Agent集群等方面取得技術突破，目標是與海外頂尖模型競爭。他預測，雖然算力成本上漲推升模型價格，但技術優化將持續提升性價比。

5 小時前閱讀分析

雷峰網生成式AI

算力之外的博弈：ICML 2026 透露了哪些學術硬通貨？

告別盲目刷榜，28頁 PPT 帶你摸透 ICML 新風向。作者丨陳淑瑜編輯丨岑峰 ICML 2026 的投稿量從去年的 12107 篇直接飆升至 23,918 篇，幾近翻倍。然而，最終的接收率卻牢牢釘在 26.56%，與去年幾乎持平。這一數據傳遞出一個明確的信號：並非競爭變得盲目激烈，而是學術評審標準經歷了一次深刻的“重新校準”。

6 小時前閱讀分析

智東西生成式AI

獨家：阿里全面禁用Claude

智東西作者 | 李水青編輯 | 雲鵬智東西7月3日獨家獲悉，今日，阿里巴巴內部宣佈反向禁用Claude。阿里全員被要求卸載Anthropic相關產品，包括Sonnet、Opus、Fable等多個系列模型，以及Claude Code在內的Agent產品。禁令於7月10日正式生效。

8 小時前閱讀分析

智東西生成式AI

超190億！AI視頻最大單筆融資誕生，阿里騰訊百度都投了

快手旗下AI視頻生成業務「可靈AI」完成190.48億元融資，阿里、騰訊、百度均參與投資，快手持股比例降至約68.33%。可靈AI自2024年6月上線以來已更新30多次，2025年營收約11億元，年化收入運行率達5億美元。快手同時宣布首次授予員工股權獎勵，並計劃在未來12個月內推動可靈AI赴港上市。

11 小時前閱讀分析

MarkTechPost AI生成式AI

RAG-Anything 教學：在 Colab 中建立支援文字、表格、方程式與圖像的多模態檢索管道

本教學示範如何在 Google Colab 中建立 RAG-Anything 多模態檢索管道，支援文字、表格、方程式與圖像。流程包括安裝依賴、設定 OpenAI API、建立合成多模態報告與 PDF，並測試 naive、local、global 與 hybrid 等不同檢索模式。最終實現從內容列表格式插入資料，並透過多模態嵌入與視覺功能進行靈活檢索。

15 小時前閱讀分析

相關文章

Claude Fable 5，名存實亡

對話Kimi B端負責人黃震昕：把國產大模型搬上亞馬遜雲科技，未來與海外“御三家”掰手腕

算力之外的博弈：ICML 2026 透露了哪些學術硬通貨？

獨家：阿里全面禁用Claude

超190億！AI視頻最大單筆融資誕生，阿里騰訊百度都投了

RAG-Anything 教學：在 Colab 中建立支援文字、表格、方程式與圖像的多模態檢索管道