Hugging Face BlogAI Agent

開源社群力挺OpenEnv,推動代理人強化學習

2026年6月8日 00:00

重點摘要

開源社群正支持OpenEnv,這是一個用於建立代理人執行環境的工具,例如終端機、瀏覽器或任何代理人可互動的對象。今天,我們興奮地宣佈OpenEnv正逐步獲得更多關注與採用。

站內 AI 整理稿

Back to Articles The Open Source Community is backing OpenEnv for Agentic RL Published June 8, 2026 Update on GitHub Upvote - ben burtenshaw burtenshaw Follow Joseph Spisak spisakjo Follow Lysandre lysandre Follow Davide Testuggine darktex Follow will brown willcb Follow Charles Frye charlesfrye Follow Chris Wing cwing-nv Follow Daniel (Unsloth) danielhanchen Follow Andrew Zhou andrewzhou Follow Michael Han shimmyshimmer Follow Hamid Shojanazeri Hamid-Nazeri Follow Sanyam Bhutani Sanyam Follow Zach Wentz zkwentz Follow Emre Guven emre0 Follow Lewis Tunstall lewtun Follow Sergio Paniego sergiopaniego Follow OpenEnv is a tool for creating an agentic execution environment like terminals, browsers, or anything an agent can interact with. And today, we’re excited to announce that OpenEnv is becoming even more open, to make the future of training agents open source. Starting today, OpenEnv will be coordinated by a committee that so far includes Meta-PyTorch, Reflection, Unsloth, Modal, Prime Intellect, Nvidia, Mercor, Fleet AI, and Hugging Face. OpenEnv now lives at huggingface/OpenEnv OpenEnv project is supported and adopted by some of the leading organizations in the AI ecosystem, including PyTorch Foundation, vLLM, SkyRL (UCB), Lightning AI, Axolotl AI, Stanford Scaling Intelligence Lab, Mithril, OpenMined, Scaler AI Labs, Scale AI, Patronus AI, Surge AI, Halluminate, Turing, Scorecard, and Snorkel AI. Why we need OpenEnv to train open source agents Agent harnesses like Claude Code, Codex, OpenClaw, and Hermes just keep improving. One reason for their improvement is that models like GPT-5.5 and Opus 4.8 are trained to use their respective harnesses. We want those gains with open source models too: training local models that use harnesses effectively, and saving compute by specializing models for specific tasks. Why we need to be (even) more open Frontier labs train models and harnesses that, for the most part, work like hand in glove. The model is trained to use the harness and optimised for its characteristics. Models can generalise beyond these harnesses, to some extent, but nothing beats the efficiency of training. In the open, this isn’t the case. Developers use any harness, any model, any inference engine, on whatever use case they value. This is fundamental to the community, but it’s also a challenge that requires infrastructure and tooling to tackle. That’s where OpenEnv comes in. It’s a library to interface between harness, environment, and trainer, which works on any model. For this to stick, it will need to be owned by all the major stakeholders. A protocol layer, not a reward framework Alongside the governance change, we're tightening what OpenEnv is. In recent releases, OpenEnv has become an interoperability layer for RL environments. Its job is to standardize how environments are published, deployed, and consumed by agents. It will not dictate how rewards are defined or how training loops work. Reward definition, scoring rubrics, and trainer-specific logic belong in the libraries that specialize in them. OpenEnv is the common socket they can all plug into. In practice this means: One interface, many environments which all expose the familiar Gymnasium-style API (reset(), step(), state()) running on a client/server architecture. A trainer that speaks OpenEnv can drive any compliant environment without bespoke code. Familiar protocols and canonical packaging. Environments are served over standard protocols like HTTP and WebSocket and packaged with Docker. MCP is a first-class citizen, so OpenEnv environments are instantly compatible with MCP servers and the same environment behaves consistently in both simulation (train/eval) and production modes. Interop across env libraries. You can define and consume environments across different ecosystems (verifiers, harbor, and others) and on the infrastructure and hub of your choice. OpenEnv is the deployment and interface layer underneath them, rather than a competitor to them. What's next Over the coming months we will focus on the things that turn OpenEnv from a fast-growing project into a dependable standard: Tasksets via datasets: wiring environment tasks to Hugging Face datasets so environments and benchmarks compose cleanly (RFC 006). External rewards: letting rewards be defined in whichever library you already use, with OpenEnv as the deployment layer (RFC 007). Continued Harness integration: first-class support for agentic harnesses. End-to-end examples: full training and evaluation walkthroughs in TRL, Unsloth, and beyond. Auto-validation: measure environment quality and contribution to model learning. This will give the community a scalable way to evaluate their environments and drive up quality (think hackathons!). RFC 008. Get involved OpenEnv is community-centric by design, and it's still early — expect rough edges, and help us smooth them. Check out the code and RFCs: github.com/huggingface/OpenEnv Thanks to everyone who helped make this transition happen. Let's build the common substrate for open-source agentic RL together. More Articles from our Blog announcementopen-sourcecommunity OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments +1 33 February 12, 2026 announcementopen-sourcecommunity Building the Open Agent Ecosystem Together: Introducing OpenEnv +6 162 October 23, 2025 Community EditPreview Upload images, audio, and videos by dragging in the text input, pasting, or clicking here. Tap or paste here to upload images Comment · Sign up or log in to comment Upvote -

Related

相關文章

Hugging Face BlogAI Agent

MosaicLeaks: Can your research agent keep a secret?

Back to Articles MosaicLeaks: Can your research agent keep a secret? Enterprise Article Published June 18, 2026 Upvote - Alexander Gurung agurung Follow ServiceNow Rafael Pardinas rafapi-snow Follow ServiceNow TL;DR Deep research agents increasingly combine private local documents with external tools like web retrieval, creating a privacy risk: an agent's external queries may leak sensitive information. MosaicLeaks proposes a new deep-research task with multi-hop questions that interleave public and private information. Across the models we tested, agents frequently leaked private information, and training only for task performance made it worse. We propose a mosaic-leakage-aware RL training method, Privacy-Aware Deep Research (PA-DR), which raises strict chain success (the share of chains

17 小時前
量子位AI Agent

騰訊老兵+大廠00後新銳,碼上飛想做的不只是AI Coding

這篇消息聚焦「騰訊老兵+大廠00後新銳,碼上飛想做的不只是AI Coding」。原始導語提到:已接入華為鴻蒙生態 從 AI 情報角度來看,這類內容值得關注其背後的技術進展、產品落地、產業競爭與後續市場影響。

18 小時前

21年老牌企服公司的AI實驗:讓Agent跑一遍流程

這篇消息聚焦「21年老牌企服公司的AI實驗:讓Agent跑一遍流程」。原始導語提到:司盟企服接入騰訊雲WorkBuddy後,將海外郵件管理、審計理賬、訂單審核等高頻交付流程交給Agent先跑一遍 從 AI 情報角度來看,這類內容值得關注其背後的技術進展、產品落地、產業競爭與後續市場影響。

19 小時前
TechWebAI Agent

曹操出行宣佈啟動全面AI轉型,組織升級向AI原生公司邁進

曹操出行在2026國際汽車及供應鏈博覽會 上宣佈啟動全面AI轉型,併發布RoboX戰略,打造全球領先的物理AI移動科技平臺。與此同時,公司正式啟動組織升級,加快向AI原生公司邁進。為推動全面AI轉型,今年上半年,公司推進戰略聚焦,持續優化業務結構,主動收縮非核心業務,加快向AI原生公司轉型。

21 小時前