NVIDIA BioNeMo Agent Toolkit Turns Biomolecular Models Into Callable Skills for AI Agents in Drug Discovery

2026年6月29日 19:06

重點摘要

站內 AI 整理稿

AI scientists are becoming a new interface for scientific computing. These agents read papers, write code, generate hypotheses, call APIs, and inspect files. But science is not software engineering. No test suite turns green when a hypothesis is correct. Discovery stays iterative, uncertain, and grounded in the physical world. That gap is what NVIDIA is targeting. NVIDIA published a hands-on walkthrough for its BioNeMo Agent Toolkit. The argument is direct. A general coding agent pointed at biology will not produce new medicines. In biomolecular research, an agent’s ceiling is set by the tools it can use reliably, correctly, and efficiently. TL;DR BioNeMo Agent Toolkit packages NVIDIA biomolecular models as documented, callable agent skills. Skills span protein folding, docking, generative chemistry, genomics, and protein design. NVIDIA reports task completion rising from 57.1% to 100% with skills. Agents averaged 2x more passing assertions per 1,000 tokens. Hosted NIM endpoints suit quick access; local NIM suits repeated iteration. Interactive Explainer (function(){ window.addEventListener("message", function(e){ if(e && e.data && e.data.bionemoHeight){ var f = document.getElementById("bionemo-demo"); if(f){ f.style.height = e.data.bionemoHeight + "px"; } } }); })(); What is BioNeMo Agent Toolkit The BioNeMo Agent Toolkit is an open-source repository of ‘skills’ for AI agents. Each skill turns an NVIDIA biomolecular model into a tool an agent can call. The toolkit packages protein folding, molecular docking, generative chemistry, genomics analysis, protein design, and biomarker discovery. NVIDIA frames the platform in two parts. The first is an accelerated tool layer. NVIDIA NIM (NVIDIA Inference Microservices) and BioNeMo open models deliver core capabilities as callable services. These are accelerated by libraries such as cuEquivariance for structure models and Parabricks for genomics. The second part is agent-ready interfaces. BioNeMo Skills package each capability so an agent can use it. A skill documents the model’s purpose, required inputs, optional parameters, expected artifacts, and failure modes. Model Context Protocol (MCP) server wrappers expose open models not yet packaged as NIM. Together, this lets an agent discover, select, invoke, and interpret biomolecular models on its own. The repository groups skills into nim-skills, open-models-skills, and library-skills. A workflows folder holds multi-step meta-skills. One example is generative_protein_binder_design, which chains RFdiffusion → ProteinMPNN → OpenFold3. How a BioNeMo Skill Works Every skill is a directory with a SKILL.md file. It holds YAML frontmatter plus instructions, optional references, and optional scripts. An agent reads it like documentation, then acts on it. The prompt pattern stays the same across models. The NVIDIA’s post uses OpenFold3. The same shape applies to other NIMs for biology. These include Boltz-2, DiffDock, GenMol, ProteinMPNN, MSA Search, RFdiffusion, and Evo 2. You name the skill, the input, and the endpoint. Copy CodeCopiedUse a different Browser# Hosted NIM endpoint Use the OpenFold3 BioNeMo Skill to fold MKTVRQERLKSIVR with the NVIDIA API endpoint at https://build.nvidia.com/openfold3 # Local NIM deployment Use the OpenFold3 BioNeMo Skill to fold MKTVRQERLKSIVR with the local NIM endpoint at http://localhost:8000 Installation pulls skills through the open-source skills CLI: Copy CodeCopiedUse a different Browser# Browse and pick a skill interactively npx skills add NVIDIA-BioNeMo/bionemo-agent-toolkit # Or install one skill for a specific agent npx skills add NVIDIA-BioNeMo/bionemo-agent-toolkit --skill boltz2-nim --agent claude-code Deployment is a choice, not a default. Use hosted NIM endpoints for fast access without managing infrastructure. Move selected models local when you need lower warm latency, data locality, or repeated iteration. Benchmark NVIDIA measured whether skills actually improve an agent’s loop. All reported metrics came from Codex CLI running GPT-5.5 fast. The team compared the same agent with and without each skill. Task completion was the first metric. Without skills, the agent completed 57.1% of required tasks on average. With access to NIM skills, completion reached 100%. Efficiency was the second metric. NVIDIA counted passing assertions, the individual steps that compose a task. With skills, an agent produced 2x more passing assertions per 1,000 tokens. That gain held across all ten NIM skills tested. Use Cases With Examples Protein structure prediction: An agent folds a peptide sequence with Boltz-2 or OpenFold3. It returns a CIF file for downstream inspection. Multiple sequence alignment: An agent generates an MSA with MMseqs2 through the MSA Search skill. The artifact is an A3M file. Generative chemistry: An agent generates candidate molecules with GenMol. Outputs arrive as SDF or SMILES for filtering. Protein binder design: The generative_protein_binder_design workflow chains three models. RFdiffusion builds a backbone, ProteinMPNN designs the sequence, and OpenFold3 validates the fold. Each loop follows the same shape: The agent selects a model, prepares inputs, runs it, inspects outputs, and explains results with caveats. How It Compares: Agent With vs Without Skills DimensionGeneral agent (no skills)Agent + BioNeMo SkillsTask completion57.1% average100% averageToken efficiencyBaseline2x passing assertions per 1k tokensModel selectionGuesses tool, format, and inputsReads purpose, inputs, and artifactsDeploymentManual setup from sourceHosted or local NIM, documentedFailure handlingUnknown failure modesDocumented failure modes per skillWorkflowsIsolated single callsMulti-step meta-skills (binder design) Getting Started The prerequisites are minimal. You need an agent runtime such as Claude or Codex. You need an NVIDIA API key for hosted BioNeMo NIM endpoints. A GPU node is optional, for local NIM deployment. Point the agent at the repository first. Let it enumerate the available capabilities before it acts. Then hand it a single skill to operate one model. NVIDIA flags two cautions. The build.nvidia.com endpoints are for small-scale development and testing only. They are not production-grade inference. NVIDIA also stresses validation: check low-confidence structures and filter generated molecules before trusting them. Check out the Repo and Technical details. Also, feel free to follow us on Twitter and don’t forget to join our 150k+ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well. Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us The post NVIDIA BioNeMo Agent Toolkit Turns Biomolecular Models Into Callable Skills for AI Agents in Drug Discovery appeared first on MarkTechPost.

原始來源：MarkTechPost AI ↗

查看原始來源

36氪模型更新

馬斯克：Grok 4.5接近Opus，每月發一個新模型，AI圈要變天？

這篇消息聚焦「馬斯克：Grok 4.5接近Opus，每月發一個新模型，AI圈要變天？」。原始導語提到：馬斯克又回到了模型牌桌從 AI 情報角度來看，這類內容值得關注其背後的技術進展、產品落地、產業競爭與後續市場影響。

39 分鐘前閱讀分析

36氪模型更新

給Transformer變個形，LLM竟能變得更聰明

研究發現，透過改變Transformer模型的結構、將「腦容量」移到前方，大型語言模型（LLM）的智慧程度竟能獲得提升。這種調整無需增加額外計算資源，就可使模型表現更佳，堪稱「AI模型的免費午餐」。此方法透過重新分配模型的注意力與參數配置，進一步激發其推理能力。

1 小時前閱讀分析

鈦媒體模型更新

馬斯克：Grok 4.5接近Opus，每月發一個新模型，AI圈要變天？

這篇消息聚焦「馬斯克：Grok 4.5接近Opus，每月發一個新模型，AI圈要變天？」。原始導語提到：馬斯克又回到了模型牌桌。從 AI 情報角度來看，這類內容值得關注其背後的技術進展、產品落地、產業競爭與後續市場影響。

3 小時前閱讀分析

IT之家模型更新

DeepSeek V4 正式版官宣 7 月中旬上線，引入峰谷定價機制

DeepSeek 團隊宣布，DeepSeek V4 正式版預計於 7 月中旬上線，將帶來更多功能優化與性能提升。此次更新也將導入峰谷定價機制，根據使用時段調整費用，讓用戶能更彈性地運用服務。

4 小時前閱讀分析

TechWeb模型更新

xAI開啟“月更”模式：馬斯克承諾今年每月發佈一款從零訓練的全新AI模型

當地時間週日（28日），馬斯克在社交媒體平臺X上宣佈，旗下最新大語言模型Grok 4.5已在SpaceX和特斯拉內部啟動Beta測試。他表示，早期評測結果顯示，該模型性能已接近甚至可能超越Anthropic的旗艦模型Claude Opus，目前強化學習（RL）仍在持續優化模型表現，配套的“Grok Build”測試基準也日趨完善。

7 小時前閱讀分析

AIBase模型更新

豆包內測社交功能:打通飛書賬號，AI助手也要做熟人社交?

字節跳動旗下AI助手豆包正灰度測試社交功能，已打通飛書賬號體系。內測新增獨立“對話”頁面，支持添加豆包好友或飛書好友；收到好友申請時AI會自動發送打招呼消息，已添加的人類好友對話列表中將顯示“人類”標識。

13 小時前7300閱讀分析

相關文章

馬斯克：Grok 4.5接近Opus，每月發一個新模型，AI圈要變天？

給Transformer變個形，LLM竟能變得更聰明

馬斯克：Grok 4.5接近Opus，每月發一個新模型，AI圈要變天？

DeepSeek V4 正式版官宣 7 月中旬上線，引入峰谷定價機制

xAI開啟“月更”模式：馬斯克承諾今年每月發佈一款從零訓練的全新AI模型

豆包內測社交功能:打通飛書賬號，AI助手也要做熟人社交?