Meta AI Releases Brain2Qwerty v2: A Non-Invasive MEG Brain-to-Text Pipeline Decoding Typed Sentences at 61% Word Accuracy

2026年6月30日 08:13

重點摘要

站內 AI 整理稿

Meta AI just introduced Brain2Qwerty v2. It decodes natural sentences from non-invasive brain recordings in real time. The system reads magnetoencephalography (MEG) signals while a person types. It reconstructs what they typed, with no implant and no surgery. This is the follow-up to Brain2Qwerty v1, released in February 2025. Meta is also releasing the full training code for both versions. The pipeline combines a convolutional encoder, a transformer, and a character-level language model. TL;DR Brain2Qwerty v2 decodes typed sentences from non-invasive MEG signals, with no implant or surgery. It reaches 61% average word accuracy (39% WER), up from 8% for prior non-invasive methods. The best participant hit 78% word accuracy, with over half of sentences at one word error or less. The pipeline pairs a convolutional encoder, transformer, and character-level language model, plus fine-tuned LLMs. Accuracy scales log-linearly with data; training code for v1 and v2 is released under CC BY-NC 4.0. What is Brain2Qwerty v2? Brain2Qwerty v2 is a brain-to-text decoder. It maps raw brain activity to characters, then to words and sentences. Meta trained it on approximately 22,000 sentences from nine volunteer participants. Each participant was recorded for 10 hours while actively typing. Recordings come from a MEG device. MEG measures the magnetic fields produced by neuronal activity, sampled at high temporal resolution. The model leverages character, word and sentence-level representations. That layered design lets it correct local errors using broader context. Importantly, this is research, not a product. The decoder is not a consumer device, and it was tested on a small group of volunteers. The data was collected with Spain’s BCBL (Basque Center on Cognition, Brain and Language). It belongs to that research center. How the Decoding Pipeline Works Earlier non-invasive systems relied on hand-crafted pipelines to detect neural events. Brain2Qwerty v2 replaces that step with end-to-end deep learning. Per Meta’s repository, the model combines three components: a convolutional encoder, a transformer, and a character-level language model. The convolutional encoder reads raw MEG signals. It learns features directly from the data instead of using engineered event detectors. The transformer models longer-range structure across the signal. The character-level language model then constrains the output toward plausible text. Meta research team describes three ways AI enables the result. Each maps to a concrete engineering decision teams will recognize. Deep learning replaces hand-crafted event detection. Large language models are fine-tuned to extract semantic representations. AI agents iteratively refined the decoding pipeline through automated code development. Final training configurations were still selected manually by devs Fine-tuning large language models on neural data adds semantic context. That context bridges noisy brain recordings and coherent language output. In practice, the language model rejects character sequences that form no real words. It pushes the decoder toward sentences a human would plausibly type. Here is an illustrative sketch of the published architecture. It mirrors the described components and is not Meta’s exact training code. Copy CodeCopiedUse a different Browserimport torch import torch.nn as nn class Brain2QwertySketch(nn.Module): """Illustrative: convolutional encoder -> transformer -> char-level head. Reflects the components Meta describes, not the official implementation.""" def __init__(self, n_meg_channels=306, d_model=256, n_chars=40): super().__init__() # 1) Convolutional encoder over raw MEG channels x time self.encoder = nn.Sequential( nn.Conv1d(n_meg_channels, d_model, kernel_size=7, padding=3), nn.GELU(), nn.Conv1d(d_model, d_model, kernel_size=5, padding=2), nn.GELU(), ) # 2) Transformer models temporal structure layer = nn.TransformerEncoderLayer(d_model, nhead=8, batch_first=True) self.transformer = nn.TransformerEncoder(layer, num_layers=6) # 3) Character-level head; a language model refines this downstream self.char_head = nn.Linear(d_model, n_chars) def forward(self, meg): # meg: (batch, channels, time) x = self.encoder(meg) # (batch, d_model, time) x = x.transpose(1, 2) # (batch, time, d_model) x = self.transformer(x) # contextualized features return self.char_head(x) # (batch, time, n_chars) To work with Meta’s real code, clone the repository and inspect both versions: Copy CodeCopiedUse a different Browsergit clone https://github.com/facebookresearch/brain2qwerty # brain2qwerty_v1/ and brain2qwerty_v2/ hold the training code The Accuracy Numbers Brain2Qwerty v2 achieves an average word accuracy rate of 61%. That corresponds to a word error rate (WER) of 39%. For the best participant, the model reaches 78% word accuracy. For that participant, over half of sentences had one word error or less. The prior baseline matters here. Meta reports that other non-invasive methods reached only 8% word accuracy. Accuracy also improves log-linearly with data volume. More recording hours predictably raise accuracy in the reported range. That scaling behavior is the key claim for builders. It suggests the gap with surgical implants could narrow through data alone. MetricBrain2Qwerty v2Prior non-invasive methodsAverage word accuracy61%8%Average word error rate (WER)39%—Best participant word accuracy78%—Recording methodMEG, non-invasiveNon-invasiveScaling behaviorLog-linear with data— These numbers come from volunteers in a controlled setting. They are not clinical results for patients with brain injuries. v1 vs v2: What Changed Brain2Qwerty v1 and v2 report different metrics, so compare them carefully. v1 was measured at character level, v2 at word level. AspectBrain2Qwerty v1 (Feb 2025)Brain2Qwerty v2 (Jun 2026)DevicesMEG and EEGMEGParticipants35 healthy volunteers9 volunteersDataTyped sentences~22,000 sentences, 10 hours eachReported resultUp to 80% of characters (MEG)61% average word accuracyRepresentation levelCharacter-levelCharacter, word and sentence-levelReal-time decodingNot emphasizedReal-time sentence decoding v1 also showed MEG decoding was at least twice better than the EEG system. EEG signals are noisier, which limits accuracy. Use Cases With Examples The primary motivation is restoring communication. Millions of people have brain lesions that prevent them from speaking or moving. Invasive methods like stereotactic electroencephalography and electrocorticography already feed a neuroprosthesis to an AI decoder. But they require neurosurgery and are hard to scale. A non-invasive decoder could widen access. A patient could potentially type sentences without an implant, using only external recordings. For researchers, the released code supports reproducible neuroscience. A lab could retrain the pipeline on its own MEG dataset. For AI engineers, the project is a template for biosignal decoding. The convolutional-encoder-plus-transformer pattern transfers to other biosignal tasks. For data scientists, the log-linear scaling result is a planning tool. It frames how much new recording data may lift accuracy. Interactive Explainer (function(){ window.addEventListener("message",function(e){ if(e.data&&e.data.type==="b2q-resize"){ var f=document.getElementById("b2q-frame"); if(f&&e.data.height){f.style.height=e.data.height+"px";} } }); })(); Strengths and Limitations Strengths: Reaches 61% average word accuracy from non-invasive MEG, up from an 8% prior baseline. Uses end-to-end deep learning instead of hand-crafted event detection. Accuracy scales log-linearly with data, giving a clear path to improvement. Full training code for v1 and v2 is publicly released under CC BY-NC 4.0. Architecture reuses standard components: convolutional encoder, transformer, character-level language model. Limitations: MEG requires a magnetically shielded room and a still subject, limiting practical use. Results come from volunteer participants, not pat

原始來源：MarkTechPost AI ↗

查看原始來源

IT之家AI應用場景

華為官宣全球首個商用多模態文旅大模型規模化應用

華為中國宣佈，2026 年 6 月 29 日，全球首個商用多模態文旅大模型 ——“博觀文旅大模型”在西安規模應用。截至今年 3 月，“博觀”支撐開發的 AI 伴遊智能體已覆蓋超 400 萬用戶。#博觀文旅大模型# #AI文旅#

剛剛閱讀分析

36氪AI應用場景

AI招聘對上AI求職，一場“魔法對轟”

AI招聘工具與AI求職軟體正展開一場「魔法對轟」，雙方皆運用人工智慧優化「人崗匹配」的流程。求職者透過AI生成履歷、模擬面試，而企業則用AI篩選履歷、解讀面試表現，形成新一輪技術競賽。這場對決正重新定義招聘與求職的底層邏輯，引發業界對效率與公平性的關注。

剛剛閱讀分析

量子位AI應用場景

頂刊生物實驗難復現？統一操作話術來了！編譯通過率98.6%

一項新研究指出，頂尖期刊發表的生物實驗常難以重複，因此提出統一的操作話術，讓編譯通過率高達98.6%。AI進入生物製造領域後，最缺乏的不是更先進的演算法，而是銜接數位模型與實體實驗的標準化接口。

1 小時前閱讀分析

TechWebAI應用場景

跨越智能終端“體驗鴻溝”，網宿科技全鏈路AI方案加速場景落地

隨著大模型加速向端側滲透，AI PC、AI玩具、可穿戴設備、具身智能機器人等智能終端呈指數級增長。然而，行業爆發背後，用戶體驗與技術實現之間的鴻溝日益凸顯——語音交互效果差、穩定性不足、安全風險等問題，成為制約智能終端滲透市場的關鍵瓶頸。

4 小時前閱讀分析

IT之家AI應用場景

62 歲香港演員吳啟華賣肖像權拍 AI 電影，“重回”20 歲樣貌

這篇消息聚焦「62 歲香港演員吳啟華賣肖像權拍 AI 電影，“重回”20 歲樣貌」。原始導語提到：62 歲的香港演員吳啟華出席新劇宣傳活動，自曝最近賣了肖像權拍 AI 電影。#吳啟華賣肖像權拍 AI 電影# 從 AI 情報角度來看，這類內容值得關注其背後的技術進展、產品落地、產業競爭與後續市場影響。

8 小時前閱讀分析

AIBaseAI應用場景

穿越千年的“AI 導遊”：全球首個多模態文旅大模型在西安開啟規模化應用

2026年6月29日，全球首個商用多模態文旅大模型“博觀”在西安規模化應用。由陝文投與華為聯合研發，專攻文化保護傳承，依託1.2PB珍貴數據進行訓練，讓歷史文化實現可觸可感的智慧對話式體驗。

11 小時前7400閱讀分析

相關文章