A New Study from Harvard and Perplexity Finds AI Agents Perform 26 Minutes of Autonomous Work per Session vs 33 Seconds for Search

重點摘要
A new working research from Perplexity and Harvard offers field evidence on what AI agents do to knowledge work. It draws on production data from two Perplexity products: Search and Computer. The setup is a natural comparison. Search is a conversational answer engine. Computer is an agent that plans and executes tasks end to end. The same users touch both products, so the team can hold the task roughly constant. What the Study Actually Measures The research study covers a 90-day window, February 27 through May 27, 2026. Computer launched two days before that window opened. The core method matches near-identical query pairs across the two products. The research team found 10,000 session pairs with cosine similarity above 0.99. Each pair is effectively the same task attempted both ways. Comp
A new working research from Perplexity and Harvard offers field evidence on what AI agents do to knowledge work. It draws on production data from two Perplexity products: Search and Computer. The setup is a natural comparison. Search is a conversational answer engine. Computer is an agent that plans and executes tasks end to end. The same users touch both products, so the team can hold the task roughly constant. What the Study Actually Measures The research study covers a 90-day window, February 27 through May 27, 2026. Computer launched two days before that window opened. The core method matches near-identical query pairs across the two products. The research team found 10,000 session pairs with cosine similarity above 0.99. Each pair is effectively the same task attempted both ways. Computer pairs are gated to sessions that invoke an execution tool. These ‘do’ tools include code execution, browser actions, file writes, and connector calls. That gate ensures every Computer session does real autonomous work. Adoption rose over the window. Cumulative Computer queries reached 84× their first-week total. A matched analysis found Computer adoption also raised users’ daily Search queries by 1.05. The positive effect points to complementarity, not substitution. https://research.perplexity.ai/articles/how-ai-agents-reshape-knowledge-work The Cost-Structure Framework The research grounds its data in a simple task-based model. Each task has a step count, and longer tasks carry weakly higher value. Agents change the cost structure. They charge a higher fixed cost per task, for delegation and review. But they charge a lower marginal cost per step, since the system executes. This produces a breakeven step count. Below it, the conversational mode is cheaper. Above it, the agent mode wins. Short lookups stay manual; long workflows move to the agent. Autonomy: 26 Minutes vs 33 Seconds The first autonomy measure is execution time. Computer runs 26 minutes of machine work per session. Search runs 33 seconds. That is a 48× gap. Medians show the same pattern: 9 minutes versus 14 seconds. The gap varies by domain. Local tasks show 75×; Science shows 26×, since plain answers often suffice. Higher autonomy did not lower quality here. The research team scored next-turn dissatisfaction from what users do next. Computer’s meaningful dissatisfaction rate was 1.3%, against 2.9% for Search (55% reduction). Follow-up turns also shift toward review and extension on Computer, though the changes are small. Connector usage rose more clearly. Computer invoked at least one connector in 7.9% of sessions, versus 1.8% for Search. Computer chains external tools that Search users would otherwise run by hand. Efficiency: Where the Savings Come From The efficiency section estimates a Search + Human counterfactual. A human with Search alone takes 269 minutes per matched task. Computer + Human takes 36 minutes. That is 87% less time and 94% less cost overall. Cost savings exceed time savings because domain wages amplify the effect. Computer’s model cost runs $4–10 per task; Search runs about $0.05. The marginal numbers support the framework. Computer + Human costs $0.16 per step, versus $2.05 for Search + Human. Matched Computer sessions also ran longer prompts, 652 versus 448 characters at the median. That supports the higher fixed-cost assumption for agents. Breakeven analysis says a professional must finish all manual steps in under 20 minutes to match Computer. The research team cross-checked with an independent LLM estimate and user interviews. The LLM method found 84% time and 93% cost savings. Interviewees reported speedups from 5× to 300×. Horizontal and Vertical Expansion Scope is where this research extends past prior work. Autonomy does not just speed up tasks. It changes which tasks users attempt. Horizontally, Computer queries cross occupational lines more often. Cross-occupation share averaged 59% on Computer, versus 50% on Search. Management and Entrepreneurship showed the largest gap, at 19 points. Vertically, Computer queries are more demanding. On Bloom’s Revised Taxonomy, 76% required higher-order cognition, versus 55% for Search. Create-level work was 50% of Computer queries, against 26%. Computer tasks also span more knowledge domains. Each query touched 2.40 O*NET Knowledge domains on average, versus 1.74. It was nearly three times as likely to need three or more domains. Composability climbs as the O*NET hierarchy gets finer. At the Task Statement level, Computer engaged 60% more activities. About 23% of Computer queries hit a Task Statement that the same users never sent to Search. https://research.perplexity.ai/articles/how-ai-agents-reshape-knowledge-work Comparison Table: Search vs Computer DimensionPerplexity SearchPerplexity ComputerMode in the frameworkConversational answer engineAgent orchestratorMachine time per session33 seconds (median 14s)26 minutes (median 9m)Queries per session2.85.3Meaningful (mid+high) dissatisfaction2.9%1.3%Sessions with a connector call1.8%7.9%Counterfactual task time269 min (Search + Human)36 min (Computer + Human)Cost per step$2.05$0.16Model cost per task~$0.05$4–10Cross-occupation query share50%59%Higher-order Bloom cognition55%76%O*NET Knowledge domains per query1.742.40 Key Takeaways Computer runs 26 minutes of autonomous work per session versus 33 seconds for Search, a 48× gap. On matched tasks, Computer + Human cuts estimated time 87% and cost 94% versus Search + Human. Computer’s meaningful dissatisfaction rate is 1.3% versus 2.9% for Search, a 55% reduction. Computer queries cross occupations more (59% vs 50%) and demand more higher-order cognition (76% vs 55%). About 23% of Computer queries hit a Task Statement the same users never sent to Search. Marktechpost’s Visual Explainer #mtp-harvard-agents *{box-sizing:border-box!important;margin:0;padding:0} #mtp-harvard-agents hr,#mtp-harvard-agents p:empty,#mtp-harvard-agents del,#mtp-harvard-agents s{display:none!important} #mtp-harvard-agents{ --crimson:#A51C30;--crimson-deep:#7A1420;--crimson-darker:#5E0F18; --ink:#1E1E1E;--ink-soft:#4A4A4A;--ivory:#FBF7F1;--line:#E7DDD2; font-family:-apple-system,BlinkMacSystemFont,"Segoe UI",Roboto,Helvetica,Arial,sans-serif!important; background:linear-gradient(135deg,#7A1420 0%,#5E0F18 100%)!important; color:#FBF7F1!important;border:1px solid #5E0F18!important;border-radius:16px!important; padding:26px!important;max-width:860px;margin:24px auto;line-height:1.5; box-shadow:0 10px 30px rgba(94,15,24,.25)!important; } #mtp-harvard-agents .mtp-head{display:flex;align-items:center;justify-content:space-between;gap:12px;margin-bottom:16px} #mtp-harvard-agents .mtp-eyebrow{font-size:11px;letter-spacing:2.4px;text-transform:uppercase;font-weight:700;color:#F4C9CF!important} #mtp-harvard-agents .mtp-badge{font-size:11px;font-weight:700;letter-spacing:1px;color:#FBF7F1!important;border:1px solid rgba(251,247,241,.35)!important;border-radius:999px!important;padding:4px 12px!important} #mtp-harvard-agents .mtp-stage{position:relative;background:#FBF7F1!important;border-radius:12px!important;border:1px solid #E7DDD2!important;overflow:hidden} #mtp-harvard-agents .mtp-slide{display:none;padding:30px 32px 34px;min-height:362px} #mtp-harvard-agents .mtp-slide.is-active{display:block;animation:mtpfade .4s ease} @keyframes mtpfade{from{opacity:0;transform:translateY(6px)}to{opacity:1;transform:none}} #mtp-harvard-agents .mtp-accent{width:46px;height:4px;background:#A51C30!important;border-radius:2px!important;margin-bottom:16px} #mtp-harvard-agents .mtp-num{font-size:12px;font-weight:700;letter-spacing:1px;color:#A51C30!important;margin-bottom:8px} #mtp-harvard-agents h2{font-family:Georgia,"Times New Roman",serif!important;font-size:25px;line-height:1.2;color:#1E1E1E!important;margin-bottom:6px;font-weight:700} #mtp-harvard-agents .mtp-sub{font-size:14px;color:#4A4A4A!important;margin-bottom:18px} #mtp-harvard-agents .mtp-cover h2{font-size:32px;margin-bot
Related
相關文章
網易有道全面向AI轉型 全場景Agent矩陣亮相圖博會
{"id":"39ef5947-b77a-4904-bf03-ff6264f08dc4","object":"response","model":"deepseek-v4-flash","output":[],"stop_reason":"max_output_tokens","usage":{"input_tokens":154,"output_tokens":200,"total_tokens":354}}
MosaicLeaks: Can your research agent keep a secret?
Back to Articles MosaicLeaks: Can your research agent keep a secret? Enterprise Article Published June 18, 2026 Upvote - Alexander Gurung agurung Follow ServiceNow Rafael Pardinas rafapi-snow Follow ServiceNow TL;DR Deep research agents increasingly combine private local documents with external tools like web retrieval, creating a privacy risk: an agent's external queries may leak sensitive information. MosaicLeaks proposes a new deep-research task with multi-hop questions that interleave public and private information. Across the models we tested, agents frequently leaked private information, and training only for task performance made it worse. We propose a mosaic-leakage-aware RL training method, Privacy-Aware Deep Research (PA-DR), which raises strict chain success (the share of chains

騰訊老兵+大廠00後新銳,碼上飛想做的不只是AI Coding
這篇消息聚焦「騰訊老兵+大廠00後新銳,碼上飛想做的不只是AI Coding」。原始導語提到:已接入華為鴻蒙生態 從 AI 情報角度來看,這類內容值得關注其背後的技術進展、產品落地、產業競爭與後續市場影響。

Agent引爆網盤大戰,騰訊、百度、阿里齊聚,這次爭的不再是下載速度
這篇消息聚焦「Agent引爆網盤大戰,騰訊、百度、阿里齊聚,這次爭的不再是下載速度」。原始導語提到:網盤成了Agent新基建。 從 AI 情報角度來看,這類內容值得關注其背後的技術進展、產品落地、產業競爭與後續市場影響。

21年老牌企服公司的AI實驗:讓Agent跑一遍流程
這篇消息聚焦「21年老牌企服公司的AI實驗:讓Agent跑一遍流程」。原始導語提到:司盟企服接入騰訊雲WorkBuddy後,將海外郵件管理、審計理賬、訂單審核等高頻交付流程交給Agent先跑一遍 從 AI 情報角度來看,這類內容值得關注其背後的技術進展、產品落地、產業競爭與後續市場影響。
曹操出行宣佈啟動全面AI轉型,組織升級向AI原生公司邁進
曹操出行在2026國際汽車及供應鏈博覽會 上宣佈啟動全面AI轉型,併發布RoboX戰略,打造全球領先的物理AI移動科技平臺。與此同時,公司正式啟動組織升級,加快向AI原生公司邁進。為推動全面AI轉型,今年上半年,公司推進戰略聚焦,持續優化業務結構,主動收縮非核心業務,加快向AI原生公司轉型。