MarkTechPost AIAI Agent

Meet Memory OS: A 6-Layer Open-Source Memory Stack Built on Top of Hermes Agent

2026年6月1日 16:53

重點摘要

Hermes Agent already remembers across sessions. The open-source agent from Nous Research ships with curated memory files and full-text session search. But a new community project argues that built-in memory is too shallow for serious work. A new library named ‘Memory OS‘ has been released under an MIT license by a developer (ClaudioDrews). It stacks six memory layers onto Hermes. It adds a vector database, structured facts, and an auto-curated knowledge wiki. The project is new but it seems to have a good potential and its architecture shows how agent memory can be layered. Memory OS Memory OS is not a Hermes plugin you toggle on. It is a layered system that sits beside Hermes Agent’s own memory. Hermes already provides workspace files and a session database. Memory OS keeps those and adds

站內 AI 整理稿

Hermes Agent already remembers across sessions. The open-source agent from Nous Research ships with curated memory files and full-text session search. But a new community project argues that built-in memory is too shallow for serious work. A new library named ‘Memory OS‘ has been released under an MIT license by a developer (ClaudioDrews). It stacks six memory layers onto Hermes. It adds a vector database, structured facts, and an auto-curated knowledge wiki. The project is new but it seems to have a good potential and its architecture shows how agent memory can be layered. Memory OS Memory OS is not a Hermes plugin you toggle on. It is a layered system that sits beside Hermes Agent’s own memory. Hermes already provides workspace files and a session database. Memory OS keeps those and adds four more layers above them. The full stack runs locally using Docker, Qdrant, Redis, and Python 3.11+. It works with any LLM provider Hermes supports, including OpenRouter, OpenAI, Anthropic, and Ollama. The README frames it as a “memory operating system,” not a single feature. The Six Layers, From Files to Vectors Layer 1 is Workspace. It holds MEMORY.md, USER.md, and CREATIVE.md, injected into the system prompt each turn. Layer 2 is Sessions. It uses state.db, a SQLite database with FTS5 full-text search across conversation history. Layer 3 is Structured Facts. It stores durable facts in memory_store.db, using SQLite, HRR, FTS5, and trust scoring. A feedback loop adjusts those trust scores over time, alongside entity resolution. Layer 4 is Fabric, a heavily forked version of the Icarus Plugin. This fork adds LLM-powered session extraction over the upstream esaradev/icarus-plugin. It handles cross-session recall through 16 tools, including fabric_recall, fabric_write, and fabric_brief. Layer 5 is the Vector Database, built on Qdrant. It uses 4096d Cosine vectors plus BM25 sparse search, a keyword-style ranking method. Layer 6 is an LLM Wiki, an auto-curated vault of concepts, entities, and comparisons. That wiki is continuously ingested back into Qdrant through a process called wiki-continuous-ingest. How the Retrieval Flow Works The flow sits on when memory is read and written. On pre_llm_call, Memory OS runs what it calls surgical recall. It pulls from four sources at once: Fabric, Qdrant, Sessions, and Facts. Each source is gated by a relevance threshold before anything reaches the model. Per-session deduplication stops the same context from appearing twice. A social-closer filter skips trivial messages, such as a plain “thanks.” On post_llm_call and on_session_end, the system extracts and captures new learnings automatically. The stated goal is token efficiency, not stuffing the context window. The Fallback Cascade and Cleanup Layer 5’s retrieval uses a four-level fallback. It tries hybrid search first, then dense vectors, then lexical, then SQLite. If one method fails or returns nothing, the next takes over. This design keeps recall working even when the vector database struggles. Memory OS also runs a weekly decay scanner to age out stale entries. Semantic dedup merges near-identical memories when cosine similarity exceeds 0.92. These housekeeping steps aim to stop memory from bloating over months of use. Local-First, And Deliberately So Memory OS positions itself against cloud memory services like mem0, Zep, and Letta. Its pitch is that memory infrastructure should run on your own machine. The memory data stays local, with no memory subscription. LLM calls still go to whichever provider you choose. Hermes itself already supports eight external memory providers, including mem0 and Honcho. Memory OS is not one of those official providers. It is a separate, community-built stack layered on Hermes directly. For teams with data-residency rules, a local memory store can matter. Just open-sourced **Memory OS** — a complete hierarchical persistent memory architecture for the Hermes Agent. 6 layers, fully local:• Structured facts + trust scoring with feedback loop• Hybrid vector search (Qdrant + BM25)• Self-curating LLM Wiki• Semantic…— Claudio Drews (@ClaudioDrews25) May 31, 2026 Strengths and Limitations Strengths: Clear layered design separating files, sessions, facts, vectors, and a wiki Fully local infrastructure with no cloud memory subscription Provider-agnostic, matching Hermes Agent’s own flexibility Token-efficient retrieval by design, via gated sources and per-session deduplication Limitations: Brand new, with few commits A forked Icarus Plugin that the author says is not upstream-compatible Heavier setup: Docker, Qdrant, Redis, and an ARQ Worker all required No published benchmarks on recall quality, latency, or token savings Key Takeaways Memory OS is a community-built, MIT-licensed stack that adds six memory layers on top of Hermes Agent. It combines workspace files, FTS5 session search, trust-scored facts, a forked Icarus fabric, Qdrant vectors, and an auto-curated LLM wiki. Retrieval runs on pre_llm_call with gated, deduplicated recall from four sources; capture runs on post_llm_call and on_session_end. Memory infrastructure is fully local and provider-agnostic, but LLM calls still go to your chosen provider. Check out the Repo. Also, feel free to follow us on Twitter and don’t forget to join our 150k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well. Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us The post Meet Memory OS: A 6-Layer Open-Source Memory Stack Built on Top of Hermes Agent appeared first on MarkTechPost.

Related

相關文章

Hugging Face BlogAI Agent

MosaicLeaks: Can your research agent keep a secret?

Back to Articles MosaicLeaks: Can your research agent keep a secret? Enterprise Article Published June 18, 2026 Upvote - Alexander Gurung agurung Follow ServiceNow Rafael Pardinas rafapi-snow Follow ServiceNow TL;DR Deep research agents increasingly combine private local documents with external tools like web retrieval, creating a privacy risk: an agent's external queries may leak sensitive information. MosaicLeaks proposes a new deep-research task with multi-hop questions that interleave public and private information. Across the models we tested, agents frequently leaked private information, and training only for task performance made it worse. We propose a mosaic-leakage-aware RL training method, Privacy-Aware Deep Research (PA-DR), which raises strict chain success (the share of chains

16 小時前
量子位AI Agent

騰訊老兵+大廠00後新銳,碼上飛想做的不只是AI Coding

這篇消息聚焦「騰訊老兵+大廠00後新銳,碼上飛想做的不只是AI Coding」。原始導語提到:已接入華為鴻蒙生態 從 AI 情報角度來看,這類內容值得關注其背後的技術進展、產品落地、產業競爭與後續市場影響。

17 小時前

21年老牌企服公司的AI實驗:讓Agent跑一遍流程

這篇消息聚焦「21年老牌企服公司的AI實驗:讓Agent跑一遍流程」。原始導語提到:司盟企服接入騰訊雲WorkBuddy後,將海外郵件管理、審計理賬、訂單審核等高頻交付流程交給Agent先跑一遍 從 AI 情報角度來看,這類內容值得關注其背後的技術進展、產品落地、產業競爭與後續市場影響。

18 小時前
TechWebAI Agent

曹操出行宣佈啟動全面AI轉型,組織升級向AI原生公司邁進

曹操出行在2026國際汽車及供應鏈博覽會 上宣佈啟動全面AI轉型,併發布RoboX戰略,打造全球領先的物理AI移動科技平臺。與此同時,公司正式啟動組織升級,加快向AI原生公司邁進。為推動全面AI轉型,今年上半年,公司推進戰略聚焦,持續優化業務結構,主動收縮非核心業務,加快向AI原生公司轉型。

20 小時前