|

Meet EverOS: An Open Source Markdown-First Agent Memory Runtime With Hybrid BM25 + Vector Retrieval and Self-Evolving Skills

EverMind has launched EverOS, an open-source reminiscence runtime for AI brokers. It ships underneath an Apache 2.0 license. It targets an issue agent builders hit early: giant language fashions are stateless. The dialog ends, and the context is gone.

EverOS proposes a special substrate. Instead of locking reminiscence inside a vector database, it writes reminiscence as plain Markdown information. Those information turn into the supply of fact that brokers learn, edit, and search throughout periods.

TL;DR

  • EverOS shops agent reminiscence as editable Markdown, listed by SQLite and LanceDB.
  • Hybrid retrieval blends BM25, vector search, and scalar filtering in a single question.
  • Cases distill into reusable Skills, giving brokers procedural, self-evolving reminiscence.
  • Benchmark scores are sturdy however EverMind-reported; confirm by yourself workload.
  • It is open supply underneath Apache 2.0, with cloud and self-hosted parity.

What is EverOS?

EverOS is a Python library and a local-first reminiscence runtime. It runs as a server with a CLI and a FastAPI HTTP API, async-first all through. You drop it into an current agent loop slightly than rebuilding your stack.

The design separates two reminiscence tracks. User-side reminiscence holds Profiles, Episodes, Facts, and Foresights. Agent-side reminiscence holds Cases and Skills. Keeping them separate is uncommon; most libraries heart on chat historical past alone.

Every document lands as a .md file. You can open, edit, grep, and Git-version it, or view it in Obsidian. EverAlgo, a separate stateless library, handles the extraction algorithms. EverOS orchestrates and persists the outcomes.

The endpoint stack is OpenAI-protocol appropriate. It connects to OpenAI, OpenRouter, vLLM, Ollama, or DeepInfra by altering a base URL. That retains integration near a single configuration change.

The runtime is local-first by default. Data by no means has to depart your surroundings, and each layer is inspectable. A managed EverOS Cloud possibility exists for groups that want to not self-host. Both share the identical SDK, retrieval engine, and reminiscence format.

The Architecture — Markdown, SQLite, and LanceDB

EverOS makes use of a three-piece storage stack. Markdown is the supply of fact. SQLite manages state and queues. LanceDB manages vectors, BM25, and scalar filters.

This is intentionally lighter than a typical manufacturing reminiscence setup. There isn’t any required MongoDB, Elasticsearch, Milvus, Redis, or Kafka. For solo builders and small groups, that lowers operational price.

Retrieval is hybrid. A single LanceDB question combines BM25 key phrase matching, dense vector search, and scalar filtering. EverMind markets this multimodal retrieval path as mRAG.

A cascade index sync retains information and indexes aligned. Editing a .md file triggers a file-watcher that re-syncs the index. Memory stays inspectable with out going stale.

Retrieval can also be orthogonal throughout identifiers. You can scope a search by user_id, agent_id, app_id, project_id, and session_id. That scoping is necessary in multi-agent and multi-user deployments the place knowledge isolation is required.

How Memory Self-Evolves — Cases Become Skills

A particular function is procedural reminiscence. EverOS data every accomplished agent job as a Case. Repeated profitable patterns are distilled offline into reusable Skills.

This is the ‘self-evolving’ declare, said plainly. Skills are shared throughout an agent workforce, with no handbook curation and no hardcoding. The aim is brokers that enhance with use as an alternative of restarting every session.

Version 1.1.0 added extra lifecycle equipment. It launched Knowledge APIs for source-backed Markdown pages with taxonomy and subject search. It additionally added Reflection, an offline course of that merges episode clusters and refines profiles and expertise between periods.

The reminiscence mannequin is straightforward. Episodic reminiscence solutions ‘what occurred.’ Profile reminiscence solutions ‘who is that this person.’ Procedural reminiscence solutions ‘how is that this job executed.’

Benchmark

EverMind workforce studies 93.05% on LoCoMo, 83.00% on LongMemEval, and 93.04% on HaluMem. It additionally cites sub-500ms p95 retrieval latency. LoCoMo and LongMemEval measure long-term conversational reminiscence; HaluMem targets reminiscence hallucination. These numbers come from EverMind posts.

The desk beneath compares EverOS in opposition to frequent alternate options on concrete design dimensions:

Dimension EverOS Naive RAG Full context window Other reminiscence libraries
Source of fact Plain Markdown .md information Vector DB data Prompt solely API or database state
Local stack Markdown + SQLite + LanceDB Vector DB + app code None Often managed providers
Retrieval Hybrid BM25 + vector + scalar Dense vector solely None (no retrieval) Varies
Procedural reminiscence Cases distilled into Skills None None Rare
Multimodal ingest PDF, picture, Office, URL in a single name Manual pipeline Via context solely Partial
LoCoMo accuracy 93.05% (EverMind-reported) N/A (context restrict) Varies
License Apache 2.0 Varies N/A Varies / proprietary

Use Cases, With Real Examples

The library hyperlinks to working integrations. They present what persistent reminiscence permits in actual merchandise.

Hive Orchestrator is a browser-native hive-mind for CLI coding brokers. Claude Code, Codex, Gemini, and OpenCode collaborate as actual PTY processes by a shared workforce protocol.

Reunite makes use of semantic reminiscence for public-value search. Parents describe what they bear in mind, kids describe what they recall, and the system surfaces connections.

Other examples span healthcare and {hardware}. They embrace an Alzheimer’s reminiscence assistant and an AI wearable. The wearable listens to on a regular basis life and converts it into reminiscence. A research buddy with self-evolving reminiscence can also be among the many examples. The wider ecosystem provides a Claude Code plugin and an MCP-based reminiscence layer for coding assistants.

A Five-Minute Code Walkthrough

Installation makes use of normal Python tooling. EverOS requires Python 3.12 or newer. The native demo wants no API keys.

# Requires Python 3.12+
uv pip set up everos        # or: pip set up everos
everos demo                  # native instructional visualizer, no keys
everos init                  # paste OpenRouter + DeepInfra keys into .env
everos server begin          # begins the FastAPI server
curl http://127.0.0.1:8000/well being   # -> {"standing":"okay"}

Adding and looking out reminiscence are plain HTTP calls. The instance beneath shops a reality, forces extraction, then retrieves it.

# 1) Add a brief dialog
curl -X POST http://127.0.0.1:8000/api/v1/reminiscence/add 
  -H 'Content-Type: software/json' 
  -d '{"session_id":"demo-001","app_id":"default","project_id":"default",
       "messages":[{"sender_id":"alice","role":"user","timestamp":1750000000000,
                    "content":"I love climbing in Yosemite every spring."}]}'

# 2) Flush to power extraction (native demo)
curl -X POST http://127.0.0.1:8000/api/v1/reminiscence/flush 
  -H 'Content-Type: software/json' 
  -d '{"session_id":"demo-001","app_id":"default","project_id":"default"}'

# 3) Search it again
curl -X POST http://127.0.0.1:8000/api/v1/reminiscence/search 
  -H 'Content-Type: software/json' 
  -d '{"user_id":"alice","app_id":"default","project_id":"default",
       "question":"Where do I prefer to climb?","top_k":5}'

Multimodal ingestion is an optionally available additional. Installing everos[multimodal] provides parsing for photographs, PDFs, and audio. Office paperwork moreover require LibreOffice, which converts information to PDF earlier than parsing.

Try It: Interactive Memory Demo

The embedded demo beneath simulates the EverOS loop in your browser. Add a snippet, watch it get extracted and tagged, then search it again by hybrid retrieval. It is illustrative and doesn’t hook up with a reside server.