A Step-by-Step Coding Tutorial to Implement GBrain: The Self-Wiring Memory Layer Built by Y Combinator’s Garry Tan for AI Agents

Your AI agent is wise however forgetful. Every new session begins from zero — no reminiscence of who you met, what you learn, what you determined final Tuesday. GBrain is an open-source repair for that. Built by Garry Tan (President and CEO of Y Combinator) to energy his personal OpenClaw and Hermes deployments, it’s a markdown-first, Postgres-backed information layer that ingests conferences, emails, tweets, and notes, then auto-wires a typed information graph on high — with zero LLM calls for the graph extraction. The manufacturing mind behind Garry’s precise brokers at present holds 146,646 pages, 24,585 individuals, 5,339 corporations, and 66 autonomous cron jobs. On its personal benchmark (BrainBench, a 240-page rich-prose corpus), GBrain hits P@5 49.1% and R@5 97.9%, a +31.4-point P@5 lead over the identical codebase with the graph layer disabled.

This is a hands-on tutorial. You’ll set up GBrain regionally, import a small notes folder, run an actual search, watch the information graph wire itself, and join it to Claude Code through MCP. About 20 minutes begin to end. All terminal outputs under have been captured from a reside set up of GBrain v0.38.2.0. The repository (MIT-licensed) lives at github.com/garrytan/gbrain.

What you’re constructing

By the tip of the tutorial, you’ll have:

  • A native ~/.gbrain/mind.pglite database — embedded Postgres 17 (through WASM) with pgvector, zero server config.
  • A small “mind repo” of markdown notes about individuals, corporations, and ideas.
  • A working hybrid-search CLI that mixes vector + BM25 key phrase + Reciprocal Rank Fusion (RRF), with a ZeroEntropy reranker on high by default.
  • A typed information graph (works_at, based, invested_in, attended, advises, mentions) auto-extracted out of your notes.
  • An MCP server exposing 74 instruments so Claude Code, Cursor, and Windsurf can learn and write to the mind instantly.

Prerequisites

  • macOS or Linux (Windows customers: use WSL2).
  • A code editor.
  • Bun ≥ 1.3.10 (the runtime GBrain ships on; the repo’s package deal.json declares this because the minimal engine). We’ll set up it in Step 1.
  • An embedding API key from one of: ZeroEntropy (default), OpenAI, or Voyage. Without one, you possibly can nonetheless set up and run key phrase search, however gbrain question (hybrid + vector) will return no outcomes.
  • Optional: an Anthropic API key for multi-query growth throughout search.

Step 1 — Install Bun and GBrain

GBrain is written in TypeScript and runs on Bun. Install it first:

curl -fsSL https://bun.sh/set up | bash
exec $SHELL                 # reload shell so `bun` is on PATH
bun --version

Now set up GBrain. As of v0.38, the canonical set up path is a single international Bun set up:

bun set up -g github:garrytan/gbrain
gbrain --version
# gbrain 0.38.2.0

Step 2 — Initialize your mind

gbrain init --pglite provisions a neighborhood PGLite database in ~/.gbrain/. PGLite is full Postgres compiled to WASM — no server, no Docker, prepared in roughly two seconds.

For this tutorial we’ll defer the embedding supplier so you possibly can observe alongside with out an API key instantly — we’ll wire it up in Step 6 once we run hybrid search:

gbrain init --pglite --no-embedding

(If you’d somewhat configure embeddings now, set considered one of OPENAI_API_KEY, ZEROENTROPY_API_KEY, or VOYAGE_API_KEY in your setting earlier than working plain gbrain init --pglite.)

Real output captured from a recent set up (truncated for brevity — there are 81 migrations from schema v1 → v85):

Setting up native mind with PGLite (no server wanted)...
  Schema model 1 → 85 (81 migration(s) pending)
  [2] slugify_existing_pages...
  [2] ✓ slugify_existing_pages
  [3] unique_chunk_index...
  [3] ✓ unique_chunk_index
  ...
  Brain prepared at /dwelling/you/.gbrain/mind.pglite
  0 pages. Engine: PGLite (native Postgres).

You now have an empty mind. Confirm:

gbrain stats
# Pages:     0
# Chunks:    0
# Embedded:  0
# Links:     0
# Tags:      0
# Timeline:  0

Step 3 — Create a tiny mind repo

The mind repo is only a listing of markdown recordsdata. Each file follows GBrain’s compiled reality + timeline sample: a present best-understanding part on high, an append-only proof path under.

Important: wikilinks should use the total slug path (e.g., [[people/alice-chen]], not simply [[alice-chen]]) for the graph extractor to resolve them. This is an actual gotcha — I examined each types; the quick kind silently produces zero hyperlinks.

mkdir -p ~/my-brain/individuals ~/my-brain/corporations ~/my-brain/ideas
cd ~/my-brain

Create an individual web page:

cat > individuals/alice-chen.md <<'EOF'
---
kind: individual
title: Alice Chen
tags: [founder, ai-infra]
---

Founder and CEO of [[companies/acme-ai]]. Previously workers engineer at
Google Brain. Focus space: inference optimization for small language fashions.

---

- 2024-03-12: Met at AI Engineer Summit. Discussed sparse MoE routing.
- 2024-09-04: Announced $12M seed led by Sequoia.
- 2025-01-18: Shipped open-source inference router on GitHub.
EOF

A firm web page:

cat > corporations/acme-ai.md <<'EOF'
---
kind: firm
title: Acme AI
tags: [startup, inference]
---

YC W24 inference-optimization startup. Founded by [[people/alice-chen]].
Building latency-aware routing for sub-7B fashions.

---

- 2024-09-04: $12M seed, led by Sequoia.
- 2025-01-18: Open-sourced their inference router.
EOF

And an idea web page:

cat > ideas/inference-optimization.md <<'EOF'
---
kind: idea
title: Inference Optimization
tags: [ml-systems]
---

Techniques to cut back latency and price when serving language fashions:
quantization, speculative decoding, KV-cache reuse, and request batching.
EOF

Step 4 — Import the repo

gbrain import is idempotent (content-hash deduplicated). We’ll cross --no-embed so this step is deterministic for readers who don’t have an embedding key set but — embeddings get backfilled in Step 6. Real output:

gbrain import ~/my-brain/ --no-embed
[gbrain phase] import.collect_files begin dir=/dwelling/you/my-brain/ technique=markdown
[gbrain phase] import.collect_files completed 2ms recordsdata=3
Found 3 markdown recordsdata
[import.files] 3/3 (100%) imported=3 skipped=0 errors=0

Import full (0.3s):
  3 pages imported
  0 pages skipped (0 unchanged, 0 errors)
  3 chunks created

Confirm:

gbrain listing
# corporations/acme-ai           firm   2026-05-22  Acme AI
# ideas/inference-optimization  idea  2026-05-22  Inference Optimization
# individuals/alice-chen           individual    2026-05-22  Alice Chen

Step 5 — Wire the information graph

For a first-time import, run the hyperlink extractor explicitly to backfill the graph out of your wikilinks. This is pure regex + typed inference — zero LLM calls.

gbrain extract hyperlinks --source db

Real output:

[extract.links_db] 3/3 (100%) completed
Links: created 2 from 3 pages (db supply)
Done: 2 hyperlinks, 0 timeline entries from 3 pages

Two typed edges have been inferred from the wikilinks: alice-chen --works_at--> acme-ai (from “Founder and CEO of …”) and acme-ai --founded--> alice-chen (from “Founded by …”). The inference cascade fires so as: FOUNDED → INVESTED → ADVISES → WORKS_AT → MENTIONS. No mannequin within the loop.

Inspect the graph instantly:

gbrain graph-query individuals/alice-chen --depth 1
# [depth 0] individuals/alice-chen
#   --works_at-> corporations/acme-ai (depth 1)
gbrain backlinks corporations/acme-ai
# [
#   {
#     "from_slug": "people/alice-chen",
#     "to_slug": "companies/acme-ai",
#     "link_type": "works_at",
#     "context": "Founder and CEO of [[companies/acme-ai]]...",
#     "link_source": "markdown",
#     ...
#   }
# ]

This is the distinction between vector search and structured retrieval. “Who works at Acme AI?” is now a one-hop typed-edge traversal, not a similarity rating. That structural channel is what drives the +31.4-point P@5 raise over the graph-disabled variant on BrainBench.

Step 6 — Run a search

GBrain ships two search verbs. gbrain search is keyword-only (BM25 on Postgres tsvector) and works with out embeddings:

gbrain search "inference"
# [0.3648] corporations/acme-ai -- YC W24 inference-optimization startup...
# [0.3648] individuals/alice-chen -- Founder and CEO of [[companies/acme-ai]]...

gbrain question is the total hybrid pipeline: vector (HNSW on pgvector) + BM25 + Reciprocal Rank Fusion + non-obligatory multi-query growth (Anthropic Haiku) + an non-obligatory ZeroEntropy reranker. It wants embeddings, which we deferred in Step 2 — wire them up now:

# Set considered one of: ZEROENTROPY_API_KEY (default), OPENAI_API_KEY, or VOYAGE_API_KEY
export OPENAI_API_KEY=sk-...
gbrain config set embedding_model openai:text-embedding-3-large
gbrain embed --all          # one-time backfill towards your embedding supplier
gbrain question "who works on small-model inference?"
# Set considered one of: ZEROENTROPY_API_KEY (default), OPENAI_API_KEY, or VOYAGE_API_KEY
export OPENAI_API_KEY=sk-...
gbrain config set embedding_model openai:text-embedding-3-large
gbrain embed --all          # one-time backfill towards your embedding supplier
gbrain question "who works on small-model inference?"

Three search modes ship out of the field — conservative, balanced, tokenmax — bundling the fee/high quality knobs into one config key. Default is balanced with the ZeroEntropy reranker on. RRF formulation: rating = sum(1 / (60 + rank)).

Step 7 — Connect to Claude Code through MCP

The mind is extra helpful when an AI agent can learn and write to it instantly. GBrain exposes 74 instruments over the Model Context Protocol through stdio. The canonical setup is one command (not a hand-edited JSON file):

claude mcp add gbrain -- gbrain serve

Verify the set up:

claude mcp listing
# gbrain  stdio  gbrain serve

Now ask Claude Code one thing like “search the mind for inference optimization” and it’ll route by the search software and return your listed outcomes. The precise MCP software names are plain snake_case: get_page, put_page, delete_page, list_pages, search, question, add_link, get_backlinks, add_tag, and 65 extra.

Cursor and Windsurf use the usual MCP JSON config of their respective settings UIs. The server spec is identical:

{
  "mcpServers": {
    "gbrain": { "command": "gbrain", "args": ["serve"] }
  }
}

Claude Desktop makes use of claude_desktop_config.json for native stdio MCP servers with the identical JSON spec. Remote HTTP MCP servers should be added by Settings → Integrations with a bearer token. See docs/mcp/CLAUDE_DESKTOP.md within the repo for the GUI walkthrough.

If you need distant entry from any machine, swap stdio for HTTP:

gbrain serve --http --port 8787
# Bearer auth, default-deny CORS, two-bucket fee restrict, per-request audit log.
# Postgres-only by design (PGLite is local-only).

Step 8 — Let the mind run itself

GBrain ships an autopilot loop. As of v0.36.4, one command computes a dependency-ordered remediation plan, submits every step as a Minion job, re-checks the mind’s well being rating between steps, and refuses to spend previous your value cap:

gbrain physician --remediate --yes --target-score 90 --max-usd 5

Or run it as a daemon:

gbrain autopilot --install        # cron-driven, 5-minute tick

Healthy brains sleep for 60 minutes between ticks. Unhealthy ones get the total in a single day cycle: sync, extract, embed, consolidate, synthesize. Three phases (synthesize, patterns, consolidate) are protected so an MCP-connected agent can’t silently burn API credit.

For ad-hoc background work, the Minions queue takes shell jobs and LLM subagent jobs facet by facet:

gbrain jobs submit sync --params '{}' --follow
gbrain jobs stats
gbrain jobs work --queue default

One PGLite caveat: gbrain jobs supervisor (the auto-restarting employee daemon) is Postgres-only. PGLite’s unique file lock blocks the separate employee course of — the CLI rejects with a transparent error if config.engine === 'pglite'. If you’re on PGLite, keep on with inline --follow jobs for the tutorial, or run gbrain migrate --to supabase earlier than standing up a persistent employee.

Routing rule: deterministic work (pull tweets, parse JSON, write a web page) goes to Minions; judgment work (triage an inbox, assess precedence) goes to LLM sub-agents.

What simply occurred, in a single diagram

markdown recordsdata  ──>  PGLite + pgvector  <──>  43 expertise
(your repo,           (hybrid retrieval +     (HOW to use the mind;
 supply of reality)      typed graph)           RESOLVER.md routes intent)
       ▲                                              │
       └──────────────  agent reads/writes  ──────────┘

The markdown repo is the system of document. GBrain is the retrieval + graph layer over it. The agent reads and writes by each, and people can at all times open any .md file and edit it instantly — gbrain sync picks up the change.

Where to go subsequent

  • One-line seize (new in v0.38): gbrain seize "the thought I need to bear in mind" lands instantly in inbox/YYYY-MM-DD-<hash>. Also accepts --file, --stdin, and webhook ingestion through gbrain serve --http /ingest.
  • Migrate to Supabase when your mind outgrows native (PGLite is nice up to ~50K pages): gbrain migrate --to supabase.
  • Ingest actual knowledge with one of many recipes: voice (Twilio + OpenAI Realtime), electronic mail + calendar, 16 embedding suppliers, credential gateway.
  • Run the benchmarks within the sibling repo gbrain-evals: BrainBench (artificial) and gbrain eval longmemeval (the general public LongMemEval benchmark).
  • Author your personal expertise. A ability is a fats markdown file that encodes a workflow — triggers, checks, high quality gate. gbrain check-resolvable validates the ability tree for reachability / MECE / DRY.

The deeper wager behind GBrain is that skinny harness, fats expertise beats skinny expertise behind a fats agent. The runtime stays small; the intelligence lives in markdown recordsdata the agent reads at resolution time. Each commit you make to your mind repo is everlasting context your agent inherits the subsequent time it wakes up. The longer you run it, the smarter it will get.

Marktechpost’s Visual Explainer

GBrain v0.38 Tutorial
01 / 11

Step-by-Step Tutorial

Implementing GBrain: The Self-Wiring Memory Layer for AI Agents

A hands-on walkthrough of Garry Tan’s open-source agent mind: set up, import, hybrid search, graph wiring, and MCP integration in ~20 minutes.

v0.38.2Latest launch
43Skills shipped
98%TypeScript
MITLicense

The Problem

AI Agents Are Smart But Forgetful

Every session begins from zero. No reminiscence of who you met, what you learn, or what you determined final week.

  • Vector DBs alone miss actual phrases and structural questions
  • Keyword search alone misses conceptual matches
  • Most reminiscence layers can’t reply “who works at X?”
  • Sub-agents are gradual and costly for deterministic work

What Is GBrain

The Brain Powering Y Combinator’s CEO

Production stats from Garry Tan’s private OpenClaw/Hermes deployment, present as of v0.38.2.0.

146,646Pages listed
24,585People tracked
5,339Companies
66Cron jobs

Hybrid search on BrainBench: P@5 49.1%, R@5 97.9% — a +31.4-point P@5 lead over the identical code with the graph layer disabled.

Step 1 — Install

Install Bun, Then GBrain

As of v0.38, the canonical set up is a single international Bun set up.

# Install Bun
curl -fsSL https://bun.sh/set up | bash
exec $SHELL

# Install GBrain
bun set up -g github:garrytan/gbrain

gbrain --version
gbrain 0.38.2.0

Step 2 — Initialize

Create a Local Brain in Two Seconds

PGLite is full Postgres 17 compiled to WASM. No Docker, no server.

# Defer embedding setup till search step
gbrain init --pglite --no-embedding
Setting up native mind with PGLite...
  Schema model 1 → 85 (81 migration(s) pending)
  [2] slugify_existing_pages... 
  [3] unique_chunk_index...     
  ...
Brain prepared at ~/.gbrain/mind.pglite
0 pages. Engine: PGLite.

Step 3 — Brain Repo

Compiled Truth + Timeline Pattern

Every web page is one markdown file. Wikilinks want full slug paths[[people/alice-chen]], not [[alice-chen]].

--- individuals/alice-chen.md ---
kind: individual
title: Alice Chen
tags: [founder, ai-infra]
———

Founder/CEO of [[companies/acme-ai]].
Previously workers engineer at Google Brain.

———

- 2024-09-04: $12M seed, Sequoia led
- 2025-01-18: Open-sourced router

Step 4 — Import

Import is Idempotent (SHA-256 Dedup)

Captured from an actual run. Pass --no-embed to defer embeddings till the search step.

gbrain import ~/my-brain/ --no-embed
Found 3 markdown recordsdata
[import.files] 3/3 (100%) imported=3 skipped=0 errors=0

Import full (0.3s):
  3 pages imported
  3 chunks created

Step 5 — Graph

Wire the Graph with Zero LLM Calls

Regex inference cascade: FOUNDED → INVESTED → ADVISES → WORKS_AT → MENTIONS.

gbrain extract hyperlinks --source db
Links: created 2 from 3 pages
Done: 2 hyperlinks, 0 timeline entries

gbrain graph-query individuals/alice-chen --depth 1
[depth 0] individuals/alice-chen
  --works_at-> corporations/acme-ai (depth 1)

Step 6 — Search

Hybrid: Vector + BM25 + RRF + Reranker

Three modes ship out of the field: conservative, balanced, tokenmax. RRF formulation: rating = sum(1 / (60 + rank)).

gbrain search "inference"
[0.3648] corporations/acme-ai — YC W24
        inference-optimization startup...
[0.3648] individuals/alice-chen — Founder/CEO of
        [[companies/acme-ai]]...

gbrain question "who works on small-model inference?"
# Hybrid: vector + BM25 + RRF + ZeroEntropy rerank

Step 7 — MCP

Wire Into Claude Code in One Command

GBrain exposes 74 instruments over MCP through stdio. The canonical setup is one CLI name — not a hand-edited JSON file.

claude mcp add gbrain -- gbrain serve

# For Cursor / Windsurf, use the usual JSON config:
{
  "mcpServers": {
    "gbrain": {
      "command": "gbrain",
      "args": ["serve"]
    }
  }
}

Tool names are plain snake_case: get_page, put_page, search, question, add_link, get_backlinks, and 68 extra.

Key Takeaways

What You Just Built

  • Markdown-first reminiscence — people can at all times learn and edit the supply of reality
  • Self-wiring graph — typed edges extracted on each save, zero LLM value
  • Hybrid search — vector + BM25 + RRF + ZeroEntropy reranker, three preset modes
  • Local Postgres — PGLite by default (up to ~50K pages), Supabase on demand
  • MCP-native — 74 instruments throughout Claude Code, Cursor, Windsurf

Key Takeaways

  • GBrain (v0.38.2.0) offers AI brokers a persistent, markdown-first reminiscence layer — constructed by Garry Tan to energy his personal OpenClaw/Hermes deployments holding 146,646 pages and 24,585 individuals.
  • Install runs regionally in ~half-hour on PGLite (Postgres 17 compiled to WASM, zero server) and scales to Supabase or self-hosted Postgres when wanted.
  • Every wikilink is parsed by a regex inference cascade (FOUNDED → INVESTED → ADVISES → WORKS_AT) that writes typed graph edges with zero LLM calls.
  • Hybrid search (vector + BM25 + RRF + ZeroEntropy reranker) hits P@5 49.1% / R@5 97.9% on BrainBench — a +31.4-point P@5 raise over the graph-disabled variant.
  • Exposes 74 instruments over MCP — wire it into Claude Code with a single claude mcp add gbrain -- gbrain serve and your agent can learn/write the mind instantly.


Check out the GitHub Repo and Implementation CodesAlso, be happy to observe us on Twitter and don’t overlook to be a part of our 150k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to accomplice with us for selling your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar and so forth.? Connect with us

The publish A Step-by-Step Coding Tutorial to Implement GBrain: The Self-Wiring Memory Layer Built by Y Combinator’s Garry Tan for AI Agents appeared first on MarkTechPost.

Similar Posts