Google Cloud Introduces Open Knowledge Format (OKF): A Vendor-Neutral Markdown Spec for Giving AI Agents Curated Context
Foundation fashions maintain getting stronger, but they nonetheless stall on the identical factor: context. A mannequin can write code or analyze a dataset, however solely with the fitting inner data. That data contains desk schemas, metric definitions, runbooks, be a part of paths and it lives scattered throughout catalogs, wikis, and some senior engineers’ heads.
Google Cloud launched the Open Knowledge Format (OKF), an open specification that formalizes the LLM-wiki sample into a conveyable, interoperable format. It is a vendor-neutral, agent- and human-friendly customary for the context trendy AI techniques want.
Open Knowledge Format (OKF)
OKF is a format, not a service or a platform. OKF v0.1 represents data as a listing of markdown information with YAML frontmatter. A small set of agreed-upon conventions lets wikis written by one producer be consumed by a special agent with out translation.
That is the entire concept. There isn’t any compression scheme, no new runtime, and no required SDK. A bundle of OKF paperwork is simply markdown, simply information, and simply YAML frontmatter. It renders on GitHub, ships as a tarball, and mounts on any filesystem.
If you may have used Obsidian, Notion, or Hugo, the form will really feel acquainted. OKF solely formalizes the conventions wanted to make these patterns interoperable.
The Fragmented Context Problem
In most organizations, mannequin context is overwhelmingly inner data. Today it sits in incompatible silos: metadata catalogs with their very own APIs, wikis, shared drives, code feedback, and docstrings.
Ask an agent ‘How do I compute weekly energetic customers from our occasion stream?’ It should assemble that reply from scattered, mutually incompatible surfaces. Every vendor affords its personal catalog, SDK, and knowledge-graph schema. None of the data is moveable throughout merchandise or organizations.
The result’s duplicated effort. Every agent builder solves the identical context-assembly downside from scratch. Every catalog vendor reinvents the identical knowledge fashions.
Andrej Karpathy articulated the underlying idea in his April 2026 LLM Wiki gist. His level: LLMs don’t get bored, don’t forget to replace cross-references, and might edit many information in a single cross. The bookkeeping that makes people abandon private wikis is precisely what LLMs deal with properly.
The identical sample retains reappearing underneath completely different names. Examples embrace Obsidian vaults wired to coding brokers, the AGENTS.md and CLAUDE.md conference information, and ‘metadata as code’ repos. Each occasion is bespoke, so none of them interoperate. OKF standardizes that interoperability layer so brokers can do the heavy lifting.
How OKF Works: The Design in One Screen
An OKF bundle is a listing of markdown information representing ideas — tables, datasets, metrics, playbooks, runbooks, or APIs. Each idea is one file, and the file path is its id.
gross sales/
├── index.md
├── datasets/
│ ├── index.md
│ └── orders_db.md
├── tables/
│ ├── index.md
│ ├── orders.md
│ └── prospects.md
└── metrics/
├── index.md
└── weekly_active_users.md
Each idea carries a small YAML front-matter block, then a markdown physique for all the pieces else.
---
sort: BigQuery Table
title: Orders
description: One row per accomplished buyer order.
useful resource: https://console.cloud.google.com/bigquery?p=acme&d=gross sales&t=orders
tags: [sales, revenue]
timestamp: 2026-05-28T14:30:00Z
---
# Schema
| Column | Type | Description |
|---------------|--------|------------------------------------------|
| `order_id` | STRING | Globally distinctive order identifier. |
| `customer_id` | STRING | FK to [customers](/tables/prospects.md). |
The reserved structured fields are sort, title, description, useful resource, tags, and timestamp. Concepts hyperlink to one another with regular markdown hyperlinks. Those hyperlinks flip the listing right into a graph that’s richer than file-system father or mother/baby relationships. Bundles can optionally embrace index.md information for progressive disclosure and log.md information for change historical past.
Three Principles Behind the Design
- Minimally opinionated: OKF requires precisely one area on each idea:
sort. Everything else is left to the producer. The spec defines the interoperability floor, not the content material mannequin. - Producer/client independence: A human-written bundle could be learn by an agent. A pipeline-generated bundle could be browsed in a visualizer. The format is the contract; tooling at every finish is swappable.
- Format, not platform: OKF is tied to no cloud, database, mannequin supplier, or agent framework. It won’t ever require a proprietary account to learn, write, or serve.
Use Cases, With Examples
- Data group metadata-as-code: Export BigQuery desk and metric definitions as a bundle. Commit it subsequent to the SQL it describes, and evaluate adjustments via pull requests.
- Incident runbooks for brokers: Store every runbook as an idea. An on-call agent reads
index.md, follows cross-links, and resolves the be a part of path it wants. - Cross-org data trade: A vendor ships a catalog export as OKF. Your agent consumes it immediately, with no integration work.
- Developer-team wiki: Replace a stale Notion or Obsidian area with versioned markdown that an agent retains present.
How OKF Compares
| Approach | Storage | Schema required | Portable | SDK/registry | Agent-readable |
|---|---|---|---|---|---|
| OKF v0.1 | Markdown + YAML information | Only sort |
Yes | No | Yes, no translation |
| Notion | Proprietary DB | Per-workspace | Export-only | API wanted | Via API |
| Obsidian vault | Markdown information | None enforced | Yes | No | Bespoke conventions |
| Metadata catalog | Vendor retailer | Vendor schema | Export-only | Vendor SDK | Vendor-specific |
| RAG index | Vector retailer | Embedding mannequin | No | Yes | Chunks, not ideas |
The distinction from RAG is beneficial for builders. RAG re-derives data at question time from uncooked chunks. An OKF bundle shops curated, cross-linked ideas that an agent reads and updates immediately.
A Minimal OKF Consumer
OKF is parseable with customary instruments. This reads a bundle and builds its hyperlink graph.
import pathlib, re, yaml
def load_bundle(root):
ideas, hyperlinks = {}, []
for path in pathlib.Path(root).rglob("*.md"):
textual content = path.read_text()
meta = {}
if textual content.startswith("---"):
_, fm, physique = textual content.cut up("---", 2)
meta = yaml.safe_load(fm) or {}
else:
physique = textual content
ideas[str(path)] = meta # sort, title, tags, and so on.
for goal in set(re.findall(r"]((/[^)]+.md))", physique)):
hyperlinks.append((str(path), goal)) # markdown cross-links
return ideas, hyperlinks
ideas, graph = load_bundle("gross sales/")
No backend or set up is required to learn or serve a bundle. The identical information dwell in model management beside the code they describe.
Key Takeaways
- Google’s Open Knowledge Format (OKF) v0.1 formalizes the LLM-wiki sample into a conveyable, vendor-neutral spec.
- A bundle is only a listing of markdown information with YAML frontmatter—no SDK, runtime, or registry.
- Every idea requires just one area,
sort; cross-links between information type the data graph. - Google shipped reference instruments: a BigQuery enrichment agent, a static HTML visualizer, and three pattern bundles.
- Unlike RAG, OKF shops curated, version-controlled ideas that brokers learn and replace immediately.
Check out the Technical details here. Also, be at liberty to observe us on Twitter and don’t neglect to hitch our 150k+ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.
Need to accomplice with us for selling your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar and so on.? Connect with us
The publish Google Cloud Introduces Open Knowledge Format (OKF): A Vendor-Neutral Markdown Spec for Giving AI Agents Curated Context appeared first on MarkTechPost.
