Skip to content

LLM Wiki — Karpathy's knowledge-compilation pattern

What it is

The LLM Wiki is a knowledge management pattern introduced by AI researcher Andrej Karpathy in a GitHub Gist published April 4, 2026 (5,000+ stars and 5,000+ forks within weeks). The proposal is an alternative to retrieval-augmented generation for personal and team knowledge work. The core idea is simple: instead of asking an LLM to retrieve and synthesise knowledge anew on every query, perform that synthesis once — at ingest time — and store the result as a persistent, interlinked collection of Markdown files that the LLM maintains and a human reads. Karpathy calls the result "a persistent, compounding artifact": the wiki grows richer with every source added and every question asked, rather than resetting to zero at each session boundary.

The paradigm shift: ingest-time vs query-time compilation

The intellectual core of the LLM Wiki pattern is the distinction between two moments at which an LLM can reason about a knowledge base:

Query-time assembly (RAG): A new source is indexed as raw chunks. When a question arrives, the model retrieves relevant chunks and synthesises an answer from scratch. Nothing is built up between sessions.

Ingest-time compilation (LLM Wiki): A new source is read; its key information is extracted and integrated into a persistent wiki — updating entity pages, revising concept summaries, noting contradictions, strengthening cross-references. By the time a question arrives, the synthesis already exists as a compiled page.

The fundamental trade-off versus RAG is freshness vs coherence. RAG is fresher; the LLM Wiki is more coherent. For slowly-changing domains (research, competitive analysis, personal self-tracking), coherence wins clearly.

This compile-once philosophy maps to what Pinecone is doing with Nexus at the infrastructure layer. The difference is layer of abstraction — Nexus compiles at the retrieval-infrastructure level with a commercial platform and KnowQL; the LLM Wiki compiles at the knowledge-representation level with Markdown files. See pinecone-nexus for the infrastructure-level equivalent.

Dimension Traditional RAG LLM Wiki Pinecone Nexus
When synthesis happens Query time Ingest time Compile time (Build Loop)
Output format Ephemeral answer Persistent Markdown Typed, structured artifact
Maintained by Nobody (resets) LLM (persistent) Pinecone Build Loop
Scale sweet spot Millions of docs Hundreds–thousands Enterprise data sources
Freshness High (instant index) Depends on ingest cadence Depends on context rebuild
Coherence Variable (re-derived) High (pre-compiled) Very high (deterministic)

Three-layer architecture

Karpathy defines three layers with strict ownership rules:

Layer 1 — Raw sources (raw/) — Immutable. Human writes; LLM reads only. Original source documents: articles, papers, PDFs, images, transcripts, data files. Never modified by the LLM. Source of truth.

Layer 2 — The wiki (wiki/) — LLM-owned, human-readable. Contains: wiki/index.md (master catalog), wiki/log.md (append-only history), wiki/hot.md (~500-word hot cache loaded at session start), plus sources/, entities/, concepts/, comparisons/, syntheses/ subdirectories.

The compounding loop: queries that yield valuable analyses can be filed back as new comparison or synthesis pages. External sources enrich the wiki at ingest; the reader's own explorations enrich it at query time.

Layer 3 — The schema (CLAUDE.md / AGENTS.md) — Co-evolved operational contract. Tells the LLM how the wiki is structured and what workflows to follow. Without it, each session starts from zero. File naming: CLAUDE.md for Claude Code, AGENTS.md for Codex (cross-vendor Linux Foundation standard), OPENCODE.md for OpenCode/Pi.

Karpathy's IDE metaphor: "Obsidian is the IDE. The LLM is the programmer. The wiki is the codebase."

L1 vs L2 memory: the routing test

The ScrapingArt implementation guide (v1.1) adds a practical extension: a distinction between L1 memory (auto-loaded at every session start) and L2 memory (the wiki, on-demand):

  • L1 (auto-loaded): Hard constraints the LLM must never operate without. Stored in .claude/memory/ (Claude Code), ~/.agents/memory/ (Codex). Kept small.
  • L2 (on-demand): The wiki. Loaded when relevant. Can be as large as needed.

The routing test — the "Dangerous or Embarrassing Test": if the LLM making a mistake without this knowledge would be dangerous or embarrassing, put it in L1. If merely inconvenient, put it in L2.

Three core operations

Ingest

Trigger: drop a file in raw/ and say "ingest raw/path/to/file.md."

Karpathy: "A single source might touch 10–15 wiki pages."

What the LLM does: read source → discuss 3–5 key takeaways → create wiki/sources/summary-{slug}.md → update wiki/index.md → update all relevant concept/entity pages → flag contradictions with > [!contradiction] → append to wiki/log.md.

Query

Trigger: you ask a question.

What the LLM does: read wiki/index.md → read relevant pages → synthesise answer with [[wiki-link]] citations → offer to file valuable analyses as new pages.

Lint

Trigger: periodic health check (weekly recommended).

Produces: contradictions between pages, orphan pages (zero inbound links), concepts mentioned 3+ times without a page, stale claims, 3–5 suggested investigations.

CLAUDE.md schema template

# CLAUDE.md — Master Schema
## Domain
[REPLACE WITH YOUR TOPIC]
## Project Structure
- `raw/`           — immutable source documents. NEVER modify.
- `wiki/`          — LLM-generated wiki. You own this layer entirely.
- `wiki/index.md`  — master catalog. Update on EVERY ingest.
- `wiki/log.md`    — append-only activity log. Never delete entries.
- `wiki/hot.md`    — session hot cache (~500 words). Read silently at session start.
## Safety Rules
- NEVER write to raw/. Hard constraint.
- NEVER delete wiki pages. Mark as deprecated in frontmatter.
- Cross-reference all new pages to at least 2 existing pages.

Implementation across agent platforms

Claude Code — uses CLAUDE.md. Reads files, edits Markdown across directories, runs shell commands, maintains persistent L1 in .claude/memory/. Custom skills in .claude/skills/ wrap ingest/query/lint as slash commands.

OpenAI Codex — uses AGENTS.md. Same three-layer architecture; L1 in ~/.agents/memory/. Nested AGENTS.md at sub-paths takes precedence.

GitHub Copilot agent mode — uses .github/copilot-instructions.md as schema. Copilot agent mode in VS Code can read/write files across the repo.

Obsidian as browsing surface — Karpathy describes keeping Obsidian open beside the agent, following links in real time, checking the Graph view (node-link visualisation of cross-references). Obsidian's Web Clipper captures web articles into raw/articles/.

Scaling considerations

The 10-Source Test: A 90%-finished wiki performed 17% worse than a complete one in production testing due to high-confidence gaps. Do not rely on it for important decisions until meaningful coverage is reached.

The single-writer rule: Multiple concurrent agent sessions writing to the same wiki file can cause silent data corruption. Default to sequential single-agent writes.

Limitations and criticisms

  • Scale wall. Once the wiki exceeds moderate size, retrieval, ranking, and reranking all return as problems. Pinecone Nexus addresses this at the infrastructure layer.
  • Silent corruption risk. The same process reads and writes; without provenance tracking and audit logs, corruption is hard to detect.
  • Maintenance discipline required. Freshness depends entirely on ingest cadence.
  • No canonical standard. Parsing, normalisation, chunking, citation schema are all left to the implementer.
  • "Wiki" is contested. A Markdown corpus maintained by one agent, without collaborative editing or formal governance, is not a wiki in the traditional sense.

For enterprise use, minimum requirements: claim-level provenance, access control, snapshotting, rollback, approval queues, and separation between authoritative sources and AI-derived interpretation.

Use cases

From Karpathy's gist: personal tracking (goals, health, self-improvement), deep research (papers and articles compiled over months), book companions (character and theme pages as you read), business/team wikis (fed by Slack threads and meeting transcripts), competitive analysis, due diligence, hobby deep-dives.

Karpathy's scale observation: the compounding effect becomes clearly visible at roughly 100 articles and 400,000 words.

Sources

  1. Andrej Karpathy — LLM Wiki gist (April 4, 2026, 5,000+ stars)
  2. AI Critique — Karpathy's LLM Wiki and the future of enterprise knowledge (May 8, 2026)
  3. ScrapingArt/Karpathy-LLM-Wiki-Stack v1.1 — complete technical blueprint (GitHub, April 2026)

Changelog

  • 2026-05-13 — Page created from Karpathy's gist (Type A primary, 5,000+ stars) + AI Critique analysis (Type C) + ScrapingArt implementation guide (Type C); 3 sources; confidence 65 (moderate). Topic: ingest-time knowledge compilation via LLM-maintained Markdown wiki.