LLM Wiki — Karpathy's knowledge-compilation pattern¶
What it is¶
The LLM Wiki is a knowledge management pattern introduced by AI researcher Andrej Karpathy in a GitHub Gist published April 4, 2026 (5,000+ stars and 5,000+ forks within weeks). The proposal is an alternative to retrieval-augmented generation for personal and team knowledge work. The core idea is simple: instead of asking an LLM to retrieve and synthesise knowledge anew on every query, perform that synthesis once — at ingest time — and store the result as a persistent, interlinked collection of Markdown files that the LLM maintains and a human reads. Karpathy calls the result "a persistent, compounding artifact": the wiki grows richer with every source added and every question asked, rather than resetting to zero at each session boundary.
The paradigm shift: ingest-time vs query-time compilation¶
The intellectual core of the LLM Wiki pattern is the distinction between two moments at which an LLM can reason about a knowledge base:
Query-time assembly (RAG): A new source is indexed as raw chunks. When a question arrives, the model retrieves relevant chunks and synthesises an answer from scratch. Nothing is built up between sessions.
Ingest-time compilation (LLM Wiki): A new source is read; its key information is extracted and integrated into a persistent wiki — updating entity pages, revising concept summaries, noting contradictions, strengthening cross-references. By the time a question arrives, the synthesis already exists as a compiled page.
The fundamental trade-off versus RAG is freshness vs coherence. RAG is fresher; the LLM Wiki is more coherent. For slowly-changing domains (research, competitive analysis, personal self-tracking), coherence wins clearly.
This compile-once philosophy maps to what Pinecone is doing with Nexus at the infrastructure layer. The difference is layer of abstraction — Nexus compiles at the retrieval-infrastructure level with a commercial platform and KnowQL; the LLM Wiki compiles at the knowledge-representation level with Markdown files. See pinecone-nexus for the infrastructure-level equivalent.
| Dimension | Traditional RAG | LLM Wiki | Pinecone Nexus |
|---|---|---|---|
| When synthesis happens | Query time | Ingest time | Compile time (Build Loop) |
| Output format | Ephemeral answer | Persistent Markdown | Typed, structured artifact |
| Maintained by | Nobody (resets) | LLM (persistent) | Pinecone Build Loop |
| Scale sweet spot | Millions of docs | Hundreds–thousands | Enterprise data sources |
| Freshness | High (instant index) | Depends on ingest cadence | Depends on context rebuild |
| Coherence | Variable (re-derived) | High (pre-compiled) | Very high (deterministic) |
Three-layer architecture¶
Karpathy defines three layers with strict ownership rules:
Layer 1 — Raw sources (raw/) — Immutable. Human writes; LLM reads only. Original source documents: articles, papers, PDFs, images, transcripts, data files. Never modified by the LLM. Source of truth.
Layer 2 — The wiki (wiki/) — LLM-owned, human-readable. Contains: wiki/index.md (master catalog), wiki/log.md (append-only history), wiki/hot.md (~500-word hot cache loaded at session start), plus sources/, entities/, concepts/, comparisons/, syntheses/ subdirectories.
The compounding loop: queries that yield valuable analyses can be filed back as new comparison or synthesis pages. External sources enrich the wiki at ingest; the reader's own explorations enrich it at query time.
Layer 3 — The schema (CLAUDE.md / AGENTS.md) — Co-evolved operational contract. Tells the LLM how the wiki is structured and what workflows to follow. Without it, each session starts from zero. File naming: CLAUDE.md for Claude Code, AGENTS.md for Codex (cross-vendor Linux Foundation standard), OPENCODE.md for OpenCode/Pi.
Karpathy's IDE metaphor: "Obsidian is the IDE. The LLM is the programmer. The wiki is the codebase."
L1 vs L2 memory: the routing test¶
The ScrapingArt implementation guide (v1.1) adds a practical extension: a distinction between L1 memory (auto-loaded at every session start) and L2 memory (the wiki, on-demand):
- L1 (auto-loaded): Hard constraints the LLM must never operate without. Stored in
.claude/memory/(Claude Code),~/.agents/memory/(Codex). Kept small. - L2 (on-demand): The wiki. Loaded when relevant. Can be as large as needed.
The routing test — the "Dangerous or Embarrassing Test": if the LLM making a mistake without this knowledge would be dangerous or embarrassing, put it in L1. If merely inconvenient, put it in L2.
Three core operations¶
Ingest¶
Trigger: drop a file in raw/ and say "ingest raw/path/to/file.md."
Karpathy: "A single source might touch 10–15 wiki pages."
What the LLM does: read source → discuss 3–5 key takeaways → create wiki/sources/summary-{slug}.md → update wiki/index.md → update all relevant concept/entity pages → flag contradictions with > [!contradiction] → append to wiki/log.md.
Query¶
Trigger: you ask a question.
What the LLM does: read wiki/index.md → read relevant pages → synthesise answer with [[wiki-link]] citations → offer to file valuable analyses as new pages.
Lint¶
Trigger: periodic health check (weekly recommended).
Produces: contradictions between pages, orphan pages (zero inbound links), concepts mentioned 3+ times without a page, stale claims, 3–5 suggested investigations.
CLAUDE.md schema template¶
# CLAUDE.md — Master Schema
## Domain
[REPLACE WITH YOUR TOPIC]
## Project Structure
- `raw/` — immutable source documents. NEVER modify.
- `wiki/` — LLM-generated wiki. You own this layer entirely.
- `wiki/index.md` — master catalog. Update on EVERY ingest.
- `wiki/log.md` — append-only activity log. Never delete entries.
- `wiki/hot.md` — session hot cache (~500 words). Read silently at session start.
## Safety Rules
- NEVER write to raw/. Hard constraint.
- NEVER delete wiki pages. Mark as deprecated in frontmatter.
- Cross-reference all new pages to at least 2 existing pages.
Implementation across agent platforms¶
Claude Code — uses CLAUDE.md. Reads files, edits Markdown across directories, runs shell commands, maintains persistent L1 in .claude/memory/. Custom skills in .claude/skills/ wrap ingest/query/lint as slash commands.
OpenAI Codex — uses AGENTS.md. Same three-layer architecture; L1 in ~/.agents/memory/. Nested AGENTS.md at sub-paths takes precedence.
GitHub Copilot agent mode — uses .github/copilot-instructions.md as schema. Copilot agent mode in VS Code can read/write files across the repo.
Obsidian as browsing surface — Karpathy describes keeping Obsidian open beside the agent, following links in real time, checking the Graph view (node-link visualisation of cross-references). Obsidian's Web Clipper captures web articles into raw/articles/.
Scaling considerations¶
The 10-Source Test: A 90%-finished wiki performed 17% worse than a complete one in production testing due to high-confidence gaps. Do not rely on it for important decisions until meaningful coverage is reached.
The single-writer rule: Multiple concurrent agent sessions writing to the same wiki file can cause silent data corruption. Default to sequential single-agent writes.
Limitations and criticisms¶
- Scale wall. Once the wiki exceeds moderate size, retrieval, ranking, and reranking all return as problems. Pinecone Nexus addresses this at the infrastructure layer.
- Silent corruption risk. The same process reads and writes; without provenance tracking and audit logs, corruption is hard to detect.
- Maintenance discipline required. Freshness depends entirely on ingest cadence.
- No canonical standard. Parsing, normalisation, chunking, citation schema are all left to the implementer.
- "Wiki" is contested. A Markdown corpus maintained by one agent, without collaborative editing or formal governance, is not a wiki in the traditional sense.
For enterprise use, minimum requirements: claim-level provenance, access control, snapshotting, rollback, approval queues, and separation between authoritative sources and AI-derived interpretation.
Use cases¶
From Karpathy's gist: personal tracking (goals, health, self-improvement), deep research (papers and articles compiled over months), book companions (character and theme pages as you read), business/team wikis (fed by Slack threads and meeting transcripts), competitive analysis, due diligence, hobby deep-dives.
Karpathy's scale observation: the compounding effect becomes clearly visible at roughly 100 articles and 400,000 words.
Sources¶
- Andrej Karpathy — LLM Wiki gist (April 4, 2026, 5,000+ stars)
- AI Critique — Karpathy's LLM Wiki and the future of enterprise knowledge (May 8, 2026)
- ScrapingArt/Karpathy-LLM-Wiki-Stack v1.1 — complete technical blueprint (GitHub, April 2026)
Changelog¶
- 2026-05-13 — Page created from Karpathy's gist (Type A primary, 5,000+ stars) + AI Critique analysis (Type C) + ScrapingArt implementation guide (Type C); 3 sources; confidence 65 (moderate). Topic: ingest-time knowledge compilation via LLM-maintained Markdown wiki.