Skip to content

Agent Skills authoring — overview

This is a vendor-neutral, deeply-worked authoring guide for Agent Skills: file-based, on-demand capabilities that the major AI coding agents — Anthropic Claude, GitHub Copilot, and OpenAI Codex — all now read from the same SKILL.md format. The guide is written so that someone with no prior Skill experience can plan, build, test, and distribute a portable Skill that runs across all three platforms with only narrow vendor-specific tweaks.

The hub page covers the conceptual ground every author needs before opening a text editor: what a Skill is, why it exists, how the three-level disclosure model keeps cost near zero at idle, and how Skills relate to neighbouring extension surfaces such as MCP, sub-agents, hooks, and project-context files (CLAUDE.md / AGENTS.md / copilot-instructions.md). The four sub-pages then go deep on each phase of the authoring lifecycle:

  • agent-skills-planning — picking a use case, defining success criteria, gathering technical requirements, and writing the frontmatter.
  • agent-skills-testing — testing triggering, behaviour, and performance; iterating from concrete failure signals.
  • agent-skills-distribution — packaging, sharing, marketplaces, plugin wrappers, and org-level deployment.
  • agent-skills-vendor-extensions — vendor-specific optional frontmatter, location precedence, and sidecar files (Anthropic allowed-tools; VS Code argument-hint/user-invocable/disable-model-invocation/context; Codex agents/openai.yaml).

The intent is for the hub to be enough to decide whether you need a Skill; the sub-pages are enough to build one well.

1. What an Agent Skill is

An Agent Skill is a directory that gives an agent the procedural knowledge and tools it needs to do a specific job, loaded into the agent's context only when the agent decides the Skill applies. The minimal shape is:

my-skill/
├── SKILL.md          # required: metadata + instructions
├── scripts/          # optional: executable helpers (Python, Bash, Node, …)
├── references/       # optional: detailed docs loaded on demand
└── assets/           # optional: templates, images, fonts, fixtures

SKILL.md is YAML frontmatter followed by a Markdown body. The frontmatter contains the trigger metadata that decides whether the Skill is loaded at all; the body is the actual instructions the agent reads when the Skill is triggered. The directory name must match the name field in frontmatter — that one-to-one mapping is how every host runtime locates the Skill on disk.

A canonical minimal example:

---
name: pdf-processing
description: Extracts text and tables from PDF files, fills PDF forms, and merges PDFs. Use when the user mentions PDFs, forms, or document extraction.
---

# PDF processing

When asked to extract a PDF, run `scripts/extract.py <path>`. For form filling, see `references/FORMS.md`.

## Steps
1. Validate the PDF is readable.
2. Run the extractor script.
3. Return a structured summary.

That is a complete, runnable Skill. Everything else — scripts, references, assets — is optional and is fetched only when the Skill needs it. Skills can ship hundreds of files of additional context without paying for any of it at session start because of the three-level disclosure model described below.

2. Why Skills exist — the problem they solve

Two problems sat unsolved in the LLM coding-agent space before Skills landed:

  1. Procedural knowledge — agents are good at general programming and bad at your team's specific repeatable workflow (how you write a release note, how you triage a ticket, how you format a PR description, which migration script to call for which table). Re-pasting that procedure into every chat was wasteful and inconsistent.
  2. Context cost — stuffing every useful instruction into a global system prompt or always-loaded CLAUDE.md / AGENTS.md blew the context window apart. Most procedures only apply to a fraction of conversations.

Skills answer both at once: write the procedure down in SKILL.md, install it once, and let the model auto-load it only when the task fits. From the author's side it feels like writing a wiki page; from the agent's side it feels like discovering a new specialist tool just in time. The format scales to hundreds of Skills per project because — and this is the load-bearing trick — only the frontmatter is in the window at idle.

3. The three-level disclosure model

What makes Skills cheap is progressive disclosure: the agent never sees more of a Skill than it needs.

Level What is loaded When Approx. cost
1 YAML frontmatter only (name + description) for every installed Skill At session start, held in the system prompt ~100 tokens per Skill
2 Full SKILL.md Markdown body When the model decides the Skill applies to the current message Whatever the body weighs
3 Files under scripts/, references/, assets/ On explicit read, or executed without entering the context at all 0 if the script is merely run

Practical implications for authoring:

  • Level 1 is precious. The description is your one and only chance to make the model pick the Skill. Vague descriptions chronically under-trigger; over-promising descriptions over-trigger and waste tokens. The planning sub-page covers description writing in detail.
  • Level 2 is where you put procedural instructions: numbered steps, decision criteria, what to do in edge cases. Keep it under a few thousand words and push reference detail down to level 3.
  • Level 3 is the cheat code. A reference Markdown file with 50 pages of compliance rules costs zero tokens until the model opens it. A Python script that does 1,000 lines of heavy lifting costs zero tokens — the model invokes it and reads the output, not the source. Authors who internalize this stop trying to compress prose and start splitting their Skill into a thin "what to do" body and fat "how exactly" reference files.

OpenAI Codex documents a session-start budget of roughly 2% of the model's context window or 8,000 characters, whichever is smaller — past that ceiling some Skills' level-1 metadata is not visible to the model and they will never auto-trigger. Anthropic and GitHub do not document a numeric cap but the same pressure applies: keep description fields short, imperative, and distinguishable from each other.

4. When to author a Skill — and when to use something else

Skills are one of several customization surfaces. Pick the wrong surface and you fight the format forever. The decision matrix below is the single most useful piece of authoring advice in this guide.

If you want to… Author it as Surface
Teach the agent a repeatable workflow / procedure Agent Skill (SKILL.md) Cross-vendor open spec
Set project conventions every agent should always know AGENTS.md (cross-vendor) or CLAUDE.md / copilot-instructions.md (vendor-specific) Always-loaded project context
Add a new tool / data source the agent can call MCP server Cross-vendor JSON-RPC protocol
Provide a one-shot reusable prompt the user types Prompt File (Copilot .prompt.md) or Claude slash command Explicit slash command
Spin up a focused autonomous worker with a different system prompt Sub-agent (Claude) or Custom Agent (Copilot, @-mention) Explicit @-mention
Block or audit dangerous tool calls before they run Hook Lifecycle event
Distribute a bundle of the above to a team Plugin (Claude .claude-plugin/plugin.json) or repo commit (Copilot, Codex) Distribution wrapper

A useful one-line model: Skills supply the what to do; MCP supplies the how to connect. Skills are recipes; MCP is the pantry. They compose — a Skill routinely declares MCP server dependencies (Codex via agents/openai.yaml's dependencies.mcp_servers, Claude via plugin .mcp.json references) — and they do not compete.

Three negative tests worth memorizing:

  • "Should the agent know this all the time?" → Yes means AGENTS.md / CLAUDE.md / copilot-instructions.md, not a Skill.
  • "Is this a tool/API the agent talks to?" → That is MCP, not a Skill. A Skill uses the tool; it doesn't expose it.
  • "Does the user type a slash to invoke it?" → That is a Prompt File or slash command, not a Skill. Skills auto-trigger from description matching.

5. The three categories of Skill use case

Almost every well-scoped Skill falls into one of three categories. Knowing which one you are writing changes the structure of the body and what goes into scripts/ vs references/.

5.1 Document / asset creation

The agent must produce a specific kind of file — a slide deck, a Word document, a PDF, an Excel workbook, a Confluence page, a release-note Markdown. Skill body lists the steps; scripts/ holds the converter / generator helper (often the Skill installs a templating library and runs it); assets/ holds templates and fixtures (logos, fonts, blank decks). Anthropic's built-in docx, pptx, xlsx, and pdf Skills are textbook examples.

5.2 Workflow automation

The agent must follow a procedure that crosses multiple tools or systems — onboarding a new hire, triaging a bug report, executing a release, running a postmortem. Skill body lists numbered steps and decision criteria; scripts/ automates the rote operations (calling APIs, formatting output, posting messages); references/ carries policy detail (escalation matrix, who owns what).

5.3 MCP enhancement

The Skill exists to make a generic MCP server feel like a domain-specific tool. Without the Skill, the agent has access to e.g. a Jira MCP server but has to invent its own workflow for "create a bug ticket for this incident." With the Skill, the agent now knows exactly which fields to fill, which queue to drop it in, and which labels to apply. The Skill is the playbook; MCP is the keyboard.

A Skill that falls into none of these three categories is worth re-checking against the surface decision matrix above — there is a good chance it should be a Prompt File, a sub-agent, or AGENTS.md content instead.

6. The cross-vendor footprint

The same SKILL.md runs on all three major platforms. Vendor differences sit in optional frontmatter extensions, location precedence, and a single Codex sidecar file. The vendor-extensions sub-page covers all of this in depth; the summary below is enough to know what portability looks like.

Platform Reads SKILL.md from Vendor-specific optional fields
Anthropic Claude (Claude Code, Agent SDK, Claude apps, plugins) .claude/skills/, ~/.claude/skills/, plugin bundles allowed-tools (experimental, CLI-only)
GitHub Copilot in VS Code (GA), JetBrains (preview), GitHub.com (GA), Copilot CLI (GA) .github/skills/, .claude/skills/, .agents/skills/ (project); ~/.copilot/skills/, ~/.claude/skills/, ~/.agents/skills/ (personal) argument-hint, user-invocable, disable-model-invocation, context (inline/fork)
OpenAI Codex (CLI, IDE, Cloud) $CWD/.agents/skills/ walking up to repo root; ~/.agents/skills/; /etc/codex/skills/; system-bundled Optional sidecar agents/openai.yaml with interface, policy.allow_implicit_invocation, dependencies.mcp_servers

Three points of practical portability advice:

  1. Pin the portable subset. Use only the two required fields (name, description) plus the open-spec optional fields (license, compatibility, metadata). Treat every other field as a vendor extension and keep them in a clearly labelled section at the bottom of SKILL.md or in the Codex sidecar.
  2. Use .agents/skills/ for cross-vendor projects. Both Copilot and Codex read from .agents/skills/; Claude does not natively but can be pointed at it via a small symlink or per-project setting. The vendor-neutral path is the right default for a Skill you intend to share publicly.
  3. Mind the security-field divergence. Tool-allowlisting is the one area where each vendor invented its own knob: Anthropic allowed-tools (allowlist), Copilot disable-model-invocation (lock to user invocation), Codex policy.allow_implicit_invocation (opt-in to auto-trigger). These do not interoperate; the planning sub-page explains how to write a Skill that degrades gracefully across all three.

7. Governance — why this convergence is durable

Skills' cross-vendor support is not a polite gesture between competitors. It rests on an open spec (agentskills.io) and a foundation: the Agentic AI Foundation (AAIF), a Linux Foundation directed fund co-founded by Anthropic, Block, and OpenAI on December 9, 2025. By the Linux Foundation's February 24, 2026 press release the AAIF had 146 members and 8 platinum sponsors — AWS, Anthropic, Block, Bloomberg, Cloudflare, Google, Microsoft, and OpenAI — with David Nalley (AWS) as chair. AAIF stewards the Model Context Protocol (donated by Anthropic, Dec 2025), the AGENTS.md spec (donated by OpenAI), and goose (the reference open-source agent from Block).

The Agent Skills specification itself, donated to the community at agentskills.io on December 18, 2025, is on the same trajectory: open spec, multiple-vendor implementations, neutral governance. This is why "write a SKILL.md once and it runs everywhere" is a strategically durable bet rather than a vendor-of-the-month feature.

8. The author's mental model — one paragraph

A Skill is a directory whose name matches its name, with a SKILL.md whose frontmatter is the trigger and whose body is the procedure. The body cites scripts and reference files that cost nothing until they are opened. The description is the only thing the model sees at session start, so it must be specific, imperative, and distinguishable. The same Skill runs on Claude, Copilot, and Codex; vendor-specific knobs go into a small set of optional fields or a Codex sidecar. Skills supply what to do; MCP supplies how to connect; AGENTS.md supplies what this project is. Author the right surface for the question.

9. Reading order

If you are about to author a Skill, read in this order:

  1. agent-skills-planning — pick a use case, define success, gather requirements, write the frontmatter.
  2. agent-skills-vendor-extensions — only if you care about non-Anthropic surfaces.
  3. agent-skills-testing — once you have a draft, this is how you prove it works.
  4. agent-skills-distribution — once it works, this is how you ship it.

If you are deciding whether to author a Skill rather than something else, the decision matrix in §4 above is the only thing you need.

Changelog

  • 2026-05-11 — Hub page created from the verbatim Anthropic authoring guide PDF and cross-vendor deep-research synthesis. Confidence 90 (HIGH); cross-vendor claims validated against vendor primary docs.