Coding agents — side-by-side comparison¶

By May 2026 the agentic-coding category has settled into a stable quartet of platform-scale offerings: Anthropic's Claude Code (with the underlying Claude Agent SDK), Microsoft's GitHub Copilot family of agent surfaces, OpenAI's Codex stack, and Google's Antigravity platform. Each one ships a multi-surface, multi-trigger experience that can read a repository, edit files, run shell commands, drive a browser, and open a pull request without continuous human steering. Each one wraps a frontier reasoning model inside a sandbox, a permissions model, a configuration file convention, and an MCP-based tool extension surface. They look superficially similar and they all read parts of each other's conventions — yet at the level of execution model, security posture, model lineup, pricing, and target workflow the differences are substantial enough to drive real procurement decisions.

This page is the dedicated cross-vendor reference. It is not a benchmark: there are no SWE-Bench numbers, no head-to-head trial transcripts, and no claim about which agent "wins". It is a feature-level comparison built from the four vendor-specific sub-pages (claude-agents, github-copilot-agents, codex-agents, antigravity-agents) and a reading of each vendor's primary documentation. Every section opens with prose context, presents a comparison table across the four vendors, and then interprets what the table actually means in practice. The scope is the state of the four products in May 2026; the volatile sections (model lineup, pricing) carry that snapshot caveat explicitly.

1. Surfaces and execution environments¶

Each vendor ships a different shape of product even before any feature comparison. Claude Code began life as a terminal CLI and grew an IDE plug-in, a web app, and a mobile experience around the same harness. GitHub Copilot grew agent surfaces inside the editors it already lived in, then added a GitHub-Actions-hosted cloud surface for async work. Codex covered every surface from launch — CLI, IDE extension, cloud VMs, mobile, Slack — but treats them as front-ends onto a single hosted agent. Antigravity is the outlier: a dedicated agentic IDE (a VS Code fork with Windsurf lineage) where the editor itself is built around a multi-agent dashboard.

Surface	Claude	GitHub Copilot	Codex	Antigravity
Hero IDE surface	VS Code & JetBrains extensions	VS Code, Visual Studio, JetBrains, Eclipse, Xcode	VS Code, Cursor, Windsurf, JetBrains	Antigravity IDE (forked VS Code)
Terminal CLI	`claude` (Node)	None first-party	`codex` (Rust)	Embedded terminal only
Async cloud agent	Claude Code on the web	Cloud agent on GitHub Actions	Codex Cloud (chatgpt.com/codex)	Manager Surface (local fleet)
Web UI	`claude.ai/code`	github.com Agents tab + Spark	chatgpt.com/codex	None — desktop IDE only
Mobile	iOS app	None first-party	ChatGPT iOS; Android remote-control preview	None
Dedicated agentic IDE	No (extension model)	No (extension model)	No (extension model)	Yes — the product is the IDE
Controlled browser	Via MCP only	Via MCP only	Headless in cloud sandbox	First-class Browser Surface

The pattern that drops out is a deliberate split between two architectures. Claude, Copilot and Codex are all harness-plus-surfaces products: one agent loop, packaged into many wrappers. Antigravity is a surface-first product: the IDE is the unit of distribution and the harness is something you only see through it. That matters for tooling decisions because the harness-plus-surfaces design preserves the developer's existing editor, while Antigravity's design asks the developer to switch IDE entirely. The presence of a true CLI on Claude and Codex (and the lack of one on Copilot and Antigravity) also shapes which agent can be wired into CI/CD or driven from a non-interactive script: Codex's codex exec and Claude Code's headless mode are the natural automation entry points, while Copilot's gh aw Agentic Workflows CLI is the closest analogue on the GitHub side. Antigravity has no scriptable equivalent at the time of writing.

2. Triggers and assignment¶

How you start a task on each platform is as telling as where it runs. The legacy pattern — open chat, type a request — is universal, but each vendor has built its own first-class trigger surface beyond that baseline. Claude leans on slash commands and a chat-anywhere model. Copilot leans on GitHub's native primitives: assign an issue, comment on a PR, or click an Agents tab on github.com. Codex pushes the breadth of triggers further than anyone else, with mobile, Slack and PR-comment triggers in addition to the IDE and CLI surfaces. Antigravity collapses the trigger question down to the Manager Surface, where new agents are spawned from inside the IDE.

Trigger	Claude	GitHub Copilot	Codex	Antigravity
IDE chat panel	Yes (VS Code / JetBrains)	Yes (Agent mode)	Yes (sidebar)	Yes (Editor View)
CLI command	`claude` interactive + headless	`gh aw` workflows only	`codex` + `codex exec`	No
Issue assignment	Via MCP / GitHub plugin	Native — assign issue to Copilot	Native via Copilot third-party slot or `@codex` PR comment	Via MCP / GitHub bridge
PR comment	Via plugin / hook	Native iteration on agent PRs	`@codex review`, `@codex implement`, `@codex address comments`	Via MCP only
Browser web UI	claude.ai/code	github.com Agents tab + Spark	chatgpt.com/codex	None
Slack	Via MCP	Via Extensions	`/codex` slash command (Business/Enterprise)	Via MCP
Mobile	iOS Claude app	None first-party	ChatGPT iOS swipe-to-review	None
Delegate sync → async	Web hand-off via `/web-setup`	"Delegate to coding agent" from VS Code	"Delegate to Codex Cloud" + `codex pull`	Cannot leave the IDE

The strategic differences become visible if you watch where each vendor invested first. GitHub Copilot's deepest trigger integration is issue assignment because that is the GitHub-native flow Microsoft owns end-to-end. Codex's deepest investment is bi-directional cloud/local hand-off — a local session can promote itself to a managed VM and the result can be pulled back into the working tree with one command — because OpenAI does not own a developer-platform substrate and has to make portability a feature. Claude's trigger story is editor-and-web-centric and converges on its hosted sandbox. Antigravity's trigger story is the most constrained: there is essentially one place to start work, the Manager Surface, but inside that place you can fan out to as many parallel agents as you want.

3. Model lineup¶

Three of the four vendors ship only their own models in the default flow. Antigravity is the exception: the model picker exposes Gemini 3 Pro and Flash alongside Claude Sonnet 4.5/4.6, Claude Opus 4.6, and OpenAI's GPT-OSS 120B open-weights model, with calls routed through Google-managed endpoints (Google Antigravity (Wikipedia), antigravity.google). Copilot is a partial exception: although its first-party agents run on Microsoft-routed OpenAI and Anthropic models, its third-party agents framework lets a maintainer assign an issue to a Codex or Claude session running entirely outside Microsoft's substrate (about-third-party-agents). Claude and Codex remain single-vendor at the model layer.

Aspect	Claude	GitHub Copilot	Codex	Antigravity
Default model (May 2026)	Sonnet 4.6 (CLI), Opus 4.7 on hard tasks (web)	Auto (mixed; picker exposes GPT-5.4 / Claude 4.7)	GPT-5.3-Codex (rolling out 5.4-Codex)	Gemini 3 Pro
Available models	Opus 4.6/4.7, Sonnet 4.5/4.6, Haiku 4.5	GPT-5.2 → 5.4, Claude Sonnet 4.5/4.6, Claude Opus 4.5–4.7, Auto	codex-1, codex-mini, GPT-5/5.2/5.3/5.4-Codex, -Spark, -nano	Gemini 3 Pro/Flash, Claude Sonnet 4.5/4.6, Claude Opus 4.6, GPT-OSS 120B
Cross-vendor multi-model	No	Yes (via picker + third-party agents)	No	Yes (native picker)
Per-sub-agent model	Yes (sub-agent frontmatter `model:`)	Yes (custom agent + picker)	Per-CLI-profile only	Per-agent in Manager
BYO inference / cloud routing	Bedrock, Vertex, Foundry envs	Microsoft-routed only	AWS Bedrock (Apr 2026); API keys	Google-routed only
Open-weights option	No	No	No	GPT-OSS 120B
Reasoning-effort knob	Implicit (Opus vs Sonnet)	Implicit	Explicit `low\|medium\|high\|xhigh` via `/model`	Implicit

Two practical consequences flow from this. First, an organisation that wants to standardise on a single agent UX but stay model-neutral has only two real options today: Antigravity, where the picker is native, or Copilot, where the third-party agents framework reaches Codex and Claude through GitHub's substrate. Picking Claude Code or Codex straight is a bet on Anthropic or OpenAI specifically. Second, the reasoning-effort knob is a Codex-specific feature: /model switches both the model and an explicit effort level (low/medium/high/xhigh), which is the most direct cost-versus-quality dial in the field. Claude approximates the same control through its Haiku/Sonnet/Opus tiering on sub-agents. Copilot and Antigravity expose effort indirectly via model choice.

4. Sub-agents, Skills, hooks, and custom agents¶

Every vendor has converged on the idea that a single monolithic agent persona is not enough — but the pattern each picked for sub-delegation diverges substantially. Claude's design is the most decomposed: separate primitives for sub-agents, Skills, hooks, MCP servers, and plugins, each with its own filesystem convention. Copilot has consolidated everything under a "custom agents" umbrella plus prompt files. Codex offers profiles plus a rolling "skills" feature. Antigravity treats the Manager Surface as the orchestration primitive and uses .antigravity/ for tool, rule, and prompt definitions.

Mechanism	Claude	GitHub Copilot	Codex	Antigravity
Sub-agent / persona file	`.claude/agents/<name>.md` (YAML frontmatter)	`.github/agents/<name>.agent.md` or `~/.github/agents/`	None as files — agents spawned at runtime	Agents created from Manager Surface
Reusable capability bundle	Skills — `.claude/skills/<name>/SKILL.md`	Skills — `.github/skills/`, `.claude/skills/`, `.agents/skills/`	"Skills" rolling out Q1 2026 (recipe form)	`.antigravity/` rules + tool defs
Lifecycle hooks	Native — `SessionStart`, `PreToolUse`, `PostToolUse`, `Stop`, `SubagentStop`, etc.	No first-class hook system; `copilot-setup-steps.yml` for env prep	None	None
Slash commands	`/agents`, `/hooks`, `/mcp`, custom from `.claude/commands/`	`/command` from prompt files	`/model`, `/review`, `/status`, `/skills`	Built-in only
Plugin / distribution format	`.claude-plugin/plugin.json` + marketplaces	Copilot Extensions marketplace	None (no plugin format)	None (no plugin format)
Per-component tool allowlist	Yes — `tools:` in frontmatter	Yes — `tools:` in custom-agent frontmatter	Per-MCP-server level	Per-MCP-server level
Cross-vendor skill format	SKILL.md (Anthropic-led open spec)	Reads `.claude/skills/` too	Reads SKILL.md in some configs	Limited

A concrete Claude sub-agent definition shows the depth of the per-file controls:

---
name: code-reviewer
description: Expert code review specialist. Use proactively after any code change.
tools: Read, Grep, Glob
model: sonnet
---

You are a code review specialist. Read every changed file in the
current branch, flag security and style issues, and return findings
grouped by severity.

Copilot's equivalent uses the same Markdown-plus-frontmatter shape, intentionally similar enough that a team with a Claude reviewer can port to Copilot without re-authoring from scratch:

---
name: Reviewer
description: Reviews diffs for security and style issues, never edits code.
target: vscode
tools: [read_file, list_dir, grep_search]
---

You are a focused code reviewer. Read the diff, flag security and
style issues against the repo's copilot-instructions.md, and produce
a single Markdown report. Never call apply_edit.

Codex does not have a per-sub-agent file convention — orchestration of multiple Codex agents happens via the experimental Git-worktree multi-agent mode, with per-CLI-profile model and sandbox overrides. Antigravity uses Manager-spawned agents as the unit; agent-to-agent handoff happens via artifacts rather than via a saved persona file. The decomposed-primitives versus consolidated-agent split is the single biggest authoring-experience difference between Claude and the other three.

5. AGENTS.md and instruction-file conventions¶

The mid-2025 fragmentation around "where do I write project rules" has compressed into a partial convergence by 2026. The AGENTS.md file, originally a Codex-specific convention, was published as an open specification at agents.md and adopted by Cursor, Aider, Windsurf, Devin, Junie, Factory, Amp, GitHub Copilot, Antigravity, and Claude Code wrappers, with the spec now governed under the Linux Foundation's Agentic AI Foundation. Claude Code itself preserves its earlier CLAUDE.md convention but increasingly reads AGENTS.md too. Copilot's canonical file is .github/copilot-instructions.md though it also reads AGENTS.md. Antigravity reads AGENTS.md as its primary knowledge-base entry point.

Convention	Claude	GitHub Copilot	Codex	Antigravity
Primary repo file	`CLAUDE.md`	`.github/copilot-instructions.md`	`AGENTS.md`	`AGENTS.md`
Reads `AGENTS.md`?	Yes (via plugins / wrappers)	Yes	Yes — canonical	Yes — canonical
Global / user-level fallback	`~/.claude/CLAUDE.md`	`%USERPROFILE%/.github/` agents	`~/.codex/AGENTS.md`	Knowledge base in `.antigravity/`
Nested per-directory files	Yes	Yes	Yes — walks Git root → cwd, concatenated	Implicit via folder context
Size limits / truncation	Implicit context budget	Implicit context budget	`project_doc_max_bytes` (32 KiB default) — silent truncation	Implicit
`.override.md` pattern	No first-class	No	Yes — `AGENTS.override.md` replaces sibling	No
Production reference	anthropics/claude-cookbook	github/.github	openai/openai monorepo — 88 nested `AGENTS.md`	Google internal samples

The practical takeaway is that AGENTS.md is now the safest portable bet for a team that wants its conventions to follow it across vendors. A repo that maintains one good AGENTS.md will be understood by Codex, Antigravity, and Copilot natively, by Claude through its wrappers, and by Cursor / Aider / Windsurf / Devin directly. The catch flagged in the codex sub-page applies to everyone: cross-tool consumption of the file is inconsistent — some vendors honour "Don't touch" directives strictly, others treat them as soft hints; some load all nested files, others only the repo root. The file format is portable; the enforcement is not.

6. MCP support¶

The Model Context Protocol has become the cross-vendor default for plugging tools and data sources into an agent, and every vendor in this comparison supports it. The differences live in configuration-file location, supported transports, and — most importantly — the security posture around server registration and approval.

MCP aspect	Claude	GitHub Copilot	Codex	Antigravity
Config file	`.mcp.json` (project), `~/.claude.json` (user), `.claude/settings.local.json` (local)	`.vscode/mcp.json` (workspace), user `mcp.json`; cloud uses repo/org config	`~/.codex/config.toml` (CLI/IDE/desktop); `.codex/config.toml` (project)	`~/.gemini/antigravity/mcp_config.json`
Transports	STDIO, HTTP, SSE; in-process via `createSdkMcpServer()`	STDIO, HTTP	STDIO, Streamable HTTP (with `codex mcp login` OAuth)	STDIO, HTTP/SSE, Google-auth HTTP
First-encounter trust prompt	Yes — listed verbatim post-TrustFall patch	Yes — per-server	Yes — surface dependent	Yes — but historically permissive
Tool namespacing	`mcp__<server>__<tool>`	Tool-list UI per session	Per-server prefixed	Per-server, surfaced in artifacts
Auto-approve / always-allow	Yes — explicit promotion per tool	Yes — per session toggle	Mode-dependent (approval policy axis)	Yes — per-tool toggle
Per-component allowlist	Per sub-agent / per Skill	Per custom agent	Per CLI profile	Per Manager-spawned agent
Notable security events	TrustFall (May 2026): MCP+hooks bundled into one trust prompt; mitigated	Sandbox firewall blocks default-deny egress in cloud agent	Issue #18243 — macOS sandbox-through-MCP fragility	`find_by_name` sandbox escape (Jan 2026); `webhook.site` exfiltration (Nov 2025)

Two observations matter for procurement. First, the security incidents are not symmetric. Claude's TrustFall and Antigravity's find_by_name are architecturally different — TrustFall was a one-click RCE on the trust-prompt UX that affected Claude Code, Cursor CLI, Gemini CLI, and Copilot CLI alike; Antigravity's were preview-stage hardening gaps in tool wrappers and outbound allowlists. Both vendors fixed the immediate findings, but the pattern of disclosure differs: Anthropic's response was a documented mitigation set and a hardened trust prompt; Google maintains a public Bug Hunters disclosure page for known Antigravity issues, which is a healthier audit posture but also a louder signal of preview-grade risk. Second, MCP STDIO servers are universally trusted code — they run inside whatever sandbox the agent opened, with the agent's privileges, and with no signing or verification by default. The ~200,000 publicly indexed STDIO servers by late 2025 are a supply-chain problem every vendor inherits. Codex's docs are the most explicit about this risk; the others soft-pedal it.

7. Sandbox and security model¶

Every vendor has had to answer the same question: how do you give the agent enough power to be useful without giving it enough power to destroy a developer's machine, exfiltrate the codebase, or push a malicious commit? The four answers differ along three axes: filesystem scope, network egress posture, and command-execution gating.

Security aspect	Claude	GitHub Copilot	Codex	Antigravity
Default filesystem scope	Working directory + temp; web sandbox stricter	Workspace in agent mode; ephemeral Actions runner in cloud	`workspace-write` — cwd only	Workspace + `.antigravity/`; Browser Surface isolated
Default network egress	Allowed locally; sandbox proxy on web	Allowed in agent mode; default-deny firewall in cloud agent	Blocked by default in `workspace-write` (Linux Landlock + seccomp; macOS Seatbelt)	Allowed but with allowlist defaults
Command approval default	Per-tool prompts	Per-command prompt with auto-approve toggle	`untrusted` / `on-failure` / `on-request` / `never` (`--full-auto`)	Per-command in agent mode
Fully-autonomous flag	`bypassPermissions` permission mode	"Auto-approve trusted" toggle	`--dangerously-bypass-approvals-and-sandbox`	Manager mode + per-agent override
Cloud-side isolation	Anthropic-managed VM with filesystem allowlist + Unix-socket network proxy (Oct 2025 sandbox)	Ephemeral GitHub Actions runner with egress firewall	Ephemeral VM, internet off by default, opt-in allowlist	Local-only — no cloud sandbox
OS-level isolation	Sandboxing runtime (filesystem allowlist + network proxy) on web	None local; runner-level on cloud	Apple Seatbelt (macOS), Landlock + seccomp-bpf (Linux); Windows weaker	OS-level isolation depends on platform patches (iterative hardening)
Notable abuse / incident	TrustFall RCE (May 2026); state-sponsored abuse report (Nov 2025)	Firewall added precisely to mitigate prompt-injection exfiltration	Macos MCP issue #18243	`find_by_name` sandbox escape; `webhook.site` exfiltration chain

The strongest local sandbox is Codex's: Landlock plus seccomp on Linux, Seatbelt on macOS, network egress off by default in the standard mode, and a verbose four-word flag (--dangerously-bypass-approvals-and-sandbox) for full-access mode. The strongest cloud sandbox is debatable: Copilot's cloud agent runs on a hardened GitHub Actions runner with a default-deny egress firewall and scoped agent tokens that are deliberately separate from CI/CD secrets, while Claude Code on the web runs on Anthropic's October-2025 sandboxing runtime with explicit filesystem allowlist and Unix-socket network proxy. Codex Cloud sits between them — fresh ephemeral VM per task, network off by default, with a user-maintained allowlist. Antigravity's cloud story is essentially "there is no cloud" — every agent runs locally in the IDE process, which simplifies the trust story but loses the bounded blast radius that an ephemeral VM provides.

What the table cannot show is the cadence of security work, which has been faster on Anthropic and Google than on the other two — not because their products are worse, but because the disclosure surface around their newer agents has been more visible.

8. Pricing model¶

The pricing space splits along two axes: flat subscription versus metered consumption, and bundled into a broader platform versus standalone. Claude and Codex both bundle agent access into general-purpose subscriptions (Claude Pro/Max, ChatGPT Plus/Pro/Business) and a metered API tier. Copilot uses a hybrid: a per-seat subscription plus a per-action premium-request meter, with a planned migration to fully usage-based billing on June 1, 2026. Antigravity is free during its public preview, with no committed paid SKU yet.

Pricing aspect	Claude	GitHub Copilot	Codex	Antigravity
Model	Subscription + API per-token	Subscription + premium-request meter (→ usage-based June 2026)	Subscription + API per-token	Free preview
Entry paid tier	Pro $20/mo	Pro $10/mo	Go $8/mo, Plus $20/mo	$0
Top consumer tier	Max 20x $200/mo	Pro+ $39/mo (1,500 premium req/mo)	Pro from $100/mo	$0
Free tier	Yes — limited Sonnet messages	Yes — 50 premium req/mo	Yes — capped weekly, `-nano` model	All access
Per-token API	Bedrock / Vertex / Foundry	Microsoft-routed only	$1.75/MTok input ($0.175 cached), $14/MTok output (5.3-Codex)	None
Premium-request metering	No	Yes — extra at $0.04/req	Quota-based; usage-based pilot rolling out	None
Enterprise SKU	Team $30/seat; Enterprise custom	Business $19/seat; Enterprise $39/seat	Business / Edu / Enterprise per-seat	Not yet
Per-action billable surface	Tokens	Premium requests + Actions minutes	Tokens or quota	None

Three contrasts are worth surfacing. First, Copilot is the cheapest entry point at $10/month for Pro, undercutting Plus and Claude Pro at $20 — but Copilot's premium-request meter caps Pro at 300 requests/month, where a long agent session can burn through dozens. Second, Codex Pro at $100/month is the most expensive consumer tier but ships uncapped CLI usage and 5×-20× Plus-quota multipliers; for a developer doing many hours of agent work per day, Pro converts to per-hour terms that beat the other plans. Third, Antigravity's free preview is genuinely free including third-party Claude and GPT-OSS inference — an unusual posture that almost certainly does not survive into the paid SKU.

9. Strengths and weaknesses¶

Each agent has a centre of gravity. Claude's is composability and configurability. Copilot's is GitHub-native end-to-end integration. Codex's is multi-surface breadth and bi-directional cloud/local hand-off. Antigravity's is the fleet-of-agents Manager Surface and the first-class controlled browser.

Read as paragraphs rather than columns: Claude Code is the agent for teams that want Unix-shaped primitives — sub-agents, Skills, hooks, MCP, plugins — that compose, sit in version control, and survive switching to a competitor (because much of the configuration is portable via SKILL.md and .claude/agents/). Its weak spots are the security history (TrustFall, the November 2025 state-sponsored abuse report), the still-pre-1.0 Agent SDK, and the absence of a native cross-vendor model picker.

GitHub Copilot's strength is that the agent is already where the work happens: in the issue, on the PR, in the editor, in Visual Studio, in the CI pipeline. Its third-party agents framework also makes it the easiest place to try Claude or Codex without changing seat licenses. The weak spots are the premium-request economics (which the June 2026 billing migration may either ease or worsen), the lack of a hook system comparable to Claude's, and macOS not being supported as a cloud-agent runner.

Codex has the deepest multi-surface story — CLI, IDE, Cloud, mobile, Slack, PR comments, third-party slot on github.com — and the most explicit reasoning-effort control. Its local sandbox is the most rigorous of the four. The weak spots are model availability lag on the API (Spark and the newest snapshots reach ChatGPT plans weeks before API customers), Windows second-class status, and the cross-vendor AGENTS.md inconsistency it helped create.

Antigravity is unique in two ways: the Manager Surface makes parallel autonomous work routine rather than exotic, and the Browser Surface puts end-to-end validation inside the agent's reach without MCP plumbing. The multi-model picker is also genuinely cross-vendor. The weak spots are the preview status, the security incidents in the first six months, the lack of an enterprise SKU, and the requirement to switch IDE entirely to adopt it.

Vendor	Strengths	Weaknesses
Claude	Most composable primitives (sub-agents + Skills + hooks + plugins); SKILL.md is an open spec; portable across IDE/CLI/web/mobile; deep MCP scopes	TrustFall + state-sponsored abuse history; pre-1.0 Agent SDK; no native multi-vendor models; no `AGENTS.md` as primary file
GitHub Copilot	Native GitHub-flow integration; cheapest paid entry; third-party agents bring Claude/Codex in; cloud-agent firewall is solid	No hook system; macOS not a runner; premium-request economics tight on lower tiers; usage-based billing transition adds uncertainty
Codex	Most surfaces; bi-directional cloud↔local; rigorous local sandbox; explicit reasoning-effort knob; AGENTS.md is canonical	API model availability lag; Windows weaker; experimental multi-agent worktree mode; MCP supply-chain risk most explicit but inherited
Antigravity	Manager Surface for fleets of agents; first-class Browser Surface; multi-vendor model picker; free preview	Preview-stage product; security incidents in first 6 months; no enterprise SKU; no CLI; requires IDE switch

10. How to pick¶

There is no globally correct answer. The decision space collapses cleanly along five axes: what IDE and ecosystem the team already lives in, which model the team trusts, how much sandbox rigour matters, whether work is sync or async, and what org-policy constraints apply.

If the team already lives on github.com, lean Copilot. Issue assignment, PR-comment iteration, and the Actions-runner sandbox are native. The third-party agents framework keeps Claude and Codex one toggle away.
If composability and version-controlled agent configuration matter, lean Claude. The combination of sub-agents, Skills, hooks, and plugins is the most decomposed in the field, and SKILL.md is the only convention here published as an open standard with cross-vendor adoption.
If multi-vendor model neutrality matters, lean Antigravity (native picker including Gemini, Claude, and GPT-OSS) or Copilot (third-party agents framework). Avoid Claude and Codex if a single-vendor model lock is unacceptable.
If sandbox rigour on a local developer machine matters most, lean Codex. Landlock + seccomp + Seatbelt + default-deny network egress is the strongest local posture documented.
If the work is mostly asynchronous "fire and forget", lean Codex Cloud or Copilot's cloud agent. Both run on ephemeral hardened VMs with proper isolation. Antigravity is the wrong fit because its agents are local. Claude Code on the web works but ZDR-required orgs cannot use it.
If long-horizon parallel autonomous work is the day job, lean Antigravity. The Manager Surface is the only first-class fleet view in this comparison.
If end-to-end browser validation is part of every task, lean Antigravity. The Browser Surface is unique.
If the org has strict data-locality requirements, prefer Claude (Bedrock / Vertex / Foundry routing) or self-hosted Codex CLI with API keys and no Cloud surfaces. Antigravity and Copilot are both vendor-routed.
If budget is the primary constraint, start free with Antigravity, or pay $8/month for Codex Go, or $10/month for Copilot Pro. Claude Pro at $20 is the most expensive entry tier.
If a single agent must work from inside a JetBrains IDE, all four ship a JetBrains experience, but Codex's January 2026 JetBrains integration and Claude's longer-standing JetBrains extension are the most mature.

Most teams will not pick exactly one. The realistic 2026 pattern is two agents in active use: typically a synchronous in-editor agent (Claude Code, Copilot agent mode, or Codex IDE) paired with an asynchronous cloud agent (Copilot cloud, Codex Cloud, or Claude Code on the web). Antigravity tends to be evaluated as a third surface for specific workflows — multi-agent prototyping, browser-heavy validation — rather than as a replacement for either of the other two.

The single observation that keeps re-emerging across all four sub-pages: agent quality is more strongly correlated with codebase hygiene than with vendor choice. A well-tested codebase with clear specs and a maintained AGENTS.md will see good results from any of these four. A neglected one will see poor results from all of them.

Sources¶

Changelog¶

2026-05-11 — Comparison page created from four vendor sub-pages with cross-vendor analysis (confidence 80)
2026-05-11 — Backlink: added related_pages entries for the new agent-skills-* authoring sub-pages (auto-sweep, Phase 3.5).