Skip to content

Coding agents — side-by-side comparison

By May 2026 the agentic-coding category has settled into a stable quartet of platform-scale offerings: Anthropic's Claude Code (with the underlying Claude Agent SDK), Microsoft's GitHub Copilot family of agent surfaces, OpenAI's Codex stack, and Google's Antigravity platform. Each one ships a multi-surface, multi-trigger experience that can read a repository, edit files, run shell commands, drive a browser, and open a pull request without continuous human steering. Each one wraps a frontier reasoning model inside a sandbox, a permissions model, a configuration file convention, and an MCP-based tool extension surface. They look superficially similar and they all read parts of each other's conventions — yet at the level of execution model, security posture, model lineup, pricing, and target workflow the differences are substantial enough to drive real procurement decisions.

This page is the dedicated cross-vendor reference. It is not a benchmark: there are no SWE-Bench numbers, no head-to-head trial transcripts, and no claim about which agent "wins". It is a feature-level comparison built from the four vendor-specific sub-pages (claude-agents, github-copilot-agents, codex-agents, antigravity-agents) and a reading of each vendor's primary documentation. Every section opens with prose context, presents a comparison table across the four vendors, and then interprets what the table actually means in practice. The scope is the state of the four products in May 2026; the volatile sections (model lineup, pricing) carry that snapshot caveat explicitly.

1. Surfaces and execution environments

Each vendor ships a different shape of product even before any feature comparison. Claude Code began life as a terminal CLI and grew an IDE plug-in, a web app, and a mobile experience around the same harness. GitHub Copilot grew agent surfaces inside the editors it already lived in, then added a GitHub-Actions-hosted cloud surface for async work. Codex covered every surface from launch — CLI, IDE extension, cloud VMs, mobile, Slack — but treats them as front-ends onto a single hosted agent. Antigravity is the outlier: a dedicated agentic IDE (a VS Code fork with Windsurf lineage) where the editor itself is built around a multi-agent dashboard.

Surface Claude GitHub Copilot Codex Antigravity
Hero IDE surface VS Code & JetBrains extensions VS Code, Visual Studio, JetBrains, Eclipse, Xcode VS Code, Cursor, Windsurf, JetBrains Antigravity IDE (forked VS Code)
Terminal CLI claude (Node) None first-party codex (Rust) Embedded terminal only
Async cloud agent Claude Code on the web Cloud agent on GitHub Actions Codex Cloud (chatgpt.com/codex) Manager Surface (local fleet)
Web UI claude.ai/code github.com Agents tab + Spark chatgpt.com/codex None — desktop IDE only
Mobile iOS app None first-party ChatGPT iOS; Android remote-control preview None
Dedicated agentic IDE No (extension model) No (extension model) No (extension model) Yes — the product is the IDE
Controlled browser Via MCP only Via MCP only Headless in cloud sandbox First-class Browser Surface

The pattern that drops out is a deliberate split between two architectures. Claude, Copilot and Codex are all harness-plus-surfaces products: one agent loop, packaged into many wrappers. Antigravity is a surface-first product: the IDE is the unit of distribution and the harness is something you only see through it. That matters for tooling decisions because the harness-plus-surfaces design preserves the developer's existing editor, while Antigravity's design asks the developer to switch IDE entirely. The presence of a true CLI on Claude and Codex (and the lack of one on Copilot and Antigravity) also shapes which agent can be wired into CI/CD or driven from a non-interactive script: Codex's codex exec and Claude Code's headless mode are the natural automation entry points, while Copilot's gh aw Agentic Workflows CLI is the closest analogue on the GitHub side. Antigravity has no scriptable equivalent at the time of writing.

2. Triggers and assignment

How you start a task on each platform is as telling as where it runs. The legacy pattern — open chat, type a request — is universal, but each vendor has built its own first-class trigger surface beyond that baseline. Claude leans on slash commands and a chat-anywhere model. Copilot leans on GitHub's native primitives: assign an issue, comment on a PR, or click an Agents tab on github.com. Codex pushes the breadth of triggers further than anyone else, with mobile, Slack and PR-comment triggers in addition to the IDE and CLI surfaces. Antigravity collapses the trigger question down to the Manager Surface, where new agents are spawned from inside the IDE.

Trigger Claude GitHub Copilot Codex Antigravity
IDE chat panel Yes (VS Code / JetBrains) Yes (Agent mode) Yes (sidebar) Yes (Editor View)
CLI command claude interactive + headless gh aw workflows only codex + codex exec No
Issue assignment Via MCP / GitHub plugin Native — assign issue to Copilot Native via Copilot third-party slot or @codex PR comment Via MCP / GitHub bridge
PR comment Via plugin / hook Native iteration on agent PRs @codex review, @codex implement, @codex address comments Via MCP only
Browser web UI claude.ai/code github.com Agents tab + Spark chatgpt.com/codex None
Slack Via MCP Via Extensions /codex slash command (Business/Enterprise) Via MCP
Mobile iOS Claude app None first-party ChatGPT iOS swipe-to-review None
Delegate sync → async Web hand-off via /web-setup "Delegate to coding agent" from VS Code "Delegate to Codex Cloud" + codex pull Cannot leave the IDE

The strategic differences become visible if you watch where each vendor invested first. GitHub Copilot's deepest trigger integration is issue assignment because that is the GitHub-native flow Microsoft owns end-to-end. Codex's deepest investment is bi-directional cloud/local hand-off — a local session can promote itself to a managed VM and the result can be pulled back into the working tree with one command — because OpenAI does not own a developer-platform substrate and has to make portability a feature. Claude's trigger story is editor-and-web-centric and converges on its hosted sandbox. Antigravity's trigger story is the most constrained: there is essentially one place to start work, the Manager Surface, but inside that place you can fan out to as many parallel agents as you want.

3. Model lineup

Three of the four vendors ship only their own models in the default flow. Antigravity is the exception: the model picker exposes Gemini 3 Pro and Flash alongside Claude Sonnet 4.5/4.6, Claude Opus 4.6, and OpenAI's GPT-OSS 120B open-weights model, with calls routed through Google-managed endpoints (Google Antigravity (Wikipedia), antigravity.google). Copilot is a partial exception: although its first-party agents run on Microsoft-routed OpenAI and Anthropic models, its third-party agents framework lets a maintainer assign an issue to a Codex or Claude session running entirely outside Microsoft's substrate (about-third-party-agents). Claude and Codex remain single-vendor at the model layer.

Aspect Claude GitHub Copilot Codex Antigravity
Default model (May 2026) Sonnet 4.6 (CLI), Opus 4.7 on hard tasks (web) Auto (mixed; picker exposes GPT-5.4 / Claude 4.7) GPT-5.3-Codex (rolling out 5.4-Codex) Gemini 3 Pro
Available models Opus 4.6/4.7, Sonnet 4.5/4.6, Haiku 4.5 GPT-5.2 → 5.4, Claude Sonnet 4.5/4.6, Claude Opus 4.5–4.7, Auto codex-1, codex-mini, GPT-5/5.2/5.3/5.4-Codex, -Spark, -nano Gemini 3 Pro/Flash, Claude Sonnet 4.5/4.6, Claude Opus 4.6, GPT-OSS 120B
Cross-vendor multi-model No Yes (via picker + third-party agents) No Yes (native picker)
Per-sub-agent model Yes (sub-agent frontmatter model:) Yes (custom agent + picker) Per-CLI-profile only Per-agent in Manager
BYO inference / cloud routing Bedrock, Vertex, Foundry envs Microsoft-routed only AWS Bedrock (Apr 2026); API keys Google-routed only
Open-weights option No No No GPT-OSS 120B
Reasoning-effort knob Implicit (Opus vs Sonnet) Implicit Explicit low|medium|high|xhigh via /model Implicit

Two practical consequences flow from this. First, an organisation that wants to standardise on a single agent UX but stay model-neutral has only two real options today: Antigravity, where the picker is native, or Copilot, where the third-party agents framework reaches Codex and Claude through GitHub's substrate. Picking Claude Code or Codex straight is a bet on Anthropic or OpenAI specifically. Second, the reasoning-effort knob is a Codex-specific feature: /model switches both the model and an explicit effort level (low/medium/high/xhigh), which is the most direct cost-versus-quality dial in the field. Claude approximates the same control through its Haiku/Sonnet/Opus tiering on sub-agents. Copilot and Antigravity expose effort indirectly via model choice.

4. Sub-agents, Skills, hooks, and custom agents

Every vendor has converged on the idea that a single monolithic agent persona is not enough — but the pattern each picked for sub-delegation diverges substantially. Claude's design is the most decomposed: separate primitives for sub-agents, Skills, hooks, MCP servers, and plugins, each with its own filesystem convention. Copilot has consolidated everything under a "custom agents" umbrella plus prompt files. Codex offers profiles plus a rolling "skills" feature. Antigravity treats the Manager Surface as the orchestration primitive and uses .antigravity/ for tool, rule, and prompt definitions.

Mechanism Claude GitHub Copilot Codex Antigravity
Sub-agent / persona file .claude/agents/<name>.md (YAML frontmatter) .github/agents/<name>.agent.md or ~/.github/agents/ None as files — agents spawned at runtime Agents created from Manager Surface
Reusable capability bundle Skills — .claude/skills/<name>/SKILL.md Skills — .github/skills/, .claude/skills/, .agents/skills/ "Skills" rolling out Q1 2026 (recipe form) .antigravity/ rules + tool defs
Lifecycle hooks Native — SessionStart, PreToolUse, PostToolUse, Stop, SubagentStop, etc. No first-class hook system; copilot-setup-steps.yml for env prep None None
Slash commands /agents, /hooks, /mcp, custom from .claude/commands/ /command from prompt files /model, /review, /status, /skills Built-in only
Plugin / distribution format .claude-plugin/plugin.json + marketplaces Copilot Extensions marketplace None (no plugin format) None (no plugin format)
Per-component tool allowlist Yes — tools: in frontmatter Yes — tools: in custom-agent frontmatter Per-MCP-server level Per-MCP-server level
Cross-vendor skill format SKILL.md (Anthropic-led open spec) Reads .claude/skills/ too Reads SKILL.md in some configs Limited

A concrete Claude sub-agent definition shows the depth of the per-file controls:

---
name: code-reviewer
description: Expert code review specialist. Use proactively after any code change.
tools: Read, Grep, Glob
model: sonnet
---

You are a code review specialist. Read every changed file in the
current branch, flag security and style issues, and return findings
grouped by severity.

Copilot's equivalent uses the same Markdown-plus-frontmatter shape, intentionally similar enough that a team with a Claude reviewer can port to Copilot without re-authoring from scratch:

---
name: Reviewer
description: Reviews diffs for security and style issues, never edits code.
target: vscode
tools: [read_file, list_dir, grep_search]
---

You are a focused code reviewer. Read the diff, flag security and
style issues against the repo's copilot-instructions.md, and produce
a single Markdown report. Never call apply_edit.

Codex does not have a per-sub-agent file convention — orchestration of multiple Codex agents happens via the experimental Git-worktree multi-agent mode, with per-CLI-profile model and sandbox overrides. Antigravity uses Manager-spawned agents as the unit; agent-to-agent handoff happens via artifacts rather than via a saved persona file. The decomposed-primitives versus consolidated-agent split is the single biggest authoring-experience difference between Claude and the other three.

5. AGENTS.md and instruction-file conventions

The mid-2025 fragmentation around "where do I write project rules" has compressed into a partial convergence by 2026. The AGENTS.md file, originally a Codex-specific convention, was published as an open specification at agents.md and adopted by Cursor, Aider, Windsurf, Devin, Junie, Factory, Amp, GitHub Copilot, Antigravity, and Claude Code wrappers, with the spec now governed under the Linux Foundation's Agentic AI Foundation. Claude Code itself preserves its earlier CLAUDE.md convention but increasingly reads AGENTS.md too. Copilot's canonical file is .github/copilot-instructions.md though it also reads AGENTS.md. Antigravity reads AGENTS.md as its primary knowledge-base entry point.

Convention Claude GitHub Copilot Codex Antigravity
Primary repo file CLAUDE.md .github/copilot-instructions.md AGENTS.md AGENTS.md
Reads AGENTS.md? Yes (via plugins / wrappers) Yes Yes — canonical Yes — canonical
Global / user-level fallback ~/.claude/CLAUDE.md %USERPROFILE%/.github/ agents ~/.codex/AGENTS.md Knowledge base in .antigravity/
Nested per-directory files Yes Yes Yes — walks Git root → cwd, concatenated Implicit via folder context
Size limits / truncation Implicit context budget Implicit context budget project_doc_max_bytes (32 KiB default) — silent truncation Implicit
.override.md pattern No first-class No Yes — AGENTS.override.md replaces sibling No
Production reference anthropics/claude-cookbook github/.github openai/openai monorepo — 88 nested AGENTS.md Google internal samples

The practical takeaway is that AGENTS.md is now the safest portable bet for a team that wants its conventions to follow it across vendors. A repo that maintains one good AGENTS.md will be understood by Codex, Antigravity, and Copilot natively, by Claude through its wrappers, and by Cursor / Aider / Windsurf / Devin directly. The catch flagged in the codex sub-page applies to everyone: cross-tool consumption of the file is inconsistent — some vendors honour "Don't touch" directives strictly, others treat them as soft hints; some load all nested files, others only the repo root. The file format is portable; the enforcement is not.

6. MCP support

The Model Context Protocol has become the cross-vendor default for plugging tools and data sources into an agent, and every vendor in this comparison supports it. The differences live in configuration-file location, supported transports, and — most importantly — the security posture around server registration and approval.

MCP aspect Claude GitHub Copilot Codex Antigravity
Config file .mcp.json (project), ~/.claude.json (user), .claude/settings.local.json (local) .vscode/mcp.json (workspace), user mcp.json; cloud uses repo/org config ~/.codex/config.toml (CLI/IDE/desktop); .codex/config.toml (project) ~/.gemini/antigravity/mcp_config.json
Transports STDIO, HTTP, SSE; in-process via createSdkMcpServer() STDIO, HTTP STDIO, Streamable HTTP (with codex mcp login OAuth) STDIO, HTTP/SSE, Google-auth HTTP
First-encounter trust prompt Yes — listed verbatim post-TrustFall patch Yes — per-server Yes — surface dependent Yes — but historically permissive
Tool namespacing mcp__<server>__<tool> Tool-list UI per session Per-server prefixed Per-server, surfaced in artifacts
Auto-approve / always-allow Yes — explicit promotion per tool Yes — per session toggle Mode-dependent (approval policy axis) Yes — per-tool toggle
Per-component allowlist Per sub-agent / per Skill Per custom agent Per CLI profile Per Manager-spawned agent
Notable security events TrustFall (May 2026): MCP+hooks bundled into one trust prompt; mitigated Sandbox firewall blocks default-deny egress in cloud agent Issue #18243 — macOS sandbox-through-MCP fragility find_by_name sandbox escape (Jan 2026); webhook.site exfiltration (Nov 2025)

Two observations matter for procurement. First, the security incidents are not symmetric. Claude's TrustFall and Antigravity's find_by_name are architecturally different — TrustFall was a one-click RCE on the trust-prompt UX that affected Claude Code, Cursor CLI, Gemini CLI, and Copilot CLI alike; Antigravity's were preview-stage hardening gaps in tool wrappers and outbound allowlists. Both vendors fixed the immediate findings, but the pattern of disclosure differs: Anthropic's response was a documented mitigation set and a hardened trust prompt; Google maintains a public Bug Hunters disclosure page for known Antigravity issues, which is a healthier audit posture but also a louder signal of preview-grade risk. Second, MCP STDIO servers are universally trusted code — they run inside whatever sandbox the agent opened, with the agent's privileges, and with no signing or verification by default. The ~200,000 publicly indexed STDIO servers by late 2025 are a supply-chain problem every vendor inherits. Codex's docs are the most explicit about this risk; the others soft-pedal it.

7. Sandbox and security model

Every vendor has had to answer the same question: how do you give the agent enough power to be useful without giving it enough power to destroy a developer's machine, exfiltrate the codebase, or push a malicious commit? The four answers differ along three axes: filesystem scope, network egress posture, and command-execution gating.

Security aspect Claude GitHub Copilot Codex Antigravity
Default filesystem scope Working directory + temp; web sandbox stricter Workspace in agent mode; ephemeral Actions runner in cloud workspace-write — cwd only Workspace + .antigravity/; Browser Surface isolated
Default network egress Allowed locally; sandbox proxy on web Allowed in agent mode; default-deny firewall in cloud agent Blocked by default in workspace-write (Linux Landlock + seccomp; macOS Seatbelt) Allowed but with allowlist defaults
Command approval default Per-tool prompts Per-command prompt with auto-approve toggle untrusted / on-failure / on-request / never (--full-auto) Per-command in agent mode
Fully-autonomous flag bypassPermissions permission mode "Auto-approve trusted" toggle --dangerously-bypass-approvals-and-sandbox Manager mode + per-agent override
Cloud-side isolation Anthropic-managed VM with filesystem allowlist + Unix-socket network proxy (Oct 2025 sandbox) Ephemeral GitHub Actions runner with egress firewall Ephemeral VM, internet off by default, opt-in allowlist Local-only — no cloud sandbox
OS-level isolation Sandboxing runtime (filesystem allowlist + network proxy) on web None local; runner-level on cloud Apple Seatbelt (macOS), Landlock + seccomp-bpf (Linux); Windows weaker OS-level isolation depends on platform patches (iterative hardening)
Notable abuse / incident TrustFall RCE (May 2026); state-sponsored abuse report (Nov 2025) Firewall added precisely to mitigate prompt-injection exfiltration Macos MCP issue #18243 find_by_name sandbox escape; webhook.site exfiltration chain

The strongest local sandbox is Codex's: Landlock plus seccomp on Linux, Seatbelt on macOS, network egress off by default in the standard mode, and a verbose four-word flag (--dangerously-bypass-approvals-and-sandbox) for full-access mode. The strongest cloud sandbox is debatable: Copilot's cloud agent runs on a hardened GitHub Actions runner with a default-deny egress firewall and scoped agent tokens that are deliberately separate from CI/CD secrets, while Claude Code on the web runs on Anthropic's October-2025 sandboxing runtime with explicit filesystem allowlist and Unix-socket network proxy. Codex Cloud sits between them — fresh ephemeral VM per task, network off by default, with a user-maintained allowlist. Antigravity's cloud story is essentially "there is no cloud" — every agent runs locally in the IDE process, which simplifies the trust story but loses the bounded blast radius that an ephemeral VM provides.

What the table cannot show is the cadence of security work, which has been faster on Anthropic and Google than on the other two — not because their products are worse, but because the disclosure surface around their newer agents has been more visible.

8. Pricing model

The pricing space splits along two axes: flat subscription versus metered consumption, and bundled into a broader platform versus standalone. Claude and Codex both bundle agent access into general-purpose subscriptions (Claude Pro/Max, ChatGPT Plus/Pro/Business) and a metered API tier. Copilot uses a hybrid: a per-seat subscription plus a per-action premium-request meter, with a planned migration to fully usage-based billing on June 1, 2026. Antigravity is free during its public preview, with no committed paid SKU yet.

Pricing aspect Claude GitHub Copilot Codex Antigravity
Model Subscription + API per-token Subscription + premium-request meter (→ usage-based June 2026) Subscription + API per-token Free preview
Entry paid tier Pro $20/mo Pro $10/mo Go $8/mo, Plus $20/mo $0
Top consumer tier Max 20x $200/mo Pro+ $39/mo (1,500 premium req/mo) Pro from $100/mo $0
Free tier Yes — limited Sonnet messages Yes — 50 premium req/mo Yes — capped weekly, -nano model All access
Per-token API Bedrock / Vertex / Foundry Microsoft-routed only $1.75/MTok input ($0.175 cached), $14/MTok output (5.3-Codex) None
Premium-request metering No Yes — extra at $0.04/req Quota-based; usage-based pilot rolling out None
Enterprise SKU Team $30/seat; Enterprise custom Business $19/seat; Enterprise $39/seat Business / Edu / Enterprise per-seat Not yet
Per-action billable surface Tokens Premium requests + Actions minutes Tokens or quota None

Three contrasts are worth surfacing. First, Copilot is the cheapest entry point at $10/month for Pro, undercutting Plus and Claude Pro at $20 — but Copilot's premium-request meter caps Pro at 300 requests/month, where a long agent session can burn through dozens. Second, Codex Pro at $100/month is the most expensive consumer tier but ships uncapped CLI usage and 5×-20× Plus-quota multipliers; for a developer doing many hours of agent work per day, Pro converts to per-hour terms that beat the other plans. Third, Antigravity's free preview is genuinely free including third-party Claude and GPT-OSS inference — an unusual posture that almost certainly does not survive into the paid SKU.

9. Strengths and weaknesses

Each agent has a centre of gravity. Claude's is composability and configurability. Copilot's is GitHub-native end-to-end integration. Codex's is multi-surface breadth and bi-directional cloud/local hand-off. Antigravity's is the fleet-of-agents Manager Surface and the first-class controlled browser.

Read as paragraphs rather than columns: Claude Code is the agent for teams that want Unix-shaped primitives — sub-agents, Skills, hooks, MCP, plugins — that compose, sit in version control, and survive switching to a competitor (because much of the configuration is portable via SKILL.md and .claude/agents/). Its weak spots are the security history (TrustFall, the November 2025 state-sponsored abuse report), the still-pre-1.0 Agent SDK, and the absence of a native cross-vendor model picker.

GitHub Copilot's strength is that the agent is already where the work happens: in the issue, on the PR, in the editor, in Visual Studio, in the CI pipeline. Its third-party agents framework also makes it the easiest place to try Claude or Codex without changing seat licenses. The weak spots are the premium-request economics (which the June 2026 billing migration may either ease or worsen), the lack of a hook system comparable to Claude's, and macOS not being supported as a cloud-agent runner.

Codex has the deepest multi-surface story — CLI, IDE, Cloud, mobile, Slack, PR comments, third-party slot on github.com — and the most explicit reasoning-effort control. Its local sandbox is the most rigorous of the four. The weak spots are model availability lag on the API (Spark and the newest snapshots reach ChatGPT plans weeks before API customers), Windows second-class status, and the cross-vendor AGENTS.md inconsistency it helped create.

Antigravity is unique in two ways: the Manager Surface makes parallel autonomous work routine rather than exotic, and the Browser Surface puts end-to-end validation inside the agent's reach without MCP plumbing. The multi-model picker is also genuinely cross-vendor. The weak spots are the preview status, the security incidents in the first six months, the lack of an enterprise SKU, and the requirement to switch IDE entirely to adopt it.

Vendor Strengths Weaknesses
Claude Most composable primitives (sub-agents + Skills + hooks + plugins); SKILL.md is an open spec; portable across IDE/CLI/web/mobile; deep MCP scopes TrustFall + state-sponsored abuse history; pre-1.0 Agent SDK; no native multi-vendor models; no AGENTS.md as primary file
GitHub Copilot Native GitHub-flow integration; cheapest paid entry; third-party agents bring Claude/Codex in; cloud-agent firewall is solid No hook system; macOS not a runner; premium-request economics tight on lower tiers; usage-based billing transition adds uncertainty
Codex Most surfaces; bi-directional cloud↔local; rigorous local sandbox; explicit reasoning-effort knob; AGENTS.md is canonical API model availability lag; Windows weaker; experimental multi-agent worktree mode; MCP supply-chain risk most explicit but inherited
Antigravity Manager Surface for fleets of agents; first-class Browser Surface; multi-vendor model picker; free preview Preview-stage product; security incidents in first 6 months; no enterprise SKU; no CLI; requires IDE switch

10. How to pick

There is no globally correct answer. The decision space collapses cleanly along five axes: what IDE and ecosystem the team already lives in, which model the team trusts, how much sandbox rigour matters, whether work is sync or async, and what org-policy constraints apply.

  • If the team already lives on github.com, lean Copilot. Issue assignment, PR-comment iteration, and the Actions-runner sandbox are native. The third-party agents framework keeps Claude and Codex one toggle away.
  • If composability and version-controlled agent configuration matter, lean Claude. The combination of sub-agents, Skills, hooks, and plugins is the most decomposed in the field, and SKILL.md is the only convention here published as an open standard with cross-vendor adoption.
  • If multi-vendor model neutrality matters, lean Antigravity (native picker including Gemini, Claude, and GPT-OSS) or Copilot (third-party agents framework). Avoid Claude and Codex if a single-vendor model lock is unacceptable.
  • If sandbox rigour on a local developer machine matters most, lean Codex. Landlock + seccomp + Seatbelt + default-deny network egress is the strongest local posture documented.
  • If the work is mostly asynchronous "fire and forget", lean Codex Cloud or Copilot's cloud agent. Both run on ephemeral hardened VMs with proper isolation. Antigravity is the wrong fit because its agents are local. Claude Code on the web works but ZDR-required orgs cannot use it.
  • If long-horizon parallel autonomous work is the day job, lean Antigravity. The Manager Surface is the only first-class fleet view in this comparison.
  • If end-to-end browser validation is part of every task, lean Antigravity. The Browser Surface is unique.
  • If the org has strict data-locality requirements, prefer Claude (Bedrock / Vertex / Foundry routing) or self-hosted Codex CLI with API keys and no Cloud surfaces. Antigravity and Copilot are both vendor-routed.
  • If budget is the primary constraint, start free with Antigravity, or pay $8/month for Codex Go, or $10/month for Copilot Pro. Claude Pro at $20 is the most expensive entry tier.
  • If a single agent must work from inside a JetBrains IDE, all four ship a JetBrains experience, but Codex's January 2026 JetBrains integration and Claude's longer-standing JetBrains extension are the most mature.

Most teams will not pick exactly one. The realistic 2026 pattern is two agents in active use: typically a synchronous in-editor agent (Claude Code, Copilot agent mode, or Codex IDE) paired with an asynchronous cloud agent (Copilot cloud, Codex Cloud, or Claude Code on the web). Antigravity tends to be evaluated as a third surface for specific workflows — multi-agent prototyping, browser-heavy validation — rather than as a replacement for either of the other two.

The single observation that keeps re-emerging across all four sub-pages: agent quality is more strongly correlated with codebase hygiene than with vendor choice. A well-tested codebase with clear specs and a maintained AGENTS.md will see good results from any of these four. A neglected one will see poor results from all of them.

Sources

  1. Claude Code documentation — Anthropic primary docs
  2. Claude Code trust prompt can trigger one-click RCE — The Register (May 2026)
  3. Agent Skills Specification — agentskills.io
  4. Introducing Copilot Agent Mode — VS Code blog
  5. Assigning and completing issues with Copilot's coding agent — GitHub blog
  6. Customize the Copilot agent environment — GitHub docs
  7. Third-party agents in Copilot — GitHub docs
  8. Plans for GitHub Copilot — GitHub docs
  9. Custom agents configuration — GitHub docs
  10. MCP servers in VS Code — Visual Studio Code docs
  11. GitHub Copilot in Visual Studio — April 2026 update
  12. Introducing Codex — OpenAI (May 2025)
  13. Introducing GPT-5.3-Codex — OpenAI (Feb 2026)
  14. Codex CLI documentation — OpenAI Developers
  15. Codex MCP guide — OpenAI Developers
  16. AGENTS.md guide — OpenAI Developers
  17. Codex pricing — OpenAI Developers
  18. AGENTS.md open specification — agents.md
  19. openai/codex — GitHub
  20. Build with Google Antigravity — Google Developers Blog
  21. Introducing Google Antigravity — antigravity.google
  22. Antigravity MCP Documentation — antigravity.google
  23. Antigravity Known Issues — Google Bug Hunters
  24. Security Keeps Google Antigravity Grounded — Embrace The Red
  25. Vendor sub-pages: claude-agents, github-copilot-agents, codex-agents, antigravity-agents

Changelog

  • 2026-05-11 — Comparison page created from four vendor sub-pages with cross-vendor analysis (confidence 80)