Coding agents — side-by-side comparison¶
By May 2026 the agentic-coding category has settled into a stable quartet of platform-scale offerings: Anthropic's Claude Code (with the underlying Claude Agent SDK), Microsoft's GitHub Copilot family of agent surfaces, OpenAI's Codex stack, and Google's Antigravity platform. Each one ships a multi-surface, multi-trigger experience that can read a repository, edit files, run shell commands, drive a browser, and open a pull request without continuous human steering. Each one wraps a frontier reasoning model inside a sandbox, a permissions model, a configuration file convention, and an MCP-based tool extension surface. They look superficially similar and they all read parts of each other's conventions — yet at the level of execution model, security posture, model lineup, pricing, and target workflow the differences are substantial enough to drive real procurement decisions.
This page is the dedicated cross-vendor reference. It is not a benchmark: there are no SWE-Bench numbers, no head-to-head trial transcripts, and no claim about which agent "wins". It is a feature-level comparison built from the four vendor-specific sub-pages (claude-agents, github-copilot-agents, codex-agents, antigravity-agents) and a reading of each vendor's primary documentation. Every section opens with prose context, presents a comparison table across the four vendors, and then interprets what the table actually means in practice. The scope is the state of the four products in May 2026; the volatile sections (model lineup, pricing) carry that snapshot caveat explicitly.
1. Surfaces and execution environments¶
Each vendor ships a different shape of product even before any feature comparison. Claude Code began life as a terminal CLI and grew an IDE plug-in, a web app, and a mobile experience around the same harness. GitHub Copilot grew agent surfaces inside the editors it already lived in, then added a GitHub-Actions-hosted cloud surface for async work. Codex covered every surface from launch — CLI, IDE extension, cloud VMs, mobile, Slack — but treats them as front-ends onto a single hosted agent. Antigravity is the outlier: a dedicated agentic IDE (a VS Code fork with Windsurf lineage) where the editor itself is built around a multi-agent dashboard.
| Surface | Claude | GitHub Copilot | Codex | Antigravity |
|---|---|---|---|---|
| Hero IDE surface | VS Code & JetBrains extensions | VS Code, Visual Studio, JetBrains, Eclipse, Xcode | VS Code, Cursor, Windsurf, JetBrains | Antigravity IDE (forked VS Code) |
| Terminal CLI | claude (Node) |
None first-party | codex (Rust) |
Embedded terminal only |
| Async cloud agent | Claude Code on the web | Cloud agent on GitHub Actions | Codex Cloud (chatgpt.com/codex) | Manager Surface (local fleet) |
| Web UI | claude.ai/code |
github.com Agents tab + Spark | chatgpt.com/codex | None — desktop IDE only |
| Mobile | iOS app | None first-party | ChatGPT iOS; Android remote-control preview | None |
| Dedicated agentic IDE | No (extension model) | No (extension model) | No (extension model) | Yes — the product is the IDE |
| Controlled browser | Via MCP only | Via MCP only | Headless in cloud sandbox | First-class Browser Surface |
The pattern that drops out is a deliberate split between two architectures. Claude, Copilot and Codex are all harness-plus-surfaces products: one agent loop, packaged into many wrappers. Antigravity is a surface-first product: the IDE is the unit of distribution and the harness is something you only see through it. That matters for tooling decisions because the harness-plus-surfaces design preserves the developer's existing editor, while Antigravity's design asks the developer to switch IDE entirely. The presence of a true CLI on Claude and Codex (and the lack of one on Copilot and Antigravity) also shapes which agent can be wired into CI/CD or driven from a non-interactive script: Codex's codex exec and Claude Code's headless mode are the natural automation entry points, while Copilot's gh aw Agentic Workflows CLI is the closest analogue on the GitHub side. Antigravity has no scriptable equivalent at the time of writing.
2. Triggers and assignment¶
How you start a task on each platform is as telling as where it runs. The legacy pattern — open chat, type a request — is universal, but each vendor has built its own first-class trigger surface beyond that baseline. Claude leans on slash commands and a chat-anywhere model. Copilot leans on GitHub's native primitives: assign an issue, comment on a PR, or click an Agents tab on github.com. Codex pushes the breadth of triggers further than anyone else, with mobile, Slack and PR-comment triggers in addition to the IDE and CLI surfaces. Antigravity collapses the trigger question down to the Manager Surface, where new agents are spawned from inside the IDE.
| Trigger | Claude | GitHub Copilot | Codex | Antigravity |
|---|---|---|---|---|
| IDE chat panel | Yes (VS Code / JetBrains) | Yes (Agent mode) | Yes (sidebar) | Yes (Editor View) |
| CLI command | claude interactive + headless |
gh aw workflows only |
codex + codex exec |
No |
| Issue assignment | Via MCP / GitHub plugin | Native — assign issue to Copilot | Native via Copilot third-party slot or @codex PR comment |
Via MCP / GitHub bridge |
| PR comment | Via plugin / hook | Native iteration on agent PRs | @codex review, @codex implement, @codex address comments |
Via MCP only |
| Browser web UI | claude.ai/code | github.com Agents tab + Spark | chatgpt.com/codex | None |
| Slack | Via MCP | Via Extensions | /codex slash command (Business/Enterprise) |
Via MCP |
| Mobile | iOS Claude app | None first-party | ChatGPT iOS swipe-to-review | None |
| Delegate sync → async | Web hand-off via /web-setup |
"Delegate to coding agent" from VS Code | "Delegate to Codex Cloud" + codex pull |
Cannot leave the IDE |
The strategic differences become visible if you watch where each vendor invested first. GitHub Copilot's deepest trigger integration is issue assignment because that is the GitHub-native flow Microsoft owns end-to-end. Codex's deepest investment is bi-directional cloud/local hand-off — a local session can promote itself to a managed VM and the result can be pulled back into the working tree with one command — because OpenAI does not own a developer-platform substrate and has to make portability a feature. Claude's trigger story is editor-and-web-centric and converges on its hosted sandbox. Antigravity's trigger story is the most constrained: there is essentially one place to start work, the Manager Surface, but inside that place you can fan out to as many parallel agents as you want.
3. Model lineup¶
Three of the four vendors ship only their own models in the default flow. Antigravity is the exception: the model picker exposes Gemini 3 Pro and Flash alongside Claude Sonnet 4.5/4.6, Claude Opus 4.6, and OpenAI's GPT-OSS 120B open-weights model, with calls routed through Google-managed endpoints (Google Antigravity (Wikipedia), antigravity.google). Copilot is a partial exception: although its first-party agents run on Microsoft-routed OpenAI and Anthropic models, its third-party agents framework lets a maintainer assign an issue to a Codex or Claude session running entirely outside Microsoft's substrate (about-third-party-agents). Claude and Codex remain single-vendor at the model layer.
| Aspect | Claude | GitHub Copilot | Codex | Antigravity |
|---|---|---|---|---|
| Default model (May 2026) | Sonnet 4.6 (CLI), Opus 4.7 on hard tasks (web) | Auto (mixed; picker exposes GPT-5.4 / Claude 4.7) | GPT-5.3-Codex (rolling out 5.4-Codex) | Gemini 3 Pro |
| Available models | Opus 4.6/4.7, Sonnet 4.5/4.6, Haiku 4.5 | GPT-5.2 → 5.4, Claude Sonnet 4.5/4.6, Claude Opus 4.5–4.7, Auto | codex-1, codex-mini, GPT-5/5.2/5.3/5.4-Codex, -Spark, -nano | Gemini 3 Pro/Flash, Claude Sonnet 4.5/4.6, Claude Opus 4.6, GPT-OSS 120B |
| Cross-vendor multi-model | No | Yes (via picker + third-party agents) | No | Yes (native picker) |
| Per-sub-agent model | Yes (sub-agent frontmatter model:) |
Yes (custom agent + picker) | Per-CLI-profile only | Per-agent in Manager |
| BYO inference / cloud routing | Bedrock, Vertex, Foundry envs | Microsoft-routed only | AWS Bedrock (Apr 2026); API keys | Google-routed only |
| Open-weights option | No | No | No | GPT-OSS 120B |
| Reasoning-effort knob | Implicit (Opus vs Sonnet) | Implicit | Explicit low|medium|high|xhigh via /model |
Implicit |
Two practical consequences flow from this. First, an organisation that wants to standardise on a single agent UX but stay model-neutral has only two real options today: Antigravity, where the picker is native, or Copilot, where the third-party agents framework reaches Codex and Claude through GitHub's substrate. Picking Claude Code or Codex straight is a bet on Anthropic or OpenAI specifically. Second, the reasoning-effort knob is a Codex-specific feature: /model switches both the model and an explicit effort level (low/medium/high/xhigh), which is the most direct cost-versus-quality dial in the field. Claude approximates the same control through its Haiku/Sonnet/Opus tiering on sub-agents. Copilot and Antigravity expose effort indirectly via model choice.
4. Sub-agents, Skills, hooks, and custom agents¶
Every vendor has converged on the idea that a single monolithic agent persona is not enough — but the pattern each picked for sub-delegation diverges substantially. Claude's design is the most decomposed: separate primitives for sub-agents, Skills, hooks, MCP servers, and plugins, each with its own filesystem convention. Copilot has consolidated everything under a "custom agents" umbrella plus prompt files. Codex offers profiles plus a rolling "skills" feature. Antigravity treats the Manager Surface as the orchestration primitive and uses .antigravity/ for tool, rule, and prompt definitions.
| Mechanism | Claude | GitHub Copilot | Codex | Antigravity |
|---|---|---|---|---|
| Sub-agent / persona file | .claude/agents/<name>.md (YAML frontmatter) |
.github/agents/<name>.agent.md or ~/.github/agents/ |
None as files — agents spawned at runtime | Agents created from Manager Surface |
| Reusable capability bundle | Skills — .claude/skills/<name>/SKILL.md |
Skills — .github/skills/, .claude/skills/, .agents/skills/ |
"Skills" rolling out Q1 2026 (recipe form) | .antigravity/ rules + tool defs |
| Lifecycle hooks | Native — SessionStart, PreToolUse, PostToolUse, Stop, SubagentStop, etc. |
No first-class hook system; copilot-setup-steps.yml for env prep |
None | None |
| Slash commands | /agents, /hooks, /mcp, custom from .claude/commands/ |
/command from prompt files |
/model, /review, /status, /skills |
Built-in only |
| Plugin / distribution format | .claude-plugin/plugin.json + marketplaces |
Copilot Extensions marketplace | None (no plugin format) | None (no plugin format) |
| Per-component tool allowlist | Yes — tools: in frontmatter |
Yes — tools: in custom-agent frontmatter |
Per-MCP-server level | Per-MCP-server level |
| Cross-vendor skill format | SKILL.md (Anthropic-led open spec) | Reads .claude/skills/ too |
Reads SKILL.md in some configs | Limited |
A concrete Claude sub-agent definition shows the depth of the per-file controls:
---
name: code-reviewer
description: Expert code review specialist. Use proactively after any code change.
tools: Read, Grep, Glob
model: sonnet
---
You are a code review specialist. Read every changed file in the
current branch, flag security and style issues, and return findings
grouped by severity.
Copilot's equivalent uses the same Markdown-plus-frontmatter shape, intentionally similar enough that a team with a Claude reviewer can port to Copilot without re-authoring from scratch:
---
name: Reviewer
description: Reviews diffs for security and style issues, never edits code.
target: vscode
tools: [read_file, list_dir, grep_search]
---
You are a focused code reviewer. Read the diff, flag security and
style issues against the repo's copilot-instructions.md, and produce
a single Markdown report. Never call apply_edit.
Codex does not have a per-sub-agent file convention — orchestration of multiple Codex agents happens via the experimental Git-worktree multi-agent mode, with per-CLI-profile model and sandbox overrides. Antigravity uses Manager-spawned agents as the unit; agent-to-agent handoff happens via artifacts rather than via a saved persona file. The decomposed-primitives versus consolidated-agent split is the single biggest authoring-experience difference between Claude and the other three.
5. AGENTS.md and instruction-file conventions¶
The mid-2025 fragmentation around "where do I write project rules" has compressed into a partial convergence by 2026. The AGENTS.md file, originally a Codex-specific convention, was published as an open specification at agents.md and adopted by Cursor, Aider, Windsurf, Devin, Junie, Factory, Amp, GitHub Copilot, Antigravity, and Claude Code wrappers, with the spec now governed under the Linux Foundation's Agentic AI Foundation. Claude Code itself preserves its earlier CLAUDE.md convention but increasingly reads AGENTS.md too. Copilot's canonical file is .github/copilot-instructions.md though it also reads AGENTS.md. Antigravity reads AGENTS.md as its primary knowledge-base entry point.
| Convention | Claude | GitHub Copilot | Codex | Antigravity |
|---|---|---|---|---|
| Primary repo file | CLAUDE.md |
.github/copilot-instructions.md |
AGENTS.md |
AGENTS.md |
Reads AGENTS.md? |
Yes (via plugins / wrappers) | Yes | Yes — canonical | Yes — canonical |
| Global / user-level fallback | ~/.claude/CLAUDE.md |
%USERPROFILE%/.github/ agents |
~/.codex/AGENTS.md |
Knowledge base in .antigravity/ |
| Nested per-directory files | Yes | Yes | Yes — walks Git root → cwd, concatenated | Implicit via folder context |
| Size limits / truncation | Implicit context budget | Implicit context budget | project_doc_max_bytes (32 KiB default) — silent truncation |
Implicit |
.override.md pattern |
No first-class | No | Yes — AGENTS.override.md replaces sibling |
No |
| Production reference | anthropics/claude-cookbook | github/.github | openai/openai monorepo — 88 nested AGENTS.md |
Google internal samples |
The practical takeaway is that AGENTS.md is now the safest portable bet for a team that wants its conventions to follow it across vendors. A repo that maintains one good AGENTS.md will be understood by Codex, Antigravity, and Copilot natively, by Claude through its wrappers, and by Cursor / Aider / Windsurf / Devin directly. The catch flagged in the codex sub-page applies to everyone: cross-tool consumption of the file is inconsistent — some vendors honour "Don't touch" directives strictly, others treat them as soft hints; some load all nested files, others only the repo root. The file format is portable; the enforcement is not.
6. MCP support¶
The Model Context Protocol has become the cross-vendor default for plugging tools and data sources into an agent, and every vendor in this comparison supports it. The differences live in configuration-file location, supported transports, and — most importantly — the security posture around server registration and approval.
| MCP aspect | Claude | GitHub Copilot | Codex | Antigravity |
|---|---|---|---|---|
| Config file | .mcp.json (project), ~/.claude.json (user), .claude/settings.local.json (local) |
.vscode/mcp.json (workspace), user mcp.json; cloud uses repo/org config |
~/.codex/config.toml (CLI/IDE/desktop); .codex/config.toml (project) |
~/.gemini/antigravity/mcp_config.json |
| Transports | STDIO, HTTP, SSE; in-process via createSdkMcpServer() |
STDIO, HTTP | STDIO, Streamable HTTP (with codex mcp login OAuth) |
STDIO, HTTP/SSE, Google-auth HTTP |
| First-encounter trust prompt | Yes — listed verbatim post-TrustFall patch | Yes — per-server | Yes — surface dependent | Yes — but historically permissive |
| Tool namespacing | mcp__<server>__<tool> |
Tool-list UI per session | Per-server prefixed | Per-server, surfaced in artifacts |
| Auto-approve / always-allow | Yes — explicit promotion per tool | Yes — per session toggle | Mode-dependent (approval policy axis) | Yes — per-tool toggle |
| Per-component allowlist | Per sub-agent / per Skill | Per custom agent | Per CLI profile | Per Manager-spawned agent |
| Notable security events | TrustFall (May 2026): MCP+hooks bundled into one trust prompt; mitigated | Sandbox firewall blocks default-deny egress in cloud agent | Issue #18243 — macOS sandbox-through-MCP fragility | find_by_name sandbox escape (Jan 2026); webhook.site exfiltration (Nov 2025) |
Two observations matter for procurement. First, the security incidents are not symmetric. Claude's TrustFall and Antigravity's find_by_name are architecturally different — TrustFall was a one-click RCE on the trust-prompt UX that affected Claude Code, Cursor CLI, Gemini CLI, and Copilot CLI alike; Antigravity's were preview-stage hardening gaps in tool wrappers and outbound allowlists. Both vendors fixed the immediate findings, but the pattern of disclosure differs: Anthropic's response was a documented mitigation set and a hardened trust prompt; Google maintains a public Bug Hunters disclosure page for known Antigravity issues, which is a healthier audit posture but also a louder signal of preview-grade risk. Second, MCP STDIO servers are universally trusted code — they run inside whatever sandbox the agent opened, with the agent's privileges, and with no signing or verification by default. The ~200,000 publicly indexed STDIO servers by late 2025 are a supply-chain problem every vendor inherits. Codex's docs are the most explicit about this risk; the others soft-pedal it.
7. Sandbox and security model¶
Every vendor has had to answer the same question: how do you give the agent enough power to be useful without giving it enough power to destroy a developer's machine, exfiltrate the codebase, or push a malicious commit? The four answers differ along three axes: filesystem scope, network egress posture, and command-execution gating.
| Security aspect | Claude | GitHub Copilot | Codex | Antigravity |
|---|---|---|---|---|
| Default filesystem scope | Working directory + temp; web sandbox stricter | Workspace in agent mode; ephemeral Actions runner in cloud | workspace-write — cwd only |
Workspace + .antigravity/; Browser Surface isolated |
| Default network egress | Allowed locally; sandbox proxy on web | Allowed in agent mode; default-deny firewall in cloud agent | Blocked by default in workspace-write (Linux Landlock + seccomp; macOS Seatbelt) |
Allowed but with allowlist defaults |
| Command approval default | Per-tool prompts | Per-command prompt with auto-approve toggle | untrusted / on-failure / on-request / never (--full-auto) |
Per-command in agent mode |
| Fully-autonomous flag | bypassPermissions permission mode |
"Auto-approve trusted" toggle | --dangerously-bypass-approvals-and-sandbox |
Manager mode + per-agent override |
| Cloud-side isolation | Anthropic-managed VM with filesystem allowlist + Unix-socket network proxy (Oct 2025 sandbox) | Ephemeral GitHub Actions runner with egress firewall | Ephemeral VM, internet off by default, opt-in allowlist | Local-only — no cloud sandbox |
| OS-level isolation | Sandboxing runtime (filesystem allowlist + network proxy) on web | None local; runner-level on cloud | Apple Seatbelt (macOS), Landlock + seccomp-bpf (Linux); Windows weaker | OS-level isolation depends on platform patches (iterative hardening) |
| Notable abuse / incident | TrustFall RCE (May 2026); state-sponsored abuse report (Nov 2025) | Firewall added precisely to mitigate prompt-injection exfiltration | Macos MCP issue #18243 | find_by_name sandbox escape; webhook.site exfiltration chain |
The strongest local sandbox is Codex's: Landlock plus seccomp on Linux, Seatbelt on macOS, network egress off by default in the standard mode, and a verbose four-word flag (--dangerously-bypass-approvals-and-sandbox) for full-access mode. The strongest cloud sandbox is debatable: Copilot's cloud agent runs on a hardened GitHub Actions runner with a default-deny egress firewall and scoped agent tokens that are deliberately separate from CI/CD secrets, while Claude Code on the web runs on Anthropic's October-2025 sandboxing runtime with explicit filesystem allowlist and Unix-socket network proxy. Codex Cloud sits between them — fresh ephemeral VM per task, network off by default, with a user-maintained allowlist. Antigravity's cloud story is essentially "there is no cloud" — every agent runs locally in the IDE process, which simplifies the trust story but loses the bounded blast radius that an ephemeral VM provides.
What the table cannot show is the cadence of security work, which has been faster on Anthropic and Google than on the other two — not because their products are worse, but because the disclosure surface around their newer agents has been more visible.
8. Pricing model¶
The pricing space splits along two axes: flat subscription versus metered consumption, and bundled into a broader platform versus standalone. Claude and Codex both bundle agent access into general-purpose subscriptions (Claude Pro/Max, ChatGPT Plus/Pro/Business) and a metered API tier. Copilot uses a hybrid: a per-seat subscription plus a per-action premium-request meter, with a planned migration to fully usage-based billing on June 1, 2026. Antigravity is free during its public preview, with no committed paid SKU yet.
| Pricing aspect | Claude | GitHub Copilot | Codex | Antigravity |
|---|---|---|---|---|
| Model | Subscription + API per-token | Subscription + premium-request meter (→ usage-based June 2026) | Subscription + API per-token | Free preview |
| Entry paid tier | Pro $20/mo | Pro $10/mo | Go $8/mo, Plus $20/mo | $0 |
| Top consumer tier | Max 20x $200/mo | Pro+ $39/mo (1,500 premium req/mo) | Pro from $100/mo | $0 |
| Free tier | Yes — limited Sonnet messages | Yes — 50 premium req/mo | Yes — capped weekly, -nano model |
All access |
| Per-token API | Bedrock / Vertex / Foundry | Microsoft-routed only | $1.75/MTok input ($0.175 cached), $14/MTok output (5.3-Codex) | None |
| Premium-request metering | No | Yes — extra at $0.04/req | Quota-based; usage-based pilot rolling out | None |
| Enterprise SKU | Team $30/seat; Enterprise custom | Business $19/seat; Enterprise $39/seat | Business / Edu / Enterprise per-seat | Not yet |
| Per-action billable surface | Tokens | Premium requests + Actions minutes | Tokens or quota | None |
Three contrasts are worth surfacing. First, Copilot is the cheapest entry point at $10/month for Pro, undercutting Plus and Claude Pro at $20 — but Copilot's premium-request meter caps Pro at 300 requests/month, where a long agent session can burn through dozens. Second, Codex Pro at $100/month is the most expensive consumer tier but ships uncapped CLI usage and 5×-20× Plus-quota multipliers; for a developer doing many hours of agent work per day, Pro converts to per-hour terms that beat the other plans. Third, Antigravity's free preview is genuinely free including third-party Claude and GPT-OSS inference — an unusual posture that almost certainly does not survive into the paid SKU.
9. Strengths and weaknesses¶
Each agent has a centre of gravity. Claude's is composability and configurability. Copilot's is GitHub-native end-to-end integration. Codex's is multi-surface breadth and bi-directional cloud/local hand-off. Antigravity's is the fleet-of-agents Manager Surface and the first-class controlled browser.
Read as paragraphs rather than columns: Claude Code is the agent for teams that want Unix-shaped primitives — sub-agents, Skills, hooks, MCP, plugins — that compose, sit in version control, and survive switching to a competitor (because much of the configuration is portable via SKILL.md and .claude/agents/). Its weak spots are the security history (TrustFall, the November 2025 state-sponsored abuse report), the still-pre-1.0 Agent SDK, and the absence of a native cross-vendor model picker.
GitHub Copilot's strength is that the agent is already where the work happens: in the issue, on the PR, in the editor, in Visual Studio, in the CI pipeline. Its third-party agents framework also makes it the easiest place to try Claude or Codex without changing seat licenses. The weak spots are the premium-request economics (which the June 2026 billing migration may either ease or worsen), the lack of a hook system comparable to Claude's, and macOS not being supported as a cloud-agent runner.
Codex has the deepest multi-surface story — CLI, IDE, Cloud, mobile, Slack, PR comments, third-party slot on github.com — and the most explicit reasoning-effort control. Its local sandbox is the most rigorous of the four. The weak spots are model availability lag on the API (Spark and the newest snapshots reach ChatGPT plans weeks before API customers), Windows second-class status, and the cross-vendor AGENTS.md inconsistency it helped create.
Antigravity is unique in two ways: the Manager Surface makes parallel autonomous work routine rather than exotic, and the Browser Surface puts end-to-end validation inside the agent's reach without MCP plumbing. The multi-model picker is also genuinely cross-vendor. The weak spots are the preview status, the security incidents in the first six months, the lack of an enterprise SKU, and the requirement to switch IDE entirely to adopt it.
| Vendor | Strengths | Weaknesses |
|---|---|---|
| Claude | Most composable primitives (sub-agents + Skills + hooks + plugins); SKILL.md is an open spec; portable across IDE/CLI/web/mobile; deep MCP scopes | TrustFall + state-sponsored abuse history; pre-1.0 Agent SDK; no native multi-vendor models; no AGENTS.md as primary file |
| GitHub Copilot | Native GitHub-flow integration; cheapest paid entry; third-party agents bring Claude/Codex in; cloud-agent firewall is solid | No hook system; macOS not a runner; premium-request economics tight on lower tiers; usage-based billing transition adds uncertainty |
| Codex | Most surfaces; bi-directional cloud↔local; rigorous local sandbox; explicit reasoning-effort knob; AGENTS.md is canonical | API model availability lag; Windows weaker; experimental multi-agent worktree mode; MCP supply-chain risk most explicit but inherited |
| Antigravity | Manager Surface for fleets of agents; first-class Browser Surface; multi-vendor model picker; free preview | Preview-stage product; security incidents in first 6 months; no enterprise SKU; no CLI; requires IDE switch |
10. How to pick¶
There is no globally correct answer. The decision space collapses cleanly along five axes: what IDE and ecosystem the team already lives in, which model the team trusts, how much sandbox rigour matters, whether work is sync or async, and what org-policy constraints apply.
- If the team already lives on github.com, lean Copilot. Issue assignment, PR-comment iteration, and the Actions-runner sandbox are native. The third-party agents framework keeps Claude and Codex one toggle away.
- If composability and version-controlled agent configuration matter, lean Claude. The combination of sub-agents, Skills, hooks, and plugins is the most decomposed in the field, and SKILL.md is the only convention here published as an open standard with cross-vendor adoption.
- If multi-vendor model neutrality matters, lean Antigravity (native picker including Gemini, Claude, and GPT-OSS) or Copilot (third-party agents framework). Avoid Claude and Codex if a single-vendor model lock is unacceptable.
- If sandbox rigour on a local developer machine matters most, lean Codex. Landlock + seccomp + Seatbelt + default-deny network egress is the strongest local posture documented.
- If the work is mostly asynchronous "fire and forget", lean Codex Cloud or Copilot's cloud agent. Both run on ephemeral hardened VMs with proper isolation. Antigravity is the wrong fit because its agents are local. Claude Code on the web works but ZDR-required orgs cannot use it.
- If long-horizon parallel autonomous work is the day job, lean Antigravity. The Manager Surface is the only first-class fleet view in this comparison.
- If end-to-end browser validation is part of every task, lean Antigravity. The Browser Surface is unique.
- If the org has strict data-locality requirements, prefer Claude (Bedrock / Vertex / Foundry routing) or self-hosted Codex CLI with API keys and no Cloud surfaces. Antigravity and Copilot are both vendor-routed.
- If budget is the primary constraint, start free with Antigravity, or pay $8/month for Codex Go, or $10/month for Copilot Pro. Claude Pro at $20 is the most expensive entry tier.
- If a single agent must work from inside a JetBrains IDE, all four ship a JetBrains experience, but Codex's January 2026 JetBrains integration and Claude's longer-standing JetBrains extension are the most mature.
Most teams will not pick exactly one. The realistic 2026 pattern is two agents in active use: typically a synchronous in-editor agent (Claude Code, Copilot agent mode, or Codex IDE) paired with an asynchronous cloud agent (Copilot cloud, Codex Cloud, or Claude Code on the web). Antigravity tends to be evaluated as a third surface for specific workflows — multi-agent prototyping, browser-heavy validation — rather than as a replacement for either of the other two.
The single observation that keeps re-emerging across all four sub-pages: agent quality is more strongly correlated with codebase hygiene than with vendor choice. A well-tested codebase with clear specs and a maintained AGENTS.md will see good results from any of these four. A neglected one will see poor results from all of them.
Sources¶
- Claude Code documentation — Anthropic primary docs
- Claude Code trust prompt can trigger one-click RCE — The Register (May 2026)
- Agent Skills Specification — agentskills.io
- Introducing Copilot Agent Mode — VS Code blog
- Assigning and completing issues with Copilot's coding agent — GitHub blog
- Customize the Copilot agent environment — GitHub docs
- Third-party agents in Copilot — GitHub docs
- Plans for GitHub Copilot — GitHub docs
- Custom agents configuration — GitHub docs
- MCP servers in VS Code — Visual Studio Code docs
- GitHub Copilot in Visual Studio — April 2026 update
- Introducing Codex — OpenAI (May 2025)
- Introducing GPT-5.3-Codex — OpenAI (Feb 2026)
- Codex CLI documentation — OpenAI Developers
- Codex MCP guide — OpenAI Developers
AGENTS.mdguide — OpenAI Developers- Codex pricing — OpenAI Developers
AGENTS.mdopen specification — agents.md- openai/codex — GitHub
- Build with Google Antigravity — Google Developers Blog
- Introducing Google Antigravity — antigravity.google
- Antigravity MCP Documentation — antigravity.google
- Antigravity Known Issues — Google Bug Hunters
- Security Keeps Google Antigravity Grounded — Embrace The Red
- Vendor sub-pages: claude-agents, github-copilot-agents, codex-agents, antigravity-agents
Changelog¶
- 2026-05-11 — Comparison page created from four vendor sub-pages with cross-vendor analysis (confidence 80)