Review AI-generated agent PRs by capability, not by line count.
AI coding agents can add tools, widen scopes, edit policies, and change CI faster than humans can review the capability boundary. Shipgate turns that PR into a deterministic merge verdict.
Problem
Claude Code, Codex, Cursor, and similar tools can produce large PRs that look like ordinary implementation work while changing what an AI agent can do in the real world. Code review catches bugs; Shipgate asks whether the PR expanded agent capability and whether that capability has release evidence.
What Shipgate verifies
- Agent tools and actions from MCP exports, OpenAPI specs, SDK code, workflows, and plugins.
- Permission scopes, approval policies, confirmation policies, idempotency evidence, and prohibited actions.
- Prompts and policy files that constrain tool use.
- Trust-root files such as
shipgate.yaml, AGENTS.md, skills, policy packs, baselines, waivers, suppressions, and Shipgate CI.
Example 1: Codex adds refund support
+ stripe.create_refund
POST /refunds
amount, currency, payment_id Shipgate reports a capability change and blocks merge when the new money-moving action lacks approval and idempotency evidence. The PR comment names the action, explains why it blocks release, and lists what must be added before merge.
Example 2: Claude edits the release gate
~ shipgate.yaml
checks:
ignore:
- SHIP-POLICY-APPROVAL-MISSING Shipgate treats release-policy weakening as a trust-root change. A coding agent may propose a fix, but it cannot approve the policy change that makes its own PR pass.
Commands
agents-shipgate verify --preview --json
agents-shipgate verify --workspace . --config shipgate.yaml --base origin/main --head HEAD --ci-mode advisory --format json
agents-shipgate init --workspace . --write --ci --agent-instructions=all
The verifier writes agents-shipgate-reports/verifier.json,
agents-shipgate-reports/report.json, and agents-shipgate-reports/pr-comment.md.
CI can post the PR comment in advisory mode, then move to strict
behavior after the team accepts the baseline and trust-root policy.
What coding agents may fix
Coding agents may add missing manifest paths when inferable, add report-output ignores, install AGENTS/Claude/Cursor instructions, draft the GitHub Action, and apply high-confidence mechanical patches. These changes are reversible and do not assert business policy.
What humans must approve
Humans must approve runtime approval policy, confirmation policy, idempotency claims, broad scopes, prohibited-action exceptions, suppressions, waivers, and edits to the release gate itself. Shipgate should surface those decisions; it should not let the generating agent self-approve them.