CI gate for AI-generated PRs: review by capability

Problem

Claude Code, Codex, Cursor, and similar tools can produce large PRs that look like ordinary implementation work while changing what an AI agent can do in the real world. Code review catches bugs; Shipgate asks whether the PR expanded agent capability and whether that capability has release evidence.

What Shipgate verifies

Agent tools and actions from MCP exports, OpenAPI specs, SDK code, workflows, and plugins.
Permission scopes, approval policies, confirmation policies, idempotency evidence, and prohibited actions.
Prompts and policy files that constrain tool use.
Trust-root files such as shipgate.yaml, AGENTS.md, skills, policy packs, baselines, waivers, suppressions, and Shipgate CI.

Example 1: Codex adds refund support

+ stripe.create_refund
 POST /refunds
 amount, currency, payment_id

Shipgate reports a capability change and blocks merge when the new money-moving action lacks approval and idempotency evidence. The PR comment names the action, explains why it blocks release, and lists what must be added before merge.

Example 2: Claude edits the release gate

~ shipgate.yaml
 checks:
   ignore:
     - SHIP-POLICY-APPROVAL-MISSING

Shipgate treats release-policy weakening as a trust-root change. A coding agent may propose a fix, but it cannot approve the policy change that makes its own PR pass.

Commands

agents-shipgate verify --preview --json
agents-shipgate verify --workspace . --config shipgate.yaml --base origin/main --head HEAD --ci-mode advisory --format json
agents-shipgate init --workspace . --write --ci --agent-instructions=default --json

The verifier writes agents-shipgate-reports/verifier.json, agents-shipgate-reports/report.json, and agents-shipgate-reports/pr-comment.md. CI can post the PR comment in advisory mode, then move to strict behavior after the team accepts the baseline and trust-root policy.

How the coding agent reads the verdict

In the published release, the verifier exposes one imperative block — agents-shipgate-reports/verifier.json → agent_controller — so an autonomous agent acts without guessing. It is part of the agent-native merge contract and can never contradict the gate (completion_allowed is locked to can_merge_without_human):

If completion_allowed is true, the capability change is done — merge.
Else if must_stop is true, surface stop_reason (for Example 2, self_approval_prohibited) to a human and never edit a forbidden_file_edits path to pass.
Otherwise apply the mechanical fix_task, re-run its verification_command, and read the fresh verdict.

What coding agents may fix

Coding agents may add missing manifest paths when inferable, add report-output ignores, install AGENTS/Claude/Cursor instructions, draft the GitHub Action, and apply high-confidence mechanical patches. These changes are reversible and do not assert business policy.

What humans must approve

Humans must approve runtime approval policy, confirmation policy, idempotency claims, broad scopes, prohibited-action exceptions, suppressions, waivers, and edits to the release gate itself. Shipgate should surface those decisions; it should not let the generating agent self-approve them.

Gate AI-generated agent PRs by capability, not by line count.