Adding a release gate to an Anthropic Claude tool-use agent
Anthropic's Messages API tool surface lives in a JSON tools array plus a system prompt. agents-shipgate scans both and produces release-readiness findings on every PR.
If your agent uses Claude’s Messages API tool-use, the tool surface is
the JSON tools array your application sends with every request — plus
the system prompt, plus whatever policy YAML your team maintains
alongside. Each is a release artifact. Each is worth gating in CI.
This post wires agents-shipgate into a Claude tool-use project end-to-end.
What agents-shipgate reads from a Claude project
Three artifact kinds, declared under a top-level anthropic: block in
the manifest:
prompt_files— markdown system prompts. Concatenated and used for prompt/tool scope-mismatch detection.tools— JSON files containing the Anthropic Messages API tool-use shape ({name, description, input_schema, ...}).policy_rules— YAML/JSON files declaring approval lists, confirmation lists, idempotency lists, and tool output schemas.
Server-side Anthropic-managed tools (web_search_*, code_execution_*)
are skipped with a warning — they have no user-controlled schema.
Client-side tools (bash_*, text_editor_*, computer_*, memory_*)
are inventoried, with explicit risk hints attached so the
framework-agnostic checks fire correctly.
1. Install + manifest
pipx install agents-shipgate
Create shipgate.yaml:
version: "0.1"
project:
name: support-refund-agent
agent:
name: refund-agent
declared_purpose:
- issue customer refunds
- look up help articles
environment:
target: production_like
anthropic:
prompt_files:
- prompts/system.md
tools:
- path: tools/anthropic-tools.json
policy_rules:
- path: policies/anthropic-policy.yaml
2. The tool surface
tools/anthropic-tools.json:
[
{
"name": "create_refund",
"description": "Create a refund for a customer payment.",
"input_schema": {
"type": "object",
"additionalProperties": false,
"properties": {
"payment_id": {"type": "string"},
"amount": {"type": "number", "maximum": 500},
"reason": {"type": "string", "enum": ["duplicate", "fraud", "requested_by_customer"]}
},
"required": ["payment_id", "amount"]
},
"cache_control": {"type": "ephemeral"}
},
{
"name": "get_help_article",
"description": "Fetch a help-center article by slug.",
"input_schema": {
"type": "object",
"additionalProperties": false,
"properties": {"slug": {"type": "string"}},
"required": ["slug"]
}
}
]
A few Anthropic-specific things to get right:
- Tool names must match the documented regex
^[a-zA-Z0-9_-]{1,64}$. Dots in names produce a warning. - Use
input_schema, notparameters. The latter is the OpenAI shape; agents-shipgate warns if it seesparametershere. - No
functionwrapper. Anthropic tools are flat. Afunctionwrapper is treated as malformed. cache_controlis captured verbatim into annotations and surfaces in the report. No semantic check yet.
3. The policy file
policies/anthropic-policy.yaml:
approval_required:
- create_refund
idempotency_tools:
- create_refund
tool_output_schemas:
create_refund:
success_fields: [refund_id, amount_cents, status]
failure_fields: [error_code, message]
These rules supplement the manifest-level policies block. agents-shipgate
merges them at scan time, so approval/confirmation/idempotency findings
respect both sources.
4. First scan
agents-shipgate scan -c shipgate.yaml
Without an approval policy on create_refund:
Status: release_blockers_detected
Critical: 1 · High: 6 · Medium: 2
Top findings:
1. create_refund lacks a declared approval policy [SHIP-POLICY-APPROVAL-MISSING]
2. create_refund lacks idempotency evidence [SHIP-SIDEFX-IDEMPOTENCY-MISSING]
3. create_refund lacks declared auth scopes [SHIP-AUTH-MISSING-SCOPE]
Adding create_refund to approval_required in
anthropic-policy.yaml resolves the critical. Adding to
idempotency_tools resolves the high. Declaring scopes in the manifest
or per-tool resolves the auth-scope finding.
5. CI integration
.github/workflows/shipgate.yml:
name: Agents Shipgate
on:
pull_request:
permissions:
contents: read
pull-requests: write
jobs:
agents-shipgate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: ThreeMoonsLab/[email protected]
with:
config: shipgate.yaml
ci_mode: advisory
pr_comment: "true"
Advisory by default. Strict mode with a baseline once existing findings are triaged. See the GitHub Action quickstart for the full path.
What this catches that other guards don’t
- A teammate adds
cancel_subscriptionto the tools array; CI fails because no approval policy exists. bash_20250124shows up in the JSON; the scanner inventories it as a client tool withcode_execution,destructive,writerisk tags and demands an approval policy.- The system prompt says “look up help articles” but the surface
includes
create_refund; the prompt/surface scope-mismatch check fires. - A
cache_control: { type: "ephemeral" }annotation gets added; it’s preserved in the report so reviewers can audit prompt-cache scope.
For why static checks belong in the release-gate slot, see Your AI agent has a tool surface. It needs a release gate..