Agents Shipgate is an open-source CLI + GitHub Action for reviewing what tools an agent can call before it gets production-like tools or higher permissions.
It scans shipgate.yaml, local MCP exports, OpenAPI specs, and optional OpenAI Agents SDK static metadata.
Once an agent can refund, email, cancel, update tickets, or call internal APIs, tool changes become release risks. Shipgate makes those risks visible before promotion.
MCP exports, OpenAPI specs, and function tools change faster than most release reviews can track.
Approval, confirmation, idempotency, and scope expectations often live in prompts, docs, or tribal knowledge.
Traces are useful, but they appear after behavior exists. Shipgate runs before promotion.
Shipgate turns an agent's tool surface into a reviewable release artifact: inventory, schemas, scopes, declared policies, findings, and recommended next actions.
Use shipgate.yaml to declare agent purpose, tool sources, policies, overrides, and CI behavior.
Inspect local MCP exports and OpenAPI specs for broad schemas, write actions, missing bounds, and risky operations.
Optionally enrich reports with OpenAI Agents SDK static metadata without importing user code by default.
Flag wildcard tools, missing approval policies, broad scopes, free-form action fields, and idempotency gaps.
Run locally or add advisory checks to pull requests and release workflows.
Produce human-readable release review reports and machine-readable JSON artifacts.
Define the agent, declared purpose, prohibited actions, tool sources, and policy expectations in shipgate.yaml.
Load local MCP exports, OpenAPI specs, and optional SDK static metadata.
Build a unified tool inventory with schemas, scopes, annotations, source references, and risk hints.
Run deterministic static checks for wildcard tools, broad schemas, approval gaps, idempotency evidence, scope mismatch, and more.
Generate Markdown and JSON reports for local review, CI artifacts, or PR comments.
Findings include severity, evidence, source reference, confidence, and a recommended next action — built for engineering and platform review, not for dashboards.
Hover findings to highlight, or browse the full check catalog.
An agent with no tools can still be wrong. An agent with tools can take action. Tool-use readiness focuses on the release risks that appear when tools, permissions, schemas, policies, and side effects enter the system.
Know when tools require broad scopes, service accounts, or missing approval policies.
Catch broad free-form fields like updates, command, action, and body before they become model-controlled actions.
Flag missing maximum bounds, confirmation flows, and idempotency evidence for write actions.
Turn a tool surface into a report that release owners, platform teams, and security reviewers can discuss.
Run the same check locally, in PRs, or before promotion to production-like environments.
CI is advisory by default. Strict mode can fail only on unsuppressed critical findings when your team is ready.
Clone the repo and run Shipgate against the bundled example agent. You'll see real findings on a real tool surface — no setup, no credentials.
If your team ships an agent with MCP or OpenAPI tools, we'd like to run Shipgate against your repo and walk you through the findings. False positives, missing checks, and rough edges become roadmap.
Agents Shipgate is static by default and open-source. It does not execute agent code, run tools, call LLMs, connect to MCP servers, make network calls, or collect telemetry.
Designed to be safe to run on internal repositories before you connect any hosted service. Open-source core. Transparent checks. Suppressions require reasons.
Inspect the checks. Run the fixture. Open an issue with a false positive.
Evals test behavior. Observability records runtime. Gateways enforce access. Shipgate reviews the tool surface before release.
Agents Shipgate starts with static release checks for tool-using agents. The longer-term vision is infrastructure for agent lifecycle health: release history, policy drift, trace-based evidence, approval workflows, re-review triggers, and governance signals across teams and agents.
CLI + GitHub Action.
Reports, history, baselines, exceptions.
Traces, tool routing, handoffs, guardrails.
Lifecycle health, policy drift, re-authorization, governance.
We are not building a closed governance suite. The lab works in the open and partners with teams shipping real agents.
Interested in design partnership