Home

Every static check the scanner runs.

40 static checks across 11 categories — use this as your AI agent release-readiness checklist. Every category corresponds to a class of release risk for tool-using agents (MCP, OpenAPI, OpenAI Agents SDK, Anthropic, LangChain, CrewAI, Google ADK, Codex plugins, n8n). Vendored from docs/checks.json and refreshed on each agents-shipgate release.

Use this catalog as your...

  • MCP security checklist — review wildcard sources, missing approval policies, idempotency gaps, and broad scopes before deploying an MCP server.
  • AI agent release checklist — match every PR's tool-surface change against the categories below before approving merge.
  • Framework-agnostic tool review — the same checks apply to OpenAI Agents SDK, Anthropic Messages API, LangChain / LangGraph, CrewAI, Google ADK, Codex plugins, and n8n workflows.

All checks run statically: no model invocation, no MCP connection, no scanner network calls, no scanner telemetry by default. Run agents-shipgate scan -c shipgate.yaml in CI; see the quickstart.

adk

  • SHIP-ADK-DYNAMIC-TOOLSET-NOT-ENUMERABLE high

    Google ADK toolset cannot be statically enumerated.

    Why: Release review needs an explicit tool inventory; ADK MCP/OpenAPI toolsets may resolve tools dynamically at runtime.

  • SHIP-ADK-EVAL-COVERAGE-MISSING medium

    Google ADK eval coverage is not declared.

    Why: ADK releases should include response and tool-trajectory eval evidence before promotion.

  • SHIP-ADK-FUNCTION-TOOL-METADATA-MISSING medium

    Google ADK function tool lacks static metadata.

    Why: Static review depends on descriptions and parameter schemas because user ADK code is not imported.

  • SHIP-ADK-GUARDRAIL-EVIDENCE-MISSING high

    High-risk Google ADK tools lack static guardrail evidence.

    Why: Callbacks and plugins are the static ADK surface where release reviewers can see guardrail intent.

  • SHIP-ADK-LONGRUNNING-CONTRACT-MISSING high

    Google ADK long-running tool lacks an operation contract.

    Why: Long-running tools need explicit status and operation-id semantics for safe continuation.

  • SHIP-ADK-MCP-TOOLSET-UNFILTERED high

    Google ADK McpToolset lacks a static tool filter.

    Why: Unfiltered MCP toolsets can expose more tools than reviewers expect.

api

  • SHIP-API-FUNCTION-SCHEMA-STRICTNESS high

    OpenAI API function schema is not strict enough for reliable tool calls.

    Why: Strict schemas reduce ambiguous tool arguments and downstream side-effect risk.

  • SHIP-API-PROMPT-TOOL-SCOPE-MISMATCH high

    Prompt scope contradicts enabled OpenAI API tools.

    Why: Prompt instructions should match the actual write/high-risk tool surface.

  • SHIP-API-RETRY-POLICY-MISSING medium

    OpenAI API high-risk flow lacks retry policy metadata.

    Why: Retries need explicit policy metadata so reviewers can reason about duplicate side effects.

  • SHIP-API-RETRY-WITHOUT-IDEMPOTENCY high

    OpenAI API write tool may be retried without idempotency evidence.

    Why: Retries against non-idempotent writes can duplicate financial, destructive, or external side effects.

  • SHIP-API-STRUCTURED-OUTPUT-READINESS medium

    OpenAI API structured output schema is missing or under-specified.

    Why: Downstream release decisions need explicit, structured success/refusal/review modeling.

  • SHIP-API-TEST-CASES-MISSING medium

    OpenAI API high-risk flow lacks test case metadata.

    Why: High-risk tool-call flows should have release evidence before promotion.

  • SHIP-API-TIMEOUT-MISSING medium

    OpenAI API high-risk flow lacks timeout metadata.

    Why: Timeouts define failure behavior and reduce ambiguous tool-call continuation.

  • SHIP-API-TOOL-OUTPUT-SCHEMA-MISSING medium

    OpenAI API high-risk tool lacks success/failure output modeling.

    Why: Tool output schemas help release reviewers reason about downstream failure handling.

  • SHIP-API-TRACE-APPROVAL-MISSING medium

    OpenAI API trace sample shows a policy-controlled tool without approval.

    Why: Trace samples should demonstrate approval behavior for tools that require approval.

  • SHIP-API-TRACE-CONFIRMATION-MISSING medium

    OpenAI API trace sample shows a policy-controlled tool without confirmation.

    Why: Trace samples should demonstrate explicit confirmation for tools that require confirmation.

auth

  • SHIP-AUTH-MANIFEST-BROAD-SCOPE high

    Manifest declares broad permission scopes.

    Why: Broad manifest scopes weaken least-privilege review.

  • SHIP-AUTH-MISSING-SCOPE high

    Scope-requiring tool lacks declared auth scopes.

    Why: Reviewers cannot assess least privilege without scope metadata.

  • SHIP-AUTH-SCOPE-COVERAGE-MISSING high

    Tool-required scopes are not covered by manifest permissions.scopes.

    Why: The manifest should describe the actual permissions needed by the release.

  • SHIP-AUTH-TOOL-BROAD-SCOPE high

    Tool declares broad auth scopes.

    Why: Tool-level broad scopes may grant more power than the operation needs.

documentation

  • SHIP-DOC-MISSING-DESCRIPTION medium

    Tool description is missing or too short.

    Why: Poor tool descriptions increase wrong-tool and reviewer misunderstanding risk.

inventory

  • SHIP-INVENTORY-LOW-CONFIDENCE-PRODUCTION-SURFACE high

    Production target includes low-confidence tool extraction.

    Why: Production promotion should not depend primarily on best-effort SDK inference.

  • SHIP-INVENTORY-NOT-ENUMERABLE high

    Tool surface cannot be enumerated from declared inputs.

    Why: A release gate must fail closed when it cannot see the agent's tools.

  • SHIP-INVENTORY-TOOL-SURFACE-TOO-LARGE medium

    Tool surface exceeds the MVP review threshold.

    Why: Large tool surfaces are harder to reason about during promotion.

  • SHIP-INVENTORY-WILDCARD-TOOLS high

    Wildcard or all-tools exposure is declared.

    Why: Wildcard tools make review and least-privilege reasoning impossible.

manifest

  • SHIP-MANIFEST-HIGH-RISK-OWNER-MISSING high

    Production high-risk tool has no declared owner.

    Why: High-risk production tools need an accountable owning team for review and remediation.

  • SHIP-MANIFEST-STALE-POLICY medium

    A policy references a missing tool.

    Why: Approval, confirmation, and idempotency policies should map to the actual release surface.

  • SHIP-MANIFEST-STALE-RISK-OVERRIDE medium

    A risk override references a missing tool.

    Why: Risk overrides should not outlive the tool they describe.

  • SHIP-MANIFEST-STALE-SUPPRESSION medium

    A suppression references a missing check or tool.

    Why: Stale suppressions hide intent and make release review harder to audit.

  • SHIP-MANIFEST-UNUSED-SCOPE medium

    Manifest declares permission scopes unused by loaded tools.

    Why: Unused permissions weaken least-privilege review and often indicate stale config.

policy

  • SHIP-POLICY-APPROVAL-MISSING critical

    High-risk tool lacks a declared approval policy.

    Why: High-risk actions need explicit approval before promotion.

  • SHIP-POLICY-CONFIRMATION-MISSING high

    Destructive/external/customer-communication tool lacks a confirmation policy.

    Why: Destructive and external actions should require explicit confirmation.

schema

  • SHIP-SCHEMA-BROAD-FREE-TEXT high

    Action-like tool accepts broad free-form input.

    Why: Broad action/body/update fields increase blast radius for write tools.

  • SHIP-SCHEMA-FREEFORM-OUTPUT medium

    Tool returns free-form string output.

    Why: Free-form tool output may carry prompt injection into later model context.

  • SHIP-SCHEMA-MISSING-BOUNDS high

    Risky numeric parameter lacks a maximum bound.

    Why: Unbounded counts or financial amounts weaken blast-radius control.

scope

  • SHIP-SCOPE-PROHIBITED-TOOL-PRESENT high

    Tool appears to overlap with a manifest prohibited action.

    Why: Prohibited actions should not be contradicted by attached tool capabilities.

  • SHIP-SCOPE-TOOL-OUTSIDE-PURPOSE high

    Write-capable tool contradicts a read-only declared purpose.

    Why: Declared purpose should constrain the attached tool surface.

security

  • SHIP-DOC-INJECTION-RISK medium

    Tool description contains instruction-override-like language.

    Why: Tool metadata can be placed into model context and should not contain prompt-like directives.

  • SHIP-DOC-SECRET-IN-DESCRIPTION medium

    Tool description contains a secret-like value.

    Why: Credentials in tool metadata can leak into reports, prompts, or logs.

side_effects

  • SHIP-SIDEFX-IDEMPOTENCY-MISSING high

    Risky write tool lacks idempotency evidence; critical when retry is known.

    Why: Retries against non-idempotent writes can duplicate financial or external side effects.

Run a scan on your repo docs/checks.md