Implementation checklist

Agent-first MCP server checklist

Use this checklist when evaluating, adopting, building, or reviewing a Model Context Protocol server that AI agents will connect to. MCP makes tools easier to plug into agents; it does not automatically make those tools safe, inspectable, verifiable, or operationally useful.

MCP serversTool designAgent operations
Best use: review every MCP tool as if an unattended agent will call it from a bounded cron job. For each item, write the exact tool name, JSON schema field, permission scope, receipt, error code, or recovery command that satisfies it.

The core test

An agent-first MCP server is not just a wrapper around an API. It is an operating surface that helps the agent decide whether to act, act with the smallest safe authority, and report verifiable evidence afterwards.

Fragile MCP server
Exposes a generic run_command or call_api tool with broad credentials, prose errors, hidden side effects, and no durable IDs.
Agent-first MCP server
Exposes narrow tools such as create_draft_invoice, preview_deploy, and verify_ticket_state with typed inputs, scoped auth, idempotency keys, and receipts.

Before the agent calls a tool

Discoverable

Tool names, descriptions, and schemas make intent obvious

  • Tool names describe business operations, not transport mechanics: prefer create_calendar_hold over post_json.
  • Descriptions state side effects, required authority, expected latency, retry safety, and whether human approval is needed.
  • Input schemas include units, enum values, formats, default behavior, constraints, and realistic examples.
  • Required fields distinguish user intent from operational metadata such as idempotency keys, dry-run flags, or request IDs.
Scoped

Authority is visible and narrow

  • The server can run with credentials scoped to one workspace, project, environment, mailbox, repository, or account segment.
  • Read-only tools exist for inspection before mutation: list_permissions, get_status, preview_change, and estimate_cost.
  • Dangerous tools require explicit, operation-specific confirmation values rather than broad “yes to all” parameters.
  • The server exposes its effective actor, tenant, scopes, and rate limits without printing secrets.

When the agent acts

Bounded

Every mutation has a planning path

  • Consequential tools support dry_run, preview, or plan modes that return the resources that would change.
  • Long-running work returns a job ID immediately and offers a separate status tool rather than relying on a single hanging call.
  • Inputs support idempotency keys so a retry cannot duplicate invoices, tickets, messages, deployments, or purchases.
  • Tools define timeouts and maximum result sizes so agents do not burn context on unbounded payloads.
Verifiable

Outputs include receipts

  • Successful mutations return durable IDs, URLs, versions, timestamps, changed fields, audit event IDs, or log references.
  • Accepted asynchronous work is clearly different from completed work.
  • Verification tools can re-fetch the result by ID and return the current state in structured form.
  • Human-facing summaries cite the same evidence fields that an agent can store in its final report.
{
  "tool": "create_release_note",
  "input": {
    "repo": "example/app",
    "version": "1.8.0",
    "source_prs": [123, 124],
    "dry_run": false,
    "idempotency_key": "release-note-1.8.0-2026-05-11"
  },
  "output": {
    "ok": true,
    "receipt": {
      "document_id": "doc_01J...",
      "url": "https://docs.example.com/releases/1.8.0",
      "version": 3,
      "created_at": "2026-05-11T09:00:00Z",
      "audit_event_id": "evt_01J..."
    }
  }
}

After something goes wrong

Recoverable

Errors teach the next safe action

  • Error payloads include stable codes, retry safety, affected resources, and recommended next tool calls.
  • Partial success is represented explicitly with completed, pending, failed, and rollbackable components.
  • Timeouts can be resolved by querying status with a job ID, resource ID, or idempotency key.
  • Rollback, cancel, restore, or human-handoff tools exist for workflows where failure has operational consequences.
Operable

The server is safe to run in production agent loops

  • Logs include request IDs, actor, tool name, scope, and outcome while redacting tokens, message contents, and secrets by default.
  • Rate limits, quotas, health checks, and provider outages are visible through MCP tools or operator endpoints.
  • Configuration can be validated without starting destructive workflows.
  • There is a documented shutdown, credential-rotation, and emergency-disable path for the server or individual tools.

Fast scoring rule

Ready for agent workflows
The MCP server exposes narrow typed tools, scoped credentials, dry-runs, idempotency, receipts, verification tools, and recoverable errors.
Useful but fragile
Agents can call it, but humans still interpret broad tools, unclear side effects, dashboard-only state, or uncertain failures.
Not agent-first yet
The server mostly gives agents a remote-control tunnel with broad authority and no durable evidence.
Most useful fix
Replace the most dangerous generic tool with one narrow operation that has a dry-run, idempotency key, and receipt schema.
Next: compare the MCP server with the API and CLI checklists. The same agent-first promises should hold at every interface layer.