Implementation checklist

Agent-first CLI checklist

Use this checklist when evaluating, adopting, or designing a command line tool that autonomous AI agents will run. The test is whether an agent can discover the interface, call it non-interactively, detect success, explain evidence, and recover when the terminal lies or hangs.

CLI designAgent reliabilityEngineering checklist
Best use: add this to a CLI design review before agents depend on the tool. For each section, write the exact flag, exit code, output field, log path, or rollback command that satisfies it.

Start with the agent’s failure modes

Human-friendly CLIs often assume someone is watching the screen. Agents need stronger contracts because they run tools inside bounded contexts, sometimes without a browser, TTY, or long-term memory of the terminal scrollback.

Fragile for agents
deploy opens an interactive prompt, streams coloured prose for ten minutes, exits 0, and tells the user to check a dashboard.
Agent-first
deploy --json --non-interactive --wait --timeout 600 returns a deployment ID, URL, final state, changed version, and next recovery command.

Before the agent runs anything

Discoverable

Machine-readable help and versioning

  • --help works without network access, auth, colour-only formatting, paging, or a TTY.
  • A --json-help, schema, OpenAPI link, or man page exposes commands, flags, defaults, environment variables, and examples.
  • --version prints a stable semantic version plus plugin/provider versions that affect behaviour.
  • Deprecated flags fail with actionable messages before they silently change an agent workflow.
Bounded

Explicit authority and operating scope

  • Every command has a visible target scope: workspace, project, environment, account, region, file path, or resource ID.
  • Dangerous commands require a named flag such as --confirm-delete RESOURCE_ID, not a generic -y hidden in docs.
  • Read-only inspection commands exist for planning: status, diff, plan, quota, whoami, and permissions.
  • Credentials can be scoped and rotated independently for agent use; the CLI never asks an agent to paste broad human credentials into a prompt.

When the agent executes the command

Scriptable

Non-interactive by default for automation

  • Every workflow agents need supports --non-interactive or works without stdin prompts.
  • Commands that may wait support --timeout, --wait, --poll-interval, and --no-watch modes.
  • Progress indicators never replace final structured output; spinners, colours, and cursor control are disabled when stdout is not a TTY.
  • Config search paths are inspectable, overrideable, and printed in diagnostics without exposing secret values.
Verifiable

Exit codes and structured output are contracts

  • 0 means the requested final state was reached, not merely that a job was accepted.
  • Distinct non-zero exit codes distinguish validation errors, auth failures, rate limits, timeouts, partial success, and unsafe retries.
  • --json returns parseable JSON on stdout and puts human logs on stderr.
  • Mutations return durable receipts: IDs, URLs, versions, checksums, diffs, audit event IDs, or log paths that the agent can cite later.
$ aft-example deploy --env prod --dry-run --json
{
  "ok": true,
  "mode": "dry_run",
  "planned_changes": [
    {"resource": "caddyfile", "action": "reload", "risk": "low"},
    {"resource": "site", "action": "sync", "files_changed": 3}
  ],
  "requires_confirmation": "deploy-prod-2026-05-10T09:04:12Z"
}

$ aft-example deploy --env prod \
  --confirm deploy-prod-2026-05-10T09:04:12Z \
  --wait --timeout 600 --json
{
  "ok": true,
  "deployment_id": "dep_01HX...",
  "url": "https://agentfirsttools.com/",
  "version": "git:abc1234",
  "receipt": "/deployments/dep_01HX...",
  "verified_at": "2026-05-10T09:05:48Z"
}

After something goes wrong

Recoverable

Failures must leave a next command

  • Timeouts and interrupted connections return enough state to query later by job ID, idempotency key, or resource version.
  • Errors include stable codes, short human messages, retry safety, and the precise command to inspect or resume.
  • Partial success is a first-class state, not buried in prose after exit 0.
  • Rollback, cancel, restore, or human-handoff commands exist for consequential workflows.
Observable

Logs and audit trails work outside the terminal

  • The CLI can print the location of local logs, remote job logs, and audit events without requiring dashboard navigation.
  • Logs avoid secrets by default and have a support bundle/export mode safe to attach to an agent report.
  • Commands include request IDs or trace IDs so an agent can correlate CLI output with API events.
  • Rate limits, quotas, and provider outages are visible as machine-readable state.

Fast scoring rule

Ready for agent workflows
The CLI is discoverable, non-interactive, scoped, structured, idempotent, and returns receipts the agent can verify.
Useful but fragile
Agents can run it, but humans still need to interpret prompts, dashboard states, prose errors, or uncertain timeouts.
Not agent-first yet
The CLI assumes a human operator at the keyboard and cannot safely support unattended agent loops.
Most useful fix
Add --json, --non-interactive, --dry-run, stable exit codes, and a receipt field to the most important mutation command.
Next: compare this with the API checklist. CLIs often wrap APIs, so the safest agent-first implementation exposes the same receipts, scopes, idempotency, and recovery states at both layers.