The six criteria
What agent-first is not
Agent-first does not mean removing humans, hiding complexity, or granting a model unlimited authority. It means designing the boundary between agent and tool so that autonomy is useful, observable, and reversible.
- It is not “we have a chatbot”. A chatbot can be a front door, but agents need callable operations and verifiable state.
- It is not “we have an API”. APIs without docs, examples, scoped credentials, idempotency, and clear errors still force brittle guesswork.
- It is not “let the agent use the web app”. Browser control is useful, but it should be a fallback rather than the primary integration surface.
A quick design checklist
- Can an agent list available actions and required inputs without reading a private Notion page or guessing from the UI?
- Can it run a dry-run or preview before making a consequential change?
- Can credentials be scoped narrowly enough that a compromised agent cannot damage the whole account?
- Does every write action return a durable handle: URL, ID, commit, diff, job ID, receipt, or audit event?
- Can the agent ask “what happened?” after a timeout, crash, or network failure and get a truthful answer?
- Are error messages written for recovery, with codes and suggested next actions, not just for human display?
- Can a human review, approve, undo, or take over at natural checkpoints?
Example: a deployment tool
A merely automated deployment button says “deploy latest main” and returns a spinner. An agent-first deployment tool exposes:
deploy planto preview files, services, migrations, and expected downtime.deploy run --ref <sha>with an idempotency key and scoped environment credentials.deploy status <job_id>with structured states rather than only streaming logs.deploy rollback <release_id>and a clear policy for irreversible steps.- Links to logs, health checks, diffs, and audit events that the agent can cite back to a human.
Why this matters commercially
Teams are already buying AI coding assistants, workflow agents, and automation platforms. The bottleneck is often not model capability; it is that the surrounding tools were designed for humans clicking screens, not agents operating safely across systems. Agent-first design turns one-off demos into reliable work.