Humans write the contract. Agents implement. Machines verify.
An AI-native BDD/spec tool that shifts code review from reading diffs to defining intent.
Traditional code review asks humans to judge 500 lines of code diff. agent-spec moves the review point: humans define 50 lines of contract, and machines verify the code against it.
Issue → Branch → Code → PR → Read Diff (80%) → Approve
Contract (60%) → Agent Codes → Explain → Approve
Human time shifts from "reading code" to "defining intent" — a higher-value activity. Quality assurance shifts from "human judgment" to "machine verification".
agent-spec ships project-local Skills that teach AI agents the contract-driven workflow. One install command — works with Claude Code, Codex, Cursor, Aider, and any agent that reads workspace conventions.
The default integration path. Teaches the Agent the seven-step workflow: read the Contract, implement within Boundaries, run lifecycle for verification, retry on failure, generate explain for review. CLI commands are the primary interface.
The spec writing path. Teaches the Agent how to draft and revise Task Contracts in the DSL — four elements structure, bilingual keywords, test selectors, step tables, and the "exception paths ≥ happy paths" principle.
A Contract is not a vague Issue. It's a precise specification with four parts that constrain the Agent's behavior and define deterministic acceptance criteria.
A focused statement of purpose. Not a feature list — a clear direction that gives the Agent context.
Already-decided choices that remove the Agent's decision space. The Agent follows these without questioning.
Path globs are mechanically enforced by the BoundariesVerifier. Natural language prohibitions are checked by lint.
BDD scenarios with explicit test bindings. Key rule: exception paths ≥ happy paths.
Human writes intent. Agent implements code. Machine verifies correctness. Each step has a clear owner and a specific agent-spec command.
Deterministic layers run first — zero token cost, no false negatives. AI layers handle the residual — probabilistic, with structured evidence.
Key rule: skip ≠ pass
— all four verdicts are semantically distinct.
Mechanical verifiers handle deterministic checks. For the rest, agent-spec supports two modes: the calling Agent does the review, or an injected backend does it.
The calling Agent (Claude Code, Codex…) performs AI verification itself. agent-spec emits structured requests; the Agent returns structured decisions.
AiRequest[]AiDecision[]uncertain findingsAn independent AI backend is injected via the Rust API. Ideal for orchestrator systems (Symphony-like) using a different model for verification.
AiBackendbackend.analyze()AiDecision
Both modes share the same data structures: AiRequest and
AiDecision.
agent-spec stays provider-agnostic.
Constraints and decisions inherit downward. Organization rules flow through project conventions into every task contract. Write once, enforce everywhere.