agent-spec

Agent Integration

Skills for Every AI Agent

agent-spec ships project-local Skills that teach AI agents the contract-driven workflow. One install command — works with Claude Code, Codex, Cursor, Aider, and any agent that reads workspace conventions.

$ npx skills add ZhangHanDong/agent-spec click to copy

Installs agent-spec skills into your project's agent configuration files

agent-spec-tool-first

workflow

The default integration path. Teaches the Agent the seven-step workflow: read the Contract, implement within Boundaries, run lifecycle for verification, retry on failure, generate explain for review. CLI commands are the primary interface.

agent-spec-authoring

authoring

The spec writing path. Teaches the Agent how to draft and revise Task Contracts in the DSL — four elements structure, bilingual keywords, test selectors, step tables, and the "exception paths ≥ happy paths" principle.

Claude Code Codex CLI Cursor Aider AGENTS.md .cursorrules

Task Contract

Four Elements of a Contract

A Contract is not a vague Issue. It's a precise specification with four parts that constrain the Agent's behavior and define deterministic acceptance criteria.

## Intent — What and Why

A focused statement of purpose. Not a feature list — a clear direction that gives the Agent context.

## Intent

Add a user registration endpoint to the existing auth module.
New users register with email + password; a verification email is sent
on success. This is the first step of the user system — login and
password reset will be built on top of it later.

## 意图

为现有的认证模块添加用户注册 endpoint。新用户通过邮箱+密码注册，
注册成功后发送验证邮件。这是用户体系的第一步，后续会在此基础上
添加登录和密码重置。

## 意図

既存の認証モジュールにユーザー登録エンドポイントを追加する。
新規ユーザーはメールアドレスとパスワードで登録し、成功時に確認
メールを送信する。これはユーザーシステムの第一歩であり、今後
ログインとパスワードリセットをこの基盤の上に構築する。

## Decisions — Fixed Technical Choices

Already-decided choices that remove the Agent's decision space. The Agent follows these without questioning.

## Decisions

- Route: POST /api/v1/auth/register
- Password hash: bcrypt, cost factor = 12
- Verification token: crypto.randomUUID(), stored in DB, 24h expiry
- Email: use existing EmailService, do not create a new one

## 已定决策

- 路由: POST /api/v1/auth/register
- 密码哈希: bcrypt, cost factor = 12
- 验证 Token: crypto.randomUUID(), 存数据库, 24h 过期
- 邮件: 使用现有 EmailService，不新建

## 決定事項

- ルーティング: POST /api/v1/auth/register
- パスワードハッシュ: bcrypt, コストファクター = 12
- 検証トークン: crypto.randomUUID(), DB保存, 24時間有効
- メール: 既存のEmailServiceを使用、新規作成しない

## Boundaries — What to Touch, What Not to Touch

Path globs are mechanically enforced by the BoundariesVerifier. Natural language prohibitions are checked by lint.

## Boundaries

### Allowed Changes
- crates/api/src/auth/**
- crates/api/tests/auth/**

### Forbidden
- Do not add new dependencies
- Do not modify the existing login endpoint

## 边界

### 允许修改
- crates/api/src/auth/**
- crates/api/tests/auth/**

### 禁止做
- 不要添加新的依赖
- 不要修改现有的登录 endpoint

## 境界

### 変更許可
- crates/api/src/auth/**
- crates/api/tests/auth/**

### 禁止事項
- 新しい依存関係を追加しない
- 既存のログインエンドポイントを変更しない

## Completion Criteria — Deterministic Pass/Fail

BDD scenarios with explicit test bindings. Key rule: exception paths ≥ happy paths.

## Completion Criteria

Scenario: Successful registration
  Test: test_register_returns_201
  Given no user with email "alice@example.com" exists
  When client submits the registration request
  Then response status should be 201

Scenario: Duplicate email rejected ← exception path
  Test: test_register_rejects_duplicate
  Given a user with email "alice@example.com" already exists
  When client submits the same email for registration
  Then response status should be 409

## 完成条件

场景: 注册成功
  测试: test_register_returns_201
  假设不存在邮箱为 "alice@example.com" 的用户
  当客户端提交注册请求
  那么响应状态码为 201

场景: 重复邮箱被拒绝 ← 异常路径
  测试: test_register_rejects_duplicate
  假设已存在邮箱为 "alice@example.com" 的用户
  当客户端提交相同邮箱的注册请求
  那么响应状态码为 409

## 完了条件

シナリオ: 登録成功
  テスト: test_register_returns_201
  前提メール "alice@example.com" のユーザーが存在しない
  もしクライアントが登録リクエストを送信する
  ならばレスポンスステータスは 201 である

シナリオ: 重複メール拒否 ← 例外パス
  テスト: test_register_rejects_duplicate
  前提メール "alice@example.com" のユーザーが既に存在する
  もしクライアントが同じメールで登録リクエストを送信する
  ならばレスポンスステータスは 409 である

Workflow

Seven Steps, Three Actors

Human writes intent. Agent implements code. Machine verifies correctness. Each step has a clear owner and a specific agent-spec command.

STEP 01

Write Contract HUMAN

Define Intent, Decisions, Boundaries, and Completion Criteria. Exception scenarios ≥ happy path scenarios.

agent-spec init --level task --name "User Registration"

STEP 02

Quality Gate MACHINE

Check Contract quality before handing to Agent. Catches vague verbs, unquantified constraints, sycophancy bias.

agent-spec lint specs/task.spec --min-score 0.7

STEP 03

Agent Implements AGENT

Agent reads the Contract and codes within its constraints. Decisions are fixed, boundaries are enforced, criteria are the stop condition.

agent-spec contract specs/task.spec

STEP 04

Lifecycle Verification MACHINE

Four-layer verification pipeline: lint → structural → boundaries → tests. Agent retries on failure — no human needed.

agent-spec lifecycle specs/task.spec --code . --format json

STEP 05

Guard Gate MACHINE

Pre-commit or CI check. All specs in the repo are verified against the current change set.

agent-spec guard --spec-dir specs --code .

STEP 06

Contract Acceptance HUMAN

Reviewer reads a Contract-level summary — not a code diff. Two questions: Is the Contract correct? Did all verifications pass?

agent-spec explain specs/task.spec --format markdown

STEP 07

Stamp & Archive MACHINE

Record Contract-to-Commit traceability via Git trailers. Every commit traces back to intent.

agent-spec stamp specs/task.spec --dry-run

Verification

Four-Layer Verification Pyramid

Deterministic layers run first — zero token cost, no false negatives. AI layers handle the residual — probabilistic, with structured evidence.

L4 · AI Verifier probabilistic · ~$0.01-0.05 · uncertain verdict

L3 · Test Verifier deterministic · 0 tokens · runs bound tests

L2 · Boundaries Verifier deterministic · 0 tokens · path glob matching

L1 · Structural Verifier deterministic · 0 tokens · pattern matching on Must-Not

← cheaper, faster, deterministic richer, costly, probabilistic →

✅

pass

Verified by a deterministic or AI verifier

❌

fail

Verification found a concrete violation

⏭️

skip

No verifier covered this scenario

❓

uncertain

AI reviewed but needs human judgment

Key rule: skip ≠ pass — all four verdicts are semantically distinct.

AI Verification

Two Modes of AI Verification

Mechanical verifiers handle deterministic checks. For the rest, agent-spec supports two modes: the calling Agent does the review, or an injected backend does it.

Caller Mode

--ai-mode caller

The calling Agent (Claude Code, Codex…) performs AI verification itself. agent-spec emits structured requests; the Agent returns structured decisions.

Agent → lifecycle

agent-spec → runs L1–L3 mechanical

agent-spec → emits AiRequest[]

Agent → analyzes code, returns AiDecision[]

agent-spec → merges into final report

Human → reviews uncertain findings

Backend Mode

Rust API: AiBackend trait

An independent AI backend is injected via the Rust API. Ideal for orchestrator systems (Symphony-like) using a different model for verification.

Orchestrator → injects AiBackend

agent-spec → runs L1–L3 mechanical

agent-spec → calls backend.analyze()

AI Backend → returns AiDecision

agent-spec → complete report, no human loop

Both modes share the same data structures: AiRequest and AiDecision. agent-spec stays provider-agnostic.

Review Point Displacement

❌ Traditional Flow

✓ agent-spec Flow