gad:verify-phase

Checks whether a completed phase actually achieved its goals — not just that tasks were marked done.

Trace marker (when running under eval hooks): write verify-phase to .planning/.trace-active-skill at start, clear at end. See skills/create-skill/SKILL.md → "Trace marker contract".

When to use

After all tasks in a phase are marked done
Before marking the phase done in ROADMAP.xml
During eval runs as the verification step
When the user asks "did this phase actually work?"

Two modes

Automated mode (default for eval agents)

Runs checks programmatically. No user input needed. Produces VERIFICATION.md with pass/fail.

Interactive mode (for human review)

Presents each criterion and asks the user to confirm. Same as verify-work but structured around the phase's goals, not SUMMARY.md deliverables.

Step 1 — Gather verification criteria

Read the phase's sources to build the checklist:

# Phase goal from ROADMAP.xml
gad phases --projectid <id> | grep <phase-id>

# Phase tasks (all should be done)
gad tasks --projectid <id> --phase <phase-id>

# Success criteria from REQUIREMENTS.xml (if any match this phase)
cat .planning/REQUIREMENTS.xml

# Build/verify commands from AGENTS.md
cat AGENTS.md | grep -A5 "Build\|Verify"

Criteria categories

Category	What to check	How
Tasks complete	All tasks in the phase are status=done	Read TASK-REGISTRY.xml
Build passes	Code compiles, no errors	Run build command
Tests pass	If tests exist, they pass	Run test command
Deliverables exist	Files/features the phase promised exist	Check file paths
State is current	STATE.xml next-action references the NEXT phase, not this one	Read STATE.xml
Decisions captured	If architectural choices were made, they're in DECISIONS.xml	Read DECISIONS.xml
Conventions documented	If first implementation phase, CONVENTIONS.md exists	Check file

Step 2 — Run checks

For each criterion:

Check: [what we're verifying]
Command: [command to run, or file to read]
Expected: [what a pass looks like]
Result: PASS | FAIL | SKIP
Evidence: [output or file content proving the result]

Automated checks

# 1. All tasks done
OPEN=$(gad tasks --projectid <id> --phase <phase-id> --status planned 2>/dev/null | wc -l)
# PASS if OPEN == 0

# 2. Build passes
cd <project-dir> && npm run build 2>&1
# PASS if exit code 0

# 3. Type check passes (if TypeScript)
npx tsc --noEmit 2>&1
# PASS if exit code 0

# 4. Deliverables exist
# For each file mentioned in task goals, check it exists
ls <expected-file-path>

# 5. STATE.xml current
grep "next-action" .planning/STATE.xml
# PASS if it mentions next phase, not current

# 6. CONVENTIONS.md (greenfield only)
test -f .planning/CONVENTIONS.md
# PASS if exists (first implementation phase only)

Step 3 — Write VERIFICATION.md

Create .planning/phases/<phase-dir>/VERIFICATION.md:

# Phase [X]: [Name] — Verification

**Verified:** [date]
**Result:** PASS | FAIL | PARTIAL

## Checks

| # | Category | Check | Result | Evidence |
|---|----------|-------|--------|----------|
| 1 | Tasks | All tasks status=done | PASS | 6/6 done |
| 2 | Build | npm run build exits 0 | PASS | 313KB bundle |
| 3 | TypeCheck | tsc --noEmit exits 0 | PASS | 0 errors |
| 4 | Deliverables | game/src/main.ts exists | PASS | 45 lines |
| 5 | State | STATE.xml points to next phase | PASS | "Phase 02..." |
| 6 | Conventions | CONVENTIONS.md exists | PASS | created |

## Summary

[X]/[Y] checks passed.
[If FAIL: list what failed and what needs fixing]

Step 4 — Report result

✓ Phase [X] verified: [PASS|FAIL|PARTIAL]
  [X]/[Y] checks passed
  [If FAIL: specific failures listed]

If FAIL: do NOT mark the phase done in ROADMAP.xml. Fix the failures first. If PASS: safe to mark the phase done.

Integration with eval trace

When running in an eval, verification results feed into the trace:

VERIFICATION.md existence → /gad:verify-work skill trigger = true in trace reconstruct
Pass/fail ratio → planning_quality score component

Definition of done for this skill

VERIFICATION.md produced with per-criterion pass/fail
Build command executed (not just "I think it builds")
All task statuses verified against TASK-REGISTRY.xml
STATE.xml currency checked
Result clearly reported as PASS/FAIL/PARTIAL