Parse from $ARGUMENTS:
--project <name>(required)--baseline <sha>(optional, defaults to HEAD)--agent <model>(optional, defaults to configured default model)
Validate project exists:
ls "evals/$PROJECT" 2>/dev/null || echo "NOT_FOUND"If not found:
Eval project '<name>' not found. Run \gad eval list` to see available projects.`Read project REQUIREMENTS.md:
cat "evals/$PROJECT/REQUIREMENTS.md"Compute next run number: Count existing
evals/<name>/v*/directories. Next = max + 1.Create git worktree:
BASELINE="${BASELINE:-HEAD}" WORKTREE_PATH=$(mktemp -d "/tmp/gad-eval-XXXXXX") git worktree add "$WORKTREE_PATH" "$BASELINE"Scaffold output directory:
mkdir -p "evals/$PROJECT/v$RUN_NUM" cat > "evals/$PROJECT/v$RUN_NUM/RUN.md" << EOF # Eval Run v$RUN_NUM project: $PROJECT baseline: $BASELINE started: $(date -u +%Y-%m-%dT%H:%M:%SZ) agent: $AGENT_MODEL status: running EOFExecute agent (stub): In the worktree, run the eval agent:
cd "$WORKTREE_PATH" # Stub: agent invocation goes here # Full implementation in gad-eval plan echo "STUB: agent run for $PROJECT" > "$WORKTREE_PATH/eval-output.txt"Collect results: Copy relevant output files from worktree to
evals/<name>/v<N>/:cp "$WORKTREE_PATH/eval-output.txt" "evals/$PROJECT/v$RUN_NUM/"Remove worktree:
git worktree remove --force "$WORKTREE_PATH"Update RUN.md with result: Set
status: completedandended: <timestamp>.Summary:
✓ Eval run complete Project: $PROJECT Run: v$RUN_NUM Baseline: $BASELINE Output: evals/$PROJECT/v$RUN_NUM/