Local DB

Every number, with a receipt. Show me where this came from.

Research credibility lives or dies on whether you can trace a number back to its inputs. This page indexes every chart and stat on the site with: where the number comes from, how it's derived, and whether the source is deterministic (computed at prebuild), self-reported (the agent put it in TRACE.json), human-rated (submitted via the rubric CLI), or authored (hand-curated content).

Per GAD-D-69(programmatic-eval priority), every new metric must answer "can this be collected programmatically?" before "how do we score it?". The push is to move self-report sources toward deterministic ones — the gaps are tracked in .planning/docs/GAPS.md.

Deterministic

9

Human-rated

2

Authored

4

Self-report

1

Trust levels explained

deterministic
Computed by code from raw inputs at prebuild. Same inputs always produce the same number. Highest trust.
human
A human submitted this via the rubric review CLI. Trustable but not scalable.
authored
Hand-curated content (glossary, decisions, requirements). Trust is editorial.
self-report
The agent put this number into TRACE.json itself. Lowest trust.

Hero

3
Playable runs
deterministic
/

Source

PLAYABLE_INDEX (lib/eval-data.generated.ts)

Formula

Object.keys(PLAYABLE_INDEX).length

Set at prebuild from auditPlayable() — counts directories under apps/portfolio/public/evals/<project>/<version>/.

Runs scored
deterministic
/

Source

EVAL_RUNS

Formula

EVAL_RUNS.filter(r => r.scores.composite != null).length
Decisions logged
deterministic
/

Source

ALL_DECISIONS

Formula

ALL_DECISIONS.length (parseAllDecisions() over .planning/DECISIONS.xml)

Per-run card

2
Composite score

Source

TRACE.json scores.composite

Formula

Σ_dimensions (weight * dimension_score), capped by gate failures

Composite is currently agent-self-reported in TRACE.json. Programmatic alternative tracked under GAPS.md G1 (deferred until UI stabilizes per gad-99).

Human review aggregate

Source

TRACE.json human_review (rubric form)

Formula

Σ_dimensions (weight * score) per project's human_review_rubric

Submitted via `gad eval review --rubric '{...}'`. Per-dimension scoring per gad-61 / decision gad-70.

Roadmap

1
Pressure rating per round
authored
/roadmap

Source

pressureForRound() and constants in app/roadmap/roadmap-shared.ts

Formula

f(requirement complexity, ambiguity, constraint density, iteration budget, failure cost) — currently authored

Will become programmatic when the pressure-score-formula open question resolves. See gad-75.

Emergent

1
Skill inheritance effectiveness

Source

TRACE.json human_review.dimensions.skill_inheritance_effectiveness

Formula

Human-rated 0.0–1.0 on whether the run productively inherited + evolved + authored skills

The compound-skills hypothesis test signal. Hygiene component (file-mutation events + CHANGELOG validity) is queued as GAPS G11 — automatable.

Per-run page

3
Tool-use mix

Source

TRACE.json derived.tool_use_mix

Formula

Counts of tool_use events per tool name from the trace stream

Reference pattern for all new programmatic metrics — see GAPS.md G4.

Plan-adherence delta

Source

TRACE.json derived.plan_adherence_delta

Formula

(tasks_committed - tasks_planned) / tasks_planned
Commit count + per-task discipline

Source

TRACE.json gitAnalysis (git log over the run's worktree)

Formula

Counts of commits, batch vs per-task, ratio of task-id-prefixed commits to total

/decisions

1
Total decisions (171)
deterministic
/decisions

Source

ALL_DECISIONS

Formula

parseAllDecisions() walks .planning/DECISIONS.xml

/planning (tasks tab)

1
Total tasks (229)
deterministic
/planning?tab=tasks

Source

ALL_TASKS

Formula

parseAllTasks() walks .planning/TASK-REGISTRY.xml

/planning (phases tab)

1
Total phases (49)

Source

ALL_PHASES

Formula

parseAllPhases() walks .planning/ROADMAP.xml

/glossary

1
Glossary terms (27)
authored
/glossary

Source

GLOSSARY

Formula

data/glossary.json terms[]

/questions

1
Open questions (16)
authored
/questions

Source

OPEN_QUESTIONS

Formula

data/open-questions.json questions[]

/planning (bugs tab)

1
Tracked bugs (4)

Source

BUGS

Formula

data/bugs.json bugs[]
Client debug · NEXT_PUBLIC_CLIENT_DEBUG=1
0 lines

No events yet. Window errors, unhandled rejections, and React render errors appear here. Set NEXT_PUBLIC_CLIENT_DEBUG_CONSOLE=1 to mirror console.error / console.warn.