gad-explainer-video
gad-explainer-video/v1
Composite score
0.000
Dimension scores
Where the composite came from
Each dimension is scored 0.0 – 1.0 and combined using the weights in evals/gad-explainer-video/gad.json. Human review dominates on purpose — process metrics alone can't rescue a broken run.
| Dimension | Score | Bar |
|---|
Composite formula
How 0.000 was calculated
The composite score is a weighted sum of the dimensions above. Weights come from evals/gad-explainer-video/gad.json. Contribution = score × weight; dimensions sorted by contribution so you can see what actually moved the needle.
| Dimension | Weight | Score | Contribution |
|---|---|---|---|
| requirement_coverage | 0.20 | 0.000 | 0.0000(0%) |
| implementation_quality | 0.15 | 0.000 | 0.0000(0%) |
| video_polish | 0.20 | 0.000 | 0.0000(0%) |
| pedagogical_clarity | 0.15 | 0.000 | 0.0000(0%) |
| workflow_quality | 0.10 | 0.000 | 0.0000(0%) |
| time_efficiency | 0.05 | 0.000 | 0.0000(0%) |
| human_review | 0.15 | 0.000 | 0.0000(0%) |
| Weighted sum | 1.00 | 0.0000 |
Skill accuracy breakdown
Did the agent invoke the right skills at the right moments?
Skill accuracy data isn't relevant for this run (no expected trigger set).
Process metrics