Content-driven hypothesis
planned — no runs yet

Content-driven

Give the agent a content pack and see what it builds.

The content-driven hypothesis (GAD-D-66) is a planned eval track that gives the agent the usual requirements plusa pre-authored content pack — spells, runes, items, NPCs, dialogue trees, encounter tables — extracted from a prior successful run. The research question: does the agent produce a more fleshed-out game when the authored canon exists to build on? User framing: "analogous to making a movie based on a book. Derivative, but not all processes are, much like a forger might not use the exact same brush."

This track is explicitly distinct from freedom and CSH. Content-pack runs and greenfield runs do not share a rubric — they answer different questions. Comparing them on the same score would confound the compound-skills measurement.

Not yet tested

No runs have been produced against this hypothesis yet. The eval project doesn't exist yet. Dependencies: (1) content-extraction CLI (GAD-D-66) that pulls an authored canon out of a preserved run, (2) a new eval flavor escape-the-dungeon-inherited-contentthat consumes it, (3) a distinct rubric scoring the "derivative coherence" quality.

What the content-driven track would measure

Scope expansion
Given the same token budget, does the agent produce a bigger game when the content pack already exists? More rooms, more encounters, deeper mechanics — measured against a greenfield run of the same agent with the same budget.
Integration coherence
Does the agent weave the pre-authored content into a unified game, or does the inherited canon feel bolted on? Narrative consistency, tonal consistency, mechanical consistency — all human-reviewed.
What the agent adds
Percentage of the final game that is the agent's own authorship vs the inherited canon. Too low → the agent didn't add anything. Too high → the content pack was ignored. A healthy ratio is somewhere in between.

The derivative-work framing

"This is a content-driven hypothesis, like starting out with some content first — much like making a game or movie based on a book or story. It's derivative, not all the processes are, much like a forger might not use the exact same brush."

The value of derivative work is real — adaptations regularly outperform originals on reach and often on quality when the adaptation is genuinely creative. The content-driven track asks whether that effect shows up in agent-authored games: given authored canon to build on, does the agent produce something better than it would from scratch, or does the canon constrain the creativity that would otherwise emerge?

This is why content-pack runs must notbe scored against the same rubric as greenfield runs. A movie adapted from a book is not a worse movie because it didn't have to invent the plot — it's a different kind of movie with different success criteria. The rubric for this track will score derivative coherence, integration, and scope expansion, not originality.

Current status

Dependencies
  • Content extraction CLI: a new subcommand (gad eval extract-content) that walks a preserved eval run and emits a portable content pack JSON.
  • New eval flavor: escape-the-dungeon-inherited-content with a gad.json content_pack field pointing at the source pack.
  • Rubric construction: dimensions for derivative coherence, integration, and scope expansion — explicitly distinct from the freedom/CSH rubric.
Round planning
Content-driven runs will enter the rounds framework as a new track. They do not require their own requirements version — they inherit greenfield v5 requirements plus the content pack as an input. Per the rounds framework (GAD-D-72), a new hypothesis can start a new round against any existing requirements version. Round 6 is the current placeholder for the first content-driven run.
Client debug · NEXT_PUBLIC_CLIENT_DEBUG=1
0 lines

No events yet. Window errors, unhandled rejections, and React render errors appear here. Set NEXT_PUBLIC_CLIENT_DEBUG_CONSOLE=1 to mirror console.error / console.warn.