Standards
Two canonical references. Cited on every skill page.
GAD is a research framework, not a new skill format. The way we author skills, evaluate them, and load them into agents follows two open references that already exist. This page is the single canonical citation point — every other skill-related page on the site links here rather than repeating the standards inline.
Anchor decisions: GAD-D-70 (Anthropic guide as canonical reference), GAD-D-80 (agentskills.io adoption + skills/ convention), GAD-D-81 (skill-count policy derived from both sources).
The two references
Anthropic's canonical document on authoring skills for Claude. Covers SKILL.md format, frontmatter rules, three testing layers (triggering / functional / performance comparison), three skill categories (doc-creation / workflow-automation / mcp-enhancement), and iteration signals (under-triggering, over-triggering, execution issues). Source of truth for how Claude Code loads and activates skills.
Download the PDFThe cross-client open format for skills. Specifies the SKILL.md file structure, the skills/ discovery convention for cross-client interoperability, progressive-disclosure three-tier loading, name collision handling, trust gating, and a full per-skill evaluation methodology. Agents built against this standard should find each other's skills regardless of runtime.
Progressive disclosure — three tiers
Per agentskills.io client implementation, every compliant agent follows the same three-tier load strategy. This is what keeps skill libraries scalable — you don't pay the token cost of every installed skill upfront, only the ones actually used in a conversation.
Name + description only. Loaded at session start.
Token cost: ~50-100 tokens per skill. 27 skills in GAD ≈ ~2000 tokens for the full catalog.
Full SKILL.md body. Loaded when the agent decides a skill is relevant to the current task.
Token cost: <5000 tokens recommended per skill.
Scripts, references, assets bundled in the skill directory. Loaded on-demand when instructions reference them.
Token cost: varies — pay only for the files actually read.
Discovery convention - skills/
The agentskills.io cross-client interoperability convention is the canonical location GAD is migrating toward (GAD-D-80). Skills installed at these paths become visible to any compliant client - Claude Code, Codex, Cursor, Windsurf, Augment, and others - without per-runtime copies.
| Scope | Path | Purpose |
|---|---|---|
| Project | <project>/.<client>/skills/ | Client-native location |
| Project | <project>/skills/ | Cross-client interoperability |
| User | ~/.<client>/skills/ | Client-native location |
| User | ~/skills/ | Cross-client interoperability |
GAD now treats vendor/get-anything-done/skills/ as the authored source of truth. Runtime-native layouts such as .claude/, .codex/, and generated commands/ are transpiled outputs, not canonical repo content. gad install now reads canonical skills first and emits the runtime-specific shape each client needs. The full findability writeup is at .planning/docs/SKILL-FINDABILITY-2026-04-09.md. Standardizing authored skills under the repo-root skills/ convention started in task 22-46 and is now the active framework contract.
Name collision handling
Per the standard, when two skills share the same name field: project-level skills override user-level skills. Within the same scope, first-found or last-found is acceptable but the choice must be consistent and collisions must be logged so the user knows a skill was shadowed. GAD adds a stronger requirement per GAD-D-81: skills must answer "what can this do that no other skill can?" — if the answer is unclear, they are merge candidates rather than collisions. A skill-collision detection scan is queued as task 22-49 to catch overlapping trigger descriptions before they manifest as ambiguous routing at runtime.
Per-skill evaluation methodology
Directly from agentskills.io/skill-creation/evaluating-skills. Distinct from GAD's eval-project rubric (which scores a whole build like escape-the-dungeon). This methodology scores individual skills.
timing.json. Lets you see the cost of the skill, not just the benefit.benchmark.json with delta (with_skill − without_skill) per metric. The loop is: grade → review → propose improvements → rerun → grade.GAD adoption status: not yet. The per-skill methodology is queued under task 22-50 + the triumvirate audit. Once built, per-skill evaluation is the direct answer to the programmatic-eval gap G2 in .planning/docs/GAPS.md (skill-trigger coverage) and the per-skill effectiveness half of G11 (skill-inheritance hygiene).
Three testing layers (Anthropic guide)
From the Anthropic skills guide. Complementary to the agentskills.io with_skill vs without_skill methodology — this taxonomy distinguishes what you're testing.