Vercel’s evaluation found that embedding a compressed docs index directly in AGENTS.md outperformed on-demand skills in agent-focused coding tests. The comparison targeted Next.js 16 APIs that are absent from current model training data, and the full write-up is available on the Vercel blog: https://vercel.com/blog/agents-md-outperforms-skills-in-our-agent-evals.
The problem at hand
AI coding agents frequently rely on pre-training knowledge that can be outdated for fast-moving frameworks. Next.js 16 introduces APIs such as 'use cache', connection(), and forbidden() that may not be present in model training sets. When agents lack version-matched docs, generated code can be incorrect or drift to older idioms. The evaluation focused on feeding agents version-accurate documentation so they can perform correct edits and implementations.
Two approaches compared
- Skills: an open standard for packaging prompts, tools, and docs into an on-demand retrieval mechanism (agentskills.io). The expectation is that agents detect when framework knowledge is needed, invoke a skill, and read targeted documentation.
AGENTS.md: a project-root markdown file that supplies persistent context to agents. Content placed inAGENTS.mdis available every turn (Claude Code usesCLAUDE.mdsimilarly). The evaluation used a Next.js docs skill and a compressed docs index injected intoAGENTS.mdpointing to a.next-docs/directory.
What the evals measured
The hardened eval suite removed leakage and focused on behavior. Tests targeted Next.js 16-specific features that models are unlikely to already know, including:
connection()for dynamic rendering'use cache'directivecacheLife(),cacheTag()forbidden(),unauthorized()- async
cookies()andheaders() after(),updateTag(),refresh()proxy.tsAPI proxying
Each configuration was judged across Build, Lint, and Test assertions with retries to reduce model variance.
Outcomes
Final pass rates across configurations:
- Baseline (no docs): 53%
- Skill (default): 53%
- Skill with explicit instructions: 79%
AGENTS.mddocs index: 100%
On the Build/Lint/Test breakdown, AGENTS.md hit 100% across all three, while skills—unless guided by fragile, explicit prompts—either weren’t invoked reliably (skill triggered in only 44% of cases by default) or produced poorer results when wording forced premature doc-first behavior.
Why passive context performed better
The working explanation centers on three factors:
- No decision point. With
AGENTS.mdthere’s no need for the agent to decide whether to consult docs—the context is present every turn. - Consistent availability. Skills require invocation and can load asynchronously; passive context is part of the system prompt.
- No sequencing fragility. Skills create an ordering problem (explore project first vs. read docs first) that subtle wording changes can flip. Persistent context removes that brittleness.
Managing context size
Concern about context bloat was addressed by compressing the docs index. A full injection (~40KB) was reduced to 8KB (≈80% smaller) using a pipe-delimited, minified index that maps doc sections to files. The index does not put all doc content into the prompt; it points agents to retrievable files under .next-docs/ so agents can fetch specific files when needed.
Example setup command included in the evaluation:
npx @next/codemod@canary agents-md
This codemod:
- Detects the Next.js version
- Downloads matching docs to
.next-docs/ - Injects the compressed index into
AGENTS.md
The codemod and related PR are available in the Next.js repo: https://github.com/vercel/next.js/pull/88961.
Practical implications for framework authors
- For broad, general framework knowledge, passive context via
AGENTS.mdcurrently provides more reliable results than skills. - Skills remain useful for focused, action-specific workflows that users intentionally trigger (for example, automated upgrade or migration tasks).
- Compressing an index and structuring docs for retrieval can achieve version-accurate guidance without exhausting the context window.
- Evals should target APIs outside model pre-training to reveal the real benefits of versioned documentation.
Further details and the full evaluation are on the original Vercel post: https://vercel.com/blog/agents-md-outperforms-skills-in-our-agent-evals