In agentic projects, code is temporary. The spec is durable.
I know that sounds backwards. We used to treat code as the source of truth. With agents, code gets reshaped constantly. If your spec is weak, every run drifts.
What survives a reset
When I run /clear, the model loses chat history. It does not lose my repo files. So I put important decisions in spec docs, not in chat messages.
A good spec carries intent across sessions. It keeps quality stable even when context poisoning starts showing up.
What I put in the spec
- Goal: one outcome, written in plain language.
- Constraints: files to avoid, APIs to keep, failure conditions.
- Checks: exact commands that must pass before done.
- Done criteria: what "shipped" means in this repo.
That is enough for reliable execution. Anything extra is usually noise.
Prompt-driven work does not scale
Prompt-only flows feel fast on day one. By day ten nobody remembers why a rule exists. Teams re-explain the same thing in every session and burn tokens doing it.
Spec-driven flow fixes that. The model reads stable docs, not random chat leftovers. That is core context engineering.
Write for execution, not for slides
Most PRDs are written for humans in meetings. Agent-readable specs are different: direct language, short sections, concrete commands, zero hand-wavy text.
If an agent can run your task from spec and pass checks, your spec is good. If not, rewrite it before touching code.