Alex Sandruk 4 min read
Published Updated

Decision records for AI agents

How lightweight decision records keep agent work recoverable without turning every chat into a source of truth.

Decision record card showing decision, evidence, rejected alternatives, and revisit trigger.

Agentic teams generate a lot of text. Most of it should not become project memory.

That is the first rule. Without it, every chat summary, task recap, and agent handoff starts to look like a source of truth. The project becomes softer, not clearer. Old assumptions get repeated. Temporary workarounds look permanent. A later agent optimizes around a sentence that was never actually decided.

The answer is not more documentation for its own sake. The useful layer is a lightweight decision record: a small, durable note that says what was chosen, why it was chosen, what evidence mattered, and when the decision should be revisited.

For agentic work, this is less about classic architecture process and more about entropy control.

Why Prompts Are Not Enough

A prompt tells the current agent what to do. A decision record tells the next agent why the system is shaped this way.

That distinction becomes important when the same patterns repeat across repositories:

  • when an agent must run a verification loop
  • where task state lives
  • which tracker is the source of truth and which surfaces are projections
  • what counts as done
  • when a browser or operator smoke test is required
  • which GitOps rules are safe and which are forbidden
  • which patterns worked once but should not be copied blindly

If those rules live only in a chat or in one operator’s memory, the next project imports the same uncertainty again. The agent may become too cautious, too confident, or simply wrong about which layer it should change.

A decision record gives that repeated pattern a handle.

The Minimum Useful Record

A useful record is small. It does not preserve the whole conversation. It captures the decision boundary.

At minimum it should answer:

  • What did we decide?
  • Why did we decide it now?
  • What alternatives did we reject?
  • What evidence or constraint mattered?
  • Where is the implementation or proof?
  • What would make us revisit it?

The last question is the one many records miss. Without a revisit trigger, decisions become dogma. With a revisit trigger, the record stays operational: “this is true while these assumptions hold.”

That is especially useful for agents because they are good at carrying old language forward. If the record says why the rule exists and when it expires, a future agent has a better chance of applying it correctly.

Chatter, Evidence, And The Handle

Conversations are evidence. They are not automatically the decision.

A transcript may show how the team explored options. A command log may show what was verified. A task note may show the acceptance criteria. Those artifacts are valuable, but they are too large and too noisy to become the thing every future agent reads first.

The decision record is the handle that points back to the evidence.

That handle should be boring and specific: a path, a commit, a test receipt, a design note, a task artifact, or a short explanation of the constraint. It should not ask a later human to reread an entire agent session just to understand whether a rule is real.

This matters because generated summaries can sound authoritative even when they compress away the uncertainty. A decision record should resist that by staying close to the evidence and by naming what was not decided.

Records For Agent Harnesses

In agent-heavy workflows, some of the most important decisions are not product architecture decisions. They are harness decisions.

Examples:

  • the agent must close user-visible changes with browser or operator evidence
  • a peer checkout may fast-forward only when clean, never reset automatically
  • task JSON is the execution record, while a board is only a projection
  • secrets, raw transcripts, and private runtime state cannot be pasted into public artifacts
  • a repeated verification failure must escalate the hypothesis instead of repeating blind patches

These are small rules, but they shape the quality of the whole delivery loop. If they remain implicit, every new agent has to rediscover them. If they are recorded, the system can continue with less operator friction.

The Goal

The goal is not bureaucracy. The goal is recoverability.

A new agent should be able to re-enter the work without rereading every transcript. A human should be able to challenge a decision without arguing against a wall of generated prose. A team should be able to tell the difference between a durable rule, a temporary workaround, and an unverified claim.

Decision records do not replace judgment. They make judgment easier to inspect later.

In an agentic SDLC, that is a practical reliability layer. Prompts create local behavior. Decision records preserve the reasons the behavior exists.

Reader next step

Keep reading before switching into hiring mode.

Related posts and tags are the natural continuation. If you want the person behind the note, About gives the profile context, while selected work stays available as implementation examples.

Back to Writing

Related Posts

View All Posts »
Source-of-truth diagram showing Task JSON as canonical and issues, boards, and dashboards as projections.
Alex Sandruk
Published Updated

Task JSON as source of truth for agent work

Typed task artifacts give agents a stable source of truth while boards and dashboards stay projection-only.

Diagram of AI review moving from diff comment to focused check and evidence receipt.
Alex Sandruk
Published Updated

AI code review needs verification loops

Why AI code review should end in a check against the real system, not a confident comment thread.

Layered diagram showing raw context, preserved judgment, and reusable decision patterns.
Alex Sandruk
Published Updated

AI human distillation

A short note on using AI to compress human context without sanding off judgment, voice, and uncertainty.

Diagram of an AI agent workflow moving through plan, execute, observe, verify, and report.
Alex Sandruk
Published Updated

Verification loops for AI agents

An AI agent's claim is useful only after it is tied to a check the real system can pass.