How we ship with AI in the loop — without letting AI run the show

Building with AI assistants is fast, plausible, and dangerous in a specific way. The first draft becomes the answer; the second-best plan, which was probably best, never gets argued for. Here's the deliberate-friction methodology the studio uses to keep AI-led builds honest.

The risk in AI-assisted development isn't that the model fails. It's that it succeeds at a slightly-wrong target, and nobody in the loop has the context to push back.

Anyone who has shipped with a modern AI assistant in the loop knows the feeling. You describe the task. The model produces a plan. The plan is plausible. You read it, nothing is obviously wrong, and the model is two steps into implementation by the time you'd have formed a counter-argument. The plan ships. It works. You move on.

That sequence is the failure mode this post is about. We call it first-draft inertia, and it's the most consistent risk we've watched our own work fall into when a capable AI assistant is doing the heavy lifting.

It's not that the first draft is wrong. The first draft is usually fine. The problem is that "fine" sets the bar for the rest of the build. The second-best plan — which, on reflection, was actually best — never gets argued for. The architectural choice that would have made the next six months easier never gets named.

This is harder to see than classic AI failure modes. A model that hallucinates is loud; you catch it. A model that confidently picks a plausible second-best path is quiet, and the cost shows up months later in the work that didn't happen because the foundation didn't support it.

/ The frame

The model is competent. The first plan is plausible. Implementing it is faster than challenging it. Nobody in the loop has the context to push back. By the time the wrong-shape decision is visible, it's already a foundation, not a draft.

The fix: simulated peer review by fictional archetypes

The deliberate friction we add to every AI-assisted build is a layer of simulated peer review. Before any first-draft plan from the assistant becomes code, it gets read by a small council of fictional archetypes — each carrying a specific kind of expertise, each with a different prior, each instructed to push back from their angle.

The council isn't real people. It isn't an external review board. It's a structured prompt-and-response loop where the same AI infrastructure that produced the plan is asked to argue against it from explicitly different positions — positions we've defined in advance and reuse across projects.

The shape varies by project, but the pattern is consistent. Three to five archetypes. Each gets the plan and is asked, in their own voice, what's wrong with it. The reviews are written down. The original plan is revised against the reviews, explicitly rejected and redone, or kept with a written defence of why the objections don't apply.

The archetypes (categories, not names)

/ Archetype 1

The critical reviewer

A voice whose entire job is to find what's wrong with the plan. Not "is this good," but "where does this break first." Reads the proposal as if assigned to defeat it. Trained on priors of skepticism, edge cases, and load-bearing assumptions.

/ Archetype 2

The testing-focused voice

Asks how this gets verified. What's the failure mode, what evidence proves the build worked, what's the audit trail when something goes wrong. Trained on priors of determinism, reproducibility, and "appears to work" versus "demonstrably works."

/ Archetype 3

The craft voice

Reads for taste — product taste, engineering taste, the user's taste. Trained on priors of "this works but it isn't right," sensibility about which corners are load-bearing, and the difference between a system that ships and a system somebody wants to use.

On a typical project we run three. On a higher-stakes spec we'll run five — adding a domain-specific voice (ethics, market, operational burden) when the decision has those dimensions. The point is that the archetypes have to disagree with each other. If a plan survives the testing voice and the craft voice and the critical voice, the disagreements are different shapes. That's the signal.

Why this works where self-critique doesn't

The obvious objection: just ask the assistant to critique its own plan. We've tried. It fails for two specific reasons.

Self-critique produces sycophancy, not review. Asked to find problems with its own plan, an assistant tends to find polite problems — "consider edge case X," "you may want to verify Y." It will not, in our experience, say "the entire architecture is shaped wrong, here's the alternative." It has no incentive to. The plan it produced is, by construction, the plan it thought was best.

Multiple instances of the same voice produce conformity. Asking the same model to review the same plan five times gets you five versions of the same review. The agreement looks like signal. It isn't — it's a tight cluster centred on "looks good to me."

What breaks both failure modes is deliberate persona separation. Each archetype is asked to read the plan under a specific, narrow prior — one cares above all about determinism and verification; another cares above all about the simplest thing that could ship and survive contact with users. Those priors don't average out. They produce different reviews, identifying different failure modes. The disagreement between archetypes is what makes the panel useful.

/ The mechanism

It is not the count of reviewers that matters. It is the spread of priors. A council of three voices with genuinely different priors produces more challenge than ten instances of the same voice. The whole point is to manufacture the kind of disagreement that exists naturally in a real engineering room and is missing by default in an AI-assisted build.

What it produces: different decisions, not better grammar

The output of running this methodology is not a more polished plan. It's a different plan. That distinction matters.

A polish pass — the kind self-critique produces — tightens wording, catches a missed edge case, moves a section earlier. Helpful, sure. But the plan that ships is structurally the same plan.

A council pass produces a different shape. We've watched specs flip from "do the easy version first, harden later" to "the harden-later version is the one that ships, do that now" because a testing voice argued the weak point was load-bearing for trust. We've watched architectures move from a single-process design to a verifier-and-orchestrator split because a critical voice noted that "the same component decides what to do and decides whether it worked" was a structural smell, not a polish item. We've watched product surfaces narrow — from six categories to three — because a craft voice argued the breadth was protecting bad decisions from scrutiny.

None of those changes are corrections. They're different decisions, made because somebody in the room (real or fictional) was holding a prior the original plan didn't account for. That's the artefact. Not a better-written first draft. A second draft that wouldn't have existed.

What it costs — and why we pay it

The honest report: this methodology is slower. It adds a beat between every plan and the implementation that follows. On a small task it's an unjustifiable beat — we don't run a council to rename a function. On anything we'd describe as a decision — an architectural choice, a product surface, a refusal contract — the beat is mandatory.

It's also genuinely uncomfortable. Watching the testing voice tear apart a plan you've already half-implemented in your head is not a good feeling, and the temptation is to skip the council because the plan is obviously fine. The plans that look obviously fine are exactly the ones that benefit most from it — their obviousness is the inertia we're trying to break.

One more cost worth naming: the methodology only works if you take the reviews seriously. If the council's job is to push back and the studio's job is to nod and ship the original anyway, the friction is theatre. The discipline is to hold the plan against the reviews and accept that sometimes the right answer is to throw the plan out and write a different one.

Methodology over model

The broader position underneath all of this: AI assistants are part of the dev practice. They are not the practice itself.

The model in your loop will get faster, cheaper, and different next quarter. Whatever you're building on top of it will outlive several model generations. What protects the work from those changes isn't the model choice; it's the methodology around the model. If the methodology is "trust the first plan and ship it," every model upgrade is a coin flip on whether the next first plan is better-shaped than the last. If the methodology produces decisions that survive challenge, the model becomes a tool that gets better over time without changing the shape of how the work happens.

This is what we mean when we say the studio's voice is consistent across projects: refusal as a feature in our retrieval systems, never-auto-apply in our agent products, disclosure-by-default in our content tools, AI-cannot-mark-its-own-work in our test harnesses. Those are the same shape of decision — AI assistants that know their lane — and the council methodology is how we get to them. Not because the assistant proposed them on the first pass. Because somebody in the council asked the right hostile question.

This is what works for us. Mileage will vary; if you're shipping with AI in the loop and first drafts are shipping, it's worth asking what the second draft would have said.


We share our internal practices openly because the studio learns more from publishing them than from keeping them. If you're shipping with AI assistants and want a second opinion on your methodology — or a council pass on a spec you're not yet sure about — that's a conversation our 30-min calls are for. Book one, or write to help@digicrafter.ai.