What are the three stages of the delivery system?

Planning, execution, and validation, arranged in a loop with production. Planning turns intent into ready work, execution turns ready work into change, and validation turns change into safe production. Production then teaches you something and the loop runs again. AI mostly accelerates execution, which is why the other two stages become the bottleneck.

Why does faster AI execution not make the team faster?

Because execution was rarely the only constraint. If planning cannot shape work fast enough and validation cannot review, test, and deploy fast enough, accelerating execution just creates two queues: unclear work waiting to be specified and finished work waiting to be validated. The team looks more productive locally while shipping no faster, and sometimes less safely.

How does AI change pull request review?

Reviewers are now reviewing code that may have been generated faster than the author can fully explain. The useful question shifts from whether the code is acceptable to whether the team understands the change well enough to own it in production. That is a higher bar, and it gets harder as the volume of AI-assisted change climbs.

What is the evidence model for validation?

It does not mean AI validates or ships the work. A human still owns every release decision. It means the system assembles the dossier for each change: a plain-language diff summary, test results and coverage, static analysis, feature flag state, a diff of deploy config against production, a rollback plan, and post-release health. The human reads that and makes one call, is this risk acceptable to ship, instead of spending the day gathering status by hand.

AI Speeds Up Execution, Not the System Around It

AI made one part of software development dramatically faster, and it happened to be the part everyone already pays attention to. A developer can now come back the same day with a working implementation, a migration plan, a test scaffold, and three pull requests. That feels like a win, and in isolation it is one. The trouble starts when you remember that writing the code was rarely the thing holding the team back.

Most teams are not constrained only by execution. They are constrained by planning and validation, and AI does very little for those two by default. So you end up speeding up the middle of the system while the front and the back stay exactly where they were. The result is not a faster team. It is a team that produces more change than it can clarify, review, deploy, and own.

The Delivery System Has Three Stages

It helps to stop thinking about "writing code" and start thinking about the whole path that turns an idea into something running in production. That path has three stages, and they sit in a loop that feeds the next round of work.

Each stage is real work, each stage has an owner, and each stage can be a bottleneck. The model looks simple until AI makes one stage much faster than the other two, and then the seams start to show.

Business intent

Planning

intent into ready work

Execution

ready work into change

Validation

change into safe production

Production

what you learn feeds the next plan

The delivery loop. AI mostly accelerates one box.

Planning: Turning Intent Into Ready Work

Planning is not backlog grooming, and it is not a ceremony you run on Tuesday. It is where the team turns intent into work that can safely move through the system. Good planning answers a specific set of questions before anyone writes a line of code: what are we actually trying to accomplish, what is the smallest useful slice, where are the risky parts, who owns the decision, what must not change, what needs to be tested, what does rollback look like, and what does "done" actually mean.

In the old model, weak planning was often hidden by slow execution. If it took two weeks to build something, leadership had two weeks to clarify requirements, notice the gaps, and change direction before much was wasted. The slowness was painful, but it bought time to think.

AI removes that buffer. When a developer returns the same day with a finished implementation, nobody had the two weeks, and it turns out nobody decided the boundary of the change either. So the work moves fast and lands in the wrong place, or it lands in three places at once. Planning becomes the bottleneck precisely because execution stopped being one. AI does not forgive vague work. It makes vague work move faster, which means it reaches production faster too.

Execution: The Part Everyone Notices

Execution is where AI feels the most obvious, and it is the first thing people point to. Code generation, test generation, refactoring, documentation, migration scripts, debugging help, API clients, local tooling, quick prototypes. This is the individual contributor getting an amplifier, and the amplifier is genuinely good.

Faster execution is not the problem. Faster execution is great. The problem is that most teams treat faster code production as the whole win, and it is only one third of the system. If execution gets faster while validation stays the same, the work does not arrive in production faster. It backs up. It piles into pull request review, into the test pipeline, into deployment approval, and into the quiet question of whether anyone is confident enough to ship it. The queue just moves downstream to the stage nobody invested in.

Validation: Where the Constraint Actually Lives

Validation is more than "does the code compile." It is the whole safety system: pull request review, automated tests, security scans, architecture review, dependency checks, feature flag review, environment validation, deployment readiness, rollback confidence, production verification, and watching the system after release to see whether it actually behaves. This is where AI-assisted execution does the most damage, because a developer can now produce far more change, and every change still has to be understood, reviewed, tested, deployed, and owned by a human.

Pull request review is where this hurts first. Reviewers are no longer just reviewing code. They are reviewing code that may have been generated faster than the author can fully explain, which quietly changes what review is for. The old question was whether the code is acceptable. The better question now is whether the team understands the change well enough to own it in production. Those are not the same bar, and the second one is much harder to clear when the volume of change keeps climbing.

The IC-Plus-AI Model: Faster Locally, Slower Globally

The most common way teams adopt AI is the simplest one: give every individual contributor an AI tool and let execution accelerate. Planning and validation are left alone. What you get is a team that looks more productive up close and becomes less stable as a whole.

Two queues form. The first is a queue of work waiting to be clarified, because leadership cannot shape intent as fast as developers can now consume it. The second is a queue of finished work waiting to be reviewed, validated, and deployed, because the safety system runs at its old speed. The team is busier than ever and shipping no faster, which is a confusing place to be.

The honest read on this is uncomfortable. When the backlog runs dry, it was not because the team lacked typing speed. It was because leadership had not built a system for turning intent into ready work, and AI made that gap impossible to ignore by removing the one thing that used to paper over it.

Planning

Leadership scrambles to feed work

Execution

ICs + AI move much faster

Validation

Review, test, deploy back up

Execution accelerates; planning and validation do not.

Leadership Plans With AI

The natural next step is not to put one person in charge of the whole flow. It is to point AI at the stage that actually broke: planning. If the IC-plus-AI model starved because leadership could not shape intent as fast as developers now consume it, the fix is to give leadership the same amplifier the developers got. Leadership uses AI to build and shape the backlog, turning intent into smaller, clearer, better-sequenced slices, surfacing the risky parts, drafting the boundary of a change, and naming what must not break and what "done" means.

This is a real and obvious move, and it resolves the front of the system. The planning queue stops being the constraint, because the people who own intent can finally produce ready work at something close to the speed execution runs. You keep the fast execution you already had, and you actually feed it.

What it does not touch is the back. Faster planning and faster execution pour into the same validation stage running at its old speed. Pull request review, the test pipeline, deployment approval, the quiet question of whether anyone is confident enough to ship, none of that got faster. You have fixed two of three stages and moved the entire backlog onto the third. The constraint did not disappear. It relocated to validation, which is where it was always going to end up.

Planning

Leadership + AI shapes the backlog

Execution

ICs + AI move fast

Validation

Review, test, deploy still back up

The front is fixed; the constraint moves to the back.

The Evidence Model: Humans Decide, Faster

With the front fixed, the constraint sits squarely in validation, and the last move is about that stage. The important thing is what it does and does not mean. It does not mean AI validates the work or ships it. A human still owns every release decision. What changes is what that human is handed when they sit down to make it.

Today, approving a change is mostly assembly before judgment. A reviewer or lead pulls up the diff, works out what it touches, checks whether tests ran and what they covered, hunts for whether there is a feature flag, asks the author what rollback looks like, and eyeballs whether anything risky moved. The judgment is fast once the facts are in hand. The gathering is what eats the day. The evidence model moves that gathering off the human and onto the system.

The word that makes this work is evidence. For AI-assisted validation to be trustworthy, the system has to produce real evidence, not vibes and not "the agent said it passed." So a change arrives with the dossier already attached: a plain-language summary of what the diff does and which files and services it touches, test results showing what ran and what is not covered, static analysis and security output, feature flag state, a diff of the deploy config against what is in production now, a stated rollback plan, and after release, the production health signal tied back to that change. The human reads that and makes one call: is this risk acceptable to ship.

That role, approving risk against clear evidence, is a far better use of human judgment than acting as a status-collection service, and it is reachable without a leap of faith. You adopt it one evidence source at a time, starting with diff summaries and test coverage and adding flag and rollback state later, and the human never stops approving the risk. The realistic end state is not an autonomous pipeline. It is that your senior people decide in a fraction of the time, so validation stops being the bottleneck.

Planning

Leadership + AI shapes the backlog

Execution

ICs + AI move fast

Validation

AI prepares the evidence; humans approve the risk

AI assembles the dossier; humans own the decision.

Improve the Loop, Not the Middle

AI-assisted development breaks when teams use it to accelerate execution without redesigning planning and validation. That is the practical failure mode, and it is everywhere right now. The first wave of adoption makes individual developers faster, which is useful, but it also exposes the parts of the team that were already fragile. Planning has to produce clearer, smaller, better-shaped work. Execution has to produce changes that are understandable, not just fast. Validation has to move from manual inspection to evidence-based confidence. Skip those and the team does not get faster, it just builds a bigger pile of work waiting for someone to understand it.

The system was never "developer writes code." The system is intent becoming a plan, a plan becoming change, change becoming safe production, and production teaching you what to do next. If AI only improves the "change" step, every other step inherits the pressure. The team does not need more code than it can understand, review, deploy, and support. It needs a better system for turning intent into safe production change, and that is the work AI has made urgent rather than optional.

Frequently asked questions

What are the three stages of the delivery system?: Planning, execution, and validation, arranged in a loop with production. Planning turns intent into ready work, execution turns ready work into change, and validation turns change into safe production. Production then teaches you something and the loop runs again. AI mostly accelerates execution, which is why the other two stages become the bottleneck.
Why does faster AI execution not make the team faster?: Because execution was rarely the only constraint. If planning cannot shape work fast enough and validation cannot review, test, and deploy fast enough, accelerating execution just creates two queues: unclear work waiting to be specified and finished work waiting to be validated. The team looks more productive locally while shipping no faster, and sometimes less safely.
How does AI change pull request review?: Reviewers are now reviewing code that may have been generated faster than the author can fully explain. The useful question shifts from whether the code is acceptable to whether the team understands the change well enough to own it in production. That is a higher bar, and it gets harder as the volume of AI-assisted change climbs.
What is the evidence model for validation?: It does not mean AI validates or ships the work. A human still owns every release decision. It means the system assembles the dossier for each change: a plain-language diff summary, test results and coverage, static analysis, feature flag state, a diff of deploy config against production, a rollback plan, and post-release health. The human reads that and makes one call, is this risk acceptable to ship, instead of spending the day gathering status by hand.

ABWaterscc79a4b

AI Speeds Up Execution, Not the System Around Itee40c71

The Delivery System Has Three Stages

Planning: Turning Intent Into Ready Work

Execution: The Part Everyone Notices

Validation: Where the Constraint Actually Lives

The IC-Plus-AI Model: Faster Locally, Slower Globally

Leadership Plans With AI

The Evidence Model: Humans Decide, Faster

Improve the Loop, Not the Middle

Frequently asked questions

What are the three stages of the delivery system?

Why does faster AI execution not make the team faster?

How does AI change pull request review?

What is the evidence model for validation?

Conversation