In a workshop with a public-sector team this spring, we did something that would have sounded like cheating not long ago. The room needed a first set of assumptions to react to — what do we believe about this scenario, which risks matter — based on public reviews we generated this on the spot, to validate and expand the team and users.
Same with prototypes. The kind of artefact our team used to go away for to build, we now create during the coffee break, rough but real, so we can see the same thing, critique and improve it together.
What changed is the order of things. The first version stopped being homework and became the thing we can do together in real time, while everyone who can contribute is still in the room.
AI is very good at producing a first version: a draft, a summary, a prototype, a list of options. The first version is the floor. The ceiling is what skilled people turn it into.
Human expertise decides what matters, what holds in context, what should be trusted, and what deserves attention.
AI reduces the blank page. It produces a first draft, summary, prototype, journey or list of options.
Fig. 1 — The floor is where the machine reliably gets us. The ceiling is set by people. Everything interesting happens in the rise between them.
The floor is rising, and it is rising for everyone at once. The best-known field experiment on this put GPT-4 in the hands of 758 consultants at a top firm: the bottom half of performers improved by 43 percent, the top half by 17. The pattern repeats in coding, in customer service, in writing. AI compresses the field by handing the median playbook to everyone.
But decent is not the same as good, or great. A first pancake is still a pancake; it is rarely the one you serve when people are coming over. The first one tells you the pan is warm, the batter works etc. The craft is in: adjusting the heat, changing the thickness, knowing when to flip, deciding what good looks like.
And here the evidence turns uncomfortable, because the floor feels like the ceiling from inside. Four in ten managers reported receiving AI-assisted work that looked finished and wasn't. The trap is mistaking the floor for the finish.
Judgement is knowing what matters. What to ignore. What is technically correct but practically useless. What sounds good but will not survive contact with the organisation.
Fig. 2 — None of this is polish. It is the knowledge that decides whether a first version is worth anything at all.
This is where expert work moves. The value sits less in producing every artefact by hand and more in directing the work: setting the brief, choosing the right context, recognising weak output, editing with taste, validating with users, and connecting the artefact to the wider system.
Fig. 3 — Same artefact, two altitudes. The machine reaches the floor on the left; the human reaches the ceiling on the right.
The consequence: When every competitor's floor rises at the same time, the floor stops being an advantage anywhere. Baseline competence is becoming a utility — bought, not built or ownable. What cannot be bought is the organisation's definition of good: the standards, the people who know when the blueprint is lying, the judgment that turns a plausible draft into something a customer trusts. Capability planning has to follow. The organisations getting this right treat AI as a capability story and plan for judgment density the way they used to plan for headcount, asking of every team which decisions need someone who can raise work above the floor, and who that is, by name.
Which collides with how judgment has always been built. In many professions, junior people learned by making first versions. They wrote the first summary, prepared the first deck, built the first comparison, mapped the first journey, drafted the first proposal.
Much of it was imperfect. That was the point. The imperfection created feedback. The feedback created judgement and experience.
Fig. 4 — Judgement used to be a by-product of making the first version. Remove the first rung and the loop still has to close somehow.
People who study apprenticeship show us what is at stake: seniors are made by doing the job alongside someone who knows more. Remove the doing, and the becoming goes with it.
It is why I changed my own practice. When the first version is generated in the room, the judging happens out loud — why this assumption is lazy, where the prototype flatters the process, what the data would have to show before anyone believes the business case. A junior sitting in that room is watching judgment being articulated, sentence by sentence. The old apprenticeship hid this inside two weeks of solo homework. The new one can put it on the screen. Organisations that generate first versions together, and argue about them together, are running a teaching hospital without paying extra for it.
The same logic shapes careers. When execution is cheap, your value is the distance you can move work above the floor, and whether anyone can see you doing it. Keep a record of the rise: what you changed in the first version before it shipped, and why. That list is your judgment made visible. It persuades more than any tool you can name, and it is exactly what to hand the junior.
Humans and AI move through the work together. AI drafts, compares, suggests, searches and simulates. Humans frame, challenge, interpret, validate and decide.
Fig. 5 — Not a clean handover but a constant movement.
If the value moves up, so does the work. Four shifts follow.
Fig. 6 — Each shift moves the question from counting output to building capability.
The first of these, commissioning, is where the floor's quality gets set before anything is generated; the brief deserves the same care as the review.
AI makes the first version cheaper. That is good. The last part is where someone decides whether the work has meaning, whether it fits the organisation, whether it deserves attention, and how to use it.
Sources: Dell'Acqua et al. — Navigating the Jagged Technological Frontier, Mollick — Centaurs and Cyborgs on the Jagged Frontier, METR — the impact of AI on experienced developers, HBR — AI-generated workslop, Stanford Digital Economy Lab — Canaries in the Coal Mine?, Taste is the new bottleneck