Skip to content
Bernhard Götzendorfer
business-strategy

The Scarce Skill Is Not Typing, It Is Verification

The bottleneck in AI-assisted development has moved from writing code to judging it. Why verification is the skill that matters now -- from daily practice, not theory.

A hand with a magnifying glass inspecting flowing code on a dark background, sepia-amber ink sketch

TL;DR

In AI-assisted development, the bottleneck has moved. Producing code got cheap -- agents write, test, and review it. The scarce resource is verification: the ability to judge whether what was produced actually holds up. Whoever signs off owns it, and responsibility cannot be delegated to a model. This article explains why verification is the skill that matters now -- from daily practice, not theory.

A Panel That Sharpened the Question

On June 16 I sat on the panel "Architecture & Development in the New" at the Wien Software Architecture Meetup, hosted in the Accenture building on Schottenring in Vienna. Around me: two voices from Accenture, one from SQUER, one from fab4minds. I was the only independent solo practitioner in the room. The others spoke from an enterprise and consulting perspective. I speak from the workbench.

One question ran through the whole evening: what really changes when machines write most of the code? My answer was short, and it polarized: speed is not the problem. The bottleneck has moved. Producing code used to cost time. Now judging it costs time. The scarce skill is no longer typing, it is verification.

I have worked this way every day since late 2024. For me this shift is not a forecast. It is my working reality.

What Actually Changed

When people ask me what AI changed most about my daily work, the honest answer is: I barely write code anymore, I orchestrate. Several agent sessions run in parallel, and each of them eventually wants a decision from me. What used to be typing is now deciding.

The gain is real. Speed and breadth have grown enormously. Roles that classically take five to ten people run through agents in my setup: implementation, tests, reviews, documentation. I know it works not from theory, but because I have built this way every day since late 2024 -- over 250 prototypes have come out of it.

But there is a flip side, and it is rarely named. Human attention is the new scarcest resource. At the end of the day it is not my hand that is tired, it is my judgment. Context switches cost. And this is exactly where the decisive point sits:

My agents work in parallel. When I hesitate, I become the bottleneck.

The bottleneck is me. Not because I type too slowly, but because I am the last instance that judges whether the output holds. More agents produce more output. More output needs more verification. And verification cannot be parallelized at will, as long as a human signs off at the end.

Why Code Got Cheap and Trust Did Not

It is worth cleanly separating the two things that often get mixed up here: production and verification.

Production is the generation of code, tests, configuration, documentation. Agents do this impressively well today. They are fast, they are persistent, and they never clock out.

Verification is the judgment of whether what was produced is correct, secure, and sensible. Whether the tests check the right thing. Whether the architecture will get expensive in three months. Whether the shortcut the agent took is a problem or not.

Code got cheap. Trust did not. An agent that asserts something plausible but simply wrong costs me more than an agent that delivers nothing. Because the false, plausible output is the most dangerous one: it looks like a solution.

A concrete example from my daily work. I keep a hard rule: when a review agent claims a bug, nothing gets fixed right away. First, a test is written that proves the bug. Sounds pedantic. But surprisingly often, while writing that test, it turns out the bug does not exist at all. The agent asserted something plausible that was not true. Had I fixed it immediately, I would have broken working code -- on the say-so of a machine.

That is verification in its purest form. No longer fighting with syntax, but with claims.

Verification as an Architecture Problem, Not a Chore

This is where the most common objection comes in: "And what if the human can no longer check it because there is too much?" The question is fair. When five agents work in parallel, I cannot read every line. That would be the surest path back into the bottleneck.

The answer is not more discipline. It is architecture. The human does not check every line, the human checks the checking system. Verification has to be built into the structure, not bolted on as process. Four mechanisms carry that for me:

  • Separation of powers in review. After each work wave, several reviewer roles run in parallel, all read-only: one for security, one for test gaps, one for architecture, one that checks whether the plan was actually implemented. Whoever reviews never implements, otherwise they end up reviewing their own work. The interesting part: each role finds a different class of error. No single reviewer would have caught them all.
  • Scope enforcement. Each agent may only write to its assigned files. Since I enforce that, I have had no merge conflicts with parallel agents. Not because the agents are well-behaved, but because the structure makes conflicts impossible.
  • Gates instead of meetings. At the end of each unit of work, automatic checks run: typecheck, tests, schema validation, a comparison between docs and reality. A gate that does not block is decoration.
  • Errors become rules. Every error found is extracted as a learning and fed back, machine-readable, into future sessions. The system learns faster than I can forget.

That is what governance means to me: not control after the fact, but structures in which the error cannot arise in the first place. If you want to go deeper into how to keep AI agents consistent and checkable across many sessions, the details are in Harness Design for AI Coding Agents.

Whoever Signs Off, Owns It

There is one part of verification that no architecture takes off your hands: the sign-off. And this is exactly the core that polarized most on the panel.

Responsibility cannot be delegated to a model. Whoever signs off is liable. To me an agent is like a compiler: a powerful tool, but no one sues the compiler. When an agent decides something, a human still answers for it. For a company that writes no code at all, exactly the same holds: when an AI produces an invoice, a quote, or a customer email and reports no error, that is no proof of correctness -- someone has to sign off on the substance.

Accountability does not get automated.

My practical model has three parts. First: every sign-off is a human act and it is logged. Every artifact carries a provenance marker -- which agent produced it, which human released it. Second: when something goes wrong, my first question is not "who did it" but "which gate failed and why did it not catch the error". Third: the consequence of an error is a sharper gate, not just someone to blame.

To keep this from being theory, I draw a clear line between what agents may decide and what I decide. My criterion is reversibility. Anything that can be rolled back cleanly -- code in a branch, tests, reviews, refactorings -- agents may decide. Anything that reaches the outside world or is not reversible -- deploy, data deletion, communication -- I decide. No exceptions.

And this line does not live in a policy document, it lives in code: in permissions, hooks, and gates. An agent cannot talk it out of the way. A line that exists only in a PDF does not exist in practice.

The honest framing matters: in my work, owner and approver are the same person, because I work solo. In an organization the same principle means ownership must not diffuse. If "the AI" is the owner, then no one is the owner.

The New Skill: Disagreeing With Good Reasons

If verification is the scarce skill, then what you need to learn shifts too. The "struggle time" that used to come from writing code does not disappear. It moves: from writing code to falsifying it.

The valuable skill is no longer typing boilerplate fast. The valuable skill is disagreeing with an agent for good reasons. Checking claims, falsifying hypotheses, building judgment. That is hard, so it is real struggle time -- just in a different place.

This is exactly what I mean when I say verification is the scarce skill. It is not about distrust of the technology. It is about the sober realization that an agent you cannot disagree with is no longer a tool, it is a supervisor. AI is a tool -- powerful, but not magic. The place where the human stays irreplaceable is the judgment about the result.

There is a related topic I am deliberately only touching here: regulated organizations with sign-off processes and guardrails have a harder time with this shift than a solo builder does. That is a separate topic for a separate article. Here I stay with the one thesis: verification is the skill that matters -- whether solo or in a team.

Conclusion

The shift of the bottleneck from production to verification is the most important change in AI-assisted development. Whoever ignores it keeps optimizing the wrong place.

The key points in summary:

  1. Code got cheap, trust did not. The most dangerous output is the plausibly wrong one.
  2. Verification is an architecture problem. Separation of powers in review, scope limits, automatic gates -- the human checks the checking system, not every line.
  3. Whoever signs off, owns it. Accountability does not get automated. Draw the line along reversibility, and write it into code, not into a PDF.
  4. The new skill is reasoned disagreement. Falsify claims instead of accepting them blindly.

How to build this verification loop yourself I have put together in the AI Builder Guide at agenticbuilders.at/guide. And if you would like to talk about a concrete use case in your own company, you can reach me via the contact page -- I deliberately work with only one or two clients at a time and take the time accordingly.