Coding agents have inverted the constraints

An industrial funnel

Writing the software used to be the slow part. It was always the easy part (most smart people knew this), but depending on your skill level, it was typically the slow part. At least in my opinion.

Picture a double sided funnel, on the left, requirements go in, in the middle is writing software, on the right is product validation. The tightest point of that funnel probably used to be when the software was being written. Requirements gathering and ideation could be done typically faster than the software could be written. New features and bug fixes could be validated faster than new code could be shipped out, assuming a reasonably balanced team makeup (a few engineers to one QA person perhaps). This worked great for a really long time, and explains why software engineers were in demand and could command high salaries.

image.png

In the past two or three years since GPT 3.5, 4o, Claude 3.5 Opus/Sonnet, o1/o3, Gemini 2.5 pro and finally this past October’s release of Claude Opus 4.5 (and the subsequent 4.6/5.3 releases after this initial draft), the constraints have undoubtedly flipped. I can now pump out code for new features and bug fixes faster than my QA person can validate (largely manually - put a pin in this). I can work through my ideas list faster than I can reasonably think through a proper architecture for them.

image.png

This would be great if my ideas weren’t garbage or if we could solve QA verification at scale. But that remains the final frontier, at least in the development space.

Let’s talk about both of these.

QA/Validation

It’s complicated because it’s ambiguous. Of course your QA team is validating the happy (and maybe some unhappy) paths of your changes/app. But you know what else they’re probably doing? Poking and prodding. Seeing if stuff passes the “smell test”. They’re checking odd flows that might not be directly related to your change set but could be tangential to it. And that’s just the surface level stuff. Beyond that they’re probably checking that data and changes propagate throughout the system. Also a hard and ambiguous thing to test in a world of microservices or sprawling apps/APIs. After that they’re probably remembering “that old bug/behavior” that was oddly similar to the error they just saw, and relaying to engineering that these two symptoms are likely related. Why is all this hard? Because it would be hard to describe in english for your organization, to say nothing of how you’d memorialize it in code.

It’s maybe easy to describe this in a broad sense, but if you were to try to standardize this into SOPs/documentation for your organization, you’d say “see you in six months” if not longer. Maybe there’s a world where vision models are able to excel enough and accurately recognize UIs better over time (right now it’s slow and expensive), or context windows become so big that downloading the entire DOM of your React app with more layers than your mom’s bean dip is not prohibitive, and the model will be able to reasonably tell what’s going on and truly understand it. Maybe the webMCP spec that google just shipped an experimental flag for in chrome will work out and revolutionize how we interact with web UIs (i haven’t read the spec, so i’m postulating at best) and none of this will matter. I kind of doubt it though. Either way, this space has yet to see the revolutionary changes we’ve seen in code shipping, and we’ll need investment here if we want the pace of change to continue the way it has.

Ideation

Ah yes, the masses are proclaiming it’s the “year of the idea guy” just like it’s the year of the linux desktop. And to some extent, it is. But in an efficient market, the good ideas have been built by an incumbent for some time now. And the new ideas are shipped to market even faster than ever before (that doesn’t mean you can’t do it though). But that said, I think the new pre-requisite for an actual product moat is novelty, or true domain expertise transformed into, wait for it, novelty.

So yeah, you can ship a web/mobile app faster than ever before, but so can everyone else. And your personal backlog of side projects can be knocked out by a Ralph Wiggum loop overnight with a decent prompt and $10 of Opus 4.6 credits. But were those ideas ever any good in terms of product market fit or actual TAM? If they’re anything like mine, probably not.

Now let’s say you’ve got an actual, good idea. Good for you! It probably takes a non-trivial amount of time and effort to refine that idea into an actual product specification, or into code designed to actually do what you want. Even my boss (CEO), one of the best story tellers I’ve ever met, can describe things so well for a human but there still needs to be architecting and building/iterating to turn his critical business insights into proper shareholder value. See again, novelty.

Maybe an LLM will one day start to ship entire enterprise grade systems, knowing where the gotchas would be or when to choose one architecture pattern over another. I’m sure they’ve read Designing Data Intensive Applications. And yet at the same time, there’s something distinctly human about that intuition and ability to sift through the noise of what you’re trying to build and come out with a coherent picture that can be described to a 5th grader. BUT EVEN STILL, IF YOU WANT A MACHINE TO BUILD IT, YOU STILL HAVE TO DESCRIBE IT. And before that you’ve gotta decide what to describe. Both non-trivial tasks, that likely will still need a human brain for a good while.

Marc Andreesen said not long ago that “VCs will be the only job that’s left because they’re the ones with taste” (or something to that effect), and I hate to agree with such an out of touch comment, but the point he’s making holds up more reasonably than I thought when I saw it the first time. It’s the human taste, curation and vision that will remain the last chasm for LLMs - perhaps forever. How do you describe taste? How do you recognize what is an inherently pleasant to use experience vs a bad one? Some people are good at this. Jobs, Ive, Huang and Musk (though they’re more engineers than a product visionaries I suppose), that Rubin guy with the earphones vibing (or so he says), Chesky, the list goes on, but not for very long. Point I’m trying to make is that those people were exceptional and had long careers learning and developing mastery over their respective crafts. Mastery, obsession, curation, taste… all sides of the same coin that are hard to describe and take a lifetime of in the trenches work to develop.

And until a model can capture that human edge, this will remain a bottleneck as well. I don’t know whether I’d rather be proven wrong, or be proven right in this case. Either way it’s exciting.


I’m not really a writer, so be gentle. But I’ve been wanting to articulate this sentiment for some time now in a longer form piece. Wishing you happy shipping in the year ahead!

https://media.brianvia.com/uploads/49a4d19b-bbff-4cbf-9bd6-b45780e44b08.jpg