This essay is part of an ongoing series exploring what AI reveals about human thinking. Previously: Most People Stop at Step One, Writing as Proof of Human.
The axiom says ideas are cheap and execution is everything. AI promised to make execution cheap. With that hurdle removed, we should now have plentiful ideas plentifully executing. We should be awash in dreams come true. But when I actually built things with AI, and banked real money on the results, reality told a different story.
The old adage isn’t quite dead yet.
Note of disclosure: This essay was written with the assistance of Claude Opus 4.6. Thoughts, arguments, experiences, and the framework presented are mine. Claude helped me organize, challenge, and refine the writing. The irony of using AI to write about the limits of AI is not lost on me. See Writing as Proof of Human for my thoughts on that particular contradiction.
The Quick Win Story
A few weeks ago, I had an idea. Build a research platform to bid and win a watch at auction. So I fired up Codex and did it.
In a single afternoon’s work.
I am not a developer. I’m a Salesforce architect: yes, that Salesforce, but I don’t write code for a living. I don’t compile much more than boxes in Lucid, meeting notes, and unread emails. Hell, I barely know the difference between a merkle tree and a douglas fir.
And so, the promise of AI — “got an idea? tell it to [Claude, Codex, etc] and watch it come true!” — is directed squarely at me.
So, case closed yeah? Idea + AI for the win!
It’s a wonderful success story, and seductive. Look what I built in just a few hours. All I had to do was think and type…in Good Old Fashioned American™.
Except it’s tragically wrong. Let’s take a deeper look at what’s really going on.
The idea — “I want to bid on [something] at an auction and not lose my shirt” — is common. Lots of people browse auction sites (like eBay), and wish they understood what they’re looking at and if they’re getting a good deal. The idea, on its own, is worth approximately nothing.
Had I simply prompted for “an app to help me win a watch at auction at a really good price”, I would not, dear reader, have won a watch at auction at a really good price. I would instead be writing a vastly different, far more upset story with requisite sad violin accompaniment. Instead, the builds that I’ve completed and pressure tested forced a realization: the “ideas vs. execution” frame is fundamentally wrong. It’s not that ideas flipped to being more valuable than execution with AI. It’s that both words — “idea” and “execution” — hide a spectrum that matters more than the individual words can properly encompass.
See the Matrix
Let me try to build the vocabulary I think we’re missing.
When someone says “I have an idea,” they could mean wildly different things. “I want to build a cool website design that people will buy” is an idea. “I want to build a visually interactive explainer of how polarized sunglasses work” is also an idea. While both are ideas, they exist on different levels of concept granularity.
In simpler terms, the difference is clarity. How precisely can you specify what the end state looks like? Can you describe the inputs, the outputs, the success criteria, the failure modes? Can you tell someone — or something — exactly what “done” looks like? Or, to be more specific to our examples: “wtf is cool, man?”
I think of this as the idea clarity axis. On one end: “I’d know it if I saw it, but I can’t articulate it yet.” On the other: “I can define a spec so precise that a stranger could evaluate whether the output meets it.”
For you project managers sitting up front rolling your eyes, preach louder for the readers in the back.
There’s a second axis that I think matters just as much, and it’s the one most people miss: output verifiability. Once something has been produced — built, written, designed, decided — how easily can you tell whether it’s right? What does “right” even mean to you?
Some outputs are trivially verifiable. Code compiles or it doesn’t. A calculation matches the expected result or it doesn’t. A recommendation is backed by trusted data or it isn’t. You can check. You can test. You can point at the thing and say “this is correct” or “this is wrong and here’s where you went wrong” and be confident in the judgment.
Other outputs are close to impossible to verify in the moment. Is this UI pleasant to use? Is this the right strategy for entering a new market? Does this essay land to the audience I want to attract? You won’t know for months or years. There’s no test suite for good judgment.
These two axes — idea clarity and output verifiability — create a matrix. And where you sit in that matrix determines almost everything about how useful AI is to you.
The Four Quadrant Model

Low idea clarity × Low output verifiability: The Wilderness
This is where AI is not just unhelpful but actively harmful. You don’t know what you want, and you can’t tell whether you got it. This is the quadrant of architecture, strategy, early-stage product direction, and — in my hot take — most AI generated relationship advice.
Minas Karamanis, an astrophysicist, wrote a devastating essay about what happens when PhD students operate in this quadrant. His character “Bob” uses AI to produce a publishable paper that’s technically correct yet completely hollow. Bob shipped a product but didn’t learn a trade. The paper looked identical to his colleague Alice’s. But it wasn’t the work of a true scientist: take away the AI, and Bob is still a first-year student who has spent an expensive year but hasn’t started his education yet.
Low idea clarity × High output verifiability: The “Solve World Hunger” Prompt
This is the quadrant of ambitious vagueness. You can’t quite specify what you want, but boy, you’ll know a good result when you see one. It’s what we’re sold with the promises of AI transforming humanity as we know it. The promises are plausible — AI finding cures for diseases with virtual protein modeling combinations. But much like self-driving cars, it’s always just around the corner.
The examples aren’t always so dramatic. They’re local to our daily lives as well.
“Use AI to improve our customer experience.” “Clean up our bad data and find unique business insights.” “Optimize our processes.” These are real things I’ve heard at work from colleagues and clients. They are ideas with high output verifiability in theory — you could measure customer satisfaction, you could benchmark process speed, you could evaluate data completeness and hygiene. And the idea itself is so palpable that AI will give you a solution. Fast. Confidently. But it will be wrong in a way you won’t notice for months, because you didn’t explicitly define what right looks like. Your CEO just figured they’d know it when they saw it in the Q3 earnings reports.
This is also where corporate mandates to “use AI daily” fail. No specified outcomes. No verifiable targets. No clarity on what problems to solve. The stumbled rollout.
But hey, the company is “all in on Claude.”
High idea clarity × Low output verifiability: The Taste Trap
This is where things get dangerous in a subtle way. You know what you want — you can describe it, you can spec it out, you can even get AI to produce something that looks right. But you can’t verify whether it’s actually right until you live with it.
In my day job as a technical architect, I live in this quadrant constantly. A stakeholder says “we need to improve the sales cycle.” I can alchemize existing pain points into requirements. I can recommend a scalable architecture. I can even get AI to draft a solution design document that looks professional, phased, and comprehensive. But whether that architecture is right — whether it will hold up under the specific pressures of this organization, its current ecosystem of incumbent technologies, this client’s change tolerance, and this fiscal year’s budget — that’s not something I can verify with a test suite. That takes judgment. That takes knowing what happened the last three times someone attempted quote to cash on these assumptions. That takes the kind of pattern recognition that only comes from having been wrong before and remembering why.
An even more salient example: AI meeting transcripts. There is a sweetly unique form of terror that accompanies explaining to a client that despite the transcript saying you definitely could build a feature in three weeks, it was indeed wrong, and that you absolutely could NOT build it in three weeks. I think we should coin a new term for it, a portmanteau of gaslighting and AI. “AI-lighting.”
High idea clarity × High output verifiability: The Sweet Spot
This is where AI is a genuine force multiplier. You know exactly what you want. You can check whether you got it. The machine executes, you validate, and the loop is fast and tight.
My watch auction tool lived here, once I got past the initial ideation. I had eight numbered data points. I had named sources. I had a domain-specific bullshit detector built from experience. When the AI produced output, I could evaluate it immediately and trace back the point of failure, the miscalculations.
Lalit Maganti’s parser rule generation lived here too. In his writeup of building SyntaQLite, he describes the moment when AI worked best: generating the 400+ grammar rules for SQLite’s parser. He knew exactly what each rule should produce. He could review AI output within a minute or two. The loop was so fast he called it “successful beyond my wildest dreams.” But this was only possible because he’d spent years working with SQLite’s internals. The clarity came from deep expertise, not from a good prompt.
There is a projection distance here as well. Aim small, miss small. The more contained the ask or the shorter the timeline for evaluation, the higher the confidence and quality of clarity.
Think of it like aiming for a long-distance target — archery, riflery, or aviation, take your pick. At a short distance, your inherent variables and variances are fewer and have minimal ability to compound: ready, aim, fire. You can be off by a degree or two and still hit your target. At longer distances, variables compound with uncertainty. Little inaccuracies add up — wind, bearing, velocity assumptions, humidity, even heartbeat. The larger the ask “distance,” the greater the variables to factor in, and the lower the certainty in your end accuracy. In other words, short incremental asks with frequent corrections — the guiding principle of agile software delivery — wins in this quadrant. The principles apply outside of tech as well.
Over the last six months I built an extension to the concrete patio in my backyard. I, like a nerd, used ChatGPT to plan and verify I wasn’t going to destroy my house and was doing a good job. Rather than ask the AI to “give me a step by step plan to build a backyard deck extension,” I broke it down into small chunks. First, I asked for a general explainer of how a professional hardscaper would go about the project. I asked what tools they use and what the main categories of their plan were. From there, I built an outline and filled in individual steps. I asked for tool substitutions and different options at each step of the way. I asked how I could tell if I did a good job, and how I could tell if things were off. I asked what was worth paying extra for, and what I could skimp on.
Don’t worry, I still made plenty of mistakes. The extension is about 1cm lower than the original deck. There’s a spot where some water pools after a heavy rain. But it looks gorgeous, I’m proud of myself, and it’s held up through an entire Indiana winter without a single crack or separation. My father-in-law even thinks it looks “pretty good.” All of which was possible because with each incremental step I had margin for error and opportunity to mitigate errors and re-align direction. And that came from having idea clarity and output verifiability.
Understand The Truth
So, did the relationship between ideas and execution flip?
No. The question was wrong. It was always wrong, because “ideas” and “execution” were each hiding a spectrum, and what we meant by those words was never precise enough to be useful.
What actually happened is this: AI collapsed the execution cost along one specific diagonal of the matrix — the high-clarity, high-verifiability quadrant. The part where you know what you want and can check whether you got it. In that quadrant, execution went from expensive to nearly free.
But the rest of the matrix? The part where you need taste, judgment, experience, the ability to specify clearly, the ability to verify meaningfully? That didn’t get cheaper. It didn’t even change. And it turns out that this — the ability to move yourself into the upper-right quadrant — was always the valuable part of “execution.”
And that led me to a realization: It turns out the matrix isn’t really about AI at all.
It’s about you.
Where you sit on both axes is a function of your clarity, your judgment, your ability to articulate what you want and evaluate what you got. AI doesn’t move you on the matrix. You move yourself — through experience, through doing the hard slow work that builds the pattern recognition, and acquires the specific accumulated knowledge that lets you tell what’s “right” and what’s “off.”
AI amplifies wherever you already are. If you’re in the sweet spot — high clarity, high verifiability — AI makes you extraordinary. If you’re in the wilderness — low clarity, low verifiability — AI makes you confidently wrong.
Codie Sanchez recently wrote a piece framing this as “Daycare vs. Department Chair” employees. AI replaces the daycare. It amplifies the department chair. It does not — cannot — move you from one to the other. That movement is the work. It’s long nights studying, failed certification attempts, hard-earned promotions, stressful projects: all the experience that taught you something you didn’t know you were learning, the gut-check moments where you caught the wrong answer because you’d seen the right one enough times to feel the difference. You can’t shortcut that with an LLM.
Archimedes, the world’s first AI expert, said it best: “Give me a firm place to stand and a lever long enough, and I will move the Earth.”
Clarify your idea to a firm place and verifiable output. AI is a lever long enough to move the Earth.