Back

Stop Debugging the Model. Edit the Sentence.

Precocity Blog

AI demos hide the hard part. The real work isn't smarter models — it's clearer language.

Mar 23, 2026

Written by

Ted Ingram

,

VP Client Success, Data & AI

Back

Stop Debugging the Model. Edit the Sentence.

Precocity Blog

AI demos hide the hard part. The real work isn't smarter models — it's clearer language.

Mar 23, 2026

Written by

Ted Ingram

,

VP Client Success, Data & AI

Back

Stop Debugging the Model. Edit the Sentence.

Precocity Blog

AI demos hide the hard part. The real work isn't smarter models — it's clearer language.

Mar 23, 2026

Written by

Ted Ingram

,

VP Client Success, Data & AI

When AI demos fail in production, teams debug the model. They should be editing the sentence. Drawing on a background in journalism and data, this post explores why natural language is harder to get right than code — and why the most valuable AI work happens before a single prompt is written. The organizations that make it out of the Dunning-Kruger valley are the ones who learned to ask clearer questions.

When AI demos fail in production, teams debug the model. They should be editing the sentence. Drawing on a background in journalism and data, this post explores why natural language is harder to get right than code — and why the most valuable AI work happens before a single prompt is written. The organizations that make it out of the Dunning-Kruger valley are the ones who learned to ask clearer questions.

If you, like millions of others, watched the Super Bowl for the commercials, you saw a Microsoft ad for AI. Microsoft paid to show the world how easy it has become. NFL recruiters. Player stats. Copilot in Excel. A few comments, and the best linebacker is obvious. Every time it aired, people laughed. Not the reaction Microsoft was looking for. (Of course, who sells the hard part?)

They laughed because anyone who has ever made a real personnel decision knows that question has 15 years of film study, injury reports, locker room culture, and salary cap math behind it. The ad showed the words. It didn't show the work underneath the words.

That gap has a name. Psychologists call it the Dunning-Kruger effect — the cognitive bias where limited experience produces outsized confidence. Right now AI is producing Dunning-Kruger at industrial scale. The demo works. The room gets excited. The budget gets proposed. And nobody talks about what the demo didn't show you.

The Computer Does What You Tell It

I learned to program in BASIC. The first real lesson wasn't syntax. It was humility. The computer was going to do exactly what I told it to do. Not what I meant. Not what I intended. What I said.

That has always been true. What changed is the language.

For 40(something) years, people adapted to computers. You learned their language. BASIC. SQL. Python. The instruction and the outcome had a one-to-one relationship. You typed a thing. It did that thing. Deterministic, the engineers call it. It wasn't always what the person typing wanted, but it was what they told the computer to do.

Large language models flipped that. Now the machine interprets human language. This means the quality of what comes out is dependent on the precision and context of what went in. The computer still does exactly what you tell it. We just tell it in English now. And English, it turns out, is a lot harder to get right than Python.

The Valley Nobody Shows You

This is where the Dunning-Kruger valley lives. The demo worked. Production doesn't. The query breaks when someone asks it differently than the person for whom you designed it. The assumptions baked in for one persona don't hold for another. The answer comes back technically correct and completely useless. There is a special frustration reserved for answers that are true and wrong at the same time. (As Gilfoyle says in Silicon Valley: "The reward function was a little under specified.")

Teams go looking for an engineering fix. The model needs to be smarter. The architecture needs to change. The data pipeline needs work. Sometimes those things are true. Often, the problem is 30 words upstream, in the markdown prompt that wasn't precise enough to encode what the user actually needed to know.

They're debugging the model when they should be editing the sentence.

I've Done This Work Before

Before data and AI, I worked in journalism. As an editor, I fixed grammar. Not because I enjoyed the arguments — and there are arguments, the AP style vs common use debate has the same energy as the newsroom rumble scenes in Anchorman, someone always ends up with a trident — but because grammar standards exist to ease understanding. Every rule is in service of one thing: making sure the meaning the writer intended is the meaning the reader receives.

The rest of the job was figuring out why the words on the page didn't match the conversation that happened before a word was written or even after it was written. A reporter files a story with every fact correct and every sentence grammatically clean. It's still the wrong story. The words as strung together don't tell the story or explain what's happening in a way a reader can absorb. The reporter was able to speak to their point and convey their meaning through body language and voice. Their full meaning needs to be translated to print.

That's an editing problem.

Building natural language query systems feels exactly like that. "How are my top stores doing?" is a clean sentence and an almost entirely ambiguous question. Top by what metric? Doing relative to what baseline? My stores — her district, her region, the 12 locations she personally sweats over first thing Monday morning? The system made assumptions. In the demo, they happened to be right. At scale, for users you never interviewed, they won't be.

The real work — the work that never shows up in the demo — is the conversation that has to happen before a word is written. Who is asking? Why are they asking? What decision are they about to make? How will they interpret the answer? And yes, whether the structure of the response matches the order in which a human being actually processes information. A reporter who buries the lede writes a bad story. A prompt that buries the lede produces an answer the system never finds. Or worse – sometimes it finds the lede, and you don't know whether this is one of those times.

This isn't an engineering problem. It's a language problem. I've been solving language problems for 30 years. The medium changed. The problem didn't.

The Climb Out Looks Like Editing

Organizations that make it through the valley stopped asking how to make the AI smarter. They started asking how to make the question clearer.

They mapped who was asking. They documented the assumptions baked into every query. They built diagnostic frameworks for when an answer comes back wrong — not to debug the model but to find the place in the language chain where meaning broke down. That work looks boring from the outside. It looks like whiteboard sessions, markdown files, and careful arguments about what a single data column actually means to a district manager versus a merchant. (It also occasionally looks like a 20-minute argument about the Oxford comma. Those are sometimes the most important 20 minutes of the project.)

Language is not the soft side of AI work. It is the work. The demo shows you the published story. It doesn't show you the conversation that had to happen before a word was written. How you find the place where meaning broke down turns out to be its own story.

If any of this sounds familiar, we should talk.

More Articles

Ted Ingram

Mar 23, 2026

Stop Debugging the Model. Edit the Sentence.

AI demos hide the hard part. The real work isn't smarter models — it's clearer language.

Ted Ingram

Mar 23, 2026

Stop Debugging the Model. Edit the Sentence.

AI demos hide the hard part. The real work isn't smarter models — it's clearer language.

Tim Doll

Mar 10, 2026

You Built a Birdhouse

Vibe coding is real — but it builds birdhouses, not high-rises.

Tim Doll

Mar 10, 2026

You Built a Birdhouse

Vibe coding is real — but it builds birdhouses, not high-rises.

Tim Doll

Mar 2, 2026

Claude, or Clod?

AI is powerful enough to boost your output and subtle enough to slowly replace your thinking if you let it.

Tim Doll

Mar 2, 2026

Claude, or Clod?

AI is powerful enough to boost your output and subtle enough to slowly replace your thinking if you let it.

Rajiv Kalapala

Feb 10, 2026

AI Adoption That Actually Sticks: Lessons From Enterprise Enablement

Sustainable AI adoption is built on foundations, not speed.

Rajiv Kalapala

Feb 10, 2026

AI Adoption That Actually Sticks: Lessons From Enterprise Enablement

Sustainable AI adoption is built on foundations, not speed.