In his very public standoff with the Pentagon recently, Anthropic CEO Dario Amodei warned that AI should never be used to kill without humans involved. The technology is capable, he said. What it isn’t capable of is handling the unexpected, the messy reality that no algorithm can plan for. That lesson is true in war and in almost every corner of work and life.
A few weeks ago, AI seemed unstoppable. Now, nearly every organization I speak with is struggling with reliability, usability, and measurable impact. The reason is simple. These models excel in controlled conditions, but they falter in the real world. That gap, what we call the “execution frontier,” is where humans still matter most.
My own engineers put it plainly. AI is strong at both ideation and well-scoped execution. The middle, where ideas and well-scoped execution are plugged into existing systems to reliably deliver, still requires human work. It requires context, judgment, domain expertise, and constant recalibration. Anthropic’s recent labor report shows the same thing. The jobs least exposed to AI are the ones that demand physical presence and unpredictable judgment. Think cooks and lifeguards—people who have to show up when no system could anticipate what will happen.
WHERE HUMANS SHOULD BE IN THE LOOP
This is exactly where most companies quietly struggle. Outside of coding or content creation, very few tasks can run without a human in the loop. The runway for reliable AI in the messy, real world is long. But the problem is not that AI isn’t improving. The problem is how companies are designing the infrastructure around it.
Over the past 12 months, our models have expanded automation coverage by more than 60%. Humans are still required on nearly every task. AI is not replacing humans. It is compressing what each human does per task, making every intervention smaller, more precise, and more impactful. This dynamic alone moved our business from deeply negative gross margins to positive ones, all while improving results and satisfaction for customers and workers alike. Each handoff has become a learning opportunity. Each task teaches the system where the next handoff should be. The loop evolves.
Here is the critical insight most people miss: The frontier is not a fixed line that AI is marching toward. As it improves, the work that requires human judgment does not disappear. It migrates. The shape of human work changes, and the infrastructure must change with it. If you build your system for today’s handoff point, it will be obsolete six months from now.
THE HUMAN/AI BALANCE
So the real question isn’t how do we get AI to do this without humans. The question is how do you build a loop between AI and humans that gets better every single time the frontier shifts. That’s what we’ve spent years building at Duckbill—the infrastructure layer that hands the baton between AI and the real world. It took engineers and operators and millions of data points to get reliable outcomes on real phone calls, real transactions, real physical-world tasks. AI orchestrates, humans execute when the work leaves the screen.
And then the interesting part happens: Every rep teaches the system where the next handoff point should be. The baton changes hands, and then it changes hands differently the next time, and differently again after that. That creates reliability that compounds over time. The reliability isn’t in the models; it’s in the loop.
The race everyone thinks they are running is about building more capable models. The race that actually matters is building the loop that evolves. We’re early. The gap is real. And it won’t be closed by better models. It will be closed by the humans who can build the loop around them.
Meghan Joyce is CEO and founder of Duckbill.
The extended deadline for Fast Company’s Best Workplaces for Innovators is Friday, April 3, at 11:59 p.m. PT. Apply today.
