Are AI agents actually slowing us down?

Are AI agents actually slowing us down?As more software engineers use AI agents daily, there’s also more sloppy software, outages, quality issues, and even a slowdown in shipping velocity. What’s happening, and how do we solve it?
͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     
Forwarded this email? Subscribe here for more
👋  Hi, this is Gergely with a subscriber-only issue of the Pragmatic Engineer Newsletter. In every issue, I cover challenges at Big Tech and startups through the lens of engineering managers and senior engineers. If you’ve been forwarded this email, you can subscribe here.
Are AI agents actually slowing us down?
As more software engineers use AI agents daily, there’s also more sloppy software, outages, quality issues, and even a slowdown in shipping velocity. What’s happening, and how do we solve it?
Gergely Orosz
Mar 17 ∙ Preview 

READ IN APP

When it comes to AI agents and AI tooling, most of the discussion focuses on their potential boosts for efficiency, faster iteration, and the pushing out of more code, faster.
Last week, we took an inside look into how Uber is adopting AI, internally. The rideshare giant has built close to a dozen internal systems to deal with code generated by AI agents. However, when quantifying the impact of AI, the focus was on how much output has increased, and how devs who use more AI also generate more pull requests; these are the “power user” devs who generate 52% more PRs than devs who use AI less. There was no mention of product quality – at all!
And there are signs that product quality is dropping overall. Today, we dig into this under-discussed topic, covering:
Anthropic: degraded flagship website. An annoying UX issue irritated paying Claude customers – and no one at Anthropic noticed. The company moves very fast, generates 80%+ of production code with Claude, but quality and user experience seem to be taking a backseat.
Amazon: AI-agent reliance triggers SEVs. Amazon’s retail org has a leap in outages caused by its own AI agents. Now, senior sign off is needed for junior engineers’ AI-assisted changes.
Big Tech: “use AI or you’re unproductive.” Companies like Meta and Uber are tracking AI token usage in performance reviews, putting pressure on engineers to use it heavily — irrespective of the tools’ quality impact.
OpenCode: more time spent cleaning up. Dax Reed, OpenCode’s creator, warns that AI agents are lowering the bar for what ships, discouraging refactoring, and don’t speed teams up.
5. Startups: founders see LLMs slowing down long-term velocity. Sentry’s CTO and others observe that while AI removes the barrier to getting started, it also produces bloated, hard-to-maintain code that slows long-term development.
Research: AI agents underperform claims. Some studies show AI coding tools produce short-lived velocity gains followed by significant tech debt increases.
How do we solve it? Engineers with strong architectural sense become more critical than ever, proposed solutions include formal validation methods, and perhaps reviving some old school QA ideas.
1. Anthropic: degraded flagship website
This article’s genesis was last week, when I’d finally had enough of a persistent UX bug on Claude’s flagship website: the prompt I typed in regularly got lost. Below is a video of me typing “How can I…” – and “losing” the first two words when the page loaded:
It’s pretty straightforward:
The page starts to render and the textbox is displayed
The user starts to type their prompt, but the page has not finished loading subscription data
The subscription information loads around a second later
The textbox is reset and the typing is lost
This is a pretty basic bug you might expect in a prototype, except that this is the landing page of Claude.ai, and it’s a bug that impacts every paying customer – easily millions – every day. Even worse, the bug happens every time you visit the site.
Somehow, nobody at Anthropic tested the site to catch a plainly obvious bug which impacted 100% of paying customers. At the same time, no company uses AI coding tools more than Anthropic: around 80% of the company’s code is now generated by Claude Code, so we can assume a good part of the website is also created that way.
My complaint about Anthropic’s website being broken went a bit viral and got the attention of the developer team:
Product manager Robert Bye confirms the bug will be fixed. Source: Robert Bye
To their credit, three days later the bug was gone. There’s no longer a “double load” of the textbox: it takes a bit longer to load but only does so once.
Still, it makes me wonder how much longer this issue would’ve continued had nobody complained. Also, how many more bugs are present on the Claude website that nobody highlighted on social media? How many more features could be shipped in a state that is subpar for production-grade software with millions of paying customers?
Anthropic seems to be prioritizing moving very fast over doing so with high quality. There is no denying that the company is moving at incredible speed and running laps around competitors. A good example is how they built Claude Cowork in just 10 days. Claude Cowork handled work with Microsoft Word and Excel documents surprisingly well, to the point that it set off a “code red” inside Microsoft’s Office division, I understand.
Microsoft responded as fast as possible, but it still took 2-3 months to launch their (cloned) response, called Copilot Cowork earlier this month, with full access still to follow soon.
In the case of Anthropic, moving fast with okay quality seems to make good business sense: they build a better product than what already exists, so no matter if it’s a bit rough around the edges; they can fix quality issues post-launch and still be months ahead of the competition.
2. Amazon: reliance on AI agents causes SEVs