Building versus Buying AI Agents: Split Your Stack!

research analyst

Building versus Buying AI Agents: Split Your Stack!Outsourcing versus do-it-yourself is a control choice for agentic workflows
͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     
Forwarded this email? Subscribe here for more
For Solo Chiefs—creatives, solopreneurs, and lone leaders orchestrating AI, humans, and chaos with no one to save their ass.
Building versus Buying AI Agents: Split Your Stack!
Outsourcing versus do-it-yourself is a control choice for agentic workflows
Jurgen Appelo
May 4 

READ IN APP

Your agents can borrow compute, but you cannot borrow accountability.
When deciding between outsourcing and do-it-yourself, rent the plumbing but keep the memory, logs, and control plane under your roof.
When should you outsource the infrastructure for agentic workflows, and when should you build it all yourself?
I felt conflicted.
For the past year, I’ve outsourced the hosting of every technology that sits underneath my AI agents (with Fibery, Make, Google Drive, and the rest). I want everything running in the cloud, and I’m happy to pay for that. I don’t care that things are “free” when you host them on your own machine. My time is not free. A handful of monthly subscriptions beats hours of fighting with server uptime, internet connectivity, error logging, automated backups, and cybersecurity. Plenty of other solo operators prefer the opposite, building and maintaining their own setup with Claude Code, OpenClaw, Obsidian, n8n, local markdown files, and so on. Good for them.
But when it comes to the agentic tech stack that runs my autonomous AI agents (memory management, prompt versioning, context management, and the like), I want to build that myself. The last thing I want is to hand Anthropic, Perplexity, or OpenAI the keys to the intelligence of my business. Not surprisingly, others do the exact opposite. They happily delegate their whole agentic architecture to Claude Cowork, Perplexity Computer, Manus, Notion AI, or whatever launches next week. Which left me wondering if I’m being stubborn, stupid, or both.
So the real question is this: for agentic technologies and the infrastructure underneath them, what are the criteria for deciding what to do in-house and what to outsource? Where do you draw the line between outsourcing and do-it-yourself?
Am I wrong to outsource the plumbing while insisting on doing all the agentic work myself?
I decided to ask the AIs.
Their answer might surprise you.
A note on my AI research approach: After framing a deep research question, I give the same question to five LLMs, each playing a different role. Perplexity is the research analyst, focused on documented evidence. Gemini is the structural analyst, digging into why something is happening and what makes it resistant to change. ChatGPT is the practical strategist, answering what to do about it. Claude is the contextual strategist, looking at the question through the lens of my target audience. Finally, Grok plays the contrarian. It maps out the mainstream consensus and then takes it apart. The result is five deep research documents with different perspectives based on the same Research Question. It’s like having a team of rather opinionated researchers trying to formulate one answer together.
Then I feed all five documents into Gemini, which turns the Research Question into a Research Map, showing where the LLMs agree, where they contradict each other, and where one of them coughed up a unique insight that the others somehow overlooked. That whole map goes to Claude, who then decides what’s the best way to write about it and turns it into a narrative structure with an Article Brief ready for the ghostwriter. Finally, the Article Brief and the five original research documents go to ChatGPT, who spins it all into a cohesive story. And yes, I have automated this workflow.
What you read below is the result (lightly edited by me for style, readability, formatting, and proper URLs).
Stop outsourcing the part of your AI agent stack that matters most
The fastest way to weaken an AI strategy is to hand your agents’ memory to a vendor before you’ve asked how you’ll inspect it, export it, or explain it when regulators come calling.
The current rush into managed agent platforms has a familiar ring. Everyone sees Anthropic, Google, OpenAI, and Perplexity shipping new agent features every few weeks, and suddenly the metric becomes feature velocity or tokenmaxxing. Such metrics flatter vendors. They don’t protect buyers.
A better metric is blast radius and recoverability.
When your agent stack sits inside one managed platform, failures are correlated. When that platform has an outage, changes retention rules, retires an API, or quietly shifts product behavior, all customers discover it together. The Builder.ai collapse in 2025 showed an extreme case: vendor failure can become customer failure very quickly. And when a managed platform hides the internals of memory, logging, or context assembly, diagnosis becomes guesswork. Simon Willison’s reporting on Anthropic’s 2025 debugging problems captured the awkward part: privacy controls were strong enough that engineers struggled to reproduce customer issues inside their own system.
That is the FOBO trap. Fear of being obsoleted pushes teams to outsource the one layer they most need to understand.
The consensus is clearer than the market noise
Once you split the stack in two, the research gets surprisingly consistent.