Welcome back!
Andreessen Horowitz general partner Anjney Midha’s main job is to lead investments in AI startups such as France’s Mistral AI and Germany’s Black Forest Labs. But behind the scenes, he’s also functioned as what he calls the firm’s “compute whisperer,” setting up and managing a group of thousands of Nvidia chips that the firm then rents to compute-hungry startups it backs. This program, called Oxygen, started with plans of an initial cluster of more than 20,000 chips, we previously reported. Now Midha, 33, and his colleagues have expanded its original single cluster to multiple private clusters, which includes groups of chips rented and bought from different cloud providers. Andreessen Horowitz then offers the servers to portfolio companies—alongside cash—in exchange for equity. The cluster program is one of the ways that Midha says the firm is winning deals in an ultra-competitive market, where investors are frequently pitching startups to take their money just days or weeks after the startup has raised a round. Offering chips—which many rival VC firms have balked at doing—solves one of the main problems AI founders face, he told me. “The one mistake I’ve seen the best frontier AI researchers make is they always underestimate the amount of compute they need,” he said. “Once that momentum machine starts getting going, you have to keep feeding the beast with more and more compute,” he added. Midha also promises a significant chunk of his time to founders—becoming a de facto “co-founder,” by limiting the number of investments he makes every year. For instance, last year he led the firm’s seed and Series A investment in Black Forest Labs, an image model maker founded by former employees of Stability AI—the only new startup he led an investment in last year. Midha says he aims to invest in AI companies regardless of how mature they are, “doubling down” on the fast-growing startups. And he wants Andreessen Horowitz to get 20% to 25% stakes, what he terms “co-founder-level ownership” in companies. That’s in line with the strategy of many of the large early-stage VC firms, though some investors have been willing to take smaller stakes to get in on hot AI deals. “With Anj you get capital, compute, and co-founder. Those are three things you get from me,” said Midha. “And in exchange, I want meaningful ownership. Otherwise I don’t get out of bed.” In his two years at Andreessen Horowitz, Midha has invested around $500 million in startups training or evaluating AI models, including putting $200 million into Mistral AI when Andreessen led Mistral’s $415 million in a Series A round in 2023. The French startup increased its valuation by four times the next year, to $6 billion, according to Andreessen Horowitz adviser Latham & Watkins. Before Andreessen Horowitz, Midha co-founded a computer vision and augmented reality company that social app Discord bought in 2021. While working as an executive running Discord’s developer products, Midha says he was approached by former OpenAI employees including Dario Amodei who were looking for advice starting their new company, Anthropic. In 2021, Midha advised the team weekly on compute strategy and fundraising and then threw his “life savings” into the start-up as one of its earliest investors. Also while at Discord, Midha said he helped his friend David Holz arrange to use the popular chat app as the user interface for the image model maker Holz was developing, which became Midjourney. The connections to Amodei and Holz made him realize he had the network and access to AI founders worth investing in, so he launched a venture firm, AM Engine, and began making investments in young AI startups. The firm caught Andreessen Horowitz’s attention, and it hired him as a general partner in 2023. Midha’s venture firm was acquired as part of the process; it’s unclear if Andreessen Horowitz acquired all the stakes in his portfolio companies or kept the investments independent. In an interview with The Information, Midha explained why is concentrating his AI bets, why the VC firm is expanding its compute clusters, and why he thinks the skeptics are wrong about the prospects for companies training their own models. This interview has been lightly edited for length and clarity. The Information: Talk to me about how many checks you’re writing a year and the size of your checks. Midha: Across the six boards I’m on—Black Forest Labs, Mistral, OpenRouter, Sesame AI, Luma AI and LMArena—I put around half a billion dollars to work in 18 months. The speed at which these companies are able to grow, not just in valuation, but in revenue, and then consume even more compute, makes it much more attractive to keep doubling down on these. This is one of the biggest mistakes that venture capital has made in the past. They think of traditional startups falling into early and growth. They think early-stage is usually under $50 million checks, and growth-stage is if you’re raising more than $100 million. These laws do not apply anymore. Across seed and Series A, into Black Forest, we invested around $120 million in. Now, we’re usually getting co-founder level ownership in exchange for that and giving them compute. But that’s the only investment I led all of last year. Just one. I would rather create a category leader with compute and myself as a co-founder, and put $120 million in then write four checks of like, $20 million each. Sure, but the challenge I hear is that investors need meaningful ownership percentages to put that kind of capital to work. I get 20% to 25% ownership. These funds would kill for the ownership I get. But you said you also get equity in exchange for those GPUs. With Anj you get capital, compute, and a co-founder. Those are three things you get from me. And in exchange, I want meaningful ownership. Otherwise I don’t get out of bed. Your approach feels different than a lot of other venture capital firms, or even Andreessen Horowitz’s other partners. What is giving you conviction that this is the right way to approach this moment in AI? It's a “winner takes most” [market]. My experience with Anthropic taught me that there won’t be 10 frontier language models. There may be three in closed source, and maybe one or two in open-source. You need to co-create these companies around the world as scientists and turn them into winners. That‘s not actually that new. It’s just that most of venture capital for the last 10 years has turned into comparison shopping, because there's 20 different companies in SaaS, and you line them all up, side by side, you pick their metrics. You pick the break out. That’s a great way to make money. But we’re in a different regime right now where a new industry is being founded and to build iconic winners. You helped start Andreessen Horowitz’s compute program Oxygen. Is the frenzy for access to GPUs over, or is the program still in demand? How has the conversation around infrastructure changed for portfolio companies? It has gotten even louder. If you added up the amount of time I spend on a given day helping founders, I spend about 80% of my time on compute and recruiting. Those are the two biggest constraints and that has not gone away. There were a lot of pundits who said a year ago that model training will get commoditized, compute will get commoditized, applications are where all the money is and all these frontier labs are going to die. In reality, you had Anthropic go from one to $5 billion in run rate in eight months, you had Mistral go from zero to $1.6 billion in [total contract value] in one year and each of those teams has put out state-of-the-art technology and they literally don’t have enough compute to serve customer demand. Once that momentum machine starts getting going, you have to keep feeding the beast with more and more compute. And no matter, the one mistake I've seen the best frontier researchers make is underestimating the amount of compute they need. If you’re not the winner, if you’re not the best state-of-the-art model team, then you don’t need more compute. There’s lots of places for you to go to get small numbers of nodes. If you need 50 to 60 GPUs, there’s no problem. But if you want to run a training run on more than 2,000 H100s or Blackwells, where they are reliable, stable and you don’t have to babysit them – there aren’t many options for you. My job is to be the acceleration infrastructure provider to the world’s best frontier models, my job is to be the compute whisperer who can give them more speed. We last reported that Andreessen’s cluster was close to 20,000 chips. In response to the demand that you just said, has that cluster grown in size? What has changed in the last year since I set up the program, is that back then the idea was that we could get one cluster and the portion it out to multiple people. The demand has grown so much that we just had to procure multiple clusters in multiple places of different types. People think that data centers are this commodity unit that are all interchangeable. [But] no two data centers are born the same, and no two frontier labs data center needs are born the same. My job is to provide compute to the founder at whatever shape, scale, speed, size, configuration they need for their company,. But the program is scaled to well beyond one cluster. I can‘t comment on the number of chips, but it’s several thousands. We‘ve saved the portfolio companies hundreds of millions of dollars. And what I’m working on now is a way to 10x that.
|