Andreessen Horowitz general partner and Mistral board member Anjney "Anj" Midha has been closely following DeepSeek's progress, and six months ago, he was impressed by the company's Coder V2, which rivaled OpenAI's GPT4-Turbo for coding-specific tasks. Since then, DeepSeek has continued to release improved models, including its latest open-source reasoning model, R1, which has sent shockwaves through the tech industry.
R1's performance has been touted as industry-standard at a fraction of the cost, leading some to wonder if this means the end of massive investments in GPU chips and data centers. However, Midha disagrees, stating that R1's efficiency improvements will actually enable companies to do more with the compute power they have, rather than reducing their need for it. "Now we can get 10 times more output from the same compute," he said.
Midha argues that Mistral, despite raising only a billion dollars, remains competitive with rivals OpenAI and Anthropic, which have raised many more billions. The key to Mistral's competitiveness lies in its open-source nature, which allows it to tap into free technical labor from contributors who use the project. Closed-source rivals, on the other hand, have to pay for all the labor and compute power.
Facebook's Llama, the largest Western open-source AI model rival to Mistral, will also continue to receive significant investment. CEO Mark Zuckerberg has pledged to spend "hundreds of billions of dollars" overall on AI, including $60 billion in 2025 on capital expenditures, mostly data centers.
Midha, who is also involved with AI image generator Black Forest Labs and 3D model maker Luma, as well as an angel investor in AI outfits Anthropic and ElevenLabs, has another reason why he doesn't see AI's hunger for GPUs abating anytime soon. As the leader of a16z's Oxygen program, he has seen firsthand the scarcity of high-end GPUs, particularly Nvidia's H100s. The program, which allows portfolio companies to share GPUs, is currently "overbooked," with startups needing GPUs not only for AI model training but also for running their ongoing AI products for customers.
The insatiable demand for inference and consumption is driving the need for more GPUs, and Midha believes that DeepSeek's engineering breakthroughs won't change the trajectory of projects like StarGate, OpenAI's $500 billion partnership with SoftBank and Oracle for AI data centers.
However, Midha sees a more significant impact from DeepSeek's R1 model: the recognition by nation-states that AI is the next foundational infrastructure, akin to electricity and the internet. He advocates for "infrastructure independence," where Western nations use Western models that follow Western laws, ethics, and abide by NATO agreements, rather than relying on Chinese models with their own set of rules and concerns.
Not everyone agrees with Midha's concerns about Chinese open-source models, pointing out that companies can run them locally in their own data centers. Additionally, DeepSeek is already available as a secure cloud service from American companies like Microsoft Azure Foundry, so developers don't have to use DeepSeek's cloud service. Intel's former CEO, Pat Gelsinger, has even stated that his startup Gloo is building AI chat services on their own version of DeepSeek R1 instead of choices like Llama or OpenAI.
In conclusion, DeepSeek's R1 model is a significant breakthrough in AI technology, but it won't reduce the demand for GPUs. Instead, it will lead to more efficient compute usage and further accelerate the development of AI infrastructure. As Midha jokingly remarks, "If you have extra GPUs, please send them to Anj."