DeepSeek, a Chinese AI lab, has taken the tech world by storm as its chatbot app surged to the top of the Apple App Store charts, raising questions about the US's ability to maintain its lead in the AI race and the sustainability of demand for AI chips. The app's sudden popularity has sparked concerns among Wall Street analysts and technologists alike, who are now reevaluating the US's position in the AI landscape.
But where did DeepSeek come from, and how did it rise to international fame so quickly? The company's origins can be traced back to High-Flyer Capital Management, a Chinese quantitative hedge fund that uses AI to inform its trading decisions. Founded by AI enthusiast Liang Wenfeng in 2015, High-Flyer launched DeepSeek as a lab dedicated to researching AI tools separate from its financial business in 2023. With High-Flyer as one of its investors, the lab spun off into its own company, also called DeepSeek.
DeepSeek's technical team, which reportedly skews young, has been aggressive in recruiting doctorate AI researchers from top Chinese universities. The company has also hired individuals without computer science backgrounds to help its tech better understand a wide range of subjects. Despite being affected by US export bans on hardware, DeepSeek has managed to build its own data center clusters for model training, using Nvidia H800 chips as a less-powerful alternative to the H100 chip available to US companies.
The company's AI models, including DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat, were first unveiled in November 2023. However, it wasn't until the release of its next-gen DeepSeek-V2 family of models in spring 2024 that the AI industry started to take notice. DeepSeek-V2, a general-purpose text- and image-analyzing system, performed well in various AI benchmarks and was far cheaper to run than comparable models at the time. This forced DeepSeek's domestic competition, including ByteDance and Alibaba, to cut the usage prices for some of their models and make others completely free.
DeepSeek-V3, launched in December 2024, only added to the company's notoriety. According to internal benchmark testing, DeepSeek V3 outperforms both downloadable, openly available models like Meta's Llama and "closed" models that can only be accessed through an API, like OpenAI's GPT-4o. Equally impressive is DeepSeek's R1 "reasoning" model, released in January, which claims to perform as well as OpenAI's o1 model on key benchmarks. R1 effectively fact-checks itself, avoiding pitfalls that normally trip up models, but takes a little longer to arrive at solutions compared to typical non-reasoning models.
However, there is a downside to DeepSeek's models. As Chinese-developed AI, they are subject to benchmarking by China's internet regulator to ensure that its responses "embody core socialist values." In DeepSeek's chatbot app, for example, R1 won't answer questions about Tiananmen Square or Taiwan's autonomy.
DeepSeek's business model is unclear, but the company prices its products and services well below market value, giving some away for free. The company attributes its cost competitiveness to efficiency breakthroughs, although some experts dispute the figures supplied. Despite this, developers have taken to DeepSeek's models, which are available under permissive licenses that allow for commercial use. According to Clem Delangue, the CEO of Hugging Face, one of the platforms hosting DeepSeek's models, developers on Hugging Face have created over 500 "derivative" models of R1 that have racked up 2.5 million downloads combined.
DeepSeek's success has been described as "upending AI" and "over-hyped." The company's success was at least in part responsible for causing Nvidia's stock price to drop by 18% on Monday, and for eliciting a public response from OpenAI CEO Sam Altman. Microsoft announced that DeepSeek is available on its Azure AI Foundry service, Microsoft's platform that brings together AI services for enterprises under a single banner. When asked about DeepSeek's impact on Meta's AI spending during its first-quarter earnings call, CEO Mark Zuckerberg said spending on AI infrastructure will continue to be a "strategic advantage" for Meta.
However, not everyone is embracing DeepSeek. Some companies are banning the AI, and so are entire countries and governments. New York state has also banned DeepSeek from being used on government devices. As for what DeepSeek's future might hold, it's unclear, but improved models are a given. The US government appears to be growing wary of what it perceives as harmful foreign influence.
As the AI landscape continues to evolve, one thing is certain – DeepSeek's sudden rise to fame has sent shockwaves through the industry, forcing companies and governments to reevaluate their positions and strategies in the AI race.