Chinese AI lab DeepSeek has taken the world by surprise, with its chatbot app rising to the top of the Apple App Store and Google Play charts. This sudden surge in popularity has led Wall Street analysts and technologists to question whether the US can maintain its lead in the AI race and whether the demand for AI chips will sustain.
But where did DeepSeek come from, and how did it rise to international fame so quickly? The company's origins can be traced back to High-Flyer Capital Management, a Chinese quantitative hedge fund that uses AI to inform its trading decisions. Founded by AI enthusiast Liang Wenfeng in 2015, High-Flyer launched DeepSeek as a lab dedicated to researching AI tools in 2023, which later spun off into its own company.
DeepSeek's technical team is notable for its youth and aggressive recruitment of doctorate AI researchers from top Chinese universities. The company also hires individuals without computer science backgrounds to help its technology better understand a wide range of subjects. Despite being affected by US export bans on hardware, DeepSeek has managed to build its own data center clusters for model training, using Nvidia H800 chips as a less-powerful alternative to the H100 chip available to US companies.
The company's AI models, including DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat, were first unveiled in November 2023. However, it was the release of its next-gen DeepSeek-V2 family of models last spring that caught the attention of the AI industry. DeepSeek-V2, a general-purpose text- and image-analyzing system, performed well in various AI benchmarks and was significantly cheaper to run than comparable models at the time.
The success of DeepSeek-V2 forced domestic competitors, including ByteDance and Alibaba, to cut prices for some of their models and make others completely free. The company's subsequent releases, including DeepSeek-V3 and the R1 "reasoning" model, have only added to its notoriety. According to internal benchmark testing, DeepSeek V3 outperforms both downloadable, openly available models like Meta's Llama and "closed" models that can only be accessed through an API, like OpenAI's GPT-4o.
The R1 model, released in January, is particularly impressive, as it effectively fact-checks itself, avoiding pitfalls that normally trip up models. While it takes longer to arrive at solutions compared to typical non-reasoning models, R1's reliability in domains such as physics, science, and math is unparalleled.
However, there is a downside to DeepSeek's models, as they are subject to benchmarking by China's internet regulator to ensure that its responses "embody core socialist values." This means that the company's chatbot app, for example, won't answer questions about Tiananmen Square or Taiwan's autonomy.
DeepSeek's business model is unclear, but the company prices its products and services well below market value, giving some away for free. The company attributes its cost competitiveness to efficiency breakthroughs, although some experts dispute the figures supplied. Despite this, developers have taken to DeepSeek's models, which are available under permissive licenses that allow for commercial use. Over 500 "derivative" models of R1 have been created on Hugging Face, racking up 2.5 million downloads combined.
DeepSeek's success has been described as "upending AI" and "over-hyped." The company's impact was at least in part responsible for causing Nvidia's stock price to drop by 18% on Monday, and for eliciting a public response from OpenAI CEO Sam Altman. Microsoft has also announced that DeepSeek is available on its Azure AI Foundry service, a platform that brings together AI services for enterprises under a single banner.
As for what DeepSeek's future might hold, it's unclear. Improved models are a given, but the US government appears to be growing wary of what it perceives as harmful foreign influence. One thing is certain, however: DeepSeek's sudden rise to fame has sent shockwaves through the AI industry, forcing companies and governments to reevaluate their strategies and priorities.