Stanford and UW Researchers Train AI 'Reasoning' Model for Under $50, Raising Questions on AI Commoditization

Reese Morgan

Reese Morgan

February 06, 2025 · 5 min read
Stanford and UW Researchers Train AI 'Reasoning' Model for Under $50, Raising Questions on AI Commoditization

A groundbreaking research paper released last Friday has sent ripples through the artificial intelligence (AI) community, as researchers from Stanford and the University of Washington successfully trained an AI "reasoning" model for under $50 in cloud compute credits. The model, known as s1, performs similarly to cutting-edge reasoning models, such as OpenAI's o1 and DeepSeek's r1, on tests measuring math and coding abilities.

The s1 model is available on GitHub, along with the data and code used to train it, making it a significant achievement in the field of AI research. The team behind s1 created the AI model through distillation, a process that extracts the "reasoning" capabilities from another AI model by training on its answers. In this case, the researchers distilled s1 from one of Google's reasoning models, Gemini 2.0 Flash Thinking Experimental.

This breakthrough raises important questions about the commoditization of AI models. If a few researchers without millions of dollars in funding can replicate a multi-million dollar model with relative pocket change, where is the moat that protects innovation in the field? The implications are far-reaching, and big AI labs are already taking notice. OpenAI has accused DeepSeek of improperly harvesting data from its API for the purposes of model distillation.

The researchers behind s1 were driven by the goal of finding the simplest approach to achieve strong reasoning performance and "test-time scaling," or allowing an AI model to think more before it answers a question. These were key breakthroughs in OpenAI's o1, which DeepSeek and other AI labs have tried to replicate through various techniques.

The s1 paper suggests that reasoning models can be distilled with a relatively small dataset using a process called supervised fine-tuning (SFT), which is cheaper than the large-scale reinforcement learning method employed by DeepSeek to train its answer to OpenAI's o1, R1. This approach has significant implications for the future of AI innovation, as it democratizes access to AI capabilities and reduces the barrier to entry for researchers and startups.

Google offers free access to Gemini 2.0 Flash Thinking Experimental, albeit with daily rate limits, via its Google AI Studio platform. However, its terms forbid reverse-engineering its models to develop services that compete with Google's own AI offerings. We've reached out to Google for comment on this development.

The s1 model is based on a small, off-the-shelf AI model from Alibaba-owned Chinese AI lab Qwen, which is available to download for free. To train s1, the researchers created a dataset of just 1,000 carefully curated questions, paired with answers to those questions as well as the "thinking" process behind each answer from Google's Gemini 2.0 Flash Thinking Experimental. After training s1, which took less than 30 minutes using 16 Nvidia H100 GPUs, s1 achieved strong performance on certain AI benchmarks, according to the researchers.

Niklas Muennighoff, a Stanford researcher who worked on the project, told TechCrunch that he could rent the necessary compute today for about $20. This highlights the accessibility of AI capabilities, even for those without significant resources.

The researchers used a clever trick to get s1 to double-check its work and extend its "thinking" time: they told it to wait. Adding the word "wait" during s1's reasoning helped the model arrive at slightly more accurate answers, per the paper. This approach has potential applications in real-world scenarios, where AI models need to make quick and accurate decisions.

In 2025, Meta, Google, and Microsoft plan to invest hundreds of billions of dollars in AI infrastructure, which will partially go toward training next-generation AI models. While this level of investment may still be necessary to push the envelope of AI innovation, distillation has shown to be a good method for cheaply recreating an AI model's capabilities. However, it doesn't create new AI models vastly better than what's available today.

The implications of this research are far-reaching, and the AI community will be watching closely as this technology continues to evolve. As AI capabilities become more accessible and affordable, the question remains: what does this mean for the future of innovation in the field, and who will be the driving forces behind the next breakthroughs?

Similiar Posts

Copyright © 2024 Starfolk. All rights reserved.