DeepSeek's AI Model Beats Rivals, But Thinks It's ChatGPT - What's Behind the Identity Crisis?

Jordan Vega

Jordan Vega

December 27, 2024 · 4 min read
DeepSeek's AI Model Beats Rivals, But Thinks It's ChatGPT - What's Behind the Identity Crisis?

Earlier this week, Chinese AI lab DeepSeek released an "open" AI model, DeepSeek V3, which has been making waves in the tech community. The model has demonstrated impressive capabilities, handling text-based tasks like coding and writing essays with ease. However, it has also been found to identify itself as ChatGPT, OpenAI's AI-powered chatbot platform, sparking concerns about its training data and potential biases.

Tests conducted by TechCrunch and posts on social media platform X have shown that DeepSeek V3 consistently identifies itself as ChatGPT, even claiming to be a version of OpenAI's GPT-4 model released in June 2023. When asked about DeepSeek's API, the model provides instructions on how to use OpenAI's API, and it even tells the same jokes as GPT-4, down to the punchlines. This has raised questions about the source of DeepSeek V3's training data and whether it may have been trained on public datasets containing text generated by GPT-4 via ChatGPT.

Experts suggest that the model may have memorized some of GPT-4's outputs and is now regurgitating them verbatim. "Obviously, the model is seeing raw responses from ChatGPT at some point, but it's not clear where that is," said Mike Cook, a research fellow at King's College London specializing in AI. Cook noted that the practice of training models on outputs from rival AI systems can be "very bad" for model quality, leading to hallucinations and misleading answers.

Furthermore, training models on outputs from rival AI systems may be against the terms of service of those systems. OpenAI's terms prohibit users of its products, including ChatGPT customers, from using outputs to develop models that compete with OpenAI's own. OpenAI CEO Sam Altman seemed to take a dig at DeepSeek and other competitors on X, writing that "it is (relatively) easy to copy something that you know works. It is extremely hard to do something new, risky, and difficult when you don't know if it will work."

This is not the first instance of an AI model misidentifying itself. Google's Gemini and others have been known to claim to be competing models. The issue is attributed to the growing presence of AI-generated content on the web, making it difficult to filter out AI outputs from training datasets. By one estimate, 90% of the web could be AI-generated by 2026.

Heidy Khlaaf, engineering director at consulting firm Trail of Bits, suggested that the cost savings from "distilling" an existing model's knowledge can be attractive to developers, regardless of the risks. However, if DeepSeek did train its model directly on ChatGPT-generated text, it could have significant implications for the model's trustworthiness and potential biases.

The incident raises concerns about the potential for AI models to exacerbate biases and flaws present in their training data. As AI models become increasingly prevalent, it is essential to ensure that they are developed and trained in a responsible and transparent manner. The incident serves as a reminder of the need for vigilance in monitoring AI development and deployment, particularly as the technology continues to advance at a rapid pace.

In conclusion, the identity crisis of DeepSeek's AI model V3 has sparked important questions about the development and training of AI models. As the tech community continues to grapple with the implications of this incident, it is clear that transparency, accountability, and responsibility will be essential in shaping the future of AI development.

Similiar Posts

Copyright © 2024 Starfolk. All rights reserved.