OpenAI's Advanced Voice Mode with Vision Falls Short on Reliability

Riley King

Riley King

December 18, 2024 · 3 min read
OpenAI's Advanced Voice Mode with Vision Falls Short on Reliability

OpenAI's Advanced Voice Mode with Vision, a feature that enables ChatGPT to respond more naturally and intuitively by feeding real-time video to the chatbot, has been found to be prone to mistakes and hallucinations. Despite its promising premise, the feature has failed to deliver on its promise of granting ChatGPT superpowers, instead highlighting the bot's biggest issue: reliability.

The feature was first demoed nearly a year ago, with OpenAI president Greg Brockman showcasing its capabilities on "60 Minutes" earlier this month. However, even in that demonstration, ChatGPT made a mistake on a geometry problem, misidentifying the triangle's height. This is not an isolated incident, as users have reported similar errors when using the feature.

In one instance, the bot mistakenly identified an ottoman as a couch, and when corrected, responded with a nonchalant "My mistake!" The feature's inability to accurately perceive its surroundings raises questions about its trustworthiness. As one user noted, "What good is 'Her'-like AI if you can't trust it?"

The reliability issues with Advanced Voice Mode with Vision are particularly concerning given the feature's design to engender trust. The cumbersome process of enabling the feature, which requires users to unlock their phone, launch ChatGPT, open Advanced Voice Mode, and enable Vision, makes it even more jarring when the bot fails to deliver accurate results.

OpenAI's 12-day "shipmas" event, which promises new product releases every day until December 20, has been overshadowed by the concerns surrounding Advanced Voice Mode with Vision. Meanwhile, other companies like YouTube, Meta, and DeepMind are making strides in AI development, with YouTube giving creators more control over AI model training and Meta's smart glasses receiving AI-powered updates.

In related news, a former OpenAI employee, Suchir Balaji, was found dead in his San Francisco apartment, raising concerns about the company's practices. Additionally, Grammarly has acquired productivity startup Coda, and Cohere has partnered with Palantir, a data analytics firm with close ties to U.S. defense and intelligence agencies.

Research into AI development continues, with Anthropic releasing a system called Clio to understand how customers are employing its AI models. The company claims that Clio is providing "valuable insights" for improving the safety of its AI. Meanwhile, AI startup Pika has released its next-gen video generation model, Pika 2, which can create clips from user-supplied characters, objects, and locations.

The Future of Life Institute (FLI) has released an "AI Safety Index" to evaluate the safety practices of leading AI companies. The index found that Meta received an overall F grade, while Anthropic scored a C, highlighting the need for improvement in the industry.

As AI technology continues to advance, concerns about reliability, trust, and safety will only grow more pressing. It remains to be seen whether OpenAI can address the issues plaguing Advanced Voice Mode with Vision and restore trust in its AI capabilities.

Similiar Posts

Copyright © 2024 Starfolk. All rights reserved.