Skich Launches as Latest Alternative iOS App Store in Europe, Focusing on Mobile Gamers
Skich, a new third-party app store, targets mobile gamers in the EU, promising a personalized experience, despite launching with no games available.
Riley King
Epoch AI, a nonprofit organization developing math benchmarks for artificial intelligence, has faced allegations of impropriety after revealing that it received funding from OpenAI, a prominent AI research organization. The controversy surrounds the lack of transparency regarding OpenAI's involvement in the creation of FrontierMath, a test designed to measure an AI's mathematical skills.
FrontierMath was one of the benchmarks used by OpenAI to demonstrate its upcoming flagship AI, o3. However, many contributors to the benchmark were not informed of OpenAI's involvement until the information was made public on December 20. A contractor for Epoch AI, going by the username "Meemi," expressed concerns on the forum LessWrong, stating that the lack of transparency compromised the integrity of the benchmark.
Meemi argued that Epoch AI should have disclosed OpenAI's funding and provided contractors with transparent information about the potential use of their work in capabilities development. The secrecy surrounding OpenAI's involvement has raised concerns that FrontierMath's reputation as an objective benchmark may be eroded.
In response to the criticism, Tamay Besiroglu, associate director of Epoch AI and one of the organization's co-founders, acknowledged that Epoch AI "made a mistake" in not being more transparent about the partnership. Besiroglu explained that the organization was restricted from disclosing the partnership until around the time o3 was launched and should have negotiated harder for transparency with OpenAI.
Besiroglu assured that OpenAI has a "verbal agreement" not to use FrontierMath's problem set to train its AI, which would be akin to teaching to the test. Additionally, Epoch AI has a "separate holdout set" for independent verification of FrontierMath benchmark results. However, lead mathematician Ellot Glazer noted that Epoch AI has not been able to independently verify OpenAI's FrontierMath o3 results, pending an independent evaluation.
The controversy highlights the challenges of developing empirical benchmarks to evaluate AI capabilities while securing necessary resources without creating perceptions of conflicts of interest. The incident raises important questions about the transparency and integrity of AI benchmarking, which is crucial for the development of trustworthy and reliable AI systems.
The saga also underscores the need for clear guidelines and protocols for AI benchmarking organizations to ensure transparency, accountability, and independence. As the AI landscape continues to evolve, it is essential to establish robust frameworks for evaluating AI capabilities, free from conflicts of interest and driven by a commitment to objectivity and transparency.
Skich, a new third-party app store, targets mobile gamers in the EU, promising a personalized experience, despite launching with no games available.
Amazon Web Services (AWS) announces it will accept payments in Naira, alongside seven other local currencies, to help Nigerian customers avoid foreign exchange costs and payment friction.
A recent report by Cirium reveals the top 5 African airlines with the highest flight cancellations in 2024, with Kenya Airways ranking highest, followed by Air Seychelles and Ethiopian Airlines.
Copyright © 2024 Starfolk. All rights reserved.