AI models from OpenAI and Google DeepMind achieved gold medal scores in the 2025 International Math Olympiad, one of the world's oldest and most challenging high school-level math competitions. Both companies entered "informal" systems, which could ingest questions and generate proof-based answers in natural language, without requiring human-machine translation. Researchers claim these gold medal performances represent breakthroughs around AI reasoning models in non-verifiable domains. However, Google raises questions around OpenAI's announcement and official evaluation of the model's test.
AI models from OpenAI and Google DeepMind have achieved gold medal scores in the 2025 International Math Olympiad (IMO), one of the world's oldest and most challenging high school-level math competitions. This remarkable feat underscores significant advancements in AI reasoning models, particularly in non-verifiable domains.
OpenAI's experimental reasoning LLM (Large Language Model) solved five out of six problems on the 2025 IMO, earning a score of 35/42 points. The model was evaluated under the same rules as human contestants, including a two 4.5-hour exam session with no tools or internet access. This performance marks a substantial leap in AI's ability to engage in sustained creative thinking and craft intricate, multi-page proofs [1].
Google DeepMind's model also achieved gold medal-level performance, though the specifics of its evaluation and the number of problems solved are not as extensively detailed. The company has not publicly released the model's solutions or the evaluation process, raising questions about the transparency and replicability of OpenAI's claims [2].
The achievements of these AI models highlight the rapid progress in AI capabilities. OpenAI's model, in particular, demonstrates a significant improvement in AI's ability to handle complex, non-verifiable tasks, such as crafting detailed mathematical proofs. However, Google DeepMind's lack of transparency in its evaluation process has sparked debate about the true extent of AI's capabilities and the need for standardized evaluation methods [3].
These developments come at a time when tech giants are aggressively investing in AI infrastructure and talent. Meta, for instance, has been recruiting top AI researchers from OpenAI and other leading AI research firms to bolster its AI capabilities. This trend suggests a growing recognition of the importance of AI in driving innovation and competitive advantage [2].
In conclusion, the gold medal performances of AI models from OpenAI and Google DeepMind in the 2025 IMO represent significant milestones in AI's ability to handle complex, non-verifiable tasks. However, the need for transparency and standardized evaluation methods remains a critical area for further discussion and development.
References:
[1] https://www.lesswrong.com/posts/RcBqeJ8GHM2LygQK3/openai-claims-imo-gold-medal
[2] https://www.indiatoday.in/amp/technology/news/story/meta-hires-openai-researchers-jason-wei-hyung-won-chung-to-boost-ai-superintelligence-2757254-2025-07-17
[3] https://timesofindia.indiatimes.com/technology/tech-news/ai-startup-windsurf-sold-to-ai-engineer-devins-maker-cognition-just-days-after-openai-acquisition-fails-and-google-poaches-ceo/articleshow/122508123.cms
Comments
No comments yet