Google and OpenAI Tie in Math Olympiad with AI Models
PorAinvest
lunes, 21 de julio de 2025, 8:11 pm ET1 min de lectura
CGTX--
OpenAI's experimental reasoning LLM (Large Language Model) solved five out of six problems on the 2025 IMO, earning a score of 35/42 points. The model was evaluated under the same rules as human contestants, including a two 4.5-hour exam session with no tools or internet access. This performance marks a substantial leap in AI's ability to engage in sustained creative thinking and craft intricate, multi-page proofs [1].
Google DeepMind's model also achieved gold medal-level performance, though the specifics of its evaluation and the number of problems solved are not as extensively detailed. The company has not publicly released the model's solutions or the evaluation process, raising questions about the transparency and replicability of OpenAI's claims [2].
The achievements of these AI models highlight the rapid progress in AI capabilities. OpenAI's model, in particular, demonstrates a significant improvement in AI's ability to handle complex, non-verifiable tasks, such as crafting detailed mathematical proofs. However, Google DeepMind's lack of transparency in its evaluation process has sparked debate about the true extent of AI's capabilities and the need for standardized evaluation methods [3].
These developments come at a time when tech giants are aggressively investing in AI infrastructure and talent. Meta, for instance, has been recruiting top AI researchers from OpenAI and other leading AI research firms to bolster its AI capabilities. This trend suggests a growing recognition of the importance of AI in driving innovation and competitive advantage [2].
In conclusion, the gold medal performances of AI models from OpenAI and Google DeepMind in the 2025 IMO represent significant milestones in AI's ability to handle complex, non-verifiable tasks. However, the need for transparency and standardized evaluation methods remains a critical area for further discussion and development.
References:
[1] https://www.lesswrong.com/posts/RcBqeJ8GHM2LygQK3/openai-claims-imo-gold-medal
[2] https://www.indiatoday.in/amp/technology/news/story/meta-hires-openai-researchers-jason-wei-hyung-won-chung-to-boost-ai-superintelligence-2757254-2025-07-17
[3] https://timesofindia.indiatimes.com/technology/tech-news/ai-startup-windsurf-sold-to-ai-engineer-devins-maker-cognition-just-days-after-openai-acquisition-fails-and-google-poaches-ceo/articleshow/122508123.cms
GOOGL--
META--
AI models from OpenAI and Google DeepMind achieved gold medal scores in the 2025 International Math Olympiad, one of the world's oldest and most challenging high school-level math competitions. Both companies entered "informal" systems, which could ingest questions and generate proof-based answers in natural language, without requiring human-machine translation. Researchers claim these gold medal performances represent breakthroughs around AI reasoning models in non-verifiable domains. However, Google raises questions around OpenAI's announcement and official evaluation of the model's test.
AI models from OpenAI and Google DeepMind have achieved gold medal scores in the 2025 International Math Olympiad (IMO), one of the world's oldest and most challenging high school-level math competitions. This remarkable feat underscores significant advancements in AI reasoning models, particularly in non-verifiable domains.OpenAI's experimental reasoning LLM (Large Language Model) solved five out of six problems on the 2025 IMO, earning a score of 35/42 points. The model was evaluated under the same rules as human contestants, including a two 4.5-hour exam session with no tools or internet access. This performance marks a substantial leap in AI's ability to engage in sustained creative thinking and craft intricate, multi-page proofs [1].
Google DeepMind's model also achieved gold medal-level performance, though the specifics of its evaluation and the number of problems solved are not as extensively detailed. The company has not publicly released the model's solutions or the evaluation process, raising questions about the transparency and replicability of OpenAI's claims [2].
The achievements of these AI models highlight the rapid progress in AI capabilities. OpenAI's model, in particular, demonstrates a significant improvement in AI's ability to handle complex, non-verifiable tasks, such as crafting detailed mathematical proofs. However, Google DeepMind's lack of transparency in its evaluation process has sparked debate about the true extent of AI's capabilities and the need for standardized evaluation methods [3].
These developments come at a time when tech giants are aggressively investing in AI infrastructure and talent. Meta, for instance, has been recruiting top AI researchers from OpenAI and other leading AI research firms to bolster its AI capabilities. This trend suggests a growing recognition of the importance of AI in driving innovation and competitive advantage [2].
In conclusion, the gold medal performances of AI models from OpenAI and Google DeepMind in the 2025 IMO represent significant milestones in AI's ability to handle complex, non-verifiable tasks. However, the need for transparency and standardized evaluation methods remains a critical area for further discussion and development.
References:
[1] https://www.lesswrong.com/posts/RcBqeJ8GHM2LygQK3/openai-claims-imo-gold-medal
[2] https://www.indiatoday.in/amp/technology/news/story/meta-hires-openai-researchers-jason-wei-hyung-won-chung-to-boost-ai-superintelligence-2757254-2025-07-17
[3] https://timesofindia.indiatimes.com/technology/tech-news/ai-startup-windsurf-sold-to-ai-engineer-devins-maker-cognition-just-days-after-openai-acquisition-fails-and-google-poaches-ceo/articleshow/122508123.cms

Divulgación editorial y transparencia de la IA: Ainvest News utiliza tecnología avanzada de Modelos de Lenguaje Largo (LLM) para sintetizar y analizar datos de mercado en tiempo real. Para garantizar los más altos estándares de integridad, cada artículo se somete a un riguroso proceso de verificación con participación humana.
Mientras la IA asiste en el procesamiento de datos y la redacción inicial, un miembro editorial profesional de Ainvest revisa, verifica y aprueba de forma independiente todo el contenido para garantizar su precisión y cumplimiento con los estándares editoriales de Ainvest Fintech Inc. Esta supervisión humana está diseñada para mitigar las alucinaciones de la IA y garantizar el contexto financiero.
Advertencia sobre inversiones: Este contenido se proporciona únicamente con fines informativos y no constituye asesoramiento profesional de inversión, legal o financiero. Los mercados conllevan riesgos inherentes. Se recomienda a los usuarios que realicen una investigación independiente o consulten a un asesor financiero certificado antes de tomar cualquier decisión. Ainvest Fintech Inc. se exime de toda responsabilidad por las acciones tomadas con base en esta información. ¿Encontró un error? Reportar un problema

Comentarios
Aún no hay comentarios