OpenAI's New o1 Model Revolutionizes AI with Enhanced Reasoning, Outshining GPT-4o in Math and Coding Competitions
OpenAI's first large language model with reasoning capabilities, codenamed "Strawberry," has officially been launched under the name "OpenAI o1." Originally anticipated to be released in two weeks, the model became available on September 12. OpenAI unveiled two initial versions: o1-preview and o1-mini. These versions are being rolled out in phases to paying users, free users, and developers, with the latter facing higher usage costs.
The new o1 model is revolutionary as it incorporates a novel training approach enabling it to tackle more complex programming, mathematical, and scientific problems. Unlike previous GPT models that mimic data patterns, o1 employs reinforcement learning to solve problems and subsequently provides summary answers through a "chain of thoughts" process, simulating human problem-solving steps.
OpenAI's researchers, including Jerry Tworek, point out that the o1 model has been trained using a new optimization algorithm and custom datasets rich in "reasoning data" and scientific literature. The model's reinforcement learning framework teaches it to solve problems by rewarding correct answers and penalizing incorrect ones.
Performance-wise, o1 has shown significant improvements. It has outperformed its predecessor, GPT-4o, by scoring 83% accuracy on an International Math Olympiad qualification exam compared to GPT-4o's 13%. Additionally, in the Codeforces coding competition, o1-mini achieved a high rank, outperforming 89% of human competitors.
Despite these advancements, the initial versions of the o1 model come with some limitations. They currently lack features such as web browsing, file uploads, and comprehensive world knowledge. There are also usage caps, with o1-preview limited to 30 messages per week and o1-mini capped at 50.
The o1 model's safety protocols have also been enhanced. In extreme "jailbreak" tests, where models are tested for their ability to resist generating harmful content, o1-preview outscored GPT-4o, suggesting a better adherence to safety guidelines.
OpenAI anticipates that the enhanced reasoning capabilities of the o1 model will be particularly beneficial for tackling complex problems in fields like science and programming. For instance, it can annotate cell sequencing data for medical researchers or generate advanced quantum optics formulas for physicists.
In terms of accessibility, the o1-preview and o1-mini models are now available to ChatGPT Plus and Team users, with Enterprise and Education users expected to gain access next week. OpenAI plans to extend o1-mini to all free users in the future and aims to introduce automatic model selection based on user prompts.