OpenAI's "Strawberry" Model Revolutionizes AI with Human-Like Reasoning Capabilities

In the early hours of Friday, Beijing time, a new milestone in the AI era was achieved with the introduction of a model capable of general complex reasoning.
OpenAI announced on its website that it has begun rolling out the OpenAI o1 preview model to all subscribers. This model, widely anticipated and internally referred to as "Strawberry," signifies a new level of AI capability for complex reasoning tasks. OpenAI emphasized that this model represents a new frontier for AI abilities, hence the decision to reset the counting to 1 and assign it a distinct name from the "GPT-4" series.
The unique aspect of the reasoning model is its ability to allocate more time to think before answering, mimicking the human problem-solving process. Unlike previous models, which relied on patterns from vast datasets to predict word sequences, this approach demonstrates a deeper understanding.
The initial versions of the o1 series, namely the o1-preview and the o1-mini, are being gradually made available to paid users, free users, and developers, with the latter facing significant usage costs. The o1 model, utilizing novel training techniques, is designed to handle intricate programming, mathematics, and scientific problems, offering answers faster than humans. The mini version focuses specifically on programming use cases.
From now, ChatGPT Plus and Team subscribers can access both versions via the AI model selector in the user interface. Enterprise and Edu users will have access next week, with future plans to extend o1-mini access to all free users eventually. OpenAI intends to enable automatic model selection based on prompts in the future.
OpenAI explained in prior communications that GPT-4, released in 2023, had intelligence akin to a high school student, whereas GPT-5 aimed to elevate AI's cognitive prowess to a doctoral level, with the o1 model as a pivotal step in this progression. Preliminary tests showcased the o1 model scoring 83% in the International Mathematical Olympiad qualifying exams, compared to GPT-4o's 13%, and achieving an 89th percentile in competitive programming contests like Codeforces, versus GPT-4o’s 11%.
The forthcoming updates to the o1 series are expected to demonstrate AI performance on par with doctoral students in physics, chemistry, and biology benchmarks. This model surpasses its predecessors by resolving complex reasoning challenges and fixing previous mechanisms' flaws.
In application, the o1 model adopts a systematic approach to problem-solving, first comprehensively thinking through and organizing its response before executing. This refined process vastly improves the accuracy and quality of generated results. OpenAI has enlisted human experts to trial the new model. For instance, quantum physicist Mario Krenn from the Max Planck Institute demonstrated how the o1-preview correctly solved intricate quantum physics problems that GPT-4o couldn't handle.
Moreover, while the o1-preview excels in data analysis, coding, and mathematics, it is not always the best for natural language tasks, indicating that the choice of the model may depend on the specific use case. Despite its capabilities, o1-preview lacks many of ChatGPT’s useful functions, such as web search and file/image uploads. Still, for specialized domains, its enhanced reasoning power marks a significant leap in AI’s potential to solve complex problems.
OpenAI's scientific contributors reveal that o1’s current thinking time spans a few to several seconds. However, the company aims to extend this to hours, days, or even weeks in future versions, promising more profound and impactful results, especially in research-intensive areas such as developing new cancer treatments.
Currently, access to the o1 series is limited. ChatGPT Plus and Team users are the first to experience it, with weekly message caps set for both models. Enterprise and Edu users will gain access next week, with broader availability planned. Additionally, OpenAI will enhance its models with browsing, file, and image upload capabilities, continuing to develop other models within the GPT series.
On the safety front, OpenAI introduces a novel method leveraging the o1 model’s reasoning skills to uphold alignment and security standards effectively. This capability enables a thorough understanding and application of context-based safety rules, thereby ensuring more robust compliance.
Sign up for free to continue reading
By continuing, I agree to the
Market Data Terms of Service and Privacy Statement
Comments
No comments yet