icon
icon
icon
icon
Upgrade
Upgrade

News /

Articles /

OpenAI's GPT-4.1 Achieves 55% Accuracy on Coding Benchmark, Reduces Costs by 26%

Coin WorldMonday, Apr 14, 2025 6:01 pm ET
2min read

OpenAI has introduced GPT-4.1, a suite of three new AI models designed to handle context windows of up to one million tokens. This capability allows the models to process entire codebases or small novels in a single operation. The lineup includes the standard GPT-4.1, as well as the Mini and Nano variants, all aimed at developers. The release of GPT-4.1 comes just weeks after the unveiling of GPT-4.5, raising questions about the naming and release strategy of OpenAI's models.

GPT-4.1 demonstrates significant improvements in performance and efficiency. According to OpenAI, the model achieved 55% accuracy on the SWEBench coding benchmark, a substantial increase from GPT-4o's 33%, while also reducing costs by 26%. The Nano variant, described as the company’s smallest, fastest, and cheapest model, operates at just 12 cents per million tokens. Additionally, OpenAI has clarified that there will be no additional charges for processing large documents, emphasizing that the one million token context is included without a pricing bump.

Ask Aime: What is the significance of OpenAI's GPT-4.1 release for developers?

During a live demonstration, GPT-4.1 showcased its ability to generate a complete web application by analyzing a 450,000-token NASA server log file from 1995. OpenAI claims that the model can handle this task with nearly 100% accuracy, even with a million tokens of context. Michelle, OpenAI's post-training research lead, highlighted the models' enhanced instruction-following capabilities, noting that GPT-4.1 adheres to complex formatting requirements without the usual AI tendency to "creatively interpret" directions.

The release of GPT-4.1 after GPT-4.5 has sparked confusion and curiosity about OpenAI's naming conventions. The company's versioning saga includes models like GPT-4o, which was upgraded with multimodal capabilities, and the reasoning-focused model simply named "o." The naming continues to evolve with models like o3 and o3 mini-high, each with its own unique characteristics and capabilities. OpenAI has also announced plans to release o4 soon, further adding to the complexity of their model lineup.

Despite the confusion surrounding the naming, GPT-4.1 is set to replace GPT-4.5, making it the shortest-lived large language model in ChatGPT's history. Kevin, OpenAI's product lead, announced that GPT-4.5 will be deprecated in the API, giving developers a three-month deadline to transition. This move is driven by the need to reclaim gpus, highlighting the industry-wide silicon shortage that even OpenAI is facing. The new models are already available via API and in OpenAI’s playground, but they are not yet integrated into the user-friendly ChatGPT UI.

In summary, OpenAI's release of GPT-4.1 marks a significant advancement in AI capabilities, with improved performance, efficiency, and context handling. The model's ability to process large documents and adhere to complex instructions positions it as a powerful tool for developers. However, the naming and release strategy of OpenAI's models continue to be a source of confusion, with the company's versioning saga adding layers of complexity to their product lineup. Despite these challenges, GPT-4.1 is poised to become a key player in the AI landscape, offering developers new opportunities to leverage advanced AI capabilities.

Comments

Add a public comment...
Post
Refresh
Disclaimer: the above is a summary showing certain market information. AInvest is not responsible for any data errors, omissions or other information that may be displayed incorrectly as the data is derived from a third party source. Communications displaying market prices, data and other information available in this post are meant for informational purposes only and are not intended as an offer or solicitation for the purchase or sale of any security. Please do your own research when investing. All investments involve risk and the past performance of a security, or financial product does not guarantee future results or returns. Keep in mind that while diversification may help spread risk, it does not assure a profit, or protect against loss in a down market.
You Can Understand News Better with AI.
Whats the News impact on stock market?
Its impact is
fork
logo
AInvest
Aime Coplilot
Invest Smarter With AI Power.
Open App