Google Unveils Ironwood TPU for 200% Efficiency Gain in AI Inference

Market IntelWednesday, Apr 9, 2025 10:07 am ET
1min read

Google has launched its seventh-generation Tensor Processing Unit (TPU) chip, named

, specifically engineered for AI inference tasks. This new chip is designed to meet the demands of AI workloads with two configuration options: a 256-chip cluster and a 9216-chip cluster. plans to make Ironwood available to customers through its cloud service platform later this year.

Ironwood represents a significant advancement in Google's TPU technology, offering enhanced performance and efficiency. The chip is equipped with 292GB of HBM memory and delivers a peak FP8 AI performance of 4614 TFLOP, with the capability to scale up to 9216 chips. This new TPU is designed to be twice as efficient as its predecessor, the TPU v6e, making it a powerful tool for AI inference tasks.

Google's announcement highlights the company's dedication to advancing AI technology and providing state-of-the-art solutions for its customers. By introducing a chip specifically tailored for AI inference, Google aims to improve the performance and efficiency of AI applications, enabling faster and more accurate decision-making processes. This move is expected to bolster Google's position in the competitive AI market, where companies are continually pushing the boundaries of AI capabilities.

The development of Ironwood is part of Google's broader strategy to enhance its data center infrastructure with custom-designed chips. The new chip is designed to support large-scale AI models, marking a significant shift in the AI development process. Google emphasizes that Ironwood is the most powerful and efficient TPU product to date, designed to handle both large-scale thinking and inference AI models. This advancement signifies a major turning point in AI development and infrastructure.

With the increasing expenditure on AI-related technologies, Google is intensifying its efforts to develop custom chips for its data centers. Ironwood, with its ability to configure up to 9216 liquid-cooled chips, achieves near 10 petaflops of computing power through innovative chip interconnect technology. This breakthrough in chip design is expected to drive further advancements in AI technology and its applications.

In terms of technical specifications, Ironwood's power efficiency is double that of its predecessor, Trillium (the sixth-generation TPU). The single-chip memory capacity has been increased to 192GB, which is six times that of Trillium. However, Google has not disclosed information about the chip's foundry partner.

Google's announcement underscores a paradigm shift in AI development, moving from a 'reactive' to a 'proactive' approach. The company is transitioning from models that require human interpretation of real-time data to intelligent systems that can proactively generate insights and solutions. This shift marks the beginning of the 'inference era,' where AI agents will actively acquire and generate data, providing deep insights and solutions rather than raw data.