icon
icon
icon
icon
Upgrade
icon

Musk's New Beast of AI Training Powers Up In Memphis, Aiming To Train 'The World's Most Powerful AI By Every Metric'

AInvestTuesday, Jul 23, 2024 10:03 am ET
1min read

On Tuesday, Elon Musk announced that his AI startup xAI has officially begun training at the supercluster in Memphis, Tennessee. The cluster claims to be the world's most powerful AI training cluster, composed of 100,000 Nvidia H100 GPUs, which Nvidia has provided since last year. Musk said this move is to develop the world's most powerful AI before December this year.

Musk detailed the configuration of the system on social media: The Memphis supercluster uses a single RDMA structure, which can achieve more efficient and lower latency data transmission, thus providing significant performance advantages for AI model training. 

According to Musk, the RDMA structure avoids additional burdens on the central processing unit (CPU), which is crucial for ultra-large-scale computing tasks. The technical support from Cisco, and the excellent chips from Nvidia, coupled with the efforts of xAI and the X team, have made the training cluster plan proceed smoothly.

If the computing resources of the Memphis supercluster are successfully operated, it will surpass many current top supercomputers in scale and performance, such as Frontier (37,888 AMD GPUs), Aurora (60,000 Intel GPUs), and Microsoft Eagle (14,400 Nvidia H100 GPUs).

The goal of xAI is not only to lead in technology, but its huge investment in the supercomputer cluster also has a significant impact on the local infrastructure development in Memphis. Musk said that xAI will improve Memphis's public facilities, including the construction of new substations and sewage treatment facilities to support the development of the data center.

In May this year, Musk revealed plans to build a supercomputing factory in Tennessee before the end of 2024, which was then called the computing power superfactory. Now, the early activation of the Memphis supercluster is undoubtedly an acceleration of this plan.

However, Musk has a history of postponed projects, such as the anticipated fully autonomous driving cars and Robotaxi. So, for the goal of training the world's most powerful AI by every metric by December 2024, many people generally hold a cautious attitude.

It is worth noting that xAI is not without competitors. Microsoft and OpenAI are also working together to develop an AI training supercomputer codenamed Stargate, with an expected investment of up to 100 billion US dollars. If this project is successful, the position of xAI's Memphis supercluster as the most powerful AI training cluster may face challenges in the future.

In addition, in subsequent tweets, Musk also mentioned that Tesla will start producing Optimus robots in 2024, and is expected to achieve mass production by 2026 for other companies to use. This timetable is later than he previously claimed, showing the adjustment of technology giants in the face of real technical challenges.

Disclaimer: the above is a summary showing certain market information. AInvest is not responsible for any data errors, omissions or other information that may be displayed incorrectly as the data is derived from a third party source. Communications displaying market prices, data and other information available in this post are meant for informational purposes only and are not intended as an offer or solicitation for the purchase or sale of any security. Please do your own research when investing. All investments involve risk and the past performance of a security, or financial product does not guarantee future results or returns. Keep in mind that while diversification may help spread risk, it does not assure a profit, or protect against loss in a down market.