Apple's MLX Framework Now Supports NVIDIA GPUs with CUDA Backend

AinvestTuesday, Jul 15, 2025 8:37 pm ET
1min read

Apple's machine learning framework, MLX, is gaining support for NVIDIA GPUs through a CUDA backend. This integration will allow developers to run MLX models directly on NVIDIA GPUs, opening up new possibilities for testing, experimentation, and research use cases. The work is still in progress, but core operations such as matrix multiplication and softmax are already supported.

Apple's machine learning framework, MLX, is gaining support for NVIDIA GPUs through a CUDA backend, enabling developers to run MLX models directly on NVIDIA GPUs. This integration opens up new possibilities for testing, experimentation, and research use cases. While the work is still in progress, core operations such as matrix multiplication and softmax are already supported.

In a separate development, NVIDIA has introduced the B30 AI GPU, optimized for small to medium AI models and cloud services. The B30 delivers approximately 75% of the performance of the H20 AI GPU, making it a cost-effective solution for Chinese tech firms. Demand for the B30 is significant, with Chinese tech companies placing orders for hundreds of thousands of units, totaling over $1 billion in late-June, with deliveries expected in August [1].

The B30 is designed to address two major pain points for China. It is the preferred solution for inference in small and medium-sized models, aligning with the arrival of the inference era. Additionally, it acts as a low-cost computing power pool for cloud services. A computing power pool constructed with 100 B30 AI GPUs can support lightweight training of billion-parameter models while reducing procurement costs by 40% and unit power consumption by close to 30% compared to the H20 [1].

The B30's deep compatibility with the CUDA-X ecosystem allows enterprises to seamlessly migrate frameworks like PyTorch, saving technical reconstruction costs. This integration is crucial for maintaining the 'stickiness' of the CUDA ecosystem in mainstream model deployment efficiency [1].

While domestic-made AI chips from the likes of Huawei might slightly pass the B30 in single-card FP16 computing power, the B30 maintains an advantage in mainstream model deployment efficiency due to its CUDA compatibility [1].

References:
[1] https://www.tweaktown.com/news/106374/nvidias-new-b30-ai-gpu-for-china-expected-to-have-significant-demand-75-as-fast-the-h20/index.html

Apple's MLX Framework Now Supports NVIDIA GPUs with CUDA Backend

Comments



Add a public comment...
No comments

No comments yet

Disclaimer: The news articles available on this platform are generated in whole or in part by artificial intelligence and may not have been reviewed or fact checked by human editors. While we make reasonable efforts to ensure the quality and accuracy of the content, we make no representations or warranties, express or implied, as to the truthfulness, reliability, completeness, or timeliness of any information provided. It is your sole responsibility to independently verify any facts, statements, or claims prior to acting upon them. Ainvest Fintech Inc expressly disclaims all liability for any loss, damage, or harm arising from the use of or reliance on AI-generated content, including but not limited to direct, indirect, incidental, or consequential damages.