NVIDIA TensorRT Boosts Stable Diffusion 3.5 Performance with 40% Less VRAM on RTX GPUs

Thursday, Jun 12, 2025 12:35 pm ET1min read

NVIDIA TensorRT boosts Stable Diffusion 3.5 performance on NVIDIA GeForce RTX and RTX PRO GPUs. The software development kit (SDK) reduces VRAM consumption by 40% and doubles performance. It also enables just-in-time engine building and seamless AI deployment to over 100 million RTX AI PCs. The SDK is now available as a standalone for developers. NVIDIA collaborated with Stability AI to quantize Stable Diffusion 3.5 Large to FP8, reducing VRAM consumption by 40%.

NVIDIA's TensorRT has significantly boosted the performance of Stable Diffusion 3.5 on NVIDIA GeForce RTX and RTX PRO GPUs, according to a recent collaboration with Stability AI. The software development kit (SDK) reduces VRAM consumption by 40% and doubles the performance of the model. This optimization enables just-in-time engine building and seamless AI deployment to over 100 million RTX AI PCs. The SDK is now available as a standalone for developers [1].

The collaboration involved quantizing Stable Diffusion 3.5 Large to FP8, which reduced VRAM consumption by 40%. This optimization means that five GeForce RTX 50 Series GPUs can now run the model from memory instead of just one. The optimized models are now available on Stability AI’s Hugging Face page [1].

Additionally, TensorRT for RTX was released as a standalone SDK, making it easier for developers to create optimized AI engines. This new version of TensorRT allows developers to create a generic TensorRT engine that is optimized on device in seconds, streamlining the process and reducing development time [1].

NVIDIA's advancements in AI performance and efficiency are likely to attract more developers and users to their RTX GPUs, potentially driving sales and market share in the competitive GPU market. The collaboration with Stability AI also demonstrates the company's commitment to fostering innovation in AI and its willingness to work with industry leaders to achieve these goals [1].

References:
[1] https://blogs.nvidia.com/blog/rtx-ai-garage-gtc-paris-tensorrt-rtx-nim-microservices/

NVIDIA TensorRT Boosts Stable Diffusion 3.5 Performance with 40% Less VRAM on RTX GPUs

Comments



Add a public comment...
No comments

No comments yet