Alibaba Unveils Next-Gen AI Model with Enhanced Efficiency: Qwen3-Next Architecture Offers Training Stability and Improved Inference Throughput.

Thursday, Sep 11, 2025 6:59 pm ET1min read

Alibaba has unveiled its next-gen AI model, Qwen3-Next, with a hybrid attention mechanism and highly sparse MoE structure. The Qwen3-Next-80B-A3B-Base model has 80 billion parameters but activates only 3 billion, offering performance comparable to or better than the Qwen3-32B dense model. Training cost is reduced by nearly 90%, and inference throughput exceeds ten times that of Qwen3-32B for contexts above 32k.

Alibaba Group Holding Ltd. has launched a new artificial intelligence model, Qwen3-Next, designed to significantly improve efficiency in both training and inference processes. The new model features a hybrid attention mechanism, a highly sparse Mixture-of-Experts (MoE) structure, training-stability-friendly optimizations, and a multi-token prediction mechanism for faster inference Alibaba unveils more efficient Qwen3-Next AI model to compete with rivals[2].

The Qwen3-Next-80B-A3B-Base model contains 80 billion parameters but activates only 3 billion during inference. Alibaba claims this base model achieves performance comparable to or slightly better than the dense Qwen3-32B model while using less than 10% of its training cost in GPU hours. For inference with context lengths exceeding 32,000 tokens, the new model delivers more than 10 times higher throughput compared to previous versions Alibaba unveils more efficient Qwen3-Next AI model to compete with rivals[2].

Alibaba has also released two post-trained versions: Qwen3-Next-80B-A3B-Instruct and Qwen3-Next-80B-A3B-Thinking. The company reports solving stability and efficiency issues in reinforcement learning training caused by the hybrid attention and high-sparsity MoE architecture. The Instruct version performs comparably to Alibaba’s flagship model Qwen3-235B-A22B-Instruct-2507 and shows advantages in tasks requiring ultra-long context of up to 256,000 tokens. The Thinking version excels at complex reasoning tasks, reportedly outperforming higher-cost models like Qwen3-30B-A3B-Thinking-2507 and Qwen3-32B-Thinking Alibaba unveils more efficient Qwen3-Next AI model to compete with rivals[2].

The new model is available on Hugging Face and ModelScope, and users can access the Qwen3-Next service through Alibaba Cloud Model Studio and NVIDIA API Catalog Alibaba unveils more efficient Qwen3-Next AI model to compete with rivals[2].

This latest development underscores Alibaba's commitment to advancing AI technology and maintaining its competitive edge in the rapidly growing AI cloud market. As the company continues to invest heavily in AI infrastructure, the efficiency gains offered by Qwen3-Next could provide a significant advantage in both domestic and international markets.

Alibaba Unveils Next-Gen AI Model with Enhanced Efficiency: Qwen3-Next Architecture Offers Training Stability and Improved Inference Throughput.

Comments



Add a public comment...
No comments

No comments yet