Alibaba Unveils Next-Gen AI Model with Enhanced Efficiency: Qwen3-Next Architecture Offers Training Stability and Improved Inference Throughput.
PorAinvest
jueves, 11 de septiembre de 2025, 6:59 pm ET1 min de lectura
BABA--
The Qwen3-Next-80B-A3B-Base model contains 80 billion parameters but activates only 3 billion during inference. Alibaba claims this base model achieves performance comparable to or slightly better than the dense Qwen3-32B model while using less than 10% of its training cost in GPU hours. For inference with context lengths exceeding 32,000 tokens, the new model delivers more than 10 times higher throughput compared to previous versions [2].
Alibaba has also released two post-trained versions: Qwen3-Next-80B-A3B-Instruct and Qwen3-Next-80B-A3B-Thinking. The company reports solving stability and efficiency issues in reinforcement learning training caused by the hybrid attention and high-sparsity MoE architecture. The Instruct version performs comparably to Alibaba’s flagship model Qwen3-235B-A22B-Instruct-2507 and shows advantages in tasks requiring ultra-long context of up to 256,000 tokens. The Thinking version excels at complex reasoning tasks, reportedly outperforming higher-cost models like Qwen3-30B-A3B-Thinking-2507 and Qwen3-32B-Thinking [2].
The new model is available on Hugging Face and ModelScope, and users can access the Qwen3-Next service through Alibaba Cloud Model Studio and NVIDIA API Catalog [2].
This latest development underscores Alibaba's commitment to advancing AI technology and maintaining its competitive edge in the rapidly growing AI cloud market. As the company continues to invest heavily in AI infrastructure, the efficiency gains offered by Qwen3-Next could provide a significant advantage in both domestic and international markets.
Alibaba has unveiled its next-gen AI model, Qwen3-Next, with a hybrid attention mechanism and highly sparse MoE structure. The Qwen3-Next-80B-A3B-Base model has 80 billion parameters but activates only 3 billion, offering performance comparable to or better than the Qwen3-32B dense model. Training cost is reduced by nearly 90%, and inference throughput exceeds ten times that of Qwen3-32B for contexts above 32k.
Alibaba Group Holding Ltd. has launched a new artificial intelligence model, Qwen3-Next, designed to significantly improve efficiency in both training and inference processes. The new model features a hybrid attention mechanism, a highly sparse Mixture-of-Experts (MoE) structure, training-stability-friendly optimizations, and a multi-token prediction mechanism for faster inference [2].The Qwen3-Next-80B-A3B-Base model contains 80 billion parameters but activates only 3 billion during inference. Alibaba claims this base model achieves performance comparable to or slightly better than the dense Qwen3-32B model while using less than 10% of its training cost in GPU hours. For inference with context lengths exceeding 32,000 tokens, the new model delivers more than 10 times higher throughput compared to previous versions [2].
Alibaba has also released two post-trained versions: Qwen3-Next-80B-A3B-Instruct and Qwen3-Next-80B-A3B-Thinking. The company reports solving stability and efficiency issues in reinforcement learning training caused by the hybrid attention and high-sparsity MoE architecture. The Instruct version performs comparably to Alibaba’s flagship model Qwen3-235B-A22B-Instruct-2507 and shows advantages in tasks requiring ultra-long context of up to 256,000 tokens. The Thinking version excels at complex reasoning tasks, reportedly outperforming higher-cost models like Qwen3-30B-A3B-Thinking-2507 and Qwen3-32B-Thinking [2].
The new model is available on Hugging Face and ModelScope, and users can access the Qwen3-Next service through Alibaba Cloud Model Studio and NVIDIA API Catalog [2].
This latest development underscores Alibaba's commitment to advancing AI technology and maintaining its competitive edge in the rapidly growing AI cloud market. As the company continues to invest heavily in AI infrastructure, the efficiency gains offered by Qwen3-Next could provide a significant advantage in both domestic and international markets.

Divulgación editorial y transparencia de la IA: Ainvest News utiliza tecnología avanzada de Modelos de Lenguaje Largo (LLM) para sintetizar y analizar datos de mercado en tiempo real. Para garantizar los más altos estándares de integridad, cada artículo se somete a un riguroso proceso de verificación con participación humana.
Mientras la IA asiste en el procesamiento de datos y la redacción inicial, un miembro editorial profesional de Ainvest revisa, verifica y aprueba de forma independiente todo el contenido para garantizar su precisión y cumplimiento con los estándares editoriales de Ainvest Fintech Inc. Esta supervisión humana está diseñada para mitigar las alucinaciones de la IA y garantizar el contexto financiero.
Advertencia sobre inversiones: Este contenido se proporciona únicamente con fines informativos y no constituye asesoramiento profesional de inversión, legal o financiero. Los mercados conllevan riesgos inherentes. Se recomienda a los usuarios que realicen una investigación independiente o consulten a un asesor financiero certificado antes de tomar cualquier decisión. Ainvest Fintech Inc. se exime de toda responsabilidad por las acciones tomadas con base en esta información. ¿Encontró un error? Reportar un problema

Comentarios
Aún no hay comentarios