Tencent's Hunyuan Turbo S: Revolutionizing AI with Faster Response Times
Tencent, a leading Chinese technology company, has unveiled its latest large language model, Hunyuan Turbo S, which boasts significantly faster response times without compromising performance on complex reasoning tasks. The new AI model, announced on Tencent's official weibo and WeChat channels, claims to double word generation speed and reduce first-word delay by 44% compared to previous models.
The Hunyuan Turbo S model employs a hybrid architecture that combines Mamba and Transformer technologies, marking the first successful integration of these approaches in a super-large Mixture of Experts (MoE) model. This technical fusion aims to address fundamental challenges in AI development by efficiently handling long sequences with Mamba and capturing complex contexts with Transformer, potentially lowering both training and inference costs.
Tencent designed Hunyuan Turbo S to mimic human cognitive processes, providing instant responses like human intuition while maintaining the analytical reasoning capabilities needed for complex problems. Performance benchmarks show the model matching or exceeding top-tier models across various tests. It scored 89.5 on mmlu, slightly above OpenAI’s GPT-4, and achieved top scores in mathematical reasoning benchmarks math and AIME2024. For Chinese language tasks, it reached 70.8 on Chinese-SimpleQA, outperforming DeepSeek's 68.0. However, it lagged in some areas like SimpleQA and LiveCodeBench, where GPT-4 and Claude 3.5 performed better.
The release of Hunyuan Turbo S intensifies the ongoing AI competition between Chinese and American tech firms. DeepSeek, a Chinese startup known for its cost-effective, high-performing models, has been putting pressure on both Chinese tech giants and American companies like OpenAI with its highly capable and ultra-efficient models. Tencent priced Hunyuan Turbo S competitively at 0.8 yuan (approximately $0.11) per million tokens for input and 2 yuan ($0.28) per million tokens for output, significantly cheaper than previous Turbo models.
The model is technically available via API on Tencent Cloud, with the company offering a free one-week trial. However, it is still not available for public download, and interested developers and businesses need to join a waiting list through Tencent Cloud to gain access to the model's API. The company has not provided a timeline for general availability via Github.
The