Mistral Launches Voxtral AI Audio Model Outperforming Whisper by 50%

Generated by AI AgentCoin World
Wednesday, Jul 16, 2025 6:15 pm ET2min read
Aime RobotAime Summary

- Mistral launches Voxtral, the first audio-focused LLM family, outperforming Whisper by 50% in multilingual speech processing.

- Available in 24B (Small) and 3B (Mini) versions, it offers scalable transcription at $0.001/minute via API, ideal for edge or production deployments.

- Open-source release democratizes advanced audio tech, enabling customization and fostering industry innovation across sectors.

Mistral, a leading AI company based in France, has announced the release of its new AI audio model called Voxtral. This model is designed specifically for businesses and is considered the first family of large language models (LLMs) focused on audio AI. Voxtral is powered by the large language model (LLM) Mistral Small 3.1, which enables it to understand multiple languages, including English, French, Spanish, Portuguese, Italian, German, Dutch, Hindi, and more.

Voxtral is designed to deliver practical speech intelligence in real-world applications. The AI audio model outperforms Whisper large-v3, which is one of the top open-source audio transcription models. It can transcribe up to 30 minutes of audio and understand up to 40 minutes of audio, making it easy for users to converse and ask relevant questions. Users can also ask it to generate text summaries of the audio file or provide analysis and detailed insights. They can also execute other actions, like running functions through an API call.

Mistral offers Voxtral’s “speech understanding models” in two variations called Voxtral Small and Voxtral Mini. Both models are capable of interacting with speech-based prompts or a combination of audio and text-based prompts. The more powerful of the two models, Voxtral Small, features 24B parameters—ideal for production-scale deployments. Mistral wrote that “Voxtral Small is competitive with GPT-4o-mini and Gemini 2.5 Flash across all tasks.”

Voxtral Mini is a lighter-weight option with 3B parameters, making it a strong choice for local and edge deployments. Its API version, Voxtral Mini Transcribe, is not only cost-effective but also outperforms OpenAI’s Whisper—at less than half the price. Both Voxtral Small (24B) and Voxtral Mini (3B) are available for download and local hosting from Hugging Face. Developers can also integrate the audio models via a single API call into any application. The pricing starts at $0.001 per minute, making transcription scalable. Mistral stated that Voxtral will be available on Le Chat in the web app or mobile app within the next couple of weeks.

Mistral is one of the leading artificial intelligence companies in Europe. The company, which was founded in 2023, has raised over €1 billion (around $1.2 billion) from known firms like Andreessen Horowitz,

, Samsung, and . The release of Voxtral marks a pivotal moment in the AI audio market. By making Voxtral open-source, Mistral is democratizing access to advanced audio processing technology. This decision aligns with the company's mission to accelerate the future of AI by providing innovative and accessible solutions. The open-source nature of Voxtral allows businesses to customize the model to fit their specific needs, fostering innovation and collaboration within the industry.

The introduction of Voxtral is expected to have a significant impact on the voice-tech market. By offering a production-ready speech model, Mistral is providing businesses with a reliable tool to enhance their audio processing capabilities. This could lead to improved customer service, more efficient data analysis, and better overall performance in various industries. Mistral's decision to release Voxtral as an open-source model is a strategic move that positions the company as a leader in the AI audio market. By providing a cost-effective and customizable solution, Mistral is attracting businesses that may have previously been deterred by the high costs and limitations of proprietary systems. This move also sets the stage for future innovations, as the open-source community can contribute to the development and improvement of Voxtral.

Comments



Add a public comment...
No comments

No comments yet