icon
icon
icon
icon
Upgrade
Upgrade

News /

Articles /

Nvidia (NVDA.US) introduces new AI model Fugatto, which can modify and generate new sounds.

Market IntelMonday, Nov 25, 2024 10:50 pm ET
1min read

Nvidia (NVDA.US) has launched a new artificial intelligence (AI) model for generating music and audio, aimed at serving those who make music, films and video games.

According to Nvidia, the model, called Fugatto (Foundational Generative Audio Transformer Opus), can generate or modify music and sound using any text and audio files.

For example, the model can create music snippets based on text prompts, remove or add instruments from existing songs, change accents or emotions in sounds, or even make sounds that have never been heard before.

Rafael Valle, Nvidia's application audio research manager, orchestra conductor and composer, said: "We want to create a model that understands and produces sound like humans."

Nvidia said advertising agencies can use Fugatto to quickly identify existing ads in multiple regions and add different accents and emotions to the voiceovers. Video game developers can also use the AI model to modify pre-recorded assets in games to adapt to the changing actions of users while they play.

Fugatto can make a trumpet bark like a dog or a saxophone meow like a cat. The company added that researchers found it could handle untrained tasks such as generating high-quality singing from text by fine-tuning and using a small amount of singing data.

Nvidia said the full version of Fugatto uses 25 billion parameters and was trained on Nvidia DGX systems with 32 Nvidia H100 Tensor Core GPUs. The model took more than a year to complete.

Fugatto may compete with similar technologies from startups such as Runway and larger companies such as Meta Platforms (META.US), which released an AI model called Movie Gen in October that can create realistic video and audio clips based on user prompts.

OpenAI, the company behind ChatGPT, released Sora in February that can create realistic and imaginative scenes based on text instructions. The company, which is backed by Microsoft (MSFT.US), has not yet released its text-to-video model to the public.

Disclaimer: the above is a summary showing certain market information. AInvest is not responsible for any data errors, omissions or other information that may be displayed incorrectly as the data is derived from a third party source. Communications displaying market prices, data and other information available in this post are meant for informational purposes only and are not intended as an offer or solicitation for the purchase or sale of any security. Please do your own research when investing. All investments involve risk and the past performance of a security, or financial product does not guarantee future results or returns. Keep in mind that while diversification may help spread risk, it does not assure a profit, or protect against loss in a down market.