Nvidia (NVDA.US) introduces new AI model Fugatto, which can modify and generate new sounds.
Nvidia (NVDA.US) has launched a new artificial intelligence (AI) model for generating music and audio, aimed at serving those who make music, films and video games.
According to Nvidia, the model, called Fugatto (Foundational Generative Audio Transformer Opus), can generate or modify music and sound using any text and audio files.
For example, the model can create music snippets based on text prompts, remove or add instruments from existing songs, change accents or emotions in sounds, or even make sounds that have never been heard before.
Rafael Valle, Nvidia's application audio research manager, orchestra conductor and composer, said: "We want to create a model that understands and produces sound like humans."
Nvidia said advertising agencies can use Fugatto to quickly identify existing ads in multiple regions and add different accents and emotions to the voiceovers. Video game developers can also use the AI model to modify pre-recorded assets in games to adapt to the changing actions of users while they play.
Fugatto can make a trumpet bark like a dog or a saxophone meow like a cat. The company added that researchers found it could handle untrained tasks such as generating high-quality singing from text by fine-tuning and using a small amount of singing data.
Nvidia said the full version of Fugatto uses 25 billion parameters and was trained on Nvidia DGX systems with 32 Nvidia H100 Tensor Core GPUs. The model took more than a year to complete.
Fugatto may compete with similar technologies from startups such as Runway and larger companies such as Meta Platforms (META.US), which released an AI model called Movie Gen in October that can create realistic video and audio clips based on user prompts.
OpenAI, the company behind ChatGPT, released Sora in February that can create realistic and imaginative scenes based on text instructions. The company, which is backed by Microsoft (MSFT.US), has not yet released its text-to-video model to the public.