Meet Meta’s AudioCraft, the latest entrant into the budding text-to-audio space.
According to a blog posted by the company, the Big Tech giant says AudioCraft is actually the composition of three unique AI-driven models: MusicGen, AudioGen and EnCodec. MusicGen is a language model that exports music based on a text prompt, similar to what you’d input into ChatGPT. AudioGen, meanwhile, is geared more towards the generation of an effect, or sequence of sound effects. Finally, the EnCodec decoder facilitates the generation of music with fewer artifacts.
All of the models will be made open-source, a move that Meta explains is part of their commitment to advancing the broader field of generative AI audio.
Meta’s expansion into the space puts them in competition with fellow big tech counterpart Google, which recently released MusicLM, a powerful language model that also generates music from user-generated text prompts.
Meta hopes the tools will be used by sound designers and musicians alike to broaden their inspirations and brainstorm efficiently. They believe that in time—and with additional controls—the tool will be used as a sort of new-age synthesizer, as though it were a creative instrument in its own right.
You can find out more about AudioCraft here.