[Editorial] Stable Audio: music composition from text, thanks to the AI

Sooner or later it had to happen: artificial intelligence is slowly taking over space in the field of artistic creativity. The Stable Audio service has been recently launched by a joint venture between Stability and Harmonai, two active companies in the field of AI. The first one works in the field of image generation, the second in the generation of sounds, by means of its Dance Diffusion project.

How does it work? It's extremely easy: at a text prompt the user enters keywords on characteristics he wants for the song, for example, love song, female voice, harps, piano chords, slow tempo, after a short while the Stable Audio web interface generates and plays music composed by artificial intelligence. The results are variable, but with a little practice the result starts to become interesting. The user can also decide the rhythm, by adding the desired number of bpm, or add some adjective next to the name of an instrument he wants to use. Obviously one can use multiple instruments and voices at the same time.

Different levels, at different cost, are available. A “free” version limits the music output to 45 seconds and the user to 20 compositions per month; the AI-generated output is for personal use only, not authorized for use by artists. By subscribing to a Pro version for $20 monthly, the user can expand those limits to 90 seconds and 500 compositions and remove the limit on use by artists. Then there is an Enterprise version, price and features to be established depending on the artist's needs.

It's a little distressing and worrying. Until now artificial intelligence had mostly limited itself to providing answers to general questions, even complicated ones on different aspects of knowledge, or to producing texts and images. Now it enters the field of pure creativity, and the only input which is required on our part is a sequence of keywords to establish the type of music we want to compose! In the non-free version, artists are authorized to use the songs composed thanks to AI, while in the free version they are not. The Pro version costs $20 per month and allows the generation of 500 musical excerpts of 90 seconds each. Then there is an Enterprise version, price and features to be established depending on the artist's needs. Everything is done in CD quality audio, i.e. 16/44, and this is the real novelty, given that AI-based software that produced sounds on request already existed.

We are at the very beginning of it all, that's obvious, but the road ahead is clear, whether we like it or not. Distinguishing “who composed what” will become complicated, as this system will probably be exploited by artists suffering from a lack of ideas. All of this obviously raises a whole series of questions and reflections, starting from copyright issues and ending with the struggle between man and machine. I imagine, for example, how much easier it could be for a movie director to produce a soundtrack supporting his images. In fact, Hollywood has already started to grapple with such issues: in both the recently concluded strikes by the screen actors' and writers' guilds, which shut down movie and TV production for months, negotiators described control of and compensation for the use of AI as the major sticking point that delayed coming to a resolution. In the end, the writers won the right to rely on AI, with the studios' permission, but to prevent the studios from forcing AI into the writing process or using the writers' work to “train” AI without their consent; the actors won the right for themselves and their estates to require their approval before studios can use their images as AI and also to be compensated for such use. Thus, in that context, the issues are settled for the moment, but doubtless they will erupt again as technology marches forward and changes the landscape further. Meanwhile, the U.S. Copyright Office has received nearly 10,000 comments from the public, including singers, voice actors, novelists, and video game artists, as it considers how copyright law should treat use of published work to train AI systems.

For now, we can have some fun creating our own samples and music, as it's nothing more than an online game, but I fear it could become something to deal with in the very near future, given that the AI “learns” quickly and...mercilessly.

DISCLAIMER. TNT-Audio is a 100% independent magazine that neither accepts advertising from companies nor requires readers to register or pay for subscriptions. If you wish, you can support our independent reviews via a PayPal donation. After publication of reviews, the authors do not retain samples other than on long-term loan for further evaluation or comparison with later-received gear. Hence, all contents are written free of any “editorial” or “advertising” influence, and all reviews in this publication, positive or negative, reflect the independent opinions of their respective authors. TNT-Audio will publish all manufacturer responses, subject to the reviewer's right to reply in turn.

November 2023 editorial

Stable Audio: Music Composition From Text, Thanks to the AI!