In November, I quit my job in generative AI to campaign for creators’ right not to have their work used for AI training without permission. I started Fairly Trained, a non-profit that certifies generative AI companies that obtain a license before training models on copyrighted works.
Mostly, I’ve felt good about this decision — but there have been a few times when I’ve questioned it. Like when a big media company, though keen to defend its own rights, told me it couldn’t find a way to stop using unfairly-trained generative AI in other domains. Or whenever demos from the latest models receive unquestioning praise despite how they’re trained. Or, last week, with the publication of a series of articles about AI music company Suno that I think downplay serious questions about the training data it uses.
Suno is an AI music generation company with impressive text-to-song capabilities. I have nothing against Suno, with one exception: Piecing together various clues, it seems likely that its model is trained on copyrighted work without rights holders’ consent.
Trending on Billboard
What are these clues? Suno refuses to reveal its training data sources. In an interview with Rolling Stone, one of its investors disclosed that Suno didn’t have deals with the labels “when the company got started” (there is no indication this has changed), that they invested in the company “with the full knowledge that music labels and publishers could sue,” and that the founders’ lack of open hostility to the music industry “doesn’t mean we’re not going to get sued.” And, though I’ve approached the company through two channels about getting certified as Fairly Trained, they’ve so far not taken me up on the offer, in contrast to the 12 other AI music companies we’ve certified for training their platforms fairly.
There is, of course, a chance that Suno licenses its training data, and I genuinely hope I’m wrong. If they correct the record, I’ll be the first to loudly and regularly trumpet the company’s fair training credentials.
But I’d like to see media coverage of companies like Suno give more weight to the question of what training data is being used. This is an existential issue for creators.
Editor’s note: Suno’s founders did not respond to requests for comment from Billboard about their training practices. Sources confirm that the company does not have licensing agreements in place with some of the most prominent music rightsholders, including the three major label groups and the National Music Publishers’ Association.
Limiting discussion of Suno’s training data to the fact that it “decline[s] to reveal details” and not explicitly stating the possibility that Suno uses copyrighted music without permission means that readers may not be aware of the potential for unfair exploitation of musicians’ work by AI music companies. This should factor into our thoughts about which AI music companies to support.
If Suno is training on copyrighted music without permission, this is likely the technological factor that sets it apart from other AI music products. The Rolling Stone article mentions some of the tough technical problems that Suno is solving — having to do with tokens, the sampling rate of audio and more — but these are problems that other companies have solved. In fact, several competitors have models as capable as Suno’s. The reason you don’t see more models like Suno’s being released to the public is that most AI music companies want to ensure training data is licensed before they release their products.
The context here is important. Some of the biggest generative AI companies in the world are using untold numbers of creators’ work without permission in order to train AI models that compete with those creators. There is, understandably, a big public outcry at this large-scale scraping of copyrighted work from the creative community. This has led to a number of lawsuits, which Rolling Stone mentions.
The fact that generative AI competes with human creators is something AI companies prefer not to talk about. But it’s undeniable. People are already listening to music from companies like Suno in place of Spotify, and generative AI listening will inevitably eat into music industry revenues — and therefore human musicians’ income — if training data isn’t licensed.
Generative AI is a powerful technology that will likely bring a number of benefits. But if we support the exploitation of people’s work for training without permission, we implicitly support the unfair destruction of the creative industries. We must instead support companies that take a fairer approach to training data.
And those companies do exist. There are a number — generally startups — taking a fairer approach, refusing to use copyrighted work without consent. They are licensing, or using public domain data, or commissioning data, or all of the above. In short, they are working hard not to train unethically. At Fairly Trained, we have certified 12 of these companies in AI music. If you want to use AI music and you care about creators’ rights, you have options.
There is a chance Suno has licensed its data. I encourage the company to disclose what it’s training its AI model on. Until we know more, I hope anyone looking to use AI music will opt instead to work with companies that we know take a fair approach to using creators’ work.
To put it simply — and to use some details pulled from Suno’s Rolling Stone interview — it doesn’t matter whether you’re a team of musicians, what you profess to think about IP, or how many pictures of famous composers you have on the walls. If you train on copyrighted work without a license, you’re not on the side of musicians. You’re unfairly exploiting their work to build something that competes with them. You’re taking from them to your gain — and their cost.
Ed Newton-Rex is the CEO of Fairly Trained and a composer. He previously founded Jukedeck, one of the first AI music companies, ran product in Europe for TikTok, and was vp of audio at Stability AI.