So over the last several months, I've been looking at all of these AI generated... - Random

ZBennoui, 16 days ago

So over the last several months, I've been looking at all of these AI generated music services. Suno, Udio, and now the new one from Eleven. As someone who generally has a pretty positive outlook on machine learning/AI, I think these tools are really interesting and a great way to help people who don't have any musical ability make personalized tracks. However, I take issue with how these systems are trained. It's pretty much been confirmed that Udio is trained on vast amounts of copyrighted material, very likely without consent considering how new the company is. With Suno it's hard for me to tell, but others have theorized that there's copyrighted stuff in there as well. These companies are telling you that you are allowed to use whatever you generate for commercial purposes, but I fail to see how they have the right to do so. I wonder if artists are even aware that their songs could potentially be included in these models, and honestly just the whole ethos of these companies is disgusting. What gives them the right to scrape massive amounts of copyrighted material these people spent a crazy amount of time on, just to dump it into a model that can generate whatever you want based on a simple text prompt? Let me be clear that I have no problem with the Technology itself, I think it's really cool, but the only reason it's able to sound as good as it does is because they are training it on a lot of music that they don't have the rights for. Take a look at Stable Audio if you want to see whats possible with just licensed royalty-free tracks, spoiler it's nowhere near as good. Some of that could of course be due to the architecture, but more likely it's the data they had access to while training. I wonder what Eleven used to train their models, but considering how clean the results are, I suspect they got custom multi-tracks from whoever they decided to work with. I'm personally far more excited about what Apple is doing in this space with the new session players in the upcoming Logic updates, and I hope this will be the path forward rather than massive audio generation models trained on unlicensed material.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Image

Image alternative text

miki, 16 days ago

@ZBennoui The reasoning here is that actual musicians are also trained on copyrighted content, and yet as long as the content they create isn't too close to an already existing song, they're fine.

Think how bad a modern musician would be if they were only allowed to listen to royalty-free music.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ZBennoui, 16 days ago

@miki Yes, this makes sense from a logical perspective, as a musician/producer myself I totally get the sentiment. The difference with AI/ML that a lot of people who only work in tech don't fully understand is that these models are actually listening to copyrighted audio files in order to train. While a human musician would be doing a similar thing, they are not then taking whatever information they've learned directly from that audio and "remixing" it for lack of a better term. These models are trying to replicate the human brain, and technically the output they produce is synthesized, however the way these diffusion models work will output bits of the training data, so it is still technically producing copyrighted material even if in a different form. Whether that counts as fair use remains to be seen, but personally I don't see whatever arguments they're going to make holding up in court for very long.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

miki, 16 days ago

@ZBennoui We genuinely don't know what the courts are going to decide (and it's probably going to both differ between jurisdictions and be superseded by new laws in some places). It's extremely unlikely for AIs to be outlawed in every single country after those court battles are over. Anybody who says otherwise (regardless of what side they're on) is just trying to push their narrative IMO.

This is genuinely new technology, different from anything we've seen before, and there are no easy answers here.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

miki, 16 days ago

@ZBennoui Then there's the politics, even if US courts outlaw AI training but China doesn't, the US will have very little choice but to amend their laws and catch up.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ZBennoui, 16 days ago

@miki Yeah I agree. Let me be totally clear that I have zero problem with AI itself, and this audio generation stuff is extremely cool and innovative. Really my only issue is with companies like Udio who think it's OK to train their models on other people's work without consent. I certainly don't think it should be outlawed or anything like that, but companies who get away with scraping vast amounts of copyrighted data should be held accountable in some way. Whether that's monetary or otherwise I have no idea, but a solution will need to be found at some point.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

miki, 16 days ago

@ZBennoui I really don't have much of a problem with this, just as I don't have a progblem with programmers looking at and learning from my code. People don't seem to be able to grasp the difference between learning from something and straight up copying it.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ZBennoui, 16 days ago

@miki But that's the thing, in some cases these models are copying what they've seen in the training data. A human is not going to be able to copy the voice of a famous artist accurately in most cases, but these models have been known to do just that. Udio was almost immediately called out for being able to replicate famous artist's voices, albeit with some clever prompting to nudge the model in the right direction. This is why when you create a model that generates images or music or whatever, you have to be very careful about what data you train with. They are going on record and saying you're allowed to use this for whatever you want, you can even make songs and release them on Spotify. If it's going to replicate peoples voices accurately, I'd say that counts as impersonation and should not be tolerated. If I were included in the dataset and someone made a track using me and made money from it, even if unintentionally, I'm obviously not gonna be happy about that.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

miki, 16 days ago

@ZBennoui As long as the model doesn't do this unintentionally, this should be on the person prompting the model to do this, not the model makers.

If you buy a CPU from Intel and make it execute the instructions of a malicious program you wrote, you're responsible for the harm caused, not Intel.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

miki, 16 days ago

@ZBennoui The way I think about this is that you don't punish musicians for being able to perform Beatles songs when forced to do so at gunpoint, and so you shouldn't punish models for performing these songs when cleverly prompt-engineered. Instead, you should punish the promt engineer themselves, just like you'd have an issue with the person holding the gun. Now when models do this unintentionally, which sometimes (although rarely) happens, that's a far muddier issue.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ZBennoui, 16 days ago

@miki Yeah, I guess that makes sense. Like you said before there aren't any clear answers yet, and I'll be really curious to see how this is dealt with in the coming years.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Add comment