IncognitoErgoSum

@IncognitoErgoSum@kbin.social

OC The AI genie is here. What we're deciding now is whether we all have access to it, or whether it's a privilege afforded only to rich people, corporations, and governments.

I know a lot of people want to interpret copyright law so that allowing a machine to learn concepts from a copyrighted work is copyright infringement, but I think what people will need to consider is that all that's going to do is keep AI out of the hands of regular people and place it specifically in the hands of people and...

IncognitoErgoSum, (edited )

I'm not sure what you're getting at with this. It will only be a privilege for these groups of we choose to artificially make it that way. And why would you want to do that?

Do you want to give AI exclusively to the rich? If so, why?

IncognitoErgoSum,

Wow, you have this all planned out, don't you?

If that's what Europe is like, they'll build their data centers somewhere else. Like the corrupt USA. Again, you'll be taking away your access to AI, not theirs.

IncognitoErgoSum,

For something to be a fact, it needs to actually be true. AI is currently accessible to everyone.

IncognitoErgoSum,

Why do you think people will build data centers in Europe when they can build them elsewhere?

IncognitoErgoSum,

Losing their life because an AI has been improperly placed in a decision making position because it was sold as having more capabilities than it actually has.

I would tend to agree with you on this one, although we don't need bad copyright legislation to deal with it, since laws can deal with it more directly. I would personally put in place an organization that requires rigorous proof that AI in those roles is significantly safer than a human, like the FDA does for medication.

As for the average person who has the computer hardware and time to train an AI (bear in mind Google Bard and Open AI use human contractors to correct misinformation in the answers as well as scanning), there is a ton of public domain writing out there.

Corporations would love if regular people were only allowed to train their AIs on things that are 75 years out of date. Creative interpretations of copyright law aren't going to stop billion- and trillion-dollar companies from licensing things to train AI on, either by paying a tiny percentage of their war chests or just ignoring the law altogether the way Meta always does, and getting a customary slap on the wrist. What will end up happening is that Meta, Alphabet, Microsoft, Elon Musk and his companies, government organizations, etc. will all have access to AIs that know current, useful, and relevant things, and the rest of us will not, or we'll have to pay monthly for the privilege of access to a limited version of that knowledge, further enriching those groups.

Furthermore, if they're using people's creativity to make a product, it's just WRONG not to have permission or to not credit them.

Let's talk about Stable Diffusion for a moment. Stable Diffusion models can be compressed down to about 2 gigabytes and still produce art. Stable Diffusion was trained on 5 billion images and finetuned on a subset of 600 million images, which means that the average image contributes 2B/600M, or a little bit over three bytes, to the final dataset. With the exception of a few mostly public domain images that appeared in the dataset hundreds of times, Stable Diffusion learned broad concepts from large numbers of images, similarly to how a human artist would learn art concepts. If people need permission to learn a teeny bit of information from each image (3 bytes of information isn't copyrightable, btw), then artists should have to get permission for every single image they put on their mood boards or use for inspiration, because they're taking orders of magnitude more than three bytes of information from each image they use for inspiration on a given work.

IncognitoErgoSum,

AI is more than just ChatGPT.

When we talk about reinterpreting copyright law in a way that makes AI training essentially illegal for anything useful, it also restricts smaller and potentially more focused networks. They're discovering that smaller networks can perform very well (not at the level of GPT-4, but well enough to be useful) if they're trained in a specific way where reasoning steps are spelled out in the training.

Also, there are used nvidia cards currently selling on Amazon for under $300 with 24 gigs of ram and AI performance almost equal to a 3090, which puts group-of-experts models like a smaller version of GPT-4 within reach of people who aren't ultra-wealthy.

There's also the fact that there are plenty of companies currently working on hardware that will make AI significantly cheaper and more accessible to home users. Systems like ChatGPT aren't always going to be restricted to giant data centers, unless (as some people really want) laws are passed to prevent that hardware from being sold to regular people.

IncognitoErgoSum,

So clearly we do agree on most of this stuff, but I did want to point out a possibility you may not have considered.

If we're just talking about what you can do, then these laws aren't going to matter because you can just pirate whatever training material you want.

This depends on the penalty and how strictly it's enforced. If it's enforced like normal copyright law, then you're right; your chances of getting in serious trouble just for downloading stuff are essentially nil -- the worst thing that will happen to you is your ISP will three-strikes you and you'll lose internet access. On the other hand, there's a lot of panic surrounding AI, and the government might use that as an excuse to pass laws that would give people prison time for possessing one, and then fund strict enforcement. I hope that doesn't happen, but with rumblings of insane laws that would give people prison time for using a VPN to watch a TV show outside of the country, I'm a bit concerned.

As for the parent comment's motivations, it's hard to say for sure with any particular individual, but I have noticed a pattern among neoliberals where they say things like "well, the rich are already powerful and we can't do anything about it, so why try" or "having universal health care, which the rest of the first world has implemented successfully, is unrealistic, so why try" and so on. It often boils down to giving lip service to progressive social values while steadfastly refusing to do anything that might actually make a difference. It's economic conservatism dressed as progressivism. Even if that's not what they meant (and it would be unwise of me to just assume that), I feel like that general attitude needs to be confronted.

IncognitoErgoSum,

As the technology improves, data centers that run AI will require significantly less cooling. GPUs aren't very power-efficient for doing AI stuff because they have to move a lot of data around from their memory to their processor cores. There are AI-specific cards being worked on that will allow the huge matrix multiplications to happen in place without that movement happening, which will mean drastically lower power and cooling requirements.

Also, these kinds of protestors are the same general group of people who stopped nuclear power from becoming a bigger player back in the 1960s and 70s. If we'd gone nuclear and replaced coal, we almost certainly wouldn't be sitting here at the beginning of what looks to be a major global warming event that's unlike anything we've ever seen before. It wouldn't have completely solved the problem, but it would have bought us time. An AI may be able to help us develop ideas to mitigate global warming, and it seems ridiculous to me to go all luddite and smash the machines over what will be a minuscule overall contribution to it given the possibility that it could help us solve the problem.

But let's be real here; these hypothetical people smashing the machines are doing it because they've bought into AI panic, not because they're afraid of global warming. If they really want to commit acts of ecoterrorism, there are much bigger targets.

IncognitoErgoSum,

If I'm the "parent comment" you're referring to, then that's very much not my motivation.

You're not. I was talking about the thread parent: "Many things in life are a privilege for these groups. AI is no different." I should have been more specific.

At any rate, I personally feel that we have a moral responsibility to make it accessible to as many people as possible.

IncognitoErgoSum,

Except an AI is not taking inspiration, it's compiling information to determine mathematical averages.

The AIs we're talking about are neural networks. They don't do statistics, they don't have databases, and they don't take mathematical averages. They simulate neurons, and their ability to learn concepts is emergent from that, the same way the human brain is. Nothing about an artificial neuron ever takes an average of anything, reads any database, or does any statistical calculations. If an artificial neural network can be said to be doing those things, then so is the human brain.

There is nothing magical about how human neurons work. Researchers are already growing small networks out of animal neurons and using them the same way that we use artificial neural networks.

There are a lot of "how AI works" articles in there that put things in layman's terms (and use phrases like "statistical analysis" and "mathematical averages", and unfortunately people (including many very smart people) extrapolate from the incorrect information in those articles and end up making bad assumptions about how AI actually works.

A human being is paid for the work they do, an AI program's creator is paid for the work it did. And if that creator used copyrighted work, then he should be having to get permission to use it, because he's profitting off this AI program.

If an artist uses a copyrighted work on their mood board or as inspiration, then they should pay for that, because they're making a profit from that copyrighted work. Human beings should, as you said, be paid for the work they do. Right? If an artist goes to art school, they should pay all of the artists whose work they learned from, right? If a teacher teaches children in a class, that teacher should be paid a royalty each time those children make use of the knowledge they were taught, right? (I sense a sidetrack -- yes, teachers are horribly underpaid and we desperately need to fix that, so please don't misconstrue that previous sentence.)

There's a reason we don't copyright facts, styles, and concepts.

Oh, and if you want to talk about something that stores an actual database of scraped data, makes mathematical and statistical inferences, and reproduces things exactly, look no further than Google. It's already been determined in court that what Google does is fair use.

IncognitoErgoSum, (edited )

I'm willing to, but if I take the time to do that, are you going to listen to my answer, or just dismiss everything I say and go back to thinking what you want to think?

Also, a couple of preliminary questions to help me explain things:

What's your level of familiarity with the source material? How much experience do you have writing or modifying code that deals with neural networks? My own familiarity lies mostly with PyTorch. Do you use that or something else? If you don't have any direct familiarity with programming with neural networks, do you have enough of a familiarity with them to at least know what some of those boxes mean, or do I need to explain them all?

Most importantly, when I say that neural networks like GPT-* use artificial neurons, are you objecting to that statement?

I need to know what it is I'm explaining.

IncognitoErgoSum, (edited )

If what you're going to give me is an oversimplified analogy that puts too much faith in what AI devs are trying to sell and not enough faith in what a human brain is doing, then don't bother because I will dismiss it as a fairy tale.

I'm curious, how do you feel about global warming? Do you pick and choose the scientists you listen to? You know that the people who develop these AIs are computer scientists and researchers, right?

If you're a global warming denier, at least you're consistent. But if out of one side of you're mouth you're calling what AI researchers talk about a "fairy tail", and out of the other side of your mouth you're criticizing other people for ignoring science when it suits them, then maybe you need to take time for introspection.

You can stop reading here. The rest of this is for people who are actually curious, and you've clearly made up your mind. Until you've actually learned a bit about how they actually work, though, you have absolutely no business opining about how policies ought to apply to them, because your views are rooted in misconceptions.

In any case, curious folks, I'm sure there are fancy flowcharts around about how data flows through the human brain as well. The human brain is arranged in groups of neurons that feed back into each other, where as an AI neural network is arranged in more ordered layers. There structure isn't precisely the same. Notably, an AI (at least, as they are commonly structured right now) doesn't experience "time" per se, because once it's been trained its neural connections don't change anymore. As it turns out, consciousness isn't necessary for learning and reasoning as the parent comment seems to think.

Human brains and neural networks are similar in the way that I explained in my original comment -- neither of them store a database, neither of them do statistical analysis or take averages, and both learn concepts by making modifications to their neural connections (a human does this all the time, whereas an AI does this only while it's being trained). The actual neural network in the above diagram that OP googled and pasted in here lives in the "feed forward" boxes. That's where the actual reasoning and learning is being done. As this particular diagram is a diagram of the entire system and not a diagram of the layers of the feed-forward network, it's not even the right diagram to be comparing to the human brain (although again, the structures wouldn't match up exactly).

IncognitoErgoSum, (edited )

So to clarify, are you making the claim that nothing that's simulated with vector mathematics can have emergent properties? And that AIs like GPT and Stable Diffusion don't contain simulated neurons?

IncognitoErgoSum,

I don't believe that current AIs should have rights. They aren't conscious.

My point is was purely that AIs learn concepts and that concepts aren't copyrightable. Encoding concepts into neurons (that is, learning) doesn't require consciousness.

IncognitoErgoSum,

Wow, that's a great way to immediately drain all of the potential out of what could be a really amazing technology, and absolutely prevent any open source competitor from ever coming into existince, so in the best case we'll all be paying google and openAI monthly forever for access to knowledge that ought to be free. What we need are unions and laws that enforce better labor conditions across the board.

IncognitoErgoSum,

Also, working in open source means having a proper understanding of licensing and ownership. Open source doesn't mean "free this and free that" -- in fact, many AI based code assistance tools are actually hurting the open source initiative by not properly respecting the license of the code base it's studying from.

Don't be patronizing. I've been involved in open source for 20+ years, and I know plenty about licensing.

What you're talking about is changing copyright law so that you'll have to license content in order for an AI to learn concepts from that content (in other words, to be able to summarize it, learn facts from it, learn an art style, and so on). This isn't how copyright law currently works, and I hope to god it stays that way.

For example, if you don't own the right of the original copy of Star Wars, you obviously wouldn't own any rights over the output of an upscaled Star Wars. Same goes for writing or other "transformative" media and it has been this way for a long time (see: audio sampling)

That's not the same thing as training and AI on Star Wars. If you feed Star Wars into an upscaling AI, the AI is processing each frame and creating an output that's a derivative work on that frame, and result of that isn't something you would be allowed to release without a license. If you train it on Star Wars, the AI would learn general concepts from Star Wars, and not be able to produce an upscaled version of the movie verbatim (although depending on the AI, it may be able to produce images in the general style of Star Wars or summarize the movie).

An appropriate analogy for what's going on here would be reading a book and then talking about the facts I learned from that book, which is in no way a violation of copyright law. If I started quoting long sections of that book verbatim, I would need a license from the author, but that's not how AI works. It's not learning the sentences those people type verbatim, it's picking up concepts and facts from them. Even if I were to memorize the book from cover to cover, I would be in the clear as long as I didn't actually start reproducing the book in some way. Neural networks are learning machines, not databases. Their purpose isn't to reproduce information verbatim.

If you're still not clear on the difference between training on data and processing it, let me know and I'll try to clarify further.

GPT detectors are biased against non-native English writers (www.cell.com)

GPT detectors frequently misclassify non-native English writing as AI generated, raising concerns about fairness and robustness. Addressing the biases in these detectors is crucial to prevent the marginalization of non-native English speakers in evaluative and educational settings and to create a more equitable digital...

IncognitoErgoSum,

Is kbin a place where we just call everyone we don't like "techbros"?

IncognitoErgoSum,

Same. I also like how they don't push comments down the page.

People are going to use it as a disagree button, let them do it publicly. If you don't want other people to know you downvoted something, it's probably because they made a good point that you don't like.

IncognitoErgoSum,

Also, neural network weights are just a bunch of numbers, and I'm pretty sure data can't be copyrighted. And yes, images and sounds and video stored on a computer are numbers too, but those can be played back or viewed by a human in a meaningful way, and as such represent a work.

IncognitoErgoSum,

Unfortunately, the courts and legislatures may craft their opinions and laws, respectively, without knowing how machine learning actually works.

IncognitoErgoSum,

The end result is going to be basically the same regardless. Plenty of people (such as myself) who believe in the huge potential of AI to give creative power to regular people will volunteer our voices. Giving that creative power to everyone is worth far more, in my opinion, than gatekeeping the creation of art.

Unless they're planning on making it illegal for a computer to imitate any human voice, I don't see where making a law against using a voice without consent would make a big substantive difference. Just re-voice the existing lines in Skyrim with new voices to maintain consistency and you're good (there's a Serana mod that already does this, for instance).

IncognitoErgoSum,

Just being "a bunch of numbers" doesn't stop it from being a work, it doesn't stop it from being a derivative work

I suggest reading my entire comment.

A trained AI is not a measurement of the natural world. It is a thing that has been created from the processing of other things -- in the common sense of it the word, it is derivative of those works. What remains, IMO, is the question of if it would be a work, or something else, and if that something else would be distinct enough from being a work to matter.

It's only a work if your brain is a work. We agree that in a digitized picture, those numbers represent the picture itself and thus constitute a work (which you would have known if you read beyond the first sentence of my comment). The weights that make up a neural network represent encodings into neurons, and as such should be treated the same way as neural encodings in a brain.

IncognitoErgoSum,

But since you seem to love the potential of AI would you be willing to send me an audio file of you pronouncing every possible phonetic sound the human mouth can make?

In theory, absolutely.

In practice, I'm not going to go through that much work just to make a point for a single fediverse comment. I'll be honest, though -- I'm not particularly worried about somebody using my voice to do a bad (or do a racism or whatever). It may happen, and I can live with it; I think the benefits far outweigh the cost, and in my experience, far more people use those sorts of things to do awesome stuff than to be shitty. Earlier today I was considering trying to put together an Open Voice project and collect volunteers to do exactly what you said.

I've already released open source code over the years; people could potentially use that to do things I don't agree with as well, but frankly, as someone who has had work out in the wild available for use by everyone, the panic is vastly overblown.

Your assumption that I felt otherwise is because you’re on the opposite end of the spectrum. So self assured of it’s value that you’re blind to real shortcomings and abusable points.

Just because I feel that the potential benefits far outweigh the costs (as well as the draconian technical restrictions that would be required in order to prevent people from expressing themselves in a bad way), it doesn't follow that I'm somehow blind to the real shortcomings and abusable points of AI. I would appreciate if you not make silly strawman assumptions about whether I've given something due consideration just because you don't like my conclusions.

If you have a solution that wouldn't absolutely kill it (or put a horribly filtered version in the hands of a few massive corporations who charge the rest of us for the privilege of using it while using it themselves however they want), I'm all ears.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • cubers
  • osvaldo12
  • InstantRegret
  • magazineikmin
  • ethstaker
  • rosin
  • Youngstown
  • slotface
  • ngwrru68w68
  • everett
  • kavyap
  • mdbf
  • DreamBathrooms
  • megavids
  • khanakhh
  • cisconetworking
  • Durango
  • GTA5RPClips
  • thenastyranch
  • tacticalgear
  • modclub
  • tester
  • normalnudes
  • provamag3
  • anitta
  • Leos
  • lostlight
  • All magazines