larkin, to random
@larkin@genart.social avatar
larkin, to random
@larkin@genart.social avatar
remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

: "Magic is often used as a metaphor for complex technological processes and systems, and this is why in the marketing rhetoric of AI systems, magic has been such a powerful metaphor. We are told of its amazing, un-ending capabilities; its power to both save and ruin the world and of God like qualities just round the corner. It is a powerful metaphor, that is easy to get swept up. But a metaphor is all it is. AI is not untethered, immaterial magic. It is structurally reliant on a vast number of people providing a myriad of tasks in not so magical working conditions.

Everyone likes to believe in magic. But where AI is concerned, awe should be reserved for the workers performing the tasks behind the curtain. It is only because of them that the systems can do what they do. The least they deserve is basic minimum standards at work.

As Fairwork, we will be continuing our investigation into AI supply chains in the new year with new studies. We will be shifting our attention to business process outsourcing companies in Latin America with further support from the Global Partnership on AI. There is nothing inevitable about poor working conditions in the digital economy. Despite their claims to the contrary, companies have substantial control over the nature of the jobs that they provide. Fairwork’s aim is to hold them to account."

https://futureofwork.fes.de/news-list/e/ai-value-chain

josemurilo, to random
@josemurilo@mato.social avatar

how should respond to the use of CC-licensed work in ?
"In 2023, the theme of the CC Global Summit was AI and the Commons, focused on supporting better sharing in a world with artificial intelligence."
There were 3 opinion groups that resulted from this conversation:
A: Moat Protectors - 16%
B: AI Oversight Maximalists - 36%
C: Equitable Benefit Seekers - 32%
https://creativecommons.org/2024/02/08/what-does-the-cc-community-think-about-regulating-generative-ai/

remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

: "The last week saw word leak that Google had agreed to license Reddit's massive corpus of billions of posts and comments to help train its large language models. Now, in a recent Securities and Exchange Commission filing, the popular online forum has revealed that it will bring in $203 million from that and other unspecified AI data licensing contracts over the next three years.

Reddit's Form S-1—published by the SEC late Thursday ahead of the site's planned stock IPO—says the company expects $66.4 million of that data-derived value from LLM companies to come during the 2024 calendar year. Bloomberg previously reported the Google deal to be worth an estimated $60 million a year, suggesting that the three-year deal represents the vast majority of its AI licensing revenue so far."

https://arstechnica.com/ai/2024/02/reddit-has-already-booked-203m-in-revenue-licensing-data-for-ai-training/

larkin, to random
@larkin@genart.social avatar
remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

: "There’s a bit of a creative working ecosystem where there are natural cycles to someone getting their start to then having enough experience to gain a higher level in their career and make a living out of it. All of those doors are now starting to close because of generative AI. With the works that generative AI makes, the metric isn’t whether something is artistic or has quality, because if that were the case then people would win. The metric is the market itself. When you have a team trying to slash costs who say, “I could hire this artist full time, that’s a whole year’s worth of salary, or I could just pay a little subscription fee,” they’re going to go for the model.

Veterans who should be respected for the incredible contributions to our industry have been approached by high-profile production houses being like, “Can you paint over this Midjourney image? Oh, and we’ll pay you half.” That’s happening right now. At least in film, there’s at least some good pay a person can make a living off, and that’s now being lowered. And it’s going so much faster than any of us ever imagined. There’s a lot of angst and depression, even among actual professionals who are like, “I’ve given my whole life to this. It’s a lifetime of work.” And then for some company to say, “That lifetime of work, that dedication, it’s now mine. We’re gonna compete against you, we’re gonna make insane amounts of money off your work, and you don’t have to have a say.” That’s fucking a lot of people up. It’s a really tough time."

https://disconnect.blog/how-artists-are-fighting-generative-ai/?ref=disconnect-newsletter

larkin, to random
@larkin@genart.social avatar
larkin, to random
@larkin@genart.social avatar
larkin, to random
@larkin@genart.social avatar
remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

: "A lot of early AI research was done in an academic setting; the law specifically mentions teaching, scholarship, and research as examples of fair use. As a result, the machine-learning community has traditionally taken a relaxed attitude toward copyright. Early training sets frequently included copyrighted material.

As academic researchers took jobs in the burgeoning commercial AI sector, many assumed they would continue to enjoy wide latitude to train on copyrighted material. Some feel blindsided by copyright holders’ demands for cash.

“We all learn for free,” Daniel Jeffries wrote in his tweet summing up the view of many in the AI community. “We learn from the world around us and so do machines.”

The argument seems to be that if it’s legal for a human being to learn from one copyrighted book, it must also be legal for a large language model to learn from a million copyrighted books—even if the training process requires making copies of the books.

As MP3.com and Texaco learned, this isn't always true. A use that’s fair at a small scale can be unfair when it’s scaled up and commercialized.

But AI advocates like Jeffries are right that sometimes it is true. There are cases where courts have held that bulk technological uses of copyrighted works are fair use. The most important example is almost certainly the Google Books case."

https://www.understandingai.org/p/the-ai-community-needs-to-take-copyright

larkin, to random
@larkin@genart.social avatar
larkin, to random
@larkin@genart.social avatar
okpierre, to reddit
@okpierre@mastodon.social avatar

Large AI company is paying about $60 million for access to Reddit (YOUR content) so it can train its AI models

Fediverse does have open source alternatives like lemmy, kbin, mbin etc that you can try

#reddit #community #portal #aggregation #socialnetwork #ai #media #aitraining #software #training #fediverse #kbin #mbin #lemmy #activitypub

remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

: "All but one of the generative AI copyright lawsuits is likely years away from being definitively resolved. However, a Thomson Reuters lawsuit against Ross Intelligence for use of Westlaw headnotes as training data for Ross’s generative AI system for analyzing legal issues is scheduled to go to trial in late August 2024. Ross claims it made only fair use of the headnotes. A trial court denied the litigants’ cross-motions for summary judgment, finding that there were triable issues of fact on the infringement and fair use claims.

Thomson Reuters is among the generative AI plaintiffs that is asking for a court order to destroy a generative AI model trained on infringing data. Thus, we may know within the year how receptive courts will be to such remedy requests in generative AI cases. (For what it’s worth, I find Ross’s fair use defense quite persuasive. If Ross prevails, we will know no more about likely remedies in generative AI cases than we know today.)

None of the generative AI copyright complaints has explicitly asked a court to order generative AI developers to obtain a license from a collecting society, such as the Copyright Clearance Center, for permission to use in-copyright works as training data, subject to providing compensation for past and future uses of copyrighted works to train AI models.

The Authors Guild, which is the lead plaintiff in one class-action lawsuit, supports a collective license approach for authorizing use of in-copyright works as training data. Because no existing collecting society has obtained permission from all affected copyright owners to grant such a collective license, a court order of this sort would seem inappropriate."

https://www.lawfaremedia.org/article/how-to-think-about-remedies-in-the-generative-ai-copyright-cases

larkin, to random
@larkin@genart.social avatar
remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

: "The University of Michigan is selling hours of audio recordings of study groups, office hours, lectures, and more to outside third-parties for tens of thousands of dollars for the purpose of training large language models (LLMs). 404 Media has downloaded a sample of the data, which includes a one hour and 20 minute long audio recording of what appears to be a lecture.

The news highlights how some LLMs may ultimately be trained on data with an unclear level of consent from the source subjects. The University of Michigan did not immediately respond to a request for comment, and neither did Catalyst Research Alliance, which is part of the sale process.

“The University of Michigan has recorded 65 speech events from a wide range of academic settings, including lectures, discussion sections, interviews, office hours, study groups, seminars and student presentations,” a page on Catalyst’s website about the University of Michigan data reads. “Speakers represent broad demographics, including male and female and native and non-native English speakers from a wide variety of academic disciplines.”"

https://www.404media.co/university-of-michigan-sells-recordings-of-study-groups-and-office-hours-to-train-ai/

larkin, to random
@larkin@genart.social avatar
larkin, to random
@larkin@genart.social avatar
larkin, to random
@larkin@genart.social avatar
larkin, to random
@larkin@genart.social avatar
larkin, to random
@larkin@genart.social avatar
remixtures, to random Portuguese
@remixtures@tldr.nettime.org avatar

RT @katecrawford
✨Just published: New @ISSUESinST collection of STS scholars addressing the urgent governance challenges of generative AI. This came from our year-long @ENS_ULM working group, where we each focused on one core problem:

https://issues.org/an-ai-society/

remixtures,
@remixtures@tldr.nettime.org avatar

: "Copyright law was developed by eighteenth-century capitalists to intertwine art with commerce. In the twenty-first century, it is being used by technology companies to allow them to exploit all the works of human creativity that are digitized and online. But the destabilization around generative AI is also an opportunity for a more radical reassessment of the social, legal, and cultural frameworks underpinning creative production.
(...)
It may be time to develop concepts of intellectual property with a stronger focus on equity and creativity as opposed to economic incentives for media corporations. We are seeing early prototypes emerge from the recent collective bargaining agreements for writers, actors, and directors, many of whom lack copyrights but are nonetheless at the creative core of filmmaking. The lessons we learn from them could set a powerful precedent for how to pluralize intellectual property. Making a better world will require a deeper philosophical engagement with what it is to create, who has a say in how creations can be used, and who should profit." https://issues.org/generative-ai-copyright-law-crawford-schultz/

remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

AI TRAINING = FAIR USE

: "The datasets on which GAI systems like ChatGPT and Stable Diffusion rely are more like Google Books than commercial web crawlers that gather data; GAI systems need writing and art in its complete form to train from. And the comparison to Google Books doesn’t stop there. Many GAI systems are built using third-party public datasets like Common Crawl and LAION that are distributed under fair use principles and provide a real public good: archiving and making accessible the aggregated content of the internet for academics, researchers, and anyone else that may want it. These are free, non-commercial datasets collected by nonprofit organizations for use by researchers and the public. Web crawling and scraping also underlie the operation of search engines and archiving projects like the Internet Archive’s popular Wayback Machine.

In other words, the same practices that go into collecting data for GAI training are currently understood to be non-infringing or protected by fair use. Considering how vital these practices are for an open and accessible internet, we should ensure that they stay that way.

As a threshold matter, it is critical to understand that accessing, linking to, or interacting with digital information does not infringe any copyright. Reading a book, looking at a photograph, admiring a painting, or listening to music is not, and never should be, copyright infringement. This is not a “fair use” issue; the ability to use, access, or interact with a creative work is outside a copyright owner’s scope of control. Based on the best explanations of how GAI systems work, training a GAI system is generally analogous to these kinds of uses."

https://publicknowledge.org/generative-ai-is-disruptive-but-more-copyright-isnt-the-answer/

larkin, to random
@larkin@genart.social avatar
  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • kavyap
  • DreamBathrooms
  • thenastyranch
  • magazineikmin
  • InstantRegret
  • Durango
  • Youngstown
  • everett
  • slotface
  • rosin
  • cubers
  • mdbf
  • ngwrru68w68
  • anitta
  • GTA5RPClips
  • cisconetworking
  • osvaldo12
  • ethstaker
  • Leos
  • khanakhh
  • normalnudes
  • tester
  • modclub
  • tacticalgear
  • megavids
  • provamag3
  • lostlight
  • All magazines