remixtures, to ArtificialIntelligence Portuguese
@remixtures@tldr.nettime.org avatar

: "Roboticists believe that by using new AI techniques, they will achieve something the field has pined after for decades: more capable robots that can move freely through unfamiliar environments and tackle challenges they’ve never seen before.
(...)
But something is slowing that rocket down: lack of access to the types of data used to train robots so they can interact more smoothly with the physical world. It’s far harder to come by than the data used to train the most advanced AI models like GPT—mostly text, images, and videos scraped off the internet. Simulation programs can help robots learn how to interact with places and objects, but the results still tend to fall prey to what’s known as the “sim-to-real gap,” or failures that arise when robots move from the simulation to the real world.

For now, we still need access to physical, real-world data to train robots. That data is relatively scarce and tends to require a lot more time, effort, and expensive equipment to collect. That scarcity is one of the main things currently holding progress in robotics back."

https://www.technologyreview.com/2024/04/30/1091907/the-robot-race-is-fueling-a-fight-for-training-data/

remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

#AI #GenerativeAI #StackOverflow #AITraining: "Stack Overflow, a legendary internet forum for programmers and developers, is coming under heavy fire from its users after it announced it was partnering with OpenAI to scrub the site's forum posts to train ChatGPT. Many users are removing or editing their questions and answers to prevent them from being used to train AI — decisions which have been punished with bans from the site's moderators.

Stack Overflow user Ben posted on Mastodon about his experience editing his most successful answers to try to avoid having his work stolen by OpenAI.

@ben on Mastodon posts, "Stack Overflow announced that they are partnering with OpenAI, so I tried to delete my highest-rated answers. Stack Overflow does not let you delete questions that have accepted answers and many upvotes because it would remove knowledge from the community. So instead I changed my highest-rated answers to a protest message. Within an hour mods had changed the questions back and suspended my account for 7 days."

Ben continues in his thread, "[The moderator crackdown is] just a reminder that anything you post on any of these platforms can and will be used for profit. It's just a matter of time until all your messages on Discord, Twitter etc. are scraped, fed into a model and sold back to you."

https://www.tomshardware.com/tech-industry/artificial-intelligence/stack-overflow-bans-users-en-masse-for-rebelling-against-openai-partnership-users-banned-for-deleting-answers-to-prevent-them-being-used-to-train-chatgpt

remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

: "A lawsuit is alleging Amazon was so desperate to keep up with the competition in generative AI it was willing to breach its own copyright rules.…

The allegation emerges from a complaint [PDF] accusing the tech and retail mega-corp of demoting, and then dismissing, a former high-flying AI scientist after it discovered she was pregnant.

The lawsuit was filed last week in a Los Angeles state court by Dr Viviane Ghaderi, an AI researcher who says she worked successfully in Amazon's Alexa and LLM teams, and achieved a string of promotions, but claims she was later suddenly demoted and fired following her return to work after giving birth. She is alleging discrimination, retaliation, harassment and wrongful termination, among other claims.

Montana MacLachlan, an Amazon spokesperson, said of the suit: "We do not tolerate discrimination, harassment, or retaliation in our workplace. We investigate any reports of such conduct and take appropriate action against anyone found to have violated our policies.""

https://www.msn.com/en-us/news/crime/ex-amazon-exec-claims-she-was-asked-to-break-copyright-law-in-race-to-ai/ar-AA1nrNEG

remixtures, to Bulgaria Portuguese
@remixtures@tldr.nettime.org avatar

#EU #Belgium #France #AI #GenerativeaAI #AITraining #DataProtection #GDPR: "As well as the Belgian Data Protection Authority decision I criticised earlier this week, it appears the French DPA has issued similar guidance on the use of personal data to train AI models. My detailed analysis below shows that, in relation to purpose-specific AI systems, it makes no sense: the training of the system cannot be separated from the ultimate purpose of the system. This has a major bearing on the issue of compatibility.

As a matter of principle and law, the creation and training of AI models/profiles for a specific purpose (be that direct marketing or health care) must be based on the legal basis relied on for that ultimate purpose.

The fact that the creation and training of the models/profiles is a “first phase” in a two-phase process (with the deployment of the models/profiles forming the “second phase”) does not alter that.

However, as an exception to this, under the GDPR, the processing can also be authorised by law or by means of an authorisation issued by a DPA under the relevant law (as in France), provided the law or DPA authorisation lays down appropriate safeguards. That is the only qualification I accept to the above principle." https://www.ianbrown.tech/2024/04/16/more-on-french-and-belgian-gdpr-guidance-on-ai-training/

remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

#AI #GenerativeAI #AITraining: "[A]s the lawsuits and investigations around generative AI and its opaque data practices pile up, there have been small moves to give people more control over what happens to what they post online. Some companies now let individuals and business customers opt out of having their content used in AI training or being sold for training purposes. Here’s what you can—and can’t—do.

Before we get to how you can opt out, it’s worth setting some expectations. Many companies building AI have already scraped the web, so anything you’ve posted is probably already in their systems. Companies are also secretive about what they have actually scraped, purchased, or used to train their systems. “We honestly don't know that much,” says Niloofar Mireshghallah, a researcher who focuses on AI privacy at the University of Washington. “In general, everything is very black-box.”" https://www.wired.com/story/how-to-stop-your-data-from-being-used-to-train-ai/

larkin, to random
@larkin@genart.social avatar
remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

: "Representative Adam Schiff (D-Calif.) introduced new legislation in the U.S. House of Representatives on Tuesday (April 9) which, if passed, would require AI companies to disclose which copyrighted works were used to train their models, or face a financial penalty. Called the Generative AI Copyright Disclosure Act, the new bill would apply to both new models and retroactively to previously released and used generative AI systems.

The bill requires that a full list of copyrighted works in an AI model’s training data set be filed with the Copyright Office no later than 30 days before the model becomes available to consumers. This would also be required when the training data set for an existing model is altered in a significant manner. Financial penalties for non-compliance would be determined on a case-by-case basis by the Copyright Office, based on factors like the company’s history of noncompliance and the company’s size." https://www.billboard.com/business/legal/federal-bill-ai-training-require-disclosure-songs-used-1235651089/

remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

: "This paper is a snapshot of an idea that is as underexplored as it is rooted in decades of existing work. The concept of mass digitization of books, including to support text and data mining, of which AI is a subset, is not new. But AI training is newly of the zeitgeist, and its transformative use makes questions about how we digitize, preserve, and make accessible knowledge and cultural heritage salient in a distinct way.

As such, efforts to build a books data commons need not start from scratch; there is much to glean from studying and engaging existing and previous efforts. Those learnings might inform substantive decisions about how to build a books data commons for AI training. For instance, looking at the design decisions of HathiTrust may inform how the technical infrastructure and data management practices for AI training might be designed, as well as how to address challenges to building a comprehensive, diverse, and useful corpus. In addition, learnings might inform the process by which we get to a books data commons — for example, illustrating ways to attend to the interests of those likely to be impacted by the dataset’s development." https://openfuture.pubpub.org/pub/towards-a-book-data-commons-for-ai-training/release/1

remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

#AI #GenerativeAI #EU #AIAct #Copyright #AITraining #IP #FairUse: "Scarcely a day goes by without news of exciting breakthroughs in the world of AI. In the face of disruptive waves of technological change and mounting uncertainty, the law cannot help but take on an “experimental” character, with lawmakers and lawyers often caught on the back foot, struggling to keep up with the sweeping winds of change. But whatever the next steps may be, one thing is certain: litigation surrounding generative AI marks an important crossroads, and whichever path we choose is likely to shape the future of the technology. The rising litigation around generative AI is not targeting image by image or specific excerpts of infringing texts produced by AI models. Rather, the whole technique behind the system is hanging in the balance.

Another key takeaway that merits attention relates to the fragmentary landscape of copyright that seems to be unfolding in the wake of the rapid advances in AI technology. Although the emerging European legal framework offers strict rules yet solid ground for AI technology to flourish on the continent, it’s worth wondering what will happen if the “Brussels effect” fails to reach the shores the other side of the Atlantic and the use of copyrighted works for training purposes is found to be transformative fair use in common law jurisdictions, while a relevant portion of these works are opted-out of AI models on European soil. That would mark a yawning gap between two copyright regimes, opening a new chapter in this old tale and potentially disadvantaging would-be European generative AI providers." https://copyrightblog.kluweriplaw.com/2024/04/08/the-stubborn-memory-of-generative-ai-overfitting-fair-use-and-the-ai-act/

remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

: "The other side: The AI companies have two chief legal arguments.

Many maintain that their broad use of copyrighted material is legal under the doctrine of "fair use," which courts apply using a complex four-part standard.
However, as Giordano notes, "the public status of copyrighted material" is not one of those factors.
A decade ago, the Google Books decision held that Google's use of "text snippets" to catalogue published works was an acceptable fair use, and AI companies often point to Google's win to back their argument.
The second argument is that copyright is not an issue in AI training because AI systems don't copy material: They just "learn" from it the way a human might.
Reality check: AI companies often refuse to say which "publicly available" data they are using, with OpenAI and others describing their sources as a competitive trade secret."

https://www.axios.com/2024/04/05/open-ai-training-data-public-available-meaning

remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

: "Models like ChatGPT and Claude are deeply dependent on training data to improve their outputs, and their very existence is actively impeding the creation of the very thing they need to survive. While publishers like Axel Springer have cut deals to license their companies' data to ChatGPT for training purposes, this money isn't flowing to the writers that create the content that OpenAI and Anthropic need to grow their models much further. It's also worth considering that these AI companies may already have already trained on this data. The Times sued OpenAI late last year for training itself on "millions" of articles, and I'd bet money that ChatGPT was trained on multiple Axel Springer publications along with anything else it could find publicly-available on the web.

This is one of many near-impossible challenges for an AI industry that's yet to prove its necessity. While one could theoretically make bigger, more powerful chips (I'll get to that later), AI companies face a kafkaesque bind where they can't improve a tool for automating the creation of content without human beings creating more content than they've ever created before. Paying publishers to license their content doesn't actually fix the problem, because it doesn't increase the amount of content that they create, but rather helps line the pockets of executives and shareholders. Ironically, OpenAI's best hope for survival would be to fund as many news outlets as possible and directly incentivize them to do in-depth reporting, rather than proliferating a tech that unquestionably harms the media industry." https://www.wheresyoured.at/bubble-trouble/

larkin, to random
@larkin@genart.social avatar
larkin, to random
@larkin@genart.social avatar
larkin, to random
@larkin@genart.social avatar
larkin, to random
@larkin@genart.social avatar
larkin, to random
@larkin@genart.social avatar
larkin, to random
@larkin@genart.social avatar
larkin, to random
@larkin@genart.social avatar
emkingma, to ai
@emkingma@mstdn.social avatar

Go on LinkedIn for a bit this morning and I'm greeted with a message and an ad inviting me to screw over my own future and that of others.

No, I'm not going to teach your generative AI model how to f**king write.

#AI #AITraining #GenerativeAI

An ad from Outlier that appeared in my LinkedIn feed, encouraging me to sign up for the role I was messaged about.

larkin, to random
@larkin@genart.social avatar
remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

#AI #GenerativeAI #UK #AITraining #Copyright #IP: "Britain’s biggest publishing houses have written to dozens of technology companies, warning them that they must pay if they want to use content from books, journals and papers to build their artificial intelligence (AI) models.

The Publishers Association said it was of “deep concern” to its members, who include Penguin Random House, HarperCollins and Oxford University Press, that it believes“vast amounts of copyright-protected works” are being fed by tech businesses into their generative AI programs without authorisation.
Among the 50 recipients of the letter, which was sent last week, were Google DeepMind, Meta, the owner of Facebook, and OpenAI, the company behind ChatGPT. Those three companies have been approached for comment." https://www.thetimes.co.uk/article/dont-use-our-books-in-your-ai-programs-publishers-warn-big-tech-8ttq0n6hq

larkin, to random
@larkin@genart.social avatar
larkin, to random
@larkin@genart.social avatar
larkin, to random
@larkin@genart.social avatar
remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

#AI #GenerativeAI #AITraining #GhostWork #Kenya #Uganda: "Magic is often used as a metaphor for complex technological processes and systems, and this is why in the marketing rhetoric of AI systems, magic has been such a powerful metaphor. We are told of its amazing, un-ending capabilities; its power to both save and ruin the world and of God like qualities just round the corner. It is a powerful metaphor, that is easy to get swept up. But a metaphor is all it is. AI is not untethered, immaterial magic. It is structurally reliant on a vast number of people providing a myriad of tasks in not so magical working conditions.

Everyone likes to believe in magic. But where AI is concerned, awe should be reserved for the workers performing the tasks behind the curtain. It is only because of them that the systems can do what they do. The least they deserve is basic minimum standards at work.

As Fairwork, we will be continuing our investigation into AI supply chains in the new year with new studies. We will be shifting our attention to business process outsourcing companies in Latin America with further support from the Global Partnership on AI. There is nothing inevitable about poor working conditions in the digital economy. Despite their claims to the contrary, companies have substantial control over the nature of the jobs that they provide. Fairwork’s aim is to hold them to account."

https://futureofwork.fes.de/news-list/e/ai-value-chain

  • All
  • Subscribed
  • Moderated
  • Favorites
  • anitta
  • everett
  • magazineikmin
  • Durango
  • thenastyranch
  • Youngstown
  • slotface
  • hgfsjryuu7
  • osvaldo12
  • rosin
  • kavyap
  • mdbf
  • PowerRangers
  • DreamBathrooms
  • modclub
  • khanakhh
  • InstantRegret
  • tacticalgear
  • vwfavf
  • ethstaker
  • ngwrru68w68
  • normalnudes
  • tester
  • GTA5RPClips
  • cubers
  • cisconetworking
  • Leos
  • provamag3
  • All magazines