cutterkom, to ArtificialIntelligence German
@cutterkom@mastodon.social avatar

How are the #LAION (-5B) datasets built? And did the LAION organisation/group ever react on criticism?

For our piece on the masses of personal identifiable information (PII) at @br_data we described the process of collecting data > 5 billion images from the web: https://interaktiv.br.de/ki-trainingsdaten/en/index.html

This makes it clear how #PII and #CSAM content is included and why the automatic filters do not work well enough.

https://interaktiv.br.de/ki-trainingsdaten/en/index.html

#Stablediffusion

judeswae, to ArtificialIntelligence
@judeswae@toot.thoughtworks.com avatar

Largest Dataset Powering AI Images Removed After Discovery of Child Sexual Abuse Material
https://www.404media.co/laion-datasets-removed-stanford-csam-child-abuse/

Do you know what's in your training data set? Do you know what the model you're using has been trained on?

tante, to random
@tante@tldr.nettime.org avatar

The biggest image data set used in "AI" contains examples of child sexual abuse material
(Title: Largest Dataset Powering AI Images Removed After Discovery of Child Sexual Abuse Material) https://www.404media.co/laion-datasets-removed-stanford-csam-child-abuse/

thomasfricke,

@tante

And is the base of basically every image creation

https://laion.ai/

They are volunteers and reacted quite well. This is a glimpse on what kind of the abyss there might be in the data elsewhere.

Securing Our Digital Future: A CERN for Open Source large-scale AI Research and its Safety - Online petition (14 days left) (www.openpetition.eu)

Join us in our urgent mission to democratize AI research by establishing an international, publicly funded supercomputing facility equipped with 100,000 state-of-the-art AI accelerators to train open source foundation models. This monumental initiative will secure our technological independence, empower global innovation, and...

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • GTA5RPClips
  • DreamBathrooms
  • InstantRegret
  • magazineikmin
  • khanakhh
  • Youngstown
  • ngwrru68w68
  • slotface
  • everett
  • rosin
  • thenastyranch
  • kavyap
  • tacticalgear
  • megavids
  • cisconetworking
  • normalnudes
  • osvaldo12
  • ethstaker
  • mdbf
  • modclub
  • Durango
  • tester
  • provamag3
  • cubers
  • Leos
  • anitta
  • lostlight
  • All magazines