cutterkom, German
@cutterkom@mastodon.social avatar

How are the #LAION (-5B) datasets built? And did the LAION organisation/group ever react on criticism?

For our piece on the masses of personal identifiable information (PII) at @br_data we described the process of collecting data > 5 billion images from the web: https://interaktiv.br.de/ki-trainingsdaten/en/index.html

This makes it clear how #PII and #CSAM content is included and why the automatic filters do not work well enough.

https://interaktiv.br.de/ki-trainingsdaten/en/index.html

#Stablediffusion

  • All
  • Subscribed
  • Moderated
  • Favorites
  • ArtificialIntelligence
  • DreamBathrooms
  • mdbf
  • ethstaker
  • magazineikmin
  • GTA5RPClips
  • rosin
  • thenastyranch
  • Youngstown
  • osvaldo12
  • slotface
  • khanakhh
  • kavyap
  • InstantRegret
  • Durango
  • provamag3
  • everett
  • cisconetworking
  • Leos
  • normalnudes
  • cubers
  • modclub
  • ngwrru68w68
  • tacticalgear
  • megavids
  • anitta
  • tester
  • JUstTest
  • lostlight
  • All magazines