cutterkom, German ![]()
How are the #LAION (-5B) datasets built? And did the LAION organisation/group ever react on criticism?
For our piece on the masses of personal identifiable information (PII) at @br_data we described the process of collecting data > 5 billion images from the web: https://interaktiv.br.de/ki-trainingsdaten/en/index.html
This makes it clear how #PII and #CSAM content is included and why the automatic filters do not work well enough.