"AI is going to eat itself: Experiment shows people training bots are using bots"

Many workers on platforms like Amazon Mechanical Turk are using AI language models like GPT-3 to perform their tasks. This use of AI-produced data for tasks that eventually feed machine learning models can lead to concerns like reduced output quality and increased bias.

Human Labor & AI Models:

  • AI systems are largely dependent on human labor, with many corporations using platforms like Amazon Mechanical Turk.
  • Workers on these platforms perform tasks such as data labeling and annotation, transcribing, and describing situations.
  • This data is used to train AI models, allowing them to perform similar tasks on a larger scale.

Experiment by EPFL Researchers:

  • Researchers at the École polytechnique fédérale de Lausanne (EPFL) in Switzerland conducted an experiment involving workers on Amazon Mechanical Turk.
  • The workers were tasked with summarizing abstracts of medical research papers.
  • It was found that a significant portion of the completed work appeared to be generated by AI models, possibly to increase efficiency and income.

Use of AI Detected Through Specific Methodology:

  • The research team developed a methodology to detect if the work was human-generated or AI-generated.
  • They created a classifier and used keystroke data to detect whether workers copied and pasted text from AI systems.
  • The researchers were able to validate their results by cross-checking with the collected keystroke data.

The Drawbacks and Future of Using AI in Crowdsourced Work:

  • Training AI models on data generated by other AI could result in a decrease in quality, more bias, and potential inaccuracies.
  • Responses generated by AI systems are seen as bland and lacking the complexity and creativity of human-generated responses.
  • Researchers suggest that as AI improves, the nature of crowdsourced work may change with the potential of AI replacing some workers.
  • The possibility of collaboration between humans and AI models in generating responses is also suggested.

The Importance of Human Data:

  • Human data is deemed as the gold standard as it is representative of humans, whom AI serves.
  • The researchers emphasize that what they often aim to study from crowdsourced data are the imperfections of human responses.
  • This could imply that measures might be implemented in future to prevent AI usage in such platforms and ensure human data acquisition.

Source (The Register)

PS: I run a ML-powered news aggregator that summarizes with an AI the best tech news from 40+ media (TheVerge, TechCrunch…). If you liked this analysis, you’ll love the content you’ll receive from this tool!

sparklingsquirrel,

That's interesting, haven't thought of that ascpect before.

jerrimu,

I have been using gpt to train a rivescript chat bot

  • All
  • Subscribed
  • Moderated
  • Favorites
  • chatgpt@lemmy.world
  • DreamBathrooms
  • ngwrru68w68
  • modclub
  • magazineikmin
  • thenastyranch
  • rosin
  • khanakhh
  • InstantRegret
  • Youngstown
  • slotface
  • Durango
  • kavyap
  • mdbf
  • GTA5RPClips
  • JUstTest
  • tacticalgear
  • normalnudes
  • tester
  • osvaldo12
  • everett
  • cubers
  • ethstaker
  • Leos
  • anitta
  • cisconetworking
  • megavids
  • provamag3
  • lostlight
  • All magazines