bcantrill,
@bcantrill@mastodon.social avatar

In the conversation @ahl and I had with @simon in January, he mentioned work on adversarial attacks on LLMs that proved surprisingly universal. On today's Oxide and Friends, we will be joined by Nicholas Carlini, one of the authors of "Universal and Transferable Adversarial Attacks on Aligned Language Models" to talk not only about this specific work, but about adversarial machine learning in general -- and how it guides thinking on LLMs. Join us, 5p Pacific!

https://discord.gg/dkzxxNQs?event=1221829306112675952

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • InstantRegret
  • DreamBathrooms
  • ngwrru68w68
  • osvaldo12
  • cubers
  • magazineikmin
  • ethstaker
  • Youngstown
  • rosin
  • slotface
  • everett
  • kavyap
  • Durango
  • khanakhh
  • megavids
  • thenastyranch
  • anitta
  • modclub
  • GTA5RPClips
  • mdbf
  • cisconetworking
  • tester
  • tacticalgear
  • provamag3
  • Leos
  • normalnudes
  • JUstTest
  • lostlight
  • All magazines