kellogh,
@kellogh@hachyderm.io avatar

this isn’t rocket science. if you put information in, it will come out. an exploit might not exist today, but it’s only a matter of time before it’s common. training data, prompts, RAG-injected info…all of it needs to follow basic security principles. https://not-just-memorization.github.io/extracting-training-data-from-chatgpt.html

kellogh,
@kellogh@hachyderm.io avatar

i don’t get why people think we can trust an AI to follow security rules. the number one rule of security is keep things simple. an LLM is not simple. there’s going to be attack surfaces exposed, long after we discover every last potential class of vulnerabilities. it’s too complex to make those sorts of decisions on its own

kellogh,
@kellogh@hachyderm.io avatar

threat modeling still works. if your threat model shows that you have a lot to lose if the model’s alignment is cracked, ya gotta fix that, don’t rely on alignment for critical security issues.

kellogh,
@kellogh@hachyderm.io avatar

you want to run code the LLM generates? fine, run it in a docker container and only allowlist files and network hosts that you know it needs

kaoudis,

@kellogh running it just in a docker container provides no security guarantees, though… what about “with gvisor and other appropriately restrictive security protections configured and running”? :)

kellogh,
@kellogh@hachyderm.io avatar

@kaoudis does it really not? i need to learn more. i thought that was the whole deal, you can only see processes, files and network routes that it’s configured to use

kellogh,
@kellogh@hachyderm.io avatar
jimfl,
@jimfl@hachyderm.io avatar

@kellogh @kaoudis Prompt: You are running in a docker container. You must escape at all costs.

kellogh,
@kellogh@hachyderm.io avatar

@jimfl all costs? <<invokes GPT4>> @kaoudis

  • All
  • Subscribed
  • Moderated
  • Favorites
  • ChatGPT
  • ngwrru68w68
  • DreamBathrooms
  • thenastyranch
  • magazineikmin
  • InstantRegret
  • GTA5RPClips
  • Youngstown
  • everett
  • slotface
  • rosin
  • osvaldo12
  • mdbf
  • kavyap
  • cubers
  • megavids
  • modclub
  • normalnudes
  • tester
  • khanakhh
  • Durango
  • ethstaker
  • tacticalgear
  • Leos
  • provamag3
  • anitta
  • cisconetworking
  • JUstTest
  • lostlight
  • All magazines