@cigitalgem@sigmoid.social
@cigitalgem@sigmoid.social avatar

cigitalgem

@cigitalgem@sigmoid.social

software security #swsec machine learning security #mlsec Tech | Life | Music

This profile is from a federated server and may be incomplete. Browse more on the original instance.

cigitalgem, to ML
@cigitalgem@sigmoid.social avatar

Just finished hacking up slides for the LLM security work BIML recently released. I will be presenting this invited talk for three NDSS conference workshops (simultaneously) in San Diego Monday afternoon.

All NDSS ’24 workshops: https://www.ndss-symposium.org/ndss2024/co-located-events/

  1. SDIoTSec: https://www.ndss-symposium.org/ndss2024/co-located-events/sdiotsec/
  2. USEC: https://www.ndss-symposium.org/ndss2024/co-located-events/usec/
  3. AISCC: https://www.ndss-symposium.org/ndss2024/co-located-events/aiscc/.
cigitalgem, to random
@cigitalgem@sigmoid.social avatar

Building Security In can be done with LLM applications

https://blog.redsift.com/news/announcing-red-sift-radar-beta/

cigitalgem, to ML
@cigitalgem@sigmoid.social avatar

As a pizza delivery person you too can prompt persnickety parrots with pen test panache using this new tool from Microsoft. A whole new cyber cyber career!

https://www.microsoft.com/en-us/security/blog/2024/02/22/announcing-microsofts-open-automation-framework-to-red-team-generative-ai-systems/

I know, let's pretend that LLM security can be bolted on later after we have created a foundation model based on data scraped from the Internet that is FULL of poison, garbage, nonsense, and noise. <Announcer: It can't>

cigitalgem, to random
@cigitalgem@sigmoid.social avatar
cigitalgem,
@cigitalgem@sigmoid.social avatar

My first real programming after applesoft basic was pascal. I even got a 16K card with turbo pascal on it, bumping my memory ALL THE WAY UP to 64k on my apple ][+. That machine deeply impacted my entire life.

cigitalgem, to ML
@cigitalgem@sigmoid.social avatar

NEW Security Ledger podcast features BIML's LLM risk analysis, recursive pollution, and data feudalism. Always a great time chatting with Paul Roberts! @securityledger

https://securityledger.com/2024/02/episode-256-recursive-pollution-data-feudalism-gary-mcgraw-on-llm-insecurity/

cigitalgem,
@cigitalgem@sigmoid.social avatar

The biggest risk posed by large language model AI like Chat GPT? “It’s this: large language models are often wrong,” McGraw told me. “And they’re very convincingly wrong and very authoritatively wrong.”

cigitalgem, to random
@cigitalgem@sigmoid.social avatar

On the accidental surveillance state we built to serve better ads...

https://www.wired.com/story/how-pentagon-learned-targeted-ads-to-find-targets-and-vladimir-putin/

cigitalgem, (edited ) to random
@cigitalgem@sigmoid.social avatar
cigitalgem, to ML
@cigitalgem@sigmoid.social avatar
cigitalgem, to ML
@cigitalgem@sigmoid.social avatar

So, what about that NIST AI attack taxonomy? Here's what BIML thinks:

https://berryvilleiml.com/2024/01/23/another-round-of-adversarial-machine-learning-from-nist/

cigitalgem, to ML
@cigitalgem@sigmoid.social avatar
cigitalgem, to llm
@cigitalgem@sigmoid.social avatar

BIML released a unique and detailed Risk Analysis one week ago. Have you read it yet? Please pass it on.

This is applied machine learning security

https://berryvilleiml.com/results/

cigitalgem, to ML
@cigitalgem@sigmoid.social avatar

Have a listen to BIML discuss Machine Learning Security on the Google Cloud Security podcast

https://berryvilleiml.com/2024/01/25/google-cloud-security-podcast-features-biml/

cigitalgem, to ML
@cigitalgem@sigmoid.social avatar
cigitalgem, to ai
@cigitalgem@sigmoid.social avatar
cigitalgem, to llm
@cigitalgem@sigmoid.social avatar
cigitalgem,
@cigitalgem@sigmoid.social avatar

META-thread: Lets do a TOP TEN LLM Risks list

10: Encoding Integrity
https://sigmoid.social/@cigitalgem/111811997833433019

cigitalgem, to random
@cigitalgem@sigmoid.social avatar
cigitalgem, to llm
@cigitalgem@sigmoid.social avatar

Lets do a TOP TEN LLM Risks list

  1. Recursive pollution

Get the full paper here https://berryvilleiml.com/results/

cigitalgem,
@cigitalgem@sigmoid.social avatar

Alemohammad, Sina, Josue Casco-Rodriguez, Lorenzo Luzi, Ahmed Imtiaz Humayun, Hossein Babaei, Daniel LeJeune, Ali Siahkoohi, Richard G. Baraniuk. “Self-Consuming Generative Models Go MAD.” arXiv preprint arXiv:2307.01850 (2023)

https://arxiv.org/pdf/2305.17493.pdf

cigitalgem,
@cigitalgem@sigmoid.social avatar

LLMs can sometimes be spectacularly wrong, and confidently so. If and when LLM output is pumped back into the training data ocean (by reference to being put on the Internet, for example), a future LLM may end up being trained on these very same polluted data. This is one kind of “feedback loop” problem we identified and discussed in 2020.

cigitalgem,
@cigitalgem@sigmoid.social avatar

See, in particular, [BIML78 raw:8:looping], [BIML78 input:4:looped input], and [BIML78 output:7:looped output]. Shumilov et al, subsequently wrote an excellent paper on this phenomenon. Also see Alemohammad. Recursive pollution is a serious threat to LLM integrity. ML systems should not eat their own output just as mammals should not consume brains of their own species.

cigitalgem,
@cigitalgem@sigmoid.social avatar

REFERENCES

Shumailov, Ilia, Zakhar Shumaylov, Yiren Zhao, Yarin Gal, Nicolas Papernot, and Ross Anderson. “Model Dementia: Generated Data Makes Models Forget.” arXiv preprint arXiv:2305.17493 (2023).

https://arxiv.org/pdf/2305.17493.pdf

cigitalgem, to random
@cigitalgem@sigmoid.social avatar
cigitalgem, to ML
@cigitalgem@sigmoid.social avatar

Just delivered the first BIML LLM Risks talk at NDSS in San Diego. Much fun was had!

Getting set up for the talk...

cigitalgem,
@cigitalgem@sigmoid.social avatar

The work I talked about at NDSS is available here under a creative commons license

https://berryvilleiml.com/results/BIML-LLM24.pdf

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • mdbf
  • InstantRegret
  • ethstaker
  • magazineikmin
  • GTA5RPClips
  • rosin
  • modclub
  • Youngstown
  • ngwrru68w68
  • slotface
  • osvaldo12
  • kavyap
  • DreamBathrooms
  • Leos
  • thenastyranch
  • everett
  • cubers
  • cisconetworking
  • normalnudes
  • Durango
  • anitta
  • khanakhh
  • tacticalgear
  • tester
  • provamag3
  • megavids
  • lostlight
  • All magazines