@cigitalgem@sigmoid.social
@cigitalgem@sigmoid.social avatar

cigitalgem

@cigitalgem@sigmoid.social

software security #swsec machine learning security #mlsec Tech | Life | Music

This profile is from a federated server and may be incomplete. Browse more on the original instance.

cigitalgem, to random
@cigitalgem@sigmoid.social avatar
cigitalgem,
@cigitalgem@sigmoid.social avatar

@danielcornell yeah. Pretend security for the win!

cigitalgem, to random
@cigitalgem@sigmoid.social avatar

You can't fix an LLM by red teaming. It does exactly what it was designed to do. Autoassociative predictive word generation.

So what do you prove when you do prompt injection? Not a damn thing.

Always ask this. How does someone FIX what comes out of a pen test? If there is no fix, there is no change in security posture.


https://www.washingtonpost.com/technology/2023/08/08/ai-red-team-defcon/?wpisrc=nl_technology202

cigitalgem,
@cigitalgem@sigmoid.social avatar
cigitalgem,
@cigitalgem@sigmoid.social avatar

@ojensen you can demonstrate that with one exploit, but you can't "prove" anything. I agree that some people don't get get this yet. But the disingenuous press coverage that pretends this will secure AI is hogwash.

cigitalgem, to random
@cigitalgem@sigmoid.social avatar

Can you code using predictive statistical patterns? Nope.

https://www.theregister.com/2023/08/07/chatgpt_stack_overflow_ai/

cigitalgem,
@cigitalgem@sigmoid.social avatar

"From semi-structured interviews, it is apparent that polite language, articulated and text-book style answers, comprehensiveness, and affiliation in answers make completely wrong answers seem correct,"

cigitalgem, to random
@cigitalgem@sigmoid.social avatar

Repeat after me. AI "red teaming" is bullshit. Do real and stop the nonsense.

https://www.washingtonpost.com/technology/2023/08/08/ai-red-team-defcon/?wpisrc=nl_technology202

cigitalgem,
@cigitalgem@sigmoid.social avatar
cigitalgem, (edited ) to random
@cigitalgem@sigmoid.social avatar

Just made 28 new KIVA micro-loans with recycled loan paybacks. Join Team BIML today!

https://bit.ly/cigitalgem-kiva

cigitalgem,
@cigitalgem@sigmoid.social avatar

@virome_girl So am I. I have been doing Kiva for many years and love to watch the loan pile grow and grow.

lauren, to random
@lauren@mastodon.laurenweinstein.org avatar

Glad to see that the new "Oppenheimer" movie didn't bomb.

cigitalgem,
@cigitalgem@sigmoid.social avatar

@lauren you didn't

exteriorpower, to random

Getting ready to head to ICML in Honolulu tomorrow. I haven’t traveled much since 2020. How are hotels doing with HEPA filters in room AC vents these days, and how are people handling it when A/C filtering is not in place?

cigitalgem,
@cigitalgem@sigmoid.social avatar

@exteriorpower it is as if COVID does not exist.

cigitalgem,
@cigitalgem@sigmoid.social avatar

@exteriorpower I know. Humans are absurd.

simplenomad, to random
@simplenomad@rigor-mortis.nmrc.org avatar

After more than 2 decades my primary care physician retired, 2.5 years ago, it took me several months to find a suitable replacement, who after 6 months decided to stop seeing patients and focus on clinical research full time. Another search commenced - found a doctor and she's been great for the past year. Yesterday in the mail I received a notice that she's moving out of state.

This is the United States. It is hard to find a doctor that is 1) not in bed with the pharmaceutical companies, 2) moving me through a quick fire assembly line, and 3) actually considers alternate health solutions, ie DO instead of MD. The search begins again....

cigitalgem, (edited )
@cigitalgem@sigmoid.social avatar

@simplenomad I see this looming on the horizon for myself.

cigitalgem, to random
@cigitalgem@sigmoid.social avatar
cigitalgem,
@cigitalgem@sigmoid.social avatar

I forgot to wear a shirt today. But it's just zoom!

judell, to llm
@judell@social.coop avatar

"This suggests that the speed of fine-tuning LLMs is far exceeding that of peer review publications (OK, that’s not saying too much!) and we are clearly going to see considerable more improvements of these LLMs in the times ahead."

https://erictopol.substack.com/p/medical-ai-is-on-a-tear#%C2%A7large-language-models-are-answering-medical-questions-increasingly-correctly

cigitalgem,
@cigitalgem@sigmoid.social avatar

@judell gack!

65dBnoise, to llm
@65dBnoise@mastodon.social avatar

"Over both evolutionary time and every individual’s lived experience, natural language to-and-fro has always been with fellow human beings. As we encounter synthetic language output, it is very difficult not to extend trust in the same way as we would with a human. We argue that systems need to be very carefully designed so as not to abuse this trust."

By @emilymbender and Chirag Shah

https://iai.tv/articles/all-knowing-machines-are-a-fantasy-auid-2334

cigitalgem,
@cigitalgem@sigmoid.social avatar
cigitalgem,
@cigitalgem@sigmoid.social avatar

@65dBnoise you trust humans???

exteriorpower, to random

What would you recommend someone read if they are skeptical of the Yudkowsky/AI “doomer” perspective, but curious to learn more and open to having their mind changed by good arguments? I’m especially interested in arguments that might be convincing to a logical, thoughtful, open minded person coming from outside the rationalist/EA/utilitarian worldview.

cigitalgem,
@cigitalgem@sigmoid.social avatar
cigitalgem,
@cigitalgem@sigmoid.social avatar

@exteriorpower the top five papers in the annotated bibliography.

yaleman, to random

I'm so glad that my replacement reading glasses with the new prescription that helps me see haven't arrived before I go on an important business trip 😡

cigitalgem,
@cigitalgem@sigmoid.social avatar

@yaleman who needs see? Who needs hear?

cigitalgem, to random
@cigitalgem@sigmoid.social avatar
cigitalgem, to random
@cigitalgem@sigmoid.social avatar

"He adds that the main method used to fine-tune models to get them to behave, which involves having human testers provide feedback, may not, in fact, adjust their behavior that much."

Another reason that Red Teaming of the sort DefCon plans to do is a waste of time.

https://www.wired.com/story/ai-adversarial-attacks/

cigitalgem, to random
@cigitalgem@sigmoid.social avatar

NEW BIML Bibliography entry

DATA VALIDATION FOR MACHINE LEARNING

Breck, et al.

This basic paper is about validating input data (as opposed to the validation set as linked to the training set).

https://berryvilleiml.com/references/

cigitalgem, to random
@cigitalgem@sigmoid.social avatar
cigitalgem, to infosec
@cigitalgem@sigmoid.social avatar
cigitalgem, to random
@cigitalgem@sigmoid.social avatar

NEW BIML Bibliography entry

Red Teaming Language Models to Reduce Harms:
Methods, Scaling Behaviors, and Lessons Learned

Anthropic

https://arxiv.org/pdf/2209.07858.pdf

Absolute malarky informed by zero understanding of security, pen testing, and what a real red team does.


https://berryvilleiml.com/references/

cigitalgem, to random
@cigitalgem@sigmoid.social avatar
cigitalgem, to random
@cigitalgem@sigmoid.social avatar
  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • mdbf
  • everett
  • osvaldo12
  • magazineikmin
  • thenastyranch
  • rosin
  • normalnudes
  • Youngstown
  • Durango
  • slotface
  • ngwrru68w68
  • kavyap
  • DreamBathrooms
  • tester
  • InstantRegret
  • ethstaker
  • GTA5RPClips
  • tacticalgear
  • Leos
  • anitta
  • modclub
  • khanakhh
  • cubers
  • cisconetworking
  • megavids
  • provamag3
  • lostlight
  • All magazines