@cigitalgem@sigmoid.social
@cigitalgem@sigmoid.social avatar

cigitalgem

@cigitalgem@sigmoid.social

software security #swsec machine learning security #mlsec Tech | Life | Music

This profile is from a federated server and may be incomplete. Browse more on the original instance.

nitashatiku, to random
@nitashatiku@mastodon.social avatar

My latest dispatch from the AI boom for the Washington Post looks at a billionaire-backed movement to push concerns about "existential risks" around AI from the fringes of tech culture into the mainstream. Recently tech philanthropists have helped fund about 20 student-led “AI Safety” clubs at elite colleges like Stanford, Harvard, MIT, NYU, Columbia as part of an effort to recruit idealistic talent to focus their time on the speculative risk that AI could kill us all.
https://www.washingtonpost.com/technology/2023/07/05/ai-apocalypse-college-students

cigitalgem,
@cigitalgem@sigmoid.social avatar

@nitashatiku excellent work

65dBnoise, to llm
@65dBnoise@mastodon.social avatar

"Over both evolutionary time and every individual’s lived experience, natural language to-and-fro has always been with fellow human beings. As we encounter synthetic language output, it is very difficult not to extend trust in the same way as we would with a human. We argue that systems need to be very carefully designed so as not to abuse this trust."

By @emilymbender and Chirag Shah

https://iai.tv/articles/all-knowing-machines-are-a-fantasy-auid-2334

cigitalgem,
@cigitalgem@sigmoid.social avatar

@65dBnoise you trust humans???

cigitalgem,
@cigitalgem@sigmoid.social avatar
lauren, to random
@lauren@mastodon.laurenweinstein.org avatar

Glad to see that the new "Oppenheimer" movie didn't bomb.

cigitalgem,
@cigitalgem@sigmoid.social avatar

@lauren you didn't

cigitalgem, to random
@cigitalgem@sigmoid.social avatar

You can't fix an LLM by red teaming. It does exactly what it was designed to do. Autoassociative predictive word generation.

So what do you prove when you do prompt injection? Not a damn thing.

Always ask this. How does someone FIX what comes out of a pen test? If there is no fix, there is no change in security posture.


https://www.washingtonpost.com/technology/2023/08/08/ai-red-team-defcon/?wpisrc=nl_technology202

cigitalgem,
@cigitalgem@sigmoid.social avatar

@ojensen you can demonstrate that with one exploit, but you can't "prove" anything. I agree that some people don't get get this yet. But the disingenuous press coverage that pretends this will secure AI is hogwash.

simplenomad, to random
@simplenomad@rigor-mortis.nmrc.org avatar

After more than 2 decades my primary care physician retired, 2.5 years ago, it took me several months to find a suitable replacement, who after 6 months decided to stop seeing patients and focus on clinical research full time. Another search commenced - found a doctor and she's been great for the past year. Yesterday in the mail I received a notice that she's moving out of state.

This is the United States. It is hard to find a doctor that is 1) not in bed with the pharmaceutical companies, 2) moving me through a quick fire assembly line, and 3) actually considers alternate health solutions, ie DO instead of MD. The search begins again....

cigitalgem, (edited )
@cigitalgem@sigmoid.social avatar

@simplenomad I see this looming on the horizon for myself.

judell, to llm
@judell@social.coop avatar

"This suggests that the speed of fine-tuning LLMs is far exceeding that of peer review publications (OK, that’s not saying too much!) and we are clearly going to see considerable more improvements of these LLMs in the times ahead."

https://erictopol.substack.com/p/medical-ai-is-on-a-tear#%C2%A7large-language-models-are-answering-medical-questions-increasingly-correctly

cigitalgem,
@cigitalgem@sigmoid.social avatar

@judell gack!

exteriorpower, to random

What would you recommend someone read if they are skeptical of the Yudkowsky/AI “doomer” perspective, but curious to learn more and open to having their mind changed by good arguments? I’m especially interested in arguments that might be convincing to a logical, thoughtful, open minded person coming from outside the rationalist/EA/utilitarian worldview.

cigitalgem,
@cigitalgem@sigmoid.social avatar

@exteriorpower the top five papers in the annotated bibliography.

cigitalgem,
@cigitalgem@sigmoid.social avatar
cigitalgem, to random
@cigitalgem@sigmoid.social avatar
cigitalgem, to random
@cigitalgem@sigmoid.social avatar
cigitalgem, to random
@cigitalgem@sigmoid.social avatar
cigitalgem,
@cigitalgem@sigmoid.social avatar

I forgot to wear a shirt today. But it's just zoom!

cigitalgem, to random
@cigitalgem@sigmoid.social avatar
cigitalgem, to infosec
@cigitalgem@sigmoid.social avatar
cigitalgem, to random
@cigitalgem@sigmoid.social avatar

"He adds that the main method used to fine-tune models to get them to behave, which involves having human testers provide feedback, may not, in fact, adjust their behavior that much."

Another reason that Red Teaming of the sort DefCon plans to do is a waste of time.

https://www.wired.com/story/ai-adversarial-attacks/

cigitalgem, (edited ) to random
@cigitalgem@sigmoid.social avatar

Just made 28 new KIVA micro-loans with recycled loan paybacks. Join Team BIML today!

https://bit.ly/cigitalgem-kiva

cigitalgem,
@cigitalgem@sigmoid.social avatar

@virome_girl So am I. I have been doing Kiva for many years and love to watch the loan pile grow and grow.

cigitalgem, to random
@cigitalgem@sigmoid.social avatar

Can you code using predictive statistical patterns? Nope.

https://www.theregister.com/2023/08/07/chatgpt_stack_overflow_ai/

cigitalgem,
@cigitalgem@sigmoid.social avatar

"From semi-structured interviews, it is apparent that polite language, articulated and text-book style answers, comprehensiveness, and affiliation in answers make completely wrong answers seem correct,"

exteriorpower, to random

Getting ready to head to ICML in Honolulu tomorrow. I haven’t traveled much since 2020. How are hotels doing with HEPA filters in room AC vents these days, and how are people handling it when A/C filtering is not in place?

cigitalgem,
@cigitalgem@sigmoid.social avatar

@exteriorpower I know. Humans are absurd.

cigitalgem,
@cigitalgem@sigmoid.social avatar

@exteriorpower it is as if COVID does not exist.

cigitalgem, to random
@cigitalgem@sigmoid.social avatar

This is just complete nonsense and does nothing to enhance .

Frustrating.

"Red Teaming" https://arxiv.org/pdf/2209.07858.pdf

yaleman, to random

I'm so glad that my replacement reading glasses with the new prescription that helps me see haven't arrived before I go on an important business trip 😡

cigitalgem,
@cigitalgem@sigmoid.social avatar

@yaleman who needs see? Who needs hear?

cigitalgem, to random
@cigitalgem@sigmoid.social avatar
  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • mdbf
  • everett
  • osvaldo12
  • magazineikmin
  • thenastyranch
  • rosin
  • normalnudes
  • Youngstown
  • Durango
  • slotface
  • ngwrru68w68
  • kavyap
  • DreamBathrooms
  • tester
  • InstantRegret
  • ethstaker
  • GTA5RPClips
  • tacticalgear
  • Leos
  • anitta
  • modclub
  • khanakhh
  • cubers
  • cisconetworking
  • megavids
  • provamag3
  • lostlight
  • All magazines