BARD --> Gemini bumpiness shows the precarious position that LLM foundation models based on huge badly-managed datasets put companies in. Google is rushing. And the results are wrong and bad.
BIML believes that reproducibility economics is a serious risk to scientific study of #ML.
"Mensch said his new model cost less than €20 million, to train. By contrast OpenAI Chief Executive Sam Altman said last year after the release of GPT-4 that training his company’s biggest models cost “much more than” $50 million to $100 million."
Which academic CS organizations have a "mere $22M" to build an LLM to experiment with. How can you try alternatives?
Just finished hacking up slides for the LLM security work BIML recently released. I will be presenting this invited talk for three NDSS conference workshops (simultaneously) in San Diego Monday afternoon. #MLsec#ML#AI#LLM
As a pizza delivery person you too can prompt persnickety parrots with pen test panache using this new tool from Microsoft. A whole new cyber cyber career!
I know, let's pretend that LLM security can be bolted on later after we have created a foundation model based on data scraped from the Internet that is FULL of poison, garbage, nonsense, and noise. <Announcer: It can't>
NEW Security Ledger podcast features BIML's LLM risk analysis, recursive pollution, and data feudalism. Always a great time chatting with Paul Roberts! @securityledger #MLsec#ML#AI#LLM
The biggest risk posed by large language model AI like Chat GPT? “It’s this: large language models are often wrong,” McGraw told me. “And they’re very convincingly wrong and very authoritatively wrong.” #MLsec
"AI" (that is black box auto-associative ML generators) will ensnare your attention and not let go. Who needs links on the web to original content if an ML copycat can condense it all for you---introducing errors and nonsense while doing so?? #MLsec
BIML reviewed this terrible work yesterday. It is so badly done that we feel the need to respond to it. This is the second anthropic paper on AI alignment that presented thin gruel, poor reasoning, and a misunderstanding of the basics of science. That kind of work does nothing to advance the field of #MLsec.
Lazy use of ML to generate product descriptions automatically. Guess what, this is a prime example of recursive pollution...because these descriptions will be eaten by search engines, etc. Here we go!