Machine Learning

MozillaAI,

Over the past few months at @MozillaAI we engaged with a number of organizations to learn how they are using language models in practice.

We spoke with 35 organizations across sectors like finance, government, startups, and large enterprises.

Our interviewees ranged from engineers to CTOs, capturing a diverse range of perspectives.

Our interview summary notes for the 35 conversations amounted to 18,481 words (approximately 24,600 tokens), almost the length of a novella.

Lobrien,

This is a very nice video on understanding attention in transformers https://www.3blue1brown.com/lessons/attention

kellogh,
@kellogh@hachyderm.io avatar

@Lobrien oh wow, i never could crack it before, i think i vaguely get it now

cigitalgem,
@cigitalgem@sigmoid.social avatar

The open source debate in ( ) is absolutely irrelevant unless all the training data are also made open. Tech reporters are getting lost again because vendors are misleading them.

https://www.nytimes.com/2024/05/29/technology/what-to-know-open-closed-software.html?utm_source=press.coop

kir0ul,
@kir0ul@sigmoid.social avatar

@cigitalgem Indeed, and by the way, there is a current effort from @osi to define what is and some people are pointing out that the data must be released as well:

dalias,
@dalias@hachyderm.io avatar

@cigitalgem And the training data can't be made open because it's all stolen, full of copyright infringement and GDPR violations, ntm CSAM.

The entire tech industry has stepped up from petty disruption lawbreaking to full on mafia level shit.

cigitalgem,
@cigitalgem@sigmoid.social avatar

Fox appoints self to guard chicken house.

"As OpenAI trains its new model, its new Safety and Security committee will work to hone policies and processes for safeguarding the technology, the company said. The committee includes Mr. Altman, as well as OpenAI board members Bret Taylor, Adam D’Angelo and Nicole Seligman. The company said that the new policies could be in place in the late summer or fall."

https://www.nytimes.com/2024/05/28/technology/openai-gpt4-new-model.html?utm_source=press.coop

pinsk,
@pinsk@freeradical.zone avatar

@cigitalgem the implication is there was no safety/security committee or policies until now, which tracks

cigitalgem,
@cigitalgem@sigmoid.social avatar

@pinsk they disbanded it not too long ago actually

cigitalgem,
@cigitalgem@sigmoid.social avatar

When you choose to use an foundation model, you accept the risk management decisions made by the vendor without your input. Wonder what they are? Read this paper from computer.

https://berryvilleiml.com/2024/05/16/how-to-regulate-llms/

cigitalgem,
@cigitalgem@sigmoid.social avatar

I am speaking tonight at the NOVA chapter meeting. Meeting starts at 5:30 in Reston at the Microsoft building.

10, 23, 81 — Stacking up the LLM Risks: Applied Machine Learning Security

https://www.issa-nova.org/may-16-530pm-dr-gary-mcgraw-on-stacking-up-the-llm-risks-applied-machine-learning-security/

cigitalgem,
@cigitalgem@sigmoid.social avatar

Wonder how the goverment should regulate LLMs? Here's how.

https://berryvilleiml.com/2024/05/16/how-to-regulate-llms/

chikim,
@chikim@mastodon.social avatar

Earlier today, Microsoft released new WizardLM-2 7b, 8x22b, 70b with great benchmark result, (of course, they say as good or almost same as GPT-4), but they removed weights on Huggingface, repo on Github, and their whitepaper. Someone on Reddit joked maybe they released GPT-4 by mistake! lol Quantized. weights from other people are still around on Huggingface!

vick21,
@vick21@mastodon.social avatar

@chikim Also, I think we talked about this before, I cannot justify 20 USD per month for either Copilot pro or Chat GPT. They really need to try harder or just lower the price. Make it a Spotify, for example! :)

miki,
@miki@dragonscave.space avatar

@vick21 @chikim Chat GPT Plus isn't worth it, you can just load up on $6 of developer credits and use an altertnative interface to GPT-4. I'm a fan of the commandline LLLM (https://github.com/simonw/llm), but GUIs do exist. Copilot for VS Code is another matter entirely, I get it for free via the Github Student pack, to which I have access, but I'd probably pay up if I needed to.

cigitalgem,
@cigitalgem@sigmoid.social avatar

Nice to see data lakes released...but what we need are data oceans. This new dataset is off by many orders of magnitude. Humans have a hard time with trillions...
https://huggingface.co/blog/Pclanglais/common-corpus

mempko,
@mempko@fosstodon.org avatar

I don't think the tech nerds out there understand how upsetting generative AI is to artists. Not because it will replace them, but because there will be a generation of soulless creation devoid of humanity.

Also, how many children are looking at the progress and thinking 'what's the point of becoming an artist?'. Or how many school directors are thinking 'what's the point of a fine art budget'.

kellogh,
@kellogh@hachyderm.io avatar

@mempko on the second paragraph, i think you’re a little backwards on what draws children to art. i can say fairly authoritatively that 8yo’s aren’t yet thinking about the finer points of what it takes to become a full time artist 😊

i doubt anyone, even adults, were ever drawn to art because they thought it was easy money. i can’t imagine schools ever invested in art because they believed they were setting students up with high paying jobs

cigitalgem,
@cigitalgem@sigmoid.social avatar

I don't believe we can filter our way out of drinking a polluted ocean of training data. https://www.techtarget.com/searchEnterpriseAI/news/366574580/Microsoft-hires-DeepMind-co-founder-amid-Google-Apple-news

cigitalgem,
@cigitalgem@sigmoid.social avatar

"The issue is that [Google] trained up the [Gemini] foundation model on the polluted ocean and now they're trying to stop the pollution from getting out with a filter, and that doesn't work," he said. "These models were built by drinking a data ocean without cleaning it first. And we have to do better than that." And Microsoft has the same problem, he added.

https://www.techtarget.com/searchEnterpriseAI/news/366574580/Microsoft-hires-DeepMind-co-founder-amid-Google-Apple-news

espadrine,
@espadrine@mastodon.social avatar

@cigitalgem I believe it too.

Some counterargue that training on Nazi content allows it to recognize it so that it can be finetuned not to be a Nazi. But it seems to me that making its output match anti-Nazi speech is more effective than making it match Nazi speech.

Lobrien,

Prompt "engineering" boils my blood. Can you imagine if you were working on a stream prediction system and the quality of the output depended on prepending a stream of magic numbers? You'd disdain anyone claiming that was a sustainable solution for a business. (I mean, I can imagine it, because that's exactly the kind of crap you see in consulting.)

kellogh,
@kellogh@hachyderm.io avatar

@Lobrien what about social engineering?

cigitalgem,
@cigitalgem@sigmoid.social avatar

BIML talk at Indiana University in Bloomington 4/5. Open to the public.

https://spice.luddy.indiana.edu/garymcgrawtalk

cigitalgem,
@cigitalgem@sigmoid.social avatar

Have a look at the Usenix login; interview featuring myself and the BIML LLM work.

https://berryvilleiml.com/2024/03/15/rik-farrow-interviews-mcgraw-for-login/

seniorfrosk,
@seniorfrosk@snabelen.no avatar

@cigitalgem From the interview, can we conclude that Cigital was not called Cigital when you joined?

seniorfrosk,
@seniorfrosk@snabelen.no avatar

@cigitalgem Interesting, I did not realize Synopsis was getting out of

metal3d, French
@metal3d@techlover.eu avatar

Allez, petit article qui va bien, tapé à l'arrache, mais qui peut vous intéresser. Comment j'ai utilisé une , locale, pour générer de la data fictive.

Code fourni en bas de l'article. Et n'hésitez pas à réagir dans la section commentaire !

https://www.metal3d.org/blog/2024/comment-jai-g%C3%A9n%C3%A9r%C3%A9-un-dataset-avec-lia/

cigitalgem,
@cigitalgem@sigmoid.social avatar

WHICH PART OF THIS WILL NOT WORK ARE YOU REPORTERS NOT UNDERSTANDING? Sorry for yelling.

https://www.fastcompany.com/91056543/google-gemini-restricts-global-election-queries

kellogh,
@kellogh@hachyderm.io avatar

peeps — are there clustering algorithm implementations (especially k-means) over a pre-built index? like maybe over an HNSW vector index

the naive O(n^k) is killing me…

kellogh,
@kellogh@hachyderm.io avatar

it seems like the iterator that HNSW gives you could shave off a few of those exponents…

renebekkers, Dutch
@renebekkers@mastodon.social avatar

Last week I attended the 6th Perspectives on Scientific Error Conference at @TUEindhoven
I learned so much! About questionable research practices, methods to detect data fabrication, , artefacts in machine learning...
I'm impressed by the commitment of participants to improve science through error detection & prevention. Thanks to the organizers Noah van Dongen, @lakens @annescheel Felipe Romero and @annaveer

renebekkers,
@renebekkers@mastodon.social avatar

@TUEindhoven @lakens @annescheel @annaveer
at the PSE6 meeting I wondered how often researchers in different disciplines attempt to replicate previous findings. Here's an overview of all studies I could find, with some surprising patterns. https://renebekkers.wordpress.com/2024/03/08/how-often-do-we-replicate-previous-research/

cigitalgem,
@cigitalgem@sigmoid.social avatar

It's the data, dummy.

"The AI company, for example, says it has an advantage of having access to X’s trove of posts."

Musk bought twitter for the data pile.

https://www.wsj.com/tech/ai/elon-musks-x-leans-on-his-ai-startup-9038380d

nohillside,
@nohillside@smnn.ch avatar

@cigitalgem keeping the data for itself may very well be the key reason for closing the APIs a year ago.

SwearyMonkey,
@SwearyMonkey@mastodon.social avatar

@cigitalgem y except you've got the data of a million trolls and stupid shitty arguments, not high quality lol

cigitalgem,
@cigitalgem@sigmoid.social avatar

LLMs are often completely wrong. Alignment does not fix this. In fact, it may exacerbate it.

https://www.npr.org/2024/02/28/1234532775/google-gemini-offended-users-images-race

cigitalgem,
@cigitalgem@sigmoid.social avatar
chikim,
@chikim@mastodon.social avatar

NVIDIA announced a New LLM: Nemotron-4 15B. Trained on 8T tokens. Training took 13 days with 3,072 H100s. Model is not available yet, but hhere's the paper. https://huggingface.co/papers/2402.16819

Lobrien,

Any good sources on what the outputs of the attention blocks in a transformer represent? I expected that for "The bank of the plane took it around the savings bank on the bank of the river", the vectors corresponding to "bank" would diverge -- "rotation things/money things/rivery things" -- but AFAICT that doesn't clearly happen. Here are the dot prods of the normalized vectors (aka "cosine similarity") against themselves after embedding layer and attention block 5:

Heatmap showing identical vectors for identical word embeddings

kellogh,
@kellogh@hachyderm.io avatar

@Lobrien yeah, afaik an embedding model has more than just attention layers

Lobrien,

@kellogh Yeah, there’s a linear layer after the attention is applied. I more or less expected that to swizzle things up, which is why the continued correlation between “rotation bank” “money bank” “river bank” surprises. I thought they’d diverge (they don’t in a clear way) but if I swapped in “embankment of the river” that then some vector in the transformer-block output would converge with “river bank”. Haven’t done that code yet.

cigitalgem, (edited )
@cigitalgem@sigmoid.social avatar

CarMax uses to make instant offers on used cars.

"The company also gives customers near-instant offers for their used cars, a capability that is powered by AI."

https://www.wsj.com/articles/corporate-ai-investment-is-surging-to-nvidias-benefit-5611ffc5?utm_source=press.coop

dalias,
@dalias@hachyderm.io avatar

@cigitalgem Oh shit I need to start buying clunkers. 😈

cigitalgem,
@cigitalgem@sigmoid.social avatar

Just delivered the first BIML LLM Risks talk at NDSS in San Diego. Much fun was had!

Getting set up for the talk...

cigitalgem,
@cigitalgem@sigmoid.social avatar

Here are some pictures of yesterday's NDSS workshop talk. We had to move to a bigger room to fit everyone in.

The BLACK BOX LLM FOUNDATION model picture
Building a real LLM is expensive both computationally and dataset wise
McGraw talks LLM risks

cigitalgem,
@cigitalgem@sigmoid.social avatar

The work I talked about at NDSS is available here under a creative commons license

https://berryvilleiml.com/results/BIML-LLM24.pdf

  • All
  • Subscribed
  • Moderated
  • Favorites
  • ML
  • ngwrru68w68
  • rosin
  • GTA5RPClips
  • osvaldo12
  • love
  • Youngstown
  • slotface
  • khanakhh
  • everett
  • kavyap
  • mdbf
  • DreamBathrooms
  • thenastyranch
  • magazineikmin
  • megavids
  • InstantRegret
  • normalnudes
  • tacticalgear
  • cubers
  • ethstaker
  • modclub
  • cisconetworking
  • Durango
  • anitta
  • Leos
  • tester
  • provamag3
  • JUstTest
  • All magazines