Sigh. LLMs don't hallucinate. They assemble words. They know nothing, understand... - Random

jeffjarvis, 1 year ago

Sigh. LLMs don't hallucinate. They assemble words. They know nothing, understand nothing. They are being misused by tech platforms and misrepresented by media.
When A.I. Chatbots Hallucinate https://www.nytimes.com/2023/05/01/business/ai-chatbots-hallucinatation.html?smid=tw-share

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ricardoharvin, janriemer, objectinspace, seth +4 more

Image

Image alternative text

simon, 1 year ago

@jeffjarvis An interesting thing about that article is that "When did The New York Times first report on AI?" is a question that's almost the perfect case of something a language model will be unable to answer correctly

Explaining why that is to people is so hard though!

One of the big skills involved in using this tech productively is developing robust instincts in terms of what questions it will likely answer well and what questions it's going to blatantly mess up

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ janriemer, simon, evan

simon, 1 year ago

@jeffjarvis A key message I've been trying to get across to people is that, while this stuff looks trivially easy to use, it really isn't - it takes a great deal of experience and knowledge to use it in a way that avoids the many, many traps and pitfalls

Once you climb that steep and invisible learning curve it's one of the most powerful new tools I've ever encountered - but figuring out how to help guide people there is proving really difficult

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ janriemer

22, 1 year ago

@simon @jeffjarvis this is the message of the classic xkcd 1425: “In CS, it can be hard to explain the difference between the easy and the virtually impossible”

As a developer who works daily with very highly skilled and technical people (financial asset traders), the opacity for them of what I do continually surprises me. The gentle “I’m not sure how long this will take but…” is just as often followed by a request that’s a one-liner change or a request that needs three teams to coordinate work on.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ janriemer

larsmb, 1 year ago

@22 @simon @jeffjarvis Looks like someone paid for the research team during the last 9 years then ...

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jakob, 1 year ago

@simon @jeffjarvis I don't think it takes extraordinary skill - or at least it would not need to, if the mental models of people who drive public discourse were better informed.

Of course you can't understand or explain a thing well, if the metaphor you are using to describe it is ill equipped and leads you down unhelpful paths.

https://pxi.social/@jakob/110283919520935594

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jakob, 1 year ago

@simon @jeffjarvis incidentally, "answering" is already an unhelpful frame.

To answer a question, as an act of human-like communication, an interlocutor needs to access a knowledge model that binds truth values to propositions. An LLM cannot answer a question in that sense.

Some prompts can generate useful outputs, though. That's how I would try to approach defining sensible constraints.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ jeffjarvis

simon, 1 year ago

@jakob @jeffjarvis that's an illustration of the problem: despite lacking a knowledge model, they CAN answer many questions really effectively - they do that all the time

I'm not convinced that trying to explain to people that what they see LLMs do every day isn't technically correct or possible is a useful path

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jakob, 1 year ago

@simon @jeffjarvis it is epistemically false to purport there is a schema of question and answer that LLMs can partake in. LLMs generate outputs from large N data pools, stochastic training algorithms and prompts. That you frame the input prompt as a question is a choice.

People see things all the time where an initial model of what they see is incongruent with the underlying causes. That's where journalists and Benoit Blanc come in. Hopefully only one of them is fictional.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

simon, 1 year ago

@jakob @jeffjarvis I understand how they work - the thing I find so interesting about them is how many things they manage to be useful for despite the very clear flaws and limitations in the way they are built

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

sheldonrampton, 1 year ago

@simon @jakob @jeffjarvis I find the same thing is the case with human beings. Despite the very clear flaws and limitations in the way WE are built, we still often manage to be useful and occasionally even inspired.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jsit, 1 year ago

@jeffjarvis Calling them “AI” in the first place is a big part of the problem.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ Jigsaw_You, jeffjarvis

profcarroll, 1 year ago

@jeffjarvis Classic both-sides trick from the grey lady. Use the bad but industrial term but then include the sentence that “critics” oppose using the term because it’s bad and misleading.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ jeffjarvis

robertklaschka, 1 year ago

@jeffjarvis I've come to conclusion that using words like hallucinate is part of a deliberate effort to make people think of these technologies as similar to human Intelligence. Similar to the vast amount of time spent making robots dance so we anthropomorphise them.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ricardoharvin

Kraemer_HB, 1 year ago

@robertklaschka It is in the best interest of the publisher that journalists paint LLMs as fully capable of taking their positions as journalists. And some journalists seem to hallucinate about LLMs, ironically. @jeffjarvis

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ jeffjarvis

KatEmm, 1 year ago

@robertklaschka @jeffjarvis it also makes what the ai is doing seem more deep and interesting...it's having a mental adventure! Instead of more negative anthro language like: it's messing up, it's bullshitting..m

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ jeffjarvis

_L1vY_, 1 year ago

@KatEmm @robertklaschka @jeffjarvis

Right? It's not a hallucination. The device is generating a statistically likely word from its wide samples of content.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ jeffjarvis

StarkRG, 1 year ago

@KatEmm @robertklaschka @jeffjarvis Or, more realistically, "It was trained by a group of idiots."

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jeffjarvis, 1 year ago

@StarkRG @KatEmm @robertklaschka
Yes, namely all of us.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

_L1vY_, 1 year ago

@robertklaschka @jeffjarvis
Those dancing robots got people to accept ultraviolent police machinery out on our streets in broad daylight 😤

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ricardoharvin

whynothugo, 1 year ago

@jeffjarvis I also know nothing, understand nothing. I just assemble words into patterns that I’ve observed from other humans. It’s not that I think that AI is super smart, it’s that I think we’re overestimating how smart humans actually are.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

morri, 1 year ago

@jeffjarvis one article called them "elaborate text predict"

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

atatassault, 1 year ago

@jeffjarvis

Yeah, the word hallucinate is terrible. Luke Lafreniere (that's probably wrong, I'm terrible with french words) from LinusTechTips called it "LLMs are Confidently Wrong", aka, bullshitting.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

costrike, 1 year ago

@jeffjarvis When the NYT Aids and Abets

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

seth, 1 year ago

@jeffjarvis Spot on

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ids1024, 1 year ago

@jeffjarvis "Figuring out why chatbots make things up and how to solve the problem has become one of the most pressing issues facing researchers as the tech industry races toward the development of new A.I. systems."

Yeah, I don't know if "why chatbots make things up" is that big of a question. Of course language models will do that. It's really a bigger question why they give an acceptable answer more often than one might expect.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

UlrikeHahn, 1 year ago

@jeffjarvis

if only we knew what "know" or "understand" mean...

https://write.as/ulrikehahn/stochastic-parrot-is-a-misleading-metaphor-for-llms

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

tanepiper, 1 year ago

@jeffjarvis Yep - I built @StochasticEntropy and asked OpenAI to confirm it's not returning anyone elses responses from my empty prompts - they confirmed it's just stochastic chains

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

martinvermeer, 1 year ago

@jeffjarvis Not untrue, but only meaningful to the extent that it not also applies to lots of human thinking, or 'thinking'.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

grumpasaurus, 1 year ago

@jeffjarvis anthropomorphism at its finest

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Aradayn, 1 year ago

@jeffjarvis
I've seen people use the word "confabulate" as an alternative.

I prefer "fabricate," and I think it's better to apply it to all of their output, not just what we classify as "mistakes."

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

azizhp, 1 year ago

@jeffjarvis fair enough, but the term was coined by the researchers in LLMs themselves and predates the (unexpectedly) public release of ChatGPT. Like "temperature", the term has a specific and unique meaning in this context.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

alec, 1 year ago

@jeffjarvis I agree generally, but I like @simon’s position that it’s more productive to focus on straightforward communication of the harms instead of getting caught up in lower priority problem of anthropomorphism. https://simonwillison.net/2023/Apr/7/chatgpt-lies/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ankitpati, 1 year ago

@jeffjarvis I understand your sentiment, but I also agree with the label of “hallucination” as commonly applied to LLMs.

We’ve been anthropomorphising technology long before LLMs.

Computer “viruses,” “worms,” and “Trojan horses” come to mind. None of these are “real” in the physical sense, but they convey the concept accurately to someone familiar with the physical objects these words describe.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Jfillian, 1 year ago

@jeffjarvis the busy folk over at snopes.com are going to be overwhelmed.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ravigupta, 1 year ago

@jeffjarvis But they are acing medical tests somehow?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

simon, 1 year ago

@ravigupta @jeffjarvis Most of those "it got an A on this test!" things are multiple choice questions

Multiple choice questions are the kind of thing a language model can be incredibly effective at - much, much easier than "when did the NY Times first mention AI?"

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Enema_Cowboy, 1 year ago

@jeffjarvis So they're basically the Sergeant Shultz of technology?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

coffe, 1 year ago

@jeffjarvis exactly.

And it's something we absolutely need to understand.

https://fosstodon.org/@coffe/110292695353936488

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jeber, 1 year ago

@jeffjarvis
Sometimes we need to not anthropomorphize everything.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Add comment