tomw,
@tomw@mastodon.social avatar

Every so often I see a post about how LLMs fail logic puzzles.

And... yes? Of course they do. The only way it could solve it is if it has seen the puzzle before or a substantially similar one. (But that might cause it to give the answer to the similar one, not the correct answer.)

Why is this even tested so often or considered surprising? It is, in essence, an autocomplete. It does not understand logic. It has no concept of a correct answer. It gives the most likely completion.

cdarwin,
@cdarwin@c.im avatar
SteveGodwin,
@SteveGodwin@mastodon.social avatar

@tomw not so much a demonstration of how smart computers are but instead a demonstration of how stupid people are.

pdcawley,
@pdcawley@mendeddrum.org avatar

@tomw @wordshaper because LLMs are consistently oversold as AI, and solving logic puzzles or simple arithmetic problems are the sort of thing that a 'real' AI should be able to do.

tek_dmn,
@tek_dmn@mastodon.tekdmn.me avatar

@tomw Because people don't realize what they are. They see it as "AI" and are trying to test the intelligence part of artificial intelligence.

They're not intelligent. They're, yeah, autocorrect (well, predictive text completion) that has enough understanding of language to attempt to construct correct sentences, without blindly just using some markov chain of "if the user types artificial, they'll type intelligence next"

It's not that there's no understanding, there's not even a concept of things like facts, fiction, or logic. But either because it's branded AI, or because you can ask a natural-language question and get a natural-language answer, people think it, well, thinks.

It's scary what that implies. All that passes for computer "intelligence" is natural language I/O.

benjamineskola,
@benjamineskola@hachyderm.io avatar

@tomw I agree, but also: given how much they’re hyped as being actually “intelligent”, it’s no surprise that many people assume that they can actually solve this kind of puzzle.

(On the other hand, I also see this from people who really ought to know better.)

tomw,
@tomw@mastodon.social avatar

@benjamineskola It adds to the illusion by being able to recite the answers to the best-known puzzles (similar to the absurdity about it "passing" exams)

samueljohnson,
@samueljohnson@mstdn.social avatar

@tomw What makes me laugh is when it apologises for basic mistakes.

"You're absolutely right. I am sorry for the confusion."

tomw,
@tomw@mastodon.social avatar

@samueljohnson It will also apologise for being correct if you insist that it is a mistake.

eibhear,

@tomw Nearly every single comment I see from those who aren't involved in designing AIs and LLMs over-emphasises the "I" and under-emphasises the "A". I suspect this is at the core of the surprise that LLMs are shite at problem-solving.

tomw,
@tomw@mastodon.social avatar

@eibhear Yes, the term is a problem. There is no more intelligence in "AI" than in, say, a calculator.

BenAveling,

@tomw @eibhear that’s too strong. AI isn’t intelligence, but for some proposes, it’s a good enough substitute.

tomw,
@tomw@mastodon.social avatar

@BenAveling @eibhear So is a calculator.

Azuaron,
@Azuaron@hachyderm.io avatar

@tomw Because the vast majority of people don't understand LLMs, and even some people who definitely know better keep talking about how it's actual intelligence and that proto-Skynet's an existential threat to humanity.

It's good to show such people that LLMs do not think, do not "know" things.

tomw,
@tomw@mastodon.social avatar

@Azuaron Yes, the existential threat stuff is so far from the reality of fancy autocomplete as to be laughable. It is bizarre that anyone takes it seriously and I don't know what the people pushing it hope to achieve.

I suppose it is on some level useful to demonstrate things like "look, I had a vaccine but I am not magnetic", but it does tend to take the proposition more seriously than is deserved.

Azuaron,
@Azuaron@hachyderm.io avatar

@tomw The existential threat people fall into two camps: grifters and griftees. When Musk, the OpenAI CEO, and their employees talk about it, what they're trying to do is convince people that LLMs are SO POWERFUL, they could even DESTROY HUMANITY. This helps them convince businesses to invest in AI because businesses love power, and doesn't that sound like the pinnacle of power?

This is why I see the logic puzzle tests as a net-good: deflating the "power" myth of LLMs reveals the grift.

tomw,
@tomw@mastodon.social avatar

@Azuaron In that sense I suppose it is a relative of "I tried to use cryptocurrency for its supposed purpose as a currency and it did not go well"

HauntedOwlbear,
@HauntedOwlbear@eldritch.cafe avatar

@tomw I was just wondering exactly this.

I imagine people feel that it's a point that needs proving, but it sometimes feels as though this approach takes the notion that LLMs might actually be able to do this more seriously than it deseves.

tomw,
@tomw@mastodon.social avatar

@HauntedOwlbear Yeah, sometimes people list out simple puzzles it can "solve" and more complex ones where it "fails" and I'm like... no. It cannot "solve" anything. At all.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • thenastyranch
  • DreamBathrooms
  • InstantRegret
  • magazineikmin
  • ethstaker
  • Youngstown
  • mdbf
  • slotface
  • everett
  • rosin
  • ngwrru68w68
  • kavyap
  • khanakhh
  • cubers
  • provamag3
  • tacticalgear
  • osvaldo12
  • GTA5RPClips
  • cisconetworking
  • modclub
  • Durango
  • Leos
  • normalnudes
  • megavids
  • tester
  • anitta
  • JUstTest
  • lostlight
  • All magazines