Helping someone debug something, said they asked chatgpt about what a series of... - Random

jonny, 6 months ago (edited 6 months ago)

Helping someone debug something, said they asked chatgpt about what a series of bit shift operations were doing. He thought it was actually evaluating the code, yno like it presents itself as doing. Instead its example was a) not the code he put in, with b) incorrect annotations, and c) even more incorrect sample outputs. Has been doing this all day and had just started considering maybe chatGPT was wrong.

I was like first of all never do that again, and explained how chatGPT wasnt doing anything like what he thought it was doing. We spent 2 minutes isolating that code, printing out the bit string after each operation, and he immediately understood what was going on.

I fucking hate these LLMs. Empowerment is learning how to figure things out, how to make tools for yourself and how to debug problems. These things are worse than disempowering, teaching people to be dependent on something that teaches them bullshit.

Edit: too many ppl reading this as "this person bad at programming" - not what I meant. Criticism is of deceptive presentation of LLMs.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ albertcardona, lewiscowles1986, wonka, alcinnz +60 more

Image

Image alternative text

libroraptor, 6 months ago

@jonny so ... how did one become a code debugger, yet somehow come under the impression that ChatGPT isn't a LLM?

There seems to be an interesting story to uncover there. It genuinely intrigues me that a programmer would think that an LLM evaluates code for you, or that it isn't a LLM.

If we could figure out why a programmer thinks like that, maybe we could find a clue that'll help to rescue the rest of us.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ janriemer

jonny, 6 months ago

@libroraptor
I think its pretty reasonable !
https://neuromatch.social/@jonny/111327404303300641

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

libroraptor, 6 months ago

@jonny It is indeed reasonable! But somehow the illusion is outweighing the facts about how LLMs work internally.

I've been concerned that much of the debunking is also targetted not at the LLM, but at the anthromorphism – I have colleagues who warn about how it "hallucinates" and "fabricates" and "lies", for instance. But it's not capable of any of that in the usual meanings of those words. And thus I worry that their language choices are making the problem worse.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 6 months ago

@libroraptor
I asked them about this! They do indeed see it as a tool, and thought it was doing a semantic code analysis, not running it per se, but something like static code analysis. Which again is I think reasonable because their IDE was showing tooltips with the value of the variable, at least the initial assignment, so why wouldnt the chatbot be able to do that?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

libroraptor, 6 months ago

@jonny I think that it's a brilliant tool. (And my colleagues do not like me to say this.)

But what does your programmer think LLMs do?

I offered a different conceptualisation to my colleagues by giving them Markov chains to play with, but they seemed to think even random prose generators were still creative, thinking agents, albeit of a less intelligent form.

I've been finding also that hardly anyone who complains about AI knows what a huge class of things it is. Language is troubling.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bornach, 6 months ago

@libroraptor @jonny
That's The ELIZA Effect for you
https://www.vox.com/future-perfect/23617185/ai-chatbots-eliza-chatgpt-bing-sydney-artificial-intelligence-history

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

accretionist, 6 months ago

deleted_by_author

Loading...

+ accretionist

thezoq2, 6 months ago

@accretionist @jonny isn't it more like well formed question in, garbage out in this case?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

KinkyTurtle, 6 months ago

@accretionist @jonny

It's not even garbage in. ChatGPT is a garbage maker.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

WAHa_06x36, 6 months ago

@accretionist @jonny Don't need garbage in. You can give an LLM perfectly valid, well-structured knowledge, and it will still output garbage.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

level98, 6 months ago

@accretionist @jonny In this sense, I feel like LLM is short for "politician".

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

shaperOfDefiance, 6 months ago

@accretionist @jonny Everyone needs to understand one important thing about LLMs:

garbage.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hllizi, 6 months ago

@accretionist @jonny is a question about the effect of bit shifts garbage?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

accretionist, 6 months ago

deleted_by_author

Loading...

hllizi, 6 months ago

@accretionist @jonny so you could go with the second part only, garbage out.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

peterrenshaw, 6 months ago

@jonny “Has been doing this all day and had just started considering maybe chatGPT was wrong”

Not useful #AI, just very good at convincing users it knows what its doing. I found this with GPT3 in November.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

mnl, 6 months ago

@jonny I am disappointed by both the hype and so much of the criticism side assuming that the way (or not) to use these models is to ask them to solve problem. On the hype side because obviously their reasoning capabilities especially for traditional logic is limited, but on the critique side because so much gets lost.

If you center the user, and think "what is going to empower this user" it's going to be either learning about bit shifts or solving their problem and moving on.

1/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

mnl, 6 months ago

@jonny Both are fine, there's tons of stuff I don't want to learn and couldn't care less about.

If it is debugging, the LLM can assist with writing debugging code and asking questions that can help unblock me. It has never seen this specific problem but has seen a lot of discourse about similar issues.

It can now help by:

asking questions

quickly generating boilerplate to help me understand.

https://chat.openai.com/share/9d893974-c927-45f2-a5a0-695ab04c06e4

imo this is very empowering, and is respectful of coworkers.

2/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

mnl, 6 months ago

@jonny If it is learning, then the teaching should be fun and tailored to the student, with adapted exercises.

I am having al ot of conversations like these, with traditional learning materials open on the side, and never have I felt more confident learning about things I felt I never really "grokked" (two phase locking protocol, diffie hellman, etc)

https://chat.openai.com/share/c2373798-effa-4475-9187-2c189fd5f21c

There is so much to discover in these models that it's frustrating to see the discussion stuck at "useless liars."

3/3

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 6 months ago

@mnl yes, i get it re: giving you a bunch of material to work from, and i also understand there are lots of ways to use them aside from asking it to solve specific problems. the thing i was criticizing here is how it presents itself as able to do everything when it is only capable of doing a very circumscribed set of things that take a lot of trial and error, intuition, etc. to determine, and even then that determination is largely vibes and can shift at any time in unpredictable ways.

I am broadly critical about the effect that these tools have on information ecosystems, in particular displacing social means of organizing information for a private, proprietary, and subscription-based access to algorithmically generated information, but that wasn't necessarily the type of criticism here. In the same way that usage can be subtle, so too can criticism, and while my phrasing was glib because i was writing in a moment of frustration, I wasn't saying "everything they produce is wrong," but saying that I can see how quickly dependence on these tools can build up and then when they fail it can become very difficult to tell.

I appreciate the time you took to create some examples for this problem. The thing that makes me sad is that while the exercises and basic information in the examples seems mostly fine, it becomes entirely decontextualized, is presented as 'neutral' and without any source, even though it was mined from many such sources, and flattens my role as a learner from someone navigating a world of social information from peers and colleagues into a recipient of platformatized information. A web search finds me half a dozen explainers and sets of exercises, each with their own style and presentation in online courses, traditional courses, forums, blogposts, etc. I don't want that world of social information to disappear, where people stop writing and creating things to help everyone else understand things because everyone merely turns to their magic oracle for all information. I don't want my ability to understand things to be predicated on my ability to pay one or a few companies for the latest version of their AI model. (it's also wrong in a hard to determine way in one of the examples, eg. because it gets the evaluation precedence of && and | backwards, and omits the context that a human would certainly give re: language dependence of those precedence operators).

So yes, true, my labmate could have used the tool in a different way that would have been helpful, but the tool itself told him that it was capable of helping him in the way that he asked it to. it is true that there are ways for it to be helpful in general, and that's great, but those applications too don't exist in a vacuum. The question of empowerment is a complex one, but if it's possible for a platform to rip something away from me and i have no recourse after becoming dependent on it, then it may be empowering at a very local scale, but very much not so on a larger scale. You're right there is a great deal of nuance that gets flattened in a "us v them, llm good vs. llm bad" mentality, but the reason why i said 'i fucking hate these llms' is always influenced by the way all the problems i see them causing add up to an overall perception, where a) i am not condemning every possible application and b) some helpful application doesn't necessarily change that overall perception.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

mnl, 6 months ago

@jonny I think we're coming from the same angle, the way these models are presented and marketed I think those them a huge detriment.

However, I am much more optimistic about the decontextualization part. While I have made a quick sketch in a vacuum, I find these models most useful to create the "glue" between existing works and my brain, allowing me to actually access them in a way that feels personal. This enriches my context.

1/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

mnl, 6 months ago

@jonny To get back to the "decontextualized" presentation, it is only decontextualized because I yeeted some random questions that I didn't care about.

If I had been hunting down a bitshift bug myself, these questions would have been highly personal in a way that no other document could (I also usually switch to bing mode to get external references in cases like these, or prompt the model by pasting an existing article (my code?) I want to "chat with")

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

evmcl, 6 months ago (edited 6 months ago)

@jonny I've said it before, but I think I agree with those that say we are getting hung up on the, “intelligence” word that implies agency. I think a better and more accurate term would be “advanced applied statistics.” Or perhaps "really advanced auto-complete."

So I think we need a browser extension that substitutes “artificial intelligence” with “advanced applied statistics” and “AI” with “AAS”. Granted that last one will cause a significant number of double-takes.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 6 months ago

Many ppl feeling the need to explain how this person is just using the LLMs wrong and that they are good as long as you use them with some undefined prior intuition about when they are wrong, and I just wanted to let you know that one has been covered and you are off the hook if that was going to be your comment.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

JMMaok, 6 months ago

@jonny

I was talking with a young code camp graduate. Her team had built a tool where you paste a URL then ask ChatGPT a question about the web page. They were looking to understand ‘Why does it give the right answer sometimes but the wrong answer other times?’

How to kindly explain why that question was so problematic??? My sense was the code camp instructor was basically pleased that it accepted input and generated output.

I think about that a lot.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

foolishowl, 6 months ago

@jonny There's a lot of discussion on here about how bad ChatGPT, but off the Fediverse people seem completely unaware of the criticisms.

Almost every major tech company -- even Mozilla, who I'd expect to be fighting hard against this -- is heavily promoting LLMs, and it turns out advertising works.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

tknarr, 6 months ago

@jonny It sounds more like this person didn't understand the tools they were using (though the presentation doesn't help that). If he doesn't correct that, it doesn't bode well for his career.

The same issue is why, when out cutting wood, we never gave the new guy the chainsaw right off.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

eons, 6 months ago

@jonny education systems are failing on teaching to learn, that's why people gets so dependant on those "magical" systems

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Walop, 6 months ago

@jonny I already forgot where I read someone say "LLMs are great for the people who do not need them and the worst for the paople who do".

If you just need inspiration or nudge to the right direction, they can help you move faster.
But if you are trying to learn or figure out something for the first time, they are worse than wasting your time.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Itty53, 6 months ago

@jonny

I saw a thread earlier complaining about how tutorials are all in video format now and I tried to explain, tutorials were never about learning how to do it right. They're the opposite. They're about doing the thing WITHOUT learning how to do it. They should be focusing on documentation.

That's the same pitfall you're highlighting. People want their answers without understanding the problems. It's not distinctly a programming 'thing' either. You'll see that mentality all over the place.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Dianora, 6 months ago

@jonny
Same problem as IDE's people stop understanding what is going on and lean on a crutch.

Re Edit: We all often have blind spots which is where the rubber duck comes in.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

suetanvil, 6 months ago

@jonny

Mood. Also, a simple demo:

On a REPL with bignums (I use Ruby but YMMV), generate a random number of several hundred digits.

Ask ChatGPT to factor it.

Paste the numbers back into your REPL (you may need an intermediate editor to fix the syntax) and evaluate them.

Point out that the result is not even close to the original value and furthermore, most or all of the "factors" aren't.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

feld, 6 months ago

deleted_by_author

Loading...

jonny, 6 months ago

@feld
A use case so useful it's literally what tailwind does!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

feld, 6 months ago

deleted_by_author

Loading...

jonny, 6 months ago

@feld
It's maybe the most resource intensive way of doing a find and replace, and displacing the culture of open source toolbuilding (like the many tools for inlining tailwind that dont also risk hallucination) with consulting the magic subscription Oracle is extremely grim to me, but glad it worked for you.

Not sure what it working for you in this case has to do with it being presented misleadingly and shortcircuiting self-education, but thanks for sharing.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

feld, 6 months ago

deleted_by_author

Loading...

Private

chris_hayes, 6 months ago

@jonny
Knowing the limits of your tools is pretty essential. ChatGPT will do things like that.

But I'm surprised at the comments deriding developers that use it. Like I know for a fact that I'm at least 30%-50% more productive with copilot + ChatGPT. Sure I have lost time to bad ChatGPT outputs, but it pales in comparison to time saved.

I'd be curious if he was using gpt-3.5 or gpt-4. LLMs inherently aren't good at math, but there is a noticeable gap in capabilities between gpt-3.5 and gpt-4.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 6 months ago

@chris_hayes
It was gpt 4!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chris_hayes, 6 months ago

@jonny ah! Looks like the gpt-4 mathematical capabilities still have a way to go.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jonny, 6 months ago

@chris_hayes
I suppose it is an empirical question if one can learn a well defined natural language interface to a well defined symbolic system by example, I have my bet but yes I suppose we'll see.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

triphazard, 6 months ago

@jonny My evaluation has ultimately been that the fundamental problem with these LLMs, at least in terms of the output the give, is that they are designed to give a satisfying answer to whatever is posed to them, even if they can't. So rather than say "I can't answer that" it will instead just invent something that sounds good. Because it may not know the answer, but it damn well knows what an answer looks like, and appearing to answer is preferable to giving a disappointing result.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

AdeptVeritatis, 6 months ago

@jonny

Thanks for your post and a special thank for calling out the gatekeeping (in the edit).

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jimbob, 6 months ago

@jonny it's incredible to me that Certified Smart People I work with (you know, PhDs, experience in research, scientific publications, distinguished academic careers) have utterly bought into LLMs and seem to have no understanding of what they actually do.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

davoloid, 6 months ago

@jonny I think I'm not using it much as results are rarely useful straight away. There's extra work required to verify the results, or work out what's missing from the solution offered... so I might as well have not bothered wasting my time and the CPU cycles.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

SignorMacchina, 6 months ago (edited 6 months ago)

@jonny I develop Software for about 15 years as a profession, and this summer was the first I consulted Chatgpt for suggesting me what was needed to implement a desired feature.

After that I took it aside and looked by my self how to use and incorporate the suggested parts.

My intention was not to copy and paste a solution but to retrieve the information where to start more quickly.

It's just a tool.
You still need to figure out things and validate it's outputs like banter talk.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

AdeptVeritatis, 6 months ago

@SignorMacchina @jonny

It is not just a tool.

"These things are worse than disempowering, teaching people to be dependent on something that teaches them bullshit."

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

SignorMacchina, 6 months ago

@AdeptVeritatis @jonny If you are aware of how to retrieve information the classical way, it's like taking the bus instead of walking down the street, so for me, it's a tool. ;)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

AdeptVeritatis, 6 months ago

@SignorMacchina @jonny

No, a bus is a public service (in most places). You pay for it directly and/or with your taxes.

With ChatGPT you step into a robo taxi, which drives you to random places. Might be nice there.

At least it is free of charge at the moment.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

liw, 6 months ago

@jonny I want to raise this point in particular: "Empowerment is learning how to figure things out, how to make tools for yourself and how to debug problems."

https://neuromatch.social/@jonny/111326984855400462

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

sesquipedality, 6 months ago

@jonny LLMs can really help when you are trying to get up to speed with a particular interface, so long as you're aware they lie to you. If you are able to identify the lies and tell them about them, then sometimes they might even generate usable code. The problem comes as soon as you start trying to do something a little bit unusual, and the LLM steps up the pace from "bullshit" to "delusional fantasy".

Absolutely agree with the problem being the dishonest presentation of results.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

PurpleShadow, 6 months ago

@jonny Playing with ChatGPT in the past, it even began to make up words that didn't exist. I never told it to do so.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

nazokiyoubinbou, 6 months ago

@jonny It's sad really. LLMs do have a lot of potential. For example, the basic mechanism was just used as part of a multi-pronged effort to figure out how some bacteria are able to go into a sort of hibernation and survive extreme conditions for decades or longer. This may come in handy for finding a way to stop harmful ones. They can be useful when used right. It's really sad watching them be used wrongly instead. ChatGPT and similar just aren't AI and need to stop pretending to be.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Add comment