@steely_glint@saghul
Yeah, it keeps giving. While it can answer correctly if the question is copy-pasted from the first picture, it shits itself if the question is modified slightly.
It looks like developers are fixing it on the fly.
Now, can we all stop feeding the monkey please? ChatGPT and it's stupid, useless ilk help burn the world with their energy usage. It is known. And decidedly not funny.
@fedithom It isn't 'known' everywhere - Over on linkedin everyone is talking about how LLMs will revolutionise X,Y and Z . I felt it worth a single query to see if I could get the message across that LLMs are more of a party trick than a solution to any actual problem.
Also are you Canadian ? I've only ever heard a Canadian say "thank you kindly".
Who remembers the Prolog programming language? It was intended to build machine-readable knowledge base in a way that would allow to answer the above question correctly:
@kravietz
I remember prolog - one of the research groups at Uni used it to model some British legislation. They struggled with counterfactuals. "If your grandfather had been alive in 1970, would he have been a British citizen?" required them to build a whole duplicate ruleset with a resurrected grandfather to answer the question, capture the result in the current rules then then discard the alternate universe before carrying on. ;-)
@kravietz They also had a sign over their door saying "abandon all Hope ye who enter here" as a dig at the other group who were working on a transputer friendly functional language called Hope.
@bjoernd@saghul yep, the order matters. Presumably because it matches more of the "feathers or lead" text out there, whereas "lead or feathers" has fewer matches and so the numbers take precedence.
@quantensalat@saghul Yep, elsewhere in the comments someone pointed out that chatGPT4 gets it right (chatGPT 3 does not), we were discussing wether that's a fix for this riddle or a re-weighting of the importance of numbers in comparisons.
2kg of feathers is heavier than 1kg of lead. Even though the feathers take up more space, their total mass is greater than the lead's.
To clarify, the term "heavy" can refer to either an object's mass (the amount of matter it contains) or its density (the amount of mass per unit volume). In this case, the 2kg of feathers has a greater mass, but the 1kg of lead is denser. Let me know if you have any other questions!
@steely_glint@saghul It makes me wonder if they specifically manipulated the training data for GPT4 so it answers that correctly now... or if it learned to do so with no specific manipulation to the data set...
@solarisfire@saghul could be a halfway house where they upped the weighting of numbers in questions that involve comparisons. - Essentially that is what is 'wrong' with 3.5 it allows the volume/placing of blather in the text to outweigh the difference between 1 and 2
@the_moep Kinda sure, since it replaces kg with kilograms and does the right thing with the plurals - unless there is an American meaning of kilograms that translates as 'falls faster' :-)
@steely_glint@the_moep But it doesn't understand it, it just has strong matches statistically between kg and kilogram. That's it. It doesn't know what they mean. It just generates words based on statistical models based on what humans wrote (though now it's probably choking on its own products). It's like trying to work out the height of a particular person by getting the average of all the heights of all the people.
@steely_glint@saghul LLMs are "sound good" text generators. Any meaning of the text must be supplied by a human. If the human "author" doesn't supply meaning, it's devoid of all content and making sense of it is left as an exercise to the reader, by "reading into it" and interpreting the nonsense. It's useless to use such a text to communicate.
This is effectively like shuffling the cards and playing tarot: You get rid of any "author" who could add meaning. If you, instead, have a human take the cards, assign previous meaning and use the cards to communicate using those cards, you'd actually be able to communicate (although a bit cumbersome).
Add comment