I am really, really, REALLY irritated by what I just saw. The #ImageDescription... - Large Language Models

hosford42, 9 months ago

I am really, really, REALLY irritated by what I just saw. The #ImageDescription function of Microsoft's #Bing is outright lying to people with vision impairments about what appears in images it receives. It's bad enough when an #LLM is allowed to tell lies that a person can easily check for veracity themselves. But how the hell are you going to offer this so-called service to someone who can't check the claims being made and NEEDS those claims to be correct?

How long till someone gets poisoned because Bing lied and told someone it was food that hasn't expired when it has, or that it's safe to drink when it's cleaning solution, or God knows what? This is downright irresponsible and dangerous. #Microsoft either needs to put VERY CLEAR disclaimers on their service, or just take it down until it can actually be trusted.

#Blindness
#VisualImpairment
#Accessibility
#AccessibilityMatters
#Disability
#DisabilityRights
#CorporateResponsibility
#LargeLanguageModels
#MoralHazard

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ nunesgh, baldur, dgoldsmith, drahardja +3 more

Image

Image alternative text

simon, 9 months ago

@hosford42 AparnaSachdev@disabled.social I would much rather have a disclaimer because when it does get things right, it gets them very, very right, and has interpretation abilities that far outstrip anything else that exists. If you're smart about it and really understand the limitations of what you're using, it can be an irreplaceable tool (short of calling an actual person on video). I would rather carefully use something that half-works now than wait years until everyone is good and satisfied that it's safe to use in all situations.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

AparnaSachdev, 9 months ago

@simon @hosford42 I agree with you. I use it myself, just that I’d feel better about it if there were a disclaimer saying some variation of “ use with caution”.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

objectinspace, 9 months ago

@hosford42 What does "until it can actually be trusted" mean? Trusted by whom? According to what metrics? Agreed that they should put a handwavy nux to tell people the results may not be correct, but apart from that I don't think they can do much in the short term. It is an industry wide problem. Would you rather they remove the feature indefinitely? I would not. It's extremely useful and something we've been asking for a long time.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hosford42, 9 months ago

@objectinspace No, I think it's clear that people find it useful and there is an enormous demand for it, even if it isn't reliable. They just need to be clear about the (lack of) reliability of the model. It's as simple as measuring the rates and types of mistakes and posting them clearly. Measuring these things is standard practice in ML. Posting the figures should be, as well.

Knowing whether a model can be trusted is as simple as setting some reasonable standards for trustworthiness, in terms of these error rates and error types. A model can be certified to have an error rate under a certain level, or else it should be provided with clearly posted disclaimers if these standards are not met. A reasonable error rate would be based on a real human's typical performance, which can be measured for comparison.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

fastfinge, 9 months ago

@hosford42 @noahcarver Checking the claims isn’t that hard. Just ask the same question two different ways. When does this expire? Is this expired? How long until this expires? If the answers don’t match it’s telling lies. If I’m at the point where I’m asking an LLM, it’s a last resort before I sniff and guess. The LLM is at least better than that. Another option is always better than no options at all. And until all packages have accessible QR codes, ocr or an LLM or random guessing are all I have sometimes. Don’t worry, liability concerns from folks like you will soon deny me even this tiny win. Without offering anything better, obviously.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 9 months ago

@fastfinge @hosford42 @noahcarver Ha! You said it. How long until some blind asshole is in court, crying that AI told them the rancid food was safe, and blind people are too helpless to figure that out or know better, "so give me umpteen million dollars!" No company can assume liability for that, either for AI or live agents, so they'll probably mod description features to be useless to blind people, and specifically prohibit that use in their TOS. Thanks!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

fastfinge, 9 months ago

@bryansmart @hosford42 @noahcarver Yup. Cops and companies will go right on using them for data mining though.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

noahcarver, 9 months ago

@fastfinge @bryansmart @hosford42 I understand both points. I’m not an engineer and not knowledgeable about large language models, however I do think that companies offering LLMs should, if/whenever possible, implement reasonable mechanisms that can reduce the possibility that their LLM might tell fibs. With that being said, users should know that LLMs do lie and should know basic methods for sussing out these falsehoods.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 9 months ago

@noahcarver @fastfinge @hosford42 The problem is the LLMs don’t know they’re telling lies. They don’t know what is real or not. People have incorrect expectations of them, partly due to marketing people promoting statistics as intelligence.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

noahcarver, 9 months ago

@bryansmart @fastfinge @hosford42 So, you're saying there’s no way of automatically cross-checking an LLMs output for errors and fibs?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 9 months ago

@noahcarver @fastfinge @hosford42 No, there is not. Definitely not for a model that can practically output anything. And, remember, they aren’t fibs. The LLM isn’t trying to lie. It doesn’t know the truth, or what’s real, or any facts. Everything is a complete and total guess, based on how it adjusted to what it encountered during training.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hosford42, 9 months ago

@bryansmart @noahcarver @fastfinge There are ways to mitigate the chances of BS coming out of the model, and these will continue to improve over time, but until machines really understand the world (and probably even after that) there will always be the chance of the model being wrong.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hosford42, 9 months ago

@bryansmart @noahcarver @fastfinge And this pinpoints my real issue. If it's a useful tool, it should be available. But companies should be open and honest about the capabilities and trustworthiness of the tools. I don't want to take anything people actually want or need away. I just want them to be real about it if they're going to put it out there.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

cassius1213, 9 months ago

@bryansmart @noahcarver @fastfinge @hosford42

Marketing people making material misrepresentations regarding easily misunderstood, pseudo-black box technology that they fundamentally don't understand themselves... you don't say...

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

fastfinge, 9 months ago

@noahcarver @bryansmart @hosford42 If we demanded perfection from ocr companies in the 90s, we just wouldn’t have ocr today. The scanning system that got me through university was only ever 67 percent accurate. But without it, I couldn’t have gone to university at all. How many opportunities are AI Luddite’s denying us today, for our own good?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hosford42, 9 months ago

@fastfinge @noahcarver @bryansmart To be clear, I am not an AI luddite. I do ML for a living, so that would be a very odd position for me to take. We shouldn't expect or demand perfection before stuff is made available. The perfect is the enemy of the good, and all that. But people are coming to this tool, believing the media hype about LLMs being just about miraculous. That bubble should be burst right up front before you get to use the tool.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 9 months ago

@hosford42 @fastfinge @noahcarver Yes, and that it goes for all of them.A recurring type of post is the "look what nonsense ChatGPT made up when I asked it this thing." That’s great. It needs to become a meme or big popular punchline. Everyone would stop thinking of them as "thinking computers", and start thinking of them as useful and entertaining bullshit machines.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

fastfinge, 9 months ago

@hosford42 @noahcarver @bryansmart Also to be clear, what I'm responding to is the tone in your post that seems to encourage AI doomerism. Talking about not doing things, or legal disclaimers, re-enforces some of the false narratives put out by the larger AI companies about government regulation, that they only want so they can restrict open-source and advantage themselves. If blind people are ever going to get an AI that works for us, we must have the ability to twittle the knobs. For example, no commercial AI will describe NSFW images to blind people under any circumstances.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ stevesplace

hosford42, 9 months ago

@fastfinge @noahcarver @bryansmart Ah, that makes sense. That was not my intent at all. I'm 100% on board with tools like this being fully under the control of the users. I'm pro open-source and contribute regularly. I was just angry at what I witnessed, which was people being lied to and manipulated and put at risk, with no obvious reason to doubt what they were being told. I want more power in the end-user's hands, not less.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

fastfinge, 9 months ago

@hosford42 @noahcarver @bryansmart The trick is to express these ideas carefully. I suspect the day is coming, in a matter of weeks or months, not years, where I take a photo of something and get "This appears to be medication or food. As a large language model, I can't answer questions about this photo." The same way Bing can't describe people to me, because all faces are blurred, and no AI is permitted to interact with NSFW images in any way. Anthropic V2, for example, already won't describe the visual appearance of famous adult filmstars and supermodels, and they've added a moderation API to V1 to disallow those responses as well. Even just mentioning the name Jenna Jameson will trigger the moderation, for example. That's a really trivial and silly example. But these things are coming, and formerly useful tools are going to be limited.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 9 months ago

@fastfinge @hosford42 @noahcarver LLMs may have read racisms somewhere, so can't risk them describing people and accidentally letting one out. Or saying something sexy if too much skin is visible. Or getting facts wrong, so they can't give facts. Or not knowing what "current opinion" is on something, so no opinions. Or might promote stereotypes by telling stories, so no telling stories. Or might forget something important, so no summarizing. Or...

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

fastfinge, 9 months ago

@bryansmart @hosford42 @noahcarver The hubspot AI is my favourite example of this. They have it so restricted that you have to command it like Siri, even though the back-end is openAI. And yet, it's equally inaccurate. It's just blander about it, or instead of making up data, it claims that data you know is directly in its context doesn't exist.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 9 months ago

@fastfinge @hosford42 @noahcarver Not sure if it's mostly attributable to companies afraid of lawsuits over inaccuracy, or censorious assholes, or political assholes who can't get it to be an ideological parrot, or weak assholes who demand everything be made completely safe for them, but, every time I get a roadblock response from an LLM, I hate them all a little more, and want dark things to happen to them.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

fastfinge, 9 months ago

@bryansmart @hosford42 @noahcarver It's all about the lawsuits. The only thing any corporation cares about is "line go up". If they cared about politics in any other way than "how can we make line go up more", the internet would be way more censored than it is now. But allowing users to post objectionable content can still make them money. Allowing their AI to generate it makes them less money and exposes them to more risk. If a user posts something bad, they can just turn all that users personal information over to whatever government is complaining, and wash their hands of it. That's not possible with an AI.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 9 months ago

@fastfinge @hosford42 @noahcarver "line go up" is the bottom line, but all that other is related. "Hello. As a member of the missing-left-finger community, me and my 5,000 social media followers demand you stop traumatizing us by ever mentioning left fingers. That's hate speech, and we'll sue you in 6 different states. Please don't force us to bring this to the media. Never mention left fingers again, and we won't deplatform you."

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hosford42, 9 months ago

@fastfinge @bryansmart @noahcarver This is why I'm a fan of consumer cooperatives. They are companies motivated to make their customers happy instead of increasing profits, because any profits would just go back to the people they were taken from.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 9 months ago

@hosford42 @fastfinge @noahcarver Cooperatives sound nice. I joined a tech workers cooperative once. It was hearding cats. The workers were essentially equal in authority. Everyone had their own idea of what would be best. They ignored the leaders and did what they wanted. If anyone gave the slightest pushback, they were like "if I can't do it my way, I'm gone." I realized top-down authority may be the only way to get somewhere in business.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

fastfinge, 9 months ago

@bryansmart @hosford42 @noahcarver I think that might be a problem with tech workers. These are the kind of people who would rather write encryption and date handling code from scratch than use a library, because they know best. I include myself in that; on the hole, tech workers are some of the worst people at cooperating with one another. See also: the 923,485 open source versions of every single tool on github, all of them doing exactly the same thing differently.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hosford42, 9 months ago

@bryansmart @fastfinge @noahcarver That's just tech workers in general. lol My old boss used to complain about herding cats all the time, because we had exactly the same attitude. It has nothing to do with cooperatives. I am chuckling at the memories now.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hosford42, 9 months ago

@bryansmart @fastfinge @noahcarver All they need to do is make it very clear up front how spectacularly and frequently the tool can fail. A clear disclaimer is enough. But they didn't do that. I know they didn't provide an adequate disclaimer, because I just had to tell several people that the tool was lying to them, when they thought it could be trusted.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Superfreq, 9 months ago

@hosford42 I agree with adding the disclaimers, but this was always going to happen and it is going to happen with the one from Be My Eyes as well. It really is up to the end-user to either choose less critical applications to use it on, or use all other available senses to confirm.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hosford42, 9 months ago

@Superfreq It is on the user, but I don't think it should be.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Superfreq, 9 months ago

@hosford42 We can't have our hands held forever, and we shouldn't deny this technology to the 90% of people who will use it mostly right. As I said, I agree with disclaimers and realistic claims, but beyond that The technology is just going to have to improve through usage, And users are going to have to get used to not trusting AI completely. I don't think most people are going to use it on a critical application the first time, and the results will likely speak for themselves.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

callionica, 9 months ago

@Superfreq @hosford42 We don’t allow restaurants to give their customers food poisoning just because the food is cheap. We don’t give the restaurants a pass because not everyone who ate there got sick. And we don’t blame people that get food poisoning because they should have known that restaurants sometimes give people food poisoning.

Tech companies are promoting these products like they’re mass market reliable tools that anyone can use. They’re not though are they?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ thepoliticalcat, trochee

Add comment