dfeldman,
@dfeldman@hachyderm.io avatar

If you feed AI an MRI, it will happily write a detailed and very convincing diagnosis...

even if the patient is a dead salmon.

it's a salmon

singe,
@singe@chaos.social avatar

@dfeldman oh my.

xs4me2,
@xs4me2@mastodon.social avatar

@singe

For freaks sake…!!!

singe,
@singe@chaos.social avatar

@dfeldman after correcting it, it got the reference.

caiocgo,
@caiocgo@mastodon.social avatar

deleted_by_author

  • Loading...
  • lerxst,
    @lerxst@az.social avatar

    @caiocgo @singe @blogdiva @dfeldman I got one of these so-called AIs to write a timelime of mobile telephones. It didn't have the full history so when I told it about Carterphone it gave me the "My apologies" line, and I kept correcting it with progressively more ludicrous statements and never once called me out. At the end, I had it telling me about Alcibiades inventing the first mobile phone to help win the Peloponnesian War.

    remenca,
    @remenca@mastodont.cat avatar

    @dfeldman "If I do this thing which the system was not designed to process, then it breaks. Look how awesome and smart I am". Great.

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @remenca @dfeldman this is actually a very important criticism — systems should not work on "garbage in, garbage out". As far as possible, they should work on "garbage in, error message out". That the system is not only incapable of spotting the mistake but actually in the first sentence affirms that the input was correct means you can't trust it. If its only failure mode is to confidently make an incorrect diagnosis, then how can you ever trust anything it says isn't just that?

    remenca,
    @remenca@mastodont.cat avatar

    @andrewt @dfeldman There are plenty of systems that do no report "error message out" and are still useful. It kind of reminds me of that story of that old woman who put her cat into a microwave oven, and now all microwave ovens come with explicit instructions of not putting cats into them. GPT4 has a big warning too in its frontpage about not trusting it. Yet, the author tries anyway and feigns surprise when the expected outcome occurs. This is not about the system but the user.

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @remenca @dfeldman this is the issue though, right? A microwave is marketed as a cooker, AI is marketed as a general purpose trustworthy answer machine. It has the "do not trust the answer" disclaimer only for the same reason that psychic hotlines have a "for entertainment purposes only" disclaimer — so they can blame the user when something goes wrong. If you aren't meant to pay any attention to its answers, then what is the point of it?

    raptor85,
    @raptor85@mastodon.gamedev.place avatar

    @andrewt @remenca @dfeldman the real answer if you read the model cards is more complex, too much for your average user. To put it simply though depending on what model in use what you'll receive is a statistically likely response based on whatever you put as input, obviously with it only having tags matching what that particular model has been trained on. The less data that exists on a specific thing, the less likely it gets it right, so off the wall inputs will often result in crazy outputs.

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @raptor85 @remenca @dfeldman yeah, but like, that's exactly the problem, isn't it? I was once in a clinical trial looking at gingivitis and we said in the protocol we'd look at the central gingival margin — the bit of gum between the two front teeth. Then a subject turned up who had three front teeth. We had to make a call on what to do. An AI would have spat out some utter nonsense. You can't simply insist that the real world conform to the assumptions made when training the system, because it will always find a way not to, and any system you use has to be able to handle that.

    raptor85,
    @raptor85@mastodon.gamedev.place avatar

    @andrewt @remenca @dfeldman kind of, I'd say it's a failure on google/meta's part where the public is assuming that a generic model is good at everything, plus models are VERY sensitive to prompt formatting. (natural language works, but it's not ideal) You could quite easily train a model, say for your case specifically on data and images of human teeth, with enough well tagged inputs and exclusion of outside noise it would do an exceptionally good job at describing new images thrown at it.

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @raptor85 @remenca @dfeldman sure, but only if those images were like the ones it was trained on — if something unexpected turned up, which it definitely would, the model would not only fail, but fail in a completely random direction with no indication that it had done so. that's clearly much worse than if you had a human dentist look at the image and make an assessment because the human would (a) know something was wrong and (b) respond sensibly

    raptor85,
    @raptor85@mastodon.gamedev.place avatar

    @andrewt @remenca @dfeldman not really true at all, that's again more of just a problem specific to google/meta's generic settings in their web front end, most models when used in combination of good settings/prompting will tell you they can't make sense of an input, it's quite easy to determine that something put in simply doesn't have enough data that matches the tags to be reliable. Can't really take chatgpt's generic "google it and write a reasonable answer" settings as how everything works.

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @raptor85 @remenca @dfeldman I mean yes and no, like I'm sure a properly trained model designed exclusively to detect things in MRI scans could reliably reject scans that are actually fish, my worry is images of patients with unrelated benign tumours, or who've had strokes and had to rewire things around the damage — things that closely resemble the training data, but differ from it in important ways that a human would understand and an AI model cannot. And there's no real way to know if you've accounted for all of them because as I say, the real world will throw some pretty unlikely things at your system if you use it long enough.

    remenca,
    @remenca@mastodont.cat avatar

    @andrewt @raptor85 @dfeldman you can account by having a good test set where you can evaluate the model. Obviously, the unseen cannot be accounted, but that also applies to humans. If a human doctor who is an expert on let's say stomachs is presented with a picture of a lung, he will fail or succeed depending on whether the disease he is trying to find appears also in stomaches or not. I don't think that this is too different from machines.

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @remenca @raptor85 @dfeldman no, if a human doctor who is an expert in stomachs is shown a picture of a lung, he will say "that's a lung, go ask Jane, she knows about lungs" because human doctors are intelligent beings with life experience beyond a million pictures of stomachs

    remenca,
    @remenca@mastodont.cat avatar

    @andrewt @raptor85 @dfeldman This is a good point. I concede. Still, when pressed to give an answer the result will be similar.

    wonka,
    @wonka@chaos.social avatar

    @remenca But they will still have pointed out that it's a lung and not their expertise instead of confidently blabbing BS like the stochastical parrots they call "AI"/"AGI" do.

    @andrewt @raptor85 @dfeldman

    remenca,
    @remenca@mastodont.cat avatar

    @wonka @andrewt @raptor85 @dfeldman
    I love how you guys use "statistical parrots" as in insult without realizing that we humans are exactly the same, hahaha.

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @remenca @wonka @raptor85 @dfeldman this is a baseless statement. Please provide with evidence sustaining it.

    remenca,
    @remenca@mastodont.cat avatar

    @andrewt @wonka @raptor85 @dfeldman

    I love when people asks for evidence.

    Trial-and-error, which is the basis of all learning, is just stochastic gradient descent in an abstract form.

    We can model a human as a function $h: \X \rarrow \y \in H$ such that for an input $\x$ produces an output $\y^\hat$. Now, exists a functional $\L$ that takes an $\h$ and produces a positive measure of error. Then the human updates its behavior to minimize that error, doing the opposite that caused that error.

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @remenca @wonka @raptor85 @dfeldman this is the dumbest thing I ever heard

    remenca,
    @remenca@mastodont.cat avatar

    @andrewt @wonka @raptor85 @dfeldman

    Still, you cannot challenge it.

    wonka,
    @wonka@chaos.social avatar

    @remenca It's on you to provide evidence, not on others to disprove your claim.

    https://en.m.wikipedia.org/wiki/Sagan_standard

    @andrewt @raptor85 @dfeldman

    raptor85,
    @raptor85@mastodon.gamedev.place avatar

    @andrewt @remenca @dfeldman and that's really the job of the software involved, realistically if you were implementing a system for something critical like this the real workflow is to sort, flag, add descriptions, then anything questionable put at high priority on top for a doctor to verify. Remember the models themselves are basically just tags and math, you still need to WRITE the software that uses it

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @raptor85 @remenca @dfeldman but that's exactly the point that was being made, isn't it? Nobody's saying there's no place for basic image tagging algorithms, the argument is against slapping a language model into something important and acting like it's a replacement for human judgement, which is the main drive of the "AI" "industry" at the moment. And we figured out image tagging years ago, it's not some newfangled pipedream, I've got it on my phone and it works pretty well.

    remenca,
    @remenca@mastodont.cat avatar

    @andrewt @raptor85 @dfeldman what happens if that LLM works perfectly for that concrete problem?

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @remenca @raptor85 @dfeldman I mean the one in the example thought a fish was a human so I don't think that's terribly likely to come up

    raptor85,
    @raptor85@mastodon.gamedev.place avatar

    @andrewt @remenca @dfeldman so if I show a random, untrained human on the street a MRI of a brain tumor would they correctly identify it?

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @raptor85 @remenca @dfeldman I have no idea what point you're trying to make

    I am saying that the ai industry has invented a bullshitting machine and is trying to foist it into everything because that's how it makes money and I think this is a bad idea

    I do not think language models should be allowed to make diagnoses. I do not think random untrained humans should be allowed to make diagnoses. I think doctors should make diagnoses.

    remenca,
    @remenca@mastodont.cat avatar

    @andrewt @raptor85 @dfeldman even when AIs make better diagnoses? This makes no sense.

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @remenca @raptor85 @dfeldman ais will never make better diagnoses

    remenca,
    @remenca@mastodont.cat avatar

    @andrewt @raptor85 @dfeldman this is a baseless statement. Please provide with evidence sustaining it.

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @remenca @raptor85 @dfeldman well, you see, in order to make a diagnosis, an ai would need to exist

    remenca,
    @remenca@mastodont.cat avatar

    @andrewt @raptor85 @dfeldman I don't want to delve into what does it mean to be intelligent or not. I'm just saying that there is a machine that gets right a diagnosis more often than a human doctor does.

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @remenca @raptor85 @dfeldman can the machine talk to a patient, ask questions, examine them, work out what scan needs to be done, suggest medicine, discreetly enquire if those bruises are really from falling down the stairs and offer a comforting bedside manner? There is more to diagnosis than analysing images and you do need fully general intelligence and a knowledge of the world outside the human body to do it properly

    raptor85,
    @raptor85@mastodon.gamedev.place avatar

    @andrewt @remenca @dfeldman personally I think this is asking the wrong question, while to a degree the answer to most of this is "yes" this makes the assumption of the process fully replacing having doctors, when in practice it would be more efficiently used as a way to help doctors prioritize and get the right information into their hands faster without them having to manually sort through all the information themselves.

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @raptor85 @remenca @dfeldman ok but the question was specifically about making a diagnosis, and that's what "making a diagnosis" is. And we're so far away from building a machine that can do it it's laughable. Current technologies and future iterations on them can doubtless help a human do it, but they can't do it

    raptor85,
    @raptor85@mastodon.gamedev.place avatar

    @andrewt @remenca @dfeldman While I wouldn't trust a current gen system just yet to be accurate enough on it's own but I'd be cautious about assuming future iterations couldn't, machines are exceptionally good at finding patterns which is in the end what a diagnosis is based on, if I had to wager I'd give it 5 years before there's some level of automated urgent care centers. Don't forget we're doing things now widely considered almost impossible 5 years ago

    remenca,
    @remenca@mastodont.cat avatar

    @andrewt @raptor85 @dfeldman Of course, if we are entitled to select only the cases in which our claims work perfectly, there would not be much of a discussion, don't you think. My point here is: if we train a network to process MRIs or any other medical problem and it turns out that it works better than the human doctors what do we do?

    raptor85,
    @raptor85@mastodon.gamedev.place avatar

    @remenca @andrewt @dfeldman or, better, train a model to sort thousands of MRIs and find ones with a high likeliness of critical issues for immediate review, which is a more likely and immediately useful use case.

    remenca,
    @remenca@mastodont.cat avatar

    @andrewt @raptor85 @dfeldman I think this is relevant because gpt-4 failing to process an MRI and the entire field of AI failing in the same are different things. You are using the first failure to extend to every AI. This is an inappropriate generalization fallacy.

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @remenca @raptor85 @dfeldman I don't think anyone anywhere is saying that all machine learning algorithms are uniformly terrible at everything. The current pushback against ai is fairly specifically against the "use as much energy as a small town to create a plagiarism machine and push it as the solution to everything" model of ai being pushed at the moment by the grifters who got out of crypto and still have a warehouse full of graphics cards to think of a use for

    remenca,
    @remenca@mastodont.cat avatar

    @andrewt @raptor85 @dfeldman I do agree that capitalism is using AI in the worst possible way, yes. I am against to many of its uses. But that does not mean that AI is useless. Only that we should get it off the hands of our capitalist overlords.

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @remenca @raptor85 @dfeldman I think it means it's reasonable and fair to highlight the shortcomings of the grift kind of ai using funny pictures of fish, though

    raptor85,
    @raptor85@mastodon.gamedev.place avatar

    @andrewt @remenca @dfeldman while it is funny you can see how the conversation quickly turns to "ALL AI IS EVIL AND BAD, people using it should be treated as criminals!" though with many people making claims that frankly have little basis in reality, even with an example of someone simply using a tool in an unexpected way in an attempt to force it to output a bad result. Basically drumming up a lot of drama over nothing

    raptor85,
    @raptor85@mastodon.gamedev.place avatar

    @andrewt @remenca @dfeldman this is about as accurate as saying "all doctors are just witches using leeches and snake oil". I can't really blame you though, misinformation travels around like WILD about this industry. The truth is most models can run on boards that use less than 10Watts, licenses for input data is HIGHLY policed in most models, and the only people pushing it as a solution to everything are idiots unrelated to development.

    remenca,
    @remenca@mastodont.cat avatar

    @andrewt @raptor85 @dfeldman you don't know what an AI will do. If the AI have seen pictures of other people with three front teeth it will probably do it right. If not, it will likely fail or maybe it is able to generalize from cases with two teeth, like you did.

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @remenca @raptor85 @dfeldman right so EITHER you need to specifically train it on the edge cases that almost never come up OR it's going to guess in a haphazard way. What we did was to flag up the issue, discuss it, and work out if we needed to exclude the data, and note in the publication what we did about it. (I forget what the final decision was.) If the AI takes a guess then it won't tell you it's done it or justify its approach, it'll just spit out a number and nobody will ever know the input was weird

    raptor85,
    @raptor85@mastodon.gamedev.place avatar

    @andrewt @remenca @dfeldman You're basing a (wrong) assumption on how all the software works based on a single example of a generic implementation that's designed for entertainment and getting people to watch ADs. The software using the model does literally whatever you program it to do. To put into context of doctors, it's like basing your understanding of how hospitals work on having seen an episode of "House".

    remenca,
    @remenca@mastodont.cat avatar

    @andrewt @raptor85 @dfeldman I don't think it is so different. After all, you tried to come up with an answer from your previous knowledge. The network will do the same. You could train it to output a confidence level if you want to. But still, it would be very much the same in my eyes.

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @remenca @raptor85 @dfeldman I'm not convinced by "confidence level" stuff — it's not really a confidence level, is it, it's just another output. When ChatGPT says it doesn't know someone it doesn't mean it doesn't know, because it never knows, it means "this is the sort of question that I've been trained to think 'i don't know' is an acceptable answer to"

    remenca,
    @remenca@mastodont.cat avatar

    @andrewt @raptor85 @dfeldman I'm not talking about chatgpt. I'm talking about things like bayesian networks or VAEs that allow to output a confidence score with each prediction. You could augment an LLM with some of that.

    andrewt,
    @andrewt@mathstodon.xyz avatar

    @remenca @raptor85 @dfeldman why not just take the language model out altogether? What's it adding beyond an air of authority?

    raptor85,
    @raptor85@mastodon.gamedev.place avatar

    @andrewt @remenca @dfeldman I think i see the disconnect here, when I'm talking AI and models LLMs are one small subset, you wouldn't identify pictures by a language model, you'd have an image recognition model for that. Sure if you want to format your output in plain english, you can also use a language model for that, but that's again up to whoever designs the system, it's by no means required. (For instance I have a model I use that produces tag clouds for input images)

    oblomov,
    @oblomov@sociale.network avatar

    @remenca @dfeldman OTOH, a human would be able to tell they're looking at something that doesn't fit the normal parameters.

    dfeldman,
    @dfeldman@hachyderm.io avatar

    inspired by this famous satirical study poking fun at fmri research https://www.psychology.mcmaster.ca/bennett/psy710/readings/BennettDeadSalmon.pdf

    jake4480,
    @jake4480@c.im avatar

    @dfeldman that's interesting. It's also interesting that I often actually feel like a dead salmon

    metin,
    @metin@graphics.social avatar

    @dfeldman Hilarious! 😆👍

    levampyre,
    @levampyre@chaos.social avatar

    @dfeldman Is this still news?

    jnfingerle,
    @jnfingerle@social.saarland avatar

    @levampyre
    No, but it's funny.

    @dfeldman

    levampyre,
    @levampyre@chaos.social avatar

    @jnfingerle I don't know. I'm already at the "it's sad" stage. It is already a pain to use web search engines these days, because they dig up garbage that was generated from llms that have been fed garbage. We've lost so much depth already and people still think, it's funny? @dfeldman

    jnfingerle,
    @jnfingerle@social.saarland avatar

    @levampyre
    Many years ago a I was on a train, together with my mother and sister. We went to my granddad's funeral. And we were telling jokes.

    Making fun can be cathartic.

    @dfeldman

    levampyre,
    @levampyre@chaos.social avatar
    flockofnazguls,
    @flockofnazguls@mastodon.flockofnazguls.com avatar

    deleted_by_author

  • Loading...
  • synapsenkitzler,

    @flockofnazguls @dfeldman
    capital is (m) taking this shit seriously

    joelanman,
    @joelanman@hachyderm.io avatar

    @dfeldman It's picked up a lot of issues that a human 'expert' wouldn't have /s

    HeavenlyPossum,
    @HeavenlyPossum@kolektiva.social avatar

    @dfeldman

    But I’ve been told by Very Serious People that these fancy autocorrects are on the cusp of self-awareness and world-breaking super intelligence.

    mike_k,
    @mike_k@mstdn.social avatar

    @dfeldman it's not a well salmon, all kinds of back pain

    nblr,
    @nblr@chaos.social avatar

    @mike_k @dfeldman
    Actual salmon expert here. This salmon is not experiencing any pain. It's perfectly normal and dead. All it needs is a bit of lemon vinaigrette.

    quincy,
    @quincy@chaos.social avatar

    @nblr @mike_k @dfeldman

    To me it looks like it's pining for the fjords.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • ethstaker
  • DreamBathrooms
  • InstantRegret
  • tacticalgear
  • magazineikmin
  • Youngstown
  • thenastyranch
  • mdbf
  • slotface
  • rosin
  • modclub
  • kavyap
  • cubers
  • osvaldo12
  • JUstTest
  • khanakhh
  • cisconetworking
  • Durango
  • everett
  • ngwrru68w68
  • Leos
  • normalnudes
  • GTA5RPClips
  • tester
  • anitta
  • provamag3
  • megavids
  • lostlight
  • All magazines