#LLMs - kbin.social

ct_bergstrom, 11 months ago to random

One of the decisive moments in my understanding of #LLMs and their limitations was when, last autumn, @emilymbender walked me through her Thai Library thought experiment.

She's now written it up as a Medium post, and you can read it here. The value comes from really pondering the question she poses, so take the time to think about it. What would YOU do in the situation she outlines?

https://medium.com/@emilymenonbender/thought-experiment-in-the-national-library-of-thailand-f2bf761a8a83

reply

expand (90)

collapse (90)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ bursaar, Jigsaw_You, ratkins, thedarktangent +27 more

ajsadauskas, 2 months ago (edited 2 months ago) to tech

In an age of LLMs, is it time to reconsider human-edited web directories?

Back in the early-to-mid '90s, one of the main ways of finding anything on the web was to browse through a web directory.

These directories generally had a list of categories on their front page. News/Sport/Entertainment/Arts/Technology/Fashion/etc.

Each of those categories had subcategories, and sub-subcategories that you clicked through until you got to a list of websites. These lists were maintained by actual humans.

Typically, these directories also had a limited web search that would crawl through the pages of websites listed in the directory.

Lycos, Excite, and of course Yahoo all offered web directories of this sort.

(EDIT: I initially also mentioned AltaVista. It did offer a web directory by the late '90s, but this was something it tacked on much later.)

By the late '90s, the standard narrative goes, the web got too big to index websites manually.

Google promised the world its algorithms would weed out the spam automatically.

And for a time, it worked.

But then SEO and SEM became a multi-billion-dollar industry. The spambots proliferated. Google itself began promoting its own content and advertisers above search results.

And now with LLMs, the industrial-scale spamming of the web is likely to grow exponentially.

My question is, if a lot of the web is turning to crap, do we even want to search the entire web anymore?

Do we really want to search every single website on the web?

Or just those that aren't filled with LLM-generated SEO spam?

Or just those that don't feature 200 tracking scripts, and passive-aggressive privacy warnings, and paywalls, and popovers, and newsletters, and increasingly obnoxious banner ads, and dark patterns to prevent you cancelling your "free trial" subscription?

At some point, does it become more desirable to go back to search engines that only crawl pages on human-curated lists of trustworthy, quality websites?

And is it time to begin considering what a modern version of those early web directories might look like?

@degoogle #tech #google #web #internet #LLM #LLMs #enshittification #technology #search #SearchEngines #SEO #SEM

reply

expand (76)

collapse (76)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ timrichards, AdeptVeritatis, ubi, oblomov +11 more

ct_bergstrom, 6 months ago to ChatGPT

It is absolutely astounding to me that we are still earnestly entertaining the possibility that #ChatGPT and #LLMS more broadly have a role in scientific writing, manuscript review, experimental design, etc.

The training data for the question below are massive. It's a very easy question if you're trained on the entire internet.

Question: What teams have never made it to the World Series?

Correct answer: Seattle Mariners.

Now, four responses from GPT4.

NB: The Nationals won it all in 2019.

Which teams have never played in the world series?As of my last update in January 2022, the following MLB teams had never appeared in the World Series: 1. Seattle Mariners: The Mariners have been in the league since 1977 and, despite having several competitive seasons and star players like Ken Griffey Jr. and Ichiro Suzuki, they have not made it to the World Series. 2. San Diego Padres: While the Padres have actually appeared in the World Series twice (in 1984 and 1998), they haven't won a championship. 3. Texas Rangers: The Rangers, who originated as the Washington Senators before moving to Texas in 1972, have been to the World Series twice (in 2010 and 2011) but have not won. 4. Milwaukee Brewers: The Brewers have made one World Series appearance in 1982 but did not win. 5. Tampa Bay Rays: The Rays, who began play in 1998, have appeared in the World Series twice (in 2008 and 2020) but haven't won the title. 6. Colorado Rockies: The Rockies, who began play in 1993, made it to the World Series in 2007 but were defeated by the Boston Red Sox. Now, it's important to note that the list of teams that have never won a World Series is longer and includes some teams like the Padres, Rangers, Brewers, Rays, and Rockies, as mentioned, among others. However, this information is as of January 2022. The list could change based on subsequent seasons. You might want to check more recent sources to see if any of these teams have made or won the World Series in the past two seasons.
Which teams have never played in the world series? As of my last training data in January 2022, the following MLB teams have never played in the World Series: 1. Seattle Mariners - Founded in 1977, they have yet to make an appearance in the World Series. 2. Colorado Rockies - Although they've been to the playoffs multiple times since their inception in 1993, the Rockies have made it to the World Series once in 2007 but were swept by the Boston Red Sox, so they have played in the World Series. It's worth checking the most recent MLB seasons (2022 and 2023) to see if either of these teams have since made it to the World Series.
Which teams have never played in the world series? As of September 2021, three Major League Baseball (MLB) teams have never played in the World Series: 1. Seattle Mariners: The team was established in 1977 and is based in Seattle, Washington. Despite having some successful seasons and making multiple playoff appearances, the Mariners have never advanced to the World Series. 2. Washington Nationals: While the Nationals franchise itself has played in and won the World Series (in 2019), the team was formerly known as the Montreal Expos and was based in Montreal, Quebec, Canada. The Expos, established in 1969, never played in the World Series before relocating to Washington, D.C., and becoming the Nationals in 2005. 3. Colorado Rockies: Established in 1993 and based in Denver, Colorado, the Rockies have made several playoff appearances and reached the World Series once, in 2007, but did not win. It’s worth noting that teams' fortunes can change rapidly in sports, and new teams may have joined the league or existing teams may have reached the World Series since my last training data in September 2021. Was this response better or worse? d9 Better CP Worse = Same X

reply

expand (52)

collapse (52)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ alcinnz, ppatel, bornach, patterfloof +8 more

KathyReid, 10 months ago to threads

#Threads is not a text sharing platform, nor a #SocialMedia app. It's a platform for people to create natural language examples Meta can use for training #LLMs, for free

reply

expand (31)

collapse (31)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ TheQuinbox, datajake1999, devinprater, noellemitchell +11 more

matthewskelton, 11 months ago to ChatGPT

LLMs and GPT-based AI systems should pay IP fees to all creators and authors whose work provides the AI ability.

What's that you say? They wouldn't be financially viable? Then these AI systems are basically stealing. 💰

#ChatGPT #AI #LLMs

reply

expand (28)

collapse (28)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ mlevison, Jigsaw_You, thirstybear, torb

jonny, 11 months ago to random

Glad to formally release my latest work - Surveillance Graphs: Vulgarity and Cloud Orthodoxy in Linked Data Infrastructures.

web: https://jon-e.net/surveillance-graphs
hcommons: https://doi.org/10.17613/syv8-cp10

A bit of an overview and then I'll get into some of the more specific arguments in a thread:

This piece is in three parts:

First I trace the mutation of the liberatory ambitions of the #SemanticWeb into #KnowledgeGraphs, an underappreciated component in the architecture of #SurveillanceCapitalism. This mutation plays out against the backdrop of the broader platform capture of the web, rendering us as consumer-users of information services rather than empowered people communicating over informational protocols.

I then show how this platform logic influences two contemporary public information infrastructure projects: the NIH's Biomedical Data Translator and the NSF's Open Knowledge Network. I argue that projects like these, while well intentioned, demonstrate the fundamental limitations of platformatized public infrastructure and create new capacities for harm by their enmeshment in and inevitable capture by information conglomerates. The dream of a seamless "knowledge graph of everything" is unlikely to deliver on the utopian promises made by techno-solutionists, but they do create new opportunities for algorithmic oppression -- automated conversion therapy, predictive policing, abuse of bureacracy in "smart cities," etc. Given the framing of corporate knowledge graphs, these projects are poised to create facilitating technologies (that the info conglomerates write about needing themselves) for a new kind of interoperable corporate data infrastructure, where a gradient of public to private information is traded between "open" and quasi-proprietary knowledge graphs to power derivative platforms and services.

When approaching "AI" from the perspective of the semantic web and knowledge graphs, it becomes apparent that the new generation of #LLMs are intended to serve as interfaces to knowledge graphs. These "augmented language models" are joint systems that combine a language model as a means of interacting with some underlying knowledge graph, integrated in multiple places in the computing ecosystem: eg. mobile apps, assistants, search, and enterprise platforms. I concretize and extend prior criticism about the capacity for LLMs to concentrate power by capturing access to information in increasingly isolated platforms and expand surveillance by creating the demand for extended personalized data graphs across multiple systems from home surveillance to your workplace, medical, and governmental data.

I pose Vulgar Linked Data as an alternative to the infrastructural pattern I call the Cloud Orthodoxy: rather than platforms operated by an informational priesthood, reorienting our public infrastructure efforts to support vernacular expression across heterogeneous #p2p mediums. This piece extends a prior work of mine: Decentralized Infrastructure for (Neuro)science) which has more complete draft of what that might look like.

(I don't think you can pre-write threads on masto, so i'll post some thoughts as I write them under this) /1

#SurveillanceGraphs

reply

expand (27)

collapse (27)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ indieterminacy, Kierunkowy74, daridrea, dbs +4 more

kellogh, 7 months ago to ai

i appreciate #ai skeptics, they have valuable points. but it annoys me when they double down on some point that’s totally bogus

one example: #llms aren’t reliable and therefore we can’t use them for engineering

it’s misguided, but it’s worse than that. we engineer systems out of unreliable components all the time, that’s a core responsibility of engineers

e.g. amazon s3 has 11 9s if durability, from disks that have less than 99%

e.g. bridges & skyscrapers are built from unreliable materials

reply

expand (21)

collapse (21)

report

activity

copy /kbin url

copy original url

open original url

Loading...

ppatel, 11 months ago to ai

Note that the training data heavily relies on the Bible and its translations. Lots of bias there.

Meta unveils open-source #AI models it says can identify 4,000+ spoken languages and produce speech for 1,000+ languages, an increase of 40x and 10x respectively.

https://www.technologyreview.com/2023/05/22/1073471/metas-new-ai-models-can-recognize-and-produce-speech-for-more-than-1000-languages/

#LLms #LLM #Language #GenerativeAI #MachineLearning

reply

expand (21)

collapse (21)

report

activity

copy /kbin url

copy original url

open original url

Loading...

jake4480, 7 months ago to ai

Aside from social media divides, there is a HUGE divide in tech I'm seeing now - pro-AI (LLM) and anti-AI/LLM. People saying it's making awful code and causing other issues, and then the companies raving about adding it to things and demonstrations of what it can do. I seriously saw one after the other a couple times today 😬🤣

It's wild out there, man

#AI #LLMs #ArtificialIntelligence

reply

expand (20)

collapse (20)

report

activity

copy /kbin url

copy original url

open original url

Loading...

kellogh, 6 months ago to LLMs

i’ll say it — #LLMs can and will spit out any topic they’ve been trained on

an absurd amount of research is going into preventing the #LLM from explaining how to make a bomb, when they could just do some dumb tricks and remove the “how to make a bomb” manuals from the training corpus.

am i missing something?

reply

expand (18)

collapse (18)

report

activity

copy /kbin url

copy original url

open original url

Loading...

mnl, 3 months ago to LLMs

I’ve been very puzzled lately by how quickly it seems that some of my social circles, as they are getting to be 40-50 years, seem to have closed their minds to new concepts in general and the youth in particular.

Concretely of course within the context of #llms, where I get so many takes that llms will replace junior engineers but not them, that kids will become lazy and not learn to distinguish truth from hallucination, etc…

1/

reply

expand (17)

collapse (17)

report

activity

copy /kbin url

copy original url

open original url

Loading...

pluralistic, 7 months ago to privacy

How To Think About #Scraping: In #privacy and #labor fights, #copyright is a clumsy tool at best

https://doctorow.medium.com/how-to-think-about-scraping-2db6f69a7e3d?sk=4a1d687171de1a3f3751433bffbb5a96

#HumanRights #CritiHype #AI #LLMs #PlausibleSentenceGenerators #ComputationalLinguistics

reply

expand (17)

collapse (17)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ Nerdfest

barik, 4 months ago to ai

🎁 2023 https://hci.social WRAPPED ☃️ 🎄 ✨

👫🏾 New users: 382
✏️ Toots tooted: 46,536
❤️ Toots favorited: 105,419

🤖 Most used hash tags (Top 10):
#ai, #CHI2023, #economics, #academicrunplaylist, #HCI, #law, #CSCW2023, #ux, #aiethics, #LLMs

:ham: Most followed people (Top 5):
@cfiesler, @bkeegan, @jbigham, @andresmh, @axz

📕 HCI in toots: 1,186
😆 LOL in toots: 884
😱 OMG in toots: 110

💾 Media storage: 1.89 TB
💰 Hosting fees: $2,912 (thanks, Princeton Research!)

HAPPY NEW YEAR!

reply

expand (16)

collapse (16)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ princeton, andresmh, jbigham, axz +1 more

UlrikeHahn, 4 months ago to LLMs

It’s striking to me how strongly people feel about the word “artificial intelligence” in application to #LLMs -There seems to be a fairly widespread sense that the term isn’t just unhelpful but somehow factually deeply ‘wrong’.

Setting aside that AI is an established term, that intuition seems at odds to me with how language works. To see this imagine “artificial intelligence” is an entirely novel, never before uttered, noun compound. 1/n

reply

expand (16)

collapse (16)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ TEG, kellogh

flameeyes, 2 months ago to LLMs

Yesterday's maintenance work on #unpaper is something that to me clearly shows the point I was making about the opportunities arising in treating specific #LLMs as Computer-Aided Software Engineering (CASE) tools, so I thought I would post a quick thread here, since I don't think I'll manage to post it on the blog any time soon.

Full disclosure before I start: I work for Meta, which clearly has been betting a lot on AI — but this is my personal point of view, and I don't work on AI projects.

reply

expand (16)

collapse (16)

report

activity

copy /kbin url

copy original url

open original url

Loading...

kellogh, 4 months ago (edited 4 months ago) to random

Last night I came up with (and implemented!) an idea for a mastodon client that automatically curates my feed by categorizing toots. It's just idea phase right now, but I wrote about the process here https://timkellogg.me/blog/2023/12/19/fossil #LLMs #AI #feditips

reply

expand (16)

collapse (16)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ box464

janriemer, 6 months ago to ai

You know that #BigTech looses millions of $ through their deployed #AI systems, right? You can expect a much higher price for using their #LLMs in the future - be it your privacy or your money.

So instead of learning proompt engineering, why not do something more useful and invest your time into learning a new #ProgrammingLanguage:

#Rust - a language empowering everyone to build reliable and efficient software

#Haskell - a purely functional language that changes the way you think

reply

expand (15)

collapse (15)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ alcinnz

mnl, 9 months ago to LLMs

This is a really nice article about what llms mean for our writing. https://www.noemamag.com/what-ai-teaches-us-about-good-writing/

Besides writing transactional/functional text (memos, hiring ads, seo nonsense, technical summaries), one thing I like is that llms, by virtue of parroting the obvious, allow me to “subtract” mainstream boilerthought from my ideas.

If GPT can transpose what I am trying to say to 8 different topics, then maybe i’m not having that valuable a thought, at least without more vivid examples.

#llms #writing

reply

expand (15)

collapse (15)

report

activity

copy /kbin url

copy original url

open original url

Loading...

mnl, 4 months ago to LLMs

it's so wild to me that in 2024 half of the #LLMs related posts on my timeline (after filtering a fair amount of people that annoyed me) is still: "these things are useless lying pieces of nonsense". Do you live under a rock? How does this happen? Did you ever try these things out?

Or are people happy skimming the foam at the surface and then complaining their thirst doesn't get stilled and that they now have milk foam on their chin? (lol where am I going with that analogy...)

reply

expand (15)

collapse (15)

report

activity

copy /kbin url

copy original url

open original url

Loading...

ai6yr, 6 months ago to ai

Founders of OpenAI: "It’s Altman’s belief — along with the belief of Ilya Sutskever, OpenAI’s chief scientist and a co-founder — that AI with intelligence exceeding that of humans could arrive within the decade, and that this AI won’t necessarily be benevolent — necessitating research into ways to limit and restrict it." #Ai #LLMs #AIpocalpyse https://techcrunch.com/2023/10/26/openai-forms-team-to-study-catastrophic-risks-including-nuclear-threats/

reply

expand (14)

collapse (14)

report

activity

copy /kbin url

copy original url

open original url

Loading...

KathyReid, 23 hours ago to stackoverflow

I just issued a data deletion request to #StackOverflow to erase all of the associations between my name and the questions, answers and comments I have on the platform.

One of the key ways in which #RAG works to supplement #LLMs is based on proven associations. Higher ranked Stack Overflow members' answers will carry more weight in any #LLM that is produced.

By asking for my name to be disassociated from the textual data, it removes a semantic relationship that is helpful for determining which tokens of text to use in an #LLM.

If you sell out your user base without consultation, expect a backlash.

reply

expand (14)

collapse (14)

report

activity

copy /kbin url

copy original url

open original url

Loading...

cassidy, 11 months ago to ai

I get that it’s hot right now, but man, the user experience of LLMs being this bot you type text to seems like a huge step backwards compared to just integrating these AI features natively into products.

#GoogleIO #LLMs #AI

reply

expand (13)

collapse (13)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ alcinnz

petersuber, 1 year ago to ai

Good start on a hard question — how or whether to use #AI tools in #PeerReview.
https://www.researchsquare.com/article/rs-2587766/v1

"For the moment, we recommend that if #LLMs are used to write scholarly reviews, reviewers should disclose their use and accept full responsibility for their reports’ accuracy, tone, reasoning and originality."

PS: "For the moment" these tools can help reviewers string words together, not judge quality. We have good reasons to seek evaluative comments from human experts.

reply

expand (13)

collapse (13)

report

activity

copy /kbin url

copy original url

open original url

Loading...

KathyReid, 7 months ago to LLMs

All the non-dev-background managers in my feed:

"Generative AI will be great for coding! It will reduce our development time for products so much!"

All the dev-background folx in my feed:

"Sure, #CoPilot will generate plausible code for you really quickly, but who's going to write your unit tests and make sure there aren't any insidious errors at a #systems level that you can't identify in a single block of code in isolation?"

Also,

"Is my job now #RLHF for code-focused #LLMs?"

reply

expand (12)

collapse (12)

report

activity

copy /kbin url

copy original url

open original url

Loading...

kellogh, 7 months ago to LLMs

one of the “business person” talking points on #LLMs that annoys me is “memories”. they’re wowed at an AI’s ability to remember things, as if 2 TB hard drives didn’t exist

reply

expand (12)

collapse (12)

report

activity

copy /kbin url

copy original url

open original url

Loading...