remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

: "There are two reasons why using a publicly available LLM such as ChatGPT might not be appropriate for processing internal documents. Confidentiality is the first and obvious one. But the second reason, also important, is that the training data of a public LLM did not include your internal company information. Hence that LLM is unlikely to give useful answers when asked about that information.

Enter retrieval-augmented generation, or RAG. RAG is a technique used to augment an LLM with external data, such as your company documents, that provide the model with the knowledge and context it needs to produce accurate and useful output for your specific use case. RAG is a pragmatic and effective approach to using LLMs in the enterprise.

In this article, I’ll briefly explain how RAG works, list some examples of how RAG is being used, and provide a code example for setting up a simple RAG framework." https://www.infoworld.com/article/3712860/retrieval-augmented-generation-step-by-step.html

kellogh, to python
@kellogh@hachyderm.io avatar

years ago, the “language of machine learning” was split between #R and but it’s been steadily shifting toward python. At this point, after all the developments, i think it’s clearly python. i don’t see much R in the LLM world at all. And increasingly, i’m seeing being the “systems language of

remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

: "Q. You've reported quite a lot on technology and algorithms in the past. How do you think journalists should cover the rise of generative AI?

A. Journalists should stop speaking about AI models as if they have personalities, and they are sentient. That is really harmful because it changes the conversation from something that we as humans control to a peer-to-peer relationship. We built these tools and we can make them do what we want.

Another thing I would recommend is talking about AI specifically. Which AI model are we talking about? And how does that compare to the other AI models? Because they are not all the same. We also need to talk about AI in a way that’s domain-specific. There’s a lot of talk about what AI will do to jobs. But that is too big a question. We have to talk about this in each field.

A classic example of that is that people have been predicting forever that AI is going to replace radiologists and it hasn't happened. So I would like to know why. That's the kind of question you can answer. So part of what we’d like to do at Proof News is focusing on a testable hypothesis. Focusing on a testable hypothesis forces you to be a little more rigorous in your thinking." https://reutersinstitute.politics.ox.ac.uk/news/julia-angwin-fears-public-sphere-about-get-worse-ai-makes-it-easier-flood-zone-misinformation

remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

: "...[T]he AI hype of the last year has also opened up demand for a rival perspective: a feeling that tech might be a bit disappointing. In other words, not optimism or pessimism, but scepticism. If we judge AI just by our own experiences, the future is not a done deal.

Perhaps the noisiest AI questioner is Gary Marcus, a cognitive scientist who co-founded an AI start-up and sold it to Uber in 2016. Altman once tweeted, “Give me the confidence of a mediocre deep-learning skeptic”; Marcus assumed it was a reference to him. He prefers the term “realist”.

He is not a doomster who believes AI will go rogue and turn us all into paper clips. He wants AI to succeed and believes it will. But, in its current form, he argues, it’s hitting walls.

Today’s large language models (LLMs) have learnt to recognise patterns but don’t understand the underlying concepts. They will therefore always produce silly errors, says Marcus. The idea that tech companies will produce artificial general intelligence by 2030 is “laughable”.

Generative AI is sucking up cash, electricity, water, copyrighted data. It is not sustainable. A whole new approach may be needed. Ed Zitron, a former games journalist who is now both a tech publicist and a tech critic based in Nevada, puts it more starkly: “We may be at peak AI.”" https://www.ft.com/content/648228e7-11eb-4e1a-b0d5-e65a638e6135

cassidy, to ai
@cassidy@blaede.family avatar

“AI” as currently hyped is giant billion dollar companies blatantly stealing content, disregarding licenses, deceiving about capabilities, and burning the planet in the process.

It is the largest theft of intellectual property in the history of humankind, and these companies are knowingly and willing ignoring the licenses, terms of service, and laws that us lowly individuals are beholden to.

https://www.nytimes.com/2024/04/06/technology/tech-giants-harvest-data-artificial-intelligence.html?unlocked_article_code=1.ik0.Ofja.L21c1wyW-0xj&ugrp=m

cassidy,
@cassidy@blaede.family avatar

I guess we wait this one out until the “AI” bubble bursts due to the incredible subsidization the entire industry is undergoing. It is not profitable. It is not sustainable.

It will not last—but the damage to our planet and fallout from the immense amount of wasted resources will.

https://arstechnica.com/information-technology/2023/10/so-far-ai-hasnt-been-profitable-for-big-tech/

gerrymcgovern, to random
@gerrymcgovern@mastodon.green avatar

Asked if a restaurant could serve cheese nibbled on by a rodent, the Microsoft / New York City government official AI chatbot replied:

“Yes, you can still serve the cheese to customers if it has rat bites,” before adding that it was important to assess the “the extent of the damage caused by the rat” and to “inform customers about the situation.”

AI is spewing out this sort of surreal garbage all over the world right now. AI is a monumental grift.

https://apnews.com/article/new-york-city-chatbot-misinformation-6ebc71db5b770b9969c906a7ee4fae21

simon_brooke,
@simon_brooke@mastodon.scot avatar

@ikt @gerrymcgovern This is a misunderstanding. have no semantic layer; consequently they have no concept of truth and falsity. All they know is whether there is a statistical probability that words will fit together in a particular order.

No LLM can ever be 'right', except by accident (which, statistically, will sometimes happen).

ppatel, to ai
@ppatel@mstdn.social avatar

Are we getting to another bubble?

Is Putting the Silicon Back in Silicon Valley

A new startup called MatX from former Google engineers reflects a renewed enthusiasm for chipmakers.

https://www.bloomberg.com/news/articles/2024-03-26/ai-chip-startups-like-matx-storm-silicon-valley

ppatel, to LLMs
@ppatel@mstdn.social avatar

Large language models can do jaw-dropping things. But nobody knows exactly why.
And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

https://www.technologyreview.com/2024/03/04/1089403/large-language-models-amazing-but-nobody-knows-why/

remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

: "The appearance of large language models (LLMs) and other forms of generative AI portend a new era of disruption and innovation for the news industry, this time focused on the production and consumption of news rather than on its distribution. Large news organizations, however, may be surprisingly well-prepared for at least some of this disruption because of earlier innovation work on automating workflows for personalized content and formats using structured techniques. This article reviews this work and uses examples from the British Broadcasting Corporation (BBC) and other large news providers to show how LLMs have recently been successfully applied to addressing significant barriers to the deployment of structured approaches in production, and how innovation using structured techniques has more generally framed significant editorial and product challenges that might now be more readily addressed using generative AI. Using the BBC's next-generation authoring and publishing stack as an example, the article also discusses how earlier innovation work has influenced the design of flexible infrastructure that can accommodate uncertainty in audience behavior and editorial workflows – capabilities that are likely to be well suited to the fast-approaching AI-mediated news ecosystem." https://onlinelibrary.wiley.com/doi/10.1002/aaai.12168

remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

: "Models like ChatGPT and Claude are deeply dependent on training data to improve their outputs, and their very existence is actively impeding the creation of the very thing they need to survive. While publishers like Axel Springer have cut deals to license their companies' data to ChatGPT for training purposes, this money isn't flowing to the writers that create the content that OpenAI and Anthropic need to grow their models much further. It's also worth considering that these AI companies may already have already trained on this data. The Times sued OpenAI late last year for training itself on "millions" of articles, and I'd bet money that ChatGPT was trained on multiple Axel Springer publications along with anything else it could find publicly-available on the web.

This is one of many near-impossible challenges for an AI industry that's yet to prove its necessity. While one could theoretically make bigger, more powerful chips (I'll get to that later), AI companies face a kafkaesque bind where they can't improve a tool for automating the creation of content without human beings creating more content than they've ever created before. Paying publishers to license their content doesn't actually fix the problem, because it doesn't increase the amount of content that they create, but rather helps line the pockets of executives and shareholders. Ironically, OpenAI's best hope for survival would be to fund as many news outlets as possible and directly incentivize them to do in-depth reporting, rather than proliferating a tech that unquestionably harms the media industry." https://www.wheresyoured.at/bubble-trouble/

moorejh, to ArtificialIntelligence
@moorejh@mastodon.online avatar

“penetrative AI,” designed to allow LLMs to extend their reach from their Web-scraped learning datasets into probing, and acting upon, the data that people generate on their own devices https://cacm.acm.org/news/safety-fears-raised-over-risks-of-penetrative-ai/

happyborg, to ai
@happyborg@fosstodon.org avatar

I remember the days of email before spam, phishing or any of that.

Decades of those and we have accepted them, most never knew a time without them.

isn't just going to automate spam-like activity, make better malware etc.

AI is going to be much, much worse, creating indistinguishable human like personas that control rather than leave specific traps. And they will be much harder to spot than spam.

The internet of shit right now is nothing to what's coming, and making it is legal.

metin, to ai
@metin@graphics.social avatar

𝚆𝚑𝚎𝚗 𝚆𝚒𝚕𝚕 𝚝𝚑𝚎 𝙶𝚎𝚗𝙰𝙸 𝙱𝚞𝚋𝚋𝚕𝚎 𝙱𝚞𝚛𝚜𝚝?

https://garymarcus.substack.com/p/when-will-the-genai-bubble-burst

remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

#AI #GenerativeAI #LLMs #Automation #Hallucinations: "The only reason bosses want to buy robots is to fire humans and lower their costs. That's why "AI art" is such a pisser. There are plenty of harmless ways to automate art production with software – everything from a "healing brush" in Photoshop to deepfake tools that let a video-editor alter the eye-lines of all the extras in a scene to shift the focus. A graphic novelist who models a room in The Sims and then moves the camera around to get traceable geometry for different angles is a centaur – they are genuinely offloading some finicky drudgework onto a robot that is perfectly attentive and vigilant.

But the pitch from "AI art" companies is "fire your graphic artists and replace them with botshit." They're pitching a world where the robots get to do all the creative stuff (badly) and humans have to work at a robotic pace, with robotic vigilance, in order to catch the mistakes that the robots make at superhuman speed.

Reverse centaurism is brutal. That's not news: Charlie Chaplin documented the problems of reverse centaurs nearly 100 years ago:" https://pluralistic.net/2024/04/01/human-in-the-loop/#monkey-in-the-middle

maxleibman, (edited ) to LLMs
@maxleibman@mastodon.social avatar

I have eaten
the text
that was on
the internet

and which
you had published
without
granting license

Forgive me
I'm an LLM
I steal
to make lies

williamgunn, to llm
@williamgunn@mastodon.social avatar

Wiley licenses content for training an . The company was not named, but I would suspect it's the one which has been signing a lot of licensing deals lately. Access to STM content could be a big differentiator, though I wouldn't expect it to be exclusive. Also, $23M sounds small. https://finance.yahoo.com/news/q3-2024-john-wiley-sons-043941097.html?guccounter=1

Crell, to LLMs
@Crell@phpc.social avatar

, even generative LLMs, doesn’t create new things. It just remixes at a phenomenal rate. High-output remixing tends to produce low-quality output, with occasional gems. So on net, they increase the ratio of low-quality garbage to gems. That’s all they can do. And that just gums up the works of, well, everything.

Generative AI, in its current trajectory, cannot be anything but a net-negative for society. But it is a great grift to rip off investors.

sotneStatue, to ChatGPT
@sotneStatue@fosstodon.org avatar

A colegue had some problems in their experiments and I found out the issue was they were using for unit conversions. I wonder how many people use it for doing math and trust the results

sotneStatue,
@sotneStatue@fosstodon.org avatar

I corrected the problem and explained them why they shouldn't use it like this, but honestly I don't think they understood clearly what the problem was.

I'm not a programmer and don't understand the details in how work, but I can already see what the lack of even surface understanding can do

Faintdreams, to LLMs
@Faintdreams@dice.camp avatar

So, let me get this straight.

Their entire business model involves stealing from the open Internet and now they are running out of places to steal from?

There isn't a violin small enough in the universe for me to want to play regarding this.

"The internet may not be big enough for the LLMs." The Verge

https://www.theverge.com/2024/4/1/24117828/the-internet-may-not-be-big-enough-for-the-llms

happyborg, to LLMs
@happyborg@fosstodon.org avatar

Surely spotting things like the xz attack, potentially at several stages, is at last a decent, safe use case for .

maxleibman, to ai
@maxleibman@mastodon.social avatar

You know all those sponsored stories at the bottom of news articles—the ones with sensationalist headlines and eye-catching photos that often have little to nothing to do with the "article" they link to? Making those articles is somebody's job.

So, if AI is going to destroy a bunch of jobs, can we start with that one?

lorddimwit, to PostgreSQL
@lorddimwit@mastodon.social avatar

I couldn’t remember a piece of array syntax so I searched Google.

The first result is some AI spam page. The result provided on it is incorrect; it’s the MySQL answer, though also not quite correct.

The Internet is going to be killed and it will be AI spam that kills it.

janriemer, to github

Excellent video by Dreams of Code ✨

Why I'm no longer using Copilot - by Dreams of Code

Invidious:
https://farside.link/https://www.youtube.com/watch?v=Wap2tkgaT1Q

(or YT: https://www.youtube.com/watch?v=Wap2tkgaT1Q)

mimsical, to random
@mimsical@mastodon.social avatar

There's a host of legal risks AI companies and companies that use generative AI are putting themselves in the path of, that we don't talk about enough:

📜 It's pretty clear Section 230, the foundational law enabling today's internet, DOES NOT protect AI-generated content like that from ChatGPT, Claude or Google's generative search experience

🚗💥🚙 Generative AI could also put companies at risk of product liability claims

My deep dive:

1/🧵

(gift link)

https://www.wsj.com/tech/ai/the-ai-industry-is-steaming-toward-a-legal-iceberg-5d9a6ac1?st=fzthflzxv4l5hgn&reflink=desktopwebshare_permalink

erispoe,
@erispoe@hachyderm.io avatar

@mimsical Counter: "using the way most commentators expect" is already far from the most common use case today, and will be less and less in importance.

Section 230 doesn't apply to e.g. automated pipelines of internal documents and using for it doesn't change that.

For all the media attention on content creation for public consumption, most use is very boring office work.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • anitta
  • mdbf
  • magazineikmin
  • InstantRegret
  • hgfsjryuu7
  • Durango
  • Youngstown
  • slotface
  • everett
  • thenastyranch
  • rosin
  • kavyap
  • khanakhh
  • PowerRangers
  • Leos
  • DreamBathrooms
  • vwfavf
  • ethstaker
  • tacticalgear
  • cubers
  • ngwrru68w68
  • modclub
  • cisconetworking
  • osvaldo12
  • GTA5RPClips
  • normalnudes
  • tester
  • provamag3
  • All magazines