#AI#GenerativeAI#LLMs#Languages: "Recently, Bonaventure Dossou learned of an alarming tendency in a popular AI model. The program described Fon—a language spoken by Dossou’s mother and millions of others in Benin and neighboring countries—as “a fictional language.”
This result, which I replicated, is not unusual. Dossou is accustomed to the feeling that his culture is unseen by technology that so easily serves other people. He grew up with no Wikipedia pages in Fon, and no translation programs to help him communicate with his mother in French, in which he is more fluent. “When we have a technology that treats something as simple and fundamental as our name as an error, it robs us of our personhood,” Dossou told me.
The rise of the internet, alongside decades of American hegemony, made English into a common tongue for business, politics, science, and entertainment. More than half of all websites are in English, yet more than 80 percent of people in the world don’t speak the language. Even basic aspects of digital life—searching with Google, talking to Siri, relying on autocorrect, simply typing on a smartphone—have long been closed off to much of the world. And now the generative-AI boom, despite promises to bridge languages and cultures, may only further entrench the dominance of English in life on and off the web."
The consistent theme here is that they all want little regulation. They don't want the others to be entrenched.
A profile of Mistral AI CEO Arthur Mensch, who says, as an atheist, he is uncomfortable with Silicon Valley's #"AGI rhetoric" and "religious" fascination with #AI.
Have people tried to link the output of #llms to devices that take actions in the real world? What if one creates a way for an #llm to make pull requests on #github or #gitlab ? The #ai could spend day and night browsing through #foss to improve them. That would be as much exciting as creepy!
Google touting that its latest #AI models and services can be grounded through its search results isn't the boast it thinks it is, especially considering the quality of its results lately. Has anybody considered the feedback loop of AI results being ranked hire and then being used to ground Gemini Pro?
Whenever I see OpenAI's Sam Altman with his pseudo-innocent glance, he always reminds me of Carter Burke from Aliens (1986), who deceived the entire spaceship crew in favor of his corporation, with the aim of getting rich by weaponizing a newly discovered intelligent lifeform.
#llms are the ultimate answer to the internet #google created: they avoid ads, ignore superfluous information of cooking recipe websites, and create a layer of privacy between you and the #internet. Well, at least as long as you can trust the provider of the LLM...
Content providers will feel that this hurts their pocket. It essentially gives everyone on the internet an ad blocker and this may lead to more paywalls. Will #ai providers start to buy information from these content providers? Or will they sell opportunities to place information?
Two articles I saved about a year ago, maybe worth reflecting now when it comes to what #LLMs can achieve and cannot achieve, have achieved and been used for in the past year, and how the applications scape has been developing:
Move over, deep learning: Symbolica’s structured approach could transform #AI
Artificial intelligence startup Symbolica emerged from stealth today and unveiled a novel approach to constructing AI models, leveraging advanced mathematics to imbue systems with human-like reasoning capabilities and unprecedented transparency.
I posted an early discussion of this earlier. But Apple's work on this could make shortcuts work so much better and #accessibility work faster and more accurate.
Ferret-UI: Grounded Mobile UI Understanding with Multimodal #LLMs
#AI#GenerativeAI#LLMs#Chatbots#RAG: "There are two reasons why using a publicly available LLM such as ChatGPT might not be appropriate for processing internal documents. Confidentiality is the first and obvious one. But the second reason, also important, is that the training data of a public LLM did not include your internal company information. Hence that LLM is unlikely to give useful answers when asked about that information.
Enter retrieval-augmented generation, or RAG. RAG is a technique used to augment an LLM with external data, such as your company documents, that provide the model with the knowledge and context it needs to produce accurate and useful output for your specific use case. RAG is a pragmatic and effective approach to using LLMs in the enterprise.
years ago, the “language of machine learning” was split between #R and #python but it’s been steadily shifting toward python. At this point, after all the #LLM developments, i think it’s clearly python. i don’t see much R in the LLM world at all. And increasingly, i’m seeing #rust being the “systems language of #ML” #rustlang#LLMs
A. Journalists should stop speaking about AI models as if they have personalities, and they are sentient. That is really harmful because it changes the conversation from something that we as humans control to a peer-to-peer relationship. We built these tools and we can make them do what we want.
Another thing I would recommend is talking about AI specifically. Which AI model are we talking about? And how does that compare to the other AI models? Because they are not all the same. We also need to talk about AI in a way that’s domain-specific. There’s a lot of talk about what AI will do to jobs. But that is too big a question. We have to talk about this in each field.
#AI#GenerativeAI#LLMs#Chatbots#Hype: "...[T]he AI hype of the last year has also opened up demand for a rival perspective: a feeling that tech might be a bit disappointing. In other words, not optimism or pessimism, but scepticism. If we judge AI just by our own experiences, the future is not a done deal.
Perhaps the noisiest AI questioner is Gary Marcus, a cognitive scientist who co-founded an AI start-up and sold it to Uber in 2016. Altman once tweeted, “Give me the confidence of a mediocre deep-learning skeptic”; Marcus assumed it was a reference to him. He prefers the term “realist”.
He is not a doomster who believes AI will go rogue and turn us all into paper clips. He wants AI to succeed and believes it will. But, in its current form, he argues, it’s hitting walls.
Today’s large language models (LLMs) have learnt to recognise patterns but don’t understand the underlying concepts. They will therefore always produce silly errors, says Marcus. The idea that tech companies will produce artificial general intelligence by 2030 is “laughable”.
Generative AI is sucking up cash, electricity, water, copyrighted data. It is not sustainable. A whole new approach may be needed. Ed Zitron, a former games journalist who is now both a tech publicist and a tech critic based in Nevada, puts it more starkly: “We may be at peak AI.”" https://www.ft.com/content/648228e7-11eb-4e1a-b0d5-e65a638e6135
“AI” as currently hyped is giant billion dollar companies blatantly stealing content, disregarding licenses, deceiving about capabilities, and burning the planet in the process.
It is the largest theft of intellectual property in the history of humankind, and these companies are knowingly and willing ignoring the licenses, terms of service, and laws that us lowly individuals are beholden to.
I guess we wait this one out until the “AI” bubble bursts due to the incredible subsidization the entire industry is undergoing. It is not profitable. It is not sustainable.
It will not last—but the damage to our planet and fallout from the immense amount of wasted resources will.
Asked if a restaurant could serve cheese nibbled on by a rodent, the Microsoft / New York City government official AI chatbot replied:
“Yes, you can still serve the cheese to customers if it has rat bites,” before adding that it was important to assess the “the extent of the damage caused by the rat” and to “inform customers about the situation.”
AI is spewing out this sort of surreal garbage all over the world right now. AI is a monumental grift.
@ikt@gerrymcgovern This is a misunderstanding. #LLMs have no semantic layer; consequently they have no concept of truth and falsity. All they know is whether there is a statistical probability that words will fit together in a particular order.
No LLM can ever be 'right', except by accident (which, statistically, will sometimes happen).
Large language models can do jaw-dropping things. But nobody knows exactly why.
And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.
#AI#LLMs#Media#Journalism#GenerativeAI#News#BBC#Automation#Audiences: "The appearance of large language models (LLMs) and other forms of generative AI portend a new era of disruption and innovation for the news industry, this time focused on the production and consumption of news rather than on its distribution. Large news organizations, however, may be surprisingly well-prepared for at least some of this disruption because of earlier innovation work on automating workflows for personalized content and formats using structured techniques. This article reviews this work and uses examples from the British Broadcasting Corporation (BBC) and other large news providers to show how LLMs have recently been successfully applied to addressing significant barriers to the deployment of structured approaches in production, and how innovation using structured techniques has more generally framed significant editorial and product challenges that might now be more readily addressed using generative AI. Using the BBC's next-generation authoring and publishing stack as an example, the article also discusses how earlier innovation work has influenced the design of flexible infrastructure that can accommodate uncertainty in audience behavior and editorial workflows – capabilities that are likely to be well suited to the fast-approaching AI-mediated news ecosystem." https://onlinelibrary.wiley.com/doi/10.1002/aaai.12168
#AI#GenerativeAI#LLMs#AITraining#Hallucinations#AITraining: "Models like ChatGPT and Claude are deeply dependent on training data to improve their outputs, and their very existence is actively impeding the creation of the very thing they need to survive. While publishers like Axel Springer have cut deals to license their companies' data to ChatGPT for training purposes, this money isn't flowing to the writers that create the content that OpenAI and Anthropic need to grow their models much further. It's also worth considering that these AI companies may already have already trained on this data. The Times sued OpenAI late last year for training itself on "millions" of articles, and I'd bet money that ChatGPT was trained on multiple Axel Springer publications along with anything else it could find publicly-available on the web.
This is one of many near-impossible challenges for an AI industry that's yet to prove its necessity. While one could theoretically make bigger, more powerful chips (I'll get to that later), AI companies face a kafkaesque bind where they can't improve a tool for automating the creation of content without human beings creating more content than they've ever created before. Paying publishers to license their content doesn't actually fix the problem, because it doesn't increase the amount of content that they create, but rather helps line the pockets of executives and shareholders. Ironically, OpenAI's best hope for survival would be to fund as many news outlets as possible and directly incentivize them to do in-depth reporting, rather than proliferating a tech that unquestionably harms the media industry." https://www.wheresyoured.at/bubble-trouble/