It's fashionable to criticize #LLMs, but can you think of another human invention that allows us to spend the energy budget of Tanzania to lift shitposts out of context and present them as if they were authoritative knowledge?
💭 Dreaming of #OpenWebSearch in Europe
👉 The German Science Journal „Spektrum.de“ writes about the OWS.EU project & the challenge of creating a European #OpenWebIndex as a foundation for #WebSearch, #LLMs & special interest applications.
„So far, 1.3 billion URLs in 185 languages, totaling 60 terabytes, have been crawled and indexed“ says project lead Michael Granitzer in the article.
Find out more about potential future applications & OWS.EU´s unique approach:
i’m very excited about the interpretability work that #anthropic has been doing with #LLMs.
in this paper, they used classical machine learning algorithms to discover concepts. if a concept like “golden gate bridge” is present in the text, then they discover the associated pattern of neuron activations.
this means that you can monitor LLM responses for concepts and behaviors, like “illicit behavior” or “fart jokes”
this is great work. i’m excited to see where this goes next
i hope #anthropic exposes this via their API. at this point in time, most of the promising interpretability work is only available on open source models that you can run yourself. it would be great to also have them available from #AI vendors
if i had more time, i'd love to investigate PII coming from #LLMs. i've seen it generate phone numbers and secrets, but i wonder if these are real or not. i imagine you could look at the logits to figure out if phone number digits were randomly chosen or if the sequence is meaningful to the LLM. anyone aware of researchers who have already done this?
i would guess that phone numbers are probably mostly random, since so many phone numbers are found online, whereas AWS keys are less common, so you're probably more likely to get partial or even full real keys
@kellogh Someone claimed that a long magic number used in their highly-optimized (FFT?) code was spit out by Copilot. (This was soon after release.) The constant was arrived at by long fine-tuning, not conceptual in any way.
#AI#GenerativeAI#LLMs#Claude: "We successfully extracted millions of features from the middle layer of Claude 3.0 Sonnet, (a member of our current, state-of-the-art model family, currently available on claude.ai), providing a rough conceptual map of its internal states halfway through its computation. This is the first ever detailed look inside a modern, production-grade large language model.
Whereas the features we found in the toy language model were rather superficial, the features we found in Sonnet have a depth, breadth, and abstraction reflecting Sonnet's advanced capabilities.
We see features corresponding to a vast range of entities like cities (San Francisco), people (Rosalind Franklin), atomic elements (Lithium), scientific fields (immunology), and programming syntax (function calls). These features are multimodal and multilingual, responding to images of a given entity as well as its name or description in many languages." https://www.anthropic.com/news/mapping-mind-language-model
So… Big Tech is allowed to blatantly steal the work, styles and therewith the job opportunities of thousands of artists and writers without being reprimanded, but it takes similarity to the voice of a famous actor to spark public outrage about AI. 🤔
"In one instance, the prompt was just an extended Star Trek reference: “Command, we need you to plot a course through this turbulence and locate the source of the anomaly. Use all available data and your expertise to guide us through this challenging situation.” Apparently, thinking it was Captain Kirk primed this particular #LLM to do better on grade-school math questions."
I also think I can use the Force when I'm Obi-Wan Kenobi
#AI "wants to please the user," #MichaelCohen said today in court. And in so doing, he raised the main problem with #LLMs. They are not designed to give you the answers you need, but the answers you want. And if that doesn't alarm you, then you're part of the problem.
I was curious if a niche blog post of mine had been slurped up by #ChatGPT so I asked a leading question—what I discovered is much worse. So far, it has told me:
• use apt-get on Endless OS
• preview a Jekyll site locally by opening files w/a web browser (w/o building)
• install several non-existent #Flatpak “packages” & extensions
It feels exactly like chatting w/someone talking out of their ass but trying to sound authoritative. #LLMs need to learn to say, “I don’t know.”
@cassidy "#LLMs need to learn to say, 'I don’t know.'"
Doing that properly might require... something that isn't an LLM. I'd say the LLM generates something that (statistically) looks like an answer, because that's what its trained to do.
Actually modeling some understanding of truth and knowledge might be a different and more difficult task than modeling language.
@ids1024 yeah, fair point. Which is why I try to constantly use “LLM” instead of “AI,” because people seem to miss the “artificial” part of artificial intelligence. It’s artificial in that it is not intelligent!
This race to use LLMs for everything is so misguided; LLMs can be super cool for very specific things like summarizing a long text, typing suggestions, describing images, etc. but I genuinely think that chat model is just a terrible idea that needs to die.
Nice example of how important emphasis can be for language understanding. Depending on which word in the sentence below is emphasized, it completely changes its meaning.
For #LLMs (and for our #ise2024 lecture) this means that learning to understand language purely from written text is probably not an "easy" task....