Large Language Models

18+ / NSFW

Image

Image alternative text

informapirata, 1 month ago Italian

Come funzionano gli #LLM, spiegato senza matematica

Da dove proviene l’apparente intelligenza di questi modelli. In questo articolo, cercherò di spiegare in termini semplici e senza utilizzare la matematica avanzata come funzionano i modelli di testo generativi, per aiutarti a pensarli come algoritmi informatici e non come magia.

@aitech

https://blog.miguelgrinberg.com/post/how-llms-work-explained-without-math

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pseudocurious, 1 month ago

@snoopy Je n'ai pas accès à ton billet pour la #DMNQ sur les #LLM. C'est normal ?

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

pseudocurious, 1 month ago

@snoopy @snoopy J'ai vu ton billet sur jlai.lu, moi. J'ai ouvert #sharkey pour pouvoir le booster et c'est de là que je ne l'ai plus vu.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

snoopy, 1 month ago

@pseudocurious @snoopy oui, j'ai pensé qu'en me pinguant depuis jlai.lu ça forcerait sa découverte. Est ce que tu le vois via mon compte mastodon ?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jemoka, 1 month ago

🎉 new preprint day

Wrote some multi-hop reasoning work recently, formalizing #llm inference as a #pomdp

achieved #sota results on game of 24 problem from tree of thougchts

https://arxiv.org/abs/2404.19055

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ doboprobodyne

nyergler, 1 month ago

Guys! My Rabbit R0 arrived! Can’t wait to see how useful it is! #llm #RabbitR1 #ai

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ phildini

nyergler, 1 month ago

@luis_in_brief Also, this is one of two FirefoxOS devices I owned. It took extra digging in the garage to find the orange one, which was obviously critical for the bit.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

luis_in_brief, 1 month ago

@nyergler thank you for your service ‽

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jmcastagnetto, 1 month ago

"Researchers at# Stanford Introduce #SUQL: A Formal Query Language for Integrating Structured and Unstructured #Data"

https://www.marktechpost.com/2024/05/04/researchers-at-stanford-introduce-suql-a-formal-query-language-for-integrating-structured-and-unstructured-data/

See paper at: https://arxiv.org/abs/2311.09818, and code at: https://github.com/stanford-oval/suql

#UnstructuredData #LLM #StructuredData

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ hrbrmstr

chikim, 1 month ago

I created a multi-needle in a haystack test where a randomly selected secret sentence was split into pieces and scattered throughout the document with 7.5k tokens in random places. The task was to find these pieces and reconstruct the complete sentence with exact words, punctuation, capitalization, and sequence. After running 100 tests, llama3:8b-instruct-q8 achieved a 44% success rate, while llama3:70b-instruct-q8 achieved 100%! #LLM #AI #ML https://github.com/chigkim/haystack-test

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ datajake1999

cerisara, 1 month ago

#LLM on CPU only.

For inference, the best option right now is llama.cpp with quantized LLM in GGUF format. There are several high-lever wrappers around llama.cpp that makes it easy to use: ollama, vllama...

For inference with very big LLM and very small RAM, the only option is airLLM: it's slow, but you can run llama3-70b

For finetuning quantized LLM with LoRA, the only option afaik is also llama.cpp (look for "finetune"). It's a work in progress but usable and promising!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ doboprobodyne

wildebees, 1 month ago

Beyond the brain: Our intelligence leverages the power of culture and language. Channeling Ted Underwood and Francios Chollet, I argue that language models, despite their biases and lack of understanding –– are important tools for thinking. 🗣️🌍💡 cc @TedUnderwood
https://leviathan.substack.com/p/beyond-the-brain

#LLM #culture #AI

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ TedUnderwood

Exxo, 1 month ago German

Ganz schön arrogant, wie einige #KI bzw. #LLM verteufeln.

Nicht jedem fallen die Worte aus den Fingern wie frühreife Früchte, für viele ist das Schreiben ein zähes, träges Mäandern.

Und wenn KI da bei der Textarbeit hilft und diesen Menschen mehr Partizipation und Produktivität und einfach ein besseres Gefühl ermöglicht, ist das toll!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ixi

ianRobinson, 1 month ago

Anthropic released an iOS app for their Claude 3 LLM.

I’m past the stage that dismisses LLMs. Some variant will be a useful tool for me. For various tasks. Some I haven’t thought of yet. I’m currently using them as research assistants on topics I’m writing about. To see if detailed prompts (several hundred words with topic headings etc) get responses that include things I’d overlooked. I don’t use any generated text directly.

I might use Claude as a tutor for some studying I plan. #LLM

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

knitter, 1 month ago

@ianRobinson Actually, in #Europe I cannot access the #Claude #iOS app. Shame, would have liked to try it.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ianRobinson, 1 month ago

@knitter Yikes. Have we stumbled on a Brexit benefit! It’d be a first! Use the web page directly via VPN if you need to. The iOS app experience is the same as the web app. https://claude.ai/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bsletten, 1 month ago

Thank goodness for small favors. The U.S. military is halting exploration of generative AI because <checks notes> it sucks.

#genai #llm

https://www.axios.com/2024/05/01/pentagon-military-ai-trust-issues

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ianRobinson

dhinojosa, 1 month ago

@bsletten If we can attack the wrong country without AI, imagine the possibilities with AI.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

punkscience_ns, 1 month ago

An #LLM #AI trained on playlists and music reviews who talks with a sneer like a local record store clerk who just doesn't have time for your pedestrian tastes. #ideas But if it must, it will make recommendations.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

kellogh, 1 month ago

@punkscience_ns @actuallybot

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

obrhoff, 1 month ago

The amazing thing about LLMs is how much knowledge they posess in their small size. The llama3-8b model, for instance, weighs only 4.7GB yet can still answer your questions about everything (despite some hallucinations).
#llm #ai #ollama #llama3

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ErikJonker

noplasticshower, 1 month ago

@obrhoff being DEAD WRONG is not really a "hallucination"...but your point is well taken. Cramming information into a smaller space is amazing.

When you re-represent and compress information in the long tails of gradient Gaussians disappears.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ cigitalgem

nithinbekal, 1 month ago

Finally got around to playing with LLMs locally, and turns out ollama makes it incredibly easy.

https://nithinbekal.com/posts/ollama-llama3-phi3/

As a newbie this was much easier than the last time I looked at this 6 months ago, and was confused by the tooling around it.

#llm #ollama #llm #llama3 #phi3 #ai #ml

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ErikJonker

kjr, 1 month ago

I am trying to build a RAG with LLAMA 3 and... getting really crazy with the strange formats I get in the response....
Not only the response, but additional text, XML tags...
#Llama3 #LLM #RAG

reply

expand (8)

collapse (8)

report

activity

copy /kbin url

copy original url

open original url

Loading...

kjr, 1 month ago

I realize now that maybe that is a question for @raf

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

raf, 1 month ago

@kjr

Do you have a desired output format?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Hot

Add post