#Embeddings - kbin.social

textvr, 3 months ago to generativeAI German

Stefano Fancello is talking about #Langchain, an open-source Python-based toolkit for Retrieval Augmentation Generation #RAG. It helps preparing your own data as a context for a question you send to a Large Language Model #LLM. Langchain tools can ingest all kinds of document formats, split documents into Chunks, and create so called #Embeddings and send it to the LLM. #fosdem2024 #ai #opensource

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

paulox, 3 months ago to PostgreSQL

pgvector 0.6.0 has been released 🎉

It now supports parallel index builds for HNSW 🗂️

Building an HNSW index is now up to 30x faster for unlogged tables ⚡

For full release notes, please review the pgvector changelog 👇
https://github.com/pgvector/pgvector/blob/master/CHANGELOG.md#060-2024-01-29

#pgvector #vectors #embeddings #postgresql #darabase #release #hnsw #indexes

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

kellogh, 4 months ago to LLMs

not sure if this exists — i wish i could prompt an embedding model.

i.e. provide context for the prompt to be interpreted, but not have the prompt contribute to the embedding value. like, the prompt could have concepts in it that aren’t lit up at all in the embedding, unless the core text references them #LLMs #embeddings #transformers

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

kellogh, 4 months ago

for everything i’ve seen, it seems the answer is to fine tune the model. but what if i want a lighter version of that? #LLMs #embeddings #transformers

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

cpbotha, 5 months ago to running

Charl's log, Earth date Sunday 2023-11-27:

Weekend long run time! Super hot at 9:00 in the morning but really good.

WHOOHOO I got Jina AI similarity working for Org-roam! See https://emacs.ch/@cpbotha/111477149483468920

#log #lifelog #running #orgmode #llm #embeddings

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ sachac

cpbotha, 5 months ago to emacs

AM EXCITE!

Witness my Emacs org-mode + Jina AI embeddings Frankenstein!

https://youtu.be/cHQx4ITQRNU

From the video description:

I've dreamed about this for quite some time now, and now I've finally been able to cobble it together!

What you're seeing, is org-roam node (subtree or file) live Jina AI (fully local) similarity search in the org-roam buffer, along with your backlinks and reflinks. This automatically surfaces other org-roam nodes which are related to the one you're currently reading, or even working on!

This open source setup currently works as follows:

export all of your org-roam nodes as text files using supplied emacs-lisp

use embed.py to calculate embeddings for all of these txt files and store them in a parq file

run serve.py which waits for submission of any text to return the N closest node ids, according to the Jina AI learned embeddings. These are really quite good and fully local, but it would be straight-forward to use a service like OpenAI embeddings for everything

There is more emacs lisp that customizes the org-roam buffer setup to call to serve.py's endpoint and renders the list of similar nodes

The source directory for this is still in shambles. I'll try and make some time the coming days to clean up and push to https://github.com/cpbotha/org-roam-similarity

#emacs #orgroam #ai #embeddings

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ sachac

clintlalonde, 6 months ago to random

deleted_by_author

Loading...

cogdog, 6 months ago

@sleslie @clintlalonde That’s logical to want a more purely trained LLM, but isnt amount needed super large? I’m trying to warp my head around it but if I read Simon Wilkinson, we don’t necessarily need to build new LLMs but understand how to deploy embeddings to the source content we want it to draw from? https://simonwillison.net/2023/Aug/27/wordcamp-llms/#embeddings I read all the explanations of them being 1563 dimension vectorized representations of tokens but struggle to grok it

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

schizanon, 6 months ago to llm

> #Embeddings are a technology that’s adjacent to the wider field of #LLM —the technology behind #ChatGPT and #Bard

@simon

https://simonwillison.net/2023/Oct/23/embeddings/

#ai #openAI #aiml #machineLearning #llms #generativeAI

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

kellogh, 7 months ago to LLMs

i wish i knew more about comparing #embeddings. anyone have resources? one thing i’ve wondered is how to convert an embedding from a “point” to an “area” or “volume”. e.g. an embedding of a 5 paragraph essay will occupy a single point in embedding space, but if you broke it down (e.g. by paragraph), there would be several points and the whole would presumably be at the center. is there a way to trace the full space a text occupies in #embedding space? #LLMs #LLM #AI #NLP

reply

expand (11)

collapse (11)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ acdha

lysander07, 7 months ago to llm

Many new and interesting topics in our upcoming #KnowledgeGraphs - Foundations and Applications online lecture at #openhpi

knowledge representation with graphs

#RDF & RDFS

Querying RDF with #SPARQL & #SHACL

#OWL & Description Logics

ontological engineering

knowledge graph #embeddings

large #languagemodels #llm
Free registration open: https://open.hpi.de/courses/knowledgegraphs2023
@tabea @sashabruns @fizise #mooc #lecture

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ wikidata

lysander07, 8 months ago to fediverse German

Keynote by Heiko Paulheim (still not arrived here in the #fediverse) at RuleML+RR 2023 in Oslo. Talk: "Knowledge Graph Embeddings meet Symbolic Schemas, or: what do they Actually Learn?" Slides: https://www.uni-mannheim.de/media/Einrichtungen/dws/Files_People/Profs/heiko/talks/RuleMLRR_2023.pdf
#knowledgegraph #semanticweb #embeddings #graphembeddings #rdf2vec #ruleml+rr of course, this slide had to be from Heiko 😉 #montypython @heikopaulheim

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ indieterminacy

Dreamwieber, 1 year ago to random

I created vector embeddings of the entire Alan Watts lecture archive and as they processed they flowed like waves across my terminal.

I thought was beautiful and oddly fitting, so I captured it.

#openai #alanwatts #philosophy #ai #embeddings #llm

video/mp4

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...