just had a conversation with my neighbor — he’s wearing an NRA tshirt and explaining that the pistol he has strapped to his hip is for shooting copperheads. And also he’s concerned that we leave our garage open too much and that copperheads might get in
Looks like today I finally found a good application for #llm 's: Learning languages!
I've been attempting to learn #arabic through duolingo for a while now, without much success. I figured if there's one thing language models should be good at, it's languages. So far the thing has actually been pretty helpful.
I was looking over some of the new EU AI restrictions, and notably it barred educational institutions and employers from using biometric emotion detection.
This seems fine, but #gpt4o makes this complicated. You can't turn off its emotion detection, it's not even really explicit, but it's very much there. The demo where the person breathes hard and soft is one example, but GPT-V (vision) already showed how it could detect emotion in images and act on that.
@ianbicking it seems like the obvious way to turn it off is to not use audio or vision, right? not a great solution, but it’s only for employers (presumably against their employees) and educators (presumably against their students)
also, now with #gpt4o, latency is going to be critical if you’re doing streaming audio/video, so #python may start looking less appealing. what’s the new #LLM language? #rust? #go? #cpp? #fortran?
seriously though, idk, python is not my go to low latency language. so much of the language is written without performance in mind, e.g. a for loop raises a StopIteration error to stop looping. integers use 28 bytes, etc.
@dr2chase your totally right. python has an enormous ecosystem, which makes it fantastic for slinging pipelines together. however, the announcement today says that the audio responds within ~350 milliseconds. i’d say your app can add 50-100 more without changing the experience much, but python apps typically add more like 200-400 ms
imo rust might be the next contender. they’ve been building out a lot of new data science & engineering tools with it. idk
i predict that there’s always going to be strong advantages to using #python for #ai, but with streaming audio & video of #gpt4o, there’s not enough latency slack for python.
i think a framework will emerge, similar to pyspark, where you can write python code that gets compiled into a steaming plan, and executed as highly optimized low level #rust code with the possibility of python UDFs. i figure it’s still a couple of years from being really usable rn
My notes on this morning's OpenAI release of the new GPT-4o model - not a huge leap in "intelligence" (whatever that might mean) but still very significant thanks mainly to the impressive new audio capabilities and the drop in price - 50% cheaper via API, and should soon be available for free ChatGPT users as well https://simonwillison.net/2024/May/13/gpt-4o/
@simon how do you think embeddings will fit into streaming audio & video? i can’t imagine RAG can go away, but certainly we need new embedding models, right?
When pg_vector dropped for Postgres I said it would be a matter of time until vector search was just a standard feature of databases. It's available in OLAP platforms like Snowflake and BigQuery, and now it's come to workhorses like Cassandra too. Data Threads
i wish “type checking for infrastructure” was a thing
my code declares that there should be a S3 bucket that’s different from that other S3 bucket, etc. —> spin up the type checker, it reads APIs and verifies, “yep, this code should run fine”