@simon@simonwillison.net
@simon@simonwillison.net avatar

simon

@simon@simonwillison.net

Open source developer building tools to help journalists, archivists, librarians and others analyze, explore and publish their data. https://datasette.io and many other #projects.

This profile is from a federated server and may be incomplete. Browse more on the original instance.

simon, to random
@simon@simonwillison.net avatar

I wrote about a common misconception I see people have about LLM tools like ChatGPT

Training is not the same as chatting: ChatGPT and other LLMs don’t remember everything you say

https://simonwillison.net/2024/May/29/training-not-chatting/

simon,
@simon@simonwillison.net avatar

If you spend a lot of time with LLMs it's easy to fall into the trap of assuming that other people already understand things like this - which can lead to frustrating conversations where people are bringing very different mental models of how these things work

simon,
@simon@simonwillison.net avatar

@ericflo yeah I don't think that's documented at all - for different chat bots, how is hitting the context limit in a conversation handled? Some might truncate earlier messages but there are summarization tricks that might end up used as well

simon,
@simon@simonwillison.net avatar

This is yet another of those unintuitive things that stem from wrapping a chat interface around an autocompletion language model

simon,
@simon@simonwillison.net avatar

@troed I mention that in my article - there are plenty of reasonable reasons that people end up believing this!

simon,
@simon@simonwillison.net avatar

@mcc that's part of my frustration here: I can't say for sure how this stuff works because OpenAI don't document it!

I frequently use prompt leaking tricks (or chat exports) to confirm that I understand how their low-level prompting for ChatGPT works myself, but that's a pretty unreliable form of documentation

simon,
@simon@simonwillison.net avatar

@glyph @mcc headlines are always hard, especially wince thee days it's clear that a lot of people genuinely won't read more than the headline

simon,
@simon@simonwillison.net avatar

@mcc @glyph the thing I care about most here is that lots of people really do believe that anything they say to a model is instantly memorized and becomes part of its global "brain" (available to all users) - and that's what the term "training" means to them

simon,
@simon@simonwillison.net avatar
simon, to random
@simon@simonwillison.net avatar

Wrote up an alternative way of getting Cloudflare to redirect one domain (in this case a no-www domain) to another using redirect rules https://til.simonwillison.net/cloudflare/redirect-rules

simon,
@simon@simonwillison.net avatar

One of those TILs where I figure something out, then go to my TIL website to write it up and discover I already figured out a similar solution just a few months ago! https://til.simonwillison.net/cloudflare/redirect-whole-domain

simon,
@simon@simonwillison.net avatar

@alexelcu I have a detailed TIL about how the most recent version of that works here https://til.simonwillison.net/shot-scraper/social-media-cards

KevinGimbel, to random
@KevinGimbel@fosstodon.org avatar

They be putting AI into everything, and they destroy years of trust.

https://kevingimbel.de/blog/2024/05/re-trust/

simon,
@simon@simonwillison.net avatar

@KevinGimbel was that Golden Gate Bridge suicide one confirmed as real? I got the impression it might have been one of the faked screenshots that were floating around (pretty condemning that it's so easy for us to believe in those fakes if that's true though)

simon, to random
@simon@simonwillison.net avatar

Weeknotes: mainly PyCon US and updates to both LLM and some Datasette plugins to support GPT-4o and Google's Gemini Flash https://simonwillison.net/2024/May/28/weeknotes/

jonkeegan, to random
@jonkeegan@mastodon.social avatar

NEW POST on my @Beautifulpublicdata newsletter:

New FOIA records from the FAA shed light on the frantic effort in 2015 to rename navigation waypoints related to Donald Trump and reveal the list of naughty waypoint names that were changed over the years.

https://www.beautifulpublicdata.com/trump-naughty-faa-waypoints/

simon,
@simon@simonwillison.net avatar

@jonkeegan @Beautifulpublicdata Is it possible to FOIA the rationale for more of those renames? As a big Narwhal fan I'd love to know why NARWL become FOLET for example

tannewt, to random
@tannewt@mastodon.online avatar

@simon I’m finely understanding the value of GitHub copilot when coding. Thanks for encouraging folks to try llm tools. Any tips for trying the open code completion models when coding? I use sublime text but am curious enough to use another editor. Thanks!

simon,
@simon@simonwillison.net avatar

@tannewt I haven't spent much time with copilot alternatives yet - not sure what the best tooling is for that right now

22, to random
@22@sfba.social avatar

@simon I hope it's ok to ask for support in this manner, if not I apologize!

Does the llm package allow me to download (and sync locally) chats I've had with the web version of ChatGPT? Or does logging only support chats done through the llm package itself?

simon,
@simon@simonwillison.net avatar

@22 it doesn't support that

It's an interesting idea for a plugin though! Could work by first having you request your JSON export from ChatGPT, then converting that JSON to the LLM SQLite schema

simon, to random
@simon@simonwillison.net avatar

Just had a delightful voice conversation with ChatGPT (4o) where I asked it how come there was an Elizabeth line train on platform 8 (surface platform) at London Paddington and it explained that there is a set of ramps at Paddington to allow trains to get from the deep below ground lines up to the surface, and then wrote some Python code to render me a diagram https://chatgpt.com/share/1afcc398-b1cf-424a-8835-5b6a2985168b

simon,
@simon@simonwillison.net avatar

@researchbuzz yeah the ChatGPT iPhone app has a little headphones icon that starts an audio conversation, currently using whisper for speech-to-text and their TTS model for text-to-speech - it's really fun

(At some point they'll be switching that over to their creepy new 4o voice mode but that's not enabled yet https://simonwillison.net/2024/May/15/chatgpt-in-4o-mode/ )

simon,
@simon@simonwillison.net avatar

@boffbowsh I am 99% confident that no such ramp exists!

simon,
@simon@simonwillison.net avatar

To clarify, I am 99.9% confident that such a ramp does not exist!

simon,
@simon@simonwillison.net avatar

@eliocamp yes, but I found it VERY amusing

jonty, to random
@jonty@chaos.social avatar

Well, that was the worst day I have had in months. Please send animal gifs.

simon,
@simon@simonwillison.net avatar
simon, to random
@simon@simonwillison.net avatar

There is a limited time opportunity right now to try a version of Claude that's completely obsessed with the Golden Gate Bridge, and it is howlingly entertaining https://www.anthropic.com/news/golden-gate-claude

Visit https://claude.ai/ and click the little bridge icon

simon,
@simon@simonwillison.net avatar

In tragic news, it looks like Golden Gate Claude is no longer available

mpesce, to random
@mpesce@arvr.social avatar

Now it can be told:

While doing some AI engineering work for a client, I developed a prompt - completely inadvertently - that reduced every AI chatbot to gibberish (except Anthropic's Claude 3). I then spent a week trying to alert the LLM vendors to this issue - and largely failed. There is no mechanism to report flaws in these models that are already deployed to billions of users. Read the whole story in @theregister

https://www.theregister.com/2024/05/23/ai_untested_unstable/

simon,
@simon@simonwillison.net avatar

@mpesce @researchbuzz @theregister this keeps on happening with prompt injection

Here's an example that was responsibly disclosed in December, nothing happened, the researcher published 4 months later and THEN Google finally mitigated it in response to the public disclosure https://embracethered.com/blog/posts/2024/google-notebook-ml-data-exfiltration/

  • All
  • Subscribed
  • Moderated
  • Favorites
  • megavids
  • modclub
  • DreamBathrooms
  • mdbf
  • khanakhh
  • ngwrru68w68
  • magazineikmin
  • thenastyranch
  • InstantRegret
  • Youngstown
  • slotface
  • everett
  • kavyap
  • cisconetworking
  • JUstTest
  • ethstaker
  • tacticalgear
  • GTA5RPClips
  • osvaldo12
  • Durango
  • rosin
  • Leos
  • normalnudes
  • anitta
  • cubers
  • tester
  • provamag3
  • lostlight
  • All magazines