simon, (edited )
@simon@simonwillison.net avatar

The only way to evaluate an LLM continues to be on its vibes

The vibes of Claude 3 Opus are looking /really/ good right now: people whose opinion I trust are treating it as a step up from GPT-4!

I've not spent enough time with it yet, but my impressions so far have been very positive

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • kavyap
  • thenastyranch
  • mdbf
  • DreamBathrooms
  • everett
  • magazineikmin
  • GTA5RPClips
  • Youngstown
  • cisconetworking
  • ethstaker
  • slotface
  • ngwrru68w68
  • rosin
  • cubers
  • JUstTest
  • InstantRegret
  • Durango
  • osvaldo12
  • modclub
  • tester
  • Leos
  • khanakhh
  • normalnudes
  • tacticalgear
  • megavids
  • anitta
  • provamag3
  • lostlight
  • All magazines