Zeugs,
@Zeugs@social.cologne avatar

@simon oh vendor performance information...
These benchmarks and their data are also in the training data. LLM generally perform worse with alternative formulations of the questions in the benchmarks.
https://arxiv.org/pdf/2402.19450.pdf
GPT4 is the best, but size does not justify the cost/size. GPT3.5 now The "vanilla LLM".
It's the defined normal and a standard you can talk about.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • cubers
  • DreamBathrooms
  • mdbf
  • tacticalgear
  • ngwrru68w68
  • magazineikmin
  • thenastyranch
  • InstantRegret
  • Youngstown
  • slotface
  • everett
  • kavyap
  • cisconetworking
  • Durango
  • provamag3
  • ethstaker
  • GTA5RPClips
  • osvaldo12
  • khanakhh
  • rosin
  • normalnudes
  • tester
  • megavids
  • Leos
  • modclub
  • anitta
  • JUstTest
  • lostlight
  • All magazines