falken,
@falken@qoto.org avatar

@noroute @yoasif local LLM execution times can be very fast on recent consumer hardware. No need to send anywhere, just like their translation - do it all on-device.
As an example, with no optimization or GPU support, my @frameworkcomputer (AMD) generates around 5 characters/sec from a 4 gigabyte pre-quantized model.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • firefox@fedia.io
  • DreamBathrooms
  • everett
  • tacticalgear
  • magazineikmin
  • thenastyranch
  • rosin
  • tester
  • Youngstown
  • khanakhh
  • slotface
  • ngwrru68w68
  • kavyap
  • mdbf
  • InstantRegret
  • megavids
  • osvaldo12
  • GTA5RPClips
  • ethstaker
  • normalnudes
  • Durango
  • cisconetworking
  • anitta
  • modclub
  • cubers
  • Leos
  • provamag3
  • JUstTest
  • lostlight
  • All magazines