chikim,
@chikim@mastodon.social avatar

Cool tip for running LLMs on Apple Silicon! By default, MacOS allows GPU to use up to 2/3 of RAM on machines with <=36GB and 3/4 on machines with >36GB. I used the command sudo sysctl iogpu.wired_limit_mb=57344 to override and allocate 56GB/64GB for GPU. This allowed me to load all layers of larger models for a faster speed!

  • All
  • Subscribed
  • Moderated
  • Favorites
  • macos
  • DreamBathrooms
  • magazineikmin
  • tacticalgear
  • InstantRegret
  • ngwrru68w68
  • Durango
  • Youngstown
  • slotface
  • mdbf
  • rosin
  • PowerRangers
  • kavyap
  • thenastyranch
  • vwfavf
  • anitta
  • hgfsjryuu7
  • cisconetworking
  • osvaldo12
  • everett
  • ethstaker
  • GTA5RPClips
  • khanakhh
  • tester
  • modclub
  • cubers
  • Leos
  • normalnudes
  • provamag3
  • All magazines