scottjenson,
@scottjenson@social.coop avatar

@simon
Simon, I'm working with @homeassistant a bit and we just had a fascinating discussion about 'nanoLLMs' that could run locally. They would NOT need the sum-total-of-all-human-knowledge but would really just be there as a smart parser for speech-to-text commands, keeping everything local. This is clearly still not trivial but hopefully one way to reduce the model size.

Do you know of any 'reduced' LLMs that could work in this more limited context?

aallan,
@aallan@mastodon.social avatar

@scottjenson @simon @homeassistant It's an fascinating area that a lot of people are looking at right now. Moving the LLMs to edge hardware is certainly possible, I'm running LLaMa on my phone locally for instance. But you have to think about architectures. I've seen some interesting architectures built around key framing and feeding LLMs from TinyML models that look potentially pretty powerful.

simon,
@simon@simonwillison.net avatar

@scottjenson @homeassistant yes, I'm really interested in that kind of model. Phi-3 is one of the most interesting of those at the moment I think - only about a 2GB file so it should be usable on a Raspberry Pi

scottjenson,
@scottjenson@social.coop avatar

@simon @homeassistant Excellent news, thank you. I'll get started running it locally on my Mac just to get started

simon,
@simon@simonwillison.net avatar

@scottjenson @homeassistant I ran it with llamafile following the instructions in the official README and it worked great https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf

scottjenson,
@scottjenson@social.coop avatar

@simon @homeassistant Thanks again. I'll also give that a try. I'm currently running it with Ollama on a 5 year old desktop (it was shockingly easy) It's only using 30% of the CPU when I ask it a question!

Even then, I'd (naively?) suggest that it is far more power than I need. But the fact that I can get this far in just 5 minutes has me shaking my head in disbelief.

simon,
@simon@simonwillison.net avatar

@scottjenson @homeassistant Phi-3 is the first small model like that which has felt to me like it's capable of basic conversion tasks like summarization and RAG-extraction and extract-data-to-JSON, I was really impressed by it

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • DreamBathrooms
  • magazineikmin
  • everett
  • InstantRegret
  • rosin
  • Youngstown
  • slotface
  • love
  • khanakhh
  • kavyap
  • tacticalgear
  • GTA5RPClips
  • thenastyranch
  • modclub
  • megavids
  • mdbf
  • normalnudes
  • Durango
  • ethstaker
  • osvaldo12
  • cubers
  • ngwrru68w68
  • tester
  • anitta
  • cisconetworking
  • Leos
  • provamag3
  • JUstTest
  • All magazines