Meta Releases Paper on SuperHOT Technique (8k Context Length via Positional Interpolation)

If you want to learn more about how 8k Context w/ SuperHOT was recently achieved (beyond the paper Meta shared), I highly recommend visiting kaiokendev's pages and posts below.

I was curious to hear more about SuperHOT myself, so I emailed kaiokendev and asked for learning material suggestions.

Here is what they shared with me. Thank you for this list, kaiokendev!

Recommendations from the Developer of SuperHOT (kaiokendev):

Here are some resources to help with learning LLMs:

Andrej Karpathy's GPT from scratch: https://www.youtube.com/watch?v=kCc8FmEb1nY

Huggingface's NLP Course: https://huggingface.co/learn/nlp-course/chapter1/1

And for training specifically:

Alpaca LoRA: https://github.com/tloen/alpaca-lora#-alpaca-lora

Vicuna: https://github.com/lm-sys/FastChat#fine-tuning

Community training guide: https://rentry.org/llm-training

Of course for papers, I recommend reading anything on arXiv's CS - Computation & Language that looks interesting to you:https://arxiv.org/list/cs.CL/recent.

If you found this post interesting, please consider subscribing to the /c/FOSAI community at !fosai where I do my best to keep you in the know with the most important updates in free open-source artificial intelligence.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • fosai@lemmy.world
  • DreamBathrooms
  • ngwrru68w68
  • modclub
  • magazineikmin
  • thenastyranch
  • rosin
  • khanakhh
  • InstantRegret
  • Youngstown
  • slotface
  • Durango
  • kavyap
  • mdbf
  • GTA5RPClips
  • JUstTest
  • tacticalgear
  • normalnudes
  • tester
  • osvaldo12
  • everett
  • cubers
  • ethstaker
  • anitta
  • provamag3
  • Leos
  • cisconetworking
  • megavids
  • lostlight
  • All magazines