OC Long Context Lengths (And Low Resource Friendly)

Thought I'd ask and see if y'all are familiar with upcoming models or techniques that I'm not. I'm aware of the MPT 7B storywriter model and the RWKV models that support up to 8192 tokens, but that's about it as far as "long" context lengths go. I'm also wanting to run this in a VM with limited resources. The most I will be comfortable allotting is 4 CPU cores and about 16 gigs of ram, which is a lot for a barebones Debian install, but not much as far as running LLMs. I was able to run the 4096 token RWKV model at a reasonable rate, but the coherence just wasn't there compared to the other transformers models.

So what else is out there? I've heard of LoRA models being used to extend context length, but I've never been able to find more information on that (I'm probably searching for the wrong key words). Any novel model types out there? Methods I'm unaware of?

Add comment