Welcome to /m/localllama! Links & FAQ
Run LLaMa locally...
Run LLaMa locally...
From their website...
cross-posted from: lemmy.world/post/2219010...
cross-posted from: lemmy.world/post/1750098...
cross-posted from: lemmy.world/post/1305651...
TLDR We trained a series of 7B LLMs named XGen-7B with standard dense attention on up to 8K sequence length for up to 1.5T tokens. We also fine tune the models on public-domain instructional data. The main take-aways are: * On standard NLP benchmarks, XGen achieves comparable or better results
baichuan-7B is an open-source large-scale pre-trained model developed by Baichuan Intelligent Technology. Based on the Transformer architecture, it is a model with 7 billion parameters trained on approximately 1.2 trillion tokens. It supports both Chinese and English, with a context window length of 4096. It achieves the best...
Exciting stuff!
Sounds like Cerebras will be training a model soon based on this dataset, will likely rival OpenLLaMa and RedPajamas models – thoughts??
Click Here to be Taken to the Megathread!...
cross-posted from: lemmy.world/post/1894070...