Mixture of Experts Explained (Huggingface blog) (huggingface.co)
mistral-8x7b-chat (huggingface.co)
A very capable chat model built on top of the new Mistral MoE model, trained on the SlimOrca dataset for 1 epoch, using QLoRA.
PixArt-LCM-XL-2-1024-MS Weights Released (huggingface.co)
SD-Turbo - Distilled Stable Diffusion 2.1 For Real-Time Synthesis (huggingface.co)
SD-Turbo is a fast generative text-to-image model that can synthesize photorealistic images from a text prompt in a single network evaluation. We release SD-Turbo as a research artifact, and to study small, distilled text-to-image models. For increased quality and prompt understanding, we recommend SDXL-Turbo....
I'm having a fantastic time with this model. (huggingface.co)
I’ve been using TheBlokes Q8 of huggingface.co/teknium/OpenHermes-2.5-Mistral-7B, but now this one (huggingface.co/…/OpenHermes-2.5-neural-chat-7b-v3…) I think is killing it. Has anyone else tested it?
Kandinsky 3 Released (huggingface.co)
Description:...
Dolphin-2.1-mistral-7b · Hugging Face (huggingface.co)
SlimOrca 7B: Trained on 1GB dataset (huggingface.co)
This new dataset release provides an efficient means of reaching performance on-par with using larger slices of our data, while only including ~500k GPT-4 completions....
LLaMA2 7B PoSE YaRN 16k: LLMs via Positional Skip-wise Training (huggingface.co)
https://lemmy.kya.moe/imgproxy?src=lemmy.kya.moe%2fpictrs/image/3fde0c15-ed8e-41ef-9e69-4383b1b7c804.png...
Zephyr 7B: A model that people like, but it has biases too (huggingface.co)
We find GPT-4 judgments correlate strongly with humans, with human agreement with GPT-4 typically similar or higher than inter-human annotator agreement....
Pay more attention: Recap of the last week (huggingface.co)
A short journey to long-context models....
LeoLM: German Foundation Language Model (huggingface.co)
https://lemmy.kya.moe/pictrs/image/5c507a45-d8af-428e-94e6-30ff8eefeb12.png...
Speechless 7B: coding, reasoning and planning (huggingface.co)
The model uses 200k samples from different datasets to fine-tune Open-Orca/Mistral-7B-OpenOrca....
Shurale: an open-domain dialogue model for chit-chat conversations (huggingface.co)
This model based on Mistral7B-v0.1 and was trained using 1,112,000 dialogs. Maximum length at training was 2048 tokens....
Llama Nous-Capybara-7B, Mistral airoboros, Mistral Aria (huggingface.co)
Nous-Capybara-7B...
Dolphin 2.0 based on mistral-7b released by Eric Hartford (huggingface.co)
Model is trained on his own orca style dataset as well as some airoboros apparently to increase creativity...
Mistral 7B OpenOrca released (huggingface.co)
This release is trained on a curated filtered subset of most of our GPT-4 augmented data....
Spread Your Wings: Falcon 180B is here (huggingface.co)
Context: Falcon is a popular free LLM, this is their biggest model yet and they claim it’s now the best open model in the market right now.
Spread Your Wings: Falcon 180B is here (huggingface.co)
cross-posted from: lemm.ee/post/6916266...
Making LLMs lighter with AutoGPTQ and transformers (huggingface.co)
Hugging face transformers officially has support for AutoGPTQ, this is a pretty huge deal and signals a much wider adoption in quantized model support which is great for everyone!
Introducing SafeCoder - huggingface (huggingface.co)
Real-Time Radiance Field Rendering (huggingface.co)
Achieves SOTA on quality AND on training time AND renders in real-time (60fps+)
WizardLM-70B-V1.0 Released on HF (huggingface.co)
These are the full weights, the quants are incoming from TheBloke already, will update this post when they’re fully uploaded...
Dolphin 7B by Eric Hartford based on Llama 2 released (huggingface.co)
And of course the quants by TheBloke...