SSD-1B: 50% smaller and 60% faster Open-Source SDXL Distilled Model

Preface

Since the introduction of Stable Diffusion 1.5 by StabilityAI, the ML community has eagerly embraced the open-source model. In August, we introduced the ‘Segmind Distilled Stable Diffusion’ series with the compact SD-Small and SD-Tiny models. We open-sourced the weights and code for distillation training, and the models were inspired by groundbreaking research presented in the paper “On Architectural Compression of Text-to-Image Diffusion Models”. These models had 35% and 55% fewer parameters than the base model, respectively, while maintaining comparable image fidelity.

With the introduction of SDXL 1.0 in July, we saw the community moving to the new architecture due to its superior image quality and better prompt coherence. In our effort to make generative AI models faster and more affordable, we began working on a distilled version of SDXL 1.0. We were successful in distilling the SDXL 1.0 to half it’s size. Read on to learn more about our SSD-1B model.

Blog post: blog.segmind.com/introducing-segmind-ssd-1b/

Model: huggingface.co/segmind/SSD-1B

Demo: huggingface.co/spaces/…/Segmind-Stable-Diffusion

poVoq,
@poVoq@slrpnk.net avatar

Bit of a shame that they didn’t manage to fit it into 12GB vRAM, so you still need a 16GB vRAM GPU.

Even_Adder,

Yeah.

Zarxrax,

What do you mean? I run the normal SDXL on 12gb vram.

poVoq,
@poVoq@slrpnk.net avatar

hmm, odd. The linked explanation says that in operation SDXL needs 15GB or so vRAM (and this slimmed down version just above 12GB). Maybe 12GB is only possible at lower resolutions?

vluz,
vluz avatar

I do SDXL generation in 4GB at extreme expense of speed, by using a number of memory optimizations.
I've done this kind of stuff since SD 1.4, for the fun of it. I like to see how low I can push vram use.

SDXL takes around 3 to 4 minutes per generation including refiner but it works within constraints.
Graphics cards used are hilariously bad for the task, a 1050ti with 4GB and a 1060 with 3GB vram.

Have an implementation running on the 3GB card, inside a podman container, with no ram offloading, 1 vcpu and 4GB ram.
Graphical UI (streamlit) run on a laptop outside of server to save resources.

Working on a example implementation of SDXL as we speak and also working on SDXL generation on mobile.
That is the reason I've looked into this news, SSD-1B might be a good candidate for my dumb experiments.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • stable_diffusion@lemmy.dbzer0.com
  • slotface
  • kavyap
  • thenastyranch
  • everett
  • tacticalgear
  • rosin
  • Durango
  • DreamBathrooms
  • mdbf
  • magazineikmin
  • InstantRegret
  • Youngstown
  • khanakhh
  • ethstaker
  • megavids
  • ngwrru68w68
  • cisconetworking
  • modclub
  • tester
  • osvaldo12
  • cubers
  • GTA5RPClips
  • normalnudes
  • Leos
  • provamag3
  • anitta
  • JUstTest
  • lostlight
  • All magazines