trafficnab, 10 months ago 7b q4 (quantized, basically compressed down to only using 4 bit precision) is about 4gb of ram, 13b q4 is about 8gb, and 30b q4 is the one that's about 25gb 30b generates slowly, but more or less usably on a CPU, the rest generate on CPU just fine
7b q4 (quantized, basically compressed down to only using 4 bit precision) is about 4gb of ram, 13b q4 is about 8gb, and 30b q4 is the one that's about 25gb
30b generates slowly, but more or less usably on a CPU, the rest generate on CPU just fine