cerisara, #LLM on CPU only.
For inference, the best option right now is llama.cpp with quantized LLM in GGUF format. There are several high-lever wrappers around llama.cpp that makes it easy to use: ollama, vllama...
For inference with very big LLM and very small RAM, the only option is airLLM: it's slow, but you can run llama3-70b
For finetuning quantized LLM with LoRA, the only option afaik is also llama.cpp (look for "finetune"). It's a work in progress but usable and promising!