--- base_model: - black-forest-labs/FLUX.1-schnell base_model_relation: quantized library_name: diffusers tags: - sdnq - flux - 4-bit license: apache-2.0 --- 4 bit (UINT4 with SVD rank 32) quantization of [black-forest-labs/FLUX.1-schnell](https://huggingface.co/black-forest-labs/FLUX.1-schnell) using [SDNQ](https://github.com/vladmandic/sdnext/wiki/SDNQ-Quantization). Usage: ``` pip install git+https://github.com/Disty0/sdnq ``` ```py import torch import diffusers from sdnq import SDNQConfig # import sdnq to register it into diffusers and transformers pipe = diffusers.FluxPipeline.from_pretrained("Disty0/FLUX.1-schnell-SDNQ-uint4-svd-r32", torch_dtype=torch.bfloat16) pipe.enable_model_cpu_offload() prompt = "A cat holding a sign that says hello world" image = pipe( prompt, height=1024, width=1024, guidance_scale=0.0, num_inference_steps=4, max_sequence_length=256, generator=torch.manual_seed(0) ).images[0] image.save("flux-dev-sdnq-uint4-svd-r32.png") ``` Original BF16 vs SDNQ quantization comparison: | Quantization | Model Size | Visualization | | --- | --- | --- | | Original BF16 | 23.8 GB | ![Original BF16](https://cdn-uploads.huggingface.co/production/uploads/6456af6195082f722d178522/sox_OBWXRWrI9JNTG-jjF.png) | | SDNQ UINT4 | 6.8 GB | ![SDNQ UINT4](https://cdn-uploads.huggingface.co/production/uploads/6456af6195082f722d178522/mbyPb2JkxEiasFLeFMmet.png) |