File size: 6,901 Bytes

8a5a209
 
 
 
 
a6d8f3f
8a5a209
 
62f61c0
8a5a209
f8a2cea
 
62f61c0
 
f8a2cea
62f61c0
f8a2cea
6cd5c76
f8a2cea
62f61c0
 
f8a2cea
62f61c0
f8a2cea
8c2e9a8
73dd4b5
62f61c0
 
73dd4b5
62f61c0
73dd4b5
 
b0bb6b9
62f61c0
 
b0bb6b9
62f61c0
b0bb6b9
 
 
62f61c0
 
b0bb6b9
62f61c0
b0bb6b9
 
 
62f61c0
 
b0bb6b9
62f61c0
b0bb6b9
 
3abc64a
62f61c0
 
3abc64a
62f61c0
3abc64a
8c2e9a8
247b4d9
62f61c0
 
247b4d9
62f61c0
247b4d9
8c2e9a8
62f61c0
247b4d9
b896e32
903688b
ef6f8af
903688b
 
 
ef6f8af
88fb299
ef6f8af
88fb299
 
 
ef6f8af
1653740
 
 
 
 
 
8a5a209
 
cf0b850
8a5a209
4c450f6
b5f1ff5
 
aa0e14a
 
f8a2cea
c06e433
 
53eab2b
70ff7b6
f8a2cea
 
 
a95d80d
 
d7ec81b
fa430d2
ec069a7
d7ec81b
c4daff5
ef3235c
c4daff5
 
 
 
 
 
 
a06b67e
c4daff5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7bacca8
 
 
 
 
 
044aeeb
c7596cf
044aeeb
aa0e14a
 
 
156df7d
a5a1efd
aa0e14a

---
license: apache-2.0
language:
- en
base_model:
- Comfy-Org/Wan_2.1_ComfyUI_repackaged
pipeline_tag: text-to-video
tags:
- gguf-connector
- gguf-node
widget:
- text: >-
    a pig moving quickly in a beautiful winter scenery nature trees sunset
    tracking camera
  parameters:
    negative_prompt: blurry ugly bad
  output:
    url: samples\ComfyUI_00007_.webp
- text: >-
    a pig moving quickly in a beautiful winter scenery nature trees sunset
    tracking camera
  parameters:
    negative_prompt: blurry ugly bad
  output:
    url: samples\ComfyUI_00009_.webp
- text: >-
    a pig moving quickly in a beautiful winter scenery nature trees sunset
    tracking camera
  parameters:
    negative_prompt: blurry ugly bad
  output:
    url: samples\ComfyUI_00003_.webp
- text: >-
    a pig moving quickly in a beautiful winter scenery nature trees sunset
    tracking camera
  parameters:
    negative_prompt: blurry ugly bad
  output:
    url: samples\ComfyUI_00004_.webp
- text: >-
    a pig moving quickly in a beautiful winter scenery nature trees sunset
    tracking camera
  parameters:
    negative_prompt: blurry ugly bad
  output:
    url: samples\ComfyUI_00005_.webp
- text: >-
    a pig moving quickly in a beautiful winter scenery nature trees sunset
    tracking camera
  parameters:
    negative_prompt: blurry ugly bad
  output:
    url: samples\ComfyUI_00006_.webp
- text: >-
    a fox moving quickly in a beautiful winter scenery nature trees sunset
    tracking camera
  parameters:
    negative_prompt: blurry ugly bad
  output:
    url: samples\ComfyUI_00002_.webp
- text: >-
    a cute anime girl with massive fennec ears and a big fluffy tail wearing a
    maid outfit turning around
  parameters:
    negative_prompt: blurry ugly bad
  output:
    url: samples\ComfyUI_00008_.webp
- text: glass flower blossom
  output:
    url: samples\ComfyUI_00010_.webp
- text: >-
    An icicle dragon lunges forward, mouth wide open to exhale a stream of icy mist. Ultramarine energy flickers beneath its frost-coated scales as it twists. The camera circles slowly, capturing the swirling ice particles and the backdrop of floating glaciers and frozen nebulae under a cyan-blue filter.
  parameters:
    negative_prompt: bad quality, blurry, messy, chaotic
  output:
    url: samples\ComfyUI_00012_.mp4
- text: >-
    Fujifilm Portra 400H film still, slammed Nissan Skyline R33 GTR LM JGTC, in heavy motion blur,  7-11 Tokyo, Midnight
  parameters:
    negative_prompt: bad quality, blurry, messy, chaotic
  output:
    url: samples\ComfyUI_00011_.mp4
- text: >-
    Fujifilm Portra 400H film still, slammed Nissan Skyline R33 GTR LM JGTC, in heavy motion blur,  7-11 Tokyo, Midnight
  parameters:
    negative_prompt: bad quality, blurry, messy, chaotic
  output:
    url: samples\ComfyUI_00013_.mp4
---

# **gguf quantized version of wan video**
- drag **gguf** to > `./ComfyUI/models/diffusion_models`
- drag **t5xxl-um** to > `./ComfyUI/models/text_encoders`
- drag **vae** to > `./ComfyUI/models/vae`

![screenshot](https://raw.githubusercontent.com/calcuis/comfy/master/wan-t2v.gif)

## **workflow**
- for i2v model, drag **clip-vision-h** to > `./ComfyUI/models/clip_vision`
- run the .bat file in the main directory (assume you are using gguf pack below)
- if you opt to use [**fp8 scaled umt5xxl**](https://huggingface.co/calcuis/wan-gguf/blob/main/t5xxl_um_fp8_e4m3fn_scaled.safetensors) encoder (if applies to any fp8 scale t5 actually), please use cpu offload (switch from default to **cpu** under **device** in **gguf clip loader**; won't affect speed); btw, it works fine for both [**gguf umt5xxl**](https://huggingface.co/calcuis/wan-1.3b-gguf/blob/main/umt5-xxl-encoder-q4_k_m.gguf) and [**gguf vae**](https://huggingface.co/calcuis/wan-1.3b-gguf/blob/main/pig_wan_vae_fp32-f16.gguf)
- drag any demo video (below) to > your browser for workflow

<Gallery />

![screenshot](https://raw.githubusercontent.com/calcuis/comfy/master/wan-flf2v.png)

## **review**
- `pig` is a lazy architecture for gguf node; it applies to all model, encoder and vae gguf file(s); if you try to run it in comfyui-gguf node, you might need to manually add `pig` in it's IMG_ARCH_LIST (under loader.py); easier than you edit the gguf file itself; btw, model architecture which compatible with comfyui-gguf, including `wan`, should work in gguf node
- 1.3b model: t2v, vace **gguf** is working fine; good for old or low end machine


## **run it with diffusers🧨 (alternative 1)**
```py
import torch
from transformers import UMT5EncoderModel
from diffusers import AutoencoderKLWan, WanVACEPipeline, WanVACETransformer3DModel, GGUFQuantizationConfig
from diffusers.schedulers.scheduling_unipc_multistep import UniPCMultistepScheduler
from diffusers.utils import export_to_video

model_path = "https://huggingface.co/calcuis/wan-gguf/blob/main/wan2.1-v5-vace-1.3b-q4_0.gguf"
transformer = WanVACETransformer3DModel.from_single_file(
    model_path,
    quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
    torch_dtype=torch.bfloat16,
    )

text_encoder = UMT5EncoderModel.from_pretrained(
    "chatpig/umt5xxl-encoder-gguf",
    gguf_file="umt5xxl-encoder-q4_0.gguf",
    torch_dtype=torch.bfloat16,
    )

vae = AutoencoderKLWan.from_pretrained(
    "callgg/wan-decoder",
    subfolder="vae",
    torch_dtype=torch.float32
    )

pipe = WanVACEPipeline.from_pretrained(
    "callgg/wan-decoder",
    transformer=transformer,
    text_encoder=text_encoder,
    vae=vae, 
    torch_dtype=torch.bfloat16
)

flow_shift = 3.0
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config, flow_shift=flow_shift)
pipe.enable_model_cpu_offload()
pipe.vae.enable_tiling()

prompt = "a pig moving quickly in a beautiful winter scenery nature trees sunset tracking camera"
negative_prompt = "blurry ugly bad"

output = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=720,
    height=480,
    num_frames=57,
    num_inference_steps=24,
    guidance_scale=2.5,
    conditioning_scale=0.0,
    generator=torch.Generator().manual_seed(0),
).frames[0]
export_to_video(output, "output.mp4", fps=16)
```

### **run it with gguf-connector (alternative 2)**
```
ggc v2
```

![screenshot](https://raw.githubusercontent.com/calcuis/gguf-pack/master/v2.png)

### update
- wan2.1-v5-vace-1.3b: except block weights, all in `f32` status (avoid triggering time/text embedding key error for inference usage)

### **reference**
- base model from [wan-ai](https://huggingface.co/Wan-AI/Wan2.1-T2V-14B)
- comfyui from [comfyanonymous](https://github.com/comfyanonymous/ComfyUI)
- pig architecture from [connector](https://huggingface.co/connector)
- gguf-connector ([pypi](https://pypi.org/project/gguf-connector))
- gguf-node ([pypi](https://pypi.org/project/gguf-node)|[repo](https://github.com/calcuis/gguf)|[pack](https://github.com/calcuis/gguf/releases))