SD1.5 and FG-CLIP 2

Similar to other text encoder replacements, but with three wishes granted:

SD1.5 model with flow matching training objective,
FG-CLIP 2, the latest text encoder,
Focusing on multiple characters.

The UNet2DConditionModel weights are finetuned and the inference is made possible with the FlowMatchEulerDiscreteScheduler. The other modules are kept frozen.

Datasets

The model was trained on pixiv images.

References

2506.02070
2509.25705

Downloads last month: -