Spaces:
Runtime error
Runtime error
<!--Copyright 2024 The HuggingFace Team. All rights reserved. | |
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
the License. You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 | |
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
specific language governing permissions and limitations under the License. | |
--> | |
# DreamBooth | |
[DreamBooth](https://arxiv.org/abs/2208.12242)λ ν μ£Όμ μ λν μ μ μ΄λ―Έμ§(3~5κ°)λ§μΌλ‘λ stable diffusionκ³Ό κ°μ΄ text-to-image λͺ¨λΈμ κ°μΈνν μ μλ λ°©λ²μ λλ€. μ΄λ₯Ό ν΅ν΄ λͺ¨λΈμ λ€μν μ₯λ©΄, ν¬μ¦ λ° μ₯λ©΄(λ·°)μμ νΌμ¬μ²΄μ λν΄ λ§₯λ½ν(contextualized)λ μ΄λ―Έμ§λ₯Ό μμ±ν μ μμ΅λλ€. | |
 | |
<small>μμμ Dreambooth μμ <a href="https://dreambooth.github.io">project's blog.</a></small> | |
μ΄ κ°μ΄λλ λ€μν GPU, Flax μ¬μμ λν΄ [`CompVis/stable-diffusion-v1-4`](https://huggingface.co/CompVis/stable-diffusion-v1-4) λͺ¨λΈλ‘ DreamBoothλ₯Ό νμΈνλνλ λ°©λ²μ 보μ¬μ€λλ€. λ κΉμ΄ νκ³ λ€μ΄ μλ λ°©μμ νμΈνλ λ° κ΄μ¬μ΄ μλ κ²½μ°, μ΄ κ°μ΄λμ μ¬μ©λ DreamBoothμ λͺ¨λ νμ΅ μ€ν¬λ¦½νΈλ₯Ό [μ¬κΈ°](https://github.com/huggingface/diffusers/tree/main/examples/dreambooth)μμ μ°Ύμ μ μμ΅λλ€. | |
μ€ν¬λ¦½νΈλ₯Ό μ€ννκΈ° μ μ λΌμ΄λΈλ¬λ¦¬μ νμ΅μ νμν dependenciesλ₯Ό μ€μΉν΄μΌ ν©λλ€. λν `main` GitHub λΈλμΉμμ 𧨠Diffusersλ₯Ό μ€μΉνλ κ²μ΄ μ’μ΅λλ€. | |
```bash | |
pip install git+https://github.com/huggingface/diffusers | |
pip install -U -r diffusers/examples/dreambooth/requirements.txt | |
``` | |
xFormersλ νμ΅μ νμν μꡬ μ¬νμ μλμ§λ§, κ°λ₯νλ©΄ [μ€μΉ](../optimization/xformers)νλ κ²μ΄ μ’μ΅λλ€. νμ΅ μλλ₯Ό λμ΄κ³ λ©λͺ¨λ¦¬ μ¬μ©λμ μ€μΌ μ μκΈ° λλ¬Έμ λλ€. | |
λͺ¨λ dependenciesμ μ€μ ν ν λ€μμ μ¬μ©νμ¬ [π€ Accelerate](https://github.com/huggingface/accelerate/) νκ²½μ λ€μκ³Ό κ°μ΄ μ΄κΈ°νν©λλ€: | |
```bash | |
accelerate config | |
``` | |
λ³λ μ€μ μμ΄ κΈ°λ³Έ π€ Accelerate νκ²½μ μ€μΉνλ €λ©΄ λ€μμ μ€νν©λλ€: | |
```bash | |
accelerate config default | |
``` | |
λλ νμ¬ νκ²½μ΄ λ ΈνΈλΆκ³Ό κ°μ λνν μ Έμ μ§μνμ§ μλ κ²½μ° λ€μμ μ¬μ©ν μ μμ΅λλ€: | |
```py | |
from accelerate.utils import write_basic_config | |
write_basic_config() | |
``` | |
## νμΈνλ | |
<Tip warning={true}> | |
DreamBooth νμΈνλμ νμ΄νΌνλΌλ―Έν°μ λ§€μ° λ―Όκ°νκ³ κ³Όμ ν©λκΈ° μ½μ΅λλ€. μ μ ν νμ΄νΌνλΌλ―Έν°λ₯Ό μ ννλ λ° λμμ΄ λλλ‘ λ€μν κΆμ₯ μ€μ μ΄ ν¬ν¨λ [μ¬μΈ΅ λΆμ](https://huggingface.co/blog/dreambooth)μ μ΄ν΄λ³΄λ κ²μ΄ μ’μ΅λλ€. | |
</Tip> | |
<frameworkcontent> | |
<pt> | |
[λͺ μ₯μ κ°μμ§ μ΄λ―Έμ§λ€](https://drive.google.com/drive/folders/1BO_dyz-p65qhBRRMRA4TbZ8qW4rB99JZ)λ‘ DreamBoothλ₯Ό μλν΄λ΄ μλ€. | |
μ΄λ₯Ό λ€μ΄λ‘λν΄ λλ ν°λ¦¬μ μ μ₯ν λ€μ `INSTANCE_DIR` νκ²½ λ³μλ₯Ό ν΄λΉ κ²½λ‘λ‘ μ€μ ν©λλ€: | |
```bash | |
export MODEL_NAME="CompVis/stable-diffusion-v1-4" | |
export INSTANCE_DIR="path_to_training_images" | |
export OUTPUT_DIR="path_to_saved_model" | |
``` | |
κ·Έλ° λ€μ, λ€μ λͺ λ Ήμ μ¬μ©νμ¬ νμ΅ μ€ν¬λ¦½νΈλ₯Ό μ€νν μ μμ΅λλ€ (μ 체 νμ΅ μ€ν¬λ¦½νΈλ [μ¬κΈ°](https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth.py)μμ μ°Ύμ μ μμ΅λλ€): | |
```bash | |
accelerate launch train_dreambooth.py \ | |
--pretrained_model_name_or_path=$MODEL_NAME \ | |
--instance_data_dir=$INSTANCE_DIR \ | |
--output_dir=$OUTPUT_DIR \ | |
--instance_prompt="a photo of sks dog" \ | |
--resolution=512 \ | |
--train_batch_size=1 \ | |
--gradient_accumulation_steps=1 \ | |
--learning_rate=5e-6 \ | |
--lr_scheduler="constant" \ | |
--lr_warmup_steps=0 \ | |
--max_train_steps=400 | |
``` | |
</pt> | |
<jax> | |
TPUμ μ‘μΈμ€ν μ μκ±°λ λ λΉ λ₯΄κ² νλ ¨νκ³ μΆλ€λ©΄ [Flax νμ΅ μ€ν¬λ¦½νΈ](https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_flax.py)λ₯Ό μ¬μ©ν΄ λ³Ό μ μμ΅λλ€. Flax νμ΅ μ€ν¬λ¦½νΈλ gradient checkpointing λλ gradient accumulationμ μ§μνμ§ μμΌλ―λ‘, λ©λͺ¨λ¦¬κ° 30GB μ΄μμΈ GPUκ° νμν©λλ€. | |
μ€ν¬λ¦½νΈλ₯Ό μ€ννκΈ° μ μ μꡬ μ¬νμ΄ μ€μΉλμ΄ μλμ§ νμΈνμμμ€. | |
```bash | |
pip install -U -r requirements.txt | |
``` | |
κ·Έλ¬λ©΄ λ€μ λͺ λ Ήμ΄λ‘ νμ΅ μ€ν¬λ¦½νΈλ₯Ό μ€νμν¬ μ μμ΅λλ€: | |
```bash | |
export MODEL_NAME="duongna/stable-diffusion-v1-4-flax" | |
export INSTANCE_DIR="path-to-instance-images" | |
export OUTPUT_DIR="path-to-save-model" | |
python train_dreambooth_flax.py \ | |
--pretrained_model_name_or_path=$MODEL_NAME \ | |
--instance_data_dir=$INSTANCE_DIR \ | |
--output_dir=$OUTPUT_DIR \ | |
--instance_prompt="a photo of sks dog" \ | |
--resolution=512 \ | |
--train_batch_size=1 \ | |
--learning_rate=5e-6 \ | |
--max_train_steps=400 | |
``` | |
</jax> | |
</frameworkcontent> | |
### Prior-preserving(μ¬μ 보쑴) lossλ₯Ό μ¬μ©ν νμΈνλ | |
κ³Όμ ν©κ³Ό language driftλ₯Ό λ°©μ§νκΈ° μν΄ μ¬μ λ³΄μ‘΄μ΄ μ¬μ©λ©λλ€(κ΄μ¬μ΄ μλ κ²½μ° [λ Όλ¬Έ](https://arxiv.org/abs/2208.12242)μ μ°Έμ‘°νμΈμ). μ¬μ 보쑴μ μν΄ λμΌν ν΄λμ€μ λ€λ₯Έ μ΄λ―Έμ§λ₯Ό νμ΅ νλ‘μΈμ€μ μΌλΆλ‘ μ¬μ©ν©λλ€. μ’μ μ μ Stable Diffusion λͺ¨λΈ μ체λ₯Ό μ¬μ©νμ¬ μ΄λ¬ν μ΄λ―Έμ§λ₯Ό μμ±ν μ μλ€λ κ²μ λλ€! νμ΅ μ€ν¬λ¦½νΈλ μμ±λ μ΄λ―Έμ§λ₯Ό μ°λ¦¬κ° μ§μ ν λ‘컬 κ²½λ‘μ μ μ₯ν©λλ€. | |
μ μλ€μ λ°λ₯΄λ©΄ μ¬μ 보쑴μ μν΄ `num_epochs * num_samples`κ°μ μ΄λ―Έμ§λ₯Ό μμ±νλ κ²μ΄ μ’μ΅λλ€. 200-300κ°μμ λλΆλΆ μ μλν©λλ€. | |
<frameworkcontent> | |
<pt> | |
```bash | |
export MODEL_NAME="CompVis/stable-diffusion-v1-4" | |
export INSTANCE_DIR="path_to_training_images" | |
export CLASS_DIR="path_to_class_images" | |
export OUTPUT_DIR="path_to_saved_model" | |
accelerate launch train_dreambooth.py \ | |
--pretrained_model_name_or_path=$MODEL_NAME \ | |
--instance_data_dir=$INSTANCE_DIR \ | |
--class_data_dir=$CLASS_DIR \ | |
--output_dir=$OUTPUT_DIR \ | |
--with_prior_preservation --prior_loss_weight=1.0 \ | |
--instance_prompt="a photo of sks dog" \ | |
--class_prompt="a photo of dog" \ | |
--resolution=512 \ | |
--train_batch_size=1 \ | |
--gradient_accumulation_steps=1 \ | |
--learning_rate=5e-6 \ | |
--lr_scheduler="constant" \ | |
--lr_warmup_steps=0 \ | |
--num_class_images=200 \ | |
--max_train_steps=800 | |
``` | |
</pt> | |
<jax> | |
```bash | |
export MODEL_NAME="duongna/stable-diffusion-v1-4-flax" | |
export INSTANCE_DIR="path-to-instance-images" | |
export CLASS_DIR="path-to-class-images" | |
export OUTPUT_DIR="path-to-save-model" | |
python train_dreambooth_flax.py \ | |
--pretrained_model_name_or_path=$MODEL_NAME \ | |
--instance_data_dir=$INSTANCE_DIR \ | |
--class_data_dir=$CLASS_DIR \ | |
--output_dir=$OUTPUT_DIR \ | |
--with_prior_preservation --prior_loss_weight=1.0 \ | |
--instance_prompt="a photo of sks dog" \ | |
--class_prompt="a photo of dog" \ | |
--resolution=512 \ | |
--train_batch_size=1 \ | |
--learning_rate=5e-6 \ | |
--num_class_images=200 \ | |
--max_train_steps=800 | |
``` | |
</jax> | |
</frameworkcontent> | |
## ν μ€νΈ μΈμ½λμ and UNetλ‘ νμΈνλνκΈ° | |
ν΄λΉ μ€ν¬λ¦½νΈλ₯Ό μ¬μ©νλ©΄ `unet`κ³Ό ν¨κ» `text_encoder`λ₯Ό νμΈνλν μ μμ΅λλ€. μ€νμμ(μμΈν λ΄μ©μ [𧨠Diffusersλ₯Ό μ¬μ©ν΄ DreamBoothλ‘ Stable Diffusion νμ΅νκΈ°](https://huggingface.co/blog/dreambooth) κ²μλ¬Όμ νμΈνμΈμ), νΉν μΌκ΅΄ μ΄λ―Έμ§λ₯Ό μμ±ν λ ν¨μ¬ λ λμ κ²°κ³Όλ₯Ό μ»μ μ μμ΅λλ€. | |
<Tip warning={true}> | |
ν μ€νΈ μΈμ½λλ₯Ό νμ΅μν€λ €λ©΄ μΆκ° λ©λͺ¨λ¦¬κ° νμν΄ 16GB GPUλ‘λ λμνμ§ μμ΅λλ€. μ΄ μ΅μ μ μ¬μ©νλ €λ©΄ μ΅μ 24GB VRAMμ΄ νμν©λλ€. | |
</Tip> | |
`--train_text_encoder` μΈμλ₯Ό νμ΅ μ€ν¬λ¦½νΈμ μ λ¬νμ¬ `text_encoder` λ° `unet`μ νμΈνλν μ μμ΅λλ€: | |
<frameworkcontent> | |
<pt> | |
```bash | |
export MODEL_NAME="CompVis/stable-diffusion-v1-4" | |
export INSTANCE_DIR="path_to_training_images" | |
export CLASS_DIR="path_to_class_images" | |
export OUTPUT_DIR="path_to_saved_model" | |
accelerate launch train_dreambooth.py \ | |
--pretrained_model_name_or_path=$MODEL_NAME \ | |
--train_text_encoder \ | |
--instance_data_dir=$INSTANCE_DIR \ | |
--class_data_dir=$CLASS_DIR \ | |
--output_dir=$OUTPUT_DIR \ | |
--with_prior_preservation --prior_loss_weight=1.0 \ | |
--instance_prompt="a photo of sks dog" \ | |
--class_prompt="a photo of dog" \ | |
--resolution=512 \ | |
--train_batch_size=1 \ | |
--use_8bit_adam | |
--gradient_checkpointing \ | |
--learning_rate=2e-6 \ | |
--lr_scheduler="constant" \ | |
--lr_warmup_steps=0 \ | |
--num_class_images=200 \ | |
--max_train_steps=800 | |
``` | |
</pt> | |
<jax> | |
```bash | |
export MODEL_NAME="duongna/stable-diffusion-v1-4-flax" | |
export INSTANCE_DIR="path-to-instance-images" | |
export CLASS_DIR="path-to-class-images" | |
export OUTPUT_DIR="path-to-save-model" | |
python train_dreambooth_flax.py \ | |
--pretrained_model_name_or_path=$MODEL_NAME \ | |
--train_text_encoder \ | |
--instance_data_dir=$INSTANCE_DIR \ | |
--class_data_dir=$CLASS_DIR \ | |
--output_dir=$OUTPUT_DIR \ | |
--with_prior_preservation --prior_loss_weight=1.0 \ | |
--instance_prompt="a photo of sks dog" \ | |
--class_prompt="a photo of dog" \ | |
--resolution=512 \ | |
--train_batch_size=1 \ | |
--learning_rate=2e-6 \ | |
--num_class_images=200 \ | |
--max_train_steps=800 | |
``` | |
</jax> | |
</frameworkcontent> | |
## LoRAλ‘ νμΈνλνκΈ° | |
DreamBoothμμ λκ·λͺ¨ λͺ¨λΈμ νμ΅μ κ°μννκΈ° μν νμΈνλ κΈ°μ μΈ LoRA(Low-Rank Adaptation of Large Language Models)λ₯Ό μ¬μ©ν μ μμ΅λλ€. μμΈν λ΄μ©μ [LoRA νμ΅](training/lora#dreambooth) κ°μ΄λλ₯Ό μ°Έμ‘°νμΈμ. | |
### νμ΅ μ€ μ²΄ν¬ν¬μΈνΈ μ μ₯νκΈ° | |
Dreamboothλ‘ νλ ¨νλ λμ κ³Όμ ν©νκΈ° μ¬μ°λ―λ‘, λλλ‘ νμ΅ μ€μ μ κΈ°μ μΈ μ²΄ν¬ν¬μΈνΈλ₯Ό μ μ₯νλ κ²μ΄ μ μ©ν©λλ€. μ€κ° 체ν¬ν¬μΈνΈ μ€ νλκ° μ΅μ’ λͺ¨λΈλ³΄λ€ λ μ μλν μ μμ΅λλ€! 체ν¬ν¬μΈνΈ μ μ₯ κΈ°λ₯μ νμ±ννλ €λ©΄ νμ΅ μ€ν¬λ¦½νΈμ λ€μ μΈμλ₯Ό μ λ¬ν΄μΌ ν©λλ€: | |
```bash | |
--checkpointing_steps=500 | |
``` | |
μ΄λ κ² νλ©΄ `output_dir`μ νμ ν΄λμ μ 체 νμ΅ μνκ° μ μ₯λ©λλ€. νμ ν΄λ μ΄λ¦μ μ λμ¬ `checkpoint-`λ‘ μμνκ³ μ§κΈκΉμ§ μνλ step μμ λλ€. μμλ‘ `checkpoint-1500`μ 1500 νμ΅ step νμ μ μ₯λ 체ν¬ν¬μΈνΈμ λλ€. | |
#### μ μ₯λ 체ν¬ν¬μΈνΈμμ νλ ¨ μ¬κ°νκΈ° | |
μ μ₯λ 체ν¬ν¬μΈνΈμμ νλ ¨μ μ¬κ°νλ €λ©΄, `--resume_from_checkpoint` μΈμλ₯Ό μ λ¬ν λ€μ μ¬μ©ν 체ν¬ν¬μΈνΈμ μ΄λ¦μ μ§μ νλ©΄ λ©λλ€. νΉμ λ¬Έμμ΄ `"latest"`λ₯Ό μ¬μ©νμ¬ μ μ₯λ λ§μ§λ§ 체ν¬ν¬μΈνΈ(μ¦, step μκ° κ°μ₯ λ§μ 체ν¬ν¬μΈνΈ)μμ μ¬κ°ν μλ μμ΅λλ€. μλ₯Ό λ€μ΄ λ€μμ 1500 step νμ μ μ₯λ 체ν¬ν¬μΈνΈμμλΆν° νμ΅μ μ¬κ°ν©λλ€: | |
```bash | |
--resume_from_checkpoint="checkpoint-1500" | |
``` | |
μνλ κ²½μ° μΌλΆ νμ΄νΌνλΌλ―Έν°λ₯Ό μ‘°μ ν μ μμ΅λλ€. | |
#### μ μ₯λ 체ν¬ν¬μΈνΈλ₯Ό μ¬μ©νμ¬ μΆλ‘ μννκΈ° | |
μ μ₯λ 체ν¬ν¬μΈνΈλ νλ ¨ μ¬κ°μ μ ν©ν νμμΌλ‘ μ μ₯λ©λλ€. μ¬κΈ°μλ λͺ¨λΈ κ°μ€μΉλΏλ§ μλλΌ μ΅ν°λ§μ΄μ , λ°μ΄ν° λ‘λ λ° νμ΅λ₯ μ μνλ ν¬ν¨λ©λλ€. | |
**`"accelerate>=0.16.0"`**μ΄ μ€μΉλ κ²½μ° λ€μ μ½λλ₯Ό μ¬μ©νμ¬ μ€κ° 체ν¬ν¬μΈνΈμμ μΆλ‘ μ μ€νν©λλ€. | |
```python | |
from diffusers import DiffusionPipeline, UNet2DConditionModel | |
from transformers import CLIPTextModel | |
import torch | |
# νμ΅μ μ¬μ©λ κ²κ³Ό λμΌν μΈμ(model, revision)λ‘ νμ΄νλΌμΈμ λΆλ¬μ΅λλ€. | |
model_id = "CompVis/stable-diffusion-v1-4" | |
unet = UNet2DConditionModel.from_pretrained("/sddata/dreambooth/daruma-v2-1/checkpoint-100/unet") | |
# `args.train_text_encoder`λ‘ νμ΅ν κ²½μ°λ©΄ ν μ€νΈ μΈμ½λλ₯Ό κΌ λΆλ¬μ€μΈμ | |
text_encoder = CLIPTextModel.from_pretrained("/sddata/dreambooth/daruma-v2-1/checkpoint-100/text_encoder") | |
pipeline = DiffusionPipeline.from_pretrained(model_id, unet=unet, text_encoder=text_encoder, dtype=torch.float16) | |
pipeline.to("cuda") | |
# μΆλ‘ μ μννκ±°λ μ μ₯νκ±°λ, νλΈμ νΈμν©λλ€. | |
pipeline.save_pretrained("dreambooth-pipeline") | |
``` | |
If you have **`"accelerate<0.16.0"`** installed, you need to convert it to an inference pipeline first: | |
```python | |
from accelerate import Accelerator | |
from diffusers import DiffusionPipeline | |
# νμ΅μ μ¬μ©λ κ²κ³Ό λμΌν μΈμ(model, revision)λ‘ νμ΄νλΌμΈμ λΆλ¬μ΅λλ€. | |
model_id = "CompVis/stable-diffusion-v1-4" | |
pipeline = DiffusionPipeline.from_pretrained(model_id) | |
accelerator = Accelerator() | |
# μ΄κΈ° νμ΅μ `--train_text_encoder`κ° μ¬μ©λ κ²½μ° text_encoderλ₯Ό μ¬μ©ν©λλ€. | |
unet, text_encoder = accelerator.prepare(pipeline.unet, pipeline.text_encoder) | |
# 체ν¬ν¬μΈνΈ κ²½λ‘λ‘λΆν° μνλ₯Ό 볡μν©λλ€. μ¬κΈ°μλ μ λ κ²½λ‘λ₯Ό μ¬μ©ν΄μΌ ν©λλ€. | |
accelerator.load_state("/sddata/dreambooth/daruma-v2-1/checkpoint-100") | |
# unwrapped λͺ¨λΈλ‘ νμ΄νλΌμΈμ λ€μ λΉλν©λλ€.(.unet and .text_encoderλ‘μ ν λΉλ μλν΄μΌ ν©λλ€) | |
pipeline = DiffusionPipeline.from_pretrained( | |
model_id, | |
unet=accelerator.unwrap_model(unet), | |
text_encoder=accelerator.unwrap_model(text_encoder), | |
) | |
# μΆλ‘ μ μννκ±°λ μ μ₯νκ±°λ, νλΈμ νΈμν©λλ€. | |
pipeline.save_pretrained("dreambooth-pipeline") | |
``` | |
## κ° GPU μ©λμμμ μ΅μ ν | |
νλμ¨μ΄μ λ°λΌ 16GBμμ 8GBκΉμ§ GPUμμ DreamBoothλ₯Ό μ΅μ ννλ λͺ κ°μ§ λ°©λ²μ΄ μμ΅λλ€! | |
### xFormers | |
[xFormers](https://github.com/facebookresearch/xformers)λ Transformersλ₯Ό μ΅μ ννκΈ° μν toolboxμ΄λ©°, 𧨠Diffusersμμ μ¬μ©λλ[memory-efficient attention](https://facebookresearch.github.io/xformers/components/ops.html#module-xformers.ops) λ©μ»€λμ¦μ ν¬ν¨νκ³ μμ΅λλ€. [xFormersλ₯Ό μ€μΉ](./optimization/xformers)ν λ€μ νμ΅ μ€ν¬λ¦½νΈμ λ€μ μΈμλ₯Ό μΆκ°ν©λλ€: | |
```bash | |
--enable_xformers_memory_efficient_attention | |
``` | |
xFormersλ Flaxμμ μ¬μ©ν μ μμ΅λλ€. | |
### κ·ΈλλμΈνΈ μμμΌλ‘ μ€μ | |
λ©λͺ¨λ¦¬ μ¬μ©λμ μ€μΌ μ μλ λ λ€λ₯Έ λ°©λ²μ [κΈ°μΈκΈ° μ€μ ](https://pytorch.org/docs/stable/generated/torch.optim.Optimizer.zero_grad.html)μ 0 λμ `None`μΌλ‘ νλ κ²μ λλ€. κ·Έλ¬λ μ΄λ‘ μΈν΄ νΉμ λμμ΄ λ³κ²½λ μ μμΌλ―λ‘ λ¬Έμ κ° λ°μνλ©΄ μ΄ μΈμλ₯Ό μ κ±°ν΄ λ³΄μμμ€. νμ΅ μ€ν¬λ¦½νΈμ λ€μ μΈμλ₯Ό μΆκ°νμ¬ κ·ΈλλμΈνΈλ₯Ό `None`μΌλ‘ μ€μ ν©λλ€. | |
```bash | |
--set_grads_to_none | |
``` | |
### 16GB GPU | |
Gradient checkpointingκ³Ό [bitsandbytes](https://github.com/TimDettmers/bitsandbytes)μ 8λΉνΈ μ΅ν°λ§μ΄μ μ λμμΌλ‘, 16GB GPUμμ dreamboothλ₯Ό νλ ¨ν μ μμ΅λλ€. bitsandbytesκ° μ€μΉλμ΄ μλμ§ νμΈνμΈμ: | |
```bash | |
pip install bitsandbytes | |
``` | |
κ·Έ λ€μ, νμ΅ μ€ν¬λ¦½νΈμ `--use_8bit_adam` μ΅μ μ λͺ μν©λλ€: | |
```bash | |
export MODEL_NAME="CompVis/stable-diffusion-v1-4" | |
export INSTANCE_DIR="path_to_training_images" | |
export CLASS_DIR="path_to_class_images" | |
export OUTPUT_DIR="path_to_saved_model" | |
accelerate launch train_dreambooth.py \ | |
--pretrained_model_name_or_path=$MODEL_NAME \ | |
--instance_data_dir=$INSTANCE_DIR \ | |
--class_data_dir=$CLASS_DIR \ | |
--output_dir=$OUTPUT_DIR \ | |
--with_prior_preservation --prior_loss_weight=1.0 \ | |
--instance_prompt="a photo of sks dog" \ | |
--class_prompt="a photo of dog" \ | |
--resolution=512 \ | |
--train_batch_size=1 \ | |
--gradient_accumulation_steps=2 --gradient_checkpointing \ | |
--use_8bit_adam \ | |
--learning_rate=5e-6 \ | |
--lr_scheduler="constant" \ | |
--lr_warmup_steps=0 \ | |
--num_class_images=200 \ | |
--max_train_steps=800 | |
``` | |
### 12GB GPU | |
12GB GPUμμ DreamBoothλ₯Ό μ€ννλ €λ©΄ gradient checkpointing, 8λΉνΈ μ΅ν°λ§μ΄μ , xFormersλ₯Ό νμ±ννκ³ κ·ΈλλμΈνΈλ₯Ό `None`μΌλ‘ μ€μ ν΄μΌ ν©λλ€. | |
```bash | |
export MODEL_NAME="CompVis/stable-diffusion-v1-4" | |
export INSTANCE_DIR="path-to-instance-images" | |
export CLASS_DIR="path-to-class-images" | |
export OUTPUT_DIR="path-to-save-model" | |
accelerate launch train_dreambooth.py \ | |
--pretrained_model_name_or_path=$MODEL_NAME \ | |
--instance_data_dir=$INSTANCE_DIR \ | |
--class_data_dir=$CLASS_DIR \ | |
--output_dir=$OUTPUT_DIR \ | |
--with_prior_preservation --prior_loss_weight=1.0 \ | |
--instance_prompt="a photo of sks dog" \ | |
--class_prompt="a photo of dog" \ | |
--resolution=512 \ | |
--train_batch_size=1 \ | |
--gradient_accumulation_steps=1 --gradient_checkpointing \ | |
--use_8bit_adam \ | |
--enable_xformers_memory_efficient_attention \ | |
--set_grads_to_none \ | |
--learning_rate=2e-6 \ | |
--lr_scheduler="constant" \ | |
--lr_warmup_steps=0 \ | |
--num_class_images=200 \ | |
--max_train_steps=800 | |
``` | |
### 8GB GPUμμ νμ΅νκΈ° | |
8GB GPUμ λν΄μλ [DeepSpeed](https://www.deepspeed.ai/)λ₯Ό μ¬μ©ν΄ μΌλΆ ν μλ₯Ό VRAMμμ CPU λλ NVMEλ‘ μ€νλ‘λνμ¬ λ μ μ GPU λ©λͺ¨λ¦¬λ‘ νμ΅ν μλ μμ΅λλ€. | |
π€ Accelerate νκ²½μ ꡬμ±νλ €λ©΄ λ€μ λͺ λ Ήμ μ€ννμΈμ: | |
```bash | |
accelerate config | |
``` | |
νκ²½ κ΅¬μ± μ€μ DeepSpeedλ₯Ό μ¬μ©ν κ²μ νμΈνμΈμ. | |
κ·Έλ¬λ©΄ DeepSpeed stage 2, fp16 νΌν© μ λ°λλ₯Ό κ²°ν©νκ³ λͺ¨λΈ λ§€κ°λ³μμ μ΅ν°λ§μ΄μ μνλ₯Ό λͺ¨λ CPUλ‘ μ€νλ‘λνλ©΄ 8GB VRAM λ―Έλ§μμ νμ΅ν μ μμ΅λλ€. | |
λ¨μ μ λ λ§μ μμ€ν RAM(μ½ 25GB)μ΄ νμνλ€λ κ²μ λλ€. μΆκ° κ΅¬μ± μ΅μ μ [DeepSpeed λ¬Έμ](https://huggingface.co/docs/accelerate/usage_guides/deepspeed)λ₯Ό μ°Έμ‘°νμΈμ. | |
λν κΈ°λ³Έ Adam μ΅ν°λ§μ΄μ λ₯Ό DeepSpeedμ μ΅μ νλ Adam λ²μ μΌλ‘ λ³κ²½ν΄μΌ ν©λλ€. | |
μ΄λ μλΉν μλ ν₯μμ μν AdamμΈ [`deepspeed.ops.adam.DeepSpeedCPUAdam`](https://deepspeed.readthedocs.io/en/latest/optimizers.html#adam-cpu)μ λλ€. | |
`DeepSpeedCPUAdam`μ νμ±ννλ €λ©΄ μμ€ν μ CUDA toolchain λ²μ μ΄ PyTorchμ ν¨κ» μ€μΉλ κ²κ³Ό λμΌν΄μΌ ν©λλ€. | |
8λΉνΈ μ΅ν°λ§μ΄μ λ νμ¬ DeepSpeedμ νΈνλμ§ μλ κ² κ°μ΅λλ€. | |
λ€μ λͺ λ ΉμΌλ‘ νμ΅μ μμν©λλ€: | |
```bash | |
export MODEL_NAME="CompVis/stable-diffusion-v1-4" | |
export INSTANCE_DIR="path_to_training_images" | |
export CLASS_DIR="path_to_class_images" | |
export OUTPUT_DIR="path_to_saved_model" | |
accelerate launch train_dreambooth.py \ | |
--pretrained_model_name_or_path=$MODEL_NAME \ | |
--instance_data_dir=$INSTANCE_DIR \ | |
--class_data_dir=$CLASS_DIR \ | |
--output_dir=$OUTPUT_DIR \ | |
--with_prior_preservation --prior_loss_weight=1.0 \ | |
--instance_prompt="a photo of sks dog" \ | |
--class_prompt="a photo of dog" \ | |
--resolution=512 \ | |
--train_batch_size=1 \ | |
--sample_batch_size=1 \ | |
--gradient_accumulation_steps=1 --gradient_checkpointing \ | |
--learning_rate=5e-6 \ | |
--lr_scheduler="constant" \ | |
--lr_warmup_steps=0 \ | |
--num_class_images=200 \ | |
--max_train_steps=800 \ | |
--mixed_precision=fp16 | |
``` | |
## μΆλ‘ | |
λͺ¨λΈμ νμ΅ν νμλ, λͺ¨λΈμ΄ μ μ₯λ κ²½λ‘λ₯Ό μ§μ ν΄ [`StableDiffusionPipeline`]λ‘ μΆλ‘ μ μνν μ μμ΅λλ€. ν둬ννΈμ νμ΅μ μ¬μ©λ νΉμ `μλ³μ`(μ΄μ μμμ `sks`)κ° ν¬ν¨λμ΄ μλμ§ νμΈνμΈμ. | |
**`"accelerate>=0.16.0"`**μ΄ μ€μΉλμ΄ μλ κ²½μ° λ€μ μ½λλ₯Ό μ¬μ©νμ¬ μ€κ° 체ν¬ν¬μΈνΈμμ μΆλ‘ μ μ€νν μ μμ΅λλ€: | |
```python | |
from diffusers import StableDiffusionPipeline | |
import torch | |
model_id = "path_to_saved_model" | |
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda") | |
prompt = "A photo of sks dog in a bucket" | |
image = pipe(prompt, num_inference_steps=50, guidance_scale=7.5).images[0] | |
image.save("dog-bucket.png") | |
``` | |
[μ μ₯λ νμ΅ μ²΄ν¬ν¬μΈνΈ](#inference-from-a-saved-checkpoint)μμλ μΆλ‘ μ μ€νν μλ μμ΅λλ€. | |