参考:
/modelscope/DiffSynth-Studio
/modelscope/DiffSynth-Studio/tree/main/examples/train/kolors
下载安装:
环境
conda create -n diff python==3.10
conda activate diff
git clone /modelscope/
cd DiffSynth-Studio
pip install -e .
训练需要安装
pip install peft lightning pandas torchvision
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
模型
export HF_ENDPOINT=
cd models
mkdir kolors
cd kolors
huggingface-cli download --resume-download --local-dir-use-symlinks False Kwai-Kolors/Kolors --local-dir Kolors
mkdir sdxl-vae-fp16-fix
cd sdxl-vae-fp16-fix
## 注意这里下载内容是到diffusion_pytorch_model.safetensors 文件夹下,多了一个层级,按照上面目录结构,diffusion_pytorch_model.safetensors 模型是直接放到sdxl-vae-fp16-fix下的,放不对训练会报错:with safe_open(file_path, framework="pt", device="cpu") as f:OSError: No such device (os error 19)
huggingface-cli download --resume-download --local-dir-use-symlinks False madebyollin/sdxl-vae-fp16-fix diffusion_pytorch_model.safetensors --local-dir diffusion_pytorch_model.safetensors
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
下载数据集:
注意与数据集图片放一个目录train文件夹下
地址:/datasets/buptwq/lora-stable-diffusion-finetune/files
模型训练:
数据很少,训练1批次即可
CUDA_VISIBLE_DEVICES=0 python examples/train/kolors/train_kolors_lora.py --pretrained_unet_path models/kolors/Kolors/unet/diffusion_pytorch_model.safetensors --pretrained_text_encoder_path models/kolors/Kolors/text_encoder --pretrained_fp16_vae_path models/sdxl-vae-fp16-fix/diffusion_pytorch_model.safetensors --dataset_path data/dog --output_path ./models --max_epochs 1 --center_crop --use_gradient_checkpointing --precision "16-mixed"
- 1
训练一批次大概6分钟,共训练10批次预计1个小时
more训练完lora模型保存结果(–output_path ./models 可以具体指定更细目录,比如./models/lora):
ls -lht models/lightning_logs/version_0/checkpoints/epoch=0-step=
是训练的loss
多卡训练
CUDA_VISIBLE_DEVICES=0,1 python examples/train/kolors/train_kolors_lora.py --pretrained_unet_path models/kolors/Kolors/unet/diffusion_pytorch_model.safetensors --pretrained_text_encoder_path models/kolors/Kolors/text_encoder --pretrained_fp16_vae_path models/sdxl-vae-fp16-fix/diffusion_pytorch_model.safetensors --dataset_path data/dog --output_path ./models --max_epochs 1 --center_crop --use_gradient_checkpointing --precision "16-mixed"
- 1
训练loss要到0.0**下效果才行
加载lora使用
from diffsynth import ModelManager, KolorsImagePipeline
from peft import LoraConfig, inject_adapter_in_model
import torch
def load_lora(model, lora_rank, lora_alpha, lora_path):
lora_config = LoraConfig(
r=lora_rank,
lora_alpha=lora_alpha,
init_lora_weights="gaussian",
target_modules=["to_q", "to_k", "to_v", "to_out"],
)
model = inject_adapter_in_model(lora_config, model)
state_dict = (lora_path, map_location="cpu")
model.load_state_dict(state_dict, strict=False)
return model
# Load models
model_manager = ModelManager(torch_dtype=torch.float16, device="cuda",
file_path_list=[
"models/kolors/Kolors/text_encoder",
"models/kolors/Kolors/unet/diffusion_pytorch_model.safetensors",
"models/kolors/Kolors/vae/diffusion_pytorch_model.safetensors"
])
pipe = KolorsImagePipeline.from_model_manager(model_manager)
# Generate an image with lora
= load_lora(
,
lora_rank=4, lora_alpha=4.0, # The two parameters should be consistent with those in your training script.
lora_path="models/lightning_logs/version_0/checkpoints/epoch=9-step="
)
torch.manual_seed(0)
image = pipe(
prompt="一只小狗蹦蹦跳跳,周围是姹紫嫣红的鲜花,远处是山脉",
negative_prompt="",
cfg_scale=4,
num_inference_steps=50, height=1024, width=1024,
)
("image_with_lora.jpg")
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
一只小狗蹦蹦跳跳,周围是姹紫嫣红的鲜花,远处是海洋
不加lora推理代码
from diffsynth import ModelManager, KolorsImagePipeline
from peft import LoraConfig, inject_adapter_in_model
import torch
def load_lora(model, lora_rank, lora_alpha, lora_path):
lora_config = LoraConfig(
r=lora_rank,
lora_alpha=lora_alpha,
init_lora_weights="gaussian",
target_modules=["to_q", "to_k", "to_v", "to_out"],
)
model = inject_adapter_in_model(lora_config, model)
state_dict = (lora_path, map_location="cpu")
model.load_state_dict(state_dict, strict=False)
return model
# Load models
model_manager = ModelManager(torch_dtype=torch.float16, device="cuda",
file_path_list=[
"models/kolors/Kolors/text_encoder",
"models/kolors/Kolors/unet/diffusion_pytorch_model.safetensors",
"models/kolors/Kolors/vae/diffusion_pytorch_model.safetensors"
])
pipe = KolorsImagePipeline.from_model_manager(model_manager)
torch.manual_seed(0)
image = pipe(
prompt="一只小狗蹦蹦跳跳,周围是姹紫嫣红的鲜花,远处是山脉",
negative_prompt="",
cfg_scale=4,
num_inference_steps=50, height=1024, width=1024,
)
("image_with_lora1.jpg")
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35