首页 > 其他分享 >diffusion常见VAE使用及其训练

diffusion常见VAE使用及其训练

时间:2024-03-15 23:02:21浏览次数:28  
标签:diffusion EMA 训练 f8 VAE ft MSE vae

kl-f8-VAE

Latent Diffusion Models 包含很多Kl8/4...的VAE,这些VAE可以使用自己的数据集进行预训练:

所用损失函数: L1 + LPIPS

网址:GitHub - CompVis/latent-diffusion: High-Resolution Image Synthesis with Latent Diffusion Models

f8-ft-EMA 、f8-ft-MSE

没有发现训练代码...

他俩与“kl-f8-VAE”的区别:

kl-f8-VAE是在“ImageNet”进行训练的,而f8-ft-EMA /f8-ft-MSE它们是为了增强stable diffusion人脸的训练

1). sd-vae-ft-ema

- trained on LAION-aesthetics+human:The first, ft-EMA, was resumed from the original checkpoint, trained for 313k steps and uses EMA weights. It uses the same loss configuration as the original checkpoint (L1 + LPIPS).

stabilityai/sd-vae-ft-ema(https://huggingface.co/stabilityai/sd-vae-ft-ema

2). sd-vae-ft-mse

- continue training on same dataset but in such a way to make the outputs more smooth:The second, ft-MSE, was resumed from ft-EMA and uses EMA weights and was trained for another 280k steps using a different loss, with more emphasis on MSE reconstruction (MSE + 0.1 * LPIPS). It produces somewhat ``smoother'' outputs. The batch size for both versions was 192 (16 A100s, batch size 12 per GPU).

stabilityai/sd-vae-ft-mse(https://huggingface.co/stabilityai/sd-vae-ft-mse

在上面的链接中有这两个模型在辅助生成图片时的效果对比。就使用经验而言,EMA 会更锐利、MSE 会更平滑。

标签:diffusion,EMA,训练,f8,VAE,ft,MSE,vae
From: https://blog.csdn.net/weixin_43135178/article/details/136614403

相关文章