kl-f8-VAE
Latent Diffusion Models 包含很多Kl8/4...的VAE,这些VAE可以使用自己的数据集进行预训练:
所用损失函数: L1 + LPIPS
网址:GitHub - CompVis/latent-diffusion: High-Resolution Image Synthesis with Latent Diffusion Models
f8-ft-EMA 、f8-ft-MSE
没有发现训练代码...
他俩与“kl-f8-VAE”的区别:
kl-f8-VAE是在“ImageNet”进行训练的,而f8-ft-EMA /f8-ft-MSE它们是为了增强stable diffusion人脸的训练
1). sd-vae-ft-ema
- trained on LAION-aesthetics+human:The first, ft-EMA, was resumed from the original checkpoint, trained for 313k steps and uses EMA weights. It uses the same loss configuration as the original checkpoint (L1 + LPIPS).
stabilityai/sd-vae-ft-ema(https://huggingface.co/stabilityai/sd-vae-ft-ema)
2). sd-vae-ft-mse
- continue training on same dataset but in such a way to make the outputs more smooth:The second, ft-MSE, was resumed from ft-EMA and uses EMA weights and was trained for another 280k steps using a different loss, with more emphasis on MSE reconstruction (MSE + 0.1 * LPIPS). It produces somewhat ``smoother'' outputs. The batch size for both versions was 192 (16 A100s, batch size 12 per GPU).
stabilityai/sd-vae-ft-mse(https://huggingface.co/stabilityai/sd-vae-ft-mse)
在上面的链接中有这两个模型在辅助生成图片时的效果对比。就使用经验而言,EMA 会更锐利、MSE 会更平滑。
标签:diffusion,EMA,训练,f8,VAE,ft,MSE,vae From: https://blog.csdn.net/weixin_43135178/article/details/136614403