首页 > 其他分享 >Diffusers中基于Stable Diffusion的哪些图像操作

Diffusers中基于Stable Diffusion的哪些图像操作

时间:2023-02-24 11:22:43浏览次数:60  
标签:Diffusion prompt img image Diffusers pipe grid Stable images

目录

基于Stable Diffusion的哪些图像操作们:

  • Text-To-Image generation:StableDiffusionPipeline
  • Image-to-Image text guided generation:StableDiffusionImg2ImgPipeline
  • In-painting: StableDiffusionInpaintPipeline
  • text-guided image super-resolution: StableDiffusionUpscalePipeline
  • generate variations from an input image:StableDiffusionImageVariationPipeline
  • image editing by following text instructions:StableDiffusionInstructPix2PixPipeline
  • ......

辅助函数

import requests
from PIL import Image
from io import BytesIO

def show_images(imgs, rows=1, cols=3):
    assert len(imgs) == rows*cols
    w_ori, h_ori = imgs[0].size
    for img in imgs:
        w_new, h_new = img.size
        if w_new != w_ori or h_new != h_ori:
            w_ori = max(w_ori, w_new)
            h_ori = max(h_ori, h_new)
    
    grid = Image.new('RGB', size=(cols*w_ori, rows*h_ori))
    grid_w, grid_h = grid.size
    
    for i, img in enumerate(imgs):
        grid.paste(img, box=(i%cols*w_ori, i//cols*h_ori))
    return grid

def download_image(url):
    response = requests.get(url)
    return Image.open(BytesIO(response.content)).convert("RGB")

Text-To-Image

根据文本生成图像,在diffusers使用StableDiffusionPipeline实现,必要输入为prompt,示例代码:

from diffusers import StableDiffusionPipeline

image_pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")

device = "cuda"
image_pipe.to(device)

prompt = ["a photograph of an astronaut riding a horse"] * 3
out_images = image_pipe(prompt).images
for i, out_image in enumerate(out_images):
    out_image.save("astronaut_rides_horse" + str(i) + ".png")

示例输出:

Image-To-Image

根据文本prompt和原始图像,生成新的图像。在diffusers中使用StableDiffusionImg2ImgPipeline类实现,可以看到,pipeline的必要输入有两个:promptinit_image。示例代码:

import torch
from diffusers import StableDiffusionImg2ImgPipeline

device = "cuda"
model_id_or_path = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionImg2ImgPipeline.from_pretrained(model_id_or_path, torch_dtype=torch.float16)
pipe = pipe.to(device)

url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
init_image = download_image(url)
init_image = init_image.resize((768, 512))

prompt = "A fantasy landscape, trending on artstation"

images = pipe(prompt=prompt, image=init_image, strength=0.75, guidance_scale=7.5).images

grid_img = show_images([init_image, images[0]], 1, 2)
grid_img.save("fantasy_landscape.png")

示例输出:

In-painting

给定一个mask图像和一句提示,可编辑给定图像的特定部分。使用StableDiffusionInpaintPipeline来实现,输入包含三部分:原始图像,mask图像和一个prompt,

示例代码:

from diffusers import StableDiffusionInpaintPipeline

img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"

init_image = download_image(img_url).resize((512, 512))
mask_image = download_image(mask_url).resize((512, 512))

pipe = StableDiffusionInpaintPipeline.from_pretrained("runwayml/stable-diffusion-inpainting", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
images = pipe(prompt=prompt, image=init_image, mask_image=mask_image).images
grid_img = show_images([init_image, mask_image, images[0]], 1, 3)
grid_img.save("overture-creations.png")

示例输出:

Upscale

对低分辨率图像进行超分辨率,使用StableDiffusionUpscalePipeline来实现,必要输入为prompt和低分辨率图像(low-resolution image),示例代码:

from diffusers import StableDiffusionUpscalePipeline

# load model and scheduler
model_id = "stabilityai/stable-diffusion-x4-upscaler"
pipeline = StableDiffusionUpscalePipeline.from_pretrained(model_id, torch_dtype=torch.float16, cache_dir="./models/")
pipeline = pipeline.to("cuda")

# let's download an  image
url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd2-upscale/low_res_cat.png"
low_res_img = download_image(url)
low_res_img = low_res_img.resize((128, 128))

prompt = "a white cat"
upscaled_image = pipeline(prompt=prompt, image=low_res_img).images[0]
grid_img = show_images([low_res_img, upscaled_image], 1, 2)
grid_img.save("a_white_cat.png")
print("low_res_img size: ", low_res_img.size)
print("upscaled_image size: ", upscaled_image.size)

示例输出,默认将一个128 x 128的小猫图像超分为一个512 x 512的:

默认是将原始尺寸的长和宽均放大四倍,即:

input: 128 x 128 ==> output: 512 x 512
input: 64 x 256 ==> output: 256 x 1024
...

个人感觉,prompt没有起什么作用,随便写吧。

关于此模型的详情,参考

Instruct-Pix2Pix

重要参考

根据输入的指令prompt对图像进行编辑,使用StableDiffusionInstructPix2PixPipeline来实现,必要输入包括promptimage,示例代码如下:

import torch
from diffusers import StableDiffusionInstructPix2PixPipeline

model_id = "timbrooks/instruct-pix2pix"
pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(model_id, torch_dtype=torch.float16, cache_dir="./models/")
pipe = pipe.to("cuda")

url = "https://huggingface.co/datasets/diffusers/diffusers-images-docs/resolve/main/mountain.png"
image = download_image(url)

prompt = "make the mountains snowy"
images = pipe(prompt, image=image, num_inference_steps=20, image_guidance_scale=1.5, guidance_scale=7).images
grid_img = show_images([image, images[0]], 1, 2)
grid_img.save("snowy_mountains.png")

示例输出:

标签:Diffusion,prompt,img,image,Diffusers,pipe,grid,Stable,images
From: https://www.cnblogs.com/shuezhang/p/17150635.html

相关文章

  • Linux安装Taiyi stable-diffusion-webui
    1.安装环境操作系统及版本:Ubuntu20.04.5GPU:8GBGPU驱动(我是阿里云GPU服务器) Linux手动安装GPU驱动参考:https://docs.nvidia.com/datacenter/tesla/tesla-inst......
  • Diffusers库的初识及使用
    diffusers库的目标是:将扩散模型(diffusionmodels)集中到一个单一且长期维护的项目中以公众可访问的方式复现高影响力的机器学习系统,如DALLE、Imagen等让开发人员可以很......
  • 一文弄懂 Diffusion Model
    前言最近AI绘图非常的火,其背后用到的核心技术之一就是DiffusionModel(扩散模型),虽然想要完全弄懂DiffusionModel和其中复杂的公式推导需要掌握比较多的前置数学知识......
  • 使用 LoRA 进行 Stable Diffusion 的高效参数微调
    LoRA:Low-RankAdaptationofLargeLanguageModels是微软研究员引入的一项新技术,主要用于处理大模型微调的问题。目前超过数十亿以上参数的具有强能力的大模型(例如G......
  • Stable Diffusion 关键词tag语法教程
    提示词PromptPrompt是输入到文生图模型的文字,不同的Prompt对于生成的图像质量有较大的影响支持的语言StableDiffusion,NovelAI等模型支持的输入语言为英语,SD支持用......
  • Rocksdb SStable
    sstable(sortedstringtable)是googlebigtable中引出的数据结构,在levelDB、RocksDB以及现在各类数据库存储中配合LSM有广泛应用,学习下很有必要,本位以RocksDB中SST的实现......
  • CF #727(div2)C. Stable Groups,贪心,排序
    problemC.StableGroupstimelimitpertest1secondmemorylimitpertest256megabytesinputstandardinputoutputstandardoutputTherearenstudentsnumerated......
  • 基于Docker安装的Stable Diffusion使用CPU进行AI绘画
    基于Docker安装的StableDiffusion使用CPU进行AI绘画由于博主的电脑是为了敲代码考虑买的,所以专门买的高U低显,i9配核显,用StableDiffusion进行AI绘画的话倒是专门有个......
  • flutter:设置当前的channel并更新版本(从master切换到stable)
    一,查看当前flutter和dart的版本:说明:我们当前的channel是master,所以可以看到flutter和dart版本均为pre或dev版本:liuhongdi@liuhongdideMacBook-Pro~%flutter--ve......
  • 使用Stable-Diffusion生成视频的完整教程
    本文是关于如何使用cuda和Stable-Diffusion生成视频的完整指南,将使用cuda来加速视频生成,并且可以使用Kaggle的TESLAGPU来免费执行我们的模型。完整文章:https://avoid.o......