首页 > 其他分享 >FCN-全卷积网络-pytorch搭建

FCN-全卷积网络-pytorch搭建

时间:2023-08-06 16:46:31浏览次数:64  
标签:kernel parameters nn 卷积 self pytorch size FCN pool4

代码摘自:https://github.com/sovit-123/Semantic-Segmentation-using-Fully-Convlutional-Networks

预备知识:

下载预训练权重,抽取出网络层实例:运行如下代码,自动下载到 C:\Users\**\.cache\torch\hub\checkpoints 目录下。

vgg = models.vgg16(pretrained=True)

抽取网络层,vgg.features 是 VGG16 的特征抽取网络部分(卷积网络),vgg 还有 vgg.classifier 表示分类器部分(全连接网络)。

print("----show VGG16's features.children()----")

# feats = vgg.features.children()  # <generator object Module.children at 0x0000021CCC997580>
feats = list(vgg.features.children())
# print(*feats)  # 解包列表,打印列表里的所有元素(*list 只能作为函数参数,无法直接运行)

for i, layer in enumerate(feats):
    print("====={0}======".format(i))
    print(layer)  # 每一个网络层
# print(feats[0:9])  # 获取 0-8 层 共前9层网络
# print(*feats[0:9])  # 解包列表,不再是列表而是9个参数

卷积网络和反卷积网络,两者操作互逆

con = nn.Conv2d(1,16,kernel_size=(3,3),stride=(2,2),padding=(1,1))
dec = nn.ConvTranspose2d(16,1, kernel_size=(3,3), stride=(2,2), padding=(1,1), bias=False)
feat = torch.randn((1, 5, 5))
feat_c = con(feat)
feat_d = dec(feat_c)
print(feat.shape)
print(feat_c.shape)
print(feat_d.shape)

模型搭建全部代码,仅把模型部分摘出作为参考:

import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torchvision import models
import logging
from itertools import chain

# 一个基类,定义了一个模型的“描述信息的功能”,例如logger、print
class BaseModel(nn.Module):
    def __init__(self):
        super(BaseModel, self).__init__()
        self.logger = logging.getLogger(self.__class__.__name__)
    # 子类必须重写的类
    def forward(self):
        raise NotImplementedError
    
    # 打印到log文件中
    def summary(self):
        # 计数 所有参数的个数
        total_params = sum(p.numel() for p in self.parameters())
        print(f"{total_params:,} total parameters.")
        total_trainable_params = sum(
            p.numel() for p in self.parameters() if p.requires_grad)
        print(f"{total_trainable_params:,} training parameters.")
        self.logger.info(f'Nbr of trainable parameters: {total_trainable_params}')
    
    # 返回信息描述
    def __str__(self):
        total_params = sum(p.numel() for p in self.parameters())
        print(f"{total_params:,} total parameters.")
        total_trainable_params = sum(
            p.numel() for p in self.parameters() if p.requires_grad)
        print(f"{total_trainable_params:,} training parameters.")
        return super(BaseModel, self).__str__() + f'\nNbr of trainable parameters: {total_trainable_params}'

上采样权重

# 此处定义的 上采样卷积核权重 是个固定值
# 返回 k 个 k层tensor,每个tensor都是k个矩阵,其中第i个tensor的第i个矩阵为一个高斯核,其他都是0
# 例如 k=3,[[g, 0, 0],[0, g, 0],[0, 0, g]]
def get_upsampling_weight(in_channels, out_channels, kernel_size):
    factor = (kernel_size + 1) // 2
    if kernel_size % 2 == 1:
        center = factor - 1
    else:
        center = factor - 0.5
    # 返回两个长度为 kernel_size 的向量,两者点乘得到一个矩阵(类似 meshgrid 的矩阵)
    og = np.ogrid[:kernel_size, :kernel_size]
    filt = (1 - abs(og[0] - center) / factor) * (1 - abs(og[1] - center) / factor)
    weight = np.zeros((in_channels, out_channels, kernel_size, kernel_size), dtype=np.float64)
    weight[list(range(in_channels)), list(range(out_channels)), :, :] = filt
    return torch.from_numpy(weight).float()

FCN8 模型,该模型的 backbone (特征提取器网络) 采用 VGG16,是pytorch库的预训练权重。

class FCN8(BaseModel):
    def __init__(self, num_classes, pretrained=True, freeze_bn=False, **_):
        super(FCN8, self).__init__()
        vgg = models.vgg16(pretrained)
        features = list(vgg.features.children())
        classifier = list(vgg.classifier.children())
        features[0].padding = (100, 100)
        for layer in features:
            if 'MaxPool' in layer.__class__.__name__:
                # __class__形如 torch.nn.modules.conv.Conv2d
                # __name__ 即为 Conv2d
                # # enbale ceil in max pool, to avoid different sizes when upsampling
                layer.ceil_mode = True
        # extract pool3, pool4 and pool5 from the VGG net
        # 取前17层为第一特征模块
        self.pool3 = nn.Sequential(*features[:17])
        # 取前17-23层为第二特征模块
        self.pool4 = nn.Sequential(*features[17:24])
        # 取24层及之后所有的为第三特征模块
        self.pool5 = nn.Sequential(*features[24:])
        
        # adjust the depth of pool3 and pool4 to num_classes
        self.adj_pool3 = nn.Conv2d(256, num_classes, kernel_size=1)
        self.adj_pool4 = nn.Conv2d(512, num_classes, kernel_size=1)
        
        # replace the FC layer of VGG with conv layers
        conv6 = nn.Conv2d(512, 4096, kernel_size=7)
        conv7 = nn.Conv2d(4096, 4096, kernel_size=1)
        output = nn.Conv2d(4096, num_classes, kernel_size=1)
        
        # copy the weights from VGG's FC pretrained layers
        conv6.weight.data.copy_(classifier[0].weight.data.view(
            conv6.weight.data.size()))
        conv6.bias.data.copy_(classifier[0].bias.data)
        
        conv7.weight.data.copy_(classifier[3].weight.data.view(
            conv7.weight.data.size()))
        conv7.bias.data.copy_(classifier[3].bias.data)
        
        # get the outputs
        self.output = nn.Sequential(conv6, nn.ReLU(inplace=True), nn.Dropout(),
                                    conv7, nn.ReLU(inplace=True), nn.Dropout(), 
                                    output)
        
        # we'll need three upsampling layers, upsampling (x2 +2) the outputs
        # upsampling (x2 +2) addition of pool4 and upsampled output 
        # upsampling (x8 +8) the final value (pool3 + added output and pool4)
        self.up_output = nn.ConvTranspose2d(num_classes, num_classes,
                                            kernel_size=4, stride=2, bias=False)
        self.up_pool4_out = nn.ConvTranspose2d(num_classes, num_classes, 
                                            kernel_size=4, stride=2, bias=False)
        self.up_final = nn.ConvTranspose2d(num_classes, num_classes, 
                                            kernel_size=16, stride=8, bias=False)
        
        # we'll use guassian kernels for the upsampling weights
        self.up_output.weight.data.copy_(
            get_upsampling_weight(num_classes, num_classes, 4))
        self.up_pool4_out.weight.data.copy_(
            get_upsampling_weight(num_classes, num_classes, 4))
        self.up_final.weight.data.copy_(
            get_upsampling_weight(num_classes, num_classes, 16))
        
        # we'll freeze the wights, this is a fixed upsampling and not deconv
        for m in self.modules():
            if isinstance(m, nn.ConvTranspose2d):
                m.weight.requires_grad = False
        if freeze_bn: self.freeze_bn()
    
    def forward(self, x):
        imh_H, img_W = x.size()[2], x.size()[3]
        
        # forward the image
        pool3 = self.pool3(x)
        pool4 = self.pool4(pool3)
        pool5 = self.pool5(pool4)

        # get the outputs and upsmaple them
        output = self.output(pool5)
        up_output = self.up_output(output)

        # adjust pool4 and add the uped-outputs to pool4
        adjstd_pool4 = self.adj_pool4(0.01 * pool4)
        add_out_pool4 = self.up_pool4_out(adjstd_pool4[:, :, 5: (5 + up_output.size()[2]), 
                                            5: (5 + up_output.size()[3])]
                                           + up_output)
        
        # adjust pool3 and add it to the uped last addition
        adjstd_pool3 = self.adj_pool3(0.0001 * pool3)
        final_value = self.up_final(adjstd_pool3[:, :, 9: (9 + add_out_pool4.size()[2]), 9: (9 + add_out_pool4.size()[3])]
                                 + add_out_pool4)

        # remove the corresponding padded regions to the input img size
        final_value = final_value[:, :, 31: (31 + imh_H), 31: (31 + img_W)].contiguous()
        return final_value
    
    def get_backbone_params(self):
        return chain(self.pool3.parameters(), self.pool4.parameters(), self.pool5.parameters(), self.output.parameters())

    def get_decoder_params(self):
        return chain(self.up_output.parameters(), self.adj_pool4.parameters(), self.up_pool4_out.parameters(),
            self.adj_pool3.parameters(), self.up_final.parameters())

    def freeze_bn(self):
        for module in self.modules():
            if isinstance(module, nn.BatchNorm2d): module.eval()

定义一个张量测试一下前向推理

fcn8 = FCN8(9)
x = torch.randn((4, 3, 28, 28))
fcn8(x)

 

标签:kernel,parameters,nn,卷积,self,pytorch,size,FCN,pool4
From: https://www.cnblogs.com/zhaoke271828/p/17609540.html

相关文章

  • AlexNet深度卷积神经网络——pytorch版
    importtorchfromtorchimportnnfromd2limporttorchasd2lnet=nn.Sequential(#(224-11+1+2)/4=54nn.Conv2d(1,96,kernel_size=11,stride=4,padding=1),nn.ReLU(),#(54-3+1)/2=26nn.MaxPool2d(kernel_size=3,stride=2),#(26+4-5+1)=26......
  • VGG使用块的网络——pytorch版
    importtorchfromtorchimportnnfromd2limporttorchasd2ldefvgg_block(num_convs,in_channels,out_channels):layers=[]for_inrange(num_convs):layers.append(nn.Conv2d(in_channels,out_channels,kernel_size=3,padding=......
  • NiN网络——pytorch版
    importtorchfromtorchimportnnfromd2limporttorchasd2ldefnin_block(in_channels,out_channels,kernel_size,strides,padding):returnnn.Sequential(nn.Conv2d(in_channels,out_channels,kernel_size,strides,padding),nn.ReLU(),nn.Co......
  • GoogLeNet网络——pytorch版
    importtorchfromtorchimportnnfromtorch.nnimportfunctionalasFfromd2limporttorchasd2lclassInception(nn.Module):#c1-c4是每条路径的输出通道数def__init__(self,in_channels,c1,c2,c3,c4,**kwargs):super(Inception,self).__init__(......
  • 步幅与填充——pytorch
    importtorchfromtorchimportnndefcomp_conv2d(conv2d,x):#在维度前面加上通道数和批量大小数1x=x.reshape((1,1)+x.shape)#得到4维y=conv2d(x)#把前面两维去掉returny.reshape(y.shape[2:])#padding填充为1,左右conv2d=nn.Conv2d......
  • 多输入多输出通道——pytorch版
    importtorchfromd2limporttorchasd2lfromtorchimportnn#多输入通道互相关运算defcorr2d_multi_in(x,k):#zip对每个通道配对,返回一个可迭代对象,其中每个元素是一个(x,k)元组,表示一个输入通道和一个卷积核#再做互相关运算returnsum(d2l.corr2d......
  • 池化层——pytorch版
    importtorchfromtorchimportnnfromd2limporttorchasd2l#实现池化层的正向传播defpool2d(x,pool_size,mode='max'):#获取窗口大小p_h,p_w=pool_size#获取偏移量y=torch.zeros((x.shape[0]-p_h+1,x.shape[1]-p_w+1))foriinrange(y.sh......
  • LeNet卷积神经网络——pytorch版
    importtorchfromtorchimportnnfromd2limporttorchasd2lclassReshape(torch.nn.Module):defforward(self,x):#批量大小默认,输出通道为1returnx.view(-1,1,28,28)net=torch.nn.Sequential(#28+4-5+1=28输出通道为6Reshape()......
  • 实现二维卷积层
    importtorchfromtorchimportnnfromd2limporttorchasd2ldefcorr2d(x,k):"""计算二维互相关运算"""#获取卷积核的高和宽h,w=k.shape#输出的高和宽y=torch.zeros((x.shape[0]-h+1,x.shape[1]-w+1))foriinrange(y.shape[0......
  • (通俗易懂)可视化详解多通道 & 多通道输入输出卷积代码实现
    以前对多通道和多通道输入输出的卷积操作不理解,今天自己在草稿纸上画图推理了一遍,终于弄懂了。希望能帮助到大家。多通道可视化一通道的2x2矩阵torch.Size([2,2])相当于torch.Size([1,2,2]),是一通道的2x2矩阵二通道的2x2矩阵torch.Size([2,2,2])代表二通道的2x2矩阵,第一个2表......