一、本文介绍
本文记录的是利用SCConv优化YOLOv9的目标检测网络模型。深度神经网络中存在大量冗余,不仅在密集模型参数中,而且在特征图的空间和通道维度中。SCConv
模块通过联合减少卷积层中空间和通道的冗余,有效地限制了特征冗余,本文利用SCConv
模块改进YOLOv9
,提高了模型的性能和效率。
文章目录
二、SCConv介绍
SCConv
:针对特征冗余的空间和通道重构卷积
SCConv(Spatial and Channel reconstruction Convolution)
模块是为了解决卷积神经网络中特征冗余导致的计算资源消耗大的问题而提出的,其设计的原理和优势如下:
2.1、原理
SCConv
由两个单元组成:空间重建单元(SRU)和通道重建单元(CRU)。- SRU:利用分离和重建操作来挖掘特征的空间冗余。具体来说,通过
Group Normalization(GN)层
的缩放因子评估不同特征图的信息含量,将特征图根据权重分为信息丰富的和信息较少的两部分,然后通过交叉重建操作将这两部分进行组合,以减少空间冗余并增强特征的表示。 - CRU:利用
Split - Transform - Fuse策略
来挖掘特征的通道冗余。首先将空间精炼后的特征图的通道进行分割和挤压,然后通过高效的卷积操作(如GWC和PWC)对分割后的特征图进行变换,以提取高级代表性信息并减少计算成本,最后使用简化的SKNet
方法自适应地融合输出特征,从而减少通道维度的冗余。
2.2、优势
- 减少冗余计算:通过挖掘空间和通道维度的冗余,
SCConv
能够减少模型的计算量和参数数量,从而降低计算成本。 - 促进代表性特征学习:SRU和CRU的设计有助于增强特征的表示能力,生成更具代表性和表达性的特征。
- 通用性和灵活性:
SCConv
是一个即插即用的模块,可以直接替换各种卷积神经网络中的标准卷积,无需对模型架构进行额外的修改。 - 性能提升:实验结果表明,嵌入
SCConv
的模型在降低复杂度和计算成本的同时,能够实现更好的性能,在图像分类和目标检测等任务中超越了其他先进的方法。
三、SCConv的实现代码
SCConv模块
的实现代码如下:
class GroupBatchnorm2d(nn.Module):
def __init__(self, c_num: int,
group_num: int = 16,
eps: float = 1e-10
):
super(GroupBatchnorm2d, self).__init__()
assert c_num >= group_num
self.group_num = group_num
self.weight = nn.Parameter(torch.randn(c_num, 1, 1))
self.bias = nn.Parameter(torch.zeros(c_num, 1, 1))
self.eps = eps
def forward(self, x):
N, C, H, W = x.size()
x = x.view(N, self.group_num, -1)
mean = x.mean(dim=2, keepdim=True)
std = x.std(dim=2, keepdim=True)
x = (x - mean) / (std + self.eps)
x = x.view(N, C, H, W)
return x * self.weight + self.bias
class SRU(nn.Module):
def __init__(self,
oup_channels: int,
group_num: int = 16,
gate_treshold: float = 0.5,
torch_gn: bool = True
):
super().__init__()
self.gn = nn.GroupNorm(num_channels=oup_channels, num_groups=group_num) if torch_gn else GroupBatchnorm2d(
c_num=oup_channels, group_num=group_num)
self.gate_treshold = gate_treshold
self.sigomid = nn.Sigmoid()
def forward(self, x):
gn_x = self.gn(x)
w_gamma = self.gn.weight / sum(self.gn.weight)
w_gamma = w_gamma.view(1, -1, 1, 1)
reweigts = self.sigomid(gn_x * w_gamma)
# Gate
w1 = torch.where(reweigts > self.gate_treshold, torch.ones_like(reweigts), reweigts)
w2 = torch.where(reweigts > self.gate_treshold, torch.zeros_like(reweigts), reweigts)
x_1 = w1 * x
x_2 = w2 * x
y = self.reconstruct(x_1, x_2)
return y
def reconstruct(self, x_1, x_2):
x_11, x_12 = torch.split(x_1, x_1.size(1) // 2, dim=1)
x_21, x_22 = torch.split(x_2, x_2.size(1) // 2, dim=1)
return torch.cat([x_11 + x_22, x_12 + x_21], dim=1)
class CRU(nn.Module):
def __init__(self,
op_channel: int,
alpha: float = 1 / 2,
squeeze_radio: int = 2,
group_size: int = 2,
group_kernel_size: int = 3,
):
super().__init__()
self.up_channel = up_channel = int(alpha * op_channel)
self.low_channel = low_channel = op_channel - up_channel
self.squeeze1 = nn.Conv2d(up_channel, up_channel // squeeze_radio, kernel_size=1, bias=False)
self.squeeze2 = nn.Conv2d(low_channel, low_channel // squeeze_radio, kernel_size=1, bias=False)
# up
self.GWC = nn.Conv2d(up_channel // squeeze_radio, op_channel, kernel_size=group_kernel_size, stride=1,
padding=group_kernel_size // 2, groups=group_size)
self.PWC1 = nn.Conv2d(up_channel // squeeze_radio, op_channel, kernel_size=1, bias=False)
# low
self.PWC2 = nn.Conv2d(low_channel // squeeze_radio, op_channel - low_channel // squeeze_radio, kernel_size=1,
bias=False)
self.advavg = nn.AdaptiveAvgPool2d(1)
def forward(self, x):
# Split
up, low = torch.split(x, [self.up_channel, self.low_channel], dim=1)
up, low = self.squeeze1(up), self.squeeze2(low)
# Transform
Y1 = self.GWC(up) + self.PWC1(up)
Y2 = torch.cat([self.PWC2(low), low], dim=1)
# Fuse
out = torch.cat([Y1, Y2], dim=1)
out = F.softmax(self.advavg(out), dim=1) * out
out1, out2 = torch.split(out, out.size(1) // 2, dim=1)
return out1 + out2
def autopad(k, p=None, d=1):
if d > 1:
k = d * (k - 1) + 1 if isinstance(k, int) else [d * (x - 1) + 1 for x in k] # actual kernel-size
if p is None:
p = k // 2 if isinstance(k, int) else [x // 2 for x in k] # auto-pad
return p
class SCConv(nn.Module):
def __init__(self,
op_channel: int,
group_num: int = 4,
gate_treshold: float = 0.5,
alpha: float = 1 / 2,
squeeze_radio: int = 2,
group_size: int = 2,
group_kernel_size: int = 3,
):
super().__init__()
self.SRU = SRU(op_channel,
group_num=group_num,
gate_treshold=gate_treshold)
self.CRU = CRU(op_channel,
alpha=alpha,
squeeze_radio=squeeze_radio,
group_size=group_size,
group_kernel_size=group_kernel_size)
def forward(self, x):
x = self.SRU(x)
x = self.CRU(x)
return x
四、添加步骤
4.1 修改common.py
此处需要修改的文件是models/common.py
common.py中定义了网络结构的通用模块
,我们想要加入新的模块就只需要将模块代码放到这个文件内即可。
4.1.1 创新模块⭐
模块改进方法:基于SCConv
的RepNCSPELAN4
。