首页 > 其他分享 >爆改YOLOv8 || 利用Gold-YOLO提高YOLOv8对小目标检测精度

爆改YOLOv8 || 利用Gold-YOLO提高YOLOv8对小目标检测精度

时间:2024-08-17 10:58:34浏览次数:14  
标签:__ Gold nn 爆改 self YOLOv8 channels out size

1,本文介绍

Gold-YOLO通过一种创新的 聚合-分发(Gather-and-Distribute, GD)机制 来提高信息融合效率。这一机制利用卷积和自注意力操作来处理来自网络不同层的信息。通过这种方式,Gold-YOLO能够更有效地融合多尺度特征,实现低延迟和高准确性之间的理想平衡.

关于GOLD-YOLO的详细介绍可以看论文:https://arxiv.org/pdf/2309.11331.pdf

本文将讲解如何将GOLD-YOLO融合进yolov8

话不多说,上代码!

2,将GOLD-YOLO融合进YOLOv8

2.1 步骤一

首先找到如下的目录'ultralytics/nn',然后在这个目录下创建一个'Addmodules'文件夹,然后在这个目录下创建一个gold.py文件,文件名字可以根据你自己的习惯起,然后将GOLD-YOLO的核心代码复制进去。

# GOLD核心代码
import torch
from torch import nn
import torch.nn.functional as F
import numpy as np
from mmcv.cnn import ConvModule, build_norm_layer
 
 
__all__ = ('Low_FAM', 'Low_IFM', 'Split', 'SimConv', 'Low_LAF', 'Inject', 'RepBlock', 'High_FAM', 'High_IFM', 'High_LAF')
 
class High_LAF(nn.Module):
    def forward(self, x1, x2):
        if torch.onnx.is_in_onnx_export():
            self.pool = onnx_AdaptiveAvgPool2d
        else:
            self.pool = nn.functional.adaptive_avg_pool2d
        
        N, C, H, W = x2.shape
        # output_size = np.array([H, W])
        output_size = [H, W]
        x1 = self.pool(x1, output_size)
        
        return torch.cat([x1, x2], 1)
 
class High_IFM(nn.Module):
    def __init__(self, block_num, embedding_dim, key_dim, num_heads,
                 mlp_ratio=4., attn_ratio=2., drop=0., attn_drop=0., drop_path=0.,
                 norm_cfg=dict(type='BN', requires_grad=True),
                 act_layer=nn.ReLU6):
        super().__init__()
        self.block_num = block_num
        drop_path = [x.item() for x in torch.linspace(0, drop_path[0], drop_path[1])] # 0.1, 2
        self.transformer_blocks = nn.ModuleList()
        for i in range(self.block_num):
            self.transformer_blocks.append(top_Block(
                    embedding_dim, key_dim=key_dim, num_heads=num_heads,
                    mlp_ratio=mlp_ratio, attn_ratio=attn_ratio,
                    drop=drop, drop_path=drop_path[i] if isinstance(drop_path, list) else drop_path,
                    norm_cfg=norm_cfg, act_layer=act_layer))
    
    def forward(self, x):
        # token * N 
        for i in range(self.block_num):
            x = self.transformer_blocks[i](x)
        return x
    
class Mlp(nn.Module):
    def __init__(self, in_features, hidden_features=None, out_features=None, act_layer=nn.ReLU, drop=0.,
                 norm_cfg=dict(type='BN', requires_grad=True)):
        super().__init__()
        out_features = out_features or in_features
        hidden_features = hidden_features or in_features
        self.fc1 = Conv2d_BN(in_features, hidden_features, norm_cfg=norm_cfg)
        self.dwconv = nn.Conv2d(hidden_features, hidden_features, 3, 1, 1, bias=True, groups=hidden_features)
        self.act = act_layer()
        self.fc2 = Conv2d_BN(hidden_features, out_features, norm_cfg=norm_cfg)
        self.drop = nn.Dropout(drop)
    
    def forward(self, x):
        x = self.fc1(x)
        x = self.dwconv(x)
        x = self.act(x)
        x = self.drop(x)
        x = self.fc2(x)
        x = self.drop(x)
        return x
 
class top_Block(nn.Module):
    
    def __init__(self, dim, key_dim, num_heads, mlp_ratio=4., attn_ratio=2., drop=0.,
                 drop_path=0., act_layer=nn.ReLU, norm_cfg=dict(type='BN2d', requires_grad=True)):
        super().__init__()
        self.dim = dim
        self.num_heads = num_heads
        self.mlp_ratio = mlp_ratio
        
        self.attn = Attention(dim, key_dim=key_dim, num_heads=num_heads, attn_ratio=attn_ratio, activation=act_layer,
                              norm_cfg=norm_cfg)
        
        # NOTE: drop path for stochastic depth, we shall see if this is better than dropout here
        self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity()
        mlp_hidden_dim = int(dim * mlp_ratio)
        self.mlp = Mlp(in_features=dim, hidden_features=mlp_hidden_dim, act_layer=act_layer, drop=drop,
                       norm_cfg=norm_cfg)
    
    def forward(self, x1):
        x1 = x1 + self.drop_path(self.attn(x1))
        x1 = x1 + self.drop_path(self.mlp(x1))
        return x1
 
def drop_path(x, drop_prob: float = 0., training: bool = False):
    """Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).
    This is the same as the DropConnect impl I created for EfficientNet, etc networks, however,
    the original name is misleading as 'Drop Connect' is a different form of dropout in a separate paper...
    See discussion: https://github.com/tensorflow/tpu/issues/494#issuecomment-532968956 ... I've opted for
    changing the layer and argument names to 'drop path' rather than mix DropConnect as a layer name and use
    'survival rate' as the argument.
    """
    if drop_prob == 0. or not training:
        return x
    keep_prob = 1 - drop_prob
    shape = (x.shape[0],) + (1,) * (x.ndim - 1)  # work with diff dim tensors, not just 2D ConvNets
    random_tensor = keep_prob + torch.rand(shape, dtype=x.dtype, device=x.device)
    random_tensor.floor_()  # binarize
    output = x.div(keep_prob) * random_tensor
    return output
 
 
class DropPath(nn.Module):
    """Drop paths (Stochastic Depth) per sample  (when applied in main path of residual blocks).
    """
    
    def __init__(self, drop_prob=None):
        super(DropPath, self).__init__()
        self.drop_prob = drop_prob
    
    def forward(self, x):
        return drop_path(x, self.drop_prob, self.training)
 
class Attention(torch.nn.Module):
    def __init__(self, dim, key_dim, num_heads,
                 attn_ratio=4,
                 activation=None,
                 norm_cfg=dict(type='BN', requires_grad=True), ):
        super().__init__()
        self.num_heads = num_heads
        self.scale = key_dim ** -0.5
        self.key_dim = key_dim
        self.nh_kd = nh_kd = key_dim * num_heads  # num_head key_dim
        self.d = int(attn_ratio * key_dim)
        self.dh = int(attn_ratio * key_dim) * num_heads
        self.attn_ratio = attn_ratio
        
        self.to_q = Conv2d_BN(dim, nh_kd, 1, norm_cfg=norm_cfg)
        self.to_k = Conv2d_BN(dim, nh_kd, 1, norm_cfg=norm_cfg)
        self.to_v = Conv2d_BN(dim, self.dh, 1, norm_cfg=norm_cfg)
        
        self.proj = torch.nn.Sequential(activation(), Conv2d_BN(
                self.dh, dim, bn_weight_init=0, norm_cfg=norm_cfg))
    
    def forward(self, x):  # x (B,N,C)
        B, C, H, W = get_shape(x)
        
        qq = self.to_q(x).reshape(B, self.num_heads, self.key_dim, H * W).permute(0, 1, 3, 2)
        kk = self.to_k(x).reshape(B, self.num_heads, self.key_dim, H * W)
        vv = self.to_v(x).reshape(B, self.num_heads, self.d, H * W).permute(0, 1, 3, 2)
        
        attn = torch.matmul(qq, kk)
        attn = attn.softmax(dim=-1)  # dim = k
        
        xx = torch.matmul(attn, vv)
        
        xx = xx.permute(0, 1, 3, 2).reshape(B, self.dh, H, W)
        xx = self.proj(xx)
        return xx
    
def get_shape(tensor):
    shape = tensor.shape
    if torch.onnx.is_in_onnx_export():
        shape = [i.cpu().numpy() for i in shape]
    return shape
    
class Conv2d_BN(nn.Sequential):
    def __init__(self, a, b, ks=1, stride=1, pad=0, dilation=1,
                 groups=1, bn_weight_init=1,
                 norm_cfg=dict(type='BN', requires_grad=True)):
        super().__init__()
        self.inp_channel = a
        self.out_channel = b
        self.ks = ks
        self.pad = pad
        self.stride = stride
        self.dilation = dilation
        self.groups = groups
        
        self.add_module('c', nn.Conv2d(
                a, b, ks, stride, pad, dilation, groups, bias=False))
        bn = build_norm_layer(norm_cfg, b)[1]
        nn.init.constant_(bn.weight, bn_weight_init)
        nn.init.constant_(bn.bias, 0)
        self.add_module('bn', bn)
 
class High_FAM(nn.Module):
    def __init__(self, stride, pool_mode='onnx'):
        super().__init__()
        self.stride = stride
        if pool_mode == 'torch':
            self.pool = nn.functional.adaptive_avg_pool2d
        elif pool_mode == 'onnx':
            self.pool = onnx_AdaptiveAvgPool2d
    
    def forward(self, inputs):
        B, C, H, W = get_shape(inputs[-1])
        H = (H - 1) // self.stride + 1
        W = (W - 1) // self.stride + 1
        
        # output_size = np.array([H, W])
        output_size = [H, W]
       
        if not hasattr(self, 'pool'):
            self.pool = nn.functional.adaptive_avg_pool2d
        
        if torch.onnx.is_in_onnx_export():
            self.pool = onnx_AdaptiveAvgPool2d
        
        out = [self.pool(inp, output_size) for inp in inputs]
        
        return torch.cat(out, dim=1)
 
class RepVGGBlock(nn.Module):
    '''RepVGGBlock is a basic rep-style block, including training and deploy status
    This code is based on https://github.com/DingXiaoH/RepVGG/blob/main/repvgg.py
    '''
    
    def __init__(self, in_channels, out_channels, kernel_size=3,
                 stride=1, padding=1, dilation=1, groups=1, padding_mode='zeros', deploy=False, use_se=False):
        super(RepVGGBlock, self).__init__()
        """ Initialization of the class.
        Args:
            in_channels (int): Number of channels in the input image
            out_channels (int): Number of channels produced by the convolution
            kernel_size (int or tuple): Size of the convolving kernel
            stride (int or tuple, optional): Stride of the convolution. Default: 1
            padding (int or tuple, optional): Zero-padding added to both sides of
                the input. Default: 1
            dilation (int or tuple, optional): Spacing between kernel elements. Default: 1
            groups (int, optional): Number of blocked connections from input
                channels to output channels. Default: 1
            padding_mode (string, optional): Default: 'zeros'
            deploy: Whether to be deploy status or training status. Default: False
            use_se: Whether to use se. Default: False
        """
        self.deploy = deploy
        self.groups = groups
        self.in_channels = in_channels
        self.out_channels = out_channels
        
        assert kernel_size == 3
        assert padding == 1
        
        padding_11 = padding - kernel_size // 2
        
        self.nonlinearity = nn.ReLU()
        
        if use_se:
            raise NotImplementedError("se block not supported yet")
        else:
            self.se = nn.Identity()
        
        if deploy:
            self.rbr_reparam = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size,
                                         stride=stride,
                                         padding=padding, dilation=dilation, groups=groups, bias=True,
                                         padding_mode=padding_mode)
        
        else:
            self.rbr_identity = nn.BatchNorm2d(
                    num_features=in_channels) if out_channels == in_channels and stride == 1 else None
            self.rbr_dense = conv_bn(in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size,
                                     stride=stride, padding=padding, groups=groups)
            self.rbr_1x1 = conv_bn(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=stride,
                                   padding=padding_11, groups=groups)
    
    def forward(self, inputs):
        '''Forward process'''
        if hasattr(self, 'rbr_reparam'):
            return self.nonlinearity(self.se(self.rbr_reparam(inputs)))
        
        if self.rbr_identity is None:
            id_out = 0
        else:
            id_out = self.rbr_identity(inputs)
        
        return self.nonlinearity(self.se(self.rbr_dense(inputs) + self.rbr_1x1(inputs) + id_out))
    
    def get_equivalent_kernel_bias(self):
        kernel3x3, bias3x3 = self._fuse_bn_tensor(self.rbr_dense)
        kernel1x1, bias1x1 = self._fuse_bn_tensor(self.rbr_1x1)
        kernelid, biasid = self._fuse_bn_tensor(self.rbr_identity)
        return kernel3x3 + self._pad_1x1_to_3x3_tensor(kernel1x1) + kernelid, bias3x3 + bias1x1 + biasid
    
    def _pad_1x1_to_3x3_tensor(self, kernel1x1):
        if kernel1x1 is None:
            return 0
        else:
            return torch.nn.functional.pad(kernel1x1, [1, 1, 1, 1])
    
    def _fuse_bn_tensor(self, branch):
        if branch is None:
            return 0, 0
        if isinstance(branch, nn.Sequential):
            kernel = branch.conv.weight
            running_mean = branch.bn.running_mean
            running_var = branch.bn.running_var
            gamma = branch.bn.weight
            beta = branch.bn.bias
            eps = branch.bn.eps
        else:
            assert isinstance(branch, nn.BatchNorm2d)
            if not hasattr(self, 'id_tensor'):
                input_dim = self.in_channels // self.groups
                kernel_value = np.zeros((self.in_channels, input_dim, 3, 3), dtype=np.float32)
                for i in range(self.in_channels):
                    kernel_value[i, i % input_dim, 1, 1] = 1
                self.id_tensor = torch.from_numpy(kernel_value).to(branch.weight.device)
            kernel = self.id_tensor
            running_mean = branch.running_mean
            running_var = branch.running_var
            gamma = branch.weight
            beta = branch.bias
            eps = branch.eps
        std = (running_var + eps).sqrt()
        t = (gamma / std).reshape(-1, 1, 1, 1)
        return kernel * t, beta - running_mean * gamma / std
    
    def switch_to_deploy(self):
        if hasattr(self, 'rbr_reparam'):
            return
        kernel, bias = self.get_equivalent_kernel_bias()
        self.rbr_reparam = nn.Conv2d(in_channels=self.rbr_dense.conv.in_channels,
                                     out_channels=self.rbr_dense.conv.out_channels,
                                     kernel_size=self.rbr_dense.conv.kernel_size, stride=self.rbr_dense.conv.stride,
                                     padding=self.rbr_dense.conv.padding, dilation=self.rbr_dense.conv.dilation,
                                     groups=self.rbr_dense.conv.groups, bias=True)
        self.rbr_reparam.weight.data = kernel
        self.rbr_reparam.bias.data = bias
        for para in self.parameters():
            para.detach_()
        self.__delattr__('rbr_dense')
        self.__delattr__('rbr_1x1')
        if hasattr(self, 'rbr_identity'):
            self.__delattr__('rbr_identity')
        if hasattr(self, 'id_tensor'):
            self.__delattr__('id_tensor')
        self.deploy = True
 
class RepBlock(nn.Module):
    '''
        RepBlock is a stage block with rep-style basic block
    '''
    
    def __init__(self, in_channels, out_channels, n=1, block=RepVGGBlock, basic_block=RepVGGBlock):
        super().__init__()
        
        self.conv1 = block(in_channels, out_channels)
        self.block = nn.Sequential(*(block(out_channels, out_channels) for _ in range(n - 1))) if n > 1 else None
        '''
        if block == BottleRep:
            self.conv1 = BottleRep(in_channels, out_channels, basic_block=basic_block, weight=True)
            n = n // 2
            self.block = nn.Sequential(
                    *(BottleRep(out_channels, out_channels, basic_block=basic_block, weight=True) for _ in
                      range(n - 1))) if n > 1 else None
        '''
    
    def forward(self, x):
        x = self.conv1(x)
        if self.block is not None:
            x = self.block(x)
        return x
 
class Inject(nn.Module):
    def __init__(
            self,
            inp: int,
            oup: int,
            global_index: int,
            norm_cfg=dict(type='BN', requires_grad=True),
            activations=nn.ReLU6,
            global_inp=None,
    ) -> None:
        super().__init__()
        self.global_index = global_index
        self.norm_cfg = norm_cfg
        
        if not global_inp:
            global_inp = inp
        
        self.local_embedding = ConvModule(inp, oup, kernel_size=1, norm_cfg=self.norm_cfg, act_cfg=None)
        self.global_embedding = ConvModule(global_inp, oup, kernel_size=1, norm_cfg=self.norm_cfg, act_cfg=None)
        self.global_act = ConvModule(global_inp, oup, kernel_size=1, norm_cfg=self.norm_cfg, act_cfg=None)
        self.act = h_sigmoid()
    
    def forward(self, x_l, x_g):
        '''
        x_g: global features
        x_l: local features
        '''
        x_g = x_g[self.global_index]
        B, C, H, W = x_l.shape
        g_B, g_C, g_H, g_W = x_g.shape
        use_pool = H < g_H
 
        local_feat = self.local_embedding(x_l)
        global_act = self.global_act(x_g)
        global_feat = self.global_embedding(x_g)
        
        if use_pool:
            avg_pool = get_avg_pool()
            # output_size = np.array([H, W])
            output_size = [H, W]
            
            sig_act = avg_pool(global_act, output_size)
            global_feat = avg_pool(global_feat, output_size)
        
        else:
            sig_act = F.interpolate(self.act(global_act), size=(H, W), mode='bilinear', align_corners=False)
            global_feat = F.interpolate(global_feat, size=(H, W), mode='bilinear', align_corners=False)
        
        out = local_feat * sig_act + global_feat
        return out
    
class h_sigmoid(nn.Module):
    def __init__(self, inplace=True):
        super(h_sigmoid, self).__init__()
        self.relu = nn.ReLU6(inplace=inplace)
    
    def forward(self, x):
        return self.relu(x + 3) / 6
    
def get_avg_pool():
    if torch.onnx.is_in_onnx_export():
        avg_pool = onnx_AdaptiveAvgPool2d
    else:
        avg_pool = nn.functional.adaptive_avg_pool2d
    return avg_pool
 
class Low_LAF(nn.Module):
    def __init__(self, in_channels, out_channels):
        super().__init__()
        self.cv1 = SimConv(in_channels, out_channels, 1, 1)
        self.cv_fuse = SimConv(round(out_channels * 2.5), out_channels, 1, 1)
        self.downsample = nn.functional.adaptive_avg_pool2d
    
    def forward(self, x):
        N, C, H, W = x[1].shape
        # output_size = np.array([H, W])
        output_size = [H, W]
 
        
        if torch.onnx.is_in_onnx_export():
            self.downsample = onnx_AdaptiveAvgPool2d
            output_size = np.array([H, W])
        
        x0 = self.downsample(x[0], output_size)
        x1 = self.cv1(x[1])
        x2 = F.interpolate(x[2], size=(H, W), mode='bilinear', align_corners=False)
        return self.cv_fuse(torch.cat((x0, x1, x2), dim=1))
 
class SimConv(nn.Module):
    '''Normal Conv with ReLU VAN_activation'''
    
    def __init__(self, in_channels, out_channels, kernel_size, stride, groups=1, bias=False, padding=None):
        super().__init__()
        if padding is None:
            padding = kernel_size // 2
        self.conv = nn.Conv2d(
                in_channels,
                out_channels,
                kernel_size=kernel_size,
                stride=stride,
                padding=padding,
                groups=groups,
                bias=bias,
        )
        self.bn = nn.BatchNorm2d(out_channels)
        self.act = nn.ReLU()
    
    def forward(self, x):
        return self.act(self.bn(self.conv(x)))
    
    def forward_fuse(self, x):
        return self.act(self.conv(x))
 
class Split(nn.Module):
    def __init__(self, trans_channels):
        super().__init__()
        self.trans_channels = trans_channels
 
    def forward(self, x):
        return x.split(self.trans_channels, dim=1)
 
class Low_IFM(nn.Module):
    def __init__(self, in_channels, embed_dims, fuse_block_num, out_channels):
        super().__init__()
        self.conv1 = Conv(in_channels, embed_dims, kernel_size=1, stride=1, padding=0)
        self.block = nn.ModuleList([RepVGGBlock(embed_dims, embed_dims) for _ in range(fuse_block_num)]) if fuse_block_num > 0 else nn.Identity
        self.conv2 = Conv(embed_dims, out_channels, kernel_size=1, stride=1, padding=0)
        
    
    def forward(self, x):
        x = self.conv1(x)
        for b in self.block:
            x = b(x)
        out = self.conv2(x)
        return out
 
class Low_FAM(nn.Module):
    def __init__(self):
        super().__init__()
        self.avg_pool = nn.functional.adaptive_avg_pool2d
    
    def forward(self, x):
        x_l, x_m, x_s, x_n = x
        B, C, H, W = x_s.shape
        # output_size = np.array([H, W])
        output_size = [H, W]
 
        if torch.onnx.is_in_onnx_export():
            self.avg_pool = onnx_AdaptiveAvgPool2d
 
        x_l = self.avg_pool(x_l, output_size)
        x_m = self.avg_pool(x_m, output_size)
        x_n = F.interpolate(x_n, size=(H, W), mode='bilinear', align_corners=False)
 
        out = torch.cat([x_l, x_m, x_s, x_n], 1)
        return out
 
 
def conv_bn(in_channels, out_channels, kernel_size, stride, padding, groups=1, bias=False):
    '''Basic cell for rep-style block, including conv and bn'''
    result = nn.Sequential()
    result.add_module('conv', nn.Conv2d(in_channels=in_channels, out_channels=out_channels,
                                        kernel_size=kernel_size, stride=stride, padding=padding, groups=groups,
                                        bias=bias))
    result.add_module('bn', nn.BatchNorm2d(num_features=out_channels))
    return result
 
class Conv(nn.Module):
    '''Normal Conv with SiLU VAN_activation'''
    
    def __init__(self, in_channels, out_channels, kernel_size, stride, groups=1, bias=False, padding=None):
        super().__init__()
        
        if padding is None:
            padding = kernel_size // 2
        self.conv = nn.Conv2d(
                in_channels,
                out_channels,
                kernel_size=kernel_size,
                stride=stride,
                padding=padding,
                groups=groups,
                bias=bias,
        )
        self.bn = nn.BatchNorm2d(out_channels)
        self.act = nn.SiLU()
    
    def forward(self, x):
        return self.act(self.bn(self.conv(x)))
    
def onnx_AdaptiveAvgPool2d(x, output_size):
    stride_size = np.floor(np.array(x.shape[-2:]) / output_size).astype(np.int32)
    kernel_size = np.array(x.shape[-2:]) - (output_size - 1) * stride_size
    avg = nn.AvgPool2d(kernel_size=list(kernel_size), stride=list(stride_size))
    x = avg(x)
    return x

第二步我们在该目录(Addmodules)下创建一个新的py文件名字为'__init__.py',然后在其内部添加如下代码。

最终结果如下图标注所示

from .gold import *

2.2 步骤二

在'ultralytics/nn/tasks.py'进行导入模块

2.3 步骤三

def parse_model(d, ch, verbose=True):中如下图位置添加如下代码

2.4 修改四

在图中相应位置添加如下代码,注意粘贴之后检查缩进是否正确

  # --------------GOLD-YOLO--------------
        elif m in (Low_FAM, High_FAM, High_LAF):
            c2 = sum(ch[x] for x in f)
        elif m is Low_IFM:
            c1, c2 = ch[f], args[2]
            if c2 != nc:  # if c2 not equal to number of classes (i.e. for Classify() output)
                c2 = make_divisible(min(c2, max_channels) * width, 8)
            args = [c1, *args[:-1], c2]
        elif m is Low_LAF:
            c1, c2 = ch[f[1]], args[0]
            if c2 != nc:  # if c2 not equal to number of classes (i.e. for Classify() output)
                c2 = make_divisible(min(c2, max_channels) * width, 8)
            args = [c1, c2, *args[1:]]
        elif m is Inject:
            global_index = args[1]
            c1, c2 = ch[f[1]][global_index], args[0]
            if c2 != nc:  # if c2 not equal to number of classes (i.e. for Classify() output)
                c2 = make_divisible(min(c2, max_channels) * width, 8)
            args = [c1, c2, global_index]
        elif m is RepBlock:
            c1, c2 = ch[f], args[0]
            if c2 != nc:  # if c2 not equal to number of classes (i.e. for Classify() output)
                c2 = make_divisible(min(c2, max_channels) * width, 8)
            nums_repeat = max(round(args[1] * depth), 1) if args[1] > 1 else args[1]  # depth gain
            args = [c1, c2, nums_repeat]
        elif m is Split:
            goldyolo = True
            c2 = []
            for arg in args:
                if arg != nc:  # if c2 not equal to number of classes (i.e. for Classify() output)
                    c2.append(make_divisible(min(arg, max_channels) * width, 8))
            args = [c2]
        # --------------GOLD-YOLO--------------

2.5 修改五

在图中相应位置添加如下代码,注意粘贴之后检查缩进是否正确

        if m in [Inject, High_LAF]:
            # input nums
            m_.input_nums = len(f)
        else:
            m_.input_nums = 1

2.6 修改六

上面的代码修改都是按照顺序来的,此处的代码修改不和上面的顺序一样我们需要找到'ultralytics/nn/tasks.py'文件的开头,在basemodel中,然后进行修改。

将下图代码进行替换

替换为以下代码

            try:
                if m.input_nums > 1:
                    # input nums more than one
                    x = m(*x)  # run
                else:
                    x = m(x)
            except AttributeError:
                # AttributeError: 'Conv' object has no attribute 'input_nums'
                x = m(x)

结果如下图所示

到此注册成功,复制后面的yaml文件直接运行即可

yaml文件


# Ultralytics YOLO 

标签:__,Gold,nn,爆改,self,YOLOv8,channels,out,size
From: https://blog.csdn.net/weixin_43986124/article/details/141276021

相关文章

  • YOLOv8改进:独家创新篇 - 给YOLOv8增加辅助可逆分支结构(PGI,1:1改进)
    鱼弦:公众号【红尘灯塔】,CSDN博客专家、内容合伙人、新星导师、全栈领域优质创作者、51CTO(Top红人+专家博主)、github开源爱好者(go-zero源码二次开发、游戏后端架构https://github.com/Peakchen)YOLOv8改进:独家创新篇-给YOLOv8增加辅助可逆分支结构(PGI,1:1改进)简介YOL......
  • 爆改YOLOv8 | yolov8添加CBAM注意力机制
    1,.本文介绍CBAM的主要思想是通过关注重要的特征并抑制不必要的特征来增强网络的表示能力。模块首先应用通道注意力,关注"重要的"特征,然后应用空间注意力,关注这些特征的"重要位置"。通过这种方式,CBAM有效地帮助网络聚焦于图像中的关键信息,提高了特征的表示力度.以下为CBAM结构......
  • windows下yolov8训练环境配置
    安装anaconda3condacreate-nyolov8python=3.8##进入yolov8虚拟环境activateyolov8pipinstallultralyticscondainstallchardet安装pytorch、cudahttps://pytorch.org/get-started/locally/condainstallpytorchtorchvisiontorchaudiopytorch-cuda=11.......
  • yolov8模型转为onnx后的推理测试(分为两个py文件)
    点击查看代码importtorchfromultralytics.utilsimportASSETS,yaml_loadfromultralytics.utils.checksimportcheck_requirements,check_yamlimportnumpyasnpimportcv2importonnxruntimeasortclassYOLOv8:"""YOLOv8objectdetectionm......
  • YOLOv8改进系列,YOLOv8添加EMA注意力机制,并与C2f融合
    原文摘要在各种计算机视觉任务中,通道或空间注意力机制在生成更具辨识度的特征表示方面表现出显著的效果。然而,通过通道维度减少来建模跨通道关系可能会在提取深度视觉表示时带来副作用。本文提出了一种新颖的高效多尺度注意力(EMA)模块。该模块重点在于保留每个通道的信息......
  • YOLOv8改进系列,YOLOv8替换主干网络为MobileNetV2(轻量化架构+助力涨点)
    原论文摘要MobileNetV2架构在多个任务和基准测试中提高了移动模型的最先进性能,并在不同的模型规模中表现出色。我们还介绍了在一种我们称之为SSDLite的新框架中应用这些移动模型进行目标检测的高效方法。MobileNetV2理论详解可以参考链接:论文地址本文在YOLOv8中的主干......
  • YOLOv8开源揭秘:AI视觉技术在智能设备与系统全景应用
    目录一、引言二、YOLOv8模型简介三、YOLOv8的功能特性四、YOLOv8应用场景五、代码示例示例一:对图像进行目标检测示例二:对视频进行目标检测示例三:批量处理图像文件夹示例四:自定义模型进行目标检测示例五:实时摄像头目标检测示例六:获取目标检测的详细信息六、结语一......
  • 基于yolov8进行目标识别的文件部署步骤
    1 yolov8模型简介YOLOv8(YouOnlyLookOnceversion8)是YOLO(YouOnlyLookOnce)系列模型的最新版本,延续了YOLO系列的实时目标检测方法。YOLOv8在架构、训练过程和性能方面做了多项改进,旨在提升检测精度和速度。以下是YOLOv8的主要特点和改进:1.改进的网络架构:  -YOLO......
  • CryptoHouse:由 ClickHouse 和 Goldsky 支持的免费区块链分析服务(ClickHouse 博客)
    我们很高兴地宣布CryptoHouse,在crypto.clickhouse.com上可访问,这是一个由ClickHouse提供支持的免费区块链分析服务。https://crypto.clickhouse.com/现有的公共区块链分析服务通常需要定时、异步查询,而ClickHouse提供实时分析,通过即时查询响应来普及访问权限。用户可以......
  • YOLOv8改进 | 主干网络 | 用EfficientNet卷积替换backbone【教程+代码 】
     秋招面试专栏推荐 :深度学习算法工程师面试问题总结【百面算法工程师】——点击即可跳转......