yolov8涨点系列之C2f模块改进主分支

文章目录

C2F 模块介绍
- 定义与基本原理
- 应用场景
C2f模块修改步骤
- (1) C2f_up模块编辑
- (2)在__init_.py+block.py中声明
- （3）在task.py中声明
- yolov8引入C2f_up模块
- - yolov8.yaml
  - yolov8.yaml引入C2f_up模块
C2f改进对YOLOv8检测具有多方面的好处

C2F 模块介绍

定义与基本原理

&emsp;C2F（Coarse - to - Fine）模块通常是一种在计算机视觉或其他领域中，用于处理从粗糙到精细层次信息的模块。例如，在图像分割任务中，C2F 模块可能先从整个图像的大致区域划分入手（粗粒度处理），然后逐步细化分割边界和区域内部的细节（细粒度处理）。
从网络结构角度看，它可能包含多个层次的处理单元。在早期阶段，这些单元会处理经过下采样后的低分辨率图像信息，以获取具有较大感受野的全局特征。随着网络的推进，通过上采样等操作，将早期的全局特征和经过处理的高分辨率局部特征相结合，逐步恢复细节信息，从而实现从粗到细的信息整合。

应用场景

图像分割：在医学图像分割领域，如对脑部 MRI 图像进行组织分割时，C2F 模块可以先利用低分辨率的图像大致区分出大脑的主要区域（如白质、灰质、脑脊液等），然后在精细阶段准确划分各个组织的边界。在语义分割任务中，对于自然场景图像，C2F 模块有助于区分天空、建筑、道路等大的物体类别，并且能够很好地描绘出物体的边缘和细节，比如路边树木的轮廓等。
目标检测：在目标检测任务中，C2F 模块可以先定位目标大致所在的区域（粗定位），然后对目标的边界框和类别进行更精确的确定（精确定位和分类）。例如在行人检测中，先找到可能包含行人的大致场景区域，再细化到行人的具体姿态、衣着细节等特征，从而更准确地识别行人个体。

C2f模块修改步骤

(1) C2f_up模块编辑

C2f模块位置位于ultralytics/nn/modules/block.py内，如下图所示：
在这里插入图片描述

class C2f(nn.Module):
    """Faster Implementation of CSP Bottleneck with 2 convolutions."""

    def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5):
        """Initialize CSP bottleneck layer with two convolutions with arguments ch_in, ch_out, number, shortcut, groups,
        expansion.
        """
        super().__init__()
        self.c = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, 2 * self.c, 1, 1)
        self.cv2 = Conv((2 + n) * self.c, c2, 1)  # optional act=FReLU(c2)
        self.m = nn.ModuleList(Bottleneck(self.c, self.c, shortcut, g, k=((3, 3), (3, 3)), e=1.0) for _ in range(n))

    def forward(self, x):
        """Forward pass through C2f layer."""
        y = list(self.cv1(x).chunk(2, 1))
        y.extend(m(y[-1]) for m in self.m)
        return self.cv2(torch.cat(y, 1))

    def forward_split(self, x):
        """Forward pass using split() instead of chunk()."""
        y = list(self.cv1(x).split((self.c, self.c), 1))
        y.extend(m(y[-1]) for m in self.m)
        return self.cv2(torch.cat(y, 1))

改进后代码：

class C2f_up(nn.Module):
    """Faster Implementation of CSP Bottleneck with 2 convolutions."""

    def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5):
        """Initialize CSP bottleneck layer with two convolutions with arguments ch_in, ch_out, number, shortcut, groups,
        expansion.
        """
        super().__init__()
        self.c = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, 2 * self.c, 1, 1)
        self.cv2 = Conv((2 + n) * self.c, c2, 1)  # optional act=FReLU(c2)
        self.m = nn.SiLU(nn.ModuleList(Bottleneck(self.c, self.c, shortcut, g, k=((3, 3), (3, 3)), e=1.0) for _ in range(n)))

    def forward(self, x):
        """Forward pass through C2f layer."""
        y = list(self.cv1(x).chunk(2, 1))
        y.extend(m(y[-1]) for m in self.m)
        return self.cv2(torch.cat(y, 1))

    def forward_split(self, x):
        """Forward pass using split() instead of chunk()."""
        y = list(self.cv1(x).split((self.c, self.c), 1))
        y.extend(m(y[-1]) for m in self.m)
        return self.cv2(torch.cat(y, 1))

(2)在__init_.py+block.py中声明

在这里插入图片描述

（3）在task.py中声明

在这里插入图片描述

yolov8引入C2f_up模块

yolov8.yaml

# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect

# Parameters
nc: 80  # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.33, 0.25, 1024]  # YOLOv8n summary: 225 layers,  3157200 parameters,  3157184 gradients,   8.9 GFLOPs
  s: [0.33, 0.50, 1024]  # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients,  28.8 GFLOPs
  m: [0.67, 0.75, 768]   # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients,  79.3 GFLOPs
  l: [1.00, 1.00, 512]   # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPs
  x: [1.00, 1.25, 512]   # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs

# YOLOv8.0n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]]  # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]]  # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]]  # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]]  # 7-P5/32
  - [-1, 3, C2f, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]]  # 9

# YOLOv8.0n head
head:
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 6], 1, Concat, [1]]  # cat backbone P4
  - [-1, 3, C2f, [512]]  # 12

  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 4], 1, Concat, [1]]  # cat backbone P3
  - [-1, 3, C2f, [256]]  # 15 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 12], 1, Concat, [1]]  # cat head P4
  - [-1, 3, C2f, [512]]  # 18 (P4/16-medium)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 9], 1, Concat, [1]]  # cat head P5
  - [-1, 3, C2f, [1024]]  # 21 (P5/32-large)

  - [[15, 18, 21], 1, Detect, [nc]]  # Detect(P3, P4, P5)

yolov8.yaml引入C2f_up模块

# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect

# Parameters
nc: 80  # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.33, 0.25, 1024]  # YOLOv8n summary: 225 layers,  3157200 parameters,  3157184 gradients,   8.9 GFLOPs
  s: [0.33, 0.50, 1024]  # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients,  28.8 GFLOPs
  m: [0.67, 0.75, 768]   # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients,  79.3 GFLOPs
  l: [1.00, 1.00, 512]   # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPs
  x: [1.00, 1.25, 512]   # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs

# YOLOv8.0n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]]  # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]]  # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]]  # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]]  # 7-P5/32
  - [-1, 3, C2f, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]]  # 9

# YOLOv8.0n head
head:
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 6], 1, Concat, [1]]  # cat backbone P4
  - [-1, 3, C2f_up, [512]]  # 12

  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 4], 1, Concat, [1]]  # cat backbone P3
  - [-1, 3, C2f_up, [256]]  # 15 (P3/8-small)

  - [-1, 1, GhostConv, [256, 3, 2]]
  - [[-1, 12], 1, Concat, [1]]  # cat head P4
  - [-1, 3, C2f_up, [512]]  # 18 (P4/16-medium)

  - [-1, 1, GhostConv, [512, 3, 2]]
  - [[-1, 9], 1, Concat, [1]]  # cat head P5
  - [-1, 3, C2f_up, [1024]]  # 21 (P5/32-large)

  - [[15, 18, 21], 1, Detect, [nc]]  # Detect(P3, P4, P5)

改进前：
在这里插入图片描述
改进后：

C2f改进对YOLOv8检测具有多方面的好处

特征融合更高效：
- 多尺度特征融合增强：C2f模块能够将来自不同层级的特征图进行融合，使模型获得既有高分辨率又有丰富语义信息的特征图。这有助于提高模型对不同尺度物体的检测能力，无论是小目标还是大目标，都能更好地被检测和识别。例如，在复杂的交通场景中，对于远处的小行人和近处的大型车辆，改进后的YOLOv8都能准确检测。
- 梯度流信息更丰富：C2f改进可能引入了新的结构或操作，增加了模型的梯度流分支，从而提供更丰富的梯度信息。这有助于模型更好地学习到图像中的特征，加快收敛速度和收敛效果，提高训练效率。
模型性能提升：
- 检测精度提高：通过更有效的特征融合和更丰富的特征表示，C2f改进后的YOLOv8能够更准确地定位和分类目标，从而提高检测精度。在面对复杂背景、遮挡等情况时，模型能够更好地提取目标的特征，减少误检和漏检。
- 模型鲁棒性增强：改进后的C2f模块使模型对输入数据的变化具有更强的适应性，例如在不同光照条件、不同视角、不同图像质量等情况下，模型仍然能够保持较好的检测性能，提高了模型的鲁棒性。
计算效率优化：
- 参数数量减少：C2f改进可能会对模块的结构进行优化，减少不必要的参数数量，从而降低模型的存储需求和计算复杂度。这对于在资源有限的设备上部署模型非常重要，例如嵌入式设备、移动设备等，可以使模型更易于部署和运行。
- 推理速度加快：在保持检测精度的前提下，优化后的C2f模块能够减少计算量，从而加快模型的推理速度，提高实时性。这对于需要快速处理大量图像数据的应用场景，如视频监控、自动驾驶等，具有重要的意义。
易于集成和扩展：
- 模块灵活性提高：C2f改进后的模块具有更高的灵活性，可以方便地与其他模块或技术进行集成。例如，可以与注意力机制、特征增强模块等相结合，进一步提升模型的性能。
- 可扩展性强：为研究人员和开发者提供了更多的优化空间，可以根据具体的应用需求和性能要求，对模型进行进一步的改进和扩展，以满足不同场景下的检测任务。