前言
作为当前先进的深度学习目标检测算法YOLOv8,已经集合了大量的trick,但是还是有提高和改进的空间,针对具体应用场景下的检测难点,可以不同的改进方法。此后的系列文章,将重点对YOLOv8的如何改进进行详细的介绍,目的是为了给那些搞科研的同学需要创新点或者搞工程项目的朋友需要达到更好的效果提供自己的微薄帮助和参考。由于出到YOLOv8,YOLOv7、YOLOv5算法2020年至今已经涌现出大量改进论文,这个不论对于搞科研的同学或者已经工作的朋友来说,研究的价值和新颖度都不太够了,为与时俱进,以后改进算法以YOLOv7为基础,此前YOLOv5改进方法在YOLOv7同样适用,所以继续YOLOv5系列改进的序号。另外改进方法在YOLOv5等其他算法同样可以适用进行改进。希望能够对大家有帮助。
链接: https://pan.baidu.com/s/1e83xPdxwmSJ0Nohc_F9nFA
提取码:关注私信后获取
一、解决问题
尝试将原YOLOv5中的sppf改为ASPP,提升精度和效果。
二、基本原理
说明:图片来自DeepLabV3 Rethinking Atrous Convolution for Semantic Image Segmentation
三、添加方法
(1)YOLOv5网络模型更改
添加后的网络模型结构图如下(YOLOv5s基础上添加):
(2)YOLOv7网络模型更改
添加后的网络模型结构图如下(YOLOv7基础上添加,将其中的
改为 [[-1, 1, ASPP, [1024]], # 最终形成结构图如下所示:
# parameters
nc: 1 # number of classes
depth_multiple: 1.0 # model depth multiple
width_multiple: 1.0 # layer channel multiple
# anchors
anchors:
- [12,16, 19,36, 40,28] # P3/8
- [36,75, 76,55, 72,146] # P4/16
- [142,110, 192,243, 459,401] # P5/32
# yolov7 backbone
backbone:
# [from, number, module, args]
[[-1, 1, Conv, [32, 3, 1]], # 0
[-1, 1, Conv, [64, 3, 2]], # 1-P1/2
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [128, 3, 2]], # 3-P2/4
[-1, 1, Conv, [64, 1, 1]],
[-2, 1, Conv, [64, 1, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[[-1, -3, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]], # 11
[-1, 1, MP, []],
[-1, 1, Conv, [128, 1, 1]],
[-3, 1, Conv, [128, 1, 1]],
[-1, 1, Conv, [128, 3, 2]],
[[-1, -3], 1, Concat, [1]], # 16-P3/8
[-1, 1, Conv, [128, 1, 1]],
[-2, 1, Conv, [128, 1, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[[-1, -3, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [512, 1, 1]], # 24
[-1, 1, MP, []],
[-1, 1, Conv, [256, 1, 1]],
[-3, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [256, 3, 2]],
[[-1, -3], 1, Concat, [1]], # 29-P4/16
[-1, 1, Conv, [256, 1, 1]],
[-2, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[[-1, -3, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [1024, 1, 1]], # 37
[-1, 1, MP, []],
[-1, 1, Conv, [512, 1, 1]],
[-3, 1, Conv, [512, 1, 1]],
[-1, 1, Conv, [512, 3, 2]],
[[-1, -3], 1, Concat, [1]], # 42-P5/32
[-1, 1, Conv, [256, 1, 1]],
[-2, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[[-1, -3, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [1024, 1, 1]], # 50
]
# yolov7 head
head:
[[-1, 1, ASPP, [1024]], # 51
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[37, 1, Conv, [256, 1, 1]], # route backbone P4
[[-1, -2], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]],
[-2, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]], # 63
[-1, 1, Conv, [128, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[24, 1, Conv, [128, 1, 1]], # route backbone P3
[[-1, -2], 1, Concat, [1]],
[-1, 1, Conv, [128, 1, 1]],
[-2, 1, Conv, [128, 1, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [128, 1, 1]], # 75
[-1, 1, MP, []],
[-1, 1, Conv, [128, 1, 1]],
[-3, 1, Conv, [128, 1, 1]],
[-1, 1, Conv, [128, 3, 2]],
[[-1, -3, 63], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]],
[-2, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]], # 88
[-1, 1, MP, []],
[-1, 1, Conv, [256, 1, 1]],
[-3, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [256, 3, 2]],
[[-1, -3, 51], 1, Concat, [1]],
[-1, 1, Conv, [512, 1, 1]],
[-2, 1, Conv, [512, 1, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [512, 1, 1]], # 101
[75, 1, RepConv, [256, 3, 1]],
[88, 1, RepConv, [512, 3, 1]],
[101, 1, RepConv, [1024, 3, 1]],
[[102,103,104], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
]
四、总结
预告一下:下一篇内容将继续分享深度学习算法相关改进方法。有兴趣的朋友可以关注一下我,有问题可以留言或者私聊我哦
PS:该方法不仅仅是适用改进YOLOv5,也可以改进其他的YOLO网络以及目标检测网络,比如YOLOv7、v6、v4、v3,Faster rcnn ,ssd等。
最后,有需要的请关注私信我吧。关注免费领取深度学习算法学习资料!
YOLO系列算法改进方法 | 目录一览表
[💡🎈☁️1. 添加SE注意力机制](https://blog.csdn.net/m0_70388905/article/details/125379649)
[💡🎈☁️2.添加CBAM注意力机制](https://blog.csdn.net/m0_70388905/article/details/125892144)
[💡🎈☁️3. 添加CoordAtt注意力机制](https://blog.csdn.net/m0_70388905/article/details/125379685)
[💡🎈☁️4. 添加ECA通道注意力机制](https://blog.csdn.net/m0_70388905/article/details/125390766)
[💡🎈☁️5. 改进特征融合网络PANET为BIFPN](https://blog.csdn.net/m0_70388905/article/details/125391096)
[💡🎈☁️6. 增加小目标检测层](https://blog.csdn.net/m0_70388905/article/details/125392908)
[💡🎈☁️7. 损失函数改进](https://blog.csdn.net/m0_70388905/article/details/125419887)
[💡🎈☁️8. 非极大值抑制NMS算法改进Soft-nms](https://blog.csdn.net/m0_70388905/article/details/125448230)
[💡🎈☁️9. 锚框K-Means算法改进K-Means++](https://blog.csdn.net/m0_70388905/article/details/125530323)
[💡🎈☁️10. 损失函数改进为SIOU](https://blog.csdn.net/m0_70388905/article/details/125569509)
[💡🎈☁️11. 主干网络C3替换为轻量化网络MobileNetV3](https://blog.csdn.net/m0_70388905/article/details/125593267)
[💡🎈☁️12. 主干网络C3替换为轻量化网络ShuffleNetV2](https://blog.csdn.net/m0_70388905/article/details/125612052)
[💡🎈☁️13. 主干网络C3替换为轻量化网络EfficientNetv2](https://blog.csdn.net/m0_70388905/article/details/125612096)
[💡🎈☁️14. 主干网络C3替换为轻量化网络Ghostnet](https://blog.csdn.net/m0_70388905/article/details/125612392)
[💡🎈☁️15. 网络轻量化方法深度可分离卷积](https://blog.csdn.net/m0_70388905/article/details/125612300)
[💡🎈☁️16. 主干网络C3替换为轻量化网络PP-LCNet](https://blog.csdn.net/m0_70388905/article/details/125651427)
[💡🎈☁️17. CNN+Transformer——融合Bottleneck Transformers](https://blog.csdn.net/m0_70388905/article/details/125691455)
[💡🎈☁️18. 损失函数改进为Alpha-IoU损失函数](https://blog.csdn.net/m0_70388905/article/details/125704413)
[💡🎈☁️19. 非极大值抑制NMS算法改进DIoU NMS](https://blog.csdn.net/m0_70388905/article/details/125754133)
[💡🎈☁️20. Involution新神经网络算子引入网络](https://blog.csdn.net/m0_70388905/article/details/125816412)
[💡🎈☁️21. CNN+Transformer——主干网络替换为又快又强的轻量化主干EfficientFormer](https://blog.csdn.net/m0_70388905/article/details/125840816)
[💡🎈☁️22. 涨点神器——引入递归门控卷积(gnConv)](https://blog.csdn.net/m0_70388905/article/details/126142505)
[💡🎈☁️23. 引入SimAM无参数注意力](https://blog.csdn.net/m0_70388905/article/details/126456722)
[💡🎈☁️24. 引入量子启发的新型视觉主干模型WaveMLP(可尝试发SCI)](https://blog.csdn.net/m0_70388905/article/details/126550613)
[💡🎈☁️25. 引入Swin Transformer](https://blog.csdn.net/m0_70388905/article/details/126674046)
[💡🎈☁️26. 改进特征融合网络PANet为ASFF自适应特征融合网络](https://blog.csdn.net/m0_70388905/article/details/126926244)
[💡🎈☁️27. 解决小目标问题——校正卷积取代特征提取网络中的常规卷积](https://blog.csdn.net/m0_70388905/article/details/126979207)
[💡🎈☁️28. ICLR 2022涨点神器——即插即用的动态卷积ODConv](https://blog.csdn.net/m0_70388905/article/details/127031843)
[💡🎈☁️29. 引入Swin Transformer v2.0版本](https://blog.csdn.net/m0_70388905/article/details/127214397)
[💡🎈☁️30. 引入10月4号发表最新的Transformer视觉模型MOAT结构](https://blog.csdn.net/m0_70388905/article/details/127273808)
[💡🎈☁️31. CrissCrossAttention注意力机制](https://blog.csdn.net/m0_70388905/article/details/127312771)
[💡🎈☁️32. 引入SKAttention注意力机制](https://blog.csdn.net/m0_70388905/article/details/127330663)
[💡🎈☁️33. 引入GAMAttention注意力机制](https://blog.csdn.net/m0_70388905/article/details/127330819)
[💡🎈☁️34. 更换激活函数为FReLU](https://blog.csdn.net/m0_70388905/article/details/127381053)
[💡🎈☁️35. 引入S2-MLPv2注意力机制](https://blog.csdn.net/m0_70388905/article/details/127434190)
[💡🎈☁️36. 融入NAM注意力机制](https://blog.csdn.net/m0_70388905/article/details/127398898)
[💡🎈☁️37. 结合CVPR2022新作ConvNeXt网络](https://blog.csdn.net/m0_70388905/article/details/127533379)
[💡🎈☁️38. 引入RepVGG模型结构](https://blog.csdn.net/m0_70388905/article/details/127532645)
[💡🎈☁️39. 引入改进遮挡检测的Tri-Layer插件 | BMVC 2022](https://blog.csdn.net/m0_70388905/article/details/127471913)
[💡🎈☁️40. 轻量化mobileone主干网络引入](https://blog.csdn.net/m0_70388905/article/details/127558329)
[💡🎈☁️41. 引入SPD-Conv处理低分辨率图像和小对象问题](https://zhuanlan.zhihu.com/p/579212232)
[💡🎈☁️42. 引入V7中的ELAN网络](https://zhuanlan.zhihu.com/p/579533276)
[💡🎈☁️43. 结合最新Non-local Networks and Attention结构](https://zhuanlan.zhihu.com/p/579903718)
[💡🎈☁️44. 融入适配GPU的轻量级 G-GhostNet](https://blog.csdn.net/m0_70388905/article/details/127932181)
[💡🎈☁️45. 首发最新特征融合技术RepGFPN(DAMO-YOLO)](https://blog.csdn.net/m0_70388905/article/details/128157269)
[💡🎈☁️46. 改进激活函数为ACON](https://blog.csdn.net/m0_70388905/article/details/128159516)
[💡🎈☁️47. 改进激活函数为GELU](https://blog.csdn.net/m0_70388905/article/details/128170907)
[💡🎈☁️48. 构建新的轻量网络—Slim-neck by GSConv(2022CVPR)](https://blog.csdn.net/m0_70388905/article/details/128198484)
[💡🎈☁️49. 模型剪枝、蒸馏、压缩](https://blog.csdn.net/m0_70388905/article/details/128222629)
[💡🎈☁️50. 超越ConvNeXt!Conv2Former:用于视觉识别的Transformer风格的ConvNet](https://blog.csdn.net/m0_70388905/article/details/128266070?csdn_share_tail=%7B%22type%22:%22blog%22,%22rType%22:%22article%22,%22rId%22:%22128266070%22,%22source%22:%22m0_70388905%22%7D)
[💡🎈☁️51.融入多分支空洞卷积结构RFB-Bottleneck改进PANet构成新特征融合网络](https://blog.csdn.net/m0_70388905/article/details/128553832)
[💡🎈☁️52.将YOLOv8中的C2f模块融入YOLOv5](https://blog.csdn.net/m0_70388905/article/details/128661165)
[💡🎈☁️53.融入CFPNet网络中的ECVBlock模块,提升小目标检测能力](https://blog.csdn.net/m0_70388905/article/details/128720459)