基于注意力机制和多空间金字塔池化的实时目标检测算法
DOI:
CSTR:
作者:
作者单位:

山西大学

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金(11804209),山西省自然科学基金(201901D111031,201901D211173),山西省高校科技创新计划(2019L0064, 2020L0051)


Real-Time Object Detection Algorithm based on Attention Mechanism and multi-spatial Pyramid Pooling
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    YOLOv4计算复杂度高、空间金字塔池化模块仅一次增强特征融合网络的深层区域特征图的表征能力、检测头网络的特征图难以突出重要通道特征;针对以上问题,提出一种基于注意力机制和多空间金字塔池化的实时目标检测算法;该算法采用多空间金字塔池化,提取局部特征和全局特征,融合多重感受野,加强特征融合网络的浅、中、深层特征图的表征能力;引入压缩激励通道注意力机制,建模通道间的相关性,自适应调整特征图各个通道的权重,从而使网络更加关注重要特征;特征融合和检测头网络中使用深度可分离卷积,减少了网络参数量;实验结果表明,所提算法的均值平均精度均高于其他七种主流对比算法;与YOLOv4相比,参数量、模型大小分别减少了27.85 M和106.25 MB,所提算法在降低复杂度的同时,提高了检测准确度;且该算法的检测速率达到33.70 帧/秒,满足实时性要求。

    Abstract:

    A novel algorithm named as real-time object detection algorithm based on attention mechanism and multi-spatial pyramid pooling is proposed to avoid the disadvantages of an enhancement to the representational power of the deep feature maps of the feature fusion network for the spatial pyramid pooling module, higher computational complexity and the difficulty in highlighting important channel features for the feature maps of the detection head network in YOLOv4 algorithm. Since multiple receptive fields are fused after extracting multi-scale information by multi-space pyramid pooling, the characterization ability of the shallow, middle and deep feature maps is strengthened for the feature fusion network. By utilizing the squeeze-and-excitation channel attention mechanism to model interdependencies between channels, the weight of each channel is adaptively recalibrated to make the network pay more attention to important features. Moreover, the depthwise separable convolution is exploited to reduce the parameters of the feature fusion and detection head networks. The experimental results show that the mean average precision of the proposed algorithm is higher than that of the state-of-the-art algorithms, while the average speed of the algorithm reaches 33.70FPS, which meets the real-time requirements. Compared with YOLOv4, the parameters and model size are reduced by 27.85M and 106.25MB, respectively. The presented algorithm not only improves the detection accuracy, but also reduces the computational complexity compared to the baseline algorithm.

    参考文献
    相似文献
    引证文献
引用本文

王国刚,李泽欣,董志豪.基于注意力机制和多空间金字塔池化的实时目标检测算法计算机测量与控制[J].,2024,32(2):56-64.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-03-08
  • 最后修改日期:2023-04-20
  • 录用日期:2023-04-21
  • 在线发布日期: 2024-03-20
  • 出版日期:
文章二维码