基于FPGA的高能效纸板缺陷检测系统
DOI:
CSTR:
作者:
作者单位:

福州大学电气工程与自动化学院

作者简介:

通讯作者:

中图分类号:

TP391???

基金项目:

国家自然科学基金项目(61871133)


High-Power-Efficient YOLO Hardware Acceleration System Based on FPGA
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    目前在工业流水线生产过程中主要采用人工检测的方法来剔除不合格纸板,这种方法效率低下,因此在生产过程中实现高能效的、准确的对纸板表面缺陷进行自动检测具有实际意义。依据YOLO系列网络在目标检测领域的优异表现和FPGA部署网络模型的高能效性,提出了一种基于FPGA的高能效纸板缺陷检测系统,通过YOLOv7-Tiny网络训练纸板缺陷数据集,并采用QAT对网络模型进行再训练和量化,在检测精度仅损失0.36%前提下,将权重和特征图数据量化为8位,降低了硬件资源的消耗。设计了一种复用型多节点可配置架构的硬件加速器,通过多个配置节点实现对不同网络层的推理加速,对各个网络层在硬件层面进行了优化设计,并采用了层内和层间协同的流水线化设计。整个硬件加速系统通过软硬件协同设计实现,合理划分软硬件任务,实现了硬件加速器与软核处理器高度并行工作。最终在Xilinx VC707 FPGA评估板上,以200 MHz的工作频率实现了177.96 GOPS的吞吐量,同时仅消耗了6.5 W的功耗,实现了27.38 GOPS/W的高能效,分别为I5-10400F CPU的19.7倍和GTX 2070S GPU的8.6倍,兼顾了检测速度和功耗,满足了纸板生产的工业环境需求。

    Abstract:

    Currently, the main method used in the industrial assembly line production process to remove defective cardboard is manual inspection, which is inefficient. Therefore, it is of practical significance to achieve high-energy efficiency and accurate automatic detection of surface defects on cardboard during the production process. Based on the excellent performance of the YOLO series network in the field of object detection and the high energy efficiency of FPGA-deployed network models, a high-energy efficiency cardboard defect detection system based on FPGA is proposed. The cardboard defect dataset is trained using the YOLOv7-Tiny network, and the network model is retrained and quantized using QAT. With a loss of only 0.36% in detection accuracy, the weights and feature map data are quantized to 8 bits, reducing hardware resource consumption. A reusable multi-node configurable architecture for the hardware accelerator is designed to achieve inference acceleration for different network layers through multiple configuration nodes. Each network layer is optimized at the hardware level, and a pipelined design with intra-layer and inter-layer coordination is adopted. The entire hardware acceleration system is implemented through coordinated software-hardware design, with a rational division of software and hardware tasks, enabling highly parallel operation of the hardware accelerator and soft-core processor. Ultimately, on the Xilinx VC707 FPGA evaluation board, a throughput of 177.96 GOPS is achieved at a working frequency of 200 MHz, while consuming only 6.5 W of power. This results in a high energy efficiency of 27.38 GOPS/W, which is 19.7 times that of the I5-10400F CPU and 8.6 times that of the GTX 2070S GPU. It balances detection speed and power consumption, meeting the industrial environmental requirements for cardboard production.

    参考文献
    相似文献
    引证文献
引用本文

陈俊杰,陈哲宇,郑子滨,李胜.基于FPGA的高能效纸板缺陷检测系统计算机测量与控制[J].,2025,33(1):45-52.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-11-15
  • 最后修改日期:2023-12-19
  • 录用日期:2024-01-02
  • 在线发布日期: 2025-02-07
  • 出版日期:
文章二维码