Abstract:Surface defects such as scratches, spalling, and corrosion are prone to occur on the wheelset treads of metro vehicles. In addition, tread images are characterized by considerable feature redundancy, strong background interference, and difficulty in multi-scale detection. To address these problems, an improved YOLO11n-based method is proposed for wheelset tread defect detection in metro vehicles. A Spatial and Channel Reconstruction Convolution (ScConv) module is introduced into the later stage of the backbone network to suppress spatial and channel redundancy. A Global Channel-Spatial Attention (GCSA) module is embedded in the feature fusion layer to enhance multi-scale feature representation. The LWIoUv3 loss function is adopted in the regression stage to improve localization performance, thereby enhancing detection capability in complex scenarios. Experiments were conducted on 1,438 wheelset tread images collected from the Changsha Metro Line 6 depot. Compared with the baseline YOLO11n, the proposed method improves Precision, F1-score, mAP@0.5, and mAP@0.5–0.95 by 5.99%, 2.95%, 1.65%, and 1.32%, respectively. The results show that the proposed method improves the detection accuracy of metro vehicle wheelset tread defects while maintaining a lightweight model size and good real-time performance, and can provide technical support for trackside intelligent inspection systems.