Abstract:To address the issue of traditional unmanned aerial vehicles (UAVs) relying heavily on prior background information for monocular visual positioning, a novel monocular visual target localization method for UAVs without the need for prior size information of the target is proposed. The method leverages deep learning-based target detection technology to lock onto the target, thereby obtaining its real-time positional information within the camera frame. The target's distance is calculated by integrating the UAV's motion data and the scale changes of the target within the frame, with refinements made through least squares fitting. A geometric relationship for target positioning and a coordinate transformation model are established. The target's position within the camera coordinate system is ascertained based on the proportional relationship between the on-board sensors and the actual spatial dimensions. By integrating the UAV's own localization, the target's location is transformed into geographic coordinates in the WGS-84 coordinate system. To validate the effectiveness of the proposed method, a simulation of UAV monocular visual target positioning software in the loop is conducted under the ROS framework, and the simulation results demonstrate that the method can achieve high-precision ranging and localization of targets in an environment lacking active high-precision ranging capabilities.