Abstract:Addressing challenges such as unstructured environmental interference, significant initial pose deviations, and uncontrollable contact forces during the automated assembly of fixed-wing aircraft cabin sections, we conducted experimental research and performance evaluations on a multi-modal perception-based automatic docking system. We designed a three-stage closed-loop control strategy, incorporating visual-guided coarse alignment, adaptive laser-ranging feeding, and force-feedback compliant locking. For visual processing, we introduced an attention-enhanced U-Net model, improving feature extraction robustness under complex lighting. Regarding control, we implemented impedance-based force/position hybrid control to ensure active compliance during contact. To assess system robustness, we designed a comprehensive stress test that simulated sudden lighting changes, dynamic vibrations, and extreme initial deviations. Experimental results demonstrated that the system maintained high-precision feature extraction in complex lighting, achieved complete capture with zero overshoot under extreme deviations, and effectively reduced contact impact forces through compliant control. Our proposed multi-modal fusion and hierarchical control strategy has proven effective in overcoming complex operational interferences, fulfilling the engineering requirements for high-precision automated docking of aircraft cabin sections.