Abstract:Target search path planning based on collision avoidance of UAV is to find the target in the faster and more efficient form by reasonable flight path planning against complex and numerous environmental obstacles. Firstly, this paper deeply discussed the law of finite position Markov mobility from the perspective of probability theory, and constructed the corresponding Markov mobility distribution model. Then, based on the cutting-edge research results of search system trajectory planning, combined with the Markov decision process theory, the negative reward mechanism was innovatively introduced to iterate the Q-Learning strategy algorithm, and the single UAV target search model was constructed. And the impact of obstacle constraints on flight is visually presented through a visualization method similar to "risk well". Finally, the simulation experiment proves the feasibility and effectiveness of the algorithm, which has certain reference significance for the design of the route planning algorithm.