Abstract:At present, most of the quadrotor UAVs use the classic control method to design the control law. However, the selection of control parameters and the dependence on the mathematical model of the controlled object have always been problems that need to be overcome in the design of the classic control method. Aiming at this problem, a design method of UAV control law based on deep reinforcement learning algorithm Deep Q Network is adopted. The quadrotor attitude angle and attitude angle rate are used as the input data of the three-layer neural network, and finally the action value function is output. Then, the action is selected according to the greedy strategy. Through continuous interaction with the environment, the agent updates the weight of the neural network according to the reward and punishment information, so that the agent selects the action in the direction of obtaining the maximum cumulative return. The simulation results show that after the reinforcement learning training, the quadrotor attitude angle can quickly and accurately track the change of the reference command, which proves the feasibility of the quadrotor UAV control law based on reinforcement learning, thus avoiding the dependence of traditional control methods on the selection of control parameters and control model.