Abstract:In the research of autonomous grasping system of manipulator,in order to acquire the spatial coordinates of the object automatically,Kinect depth sensor is used to collect RGB image,improved depth learning algorithm Mask RCNN is used to recognize and segment the target on RGB image, and through the Kinect depth sensor model, the two-dimensional image coordinates are transformed into three dimensional space coordinates, and the object is modeled in three-dimensional to achieve the purpose of spatial positioning.Mask RCNN algorithm trained by a large amount of data can recognize many objects with different features at the same time, so it has wide application space.Experiments show that the three-dimensional coordinates of the target object are more accurate and less affected by the environment,it is of great significance to the research of the manipulator grasping system.