Abstract:In order to diagnose and control the behaviors with potential safety hazards in the construction site, it is proposed to identify the unsafe behaviors of workers in the construction site through in-depth learning. First, extract the human skeleton motion model, take the extracted information as the new modal information of human posture and movement changes, and briefly introduce the process of skeleton information extraction based on human posture, Furthermore, CNN-LSTM model is proposed, which can optimize the performance of spatial feature extraction. By using BN-Inception as the spatial feature extractor required by CNN-LSTM behavior recognition model, the spatial structure information contained in all video frames is trained in the extraction process. Then, the timing information of all frames in the complete video is modeled with the help of long-term and short-term memory network (LSTM). Finally, the result obtained by the model is the prediction output of LSTM at the final time. Through relevant research, it can be proved that the information accuracy obtained by CNN-LSTM model can reach 88.67%, and the accuracy of single-mode behavior recognition model in the recognition process can be optimized.