Abstract:When using the traditional semantic segmentation SegNet network to extract buildings from high-resolution remote sensing images, there are some problems, such as fuzzy boundary, low accuracy, error detection and missing detection. In order to solve the above problems, an improved SegNet+CRF semantic segmentation method is proposed. The SegNet model is improved by adopting Atrous Spatial Pyramid Pooling(ASPP) in the encoding stage to extract the feature maps of different receptive fields of an image through multiple hole convolutions with designed expansion rates. In the decoding stage, construct Feature Pyramid Networks(FPN) to realize multi-scale feature fusion and reduce feature detail loss. Further, the prediction images are image-processed based on the fully connected conditional random field(CRF)model to optimize the building edges. Experimental results in test areas are evaluated quantitively and visionally, which show that the improved model has the higher accuracy than that of the original SegNet deep learning model, with the average pixel accuracy, recall and average cross-over ratio by 0.48% , 1.29% and 2.36% respectively. The improved method is able to acquire buildings with clear and accurate boundaries, and can be extended for recognition applications of in remote sending images for urban mapping, management and planning.