Abstract:Aimed at the problems that faulty classification and incomplete segmentation are in existence in segmenting multi-scale objects to the existing real-time semantic segmentation networks, a real-time semantic image segmentation method is proposed based on dual branch fusion. The method introduces a scale attention fusion module that is able to fuse object spatial feature and semantic information extracted from the detail branch and semantic branch, thereby improving the accuracy of the network for multi-scale object recognition. The edge loss function is used to guide the detail branch into learning the object edge contour, improving the network’s segmentation performance on object edge details. Finally, a global perception module is constructed to enhance the global context perception capability of the network. The experimental results demonstrate that the proposed method achieves the mean Intersection over union (mIoU) of 78.1% and 76.2% on the CityScapes and CamVid datasets respectively. Additionally, the mean pixel accuracy (mPA) is 87.6% and 85.4%, respectively. For small-scale object edges, there is a more accurate segmentation, coming up to the real-time requirements on a single GTX 1080Ti GPU, and frames per second (FPS) achieves 59.8 and 43.5 respectively.