Abstract:In view of the problem that effective interpretation of SAR images is extremely difficult,this paper proposes a new model of translating SAR into optical images.The model is very unlike the existing methods relying on strictly aligned datasets for training, and does not depend on such preconditions,enabling the generation of images to assist in the interpretation tasks associated with SAR imagery. And the proposed method is to utilize the diffusion Schröinger bridge theory for decomposing the complex transformation relationships between SAR and optical images into multi-step generative processes. Through mitigating the impact of coherent spot noise(CSN)on feature extraction,an AD-DPM denoising preprocessing module is incorporated into the model. Furthermore,a ViT-UNet generator is employed for deep feature extraction,while incorporating PatchNCE regularization terms into the loss function to enhance the preservation of underlying structural components during the generation process.The experiments conducted by using the SEN1-2 dataset demonstrate that the proposed method can effectively convert SAR images into high-quality optical images that align with visual perception standards.Compared to the baseline method CUT,the PSNR、SSIM、FID and LPIPS metrics improved by 38.4%、42.2%、30.6% and 21.0%,respectively. These results provide a valuable reference for improving the efficient and accurate interpretation of SAR images.