Abstract:The image blurring problem caused by camera shake or object motion is addressed by proposing a Mamba and frequency-domain fusion network, MFNet. The network adopts a flip-decoder architecture, combines the non-causal modeling capabilities of the Vision Transformer with the Mamba modeling framework, and improves image deblurring performance by fusing frequency domain information. A non-causal pixel interaction method is designed to effectively model semantically similar pixels in unscanned sequences using the attention state space equation, and the issue of long-distance information decay is alleviated by the Fourier transform module. Experimental results show that MFNet outperforms existing mainstream methods on the GoPro dataset, achieving a PSNR of 33.43 dB, FLOPs of 66.7 G, higher recovery accuracy, and lower computational overhead, while effectively removing image blur and recovering details.