An image fusion of different modal images, such as visible and far-infrared images, is an important image processing technique because different modal images can compensate for each other. Many existing image fusion algorithms assume that different modal images are perfectly aligned. However, that assumption is not satisfied in many practical situations. In this paper, we propose an image alignment and fusion algorithm with gradient-domain processing. First, we extract the gradient maps from both modality images. Then, assuming disparities between the two gradient maps, candidate gradient maps for the target fused image are generated by selecting the gradient having larger power from different modality images pixel-by-pixel. A key observation is as follows. If the assumed disparity is wrong, the fused image includes ghost edges. If the assumed disparity is correct, the single edge is preserved without the ghost edge in the fused image. Therefore, we evaluate the gradient power in the region-of-interest of the fused image with different diparities. Then, we can align images based on the disparity associated with the minimum gradient power. Finally, we apply gradient-based image fusion with the aligned image pairs. We experimentally validate that the proposed approach can effectively align and fuse the visible and far-infrared images.
We present a system to perform joint registration and fusion for RGB and Infrared (IR) video pairs. While RGB is related to human perception, IR is associated with heat. However, IR images often lack contour and texture information. The goal with the fusion of the visible and IR images is to obtain more information from them. This requires two completely matched images. However, classical methods assuming ideal imaging conditions fail to achieve satisfactory performance in actual cases. From the data-dependent modeling point of view, labeling the dataset is costly and impractical.In this context, we present a framework that tackles two challenging tasks. First, a video registration procedure that aims to align IR and RGB videos. Second, a fusion method brings all the essential information from the two video modalities to a single video. We evaluate our approach on a challenging dataset of RGB and IR video pairs collected for firefighters to handle their tasks effectively in challenging visibility conditions such as heavy smoke after a fire, see our project page.