Tags:edge-based computing, image fusion and multimodality
Abstract:
The fusion of multiple images from different modalities is the process of generating a single output image that combines the useful information of all input images. Ideally, the information-rich content of each input image would be preserved, and the cognitive effort required by the user to extract this information should be smaller on the fused image than the one required to examine all images. We propose MobileFuse, an edge computing method targeted at processing large amount of imagery in a bandwidth limited environment using depthwise separable Deep Neural Networks (DNNs). The proposed approach is a hybrid between generative and blending based methods. Our approach can be applied in various fields which require low latency interaction with the user or with an autonomous system. The main challenge in training DNNs for image fusion is the sparsity of data with representative ground truth. Registering images from different sensors is a major challenge in itself, and generating a ground truth from them is another massive one. For this reason, we also propose a multi-focus and multi-lighting framework to generate training dataset using unregistered images. We show that our edge network can perform faster than its state-of-the-art baseline, while improving the fusion quality.