Tags:Channel and spatial attentions, High-frequency information, Image rescaling and Multi-stage training strategy
Abstract:
Recent image rescaling methods adopt invertible bijective transformations to model downscaling and upscaling simultaneously, where the high-frequency information learned in the downscaling process is used to recover the high-resolution image by inversely passing the model. However, less attention has been paid to exploiting the high-frequency information when upscaling. In this paper, an efficient end-to-end learning model for image rescaling, based on a new designed neural network, is developed. The network consists of a downscaling generation sub-network (DSNet) and a super-resolution sub-network (SRNet), and learns to recover high-frequency signals. Concretely, we introduce dense attention blocks to the DSNet to produce the visually-pleasing low resolution (LR) image and model the distribution of the high-frequency information using a latent variable following a specified distribution. For the SRNet, we adapt an enhanced deep residual network by using residual attention blocks and adding a long skip connection, which transforms the predicted LR image and the random samples of the latent variable back during upscaling. Finally, we define a joint loss and adopt a multi-stage training strategy to optimize the whole network. Experimental results demonstrate that the superior performance of our model over existing methods in terms of both quantitative metrics and visual quality.
Enhancing Image Rescaling Using High Frequency Guidance and Attentions in Downscaling and Upscaling Network