DCT-DFT-FFT Based Method for Text Detection in Underwater Images

EasyChair Preprint no. 6997

Date: November 6, 2021


Text detection in water images is an open challenge for the text detection community because of distortions caused by refraction, absorption of light, particles, which all vary depending on the depth, color, and nature of water. Unlike the existing models aim at detecting text in natural scene images, the proposed work focus on developing a new model for text detection in water images through a new enhancement model. The basis for introducing a new enhancement model is that the fine details of the text share with high energy, spatial resolution, and brightness achieved by Discrete Cosine Transform (DCT), Discrete Wavelet Transform (DWT), and Fast Fourier Transform (FFT), respectively, compared to non-text pixels. This step produces six combinations of enhancement images, namely, DCT-DWT-FFT, DCT-FFT-DWT, DWT-DCT-FFT, DWT-FFT-DCT, FFT-DCT-DWT, FFT-DWT-DCT. The combination of enhanced images is fed to the modified Character Region Awareness for Text Detection (CRAFT) model to detect text in the water images. Experimental results on our dataset which contains water images containing text information and benchmark datasets of natural scene text detection, name-ly, MSRA-TD500, ICDAR 2019 MLT, ICDAR 2019 ArT, Total-Text, CTW1500, and COCO Text show that the proposed work well for both water im-ages and natural scene images. It is also noted from experimental results that the proposed method outperforms the existing methods for all the datasets.

Keyphrases: deep learning model, Discrete Cosine Transform, Enhancement, Fourier transform, text detection, Water images, wavelet transform

