Tags:Automatic Image Annotation, Convolutional Neural Networks, Ensemble methods and Fast Fourier Transform
Abstract:
This paper discusses the models and methods of machine learning that are employed to solve the problem of automatic image annotation. Today, the systems which have the ability to extract meaning from visual data are increasingly developed and used both in academia and industry. One of the practically important directions within the scope of this problems is the development of automatic systems for understanding of visual scenes. In this paper, we propose a brief survey of the state-of-the-art machine learning approaches and methods that have been suggested for automatic image annotation. We study the mathematical foundations of the overviewed methods and analyze their strengths and limitations. Further, we develop a proof-of-concept system for the image annotation using convolutional neural networks and construct a neural network ensemble using the snapshot approach. In the image processing stage, we apply the Fast Fourier Transform method. In addition, we outline a direction for further development of image annotating systems based on both theoretical and experimental models.
Automatic Image Annotation with Ensemble of Convolutional Neural Networks