previous day
all days

View: session overviewtalk overview

09:20-10:20 Session 7A: Invited Speech 3&4
Location: Room 345
Color Image Processing via Quaternion Representation
Representation Learning for Face Image Analysis
10:30-11:30 Session 8A: Artificial Intelligence 2
Location: Room 345
Perceptual Fusion of Infrared and Visible Image Through Variational Multiscale with Guide Filtering

ABSTRACT. Aiming at the problem of poor noise suppression ability and easy loss of edge contour and detail information in current fusion methods, an infrared and visible light image fusion method based on variational multiscale decomposition is proposed. Firstly, the fused images are separately processed by variational multiscale decomposition to obtain texture components and structural components. The method of guided filter is used to carry out the fusion of the texture components of the fused image. In the structural component fusion, a method is proposed to weigh the fused weights with phase consistency, sharpness, and brightness comprehensive information. Finally, the texture components of the two images are fused. Add the structure components to get the final fused image. Experiments show that the proposed method can effectively suppress the influence of noise on the fusion result. And the texture and structure information of the fusion result is more clear and effective.

Infrared dim and small target detection based on denoising autoencoder network

ABSTRACT. The method of infrared small target detection is a crucial technology of infrared early-warning tasks, infrared imaging guidance, and large field of view target monitoring. It is very important for certain early-warning tasks. In this paper, we propose an end-to-end infrared small target detection model (called CDAE) based on denoising autoencoder network and convolutional neural network, which treats small target as "noise" in infrared image and transforms small target detection task into denoising problem. In addition, we quote the perceptual loss to solve the problem of background texture feature loss in the encoding process, and propose the structure loss to make up for the defect of the perceptual loss that the small targets appearing. We compare with ten methods on six sequences and one single-frame data set. Experimental results show that our method obtains the highest SCRG value on four sequences, and the highest BSF value on six sequences. From ROC curve, We can see that our method achieves the best results in all test sets.

Implicit Preference in Social Recommendation
PRESENTER: Yuecheng Yu

ABSTRACT. The feedback information in social media often contains the user's implicit preferences. However, it is often ignored by most social recommendation algorithm. To improve the users’ experience and reduce the push of unwelcome information, a new social recommendation algorithm adopting user implicit preference is proposed in this paper. Different from the existing recommendation methods based on probability matrix decomposition, the user implicit feedback information is incorporated into the user rating prediction function, and social network trust calculation is adopted to improve the data sparsity of implicit feedback information. Then the recommendation list is optimized and most of disgusting content is filtered out. The experimental results on real-world datasets demonstrate the effectiveness of our proposed method.

A Multi-scale Progressive Method of Image Super-resolution
PRESENTER: Ying Surong

ABSTRACT. In recent year, researchers have gradually focused on single image super-resolution for large scale factors. Single image contains scarce high-frequency details, which is insufficient to reconstruct high-resolution image. To address this problem, we propose a multi-scale progressive image super-resolution reconstruction network (MSPN) based on the asymmetric Laplacian pyramid structure. Our proposed network allows us to separate the difficult problem into several subproblems for better performance. Specially, we propose an improved multi-scale feature extraction block (MSFB) to widen our proposed network and achieve deeper and more effective feature information exploitation. Moreover, weight normalization is applied into MSFB to tackle the gradient vanishing and gradient exploding problem, and to accelerate the convergence speed of training. In addition, we introduce pyramid pooling layer into the upsampling module to further enhance the image reconstruction performance by aggregating local and global context information. Extensive evaluations on benchmark datasets show that our proposed algorithm gains great performance against the state-of-the-art methods in terms of accuracy and visual effect.

VM Migration based on Gene Aggregation Genetic Algorithm
PRESENTER: Jinjin Wang

ABSTRACT. As a key technology of cloud computing, virtualization technology enables multiple virtual machines (VMs) to run on a host, improving the efficiency of the host. When the virtual machine (VM) runs too many tasks, the host will be overloaded and an exception occurs. Regarding the issue above, this paper considers the communication cost of VM migration, and proposes a VM Migration Algorithm based on Gene Aggregation Genetic Algorithm (VMM-GAGA), which effectively reducing the number of genes in the chromosome, search space and the communication cost.

Multi-view Spectral Clustering via Complementary Information
PRESENTER: Shuangxun Ma

ABSTRACT. In this paper, a novel multi-view spectral clustering via complementary information (MSCC) is proposed, in which both the consensus information and complementary information are explored for multi-view clustering. Different from most existing multi-view spectral clustering methods, the proposed MSCC takes the difference among multiple views into consideration and constructs a desired similarity matrix for clustering. Furthermore, a convex relaxation is employed and an algorithm based on the Augmented Lagrangian Multiplier is proposed to optimize the objective function of the MSCC. Extensive experiments on four real-world benchmark datasets verified that the proposed method outperforms several state-of-the-art methods for multi-view clustering.

Fine-grained Facial Image-to-image Translation with an Attention based Pipeline Generative Adversarial Framework

ABSTRACT. Fine-grained feature detection and recognition is a challenging work due to the resolution and noisy representation. Synthesize images with a specified tiny feature is even more challenging. Existing image-to-image generation studies usually focus on improving image generation resolution and increasing the representation learning abilities under coarse features. However, generating images with fine-grained attributes under an image-to-image framework is still a tough work. In this paper, we propose an attention based pipeline generative adversarial network (Atten-Pip-GAN) to generate various facial images under multi-label fine-grained attributes with only a neutral facial image. First, we use a pipeline adversarial structure to generate images with multiple features step by step. Second, we use an independent image-to-image framework as a prepossessing method to detection the small fine-grained features and provide an attention map to improve the generation performance of delicate features. Third, we also propose an attention-based location loss to improve the generated performance on small fine-grained features. We apply this method to an open facial image database RaFD and demonstrate the efficiency of Atten-Pip-GAN on generating fine-grained attribute facial images.

Hyperspectral Image Classification Based on Set-to-Set Distance and Label Optimization

ABSTRACT. In this study, we develop an effective classification framework to classify a hyperspectral image (HSI), which consists of two fundamental components: set-to-set distance (STS) and label optimization. First, we propose a novel STS method to measure the distance between the test data and the sample matrix in each class. The STS model effectively models the spatial consistency among the neighboring pixels by using the convex hull model and set-to-set distance. In addition, we develop a novel label optimization method to enhance label consistency in the classification process, which is able to further improve the performance of the STS method. Finally, we evaluate the proposed methods by comparing them with other algorithms on several HSI classification data sets. Both qualitative and quantitative results demonstrate that the proposed methods perform favorably in comparison to the other algorithms.

High level Video Event Modeling, Recognition and Reasoning via Petri Net

ABSTRACT. A Petri net based framework is proposed for automatic high level video event description, recognition and reasoning purposes. In comparison with the existing approaches reported in the literature, our work is characterized with a number of novel features: (i) the high level video event modeling and recognition based on Petri net are fully automatic, which are not only capable of covering single video event but also multiple ones without limit; (ii) more variations of the event path extracted from training dataset can be found and modeled in the proposed model using proposed algorithms; (iii) the recognition results are more accurate based on automatic built high level event models. Experimental results show that the proposed method outperforms the existing benchmark in terms of recognition precision and recall. Additional advantages can be achieved such that hidden variations of events hardly identified by humans can also be recognized.

Context-aware based Discriminative Siamese Neural Network for Face Verification

ABSTRACT. Although face recognition and verification algorithms have made great success under controlled conditions in recent years. In real-world uncontrolled application scenario, there is a fundamental challenge that the discriminative ability of feature is always the core concern of face verification. Aiming at this problem, we proposed a context-aware based discriminative Siamese neural network for face verification. In fact, the structure of facial image are more stable rather than hairstyle change and wearing jewelry. Firstly we use a context-aware module to anchor facial structure information by filtering out irrelevant information. For improved discrimination, we develop a Siamese network including two symmetrical branch subnetworks to learn discriminative feature by labeled triad training data. The experimental results on LFW face dataset outperform some state-of-the-art face verification methods.

Question Generalization in Conversation
PRESENTER: Jianfeng Peng

ABSTRACT. The dialogue response generation system is one of important topics in natural language processing, but the current system is difficult to produce human-like dialogues. The responses proposed by the chat-bot are only a passive answer or assentation, which does not arouse the desire of people to continue communicating. To address this challenge, in this paper, we propose a question generalization method with three types of question proposing schemes in different conversation patterns. A probability-triggered multiple conversion mechanism is used to control the system to actively propose different types of questions. In experiments, our proposed method demonstrates its effectiveness in dialogue response generalization on standard dataset. In addition, it achieves good performance in subjective conversational assessment.

Optimal Scheduling of IoT Tasks in Cloud-Fog Computing Networks

ABSTRACT. The emerging IoT end devices generating a huge volume of IoT tasks have triggered the prosperous development of Fog computing in the past years, mainly due to the real-time requirements from IoT tasks. Fog computing aims to make use of the idle edge devices with computing/storage resources that are in the vicinity of IoT end devices and form them as instantaneous small-scale Fog networks (Fogs), so as to provide the one-hop service and thus minimize the service latency. Since Fogs may consist of only wireless nodes, only wire nodes or both of them, it is important to map the diverse IoT tasks with different QoS requirements to different types of Fog, in order to optimize the overall Fog performance in terms of the OPEX and the transmission latency. Regarding this, we propose an Integer Linear Programming (ILP) model to optimally map the IoT tasks to different Fogs and/or Cloud, taking the IoT task mobility and real-time requirements into consideration. Numerical results show that the real-time and mobility requirements have significant impact on the OPEX of the integrated Cloud-Fog (iCloudFog) framework.

10:30-11:30 Session 8B: Computer Vision 2
Location: Room 348
An improved Semantic Image Segmentation based on Residual Network

ABSTRACT. Semantic segmentation is one of the key issues in computer vision. It is widely applied in autonomous driving, robotic picking, augmented reality and so on. Due to breakthrough of deep learning in recent years, Fully Convolutional Network (FCN) based method has become the most popular in semantic segmentation. However, traditional FCN is difficult to capture global context information because of the Inherent spatial invariance. Furthermore, there is a problem that image resolution is low because of the existence of the pooling layer. In this paper, we propose a new architecture called WideSegt to solve the shortcomings of FCN. The new architecture captures the image context on various spatial scales and is also effective for small objects. In addition, it is with little loss of position information because of the structure without any pooling layers. The proposed method achieves Mean Intersection over Union (MIoU) 72.5 [%] and Global Accuracy (GA) 92.4 [%] on CamVid dataset, and achieves higher performance than previous methods without inputting additional dataset.

A New Image Transmission Compression Approach based on Beidou Navigation Satellite System on the High Sea

ABSTRACT. In order to solve the problems that a great amount of image information need to be transmitted on the high sea aquaculture and Beidou Navigation Satellite System has the limited transmitting capability, this paper proposes a new compression approach to transmit images using Beidou Navigation Satellite System, i.e., the variable sequence progressive compression approach. Firstly, by performing matrix operation and difference component extraction on the binary stream data, the data volume is reduced. Then, the data volume is further reduced by using the iterative compression approach. The simulation results show that the proposed approach can effectively reduce the amount of transmitted data, reduce the compression ratio and improve the data transmission efficiency under the premise of ensuring the same image quality by comparing with the traditional Huffman and DCT (Discrete Cosine Transform) algorithms.

Multilayer Depthmap Perceptron-based Underwater Image Enhancement

ABSTRACT. In order to improve the quality of underwater images, this paper improves the traditional Retinex algorithm, which is combined with a neural network. Firstly, Retinex algorithm is used to defog the underwater image. Secondly, the image brightness was improved by Gamma correction. Finally, combined with dark channel prior and multilayer perceptron (MLP), the transmission image is further refined to improve the dynamic range of the image. Through theoretical analysis and practical experiments, the method can enhance the image details and restore the image color.

Underwater Image Enhancement Using Retinex and Multilayer Perceptron
PRESENTER: Tingting Zhang

ABSTRACT. In order to improve the quality of underwater images, this paper improves the traditional Retinex algorithm, which is combined with a neural network. Firstly, Retinex algorithm is used to defog the underwater image. Secondly, the image brightness was improved by Gamma correction. Finally, combined with dark channel prior and multi-layer perceptron, the transmission image is further refined to improve the dynamic range of the image. Through theoretical analysis and practical experiments, the method can enhance the image details and restore the image color.

Image Super-Resolution Algorithm Based on Color Features

ABSTRACT. The Super Resolution (SR) reconstruction of a single image is a classical problem in the field of computer vision, and its purpose is to obtain a high Resolution image from a low Resolution image. The current convolutional neural network reconstruction algorithm is only trained on the Y channel, which will lead to the problem of insufficient color feature acquisition. Therefore, an image super-resolution reconstruction algorithm based on color feature is proposed. First, the image is divided into R, G and B channels to extract more color features. Then they are used as input images to carry out convolutional neural network training. Finally, the output R, G and B channel images are fused to restore high-resolution images. Experimental results show that the proposed method improves the reconstruction effect of image edge and texture details, improves the image clarity, and enhances the image color recovery.

Classification of Hyperspectral image based on Shadow Enhancement by Dynamic Stochastic Resonance

ABSTRACT. Information extraction of shadow areas in hyperspectral images (HSIs) has always been a difficult problem in HSI processing. Dynamic stochastic resonance (DSR) heory has proved that the noise contained in the signal can enhance the strength of the original signal and improve the signal-to-noise ratio (SNR). And it has been applied in signal and image processing,communication and other fields. In this paper, DSR theory is introduced to the shadow enhancement in HSIs for the first time. The spatial and spectral dimensions of the shadow areas in a HSI could be enhanced by the DSR respectively. Then, the enhanced shadow should be fused with the other areas in the HSI. Finally, the fused image could be classified to explore the information in the HSI. The experimental result show that the DSR has promising prospect in the shadow enhancement in HSIs, and can help to improve the classification.

Improving the Representation of Image Descriptions for Semantic Image Retrieval with RDF

ABSTRACT. Recent years witnessed a surge of interest in many tasks at the intersection of natural language processing and computer vision. In particular, using objects together with their attributes and relations to represent images or interpret languages has been proved useful across a wide variety of applications. ln this paper, we only focus on the image textual descriptions and improve the representation of image descriptions for semantic image retrieval. We use natural language processing tools to obtain a set of objects, attributes and relations and then model them into a graphical structure with RDF. We also illustrate some use cases to show how to handle textual based image retrieval for complex queries or multilingual queries. The experimental results show that our approach improves the representation of image descriptions, which is suitable for enhancing image retrieval with high-level semantics.