ISAIR2019: THE 4TH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND ROBOTICS 2019
PROGRAM FOR TUESDAY, AUGUST 20TH
Days:
next day
all days

View: session overviewtalk overview

09:20-10:20 Session 2A: Keynote Speech 1&2
Location: Room 345
09:20
Internet of Things: Applications and Security Challenges

ABSTRACT. The Internet is used daily for many online services, e.g., to communicate, find information, do transactions, and be entertained. And this is possible independent of location due to its openness and distributed nature. But this very openness poses vulnerabilities which allow attacks such as ID theft, denial of service, industrial espionage, and extortion.

The Internet of Things (IoT) anticipates connecting billions of objects to the Internet in the next few years, thus potentially creating additional challenging security and privacy issues.

The objective of this talk is to describe some trends affecting Internet security and discuss the sources of security problems, as well as provide an overview of IoT and its related security issues. Typical IoT applications are presented with focus on autonomous cars. Also, some potential solutions, safeguards, and defenses are offered.

09:50
A Cyber-Physical Billboard IoT Network -Architecture, Operation, and Performance
09:20-10:20 Session 2B: Keynote Speech 3&4
Chair:
Location: Room 348
09:20
Machine Learning and High Performance Computing
09:50
Non-rigid Image Registration Techniques and Its Application to Medical Imaging
10:30-11:30 Session 3A: Keynote Speech 5&6
Location: Room 345
10:30
Affective Interaction through Wide-learning and Cognitive Computing

ABSTRACT. With the development of the 5th generation wireless systems (5G), the wireless world is to be interconnected without barriers. The new technology is giving rise to more challenging applications, and people expect more personalized and interactive services with resource-limited mobile terminals. Fortunately, Mobile edge computing (MEC) implemented in the context of 5G can overcome this bottleneck, which makes it possible to enable many resource-intensive services for mobile users with the support of mobile big data delivery and cloud-assisted computing. In this talk, a novel Emotion-aware Mobile Cloud Computing framework in 5G will be given, which provides personalized emotion-aware services by MEC and affective computing. Meanwhile, artificial intelligence (AI) and cognitive computing are significant for emotion-aware computing and emotion detection to meet various technical challenges. Especially, the role of wide learning is emphasized, which integrates multiple deep nets models and multidimensional data collections through IoT and 5G technologies. The practical testbed named Affective Interaction through Wide-Learning And Cognitive Computing (AIWAC) will be introduced.

11:00
Kansei Information Processing and Its Application
10:30-11:30 Session 3B: Invited Speech 1&2
Chair:
Location: Room 348
10:30
3D Mapping of Outdoor Environmentals by Scan Matching and Motion Averaging
11:00
Image-to-Image Translation: Advances and Trends
13:30-15:30 Session 4A: Artificial Intelligence 1
Chair:
Location: Room 345
13:30
Video Smoke Removal based on Low-rank Tensor Completion
PRESENTER: Lizhen Deng

ABSTRACT. Smoke has a bad effect on the outdoor vision system. Not only are the videos with poor visual effects obtained, but also the quality and structure of the videos are reduced. In this paper, to remove the smoke of the videos, we proposed a video image smoke removal method based on space-time consistency based on the smoke mixing model by analyzing the characteristics of smoke. So far, there are many effective haze removal algorithms, but it exists few algorithms to removal smoke. So our proposed smoke removal method has certain practical significance. Simultaneously, experiments of smoke removal on simulated data show that our proposed algorithm is better than other haze removal algorithms in visual effects and qualitative measures.

13:40
Classification of Epilepsy Period Based on Combination Features and Spiking Swarm Intelligent Optimization Algorithm
PRESENTER: Lijuan Duan

ABSTRACT. Epilepsy seriously damages the physical and mental health of patients. Detection of epileptic EEG signals in different periods can help doctors diagnose the disease. The change of frequency components during epilepsy seizures is obvious and there may be noises in epilepsy EEG signals. Moreover, epileptic seizures are closely related to the release of neuronal spiking in the brain. In this paper, we propose an approach for epilepsy period classification based on combination features and spiking swarm intelligent optimization classification algorithm. First, combination features take in account both the time-frequency features and principal component features of epilepsy. We obtain the time-frequency features by WPT or STFT-PSD, and remove noises of epilepsy while extracts principal component features by PCA. Second, spiking swarm intelligent optimization classification algorithm takes advantage of individual cooperation and information interaction and has strong robustness. Its simulated neurons are closer to reality, which consider more information and obtain stronger computing power. The experimental results show that the average classification accuracy of proposed method can reach 98.9% and the highest classification accuracy can reach 100%. Compared with other methods, the proposed method has the best classification performance.

13:50
Learning Spectral Normalized Adversarial Systems with Stacked Structure for High-Quality 3D Object Generation
PRESENTER: Haiyong Zheng

ABSTRACT. This paper proposes a new method for generating 3D objects based on generative adversarial networks (GANs). Recently, GANs have been used in 3D objects generation, but it's still very challenging in generating high-quality 3D objects because of the complex data distribution over 3D objects. In this paper, we propose a system based on GAN that makes the generated objects more realistic. We use multiple generators and discriminators to enhance the ability of model for learning complex distributions. Such a stacked structure can be considered as a coarse-to-fine or low-to-high-resolution mechanism. We employ the spectral normalization technology to control the Lipschitz constant of the discriminators by literally constraining the spectral norm of each layer to get a more stable training process. In this way, the proposed model can generate realistic and high-quality 3D objects. Moreover, our system can also recover incomplete 3D objects into complete 3D objects. Experiments demonstrate that our model performs better in the quality of the generated objects than the baselines.

14:00
Robust Face Hallucination via Locality-Constrained Matrix Regression Network

ABSTRACT. Representation learning methods has attracted considerable attention in learning-based face hallucination in recent years. Conventional methods perform local models learning on low-resolution (LR) manifold and face reconstruction on high-resolution (HR) manifold respectively, leading to unsatisfactory reconstruction performance when the acquired LR face images are severely degraded (e.g., noisy, blurred) . To this end, this paper propose an efficient locality-constrained matrix regression network (LCMRN) model to learn the representation of the input LR patch and meanwhile preserve the manifold of the original HR space. Particularly, locality-constrained matrix regression (LCMR) uses nuclear norm regularization to capture the low-rank structure of the representation residual, and applies an adaptive neighborhood selection scheme to find the HR patches that are compatible with its neighbors. In addition, LCMRN iteratively applies the manifold structure of the desired HR space to induce the reconstruction weights learning in the LR space, aims at reducing the consistency gap between different manifolds. Experimental results on standard FEI and real-word CMU face databases have demonstrated the effectiveness of our proposed approach over several state-of-the-art face hallucination approaches both in objective metrics and visual quantity.

14:10
Research on Image Encryption based on Wavelet Transform Integrating with 2D Logistic
PRESENTER: Xi Yan

ABSTRACT. The paper proposed an image encryption model based on wavelet transform integrating with 2D logistic specific to the defects of information transfer accuracy and low safety of existing image encryption algorithm, and based on the 2D Logistic map chaotic system. The model can realize image multi-level encryption by extracting local decomposition coefficient information of original image and decomposition coefficient matrix mapped and arranged through 2D Logistic chaos. Eventually, it can simulate through testing. The results show that when there are only two wavelet decomposition levels, the encryption space order of magnitudes has reached class 10126. In addition, if increasing the wavelet decomposition level, the key space will be expanded rapidly. And when there are three wavelet decomposition levels, the order of magnitudes of overall key space can reach 10180. The algorithm not only can realize high precision retention of information, but also can significantly increase the order of magnitudes of key space, and further improves the security of image encryption.

14:20
An Improved Algorithm Based on SSD for Vehicle Detection
PRESENTER: Yanting Qiao

ABSTRACT. Vehicle detection has attracted extensive attention from academic and industry because of the potential to improve traffic safety and economic benefits. Benefit from the quick development of deep learning techniques, vehicle detection has achieved remarkable progresses recently. However, it is still not fast and accurate enough to be applied in industry. To solve those problems, an improved algorithm is proposed basing on the Single Shot Detector (SSD) network model in this paper. We replace the small convolution filter with the Inception block to improve its performance. Then we use a new method to set the scales and aspect ratios of the default bounding boxes, which is more suitable for vehicle detection. The validity of our algorithm is verified on KITTI and UVD datasets. Compared with SSD, our algorithm achieves a higher mAP, while maintaining a fairly fast speed.

14:30
Multi-Channel Deep CNN for Object Detection Under Complex Background
PRESENTER: Bo Zhao

ABSTRACT. This paper focuses on robust object detection under complex background based on multi-channel deep convolutional network. We propose a novel detection framework inspired by darknet-53, which take colored image, infrared image and motion image as network input. Images from multiple source can provide complementary information, which is beneficial to accurate detection performance. Furthermore, feature aggression modules are exploited to further integrate features of shallow and deep activation maps. We first train the network on large-scale IMAGENET datasets, then fine-tuned on small-scale datasets collected in real world. Both quantitatively and qualitatively experiments are conducted to show comparable results while maintaining real-time performance.

14:40
Building Label-Balanced Emotion Corpus Based On Active Learning for Text Emotion Classification
PRESENTER: Xin Kang

ABSTRACT. In Supervised-learning of emotions from human language, to keep the emotion la-bels balanced in the training set is a challenging task since emotion labels are highly biased in raw data of human language. In this paper, we propose a novel method based on active learning to partially inhibit the polarization of text samples with more frequently observed emotion labels for constructing the training set, and to encourage the selection of samples with less frequently observed emotion labels. For each batch of unlabeled samples, the selected samples by our approach are giv-en the ground truth emotion labels from human experts before they are merged to the training data. Our experiment of multi-label emotion classification on Chinese Weibo messages suggests that the proposed method is effective in constructing the label-balanced training set for text emotion classification, and the supervised text emotion classification results have been steadily improved with such training set.

14:50
KA-Ensemble: Towards Imbalanced Image Classification Ensembling Under-sampling and Over-sampling
PRESENTER: Haiyong Zheng

ABSTRACT. Imbalanced learning has become a research emphasis in recent years because of the growing number of class-imbalance classification problems in real applications. It is particularly challenging when the imbalanced rate is very high. Sampling, including under-sampling and over-sampling, is an intuitive and popular way in dealing with class-imbalance problems, which tries to regroup the original dataset and is also proved to be efficient. The main deficiency is that under-sampling methods usually ignore many majority class examples while over-sampling methods may easily cause over-fitting problem. In this paper, we propose a new algorithm dubbed KA-Ensemble ensembling under-sampling and over-sampling to overcome this issue. Our KA-Ensemble explores EasyEnsemble framework by under-sampling the majority class randomly and over-sampling the minority class via Kernel-ADASYN at meanwhile, yielding a group of balanced datasets to train corresponding classifiers separately, and the final result will be voted by all these trained classifiers. Through combining under-sampling and over-sampling in this way, KA-Ensemble is good at solving class-imbalance problems with large imbalanced rate. We evaluated our proposed method with state-of-the-art sampling methods on 9 image classification datasets with different imbalanced rates ranging from less than 2 to more than 15, and the experimental results show that our KA-Ensemble performs better in terms of ACC, F-Measure, G-Mean, and AUC. Moreover, it can be used in both dichotomy and multi-classification on both image classification and other class-imbalance problems.

15:00
Large-Scale Optimization via Cooperatively Coevolving Competition Swarm Optimizer
PRESENTER: Rushi Lan

ABSTRACT. This paper presents a two-stage approach, named Cooperatively Coevolving Competition Swarm Optimizer (C3SO), for large-scale optimization. C3SO first detects interactions among the original variables by a differential grouping algorithm, therein decomposing a large-scale problem into several subcomponents. In the next stage, a new approach is developed using the competition mechanism to independently optimize each subcomponent obtained in the first stage. Hence, C3SO takes advantages of both divide-and-conquer and competition strategies. The proposed method is evaluated by comparing with several state-of-the-art algorithms on different benchmark functions, and the experimental results demonstrated its effectiveness.

15:10
Supervised Dictionary Learning with Regularization for near-infrared spectroscopy classification
PRESENTER: Lingqiao Li

ABSTRACT. This paper presents a new model for near-infrared spectroscopy classification. The proposed model depicts a novel sparse classification mechanism through designing appropriate regularization factors. First, inspired by the impressive results of SRC over different classification tasks, we propose to improve original SRC models through designing the representation-constrained term and the coefficients incoherence term, and the added two terms can get the reconstruction error of coding coefficients and correlations between similar samples by sharing dictionary under more stable control. Then, based on the proposed model, a supervised class-specific dictionary learning algorithm is developed by choosing appropriate samples with class labels. Finally, a classification scheme integrating the novel sparse model is designed to exploit such discriminative information. The experimental results shows that the proposed sparse classification mechanism may be an alternative method to traditional methods for classifying near-infrared spectroscopy.

15:20
Deep Compression: A Compression Technology of Apron Monitoring Video
PRESENTER: Zonglei Lu

ABSTRACT. This paper presents a method of deep compression, which uses some methods of object detection to separate the moving and stationary objects of the real frame in the apron surveillance video. The extracted object image, the corresponding position information and the background image are stored in the linked list. When the video is decompressed, the extracted object images are restored to the background image according to the corresponding information, and the overall adjustment is performed according to the stored information such as illumination, and finally a video with high similarity to the original video is generated. This video compression method greatly reduces the storage space without destroying the original video information.

13:30-15:30 Session 4B: Computer Vision 1
Location: Room 348
13:30
Blind Image Deblurring Based on Local Rank
PRESENTER: Long Jin

ABSTRACT. The current blind image deblurring algorithms are often inaccurate for the blur kernel estimation, and the recovery effect is far from perfect. Accordingly, this paper proposes a single image blind deblurring method based on local rank. It first imposes adaptive threshold segmentation to conventional local rank transform, which is then utilized to constructs a novel model for blind image deblurring. Subsequently, the half-quadratic splitting method is adopted to alternatively estimate the blur kernel and the intermediate latent image by iterations. Finally, the desired latent image is obtained by linear combination of the hyper-Laplacian model and the TV-l2 model, where weight is calculated by the adaptive local rank. Experimental results on the public datasets demonstrate that the proposed approach could accurately estimate the blur kernel and effectively suppress the ringing effects.

13:40
Residual Feature Pyramid Networks for Salient Object Detection
PRESENTER: Ben Wang

ABSTRACT. Effective convolutional features play an important role in salient object detection, but how to extract powerful features for saliency is still a challenging task. FCN-based methods directly apply multi-level convolutional features without distinction, which leads to sub-optimal results due to the distraction from redundant details. In this paper, we propose a novel residual feature pyramid network for extracting more efficient convolutional features. Specifically, we first introduce richer convolutional features to fully exploit multiscale and multilevel information of objects, which makes these convolutional features more discriminative. Secondly, we further propose our residual feature pyramid networks by erasing the current predicted salient regions from side-output features, so that the network could learn residual features from unpredicted regions and result in high resolution prediction. Experiments on four benchmark datasets demonstrate that the proposed approach compares favorably against state-of-the-art methods.

13:50
Obstacle Avoidance Path Planning of Unmanned Submarine Vehicle in Ocean Current Environment Based on Improved Firework-Ant Colony Algorithm
PRESENTER: Yan Ma

ABSTRACT. In order to solve the UUV two-dimensional autonomous path planning problem in the environment affected by ocean current and obstacles, this paper applies the improved Fireworks-Ant Colony Hybrid Algorithm to solve it. Firstly, a two-dimensional Lamb vortex ocean current environment model with randomly distributed obstacles is established, and the circular obstacle is equivalent to a square grid. Then, the mathematical model of path planning is established considering the energy consumption cost, navigation time cost and navigation distance cost. Finally, the improved fireworks-ant colony hybrid algorithm is applied to solve the nonlinear optimization problem, and this algorithm is compared with the basic ant colony algorithm for simulation experiments in the four different marine environments. The experimental results show that this algorithm can quickly find the global optimal solution, and the more complex the environment, the more obvious its advantages. The algorithm proposed in this paper provides a new way for autonomous path planning of underwater vehicle.

14:00
Superpixel-based Feature Tracking for Structure from Motion
PRESENTER: Mingwei Cao

ABSTRACT. Feature tracking in image collections significantly affects the efficiency and accuracy of Structure from Motion (SFM). Insufficient correspondences may result in disconnected structures and incomplete components, while the redundant correspondences containing incorrect ones may yield to folded and superimposed structures. In this paper, we present a Superpixel-based feature tracking method that tackles the issues of completeness, efficiency and consistency in a unified framework. In the proposed method, we first propose to use joint approach to detect local keypoints and compute descriptors. Second, the super pixel-based approach is use to generate labels for the input image. Third, we combine the Speed Up Robust Feature and binary test in the generated label regions to produce a set of combined descriptors for the detected keypoints. Fourth, the locality-sensitive hash (LSH)-based k nearest neighboring matching (KNN) is utilized to produce feature correspondences, and then the ratio test approach is used to remove outliers from the previous matching collection. Finally, we conduct comprehensive experiments on several challenging benchmarking datasets including highly ambiguous and duplicated scenes. Experimental results show that our method has the superior performance in terms of accuracy and efficiency.

14:10
Style Transfer using Deep Convolutional Neural Networks and Saliency Detection
PRESENTER: Hui-Huang Zhao

ABSTRACT. This paper presents a novel image style transfer method using Deep Convolutional Neural Networks and Saliency Detection to transfer a style of one image to another image. When standard neural style transfer approaches are used, the textures in different regions of the style image are often applied inappropriately to the content image. In order to reduce or avoid such effects, we propose a novel method based on saliency detection. At first, a existed saliency detection method is used to detect the object in the content and style images, and rescale the images by using saliency detection results. Then, a loss function combining saliency detection is proposed. A gradient of saliency detection is added in style transfer result in each iteration. At last the saliency detection images and source images are provided as multichannel input to an augmented deep CNN framework for style transfer which incorporates a generative Markov random field (MRF) model. Results on various images show that our method outperforms the most recent approaches.

14:20
Multiple Faces Tracking using Particle filter and BP Neural Network
PRESENTER: Guangyong Zheng

ABSTRACT. Tracking multiple faces is more challenging than tracking a single face since similarity is much high between faces. And some problems arise in multiple-object tracking that do not exist in single-object tracking, such as object occlusion. This paper presents an occlusion robust tracking (ORT) method for multiple faces tracking. Given a video having multiple faces, we firstly detect faces in the first frame using the off-the-shelf face detector, and then extract wavelet packet transform (WPT) coefficients and color features from the detected faces, finally we design a back propagation (BP) neural network and track the faces by a particle filter and BP neural network. The main contribution is twofold. Firstly, the WPT coefficients combined with traditional color features is utilized to face tracking. It efficiently describes faces due to their discrimination and simplicity. Secondly, we propose an improved tracking method for occlusion robust tracking based on the BP neural network. When there is an occlusion, BP neural network learns from previous tracking results and is utilized to refine the current result from particle filter. Experimental results have been shown that our ORT method can handle the occlusion effectively and achieve better performance than several previous methods.

14:30
Interactive segmentation and 3D Reconstruction of Foreground Objects from RGBD Images
PRESENTER: Rui Sun

ABSTRACT. The 3D reconstruction of objects in complex scenes has important applications in intelligent transportation systems and medical image processing. This paper aims to segment and locate the foreground object through the interactive segmentation with multiple cues including saliency, depth and color. The desired foreground object is obtained by using the foreground and background information provided by a saliency map and a heatmap. Then, the multi-view point cloud information of the foreground object is recovered by depth, and a multi-view point cloud registration algorithm based on color information is proposed. The 3D model of the object is reconstructed through a multi-view point cloud registration by a combining motion averaging algorithm and a low-rank sparse matrix. The 3D point cloud model generated by a multi-view registration based on color can be applied in medicine, transportation, biology, and other fields; a face model example is considered for further research. The validity of the framework is empirically assessed with comparative experiments.

14:40
Sentence Semantic Matching Based on Hierarchical Encoding Model
PRESENTER: Lu Wenpeng

ABSTRACT. Sentence semantic matching (SSM) always play a critical role in natural language processing. Measuring the intrinsic similarity among sentence semantics is very challenging and has not been substantially addressed. Recent progress on this work usually relies on a shallow representation and interaction between pairs, which fails to capture the deep semantic features and only generates limited performance improvement. In this paper, we present deep Hierarchical Encoding Model, HEM, for sentence semantic matching. Unlike existing models, HEM takes a sentence pair as input, encodes the representation for each sentence with hierarchical encoding model. Moreover, HEM realizes a hierarchical matching mechanism to fully capture the interactions between sentences. Extensive experiments on the public real-world dataset, i.e., BQ corpus, demonstrate that HEM significantly outperforms the existing state-of-the-art neural models.

14:50
Smart Financial Services Assisted by Computational Intelligence
PRESENTER: Jianmin Lu

ABSTRACT. With the advent of the industrial Internet era, IoT (Internet of Things) technology has been applied to all walks of life. Through the combination of IoT technology and neural network algorithms, fund correlation analysis can guide investor’s investments and wealth management so that investors can avoid selecting high-relevance funds in the investment process and achieve risk sharing between funds. Fund correlation analysis belongs to a multivariate time series prediction problem, which requires a good time series analysis model. An increasing number of studies have found that the LSTM (Long Short-Term Memory) model has better time series processing capabilities than traditional statistical models and machine learning models. However, due to various factors such as economic factors and political factors, fund data usually appears to contain noise and strong correlation between features. Based on the above research, this paper proposes a fund correlation analysis system combining IoT technology and CI (Computer Intelligence) technology and uses the encoder-decoder model implemented by LSTM to analyze the correlation of funds. In this paper, this model is applied to the historical performance dataset containing multiple public funds and compared with the LightGBM model and MLP (Multi-Layer Perception) model using feature extraction. The encoder-decoder model implemented by LSTM has the best prediction result. The research in this paper has great significance in the use of machine learning methods for solving fund correlation analysis problems and provides new ideas for research in the field of intelligent finance.

15:00
Discriminative Multiview Nonnegative Matrix Factorization for Classification

ABSTRACT. Multiview nonnegative matrix has shown many promising applications in computer vision and pattern recognition. However, most existing works focus on the view consistency and ignore the discrimination. In this paper, we introduce a novel discriminative multiview nonnegative matrix (DMultiNMF) algorithm to learn discriminative and consistent representation for facilitating classification. In the algorithm, we apply discriminative patch alignment to enhance the local discrimination for each view and utilize the large margin principle to improve the global discrimination. At the same time, we use a shared representation to propagate information among the multiple views to ensure consistency. Apart from that, we measure the reconstruction errors utilizing the correntropy-induced metric to improve the robustness. Experiments on face recognition, handwritten digit recognition, Xmedia and wikepedia multiview data sets demonstrate the advantages of the proposed method compared with other algorithms like single view using concatenated views and substantially better than other multiview nonnegative matrix factorization algorithm.

15:10
Fast Two-Cycle Level Set Tracking with Interactive Superpixel Segmentation and Its Application in Image Retrieval
PRESENTER: Huayun Pan

ABSTRACT. In this paper, a fast two-cycle level set contour tracking algorithm based on an interactive extraction and a superpixel segmentation for foreground objects is proposed. Then, image retrieval is carried out based on the contour tracking results. The method of contour tracking is divided into three stages: superpixel segmentation, interactive feature extraction, and foreground tracking. In the first phase, the image is preprocessed by a superpixel segmentation method. In the second stage, the feature pools are defined and established interactively according to the complexity of the image. In the last stage, based on the first two stages, a fast two-cycle level set method is used to track the contour of the target object. Finally, the results of the contour tracking are applied to an image matting of the target foreground, and then the foreground image retrieval is implemented. The results of experimental comparisons demonstrate the effectiveness of the proposed method.

15:20
ESNet: An Efficient Symmetric Network for Real-time Semantic Segmentation
PRESENTER: Yu Wang

ABSTRACT. The recent years have witnessed great advances for semantic segmentation using deep convolutional neuron networks (DCNNs). However, a large number of convolutional layers and feature channels lead to semantic segmentation as a computationally heavy task, which is disadvantage to the scenario with limited resources. In this paper, we design an efficient symmetric network, called (ESNet), to address this problem. The whole network has nearly symmetric architecture, which is mainly composed of a series of factorized convolution unit (FCU) and its parallel counterparts. On one hand, the FCU adopts a widely-used 1D factorized convolution in residual layers. On the other hand, the parallel version employs a transform-split-transform-merge strategy in the designment of residual module, where the split branch adopts dilated convolutions with dierent rate to enlarge receptive eld. Our model has nearly 1.6M parameters, and is able to be performed over 62 FPS on a single GTX 1080Ti GPU. The experiments demonstrate that our approach achieves state-of-the-art results in terms of speed and accuracy trade-o for real-time semantic segmentation on CityScapes dataset.

15:40-18:00 Session 5A: Internet of Things
Chair:
Location: Room 345
15:40
Rice Growth Prediction Based on Periodic Growth
PRESENTER: Yongzhong Cao

ABSTRACT. On the basis of studying the growth and development characteristics of rice and the historical data of growth in recent years, this paper gives the definition of rice growth quantity, which takes into account the periodic growth and the key growth indexes of each stage, so as to characterize the growth of rice in each growing period. Elman neural network was used to determine the relationship between environmental factors and growth in each growth period. At the same time, in order to improve the convergence speed and accuracy of the algorithm, we propose an adaptive improved genetic algorithm to optimize the forward feedback data. The training samples of the paper are composed of various environmental parameters and physiological indexes of rice at each stage. In the experiment, the network training was carried out with the historical samples of several years to obtain the weights of each layer of the model, and the precision of the improved model was improved.

15:50
Medical Expenses Prediction of Gastric Cancer Based on Process Mining
PRESENTER: Yalu Guo

ABSTRACT. At present, disputes caused by medical expenses are widespread. How to use information means to provide accurate prediction of medical expenses for major diseases has become a hot research spot. With the establishment of electronic medical record files in major hospitals, it provides data guarantee for cost forecasting using process mining technology. In this paper, gastric cancer was selected as the research disease. According to the basic characteristics and clinical stages of 1700 patients, the clinical medical logs of each cluster of patients were extracted semi-automatically. We use the α^tj algorithm to build Petri net model for each log, mining the effective diagnosis and treatment scheme for the corresponding population, and counting the frequency and health distance of each scheme. Then, combined with the length of hospitalization and the cost of each scheme, four measures of the medical schemes were given, and their values were standardized. The comprehensive evaluation value of each scheme was obtained by using UAV algorithm. According to the personal wishes of the patients, we recommend the optimal medical scheme. Finally, according to the type of personal medical insurance and the medical insurance reimbursement policy of the location of the hospital, we give the estimated medical expenses of the patients.

16:00
Generative Image Inpainting
PRESENTER: Jiajie Xu

ABSTRACT. Recently, image is becoming more and more important as a carrier of information, and the demand of image inpainting is increasing. We present an approach for image inpainting in this paper. The completion model contains one generator and double discriminators. The generator is the architecture of AutoEncoders with skip connection and the discriminators are simple convolutional neural networks architecture. Wasserstein GAN loss is used to ensure our model's stable training. We also give the algorithm of training our model in this paper.

16:10
Research on Robustness of Emotion Recognition under Environmental Noise Conditions

ABSTRACT. Noise is an unneglectable problem in emotion recognition if we want to put it into practice. First, aiming at the problem of noise in speech, we design a new acoustic feature, Long time frame Analysis Weighted Wavelet Packet Cepstral Coefficient (LW-WPCC), for better robustness. To extract LW-WPCC feature, first the best wavelet packet basis is constructed. On the basis of this, a robust wavelet packet Cepstral Coefficient is extracted by combining short time frame analysis with long time frame analysis. After that, we introduce a sub-band spectral center-of-mass parameter with good robustness to additive noise and propose an extraction algorithm of LW-WPCC. Through experiments on speech emotion recognition of different SNR levels, it is shown that our proposed method shows better noise robustness and performance on speech emotion recognition. What’s more, as facial expressions will not be affected by noise, we do bio-modal emotion recognition based on audio-visual data to improve robustness by making a decision-level fusion. Experiments based on audio-visual data are conducted to evaluate efficiency of our method. Results show that bio-modal emotion recognition based on audio-visual data can improve robustness and achieve better performance by benefiting from different kinds of emotion data.

16:20
A New Enhanced Artificial Bee Colony Clustering Algorithm

ABSTRACT. In this paper, the discussion is focused on the shortcomings of artificial bee colony algorithm such as slow convergence speed, susceptibility to fall into premature and low development efficiency. To make up for these problems, an improved ABC algorithm named SABC (Strengthened Artificial Bee Colony Algorithm) is proposed in this paper, which can be used to solve the clustering problem. In the initial stage, the algorithm uses K-means operator to generate the initial nectar source, which not only improves the quality of the initial nectar source, but also avoids the problem of low operating efficiency. In order to enhance the interaction between individuals, the concept of global optimum is introduced, where the original one-dimensional information exchange is replaced by the full-dimensional information exchange among nectar sources, so that the information exchange volume of the whole swarm can be improved. In the phase of bee scouting, the above-mentioned global optimal solution is combined with PSO algorithm to generate a brand-new nectar source search method, which will enhance the ability of ABC algorithm to develop nectar sources. And when it comes to the phase of bee scouting, the global optimal solution will also guides the bee scouting to generate new nectar sources so as to improve the overall quality of the nectar sources. Then the experiment carried out on two sets of artificial data sets and four sets of UCI machine learning data sets which are the most representative will verify the clustering performance of the newly proposed SABC algorithm. In addition, the experimental results are compared with standard ABC algorithm, several other newly proposed ABC improved algorithms and classical clustering algorithms. Results clearly show that the improved ABC algorithm described in this paper has better accuracy and stability in solving clustering problems, and it is a more effective clustering algorithm.

16:30
Context-Aware Pub/Sub Control Method using Reinforcement Learning
PRESENTER: Joohyun Kim

ABSTRACT. The real-world application of reinforcement learning implemented vastly. A single agent-based reinforcement learning is usually applied to typical applications. However, most of the practical tasks require multiple agents for cooperative control processes. Multiple-agent reinforcement learning needs complicating design issues. Numerous design possibilities should be considered for practical usefulness. We propose two reinforcement learning algorithms for Message Queuing Telemetry Transport protocol system. The two-type of implementation improve the message transportation efficiency of MQTT communication: (i) Publisher-centric Implementation and (ii) Subscriber-centric Implementation. We focus on different message priority in a dynamic environment. The proposed algorithms improve the communication efficiency by adjusting the loop cycle time of the broker and learning importance of the messages in the system.

16:40
Research on Prediction and Analysis of Score Based on BP neural Network and Apriori Algorithm
PRESENTER: Yan Cheng

ABSTRACT. Student score data prediction has always been a research hotspot in the field of educational data mining. Traditional forecasting methods only use the student's previous scores as a reference, and do not take into account the students' daily behavioral information, nor can they achieve the expected accuracy. According to the data in the campus card management system, the paper first uses the Apriori algorithm to study the relationship between students eating breakfast, the number of times to go to the library and their scores, and find out that there is a strong correlation between student behavior information and their graduation scores. BP neural network is then used to predict the student's graduation score. Experiments show that the prediction results which use students' previous scores and daily behavioral information as influencing factors are better than those that only use past achievements as influencing factors.

16:50
Composite Nonlinear Multiset Canonical Correlation Analysis for Multiview Feature Learning and Recognition
PRESENTER: Yun-Hao Yuan

ABSTRACT. In this paper, we propose a composite nonlinear multiset canonical correlation projections framework where the original data from any view are transformed into multiple higher dimensional feature spaces by several implicitly nonlinear mappings determined by different kernels. With this framework, we further present an example algorithm called multi-kernel multiset canonical correlation or mKMCC, which can not only be viewed as a multi-kernel extension of multiset canonical correlation analysis, but also an extension of kernel multiset canonical correlations. In the proposed mKMCC, different weights are introduced into diverse kernels in all views and an alternating iterative optimization is designed for computational solution. A series of experimental results on practical datasets have demonstrated the effectiveness and robustness of our mKMCC, in contrast with existing kernel methods.

17:00
Acoustic Emission Source Localization Method on High-speed Train Bogie
PRESENTER: Lixia Huang

ABSTRACT. Bogie is one of the most critical parts of high-speed train. It is directly related to the operation quality and safety of the train. However, there is not any dynamic nondestructive testing method for real-time monitoring at present. For the high-speed train bogie dynamic testing, this paper puts forward a new damage localization method. Using the acoustic emission testing technology, time reversal localization method for the vulnerable welding parts of the bogie has been studied. First of all, the bogie welding parts structure model was established based on finite element software; then an acoustic emission damage signal was sent from the model and the acoustic emission source signal was received by the preset acoustic emission sensor; finally the accurate localization of damage was achieved by dealing with the received signal according to the time reversal focusing principle and imaging the acoustic emission source. The simulation experiment results show that this method can find the location of damage accurately.

17:10
IP Blacklisting via Threat Modeling and Machine Learning from Security Logs
PRESENTER: Byungchul Tak

ABSTRACT. Blacklisting of IP addresses is one of the wide-used technique for safeguarding mission-critical IT systems. However, most of the security monitoring for blacklisting is, in large part, done by human agents. Although there are efforts to apply machine-learning techniques to automate the process, we are yet to see the successful application of such techniques in this problem domain. In order to investigate this problem, we have developed ML models based on the combination of linear regression techniques and studied its effectiveness. We propose a multi-staged method that combines data cleansing with ridge regression and the classification by the logistic regression. Our study using real-world data shows that it can reduce the omission and incorrect blacklisting by more than 90% when compared to the results by human agents.

17:20
Multi-modal aerial sensing and scene understanding for search-and-rescue tasks
PRESENTER: Csaba Beleznai

ABSTRACT. Modern unmanned aerial vehicles (UAV's) with on-board image-based sensory setups and data analysis open up new possibilities in the aerial surveillance domain. Especially search-and-rescue tasks, where a rapid assessment of critical situations involving humans, vehicles and their context is needed, might benefit from such automated platforms. However, automated analysis typically faces several difficulties. Accurate and real-time visual object recognition still represents a scientific challenge. In the context of UAV's, motion blur, the non-specific top-view appearance of ground objects, low-image resolution and limited on-board computational resources are among the most important limiting factors to be considered. Furthermore, lighting and atmospheric conditions also impose strong visibility constraints. To cope with the above complexities and to achieve situation-aware recognition of situations embedded into a context, we propose a run-time-efficient multi-modal analysis framework using information from thermal infrared, passive stereo depth and intensity images. The proposed aerial sensing and analysis system is validated in qualitative and quantitative terms in a set of experiments representing complex (small objects, clutter, occlusions) scenarios. Obtained results indicate that the analysis of complementing sensing modalities yields high recognition accuracy and allows for a timely extraction of information, easing first responder's task.

17:30
Estimating Execution Time of Computational Science and Engineering Simulations via Machine Learning Techniques
PRESENTER: Seounghyeon Kim

ABSTRACT. EDISON is a web-based computational science and engineering simulation platform being widely used in Korea. EDISON allows users to readily conduct their high-performance computing simulations online. Depending on input parameters entered into simulations, however, sometimes it takes exceedingly long time to complete the simulations on EDISON. Such huge execution cost raises an inefficiency issue in simulation. To alleviate such uncertainty, in this paper we propose a novel time estimation method via machine learning techniques. We train our models based on a large number of simulation provenance data and utilize these models to predict the execution time for specified input parameters. Consequently, we observed in our experiments that the proposed models achieved about 73% accuracy in time estimation across diverse simulation programs. Finally, our trained models are expected to help the users avoid suffering from long wait time due to incorrect input parameters and design an optimal schedule to conduct their simulations more efficiently.

17:40
iCNN: A Convolutional Neural Network for Fractional Interpolation in Video Coding
PRESENTER: Chi Do-Kim Pham

ABSTRACT. Motion compensated prediction has significantly contributed to the temporal redundancy in video coding by predicting the current frame from the list of previously reconstructed frames. Later video coding standard HEVC uses DCTIF to interpolate fractional pixels for more accurate motion compensated prediction. Although the fixed interpolation filters have been improved, they are not able to adapt to the diversity of video content. Inspired by super-resolution, we design the interpolation Convolutional Neural Network for fractional interpolation in video coding. Our work also solves two main problems in applying Convolutional Neural Network to fractional interpolation in video coding: there is no training set for fractional interpolation and integer pixels change after processing. As a result, this work achieves a 2.6% BD-rate reduction compared to the baseline HEVC.

15:40-18:00 Session 5B: Robotics
Location: Room 348
15:40
Predicting Aesthetic Radar Maps using Richer Convolutional Multitasking Networks
PRESENTER: Xinghui Zhou

ABSTRACT. Giving a comprehensive aesthetic quality assessment of an image is a challenging task in the field of computer vision for its rich subjective semantic information. The recent research work can utilize the deep convolutional neural network to evaluate the overall score of the image. The characteristics in the field of aesthetics are often limited to the total score of images, but exhaustive characteristics. In this paper, a neural network with more richer images’ convolutional features can obtain more aesthetics characteristic than before. The multi-attribute rating called Aesthetic Radar Map. Which different from before, in this paper, we propose a multi-task convolutional multitasking neural network with more convolutional features. More branch in network, the better scoring performance of some attribute can be got, and the effect of using stages is better by making the network deeper. Through this method, the more sufficient aesthetic information of the image can be obtained, which can guide significance to the comprehensive evaluation of image aesthetics.

15:50
Research on Cloud Robot Computation offloading Algorithm Based on Improved Game Theory
PRESENTER: Fei Xu

ABSTRACT. In the cloud robot computing offloading process, how to use the edge cloud resources more reasonably, reduce the energy consumption of the machine equipment and ensure the shortest task completion time is a huge challenge.This paper transforms the computational offloading problem of multiple heterogeneous cloud robots into a game form, and performs task segmentation on computationally intensive tasks, which can achieve partial offloading of tasks.Multiple edge cloud resources not only reduce the pressure of the network load of the central cloud, but also reduce the transmission delay when calculating the offload;By designing an improved distributed game theory algorithm, the offloading strategy is dynamically updated until the Nash equilibrium state is reached, and an optimal offload strategy is obtained.The simulation results show that the improved distributed game offloading algorithm proposed in this paper can reduce the energy consumption of local computing and reduce the average completion time, which greatly improves the service quality of edge cloud.

16:00
Road Boundaries Detection based on Modified Occupancy Grid Map Using Millimeter-wave Radar
PRESENTER: Fenglei Xu

ABSTRACT. Road region detection is a hot spot research topic in autonomous driving field. It requires to give consideration to accuracy, efficiency as well as prime cost. In that, we choose millimeter-wave (MMW) Radar to fulfill road detecting task, and put forward a novel method based on MMW which meet real-time requirement. In this paper, a dynamic and static obstacle distinction step is firstly conducted to estimate the dynamic obstacle interference to boundary detection. Then, we generate an occupancy grid map using modified Bayesian prediction to construct a 2D driving environment model based on static obstacles, while a clustering procedure is carried out to describe dynamic obstacles. Next, a Modified Random Sample Consensus (Modified RANSAC) algorithm is presented to estimate candidate road boundaries from static obstacle maps. Results of our experiments are presented and discussed at the end. Note that, all our experiments in this paper are run in real-time on an experimental UGV (unmanned ground vehicle) platform equipped with Continental ARS 408-21 radar.

16:10
A new region proposal network for far-infrared pedestrian detection
PRESENTER: Zhiwei Cao

ABSTRACT. An automatic region proposal network (ARPN) is proposed to generate bounding boxes with confidence scores for far-infrared (FIR) pedestrian detection. The model consists of two parts: first, the bounding boxes are predicted by the L2 loss function and a module designed based on a convolutional neural network. This module is simple and only has two layers, each with a 1×1 kernel. Second, a score map is obtained through FIR pedestrian segmentation based on a feature pyramid network. The scores are taken as the confidence levels for the predicted bounding boxes. To obtain the scores, a new labelling method is also introduced in this paper for FIR image segmentation. We can obtain the bounding boxes per pixel directly and efficiently without any manually designed hyper parameters related to the anchor boxes. To validate the model, this paper uses the LSI, CVC09, CVC14 and SCUT FIR pedestrian detection datasets in the experiments. The datasets consist of different sizes of FIR images collected from several different cameras. The datasets contain different outdoor urban scenes collected at different times, from day to night. The recall vs number of proposals, average recall and recall at IoU 0.5 are used to evaluate the proposed method and log-average miss rate is used to evaluate final detection results. Compared with other algorithms, experiments on most of the data sets also show better performance.

16:20
Water Hazard Detection Using Conditional Generative Adversarial Network with Reflection Attention Units
PRESENTER: Li Wang

ABSTRACT. Water hazard detection is a challenging problem in the field of autonomous driving. Also, it’s quite important as there might be some hidden risks under the water hazards such as puddles, which could make self-driving cars unsafe. As deep learning achieves remarkable performance on such image segmentation tasks, we use the Conditional Generative Adversarial Network (cGAN) to deal with the task. To improve the performance of cGAN, we take the advantage of Reflection Attention Unit (RAU) and create our new method: cGAN-RAU. To verify the performance of our method, we use the ‘Puddle-1000’ dataset to evaluate our method. But we find many annotation mistakes in the dataset and we correct them through re-annotation. We compare our method with FCN-8s-FL-5RAU, which is the state-of-the-art, both of cGAN and cGAN-RAU outperform the FCN-8s-FL-5RAU, where cGAN achieves best performance with ‘Off Road’ subset and cGAN-RAU performs best with ‘On Road’ subset.

16:30
Improved Sliding Mode Controller Design for Remotely Operated Vehicle with Fixed-time Disturbance Observer
PRESENTER: Ling Liu

ABSTRACT. Remotely operated vehicle (ROVs) is an important tool for underwater operations in complex marine environments. Key requirements for ROV are the ability to sustain operation at high ambient pressures. In this paper, a novel fixedtime disturbance observer and an improved sliding-mode controller (IPSMC) with adaptive fractional power times saturation function (AFPTSF) are described to focus on the trajectory tracking control for permanent magnet synchronous motor (PMSM) of ROV with unknown dynamic model and unmeasured states, and solve the problem that PMSM is susceptible to parameter variations and external disturbances. Exact disturbance estimation can be achieved by this controller independent of initial error. In addition, an adaptive fractional power times saturation function with adaptive fractional power times saturation function is developed to realize the coordinated control between chattering and tracking accuracy. Simulation results demonstrate the effectiveness of the proposed control scheme.

16:40
Spatiotemporal Disambiguation for EEG-based Classification
PRESENTER: Fang Wang

ABSTRACT. The spatiotemporal differences existing in EEG data makes the EEG-based classification model has poor generalization ability among different individuals or even different sessions. This paper proposes an EEG-based classification framework based on spatiotemporal disambiguation. Dimensionality reduction based on brain asymmetry is first used to obtain the effective asymmetry features. Then, the feature selection algorithm is used to select the spatially representative channels. Then, we use the time warping algorithm to align the signal with the nearest spatially representative channel. The experimental results show that the proposed disambiguation method not only effectively improves the performance of EEG-based visual classification, but also accelerates the convergence speed of the model on a certain extent.

16:50
User Preference-aware Trailer Generation via Deep Reinforcement Learning
PRESENTER: Han Wang

ABSTRACT. Automatically generating a real trailer for a video with regarding to the user preference is a challenging problem in multimedia content analysis. To facilitate the attraction of trailer for different audience, it is not trivial to generate individuation trailer for different audience tastes. In this paper, we describe a reinforcement learning-based framework that generates trailer in an end-to-end manner. Under this framework, a novel reward function that accounts for the relevance of user preference to candidate video snips is introduced. This work makes the first attempt to create a personalized trailer with no dependencies on manual labeling or user interactions at all. The effectiveness of our trailer is confirmed with quantitative and qualitative experiments on two challenging TV shows, i.e. a TV series, “the Big Bang Theory” and a full-length feature video “The 90th Annual Academy Awards”.

17:00
Experimental Study on Tomographic Imaging Algorithm with SAGA of Concrete Pile Foundation
PRESENTER: Qiufeng Li

ABSTRACT. Concrete pile foundation is the main load-carrying and aseismic structure for many large-scale building structures in the field of Civil Engineering. Its structural quality will have a great impact on the safety of those building structures. However, the imaging results were still difficult to meet practical demands in current ultrasonic computerized tomography (CT) imaging test of concrete structures. In view of the current difficulties, a combined optimization tomography imaging method is proposed here. Firstly, a quadric broadening objective function with clear physical meaning is established according to the characteristics of ultrasonic propagation in concrete. And then a new CT imaging method of concrete pile foundation is formed by combining with fast Adaptive Optimization Search Ability of GA and control ability to Global Search of SAGA, and a SAGA imaging system for concrete pile foundation is compiled. Finally, the imaging system is verified by experiment, and fewer iterations, faster computation speed and more accurate imaging results have been obtained with SAGA compared with the results of single GA.

17:10
Big Image Segmentation Based on Bilateral Grid
PRESENTER: Weiguo Du

ABSTRACT. Because the high resolution image segmentation suffers from the low efficiency and inaccurate segmentation result, a novel image segmentation approach based on bilateral grid is proposed. Firstly, the bilateral grid is constructed according to the setting of spatial sampling ratio and color sampling ratio, which are determined by image size. Then the image data is splatted into the bilateral grid, the sampling data which among grid vertices is replaced by the nearest vertex data, and all grid vertices is assigned the label via GraphCut. Finally, the grid data is reconstructed to the image space by interpolation. Experiment results show that the proposed algorithm could effectively improve the high resolution image segmentation efficiency, and achieve the better segmentation results than traditional methods.

17:20
A UAV Patrol System using Panoramic Stitching and Object Detection
PRESENTER: Tongwei Ren

ABSTRACT. With their increasing popularity, drones are widely used in security and other fields, such as patrols. There are two key issues in drone patrol: 1) showing the patrolling scene in a simple and intuitive manner; 2) marking the objects of interest. To this end, we propose a UAV patrol system based on panoramic stitching and object detection. Specifically, the system uses SPHP algorithm combined the region growing algorithm based on difference image to generate a panorama and eliminate motion ghost, and adopts the popular image object detector Faster RCNN to detect objects while using knowledge about the scene categories to refine the classification scores of the objects. We experiment the system on video detection dataset of VisDrone. The experimental results show that the system obtains good effects on the panoramic stitching and object detection of drone videos.

17:30
Fuzzy Neural Network Sliding Mode Control for Permanent Magnet Synchronous Motor System Using Novel Sliding Mode Reaching Law
PRESENTER: Ling Liu

ABSTRACT. In order to optimize the speed-control performance of the permanent magnet synchronous motor (PMSM) system, a nonlinear speed-control algorithm using fuzzy neural network sliding mode control strategy is developed to suppress chattering in this paper. Also, a fuzzy neural network switching gain controller is proposed which combine the advantages of both fuzzy control and neural network control. This strategy can dynamically adapt to the variations of the controlled system, which allows chattering reduction on control input while maintaining high tracking performance of the controller. Furthermore, a sliding mode control method based on a novel sliding mode reaching law with adaptive fractional power times saturation function is presented to replace the conventional sign function, so as to overcome the discontinuity of sign function. Subsequently, the Lyapunov theorem ensures the stability of the closed-loop system. Simulation and experimental results are provided to demonstrate the effectiveness of them.

17:40
Minor Motion Magnification Based on Multi-Spectrum Analysis
PRESENTER: Fei Zhou

ABSTRACT. The subtle signal magnification in video shows the subtle changes that are significant and informative in the world. In this paper, an Euler motion magnification technique based on BPD(BackProjection of the Derivative) Riesz pyramid space decomposition is proposed. At the same time, a new pyramid structure is adopted in order to obtain better inverse pyramid transform results. This algorithm does not involve the calculation of optical flow, it supports a larger amplification factor and is significantly resistant to noise compared to the previous Euler video amplification method. This paper demonstrates the advantages of this method in generating natural video sequences, and discusses the application of this method in scientific analysis, visualization and video enhancement.

17:50
Optimal Search Path Planning for Unmanned Surface Vehicle Based on an Improved Genetic Algorithm
PRESENTER: Hui Guo

ABSTRACT. A planning model for simultaneously optimizing the unmanned surface vehicle’s (USV’s) direction and speed is established for searching submarines. The USV detection model is achieved through the underwater sonar search principle. An improved genetic algorithm is employed for maximizing cumulative detection probability (CDP), which uses three control factors to control the direction and amplitude of mutation adaptively and improve the convergence speed. In the simulation, the escape target is assumed unknown direction, and many reasonable and efficient search paths are obtained. The analysis results of the evolutionary curve show that the proposed algorithm has the advantages of strong stability and fast convergence and is suitable for USV search problem.