ISAIR2018: THE 3RD INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND ROBOTICS 2018
PROGRAM FOR MONDAY, NOVEMBER 19TH
Days:
previous day
all days

View: session overviewtalk overview

08:30-10:30 Session 6: Keynote Speech
Chair:
Location: Room 111
08:30
Software Development Paradigms and Practices with Computing Environment

ABSTRACT. In Software Engineering and data analytics, Computational Intelligent (CI) paradigms have been adopted as a, intelligent decision support system for prediction, optimization in variety of applications. The traditional data analysis approaches are lack of efficiency, limited computational capability, inadequate and impreciseness nature of handling unstructured data. However, Computational Intelligence (CI) methodologies are high computational efficiency to integrate, explore and share high volume of un-structured data in a real time, using diverse analytical techniques for enhanced decision making. Further, CI has the capability to implement complex data via sophisticated mathematical models, analytical techniques. This key note address illustrates a short overview of computational intelligence (CI) approaches and its noteworthy character in software engineering, Internet of Things and data analytics. The focus of this talk is to study and analyse the effect of CI for overall advancement of emerging intelligent computing environment.

09:00
Deep Prototype Learning for Robust Pattern Recognition

ABSTRACT. Existing pattern classification studies mostly concern the generalized classification accuracy, but ignore the rejection and robustness in open world. In recent years, deep learning methods achieved huge successes in pattern recognition, but the popular deep neural networks show inferior generalization when training with small sample and robustness to noise and outlier. In the talk, I first explain the robustness of pattern recognition, and introduce some methods for improving the robustness from the viewpoint of rejection. The rejection methods fall in two categories: ambiguity rejection and outlier rejection, which are based on different models and learning methods. I will give the formulations of two rejection modes and introduce some methods. Last, I will introduce a newly proposed deep learning method for robust pattern classification: deep convolutional prototype learning (CPL). The CPL uses a prototype classifier for classification, which is inherently robust to outlier. And combining with feature learning by convolutional neural network (CNN), the CPL yields high classification accuracy. Through regularization based on maximum likelihood (ML), the generalization performance on small sample and robustness can be further improved. The CPL model also shows potential in domain adaptation, online learning, novel class discovery, and so on.

09:30
Improve Health System with Intelligence

ABSTRACT. Recent advances in artificial intelligence and machine learning have demonstrated that deep learning technologies have superiority in solving various medical analysis tasks involving CT, X-ray and EHR. In this talk, Zongyuan will go through his research in general medical AI analytics during his career at IBM, NVIDIA research and Monash University. Moreover, he will cover the topics of his future research in general computer vision and robotics. 

10:00
Advanced Control Methods in Medical and Welfare Applications

ABSTRACT. In recent years, Artificial Intelligence (AI) technologies have been applied rapidly to various kinds of applications. To the development control technology, the intelligent algorithms bring in many excellent features expanding the applications of control methods. Meanwhile, along with the improvement in the quality of life, the requirement of devices with advanced control methods combined with AI in medical and welfare fields grows very quick. Novel applications of advanced control methods in medical and welfare fields, which is expected to play a crucial role connecting robotics, mechatronics, informatics technologies, are considered as a promising development direction in the future. In this speech, some applications of advanced control methods using intelligent technologies will be introduced. Some investigations of the methods will be given based on the discussion of their advantages, limitations, and development. Our research on intelligent control methods for medical and welfare using neural network will be introduced according to simulation and experimental results. Issues and challenges of intelligent control methods for medical and welfare applications will be given for discussion.

10:30-10:40Coffee Break
10:40-12:00 Session 7: Awards Nomination Session
Chair:
Location: Room 111
10:40
Local Binary Pattern Metric-based Multi-Focus Image Fusion
SPEAKER: Wenda Zhao

ABSTRACT. Multi-focus image fusion is to integrate the partially focused images into one single image which is focused everywhere. Nowadays, it has become an important research topic due to the applications in more and more scientific fields. However, preserving more information of the low-contrast area in the focus area and maintaining the edge information are two challenges for existing approaches. In this paper, we address these two challenges with presenting a simple yet efficient multi-focus fusion method based on local binary pattern (LBP). In our algorithm, we measure the clarity using the LBP metric and construct the initial weight map. And then we use the connected area judgment strategy (CAJS) to reduce the noise in the initial map. Afterwards, the two source images are fused together by weighted arranging. The experimental results validate that the proposed algorithm outperforms state-of-the-art image fusion algorithms in both qualitative and quantitative evaluations, especially when dealing with low contrast regions and edge information.

10:50
Human Detection in Crowded Situations by Combining Stereo Depth and Deeply-Learned Models

ABSTRACT. Human detection in crowded situations represents a challenging task in many practically relevant scenarios. In this paper we propose a passive stereo depth based human detection scheme employing a hierarchically-structured tree of learned shape templates for delineating clusters corresponding to humans. In order to enhance the specificity of the depth-based detection approach towards humans, we also incorporate a visual object recognition modality in form of a deeply-trained model. We propose a simple way to combine the depth and appearance modalities to better cope with complex effects such as heavily occluded and small-sized humans, and clutter. Obtained results are analyzed in terms of improvements and shortcomings introduced by the individual detection modalities. Our proposed combination achieves a good accuracy at a decent computational speed in difficult scenarios exhibiting crowded situations. Hence in our view, the presented concepts represent a detection scheme of practical relevance.

11:00
An Efficient Aggregation Method of Convolutional Features for Image Retrieval
SPEAKER: Xinsheng Wang

ABSTRACT. Deep features extracted from the convolutional layers of per-trained CNNs have been widely used in images retrieval tasks. These features, however, are usually very large and not suitable to be used directly in similarity evaluation. Therefore, the selection of discriminative features and appropriate aggregation method play important roles in generating the final representation. This paper introduces a simple but effective method to select the region-of-interest on features based on semantic content of feature maps. Besides, an effective channel weighting method for aggregated features is proposed, and an unsupervised semantic-based aggregation method (SBA) is successfully improved. Based on these strategies, the final aggregated representation can be obtained, which achieves comparable retrieval results with the state-of-the-art on the task of image retrieval with benchmark.

11:10
Feature-Based tracking via SURF detector and BRISK descriptor

ABSTRACT. Marker-less tracking has become vital for a variety of vision-based tasks because it tracks some natural regions of the object rather than using fiducial markers. In marker-less feature-based tracking salient regions in the images are identified by the detector and information about these regions are extracted and stored by the descriptor for matching. Speeded-Up Robust Feature (SURF) is considered as the most robust detector and descriptor so far. SURF detects the feature points that are unique and repeatable. It uses integral images which provide a base for the low computational expense. However, descriptors generation and matching for SURF is a time-consuming task. Binary Robust Invariant Scalable Key-points (BRISK) is a scale and rotation invariant binary descriptor. It reduces the computational cost due to its binary nature. This paper presents a marker-less tracking system that tracks natural features of the object in real-time and is very economical in terms of computation. The proposed system is based on SURF detector, as it identifies highly repeatable interest points in the object and BRISK descriptor, due to its low computational cost and invariance to scale and rotation which is vital for every visual tracking system

11:20
Performance Modeling of Spark Computing Platform
SPEAKER: Yunyue Xie

ABSTRACT. Big Data has been widely used in all aspects of society. For solving the problem of massive data storing and analyzing, many big data solutions have been proposed. Spark is the newer solution of the universal parallel framework which like Hadoop MapReduce. Compare the Hadoop, Spark's performance has been increased significantly. As a data analysis framework, researchers are particularly concerned about its performance. So in this paper, we use a stochastic process algebra (PEPA) to model the Spark architecture. This model will give the usability of the compositional approach in modeling and analysis Spark architecture. This research obtains an algorithm that generated the service flow of the PEPA model. In the end, we will state the benefit of this compositional method in modeling a large parallel system.

11:30
An Improved Unsupervised Band Selection of Hyperspectral Images Based on Sparse Representation

ABSTRACT. Rich band information provides more favorable conditions for the tremendous applications as well as many problems such as the curse of dimensionality. Band selection is an effective method to reduce the spectral dimension which is one of popular topics in hyperspectral remote sensing. Motivated by previous sparse representation method, we present a novel framework for band selection based on multi-dictionary sparse representation (MDSR). By obtaining the sparse solutions for each band vector and the corresponding dictionary, the contribution of each band to the raw image is derived. In terms of contribution, the appropriate band subset is selected. Five state-of-art band selection methods are compared with the MDSR on three widely used hyperspectral datasets. Experimental results show that MDSR achieves marginally better performance in hyperspectral image classification, and better performance in average correlation coefficient and computational time.

11:40
Photo Aesthetic Scoring through Spatial Aggregation Perception DCNN
SPEAKER: Xin Jin

ABSTRACT. The aesthetic quality assessment of image is a challenging work in computer vision field. The recent research work used the deep convolutional neural network to evaluate the aesthetic quality of images. However, the score of image data sets has a strongly normal distribution, which makes the training of neural network easy to be over-fitting. In addition, traditional deep learning methods usually pre-process images, which destroy the original aesthetic features of the picture, so that the network can only learn some superficial aesthetic features. This paper presents a new data set what images distributed evenly for aesthetics (IDEA). This data set has less statistical characteristics, which is helpful for the neural network to learn the deeper features. We propose a new spatial aggregation perception neural network architecture which can control channel weights automatically. The advantages and effectiveness of our method are proved by experiments in different data sets.

11:50
Depth Map Prediction from a Single Image with Generative Adversarial Nets
SPEAKER: Zhibin Yu

ABSTRACT. Depth map is a fundamental component of 3D construction. Depth map prediction from a single image is a challenging task in computer vision. In this paper, we consider the depth prediction as an image-to-image task and propose an adversarial convolutional architecture called Depth Generative Adversarial Network (DepthGAN) for depth prediction. To enhance the image translation ability, we take advantage of a Fully Convolutional Residual Networks (FCRN) and combine it with a generative adversarial network, which have shown remarkable achievements in image-to-image tasks. We also present a new loss function including the scale-invariant (SI) error and the structural similarity (SSIM) loss function to improve our model as well, in order to output high-quality depth map. Experiments show that DepthGAN performs better in monocular depth prediction than the current best method on the NYU Depth v2 dataset.

12:00-13:30Lunch Break (Invitation ONLY)

Nanjing University International Conference Center

13:30-15:00 Session 8: Spotlight-Computer Vision
Chair:
Location: Room 111
13:30
Small Object Tracking in High Density Crowd Scenes
SPEAKER: Yujie Li

ABSTRACT. Computational approaches for analyzing of collective behavior in social insects increasing rely on motion paths as an intermediate layer from which one can infer individual behaviors or social interactions. Honey bees are a popular model for future social interactions. In this paper we present a detection method which is based on improved three-frame difference method and VIBE algorithm and one tracking method which is based on unscented Kalman filtering for honey bees tracking in high density crowed scenes. We evaluate the performance of the proposed methods on datasets containing videos with crowd honey bee colony. The experimental results show that the proposed method performs good performance in detection and tracking.

13:35
Chinese Medical Question Answer Matching with Stack-CNN
SPEAKER: Yuteng Zhang

ABSTRACT. Question and answer matching in Chinese medical science is a challenging problem, which requires an effective text semantic representation. In recent years, deep learning has achieved brilliant achievements in natural language processing field, which is utilized to capture various semantic features. In this paper, we propose a neural network, i.e., stack-CNN, to address question answer matching, which stacks multiple convolutional neural networks to capture the high-level semantic information from the low-level n-gram features. Substantial experiments on a real-world dataset show that our proposed model significantly outperforms a variety of strong baselines.

13:40
Precise location and recognition of workpiece based on AdaBoost and template matching algorithm

ABSTRACT. In the view of the dynamic random position error and angle error of the workpiece in the automatic production line, the function demand of the workpiece position recognition and azimuth adjustment should be completed automatically by the robot. This paper presents a new algorithm for fast and accurate location and recognition of the workpiece with random distribution. Firstly, the AdaBoost algorithm in machine learning is used to locate the workpiece and get the ROI region of the workpiece.Then, combine the improved template matching algorithm to obtain the center coordinate and rotation angle of the workpiece. Set up the experiment system of workpiece identification in natural illumination environment and collect the images of 100 pieces of workpiece under different combination state. The algorithm is tested and validated. The results show that the workpiece can be accurately positioned and identified by the combination of AdaBoost algorithm and improved template matching algorithm.

13:45
Two-stage Supervised Codebook Learning for Image Indexing
SPEAKER: Shichao Kan

ABSTRACT. Image indexing plays a signicant role in large-scale image retrieval. In the past decades, unsupervised methods are the most commonly used methods for image index. We recognize that unsupervised-based methods can not benefit from supervised information. For designing supervised image indexing algorithm, the mainly difficulty is that the supervision information of an image should be quantified to the codeword in the codebook that is unknown. In this paper, a two-stage codebook learning method is proposed to realize supervised codebook learning. The first stage is to relabel the database images according to the designed algorithm, and the second stage is supervised iterative learning based on our newly established constraint optimization objective function. The proposed method is verified on the MNIST and CIFAR-10 data sets. Experimental results demonstrate the effectiveness of the proposed method.

13:50
Group K-SVD for Classification of Gene Expression Data
SPEAKER: Baichuan Bai

ABSTRACT. In this paper, we propose a novel sparse learning model, Group KSVD for the classification of gene expression data. Group KSVD improves the dic-tionary representation to remove the redundancy of over-complete dictionary and the learned optimal dictionary is more suitable for classification. To solve the optimization target, we improve the traditional orthogonal matching pursuit (OMP) algorithm and propose a novel Group KSVD-MOMP algo-rithm under the Group KSVD model. Group KSVD-MOMP is advantageous in dealing with small sample in high dimensionality with uncertain noise, thus is especially suitable for classification of gene expression data. Suffi-cient experimental results demonstrate the superior classification perfor-mance of our proposed algorithm in comparison with the state-of-the-art classification algorithms on real-world gene expression data.

13:55
Occluded Face Recognition by Identity-Preserving Inpainting
SPEAKER: Chenyu Li

ABSTRACT. Occluded face recognition, which has an attractive application in the visual analysis field, is challenged by the missing cues due to heavy occlusions. Recently, several face inpainting methods based on generative adversarial networks (GANs) fill in the occluded parts by generating images fitting the real image distributions. They can lead to a visually natural result and satisfy human perception. However, these methods fail to capture the identity attributes, thus the inpainted faces may be recognized at a low accuracy by machine. To enable the convergence of human perception and machine perception, this paper proposes an Identity Preserving Generative Adversary Networks (IP-GANs) to jointly inpaint and recognize occluded faces. The IP-GANs consist of an inpainting network for regressing missing facial parts, a global-local discriminative network for guiding the inpainted face to the real conditional distribution, a parsing network for enhancing structure consistence and an identity network for recovering missing identity cues. Especially, the novel identity network suppresses the identity diffusion by constraining the feature consistence from the early subnetwork of a well-trained face recognition network between the inpainted face and its corresponding ground-true. In this way, it regularizes the inpaintor, enforcing the generated faces to preserve identity attributes. Experimental results prove the proposed IP-GANs capable of dealing with varieties of occlusions and producing photorealistic and identity-preserving results, promoting occluded face recognition performance.

14:00
Transferring Rich Deep Features for Facial Beauty Prediction
SPEAKER: Lu Xu

ABSTRACT. Feature extraction plays a significant part in computer vision tasks. In this paper, we propose a two-stage method, which firstly trains the face verification model with triplet loss on VGG Face dataset to obtain informative facial representation, and then trains Bayesian ridge regression for our facial beauty prediction (FBP) task. Through effective feature fusion strategy, we find that the features learned from a totally different task (face verification) with different dataset may also contain informative representation for facial beauty. Our method achieves state-of-the-art performance on SCUT-FBP dataset and ECCV HotOrNot dataset. Experiments demonstrate the effectiveness of the proposed method.

14:05
Domain Adaptation for Semantic Segmentation with Conditional Random Field

ABSTRACT. Fully-convolutional neural networks (CNNs) for semantic seg- mentation dramatically improve performance using end-to-end learning on whole images in a supervised manner. The success of CNNs for seman- tic segmentation depends heavily on the pixel-level ground truth, which is labor-intensive in general. To partially solve this problem, domain- adaptation techniques have been adapted to the two similar tasks for semantic segmentation, one of which is fully-labelled, while the other is unlabelled. Based on the adversarial learning method for domain adap- tation in the context of semantic segmentation (AdaptSegNet), this pa- per proposes to employ the conditional random eld (CRF) to rene the output of the segmentation network before domain adaptation. The proposed system fully integrates CRF modelling with CNNs, making it possible to train the whole system end-to-end with the usual back- prop- agation algorithm. Extensive experiments demonstrate the eectiveness of our framework under various domain adaptation settings, including synthetic-to-real scenarios.

14:10
An Unambiguous Acquisition Algorithm for BOC (n, n) Signal Based on Sub-Correlation Combination
SPEAKER: Xiyan Sun

ABSTRACT. To overcome the acquisition problems caused by the multiple peaks of the auto-correlation function of Binary Offset Carrier (BOC) modulated signal, a technology to eliminate secondary peaks based on sub-combination correlation is proposed in this paper. According to the characteristics of the sub-function of the BOC autocorrelation, this new method recombined the sub-correlation function obtain the ability to eliminate the edge. MonteCarlo simulations show that the proposed method can improve 3dBHz sensitivity in detection probability compared with ASPeCT when the number of non-coherent is 10 for BOCs (1, 1). In addition, it can be applied to BOCc (1, 1) and achieved the same the detection probability compared with the traditional BSPK-LIKE method by appropriately increasing the number of non-coherent.

14:15
A firework algorithm based parameter optimization method for random forest
SPEAKER: Xiao-Hua Tao

ABSTRACT. Although Random Forest(RF) has been used in research projects and real-world applications in diverse domains,but how to set parameters of the number of trees in RF and the number of samples to training a tree is still troubled user. For solving this problem,in this paper,we choose Firework Algorithm(FWA) to adjust the parameters,using the explosion mechanism and mutation mechanism of it. The experiments conducted by 15 UCI datasets show that the classification accuracy rate of improved FWA-RF rising 1.55%~26.67% than RF without adjusting parameters. Meanwhile,it is also same or better than Bagging,Adaboost and Decision tree. Additionally,through the analysis of the experimental results,we found this method is also efficient to limited the search scope of the number of trees in a RF and the number of samples for training a tree for any datasets,making RF more understanding.

14:20
Analysis of Urban Bicycles' Trip Behavior and Efficiency Optimization
SPEAKER: Haoyu Wen

ABSTRACT. Bicycle sharing systems are becoming more and more prevalent in urban environments. They provide a low environmental friendly transportation alternative city. The management of these systems brings many optimization problems. The most important of these problems is the individual maintenance of bicycle rebalancing and shared facilities, and the use of systems by creating requirements in asymmetrical patterns. In order to solve the problem of unbalanced use of bicycles, based on real data sets, a series of data mining is developed around these issues. By analyzing the characteristics of each site, the site is modeled from the perspective of individuals and clusters, through different models. The evaluation indicators to detect the accuracy of the results provide an effective method for predicting shared bicycles.

14:25
Pedestrian Attribute Recognition with Occlusion in Low Resolution Surveillance Scenarios
SPEAKER: Yuan Zhang

ABSTRACT. In surveillance scenarios, the pedestrian images are often facing poor resolution problems or the images are often suffered the occlusion problems. These problems make pedestrian attribute recognition more difficult. In order to solve this problem, we propose an improved pedestrian attribute recognition method based on hand-crafted feature. In this method, we use Patch Match algorithm as pedestrian image preprocessing to enhance the pedestrian images. Experiments show that this method proposed performs excellent when the pedestrian images suffer occlusion problem and the method is robust to low resolution problem.

14:30
A No-ambiguity Acquisition Algorithm Based on Correlation Shift for BOC (n, n)
SPEAKER: Xiyan Sun

ABSTRACT. In the course of GPS modernization, Binary Offset Carrier (BOC) modulation technology is adopted to realize rational utilization of frequency, to avoid mutual interference between navigation signal frequency bands. Due to the multiple peaks of the auto-correlation function (ACF) of BOC modulated signal, an acquisition algorithm is proposed in this paper. Sub-correlation function of ACF is analyzed, one sub-signal and half-chip-shift sub-signal in the local process is designed. It can achieve the homologous sub-correlation function of ACF by respectively correlating two local signals and received signal. The complexity of the algorithm as well as its detection probability based on the constant false alarm rate is analyzed. Simulations show that the proposed method can effectively solve the problem of ambiguous acquisition.

14:35
Adaptive Gradient-based Block Compressive Sensing with Sparsity for Noisy Images

ABSTRACT. This paper develops a novel adaptive gradient-based block compressive sensing (AGbBCS_SP) methodology for noisy image compression and reconstruction. The AGbBCS_SP approach splits an image into blocks by maximizing their sparsity, and reconstructs images by solving a convex optimization problem. The main contribution is to provide an adaptive method for block shape selection, improving noisy image reconstruction performance. Experimental results with different image sets indicate that our AGbBCS_SP method is able to achieve better performance, in terms of peak signal to noise ratio (PSNR) and computational cost, than several classical algorithms.

14:40
A Novel Active Contour Model Using Oriented Smoothness and Infinite Laplacian for Medical Image Segmentation
SPEAKER: Chunhong Cao

ABSTRACT. Active contour model (ACM) has been widely used in image segmentation. The original ACM, or Snakes, has poor weak edge preservation ability and it is difficult to converge to the concave, especially long and thin indentation convergence. In order to improve these defects to a certain extent, a series of models such as gradient vector flow (GVF) and general gradient vector flow (GGVF) were proposed. To further address these issues, a new edge-preserving ACM using Oriented Smoothness, Infinite Laplacian and component-based normalization is proposed in this paper. Oriented Smoothness and Infinite Laplacian are adopted as the smoothness term in the energy function to promote the model’s weak edge preservation and concave convergence ability. Component-based normalization further accelerates the concave convergence rate. The experimental results show that the proposed method achieves better performance than the other representative methods.

14:45
Distilling Deep Networks with Selective Soft Decision Trees for Robust Image Classification
SPEAKER: Yingying Hua

ABSTRACT. Despite impressive performance achieved by deep networks, they are often sensitive to adversarial examples, leading to misclassification when adding even a slight perturbation to original images. To improve the classication robustness against adversarial attacks, this paper proposes a learning approach to distill deep networks via selective soft decision trees, which could transfer the selectively informative knowledge of well-trained deep networks by optimizing the training process and training examples. In this approach, we take the outputs of softmax layer as the soft targets to train selective soft decision trees with knowledge distillation. Moreover, we adaptively select the optimal outputs of selective soft decision trees and deep neural networks to optimize training examples by amending loss function. The experimental results demonstrate the distilled models are more robust against the attack of adversarial examples than deep networks.

14:50
Multi-Feature Fusion Compressive Tracking Algorithm Based on Particle Filter

ABSTRACT. In order to solve the problem that particle filter object tracking algorithm is vulnerable to illumination changes based on single feature and can suffer from particle scarce after number of iterations. In this paper, we proposed a multi-feature fusion compressive tracking algorithm based on particle filter. The algorithm integrates compressive sensing into particle filtering. We extract color histogram feature and texture feature to describe the object and project the high-dimensional feature to low-dimensional feature by a sparse measurement matrix which reduces the complexity of compute. Then the object model is set up. The principle of particle filter is used to estimate the state of the target and obtains target station accurately. In addition, aim to adapt to the changes of object and scene, an adaptive updating strategy of the target model is designed to update the number of particles dynamically. Experiments on some publicly available benchmarks of video sequences showed that the proposed algorithm can track the object accurately under the complex scene of partial occlusion and illumination changes in comparison with other traditional algorithms.

14:55
Pedestrian Object Detection with Fusion of Visual Attention Mechanism and Semantic Computation
SPEAKER: Baotong Liu

ABSTRACT. Aiming at the problem that the primary visual features are difficult to effectively address pedestrian detection in complex scenes, a pedestrian detection method combining with visual attention mechanism and semantic computation is proposed. Firstly, the saliency of primary visual is calculated by the visual attention model of Graph-Based Visual Saliency. Secondly, the significance of the skin semantic features are calculated by using the regional skin color model, and Haar features are combined with support vector machine to calculate the semantic features of head-shoulders. Then, the static visual attention model is established by the fusion of Laplacian Pyramid to obtain a total saliency map. Finally, the focus of attention is selected on the total saliency map through the Deformable Parts Model to complete the pedestrian detection. Experimental results show that the accuracy of pedestrian detection reach 92.78% by using Laplacian fusion strategy on the standard pedestrian database INRIA.

15:00-15:10Coffee Break
15:10-17:00 Session 9: Spotlight-IoT&Robotics
Chair:
Location: Room 111
15:10
Saliency Detection via Objectness Transferring

ABSTRACT. In this paper, we present a novel framework to incorporate top-down guidance to identify salient objects. The salient regions/objects are predicted by transferring objectness prior without the requirement of center-biased assumption. The proposed framework consists of the following two basic steps: In the top-down process, we create a location saliency map (LSM), which can be identied by a set of overlapping windows likely to cover salient objects. The corresponding binary segmentation masks of training windows are treated as high-level knowledge to be transferred to the test image windows, which may share visual similarity with training windows. In the bottom-up process, a multi-layer segmentation framework is employed, providing local shape information that is used to delineate accurate object boundaries. Through integrating top-down objectness priors and bottom-up image representation, our approach is able to produce an accurate pixel-wise saliency map. Extensive experiments show that our approach achieves the state-of-the-art results over MSRA 1000 dataset.

15:15
A Skin Lesion Segmentation Method Based on Saliency and Adaptive Thresholding in Wavelet Domain
SPEAKER: Kai Hu

ABSTRACT. Segmentation is the essential requirement in automated computer-aided diagnosis (CAD) of skin diseases. In this paper, we propose an unsupervised skin lesion segmentation method to challenge the difficulties existing in the dermoscopy images such as low contrast, border indistinct, and skin lesion is close to the boundary. Our method combines the enhanced fusion saliency with adaptive thresholding based on wavelet transform to get the lesion regions. Firstly, the saliency map increases the contract of the skin lesion and healthy skin, and then an adaptive thresholding method based on wavelet transform is used to obtain more accurate lesion regions. Experiments on dermoscopy images demonstrate the effectiveness of the proposed method over several state-of-the-art methods in terms of quantitative results and visual effects.

15:20
Distortion Correction Method of Zoom Lens Based on Vanishing Point Geometric Constraint
SPEAKER: Zhenmin Zhu

ABSTRACT. In order to solve the problem that the nonlinear distortion of the zoom lens varies with the focal length, a fast correction method of the minimum fitting error zoom lens based on the principle of the vanishing points is proposed. Firstly, based on the radial distortion model, the equation between the vanishing point and the radial distortion coefficient is established according to the geometric constraint of the vanishing point, and then, according to the principle of the deviation error minimization, use the least squares to fit the equation of straight line of the corrected points. Finally, the variation of distortion parameters with focal length is analyzed, the distortion parameter table between distortion parameter and focal length and the empirical formula of fitting are established. The results of images correction show that the proposed method can effectively correct the nonlinear distortion of zoom lens.

15:25
Bidirectional Compressive Sensing for Classification of Gene Expression Data
SPEAKER: Baichuan Fan

ABSTRACT. Classification of gene expression data has received growing interest in recent years. Compressive sensing is an emerging sparse learning algorithm, which is often incorporated into classification algorithms. By representing out-of-sample test data as a sparse linear combination of training data, compressive sensing can facilitate the classification of test data. However, the traditional compressive sensing model only considers the column-wise sparse represen-tation, but ignore the relationship among different rows. For gene expression data, it is fundamentally important to take into account the correlation among genes. In this paper, we develop a novel bidirectional compressive sensing model for classification of gene expression data, and solve it with a pro-posed Bi-ADMM algorithm. Sufficient experiments demonstrate that our model exhibits superior performance than the state-of-the-art classification methods on gene expression data.

15:30
Robust multi-user detection based on hybrid Grey wolf optimization
SPEAKER: Yuanfa Ji

ABSTRACT. Aiming at the problem of high bit error rate(BER) in multi-user detection under impulse noise environment, a novel robust multi-user detection algorithm is proposed based on the directional information of individual population and the advantages of Grey wolf optimization and adaptive differential evolution algorithm.The simulation results show that the iteration times of the multi-user detector based on the proposed algorithm is less than that of genetic algorithm, differential evolution algorithm and Grey wolf optimization algorithm, and has the lower BER.

15:35
Fast Dynamic Routing Based on Weighted Kernel Density Estimation

ABSTRACT. Capsules as well as dynamic routing between them are most recently proposed structures for deep neural networks. A capsule groups data into vectors or matrices as poses rather than conventional scalars to represent specific properties of target instance. Besides of pose, a capsule should be attached with a probability (often denoted as activation) for its presence. The dynamic routing helps capsules achieve more generalization capacity with many fewer model parameters. However, the bottleneck that prevents widespread applications of capsule is the expense of computation during routing. To address this problem, we generalize existing routing methods within the framework of weighted kernel density estimation, and propose two fast routing methods with different optimization strategies. Our methods prompt the time efficiency of routing by nearly 40\% with negligible performance degradation. By stacking a hybrid of convolutional layers and capsule layers, we construct a network architecture to handle inputs at a resolution of $64\times{64}$ pixels. The proposed models achieve a parallel performance with other leading methods in multiple benchmarks.

15:40
Pedestrian Detection in Unmanned Aerial Vehicle Scene

ABSTRACT. With the increasing adoption of unmanned aerial vehicles (UAVs), pedestrian detection with use of such vehicles has been attracting attention. Object detection algorithms based on deep learning have considerably progressed in recent years, but applying existing research results directly to the UAV perspective is difficult. Therefore, in this study, we present a new dataset called UAVs-Pedestrian, which contains various scenes and angles, for improving test results. To validate our dataset, we use the classical detection algorithms SSD, YOLO, and Faster-RCNN. Findings indicate that our dataset is challenging and conducive to the study of pedestrian detection using UAVs.

15:45
Identification of grape disease using image analysis and BP neural networks

ABSTRACT. The prevention and treatment of diseases is critical for improving grape yield and quality. In order to timely and effective prevention against insect pests, it is necessary to realize the automatic identification of grape diseases. This paper put forward an automatic detection method to inspect grape leaf diseases using image analysis and back-propagation (BP) neural networks. The Wiener filtering method based on wavelet transform (WT) was applied to denoise the disease images. The grape leaf disease regions were segmented by Otsu method, and the morphological algorithms were used to improve the lesion shape. Prewitt operator was selected to extract the complete edge of lesion region. Five effective characteristic parameters, such as perimeter, area, circularity, rectangularity and shape complexity, were extracted. Grape leaf diseases recognition model based on BP neural network can efficiently inspect and recognize five grape leaf diseases: leaf spot, sphaceloma ampelinum de Bary, anthracnose, round spot and downy mildew. The results indicated that the proposed grape leaf diseases detection system can be used to inspect grape diseases with high classification accuracy.

15:50
Trustworthy traceability of quality and safety for pig supply chain based on blockchain
SPEAKER: Yan Yuan

ABSTRACT. Pork safety incidents happened frequently in china even if the traditional traceability system was established, which declines the consumers’ confidence rapidly. Thus, to explore a trustworthy traceability of quality and safety for the pig supply chain, this paper first proposes a framework for traceability of the pig supply chain based on blockchain. In our research, we found HACCP is suitable for screening key information in every link of the pig supply chain as well as GS1 can achieve the series connection of information on the supply chain, which reduces the information isolated and increases the transparency. However, the information provided by the traditional traceability system cannot guarantee the authenticity and credibility because of its hidden troubles such as monopoly, corruption, counterfeit, hacker attack and so on. For this reason, we verified the validity and credibility of pig traceability information by deploying the smart contract in the form of consortium blockchain and analysing its operating mechanism from the perspective of consumers. This paper concludes with a discussion of the practical problems and the future work of research.

15:55
Discrete Hashing Based on Point-wise Supervision and Inner Product
SPEAKER: Xingyu Liu

ABSTRACT. Recent years has witnessed an increase popularity of supervised hashing in vision problems like image retrieval. Compared with unsupervised hashing, supervised hashing accuracy can be boosted by leveraging semantic information. However, the existing supervised methods either lack of adequate performance or often incur a low quality optimization process by dropping the discrete constraints. In this work, we propose a novel supervised hashing framework called discrete hashing based on point-wise supervision and inner product (PSIPDH) which using point-wise supervised information make hash code effectively correspond to the semantic information, on the basis of which the coded inner product is manipulated to introduce the punishment of Hamming distance. By introducing two kinds of supervisory information, a discrete solution can be applied that code generation and hash function learning processes are seen as separate steps and discrete hashing code can be directly learned from semantic labels bit by bit. Experiment results on data sets with semantic labels can demonstrate the superiority of PSIPDH to the state-of-the-art hashing methods.

16:00
Experimental Study on Learning of Neural Network Using Particle Swarm Optimization in Predictive Fuzzy for Pneumatic Servo System
SPEAKER: Shenglin Mu

ABSTRACT. Based on the scheme of predictive fuzzy control combined with neural network (NN) for pneumatic servo system, the learning of NN using Particle Swarm Optimization (PSO) is studied according to experimental investigation in this research. A group of positioning experiments using existent pneumatic servo system were designed to confirm the effectiveness and efficiency of the NN's learning employing PSO in the imaginary plant construction for the pneumatic system in predictive fuzzy control. The analysis in the study was implemented comparing the results of traditional back-propagation (BP) type NN and the PSO type NN.

16:05
A target detection-based milestone event time identification method

ABSTRACT. The flight and departure time nodes for the port and departure flights yield important information about the cooperative decision system of an airport. However, at present, because it would affect normal flight management, airports cannot obtain these data by technical means. By installing a camera on the airport apron and employing a regional convolutional neural network model to identify the targets in the video, such as the aircraft, staff, and working vehicle, the times of the milestone events were determined according to the identified changes inthe target shape and target motion state. Furthermore, prior knowledge on the plane gliding curve and ground support operations was obtained by implementing the least squares method to fit the plane gliding curve, and subsequently used to compensate for the occlusion-induced recognition error and enhance the robustness of the algorithm. It was experimentally verified that the proposed target detection-based milestone event time recognition method is able to identify the flight times during the over- station, plane entry, and the milestone launch event.

16:10
Multi-view Point Cloud Registration with Adaptive Convergence Threshold and its Application on 3D Model Retrieval
SPEAKER: Ying Liu

ABSTRACT. Multi-view point cloud registration is a hot topic in the community of artificial intelligence and robotics. In this paper, we propose a framework to reconstruct the 3D models by the multi-view point cloud registration algorithm with adaptive convergence threshold, and apply it to 3D model retrieval. The ICP algorithm is improved for the point cloud registration, combining with the motion average algorithm for multi-view point cloud. After the registration of multi-view point clouds, we design applications for 3D model retrieval. For the 3D face detection, the geometric saliency map is computed based on the vertex curvature. The test facial triangle is then generated based on the saliency map, which is applied to compute the matching error with the standard facial triangle. The face and non-face models are then discriminated. The experiments and comparisons prove the effectiveness of the proposed framework.

16:15
A Class of Chaos-Gaussian Measurement Matrix based on Logistic Chaos for Compressed Sensing
SPEAKER: Hongbo Bi

ABSTRACT. Accurate compressed sensing recovery theoretically depends on a large number of random measurements. In this study, we demonstrated the correlation properties of non-piecewise and piecewise Logistic chaos system to follow Gaussian distribution. The correlation properties can generate a class of Chaos-Gaussian measurement matrix with the low complexity, hardware-friendly implementation and desirable sampling efficiency. Thus, the proposed algorithm constructs Chaos-Gaussian measurement matrix by the sequences. Experimental results show that Chaos-Gaussian measurement matrix can provide comparable performance against Gaussian and Bernoulli random measurement matrix.

16:20
A New Dataset for Vehicle Logo Detection

ABSTRACT. This paper establishes a multilevel dataset for solving the vehicle logo detection task; we call it ‘VLD-30’. Vehicle logo detection is applied to the Intelligent Transport System widely, such as vehicle monitoring. As for the object detection algorithm of deep-learning, a good dataset can improve the robustness of it. Our dataset has a very high reliability by including analysis on various factors. In order to confirm the dataset performance, we use the typical target detection algorithm, such as Faster-RCNN and YOLO. The experimental results show that our dataset achieves significant improvements for the small object detection, and vehicle logo detection is potential to be developed.

16:25
Salt and Pepper Noise Suppression for Medical Image by Using Non-local Homogenous Information
SPEAKER: Hu Liang

ABSTRACT. In this paper, we propose a method to suppress salt and pepper noise for medical images based on the homogenous information obtained by non-symmetrical and anti-packing model (NAM). The NAM could divide the image into several homogenous blocks and it is sensitive to the additive extra energy. Thus the noise could be detected effectively due to the usage of bit-plane during the division. Then corrupted points are estimated by using a distance based weighted mean filter according to the homogenous information in its non-local region, which could keep local structure. Experimental results show that our method can obtain denoising results with high quality.

16:30
Medical Diagnosis based on Nonlinear Manifold Discriminative Projection

ABSTRACT. In recent years, medical diagnosis based on machine learning has become popular in the interdispline research of computer science and medical science. It is closely related with classification, which is one of the important problems in machine learning. However, the traditional classification algorithms can hardly appropriately solve high-dimensional medical datasets. Manifold learning as nonlinear dimensionality reduction algorithm can efficiently process high dimensional medical datasets. In this paper, we propose an algorithm based on Nonlinear Manifold Discriminative Projection (NMDP). Our algorithm incorporates the label information of medical data into the unsupervised LLE method, so that the transformed manifold becomes more discriminative. Then we apply the discriminant mapping to the unlabeled test data for classification. Experimental results show that our method exhibits promising classification performance on different medical data sets.

16:35
A New Sketch-based 3D Object Retrieval Approach by Learning Semantic Attributes and LDA model
SPEAKER: Lei Haopeng

ABSTRACT. Recently, retrieve 3D object based on hand-drawn sketch are becoming increasingly popular owning to its simplicity and intuitiveness. However, the existing sketch based 3D object retrieval methods only consider using low-level visual features that cannot capture users’search intention. In this paper, we propose a new sketch-based 3D objects retrieval approach by learning semantic attributes to address these issues. We use two kinds of attributes which are called pre-defined attributes and latent attributes in our retrieval system. Pre-defined attributes are defined manually and assign them to the different categories of 3D objects in the database and the query sketch directly; Latent attributes can be used to discriminate different details between query sketch and 3D objects which are learned from the low-level features by adopting latent dirichlet allocation(LDA) model. Then we use these attributes to determine the category of the query sketch and retrieve relevant 3D objects from the database which are ranked according to the similarity of attributes. The experiment results demonstrate that our method can achieve more accurate retrieval results and have a superior performance than other previously proposed methods which merely used low-level feature descriptors.

16:40
Correlation Filter Tracking Algorithm Based on Multiple Features and Average Peak Correlation Energy
SPEAKER: Xiyan Sun

ABSTRACT. The traditional target tracking algorithm adopts artificial features. However, the artificial features are not strong enough to illustrate the appearance of the target. So it is difficult to apply to complex scenes; moreover, the traditional target tracking algorithm does not judge the confidence of the response. If the confidence is low, the appearance model of the target can be easily disturbed and the tracking performance is degraded. This paper proposes the Multiple Features and Average Peak Correlation Energy (MFAPCE) tracking algorithm, MFAPCE tracking algorithm combines deep features with color features and uses average peak correlation energy to measure confidence. The algorithm uses multiple convolution layers and color histogram features to illustrate the target appearance; the response is obtained by optimizing the context information under the correlation filter framework; and using the average peak correlation energy to determine the final confidence of the response. Finally, according to the confidence to determine whether to update the model. Compared with the traditional tracking algorithm, MFAPCE algorithm can improve the tracking performance according to experiment.

16:45
A Deep Architecture for Chinese Semantic Matching with Pairwise Comparisons and Attention-Pooling
SPEAKER: Huiyuan Lai

ABSTRACT. Semantic sentence matching is a fundamental technology in natural language processing. In the previous work, neural networks with attention mechanisms have been successfully extended to sematic matching. However, existing deep models often simply use some operations such as summation and max-pooling to represent the whole sentence to a single distributed representation. We present a deep architecture to match two Chinese sentences, which relies only on alignment instead of recurrent neural network after attention mechanism used to get interaction information between sentence-pairs, it becomes more lightweight and simple. In order to capture original features enough, we employ a pooling operation named attention-pooling to convergence information from whole sentence. We also explore several excellent performance English models on Chinese data. The experimental results show that our method can achieves better results than other models on Chinese dataset.

16:50
A Cooperative target search method based on intelligent water drops algorithm
SPEAKER: Xixia Sun

ABSTRACT. In this paper, we investigate the problem of path planning for multiple unmanned aerial vehicles (multi-UAVs), which collaborate with each other for target search. To be specific, we first establish the optimization model of the multi-UAVs cooperative search path planning problem. Then, we propose an improved intelligent water drops algorithm to solve it. The proposed algorithm introduces several heterogeneous water drop populations, with each UAV having a water drops population that searches paths for it. Water drops in the same population cooperate with each other, while water drops in different populations are competing with each other. To further improve the search efficiency of the algorithm, a new soil update mechanism is developed. It increases soils of the paths found by water drops get trapped into map dead ends and adjusts the design of soil update rate automatically. Simulation results show the effectiveness of the proposed algorithm.

16:55
Foggy Image Restoration Using Histogram Equalization and Retinex

ABSTRACT. In order to improve the visibility of foggy images, this paper uses a new method to iteratively refine the image. Firstly, the foggy image is equalized using a histogram model to improve image contrast. Secondly, the image which is equalized by histogram is re-enhanced with the Retinex algorithm model. From a theoretical and practical point of view, this method improves the sharpness of the image while enhancing the image detail information and restoring the image color.