previous day
next day
all days

View: session overviewtalk overview

08:30-09:00 Session 1: Opening Ceremony
Location: Room 111
General Chair Speech
Introduction to ISAIR2018
SPEAKER: Huimin Lu
Welcome to ISAIR2019
09:00-10:30 Session 2: Kenote Speech
Location: Room 111
Visual Distortion Detection and Reduction in 3D Video: Recent Advances

ABSTRACT. Multi-view video and more sophisticated depth-based 3D video for stereoscopic or autostereoscopic displays have vividly extended the conventional 2D video with a third dimension, providing viewers an immersive perception of 3D scene. In the depth-based 3D video system, view synthesis with depth-image-based rendering (DIBR) technique is able to generate a virtual view at an arbitrary viewpoint, thus supporting an adjustable (user-defined) disparity range based on users’ preference to the intensity of 3D perception. Presenting high quality 3D video is the goal of 3D video system development, where the challenges of visual distortion detection and reduction in 3D video are unique in that many of them may not be encountered in the conventional 2D video system. The talk will mainly present our recent work on 3D visual distortion detection and reduction in the framework of DIBR-oriented 3D video system.

Image Processing Technique for Computer Aided Diagnosis

ABSTRACT. Cancer is the leading cause of death in Japan and worldwide. To analyze the abnormal shadows in visual screening, many Computer Aided Diagnosis (CAD) systems are introduced. By using the CAD system, there are many advantages such as reducing the workflow of radiologists, and improving the detection rates of lesion etc. We proposed some CAD systems to extract abnormalities on CT images. In this talking, I will show you some of the CAD systems.

Continuum Robots for Minimally Invasive Surgery: Smaller, Softer, and Smarter

ABSTRACT. Medical robots have seen significant growth worldwide recently. It is estimated that around 10,700 sets of medical robots will be supplied in the market between 2018 and 2020. Although some medical robots have achieved commercial success, the robotics community is putting continuous efforts to make robotic surgery safer, easier, and less invasive. In this talk, I will introduce a new type of robots, the continuum robots, for the application of minimally invasive surgery. Due to the intrinsic flexibility and the ability to be scaled down, the continuum robots are excellent candidates for robotic minimally invasive surgery. Instrumentation, human-robot interaction, and shape and force sensing have been identified as three grand challenges in the development of continuum robots for medical applications. In this talk, I will introduce our efforts to address these challenges. Paticularly, I will introduce two classes of continuum robots, the concentric tube robots and the cable-driven robots, that have been investigated for a variety of medical procedures recently. Finally, I will share my individual perspective on the trends of the development of medical robots.

10:30-10:40Coffee Break
10:40-12:00 Session 3: China-Korea Workshop
Location: Room 111
Prediction Model of Respiratory Outpatient Visits Based on Multi-dimensional Features
Delay Evaluation in Cache-Enabled Backhaul Networks
Analysis of Urban Shared Bicycle's Trip Behavior and Efficiency Optimization
The Consensus Protocol of Blockchain and Applicaiton of Blockchain Technology
12:00-13:30Lunch Break (Invitation ONLY)

Nanjing University International Conference Center

13:30-15:00 Session 4: Keynote Speech
Location: Room 111
Extreme Optical Imaging for Underwater Robotic Vision

ABSTRACT. Absorption, scattering, and color distortion are three major issues in underwater optical imaging. Light rays traveling through water are scattered and absorbed based on their wavelength. Scattering is caused by large suspended particles that degrade optical images captured underwater. Color distortion occurs due to different wavelengths are attenuated to different degrees in water. Consequently, images of ambient underwater environment are dominated by a bluish tone. This talk will introduce some underwater imaging models that compensates the attenuation discrepancy along to the propagation path, a corresponding robust color line-based background light estimator and a locally adaptive filtering algorithm for enhancing underwater images.

Development of the Honeybee Life-Log Monitoring System Using RFID-tag and Image Processing

ABSTRACT. More than 50 years ago, Karl von Frisch discovered that honeybees (Apis mellifera) communicate the exact location of food sources to other bees through a complex movement called waggle dance. Since then, analyzing communications between honeybee dancers and their followers in their hive is one of the most important and interesting issues to reveal a mechanism of honeybee’s language. In general, these behavior analyses have been usually conducted by extracting honeybee’s walking trajectories from recorded long-time video data manually. Therefore, in order to decrease the hard work of observers and their artificial errors, we have previously proposed an automatic tracking algorithm of multiple honeybees using image processing and machine learning. Besides, we have constructed an automatic recording system for long-term tracking of honeybee behaviors using Radio Frequency Identification (RFID) sensors and several high-resolution camera modules connected to multiple small-size single board computers. Using this system, we recorded the observation hive and the corridor from 6:30 am to 7:30 pm over 4 weeks once or twice per a year from 2015 to 2018. The size of our target colony is about 800 honeybees including a queen. As a result of the recording we obtained very large video data over 20TB. Analyzing honeybee’s behavior from this enormous amount of data is required an extremely long time even if using a high spec computer. In order to deal with this issue, we first extracted dance area and duration from the recorded movie using a preprocessing based on frame-difference approach. Then we applied our tracking algorithm for the extracted partial movies. As a result, we could detect dancers and their followers’ trajectories from the observed movies. In this talk, I introduce our system and show some results obtained by our approach.

Large Scale Automated Screening and Analysis Using Retinal Fundus Images

ABSTRACT. Ophthalmology with machine learning is attracting attentions in both industry and acadima. Diseases like diabetic retinopathy, age-related macular degeneration, hypertension and arteriosclerosis related eye abnormalities can be diagnosed through the clues provided by retinal fundus images, which is the only part of the body where blood vessels can retina can be observed. More recently, there are some research show that fundus information may be served as signals for early detection of Alzheimer, Parkinson, and even HIV. In this talk, we tell the story behind the large data collection and preparation at Airdoc and how we leverage state-of-the-art machine learning model with those datasets to benefit millions of people.

15:00-15:10Coffee Break
15:10-17:00 Session 5: Spotlight-Artificial Intelligence
Location: Room 111
Compressive Sensing-based Optimal Design of an Emerging Optical Imager

ABSTRACT. The emerging optical imager can greatly reduce system weight and size compared to conventional telescopes. The compressive sensing (CS) theory demonstrates that incomplete and noisy measurements may actually suffice for accurate reconstruction of compressible or sparse signals. In this paper, we propose an optimized design of the emerging optical imager based on compressive sensing theory. It simplifies data acquisition structure and reduces data transmission burden. moreover, the system robustness is improved.

Synthesizing Virtual-Real Artworks using Sun Orientation Estimation

ABSTRACT. The illumination effect is essential for the realistic results in images which are created by inserting virtual objects into real scene. For outdoor scenes, automatic estimation of sun orientation condition from a single outdoor image is fundamental for inserting 3D models to a single image. Traditional methods for outdoor sun orientation estimation often use handcraft illumination features or cues. These cues heavily rely on the experiences of human and pre-processing progresses using other image understanding technologies such as shadow and sky detection, geometry recovery and intrinsic image decomposition, which limit their performances. We propose an end to end way of outdoor sun orientation estimation via a novel deep convolutional neural network (DCNN), which directly outputs the orientation of the sun from an outdoor image. Our proposed SunOriNet contains a contact layer that directly contacts the intermediate feature maps to the high-level ones and learns hierarchical features automatically from a large-scale image dataset with annotated sun orientations. The experiments reveal that our DCNN can well estimate sun orientation from a single outdoor image. The estimation accuracy of our method outperforms both traditional handcraft features based methods and model state-of-the-art DCNN based methods.

Silhouette Photo Style Transfer

ABSTRACT. Silhouette photography is popular among photographers. However, it is hard for ordinary users to shoot this kind of photos because of the limitations of cameras, weather and skills. In this work, we propose an automatic photo style transfer approach that can generate realistic silhouette images. First we present a sky segmentation method to divide an input image into an object foreground and a sky background. Then, for the background, we implement a statistic color transfer method using a specified sky photo. Finally, in order to generate natural results, we develop an adaptive approach to adjust the color of the object foreground considering the ambient color computed from the stylized background. Extensive experiments show that our methods can achieve satisfactory sky segmentation results and generate aesthetically pleasing silhouette photos.

Low-Rank Matrix Recovery for Source Imaging with Magnetoencephalography
SPEAKER: Yegang Hu

ABSTRACT. Source imaging with magnetoencephalography (MEG) has obtained good spatial accuracy on the shallow sources, and has been successfully applied in the brain cognition and the diagnosis of brain disease. However, its utility with locating deep sources may be more challenging. In this study, a new source imaging method was proposed to find real brain activity on deep locations. A sensor array with MEG measurements including 306 channels was represented as a low-rank matrix plus sparse noises, where the sensor array that removed the interference can be explained by the low-rank matrix. The low-rank matrix was then used to estimate the source model using a minimum variance beamforming. Simulations with a realistic head model indicated that the proposed method was effective. This method was further verified in 10 patients with temporal lobe epilepsy, and the localization results may be more consistent with the clinical conclusion.

Multi-task Deep Learning for Fine-grained Classification/Grading in Breast Cancer Histopathological Images
SPEAKER: Xipeng Pan

ABSTRACT. The fine-grained classification or grading of breast cancer pathological images is of great value in clinical application. However, the manual feature extraction methods not only require professional knowledge, but also the cost of feature extraction is high, especially the high quality features. In this paper, we devise an improved deep convolution neural network model to achieve accurate fine-grained classification or grading of breast cancer pathological images. Meanwhile, we use online data augmentation and transfer learning strategy to avoid model overfitting. According to the issue that small inter-class variance and large intra-class variance exist in breast cancer pathological images, multi-class recognition task and verification task of image pair are combined in the representation learning process; in addition, the prior knowledge (different subclasses with relatively large distance and small distance between the same subclass) are embedded in the process of feature extraction. At the same time, the prior information that pathological images with different magnification belong to the same subclass will be embedded in the feature extraction process, which will lead to less sensitive with image magnification. Experimental results on three different pathological image datasets show that the performance of our method is better than that of state-of-the-arts, with good robustness and generalization ability.

Nuclear Norm Regularized Structural Orthogonal Procrustes Regression for Face Hallucination with Pose

ABSTRACT. In real applications, the observed low-resolution (LR) face images usually have pose variations. Conventional learning based methods ignore these variations, thus the learned representations are not beneficial for the following reconstruction. In this paper, we propose a nuclear norm regularized structural orthogonal Procrustes regression (N2SOPR) method to learn pose-robust feature representations for efficient face hallucination. The orthogonal Procrustes regression (OPR) seeks an optimal transformation between two images to correct the pose from one to the other. Additionally, our N2SOPR uses the nuclear norm constraint on the error term to keep image’s structural information. A low-rank constraint on the representation coefficients is imposed to adaptively select the training samples that belong to the same subspace as the inputs. Moreover, a locality constraint is also enforced to preserve the locality and the sparsity simultaneously. Experimental results on standard face hallucination databases indicate that our proposed method can produce more reasonable near frontal face images for recognition purpose.

An Adaptable Digital Camouflage Synthesis for 3D Surfaces
SPEAKER: Guangxu Li

ABSTRACT. Military camouflage should be various in order to quickly adapt environmental changes of battlefield. The elusiveness is depressed if the constant camouflage patterns of the clothes and weapons are used, as applying in the most of current military. In this paper, we propose a texture synthesis method from an image using convolutional neural networks, which has an effect on the camouflage generation on a 3D surface. We use the latest advances in style transitions in 2D images and add surface arameterization methods that apply to a curved surface. This allows us to make an adaptive texture map of the 3D objects, even in the case of the complex topologies.

Non-destructive Detection of Medicines Using NIRS Based Collaborative Representation
SPEAKER: Zhenbing Liu

ABSTRACT. Near-infrared spectroscopy (NIRS) has potential for non-destructive detection (classification) of medicines. To address the identification problem of medicines, the sparse signal representation model is established by NIRS signal in the presence of spectral crossover and overlapping. However, the problem of finding the sparsest solution is difficult even to approximate the initial absorption spectral. Meanwhile, as the binary classification, the nonzero representation coefficients concentrate on two classes to distinguish the multi-label classification. Thus, a novel classification model – Collaborative Representation classification with Gaber optimizer for Regularized Least Square (CRC_GRLS) is constructed to overcome the two crucial issues: “Sparsity” and Binary classification. By using Gaber filters to handle “Sparsity” to obtain the more relevant factor vectors of NIRS signal and adding some justifications for detection of medicines, the CRC_GRLS model with low classification errors could be obtained. The experiments using NIRS samples from the three data sets (active substance, Erythromycin Ethylsuccinate and Domperidone) show that the proposed model has substantial potential to find the difference in the chemical characteristics of medicines using NIRS data, and it has speed-up about 1 times compared with the Sparse Representation based Classification (SRC) and Class L1-optimizer classifier with the closeness rule (C_CL1C).

Multi-view Registration Based on Expectation Maximization Algorithm

ABSTRACT. Many methods have been proposed and improved to deal with multi-view point cloud registration. Most of them are based on the classical method Iterative Closest Point (ICP), which is fast and accurate in most cases. However in the case where great noise exists in cloud points, the equal weights ICP assigned to all correspondences would lead to an unsatisfactory registration result. To address this issue, this paper proposes a new automatic multi-view registration method based on Expectation Maximization(EM). Instead of giving equal weights to all points as did in ICP, we introduce a Gaussian distribution on modeling the probabilities of aligning data shape with the model shape to assign different weights to points according to the distance. Then Expectation Maximization(EM) is brought in to optimize the likelihood function formulated with Gaussian distribution. By iteratively setting each scan as data shape and others altogether as model shape, and aligning them repeatedly until convergence, we could obtain the multi-view registration results at last. The experimental results demonstrate accuracy and robustness of our methods over three state-of-the-art algorithms, especially when noisy data exist.

Film Clips Retrieval using Image Queries

ABSTRACT. The emergence of entertainment industry motivates the explosive growth of automatically film trailer. Manually finding desired clips from these large amounts of films is time-consuming and tedious, which makes finding the moments of user major or special preference becomes an urgent problem. Moreover, the user subjectivity over a film makes no fixed trailer meets all user interests. This paper addresses these problems by posing a query-related film clip extraction framework which optimizes selected frames to both semantically query-related and visually representative of the entire film. The experimental results show that our query-related film clip retrieval method is particularly useful for film editing, e.g. showing the abstraction of the entire film while playing focus on the parts that matches the user queries.

Improved Rao-Blackwellised Particle Filter based on randomly weighted PSO

ABSTRACT. In this paper, a new RBPF-SLAM based on randomly weighted PSO(Particle Swarm Optimization) is proposed in order to solve some problems in the Rao-Blackwillised particle filter(RBPF), including the depletion of particles and loss of diversity in the process of resampling. PSO optimization strategy is introduced in the modified algorithm, inertia weight is randomly set. Modified PSO is utilized to optimize the particle set to avoid particle degenerating and keep diversity. The proposed algorithm is used in the Qt platform to do simulation and verified in ROS by turtlebot. Results show that the proposed RBPF outperform RBPF-SLAM and FastSLAM2.0.

A Novel Sliding Mode Control For Human Upper Extremity With Gravity Compensation

ABSTRACT. The paper studied the reaching movements of redundant human upper extremity muscles by a sliding mode control based on fuzzy adaptive scale adjustment. A two-link planar human musculoskeletal arm model is adopted on the basis of the Hill type with six redundant muscles. The study focused on the gravity compensation for the muscle input during the reaching movements process. Through the fuzzy adaptive system, the sliding mode controller may achieve adaptive approximation of switching scale so as to eliminate chattering. The numerical simulations are performed in order to verify the control. The results revealed that the human upper extremity can very well accomplish the reaching moments with proposed sliding mode controller.

GPU-Accelerated Feature Tracking for SFM Based 3D Reconstruction
SPEAKER: Mingwei Cao

ABSTRACT. This paper presents a novel GPU-accelerated feature tracking (GFT) method for large-scale structure from motion (SFM)-based 3D reconstruction. The proposed GFT method consists of GPU-based Difference of Gaussian (DOG), RootSIFT descriptor, k nearest neighbors matching, and outlier removing. Firstly, our GPU-based DOG implementation can detect thousands of keypoints in real-time, whose speed is 30 times faster than that of the CPU version. Secondly, our GPU-based RootSIFT descriptor can compute thousands of descriptors in real-time. Thirdly, the speed of our GPU-based descriptor matching is 10 times faster than that of the state-of-the-art methods. Finally, we conduct thorough experiments to evaluate the proposed method. Experimental results demonstrate the effectiveness and efficiency of the proposed method.


ABSTRACT. ABSTRACT Activity recognition system is an important step in any video surveillance system, tracking or video activity. This paper examines the result of the adaptive Gaussian Mixture Model using the Maximum A posterior (MAP) updates on video clips (dataset) obtained from Adeyemi College of Education Ondo, Nigeria. The results showed a reliable moving object detection algorithm, shadows constitute a problem, in that moving shadows can be mistaken as moving objects. The shadow was suppressed using the HSV and Phong illumination Model. The overall performance of this system was evaluated using the confusion matrix and the receiver operating characteristic (ROC), shadow detection and shadow discrimination values which showed a better result compared to existing benchmarks.

Unsupervised Feature Selection Using Ideal Local Structure Learning
SPEAKER: Yanbei Liu

ABSTRACT. Unsupervised feature selection has become an important and challenging problem due to vast amounts of unlabelled and high-dimensional data in machine learning. Traditional unsupervised feature selection algorithms usually need to build the similarity matrix making the selected features heavily depend on the learned structure. However, the real-world data always contains lots of noise samples and features that may make the similarity matrix obtained by original data unreliable. In this paper, we propose a novel Unsupervised Feature Selection using Ideal Local Structure Learning (LSL) method, which performs local structure learning and feature selection simultaneously. To obtain more accurate structure information, we learn an ideal local structure with exactly c connected components of data (where c is the number of clusters), thus the proposed method can select more valuable features. Furthermore, we present a simple yet effective iterative algorithm to optimize our algorithm. Experiments on various benchmark datasets, including biomedical data, letter recognition digit data and face image data, demonstrate the encouraging performance of our algorithm over the state-of-the-arts.

Semantics Consistent Adversarial Cross-Modal Retrieval
SPEAKER: Ruisheng Xuan

ABSTRACT. Cross-modal retrieval returns the relevant results from the other modalities given a query from one modality. The main challenge of cross-modal retrieval is the “heterogeneity gap” amongst modalities, because different modalities have different distributions and representations. Therefore, the similarity of different modalities can not be measured directly. In this paper, we propose a semantics consistent adversarial cross-modal retrieval approach, which learns a semantics consistent representation for different modalities with same semantic category. Specifically, we encourage the class center of different modalities with same semantic label to be as close as possible, and also minimize the distances between the samples and the class center with same semantic label from different modalities. Comprehensive experiments on Wikipedia dataset are conducted and the experimental results show the efficiency and effectiveness of our approach in cross-modal retrieval.

Hatching Eggs Classification Based on CNN with Channel Weighting and Joint Supervision
SPEAKER: Huasong Liu

ABSTRACT. Convolutional neural network (CNN) shows the state-of-the-art performance in tackling a variety of visual tasks. It is expected that CNN can apply to 9-day hatching eggs classification. This kind of hatching eggs are divided into fertile eggs and dead eggs. Because of the inter-class similarity and intra-class difference issue, the CNN classification method combining channel weighting (Squeeze-and-Excitation module) and joint supervision is proposed to improve the classification accuracy. We use Center loss and Softmax loss together as joint supervision signal. With such joint supervision, CNN can obtain the deep features with inter-class dispersion and intra-class compactness, which enhances the discriminative and generalization power. Simultaneously, channel weighting is adopted in feature extraction, which is added in each convolutional layer to make better use of channel features. The experimental results demonstrate that the proposed method successfully solves the classification problem of hatching eggs. The accuracy of our method is 98.8%.

Truthful volume discount mechanism based on combinatorial double auction

ABSTRACT. In the auction market, allocation and pricing will affect participants’ behavior, honesty, and the success of the auction. A proper mechanism will help to achieve higher utility. In a combinatorial double auction, buyers bidding for commodity combinations of different sellers solve the problem of resource allocation in the real market more efficiently. In view of the problem of allocation and pricing of resource such as cloud resource allocation and spectrum auction, this paper designs the TCD4GB mechanism based on the scene of combined double auction, which determines winners, allocates goods and calculates payments. The concept of unit difference is introduced to this paper in order to solve winner determination problem in the group-buying mechanism. The matched sellers who have the minimum cost is directly chosen to be the winning sellers in the process of selecting the winning buyers. The mechanism also calculates payment by using the unit difference of overlapping buyers and apportion it to the matching sellers through the idea of second-price in VCG mechanism. This mechanism avoids the higher utility of buyers when they report falsely. Through theoretical and simulation experiments, it is proved that the TCD4GB mechanism satisfies the economic attributes of individual rationality and budget balance.

Equilibrium balking strategies of delay-sensitive customers in a clearing queueing system with service quality feedback
SPEAKER: Baoxian Chang

ABSTRACT. In this paper, we study a clearing queueing system with service quality feedback and system maintenance. Once the system receives an unsatisfied (negative) feedback from customers (i.e., a customer is unsatisfied with the service), all present customers are forced to leave the system, at the same time, the system undergoes a maintenance procedure. After experiencing an exponential maintenance time, the system resumes work immediately. We assume that arriving customers decide whether to join or balk the system based on a linear reward-cost structure. By considering the waiting cost and reward, we discuss the balking behavior of customers and obtain the corresponding equilibrium balking strategies and social optimal strategies for the observable case and the unobservable case. Finally, some numerical examples are provided to show the effect of several system parameters on the equilibrium and optimal balking strategies.

Research on fundus image registration and fusion method based on NSCT and adaptive PCNN
SPEAKER: Shihao Zhang

ABSTRACT. The fusion of the fundus image can show lesions and tissues in multi-modalities of images comprehensively, and is of great value of application in the diagnosis of fundus diseases for doctors. In this paper, we propose a method of fundus image registration and fusion with the combination of NSCT(Nonsubsampled contourlet) and adaptive PCNN(Pulse Coupled Neural Network).The specific process is: firstly, two images are registered to eliminate the spatial differences between the source images to extract SURF(Speeded Up Robust Features) as the feature point, and then the feature vector is calculated by feature points description, the nearest neighbor and the next nearest neighbor distance ratio method are used to realize the initial matching of the feature points,the RANSIC(Random Sample Consensus,RANSAC) method is used to remove the mismatched point pairs. Finally, the transformation parameters between the images are calculated to complete the registration of source images by the spatial transformation. For integration the of the two images after the registration, the specific process is: The low frequency and high frequency sub-band of the image to be fused is got by NSCT decomposition, the low frequency sub-band is fused by the regional energy . The high frequency sub-band use the simplified PCNN model to study and is fused based on the number of times the image pixels are fired. The experimental results show that this method is better than other representative methods in the fusion result of fundus image.the fusion image synthesize the image information and clarify the performance of the details,provides an effective reference for the clinical diagnosis of fundus diseases.

Hyperspectral Image Classication Based on Multi-Feature Fusion

ABSTRACT. The traditional hyperspectral image (HSI) classication methods usually use the spectral feature , and does not make full use of the spatial feature or other characteristic of the HSI. This paper proposes a novel HSI classication method based on multi-feature fusion (SST). First, the spectral spatial features are extracted by a spectral-spatial feature network. Then, the texture features of the local binary pattern (LBP) image are applied, and fusion with spectral-spatial features. Finally, kernel-based extreme learning machine (KELM) is used to classify the hyperspectral images. Through experiments on dierent datasets, the results show that the proposed method can eectively improve the classication accuracy of hyperspectral images.

Multi-Scale Adversarial Network for Underwater Image Restoration
SPEAKER: Jingyu Lu

ABSTRACT. Underwater image restoration, which is the keystone to the underwater vision research, is still a challenging work. The key point of underwater image restoration work is how to remove the turbidity and the color distortion caused by the underwater environment. In this paper, we propose an underwater image restoration method based on transferring an underwater style image into a recovered style using Multi-Scale Cycle Generative Adversarial Network (MCycle GAN) System. We include a Structural Similarity Index Measure loss (SSIM loss), which can provide more flexibility to model the detail structural to improve the image restoration performance. We use dark channel prior (DCP) algorithm to get the transmission map of an image and design an adaptive SSIM loss to improve underwater image quality. We input this information into the network for multi-scale calculation on the images, which achieves the combination of DCP algorithm and Cycle-Consistent Adversarial Networks (CycleGAN). By compared the quantitative and qualitative with existing state-of-the-art approaches, our method shows a pleasing performance on the underwater image dataset.

18:00-20:00 Reception Banquet

Nanjing University International Conference Center

Location: XiongmangTing