View: session overviewtalk overview
08:30 | General Chair Speech |
08:40 | Introduction to ISAIR2018 SPEAKER: Huimin Lu |
08:55 | Welcome to ISAIR2019 |
Nanjing University International Conference Center
15:10 | Compressive Sensing-based Optimal Design of an Emerging Optical Imager SPEAKER: Gang Liu ABSTRACT. The emerging optical imager can greatly reduce system weight and size compared to conventional telescopes. The compressive sensing (CS) theory demonstrates that incomplete and noisy measurements may actually suffice for accurate reconstruction of compressible or sparse signals. In this paper, we propose an optimized design of the emerging optical imager based on compressive sensing theory. It simplifies data acquisition structure and reduces data transmission burden. moreover, the system robustness is improved. |
15:15 | Synthesizing Virtual-Real Artworks using Sun Orientation Estimation SPEAKER: Xin Jin ABSTRACT. The illumination effect is essential for the realistic results in images which are created by inserting virtual objects into real scene. For outdoor scenes, automatic estimation of sun orientation condition from a single outdoor image is fundamental for inserting 3D models to a single image. Traditional methods for outdoor sun orientation estimation often use handcraft illumination features or cues. These cues heavily rely on the experiences of human and pre-processing progresses using other image understanding technologies such as shadow and sky detection, geometry recovery and intrinsic image decomposition, which limit their performances. We propose an end to end way of outdoor sun orientation estimation via a novel deep convolutional neural network (DCNN), which directly outputs the orientation of the sun from an outdoor image. Our proposed SunOriNet contains a contact layer that directly contacts the intermediate feature maps to the high-level ones and learns hierarchical features automatically from a large-scale image dataset with annotated sun orientations. The experiments reveal that our DCNN can well estimate sun orientation from a single outdoor image. The estimation accuracy of our method outperforms both traditional handcraft features based methods and model state-of-the-art DCNN based methods. |
15:20 | Silhouette Photo Style Transfer SPEAKER: Henan Li ABSTRACT. Silhouette photography is popular among photographers. However, it is hard for ordinary users to shoot this kind of photos because of the limitations of cameras, weather and skills. In this work, we propose an automatic photo style transfer approach that can generate realistic silhouette images. First we present a sky segmentation method to divide an input image into an object foreground and a sky background. Then, for the background, we implement a statistic color transfer method using a specified sky photo. Finally, in order to generate natural results, we develop an adaptive approach to adjust the color of the object foreground considering the ambient color computed from the stylized background. Extensive experiments show that our methods can achieve satisfactory sky segmentation results and generate aesthetically pleasing silhouette photos. |
15:25 | Low-Rank Matrix Recovery for Source Imaging with Magnetoencephalography SPEAKER: Yegang Hu ABSTRACT. Source imaging with magnetoencephalography (MEG) has obtained good spatial accuracy on the shallow sources, and has been successfully applied in the brain cognition and the diagnosis of brain disease. However, its utility with locating deep sources may be more challenging. In this study, a new source imaging method was proposed to find real brain activity on deep locations. A sensor array with MEG measurements including 306 channels was represented as a low-rank matrix plus sparse noises, where the sensor array that removed the interference can be explained by the low-rank matrix. The low-rank matrix was then used to estimate the source model using a minimum variance beamforming. Simulations with a realistic head model indicated that the proposed method was effective. This method was further verified in 10 patients with temporal lobe epilepsy, and the localization results may be more consistent with the clinical conclusion. |
15:30 | Multi-task Deep Learning for Fine-grained Classification/Grading in Breast Cancer Histopathological Images SPEAKER: Xipeng Pan ABSTRACT. The fine-grained classification or grading of breast cancer pathological images is of great value in clinical application. However, the manual feature extraction methods not only require professional knowledge, but also the cost of feature extraction is high, especially the high quality features. In this paper, we devise an improved deep convolution neural network model to achieve accurate fine-grained classification or grading of breast cancer pathological images. Meanwhile, we use online data augmentation and transfer learning strategy to avoid model overfitting. According to the issue that small inter-class variance and large intra-class variance exist in breast cancer pathological images, multi-class recognition task and verification task of image pair are combined in the representation learning process; in addition, the prior knowledge (different subclasses with relatively large distance and small distance between the same subclass) are embedded in the process of feature extraction. At the same time, the prior information that pathological images with different magnification belong to the same subclass will be embedded in the feature extraction process, which will lead to less sensitive with image magnification. Experimental results on three different pathological image datasets show that the performance of our method is better than that of state-of-the-arts, with good robustness and generalization ability. |
15:35 | Nuclear Norm Regularized Structural Orthogonal Procrustes Regression for Face Hallucination with Pose SPEAKER: Dong Zhu ABSTRACT. In real applications, the observed low-resolution (LR) face images usually have pose variations. Conventional learning based methods ignore these variations, thus the learned representations are not beneficial for the following reconstruction. In this paper, we propose a nuclear norm regularized structural orthogonal Procrustes regression (N2SOPR) method to learn pose-robust feature representations for efficient face hallucination. The orthogonal Procrustes regression (OPR) seeks an optimal transformation between two images to correct the pose from one to the other. Additionally, our N2SOPR uses the nuclear norm constraint on the error term to keep image’s structural information. A low-rank constraint on the representation coefficients is imposed to adaptively select the training samples that belong to the same subspace as the inputs. Moreover, a locality constraint is also enforced to preserve the locality and the sparsity simultaneously. Experimental results on standard face hallucination databases indicate that our proposed method can produce more reasonable near frontal face images for recognition purpose. |
15:40 | An Adaptable Digital Camouflage Synthesis for 3D Surfaces SPEAKER: Guangxu Li ABSTRACT. Military camouflage should be various in order to quickly adapt environmental changes of battlefield. The elusiveness is depressed if the constant camouflage patterns of the clothes and weapons are used, as applying in the most of current military. In this paper, we propose a texture synthesis method from an image using convolutional neural networks, which has an effect on the camouflage generation on a 3D surface. We use the latest advances in style transitions in 2D images and add surface arameterization methods that apply to a curved surface. This allows us to make an adaptive texture map of the 3D objects, even in the case of the complex topologies. |
15:45 | Non-destructive Detection of Medicines Using NIRS Based Collaborative Representation SPEAKER: Zhenbing Liu ABSTRACT. Near-infrared spectroscopy (NIRS) has potential for non-destructive detection (classification) of medicines. To address the identification problem of medicines, the sparse signal representation model is established by NIRS signal in the presence of spectral crossover and overlapping. However, the problem of finding the sparsest solution is difficult even to approximate the initial absorption spectral. Meanwhile, as the binary classification, the nonzero representation coefficients concentrate on two classes to distinguish the multi-label classification. Thus, a novel classification model – Collaborative Representation classification with Gaber optimizer for Regularized Least Square (CRC_GRLS) is constructed to overcome the two crucial issues: “Sparsity” and Binary classification. By using Gaber filters to handle “Sparsity” to obtain the more relevant factor vectors of NIRS signal and adding some justifications for detection of medicines, the CRC_GRLS model with low classification errors could be obtained. The experiments using NIRS samples from the three data sets (active substance, Erythromycin Ethylsuccinate and Domperidone) show that the proposed model has substantial potential to find the difference in the chemical characteristics of medicines using NIRS data, and it has speed-up about 1 times compared with the Sparse Representation based Classification (SRC) and Class L1-optimizer classifier with the closeness rule (C_CL1C). |
15:50 | Multi-view Registration Based on Expectation Maximization Algorithm ABSTRACT. Many methods have been proposed and improved to deal with multi-view point cloud registration. Most of them are based on the classical method Iterative Closest Point (ICP), which is fast and accurate in most cases. However in the case where great noise exists in cloud points, the equal weights ICP assigned to all correspondences would lead to an unsatisfactory registration result. To address this issue, this paper proposes a new automatic multi-view registration method based on Expectation Maximization(EM). Instead of giving equal weights to all points as did in ICP, we introduce a Gaussian distribution on modeling the probabilities of aligning data shape with the model shape to assign different weights to points according to the distance. Then Expectation Maximization(EM) is brought in to optimize the likelihood function formulated with Gaussian distribution. By iteratively setting each scan as data shape and others altogether as model shape, and aligning them repeatedly until convergence, we could obtain the multi-view registration results at last. The experimental results demonstrate accuracy and robustness of our methods over three state-of-the-art algorithms, especially when noisy data exist. |
15:55 | Film Clips Retrieval using Image Queries SPEAKER: Ling Zou ABSTRACT. The emergence of entertainment industry motivates the explosive growth of automatically film trailer. Manually finding desired clips from these large amounts of films is time-consuming and tedious, which makes finding the moments of user major or special preference becomes an urgent problem. Moreover, the user subjectivity over a film makes no fixed trailer meets all user interests. This paper addresses these problems by posing a query-related film clip extraction framework which optimizes selected frames to both semantically query-related and visually representative of the entire film. The experimental results show that our query-related film clip retrieval method is particularly useful for film editing, e.g. showing the abstraction of the entire film while playing focus on the parts that matches the user queries. |
16:00 | Improved Rao-Blackwellised Particle Filter based on randomly weighted PSO SPEAKER: Ye Zhao ABSTRACT. In this paper, a new RBPF-SLAM based on randomly weighted PSO(Particle Swarm Optimization) is proposed in order to solve some problems in the Rao-Blackwillised particle filter(RBPF), including the depletion of particles and loss of diversity in the process of resampling. PSO optimization strategy is introduced in the modified algorithm, inertia weight is randomly set. Modified PSO is utilized to optimize the particle set to avoid particle degenerating and keep diversity. The proposed algorithm is used in the Qt platform to do simulation and verified in ROS by turtlebot. Results show that the proposed RBPF outperform RBPF-SLAM and FastSLAM2.0. |
16:05 | A Novel Sliding Mode Control For Human Upper Extremity With Gravity Compensation ABSTRACT. The paper studied the reaching movements of redundant human upper extremity muscles by a sliding mode control based on fuzzy adaptive scale adjustment. A two-link planar human musculoskeletal arm model is adopted on the basis of the Hill type with six redundant muscles. The study focused on the gravity compensation for the muscle input during the reaching movements process. Through the fuzzy adaptive system, the sliding mode controller may achieve adaptive approximation of switching scale so as to eliminate chattering. The numerical simulations are performed in order to verify the control. The results revealed that the human upper extremity can very well accomplish the reaching moments with proposed sliding mode controller. |
16:10 | GPU-Accelerated Feature Tracking for SFM Based 3D Reconstruction SPEAKER: Mingwei Cao ABSTRACT. This paper presents a novel GPU-accelerated feature tracking (GFT) method for large-scale structure from motion (SFM)-based 3D reconstruction. The proposed GFT method consists of GPU-based Difference of Gaussian (DOG), RootSIFT descriptor, k nearest neighbors matching, and outlier removing. Firstly, our GPU-based DOG implementation can detect thousands of keypoints in real-time, whose speed is 30 times faster than that of the CPU version. Secondly, our GPU-based RootSIFT descriptor can compute thousands of descriptors in real-time. Thirdly, the speed of our GPU-based descriptor matching is 10 times faster than that of the state-of-the-art methods. Finally, we conduct thorough experiments to evaluate the proposed method. Experimental results demonstrate the effectiveness and efficiency of the proposed method. |
16:15 | ACTIVITY RECOGNITION SYSTEM WITH SHADOW SUPPRESSION USING ADAPTIVE GAUSSIAN MIXTURE MODEL ABSTRACT. ABSTRACT Activity recognition system is an important step in any video surveillance system, tracking or video activity. This paper examines the result of the adaptive Gaussian Mixture Model using the Maximum A posterior (MAP) updates on video clips (dataset) obtained from Adeyemi College of Education Ondo, Nigeria. The results showed a reliable moving object detection algorithm, shadows constitute a problem, in that moving shadows can be mistaken as moving objects. The shadow was suppressed using the HSV and Phong illumination Model. The overall performance of this system was evaluated using the confusion matrix and the receiver operating characteristic (ROC), shadow detection and shadow discrimination values which showed a better result compared to existing benchmarks. |
16:20 | Unsupervised Feature Selection Using Ideal Local Structure Learning SPEAKER: Yanbei Liu ABSTRACT. Unsupervised feature selection has become an important and challenging problem due to vast amounts of unlabelled and high-dimensional data in machine learning. Traditional unsupervised feature selection algorithms usually need to build the similarity matrix making the selected features heavily depend on the learned structure. However, the real-world data always contains lots of noise samples and features that may make the similarity matrix obtained by original data unreliable. In this paper, we propose a novel Unsupervised Feature Selection using Ideal Local Structure Learning (LSL) method, which performs local structure learning and feature selection simultaneously. To obtain more accurate structure information, we learn an ideal local structure with exactly c connected components of data (where c is the number of clusters), thus the proposed method can select more valuable features. Furthermore, we present a simple yet effective iterative algorithm to optimize our algorithm. Experiments on various benchmark datasets, including biomedical data, letter recognition digit data and face image data, demonstrate the encouraging performance of our algorithm over the state-of-the-arts. |
16:25 | Semantics Consistent Adversarial Cross-Modal Retrieval SPEAKER: Ruisheng Xuan ABSTRACT. Cross-modal retrieval returns the relevant results from the other modalities given a query from one modality. The main challenge of cross-modal retrieval is the “heterogeneity gap” amongst modalities, because different modalities have different distributions and representations. Therefore, the similarity of different modalities can not be measured directly. In this paper, we propose a semantics consistent adversarial cross-modal retrieval approach, which learns a semantics consistent representation for different modalities with same semantic category. Specifically, we encourage the class center of different modalities with same semantic label to be as close as possible, and also minimize the distances between the samples and the class center with same semantic label from different modalities. Comprehensive experiments on Wikipedia dataset are conducted and the experimental results show the efficiency and effectiveness of our approach in cross-modal retrieval. |
16:30 | Hatching Eggs Classification Based on CNN with Channel Weighting and Joint Supervision SPEAKER: Huasong Liu ABSTRACT. Convolutional neural network (CNN) shows the state-of-the-art performance in tackling a variety of visual tasks. It is expected that CNN can apply to 9-day hatching eggs classification. This kind of hatching eggs are divided into fertile eggs and dead eggs. Because of the inter-class similarity and intra-class difference issue, the CNN classification method combining channel weighting (Squeeze-and-Excitation module) and joint supervision is proposed to improve the classification accuracy. We use Center loss and Softmax loss together as joint supervision signal. With such joint supervision, CNN can obtain the deep features with inter-class dispersion and intra-class compactness, which enhances the discriminative and generalization power. Simultaneously, channel weighting is adopted in feature extraction, which is added in each convolutional layer to make better use of channel features. The experimental results demonstrate that the proposed method successfully solves the classification problem of hatching eggs. The accuracy of our method is 98.8%. |
16:35 | ABSTRACT. In the auction market, allocation and pricing will affect participants’ behavior, honesty, and the success of the auction. A proper mechanism will help to achieve higher utility. In a combinatorial double auction, buyers bidding for commodity combinations of different sellers solve the problem of resource allocation in the real market more efficiently. In view of the problem of allocation and pricing of resource such as cloud resource allocation and spectrum auction, this paper designs the TCD4GB mechanism based on the scene of combined double auction, which determines winners, allocates goods and calculates payments. The concept of unit difference is introduced to this paper in order to solve winner determination problem in the group-buying mechanism. The matched sellers who have the minimum cost is directly chosen to be the winning sellers in the process of selecting the winning buyers. The mechanism also calculates payment by using the unit difference of overlapping buyers and apportion it to the matching sellers through the idea of second-price in VCG mechanism. This mechanism avoids the higher utility of buyers when they report falsely. Through theoretical and simulation experiments, it is proved that the TCD4GB mechanism satisfies the economic attributes of individual rationality and budget balance. |
16:40 | Equilibrium balking strategies of delay-sensitive customers in a clearing queueing system with service quality feedback SPEAKER: Baoxian Chang ABSTRACT. In this paper, we study a clearing queueing system with service quality feedback and system maintenance. Once the system receives an unsatisfied (negative) feedback from customers (i.e., a customer is unsatisfied with the service), all present customers are forced to leave the system, at the same time, the system undergoes a maintenance procedure. After experiencing an exponential maintenance time, the system resumes work immediately. We assume that arriving customers decide whether to join or balk the system based on a linear reward-cost structure. By considering the waiting cost and reward, we discuss the balking behavior of customers and obtain the corresponding equilibrium balking strategies and social optimal strategies for the observable case and the unobservable case. Finally, some numerical examples are provided to show the effect of several system parameters on the equilibrium and optimal balking strategies. |
16:45 | Research on fundus image registration and fusion method based on NSCT and adaptive PCNN SPEAKER: Shihao Zhang ABSTRACT. The fusion of the fundus image can show lesions and tissues in multi-modalities of images comprehensively, and is of great value of application in the diagnosis of fundus diseases for doctors. In this paper, we propose a method of fundus image registration and fusion with the combination of NSCT(Nonsubsampled contourlet) and adaptive PCNN(Pulse Coupled Neural Network).The specific process is: firstly, two images are registered to eliminate the spatial differences between the source images to extract SURF(Speeded Up Robust Features) as the feature point, and then the feature vector is calculated by feature points description, the nearest neighbor and the next nearest neighbor distance ratio method are used to realize the initial matching of the feature points,the RANSIC(Random Sample Consensus,RANSAC) method is used to remove the mismatched point pairs. Finally, the transformation parameters between the images are calculated to complete the registration of source images by the spatial transformation. For integration the of the two images after the registration, the specific process is: The low frequency and high frequency sub-band of the image to be fused is got by NSCT decomposition, the low frequency sub-band is fused by the regional energy . The high frequency sub-band use the simplified PCNN model to study and is fused based on the number of times the image pixels are fired. The experimental results show that this method is better than other representative methods in the fusion result of fundus image.the fusion image synthesize the image information and clarify the performance of the details,provides an effective reference for the clinical diagnosis of fundus diseases. |
16:50 | Hyperspectral Image Classication Based on Multi-Feature Fusion SPEAKER: Jie Wang ABSTRACT. The traditional hyperspectral image (HSI) classication methods usually use the spectral feature , and does not make full use of the spatial feature or other characteristic of the HSI. This paper proposes a novel HSI classication method based on multi-feature fusion (SST). First, the spectral spatial features are extracted by a spectral-spatial feature network. Then, the texture features of the local binary pattern (LBP) image are applied, and fusion with spectral-spatial features. Finally, kernel-based extreme learning machine (KELM) is used to classify the hyperspectral images. Through experiments on dierent datasets, the results show that the proposed method can eectively improve the classication accuracy of hyperspectral images. |
16:55 | Multi-Scale Adversarial Network for Underwater Image Restoration SPEAKER: Jingyu Lu ABSTRACT. Underwater image restoration, which is the keystone to the underwater vision research, is still a challenging work. The key point of underwater image restoration work is how to remove the turbidity and the color distortion caused by the underwater environment. In this paper, we propose an underwater image restoration method based on transferring an underwater style image into a recovered style using Multi-Scale Cycle Generative Adversarial Network (MCycle GAN) System. We include a Structural Similarity Index Measure loss (SSIM loss), which can provide more flexibility to model the detail structural to improve the image restoration performance. We use dark channel prior (DCP) algorithm to get the transmission map of an image and design an adaptive SSIM loss to improve underwater image quality. We input this information into the network for multi-scale calculation on the images, which achieves the combination of DCP algorithm and Cycle-Consistent Adversarial Networks (CycleGAN). By compared the quantitative and qualitative with existing state-of-the-art approaches, our method shows a pleasing performance on the underwater image dataset. |
Nanjing University International Conference Center