SOICT 2019: THE 10TH INTERNATIONAL SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY
PROGRAM FOR FRIDAY, DECEMBER 6TH
Days:
previous day
all days

View: session overviewtalk overview

09:00-09:45 Session 10
Location: Panorama
09:00
Keynote 7: Personal data as new oil

ABSTRACT. Differential Privacy (DP) has been receiving attention as a rigorous privacy framework. In this talk, we introduce the basic notion of DP and our recent studies on extension of DP to spatial temporal data. The topics include i) DP mechanism under temporal correlations in the context of continuous data release; and ii) location privacy for location-based service over road networks.

09:45-09:55Coffee Break
09:55-11:55 Session 11A
Location: Panorama 1
09:55
Emotional Speech Generator by using Generative Adversarial Networks

ABSTRACT. In this paper, we propose an affective voice conversion method that generates emotional speech from neutral one using cycle-consistent generative adversarial networks (CycleGAN). This method uses Mel-cepstral coefficients (MCEPs), which extracted from speech signal, as an input. Next, we applied the modified network model which was comprises two components, called a generator and a discriminator. In this generator network, the pairing structure with an encoder and decoder is efficient for an accurate and fast calculation in the learning process. Furthermore, we constructed two types of encoder for realizing more accurate and emotional conversion. The one equips a content encoder that encodes the linguistic-information, and the other equips a domain encoder that encodes the emotional-information. This separation contributes to the smooth reproduction of emotional information. Finally, we evaluated the emotion expression and the sound quality of the speeches converted by the proposed method. As the results, it is obvious that our method has accomplished to convert the emotion of the speech, although additional efforts are necessary to improve sound quality.

10:15
Context-Aware Collaborative Object Recognition For Distributed Multi Camera Time Series Data

ABSTRACT. Recent research shows that the multi-view system for object recognition outperforms the single-view point system. When viewpoints are added, additional communication cost and cost to deploy the viewpoints are also added. However, prior work has shown that not all of the views are useful, and poor viewpoints can be excluded. This paper explores the dynamic context application for a ContextAware Neural Network. The Context-Aware Neural Network uses Shannon entropy value to acquire likelihood, and this likelihood value to reduce viewpoints in a distributed system. However, reducing viewpoints were done on static image recognition, so the spatial relation between the views and subject is fixed. Expansion to dynamic context is essential since most of the real world is a series of images, rather than a snapshot of the scene. Apart from testing on images of 3D CAD data, this paper illustrates the generation of 3D CAD data videos, and examines the video analysis of the generated videos using the Context-Aware Neural Network. In this particular setup, relevant objects move with respect to a fixed set of cameras. It is reported that the viewpoints can be reduced, and context of the trained data matters in the setup

10:35
OCR Error Correction for Unconstrained Vietnamese Handwritten Text

ABSTRACT. Post-processing is an essential step in detecting and correcting errors in OCR-generated texts. In this paper, we present an automatic OCR post-processing model which comprises both error detection and error correction phases for OCR output texts of unconstrained Vietnamese handwriting. We propose a hybrid approach of generating and scoring correction candidates for both non-syllable and real-syllable errors based on the linguistic features as well as the error characteristics of OCR outputs. We evaluate our proposed model on a Vietnamese benchmark database at the line level. The experimental results show that our model achieves 4.17% of character error rate (CER) and 9.82% of word error rate (WER), which helps improve both CER and WER of an attention-based encoder-decoder approach by 0.5% and 3.5% respectively on the VNOnDB-Line dataset of the Vietnamese online handwritten text recognition competition (VOHTR2018). These results outperform those obtained by various recognition systems in the VOHTR2018 competition.

10:55
BK.Synapse: A scalable distributed training framework for deep learning

ABSTRACT. Training neural networks efficiently is a thoroughly-researched topic that plays an important role in their adoption. Major advancements have been made, including the use of multiple nodes to further decrease training time. However, training at scale usually means adding on multiple layers of complex deployment logic and parallelization concerns, distracting researchers from the core of their algorithms. This paper presents a framework called BK.Synapse that can facilitate distributed training while maintaining clarity, simplicity, and user-friendliness. The design is modular, allowing flexible and easy deployment on a variety of hardware specifications. The framework is benchmarked in a case study: training a neural network for an object detection problem. Our results show a good amount of improvements over conventional training, with very few modifications to the existing codebase. The resulting model also performs relatively well upon further testing.

11:15
Building Face Recognition System with Triplet-based Stacked Variational Denoising Autoencoder

ABSTRACT. Face recognition is a fundamental and critical topic in computer vision. In this work, a face recognition system based on stacked variational denoising autoencoders with triplet loss is proposed to overcome some existing challenges regard to face variations including poses, illumination, expression and low resolution with less training data. In our proposed system, a stacked variational denoising autoencoders is used to build a deep architecture for extracting salient and latent features from data. Together with that, by using a triplet loss function, we can preserve categorical similarity between faces, then improve the performance of the autoencoders in clustering task. The proposed system is evaluated in some benchmark face datasets including ORL, Yale, Youtube Faces. Preliminary results demonstrate that the proposed system yields comparable results to other deep convolutional neural networks (CNN) and none-deep CNN based methods.

09:55-11:55 Session 11B
Location: Panorama 2
09:55
A new Anomaly Traffic Detection Based on Fuzzy Logic Approach in Wireless Sensor Networks

ABSTRACT. n the Internet of Things area, Wireless Sensor Network (WSN) play as a key infrastructure providing a lot of interesting applications nowadays. WSN is a kind of distributed and self-organizing wireless networks with a lot of constraint issues such as communication capacity, energy limitation, and distributed fashion. This type of wireless networks is usually vulnerable by many types of attacks including jamming attacks, DDOS attacks, and malicious nodes. Anomaly detection is one of typical intrusion defend approaches, in which the entropy-based anomaly detection approach has been extensively studied that brings some good results. However, it has faced several challenges such as complexity and accuracy. Hence, in this paper, a new anomaly detection method based on a fuzzy inference system is proposed to detect anomaly network traffic in wireless sensor networks. This fuzzy inference system model based on the fuzzy estimator which is validated by numerical results.

10:15
Balanced landmark-based graph partitioning with application in navigating with limited resources

ABSTRACT. We study the problem of graph partitioning balanced on vertex weight where we aim to maximize the average weight of the obtained subgraphs. Our proposed approach is to partition the graph by a graph Voronoi diagram where the Voronoi nodes, considered as landmarks, are carefully selected in some special manner for achieving the mentioned targets. We also introduce a direct application for this problem, namely navigating with limited resources, where one would like to produce an inexpensive navigation device. This device is although short of battery and of limited memory but still is capable to help showing directions in a long hiking journey in the wilderness. This is achieved by using a specialized map (for navigating) created based on our proposed graph partitioning. We introduce two different heuristic techniques in selecting the landmarks in such a mentioned desired manner. Initial experiments with our prosed algorithms show encouraging results where we can successfully construct a 0.5-balanced partition for all tested graphs.

10:35
Blockchain-Based Cross-Organizational Integrated Platform for Issuing and Redeeming Reward Points

ABSTRACT. For the sake of enhancing repurchase rate and customer loyalty, plenty of enterprises have issued their own reward points. However, as there are more enterprises issuing reward points, customers become less and less interested and decline in willingness to collect them. Even though there are companies forming a business alliance to make reward points more attractive to users, it is usually managed by a central leading company. In other words, there still exists a security concern for a centralized system and database. In this paper, we propose a blockchain-based cross-organizational integrated platform (BCOIP) for issuing and redeeming reward points. BCOIP provides customers an electronic reward point system that is easier to collect and manage than the physical ones and could circulate in many enterprises. Besides, since BCOIP deploys the smart contract to operate the reward point and make all point transactions stored on the permissioned blockchain, it provides more interoperability and security to enterprises and more credibility for costumers. Moreover, goods providers joining the BCOIP could not only improve the diversity of the reward market but also reach more costumers. Finally, we conduct two experiments to evaluate the efficiency and stability of the BCOIP. Results indicate that points transactions are stable regardless of the number of parallel processes and the other shows the number of nodes has nothing to do with the efficiency of this system.

10:55
Recursive Gateway Allocation Combined with Self-localization and Model Checking in Mobile Ad-hoc Networks

ABSTRACT. The Internet of Things (IoT) has been applied to many systems, and growing rapidly to provide various services. The IoT technologies rely on network systems, for example mobile ad-hoc network enables connections between gateways and devices. Gateway allocation satisfying system requirements is a problem to be solved for the IoT development. In this study, we propose a recursive approach to allocate gateways combined with self-localization and model checking in mobile ad-hoc networks. The aim is to allocate gateways on a map to enable position estimation by a mobile device. The approach is recursive. First we construct an initial map, and repeatedly apply model checking technique to validate convergence and reconfiguration of gateway allocation according to constraints obtained from requirements of the target network system. Experimental studies show concrete examples which scalability is not large but include essential foundations of the proposed approach.We also discuss possibility of extensions and how to apply this study to real systems.

11:15
Privacy Preserving Visual Log Service with Temporal Interval Query using Interval Tree-based Searchable Symmetric Encryption

ABSTRACT. Visual logs become widely available via personal cameras, visual sensors in smart environments, or surveillance systems. Storing such data in public services is a common convenient solution, but it is essential to devise a mechanism to encrypt such data while enabling the capability to query visual content even in encrypted format at the services. This motivates our proposal to develop a smart secure service for visual logs with a temporal interval query. In our system, visual log data are analyzed to generate high-level contents, including entities, scenes, and activities happening in visual data. Then our system supports data owners to query these high-level contents from their visual logs at the server-side in a temporal interval while the data are still encrypted. Our searchable symmetric encryption scheme TIQSSE utilizes interval tree structure and we prove that our scheme achieves efficient search and update time while also maintaining all important security properties such as forward privacy, backward privacy, and it does not leak information outside the desired temporal range.

11:35
Detecting Web Attacks using Stacked Denoising Autoencoder and Ensemble Learning Methods

ABSTRACT. Web-based anomalies remains a serious security threat in the Internet. This paper proposes the use of Sum Rule and Xgboost to combine the outputs related to various Stacked Denoising Autoencoders (SDAEs) in order to detect abnormal HTTP queries. Sum Rule and Xgboost inherit the distinct advantage of SDAE that does not require handcrafted fea- tures to be extracted. Furthermore, these methods can cope with the changing web vulnerabilities, where malicious code is added into different parts of the request header and body. Experiments were carried out on the DVWA dataset and the dataset that obtained from a real-world application. Sum Rule and Xgboost demonstrate to achieve higher F1-score as compared to the state-of-the-art Regularized Deep Au- toencoders, Isolation Forest, C4.5 decision tree and Long Short-term Memory network.

09:55-11:35 Session 11C
Location: Ruby
09:55
Unsupervised pregnancy and physical activity detection in mammals using circadian rhythms

ABSTRACT. Circadian rhythms are daily cyclical biological processes expressed in almost all tissues throughout the body. While circadian rhythms are endogenous, they are affected and possibly modified by external factors ranging from light exposure, activity levels, food intake, and temperature adjustments, to other factors both inside and outside of the body. Providing a clear representation of our circadian system and the related disturbances derived from misalignments between our internal clock and external stimuli may enable us to associate the insurgence of a specific disease or condition to a modification of an established circadian rhythm. We examine if an individual’s circadian rhythm, assembled from ubiquitous physiological data such as core body temperature, can uncover anomalies related to the disease or condition being addressed in an unsupervised fashion. Here, we show that circadian rhythms derived from core body temperature data can successfully be employed to classify pregnancies in laboratory mice that will and won’t come to term, as well as states of low/high activity in goats and sheep without using labelled data. Our algorithm can be used for an efficient and fast visualization of successful and unsuccessful pregnancy status, within 1 day from the pairing episode, and a comparison between our component mapping and bipartition of raw activity data yielded 84% accuracy, 30% precision, 84% recall, and 42% F-score. Furthermore, we have proposed a new graph type to display these aforementioned components: C-lock. Our unsupervised approach can be applied to other types of unlabeled datasets that exhibit cyclical behavior.

10:15
Quantification of Pass Plays Based on Geometric Features of Formations in Team Sports

ABSTRACT. In recent years, geometric features of formations such as players’ dominant regions and their adjacency have been frequently computed and utilized for team sports’ analysis. Such geometric features of formations have also been successfully applied to real game data in many team sports. In this paper, we propose a novel quantification method of pass plays by combining multiple geometric features of formations from two viewpoints: (i) effectiveness of leading to a shoot, and (ii) risk of being robbed a ball. The proposed method extracts many feature values including passers’ and receivers’ geometric feature values and then constructs quantification models from the two viewpoints based on statistical classification methods using the extracted feature values. Thus, the proposed method enables users to effectively understand pass plays and tactics used through visualization tools of players and ball position data. For validation, this method is applied to real players’ position data during a soccer game.

10:35
Cow estrus detection with low-frequency accelerometer sensor by unsupervised learning

ABSTRACT. The evolution of the Internet of Things (IoT) and Machine Learning (ML) has applied successfully in agriculture help increase productivity and reduce labour. Specifically, we focus on improving the autonomous estrus detection of a cow in modern farms in term of energy consumption and precision. In previous detection pipeline, an accelerometer is mounted to the neck of cows to capture motion data with high frequency, followed by an ML algorithm to check the data and determine whether it is in estrus or not. Instead, we configured the accelerometer to sample with low frequency for minimizing its energy consumption; however, it leads to an undesirable higher false alarm rate. We mitigate those false alarms by designing a new pipeline of unsupervised learning and proposing a new postprocessing algorithm. The proposed postprocessing algorithm is a backtracking algorithm that incorporates timing constraint of the period obtained by agriculture knowledge. With the constraint, the postprocessing algorithm facilitates a significantly higher precision than simple thresholding techniques in previous studies on a simulated dataset. The overall result of the pipeline and the proposed algorithm is visualized on real-world data captured on the farm in our agriculture department.

10:55
Improving CRNN with EfficientNet-like feature extractor and multi-head attention for text recognition

ABSTRACT. Text recognition is one of the most important and challenging tasks in image-based sequence recognition, which has various po- tential applications in real life. In this paper, we propose a novel convolutional-recurrent neural network (CRNN) for text recogni- tion. Particularly, we adapt the EfficientNet architecture for extract- ing deep features and propose multi-head attention mechanisms to improve character localization. The experiments show that our EfficientNet-like feature extractor clearly outperforms other pre- vious CNN feature extractors like VGG and ResNet based ones. In overall, our proposed method yields competitive performance in comparison with other state-of-the-art approaches. Specifically, our F1-score is equivalent to top 4 on the ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction.

11:15
YOLOv3-VD: A sparse network for vehicle detection using variational dropout

ABSTRACT. Deep neural networks (DNNs) are currently state-of-the-art methods in many important AI tasks. However, DNNs usually contain a lot of parameters that make them prone to overfitting and slow in inference. In this paper, we apply variational dropout to sparsify YOLOv3 network for vehicle detection. We then prune redundant layers and compress the network to reduce the memory size and accelerate the inference speed of the model. Experiments show that we can eliminate up to 91% weights in the original YOLOv3 with a negligible decrease of accuracy.

11:35
Visual Assistant for Crowdsourced Anomaly Event Recognition in Smart City

ABSTRACT. City anomalies such as residential fires or urban floods can cause a lot of damage each year from loss of human lives to negative effects on the productivity of the city. One possible way to deal with these anomalies in a quick and effective manner is to use the information provided by the network of citizens that is likely present at the scene of any serious anomaly. However, this source of information may be of varying quality and there may be too much data for human operators to manually inspect in a speedy manner. This motivates us to propose an architecture to effectively make use of the network of citizens to deal with city anomalies. At the center of our architecture is a neural network that automatically classify incoming image data and use this information to assist anomaly handling efforts. In order for our classifier to work well, we have also collected a dataset of city anomalies specific to Vietnam. We experiment with several neural network models on our collected dataset. In overall, they all have decent accuracies with the best being MobileNet with 92.3% accuracy.

12:00-13:30Lunch
13:30-14:15 Session 12
Location: Panorama
13:30
Keynote 8: Unveiling the potential of Graph Neural Networks

ABSTRACT. Recent advances in Artificial Intelligence (AI) have led to a new era of Machine Learning techniques such as Deep Learning. This has attracted the interest of the networking community to try to take advantage of these novel techniques to develop a new breed of network operation techniques: models, management and optimization. However and in this context, ML applied to networking has not fulfilled its high expectations yet. The main reason behind this is that existing proposals typically use Fully-Connected Neural Networks (FCNN) that are not designed to learn computer networks. Indeed, data networks are fundamentally represented as graphs (topology, routing), and FCNN are not designed to learn information structured as graphs. Graph Neural Network (GNN) is a new family of neural networks that have been recently proposed with the aim of learning graphs. In other words, GNNs facilitate the learning of relations between entities in a graph and the rules for composing them (i.e., they have a strong relational inductive bias). In general, this is achieved by using Message-passing Neural Networks (MPNN), which have been used in the literature to develop new chemistry compounds. In this talk we will present RouteNet, a pioneering GNN designed to learn the complex relationship between routing, topology, traffic and the resulting delay, jitter and drops. We will also show that RouteNet is able to generalize, providing accurate estimates for unseen topologies, routings and traffics. Finally, we will discuss the prospects of this new technology and its potential in the field of Computer Networks.

14:15-14:30Coffee Break
14:30-15:50 Session 13A
Location: Panorama 1
14:30
Fast Distance-based Outlier Detection in Data Streams

ABSTRACT. Continuous outlier detection in data streams is one important topic in data mining. It has many applications in public health, network intrusion detection, and fraud detection. Over the last two decades of research, many studies have been conducted on distance-based outlier detection algorithms which are viable, scalable, parameter-free approaches. Because streaming data points arrive and expire over time, the challenge is to monitor the outlier status of data points with time and space efficiency. In this study, we propose three algorithms: O-MCOD, U-MCOD, and M-MCOD. These algorithms improve upon the state-of-the-art algorithm in distance-based outlier detection in data streams, i.e., MCOD, by relaxing the constraints of micro-clusters and using the minimal probing principal. With extensive experiments on synthetic and real-world datasets, we show that the proposed algorithms outperform MCOD in time and space efficiency. Specially, our proposed algorithms are 1.5 to 95 times faster than MCOD and require as low as 25\% peak memory compared to MCOD.

14:50
An approach to improving group recommendation systems based on latent factor matrices

ABSTRACT. Group activities have been becoming more powerful in various fields today. This results in expanding single user recommendation systems to group recommendation systems. An effective approach to group recommendation systems is to represent a group as a virtual user. Then, the single user recommendations are performed for this virtual user. In this paper, we propose a novel virtual user computation named the observed-filled-rating-based method and its extended version. The aim of our research is to overcome the weaknesses of the previous methods in the virtual user computation to improve group recommendation systems.

15:10
Session-Based Recommendation with Self-Attention

ABSTRACT. The goal of session-based recommendation is to predict the next action of a user based on the current session and anonymous sessions before. Recent works on session-based recommendation usually use neural network architectures such as convolution neural networks (CNNs) or recurrent neural networks (RNNs) to extract patterns of sessions. Such features have been shown to give promising results because they can discover the user's sequential behavior and understand the purpose of current session. In this paper, we propose a neural network architecture for session-based recommendation without using convolution or recurrent neural networks. Our model is inspired by the Transformer's design, in which the information of important items is passed directly to the hidden states via a self-attention mechanism. Experimental results on two real-world datasets show that our method outperforms several state-of-the-art models.

15:30
Simulated Annealing for the Assembly Line Balancing Problem in the Garment Industry

ABSTRACT. Assembly line balancing (ALB) is the problem of assigning a set of tasks to workstations, such that the precedence relations among the tasks are satisfied to optimize different objectives. ALB is an important task for the garment industry. When the product model is changed, the assembly line must be balanced again. There are huge investigations on ALB including different objectives such as minimizing the number of workstations, minimizing the balance delay and minimizing the cycle time. In this paper, the objective of ALB is to minimize the number of workstations for a given cycle time with respect to some constraints on the order of precedence relations among tasks, on the number of equipment types and tasks in each group of tasks. We first use the greedy strategy to find an initial solution, then apply the Simulated Annealing (SA) to find the best solutions possible. The proposed algorithms have been evaluated on the actual data set of Dong Van Garment Factory, Hanoi Textile Garment Joint Stock Corporation, Vietnam. The experimentation shows the feasibility to the real-life situation with very fast running times. Especially, we achieved the optimal results on small-size test cases.

14:30-16:30 Session 13B
Location: Panorama 2
14:30
Blockchain for Cyber-Physical System in Manufacturing

ABSTRACT. This paper proposed a blockchain manufacturing framework that involves the utilization of secondary validation and collaborative distribution concepts. Conventional and blockchain manufacturing simulators were developed to investigate the effects of blockchain technologies on manufacturing systems in terms of time and product quality. The simulation results illustrated that the products manufactured in the blockchain-based system are of better quality than the conventional system but does not necessarily reduce the manufacturing times due to the secondary validation and collaborative distribution processes, which are also responsible for ensuring manufacturing quality. The results also emphasized the importance of utilizing collaborative distribution models to improve workcell utilization within blockchain-based systems. By implementing the collaborative distribution logic within the blockchain, the manufacturing times will be lessened. This paper highlights the potential of utilizing blockchain technologies to enhance Cyber-Physical Systems in the manufacturing industry.

14:35
Using Metaheuristic for Solving the Resource-Constrained Deliveryman Problem

ABSTRACT. Traveling Repairman Problem (TRP) is a class of NP-hard combinatorial optimization problems which has many practical applications. In this paper, a general variant of TRP, also known as TRPTW is introduced. The TRPTW problem deals with finding a tour in order to serve a set of locations, each one within a specified time window. Obviously, TRPTW is more complex than TRP because it is a generation of TRP. Due to NPhard problem, metaheuristic needs to be developed to provide near-optimal solutions within a short computation time for large instance sizes. However, the main issue of metaheuristics is that they fall into local optima in some cases since the search space of the problem is combinatorial explosion. In order to overcome the drawback, we propose a metaheuristic algorithm which is mainly based on Variable Neighborhood Search (VNS) and Shaking techniques to solve the problem. The aim of VNS is to generate diverse neighborhoods by using various neighborhood searches while Shaking techniques allow it to guide the search towards an unexplored part of the solution space. The combination supports our algorithm to escape local optimal. Extensive numerical experiments on benchmark instances show that our algorithm reaches the optimal solutions for the problem with up to 100 vertices at reasonable amount of time. In addition, our algorithm is comparable with the state of the art metaheuristic algorithms in terms of solution quality and running time for larger instances.

14:40
Systemic Approach for Modeling a Generic Smart Grid

ABSTRACT. Smart grid technological advances present a recent class of complex interdisciplinary modeling and increasingly difficult simulation problems to solve using traditional computational methods. To simulate a smart grid requires a systemic approach to integrated modeling of power systems, energy markets, demand-side management, and much other resources and assets that are becoming part of the current paradigm of the power grid.

This paper presents a backbone model of a smart grid to test alternative scenarios for the grid. This tool simulates disparate systems to validate assumptions before the human scale model. Thanks to a distributed optimization of subsystems, the production and consumption scheduling is achieved while maintaining flexibility and scalability.

14:45
Voice authentication by text dependent single utterance for in-car environment

ABSTRACT. Individual authentication using speech is called speaker verification. Speaker verification, which can be implemented in portable devices, is used in many scenarios. This study focuses on speaker verification while driving. Noise and long-term variability of feature are problems associated with speaker verification while driving. Considering the characteristics of noise in a moving car, spectral subtraction and cutting low frequency are implemented in noise reduction phase. We describe adaption of templates to noisy environment. We also evaluate long-term variability of feature by using speech that is recorded after approximately 10 months from the first enrollment. False reject rate(FRR) decreased to 66.6 % on average when implementing the noise reduction phase. In addition, FRR is improved by noise reduction phase through an update template using accepted speech. With respect to long-term variability, the FRR did not change after 10 months. This result indicates that GMM Posteriorgram is valid for inter-speaker variability. We need to consider template text to reduce the variation in GMM in the future.

14:50
Improving Prediction of Pass Receivable Players in Basketball

ABSTRACT. Recently, a tool has been developed to reproduce basketball games by visualizing players and ball positions data in 3D space. This tool can also display predictive information of pass receivable players to understand the tactics in the games effectively based on real game data. In this paper, we propose an improved method to predict such pass receivable players considering the characteristics of basketball plays. In this method, kinetic models for ball and player movements are used to simulate a large number of virtual passes with parameters estimated from the players and ball positions data. The pass receivable players are predicted based on the number of receivable virtual passes for each player. For an initial validation, this method was applied to the players and ball positions data included in the APIDIS basketball dataset, and the result was compared with that of a conventional method based on players’ adjacency information.

14:55
An EM Algorithm based Method for Constructing Dynamic Saliency Maps considering Characteristics while Driving

ABSTRACT. In this paper, we propose a novel construction method for dynamic saliency maps to predict the gaze of human drivers. In the proposed method, multiple feature maps are calculated from input images recorded by a vehicle-mounted camera. The dynamic saliency map consists of these multiple feature maps after center-biasing and normalization processes. The mixing ratios for these processed feature maps are determined with the Expectation–Maximization algorithm by considering the dynamic saliency map as a mixture distribution consisting of the processed feature maps as components. In addition, this paper introduces two models for constructing dynamic saliency maps. While the mixing weights are static in the first model, the mixing ratios are dynamically computed based on scene features calculated from input images in the second model. The proposed method is validated with these two models using fixation point data extracted from the DR(eye)VE dataset, which consists of videos recorded by both eye tracking glasses and a roof-mounted camera.

15:00
Stock Market Trend Prediction using Supervised Learning

ABSTRACT. Stock market trend prediction has received considerable attention of researchers in recent times. It is an important application in the area of machine learning. In this work, we propose a machine learning based stock trend prediction system with a focus on minimizing data sparseness in the acquired datasets. We perform outlier detection on the acquired datasets for dimensionality reduction and employ K-nearest neighbor classifier for predicting stock trends. Results obtained show the effectiveness of the proposed system, when compared with baseline studies.

15:05
PdM - A predictive maintenance modeling tool implemented as R-package and web-application

ABSTRACT. In the current manufacturing world, maintenance has a critical role to play in improving companies' competitiveness. Among the available maintenance strategies, predictive maintenance seems to be the most promising because failures are predicted and a timely reaction is possible. Therefore, in this paper, we propose the PdM package to build predictive maintenance models for proactive decision support based on machine learning algorithms. The proposed package implemented as a package for R and it provides several major functionalities to create and evaluate predictive maintenance models. The PdM package also provides interactive graphical user interface (web-application), that enables the user to conduct all steps of the predictive maintenance building workflows from his browser without using code. All features of the package can be attributed to one of the following groups: data import, data validation and preparation, data exploration and visualization, feature engineering, data preprocessing, model creating and evaluation, report creation. For illustrations, the proposed package is applied to the Turbofan Engine Degradation Simulation data set FD001 from NASA for the estimation of the turbofan engine remaining useful life (RUL).

15:10
ARTIFICIAL INTELLIGENCE IN THE CYBER DOMAIN: OFFENSE AND DEFENSE

ABSTRACT. Artificial intelligence and machine learning have grown rapidly in recent years, and their applications in practice can be seen in many fields, ranging from facial recognition to image analysis. In the cybersecurity domain, AI-based techniques can provide better cyber defense tools and help adversaries improve methods of attack. However, malicious actors are aware of the new prospects too and will probably attempt to use for nefarious purposes. This survey paper aims at providing an overview of how Artificial intelligence can be used in the context of cybersecurity in both offense and defense.

15:15
Deep learning approach for singer voice classification of Vietnamese popular music

ABSTRACT. The singer's vocal classification is a meaningful task in the digital era. With a huge number of songs today, identifying a singer is very helpful for music information retrieval, music properties indexing, and so on. In this paper, we propose a method to identify the singer's name in a song of Vietnamese popular music. We use of vocal segment detection and singing voice separation as the pre-processing steps. The purpose of these steps is to extract the singer's voice from the mixtures sound. To build a singer classifier, we propose a neural network architecture with Mel Frequency Cepstral Coefficient (MFCC) input features from vocal voice extracted. For verification of the accurate of our methods, we use a dataset with 300 Vietnamese songs of 18 famous singers. The accuracy reaches 92.84\% with 5-fold stratified cross-validation and achieves the best results compared to other methods on the same data set.

15:20
Labelling Stomach Anatomical Locations In Upper Gastrointestinal Endoscopic Images Using a Convolution Neuronal Network

ABSTRACT. In this paper, we aim to develop a feasible diagnostic assistant system for labelling the stomach anatomical locations in upper GastroIntestinal Endoscopy (UGIE) examination. To address this task, we construct an appropriate manner which utilizes both ability of a convolutional neuronal network (CNN) and interactions between machine and doctors. With the expectation to assist for more accurate diagnosis and contribute to training activites, we specify to solve the problem by two-phase schemes. The first is a coarse-scheme to classify six major anatomical locations including gastric body, fundic, antrum, pyloric ring, lesser curvature and greater curvature using advances of a CNN. The constructed CNN network is compact with high performance and appropriate integration into a Graphic User Interface (GUI). In order to classify with 13 detailed positions, a GUI is developed so that the endoscopists can conveniently specify the anatomical locations from results of the coarse-scheme. In this phase, the doctors will prune the automatic results as well as specify the more detailed positions of major locations. In the experimental results, the developed application is shown as an efficient way in UGIE procedures. It reduces a significant time from 13:03 minutes in a manual procedure to 4:35 minutes by using the developed system when comparing with trainee endoscopists. The results of specifying anatomical locations satisfied accuracy requirements and showed promising research trend for future application as a computer-aided GIE diagnostic system.

15:25
A Security-Enhanced Monitoring System for Northbound Interface in SDN using Blockchain

ABSTRACT. In Software-Defined Networking (SDN), Northbound Interface provides APIs, which allow network applications to communicate with SDN controllers. However, a malicious application can access to SDN controller and perform illegal activities via these APIs. Although some studies proposed AAA (Authentication, Authorization, Accounting) systems to protect SDN controllers from malicious applications, their proposed systems also exist several limitations. Attackers can compromise a system, then modify its database or files to gain higher privileges. This system can be taken down because of Single Point of Failure threat. To enhance security for the Northbound interface, we propose a novel system using blockchain, namely BlockAS. It is used to authenticate, authorize and monitor accessing critical controller resources from applications. Specifically, BlockAS leverages blockchain features to maintain the immutability and decentralization of credential data. Our proposed system has five key properties: immutability of database, decentralization, authentication, authorization, and accounting to enhance security for SDN controller and its offered services.

15:30
Unsupervised methods for Software Defect Prediction

ABSTRACT. Software defect prediction (SDP) aims to assess software quality by using machine learning techniques. Zhang et al. [21] proposed to apply connectivity­based unsupervised learning method, which is spectral clustering. In their results, they got impressive performances comparing with supervised learning models. This paper is inspired by their work and focusing on replicating the experiment using spectral clustering done by Zhang et al. [21]. However, there is a huge gap in terms of AUC comparing to their results. The exhaustive experiment steps are recorded in this paper. Additionally, this paper examined three community structure detect methods on the selected datasets: AEEEM, NASA, and PROMISE. To the best of our knowledge, these methods are first applied in SDP to evaluate their predictive power. Also, this paper uses another adjacency matrix, two feature selection methods, and two feature reduction methods to find the best combination which has the best performance on these datasets. To make replicating our work easy, a lightweight framework is therefore designed and released for future investigation

14:30-15:50 Session 13C
Location: Ruby
14:30
Don’t Drive Me My Way – Subjective Perception of Autonomous Braking Trajectories for Pedestrian Crossings

ABSTRACT. Autonomous vehicles will encounter all the traffic situations that current drivers are confronted with. These vehicles are expected to handle the situations at least as good as human drivers or even better. This “better” can be split up in various ways and address different facets of traffic: safety, efficiency, and cooperativity to name a few. The driving simulator study at hand investigated the effect of different braking trajectories of a fully autonomous vehicle (SAE Level 5) approaching a zebra crossing. Participants had to rate two aspects: the perceived cooperativity and criticality of the programmed braking trajectories, in addition to a replay of their own, manual approach, in a dynamic driving simulator. The results show significant differences between the approaches in terms of how critical and cooperative they were perceived. Remarkably, the participants’ individual driving style was, on average, not the safest or most cooperative one. Participants favourized an approach with an early brake onset with gradually increasing and subsequent decreasing brake intensity (bell-shaped curve) until full stop in front of the pedestrian crossing.

14:50
Punctuation Prediction for Vietnamese Texts Using Conditional Random Field

ABSTRACT. We investigate the punctuation prediction for the Vietnamese language. This problem is crucial as it can be used to add all suitable punctuation marks to machine-transcribed speeches, which usually do not have such information. Similar to previous works for English and Chinese languages, we formulate this task as a sequence labeling problem. After that, we apply the conditional random field model for solving the problem and propose a set of appropriate features that are useful for prediction. Also, we build two corpora from Vietnamese online news and movie subtitles and perform extensive experiments on these data. Furthermore, we ask four volunteers to insert punctuations into a small sample of our dataset. The experimental results show that this problem is even painful for a human, and our model can achieve near performance in comparison to a human.

15:10
Spatio-temporal Multi-level Fusion for Human Action Recognition

ABSTRACT. Two-stream convolutional networks have achieved great success for action recognition tasks. In this paper, we propose a spatiotemporal network that integrates the spatial and temporal features at multi-level to model the correlation between spatial and temporal information. Based on TSN model [16] where videos are divided into segments, our model integrates spatio-temporal information at either local or global levels. At local levels, temporal information is transferred to spatial stream in each segment. Considering at a global level, we integrate features of entire action extracted from two streams to obtain the final action representation. Moreover, in order to take into consideration the chronological sequence of the segments, we propose strategies for segment aggregation by using Conv3D and LSTM (Long-short term memory). In the training process, we also applied and evaluated several strategies such as auxiliary classifier, cross modality initialization to improve the convergence rate. Experimentation on the standard dataset UCF-101 (split-1) demonstrates the effectiveness of proposed network. Our model achieved an accuracy of 87.1% for spatial network, higher than TSN (85.5%) thanks to segment aggregation strategy with LSTM seq-to-seq. In proposed two-stream network, the strategy of multi-level fusion allows to get a better model in comparing with network using only global fusion with an improvement of 1.4% in accuracy and 1.2% in F1-score. Our two-stream network obtained also very  promising results with an accuracy of 92.57%.

15:30
Noise Removal Based Query Pre-processing to Improve Face Search Performance in Large Scale Video Databases

ABSTRACT. In most person search systems, for ease of use, user is often required to provide only raw images that contain person as query examples without specific face location. As a result, face detector need to be used. However, current face detectors are robust to pose, thus using all the detected faces as query can hurt search performance. So, having a good stage for removing bad examples in query can directly lead to better performance on the whole system. In this paper, we focus on analyzing how bad face examples affect person search system. Moreover, we propose an automatic bad-face removal method which is stable to the case where bad faces are dominant in a query. Experiments show that our removal method yields better mAP in both image example and shot example setting compared to that of Peking state of the art system.