SOICT 2019: THE 10TH INTERNATIONAL SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY
PROGRAM FOR THURSDAY, DECEMBER 5TH
Days:
previous day
next day
all days

View: session overviewtalk overview

09:05-09:50 Session 6
Location: Panorama
09:05
Keynote 5: Adaptive Control for multi flow traffic based on quality of experience and quality of service paradigms

ABSTRACT. Based on a convergence of network technologies, the Next Generation Network (NGN) is being deployed to carry high quality video and voice data. In fact, the convergence of network technologies has been driven by the converging needs of end-users. The perceived end-to-end quality is becoming one of the main goals required by users that must be guaranteed by the network operators and the Internet Service Providers, through manufacturer equipment. This is referred to as the notion of Quality of Experience (QoE) and is becoming commonly used to represent user perception. The QoE is not a technical metric, but rather a concept consisting of all elements of a user's perception of the network services. In this talk, we focus on the idea of how to integrate the QoE into a control- command chain in order to construct an adaptive network system. More precisely, in the context of Content-Oriented Networks that is used to redesign the current Internet architecture to accommodate content-oriented applications and services, the talk aim to describe an end-to-end QoE model applied to a Content Distribution Network architecture and see relationships between Quality of service and Quality of Experience.

09:50-10:00Coffee Break
10:00-12:00 Session 7A
Location: Panorama 1
10:00
Rule based English-Vietnamese bilingual terminology extraction from Vietnamese documents

ABSTRACT. Bilingual terminologies are important resources for natural language processing as well as for human use. The automatic acquisition of bilingual terminologies is mostly based on bilingual corpora. However, monolingual corpora could also be a good source for extracting bilingual terms. In fact, as English is used for international publications, we often find English words together with their translation in monolingual documents written in other languages. In this paper, we propose a rule-based approach for automatically extracting English-Vietnamese bilingual terminology from Vietnamese monolingual documents. Experiments on a medical corpus and on Wikipedia documents show that the extracted bilingual term candidates are of high quality.

10:20
VIETNAMESE SENTIMENT ANALYSIS FOR HOTEL REVIEW BASED ON OVERFITTING TRAINING AND ENSEMBLE LEARNING

ABSTRACT. In this study, we propose a machine learning model in analyzing customer opinions based on Vietnamese text: the case of hotel service, classifying a review as a positive or a negative. In particular, our solution focuses on improvement: pre-processing, standardization, training data relabelling with Error Analysis method. Besides, training data is enhanced with emotional dictionary; 5-Fold Cross Validation and Confusion Matrix are used to control overfitting and underfitting and to test the model; Hyperparameter Tuning method is used to optimize model parameters; Ensemble Methods are used to combine several machine learning techniques into the most efficient predictive model. Used data is collected from website booking.com

10:40
Extractive Multi-document Summarization using K-means, Centroid-based Method, MMR, and Sentence Position

ABSTRACT. Multi-document summarization is more challenge than single-document summarization since it has to solve the overlapping information among sentences from different documents. Also, multi-document summarization dataset is rare, thus methods based on deep learning are difficult to be applied. In this paper, we propose an approach to multi-document summarization based on k-means clustering algorithm, combining with centroid-based method, maximal marginal relevance and sentence positions. This approach is efficient in finding salient sentences and preventing overlapping between sentences. Experiments using DUC 2007 dataset show that our system is significantly more efficient than other researches in this field.

11:00
An Efficient Method for Discovering Variable-length Motifs in Time Series based on Suffix Array

ABSTRACT. Repeated patterns in time series, also called motifs, are approximately repeated subsequences embedded within a longer time series data. There have been some popular and effective algorithms for discovering motifs in time series. Nevertheless, these algorithms still have some weaknesses such as users have to choose an appropriate value of the motif length or suffer high computational cost. In this paper, we propose an efficient method for discovering motifs based on suffix array. This method consists of transforming a time series into a symbolic string and then finding repeated substring in the symbolic string based on the suffix array. Besides, we also speed up the execution time of the method by applying multi-core parallelism. Experimental results reveal that our proposed method can discover variable-length motifs and perform very fast on large time series datasets while providing high accuracy.

11:20
Exploiting User Comments for Document Summarization with Matrix Factorization

ABSTRACT. Social media presents a new method for readers who can freely discuss the content of an event mentioned in a Web document by posting relevant comments. The comments provide additional information which can be used to enrich the information of the main document. This paper introduces a new model which integrates user comments into the summarization process. While prior methods consider the same topic number between sentences and comments of a document, we argue that sentences and comments should own their different topics and they also share common hidden topics in term of same or inferred words. From this, we define a new objective function which jointly combines sentences and comments to find global optimization. The objective function is optimized by our non-negative matrix factorization algorithm to find out weights of sentence-matrix and comment-matrix for ranking sentences and comments. Experimental results on two datasets in English and Vietnamese show the efficiency of our model.

11:40
Improving a Language Model Evaluator for Sentence Compression Without Reinforcement Learning

ABSTRACT. We consider sentence compression as a binary classification task on tokens. In this paper we improve on a language model evaluator model by incorporating a score from a neural language model directly into the loss function instead of resorting to reinforcement learning. As a result, the model learns to remove individual tokens and to preserve readability at the same time while maintaining the desired level of compression. We compare our model with a state-of-the-art model, which uses a policy-based reinforcement learning method for evaluating compressed sentences on readability. We perform automatic evaluation and evaluation with humans. Experiments demonstrate that we were able to improve on the strong baselines. We also provide human-evaluation of 200 gold compressions from Google dataset setting a baseline for human-evaluation in upcoming studies.

10:00-11:40 Session 7B
Location: Panorama 2
10:00
Quantum Key Distribution over Hybrid Fiber-Wireless System for Mobile Networks

ABSTRACT. A novel method of quantum key distribution (QKD) based on hybrid fiber-wireless system is proposed in this paper. Quantum key from a center station (i.e., Alice) is transmitted through an optical fiber to a base station and then wirelessly forwarded to mobile nodes (i.e., Bobs).QKDprotocol is implemented by using subcarrier intensity modulation, where binary phase shift keying (BPSK) signaling is used for encoding and direct detection with a dual-threshold receiver is employed for decoding. We derive the mathematical expressions for security analysis of the proposed system in terms of quantum bit error rate and ergodic secret-key rate taking into account the channel loss and receiver noise. The feasibility of the proposed QKD system is validated via the numerical results.

10:20
Design and Implementation ofWeb Browser Secure Storage forWeb Standard Authentication Based on FIDO

ABSTRACT. Recently, researches have been actively conducted according to the government’s policy direction on authentication methods that comply with web standards, such as the abolition of accredited certification systems. The problem was that the data stored in the existing web repository could be tampered with, resulting in a vulnerability. In this paper, we applied a mutual verification technique and API (Application Programming Interface) forgery/forgery blocking and obfuscation to solve the authentication weakness in web browsers that comply with the FIDO (Fast IDentity Online) standard. In addition, user convenience is improved by implementing the No-Plugin, which does not require the installation of a separate program. Performance tests show that most browsers perform at about 0.1ms based on the RSA key generation rate. In addition, this study proved that this service can be used for commercialization as it showed a performance of less than 0.1 second, even in the digital signature verification speed of the server. The service is expected to be useful as an alternative to browser authentication and to establish a secure web repository that does not require a public certificate.

10:40
C500-CFGVex: A Novel Feature Extraction Method for Detecting Cross-Architecture IoT Malware

ABSTRACT. The widespread adoption of Internet of Things (IoT) devices built on different architectures gave rise to the creation and development of multi-architecture malware for mass compromise. Crossarchitecture malware detection plays an important role in detecting malware early on devices using new or strange architectures. Prior knowledge of malware detection on traditional architectures can be inherited for the same task on new and uncommon ones. Basing on C500-CFG and VEX intermediate representation, we propose a feature selection method to detect cross-architecture malware, called C500-CFGVex. Experimental evaluation of the proposed approach on our large IoT dataset achieved good results for cross-architecture malware detection. We only trained a SVM model by Intel 80386 architecture samples, our method could detect the IoT malware for the MIPS architecture samples with 95.72% of accuracy and 2.81% false positive rate.

11:00
Multi-Task Network Anomaly Detection using Federated Learning
PRESENTER: Junjun Chen

ABSTRACT. Because of the complexity of network traffic, there are various significant challenges in the network anomaly detection fields. One of the major challenges is the lack of labeled training data. In this paper, we use federated learning to tackle data scarcity problem and to preserve data privacy, where multiple participants collaboratively train a global model. Unlike the centralized training architecture, participants do not need to share their training to the server in federated learning, which can prevent the training data from being exploited by attackers. Moreover, most of the previous works focus on one specific task of anomaly detection, which restricts the application areas and can not provide more valuable information to network administrators. Therefore, we propose a multi-task deep neural network in federated learning (MT-DNN-FL) to perform network anomaly detection task, VPN (Tor) traffic recognition task, and traffic classification task, simultaneously. Compared with multiple single-task models, the multi-task method can reduce training time overhead. Experiments conducted on well-known CICIDS2017, ISCXVPN2016, and ISCXTor2016 datasets, show that the detection and classification performance achieved by the proposed method is better than the baseline methods in centralized training architecture.

11:20
Secure EEG-Based User Authentication System Integrated with Robust Watermarking

ABSTRACT. Electroencephalogram (EEG) data has been widely used in health care sector. Recent studies explore the potential of EEG in biometric authentication because of its advantages. However there are secu- rity problems in EEG based user authentication system, especially in remote application with unsecured channel that are not investigated thoroughly in previous works. In this paper, a secure user authenti- cation system based on EEG data is integrated with watermarking scheme which applies hybrid Discreate Wavelet Transform-Sigular Value Decomposition (DWT-SVD) and Quantization Module Index (QIM). The developed model is able to enhance the security of the system against spoofing, relay and communication attacks while not degrade the EEG-based user authentication performance. The impact of watermarking on the recognition performance of EEG- based user authentication system has also been investigated in this paper. Experimental results find partially significant degradation of system performance while its security is strengthened.

10:00-11:40 Session 7C
Location: Ruby
10:00
jDomainApp: A Module-Based Domain-Driven Software Framework

ABSTRACT. Object-oriented domain-driven design (DDD) has been advocated to be the most common form of DDD, thanks to the popularity of object-oriented development methodologies and languages. Although the DDD method prescribes a set of design patterns for the domain model, it provides no languages or tools that realise these patterns. There have been several software frameworks developed to address this gap. However, these frameworks have not tackled two important software construction issues: generative, module-based software construction and development environment integration. In this paper, we propose a framework, named jDomainApp, and an Eclipse IDE plugin to address these issues. In particular, we extend our recent works on DDD to propose a software configuration language that expresses the software configuration, needed to automatically generate software from a set of modules. The modules are automatically generated using a module configuration language that we defined in a previous work. We demonstrate the framework and plug-in using a real-world software example. Further, we evaluate the performance of software construction to show that it is scalable to handle large software.

10:20
On Implementation of the Improved Assume-Guarantee Verification Method for Timed Systems

ABSTRACT. The two-phase assume-guarantee verification method for timed systems using TL∗ algorithm implemented in the learner has been known as a potential method to solve the problem of state space explosion in model checking thanks to its divide and conquer strategy. This paper presents three improvements to the verification method. First, we remove the untimed verification phase from the verification process. This removal reduces the time complexity of the verification process because of the great time complexity of this phase. Second, we introduce a maxbound to the equivalence queries answering algorithm implemented in the teacher which acts as a method for the teacher to return “don’t know” results to the learner to prevent the verification process from many endless scenarios. Finally, we introduce a technique to analyze the counterexample received from the teacher and another one implemented in the equivalence queries answering algorithm which helps the teacher not return a counterexample that has been returned to the learner. This technique keeps the verification process from running forever in several circumstances. We give primitive experimental results for both two-phase assumption generation method and the improved one with some discussions in the paper.

10:40
Preparation method in automated test case generation using machine learning

ABSTRACT. We have been working on the automation of software development processes. Among the different processes in software development, testing yields the most influence in software quality. Test cases are therefore written by skilled engineers and are decided after multiple reviews, requiring a large amount of manpower in preparing them. Thoughtless reduction of these steps, however, could diminish software quality. We therefore used the knowhow of skilled engineers in writing test cases as training data to automate the generation of homogeneous test cases through machine learning. Our method automatically extracts homogeneous test cases that are not dependent on skills and knowhow of the engineer writing the test cases from requirements specification documents, which are the products of the basic design process. Use of machine learning, however, has made it difficult to increase accuracy in extracting test items. In this paper, we propose a method to increase accuracy by preparation of training data inputted into the machine learning process, and report the results of evaluation of effectiveness of the method.

11:00
Prediction System for Fine Particulate Matter Concentration Index by Meteorological and Air Pollution Material Factors Based on Machine Learning

ABSTRACT. Fine dust (PM-10) and ultra-fine dust (PM-2.5) have a serious effects on the human body and also in various areas such as nature, ecosystemsandassets.Previousstudiesthatpredictfinedustconcentrationthruaweathermodelonlyconsideredweatherfactors, thus prediction accuracy was very low in the absence of complex mathematical models optimized for the terrain. In order to solve this problem, this paper analyzes influence factors affecting fine dust by looking at various prior studies. Among the various factors, meteorological factors and air pollution factors were found to have the most influence on fine dust, and were used as teacher trainingdata.Fourtypesofteacherlearningalgorithmswerecompared: Linear Regression, Multi-Layer Perceptron, Random ForestTree,andGradientBoosting.Experimentalresultsshowedthat gradient boosting performance was best at 5.8/3.9 for both fine and ultra-fine dust. This study only utilized weather data from Seoul, but we will study and develop an optimized micro/ultrafine dust prediction model for a metropolitan city by using the weather data of Korea Metropolitan City.

11:20
A Specification-Based Approach to Model Checking Event-Driven Systems

ABSTRACT. An event-driven system with multiple external events is difficult to verify manually. Model checking is an appropriate approach for exhaustively and automatically verifying this kind of systems. In fact, adopting the specification language used by a specific model checker to specify the behaviors of the system is really hard because many events with the corresponding constraints need to be considered. To address this problem, we propose a domain-specific language (DSL) to facilitate the specification of an event-driven system with several configurations and validation rules for checking their properties. In our approach, the occurrences of these events can be defined using different scenarios. From the specification of the system, we generate the program in Promela (Process or Protocol Meta Language) for the verification. Model checking techniques are then applied to export the counterexamples as the results of the verification. From these results, we can handle the occurrence of the events to find the corresponding errors of the system.

12:00-13:30Lunch
13:30-14:15 Session 8
Location: Panorama
13:30
Keynote 6: Attractiveness Computing-Analysis and Enhancement of Attractiveness using Big Multimedia Data

ABSTRACT. Toshihiko YAMASAKI received the B.S. degree, the M.S. degree, and the Ph.D. degree from The University of Tokyo in 1999, 2001, and 2004, respectively. He is currently an Associate Professor at Department of Information and Communication Engineering, Graduate School of Information Science and Technology, The University of Tokyo. He was a JSPS Fellow for Research Abroad and a visiting scientist at Cornell University from Feb. 2011 to Feb. 2013.

His current research interests include multimedia big data analysis, pattern recognition, machine learning, and so on.

14:15-14:30Coffee Break
14:30-16:30 Session 9A
Location: Panorama 1
14:30
Co-author Relationship Prediction in Bibliographic Network: A New Approach Using Geographic Factor and Latent Topic Information

ABSTRACT. In this research, we propose a novel approach for co-author relationship prediction in a bibliographic network utilizing geographic factor and latent topic information. We utilize a supervised method to predict the co-author relationship formation where combining dissimilar features with the dissimilar measuring coefficient. Firstly, besides existing relations have been studied in previous researches, we exploit new relation related to the geographic factor which contributes as a topological feature. Moreover, we discover content feature based on textual information from author's papers using topic modeling. Finally, we amalgamate topological features and content feature in co-author relationship prediction. We conducted experiments on dissimilar datasets of the bibliographic network and have attained satisfactory results.

14:50
A Novel Conditional Random Fields Aided Fuzzy Matching in Vietnamese Address Standardization

ABSTRACT. Address standardization is the process of recognizing and normalizing free-form addresses into a common standard format. In today's digital economy, this process is increasingly challenging such as in e-commerce fulfillment, logistic planning, geographical data analysis, real-estate, and social network mining, etc. Traditional approaches mostly follow two directions: Named Entity Recognition (NER) and fuzzy matching. Particularly, for Vietnamese address, neither these two approaches are efficient due to sparse and erroneous data. In this paper, we propose a novel approach that leverages NER model as a suggestion to re-rank potential address candidates obtained by the fuzzy matching stage. We develop a log-linear model for this re-ranking purpose. Our experiments showed that it outperforms both NER and fuzzy matching approaches with an accuracy of 88%, and suggested further applications on different language data.

15:10
A New Variant of Truck Scheduling for Transporting Container Problem
PRESENTER: Son Nguyen Van

ABSTRACT. We consider in this paper the truck scheduling for container transportation problem in which trucks and trailers are separate objects which can attach together for transporting containers. It is called Truck-Trailer-Container Routing Problem (TTCRP). In this context, trucks, trailers, and containers are located in different places. A truck can be scheduled to take a trailer, and then transports different containers from depots to customers or from customers to terminals or from terminals to customers or from customers to depots. In a planning, a truck can carry one 40ft container or two 20ft containers at a time. Moreover, in case of a long waiting time for unloading, a truck can be detached from a trailer at a customers and attached to another trailer for other itineraries. We describe the problem formulation and propose heuristics algorithms for solving large instances of the problem. We analyze the efficiency of the proposed algorithms in different problem instances.

15:30
Multifactorial Evolutionary Algorithm For Clustered Minimum Routing Cost Problem

ABSTRACT. Minimum Routing Cost Clustered Tree Problem (CluMRCT) is applied in various fields in both theory and application. Because the CluMRCT is NP-Hard, the approximate approaches are suitable to find the solution for this problem. Recently, Multifactorial Evolutionary Algorithm (MFEA) has emerged as one of the most efficient approximation algorithms to deal with many different kinds of problems. Therefore, this paper studies to apply MFEA for solving CluMRCT problems. In the proposed MFEA, we focus on crossover and mutation operators which create a valid solution of CluMRCT problem in two levels: first level constructs spanning trees for graphs in clusters while the second level builds a spanning tree for connecting among clusters. To reduce the consuming resources, we will also introduce a new method of calculating the cost of CluMRCT solution. The proposed algorithm is experimented on numerous types of the datasets. The experimental results demonstrates the effectiveness of the proposed algorithm, partially on large instances.

15:50
Clustering Method using Pareto Corner Search Evolutionary Algorithm for Objective Reduction in Many-Objective Optimization Problems

ABSTRACT. Many-objective optimization problems (MaOPs) have been gained considerable attention for researcher, recently. MaOPs make a number of difficulties for multi-objective optimization evolutionary algorithms (MOEAs) when solving them. Although, there exist a number of many-objective optimization evolutionary algorithms (MaOEAs) for solving MaOPs, they still face difficulties when the number of objectives of MaOPs increases. One common method to reduce or alleviate these difficulties is to use objective dimensionality reduction (or objective reduction for briefly). Moreover, instead of searching the whole of objective space like existing MOEAs or MaOEAs, Pareto Corner Search Evolutionary (PCSEA) concentrates only on some places of objective space, so it decreases time consuming and then speeds up objective reduction. However, PCSEA-based objective reduction needs to specify a threshold to select or remove objectives, which is not straightforward to do. Based on the idea that more conflict two objectives are, more distant two objectives are; in this paper, we introduce a new objective reduction by integrating PCSEA and k-means, DBSCAN clustering algorithms for solving MaOPs which are assumed containing redundant objectives. The experimental results show that the introduced method can reducing redundant objectives better than PCSEA-based objective reduction.

16:10
Sensor-based Abnormal Behavior Detection Using Autoencoder

ABSTRACT. The population of elderly people is increasing, with the development of an aging society all over the world. As a result, the number of people who need to take care of themselves, such as elderly people living alone or suffering from dementia, is also increasing. Caring for these people requires not only social burdens but also economic costs. A system that manages their behavior is essential to reduce the cost of caring for them. In this study, we propose an abnormal behavior detection model using smart home sensor data to manage elderly people living alone and people with dementia. Previous studies have used probability models such as a hidden Markov model (HMM) or support vector machine (SVM) model. However, the HMM requires a process to estimate values such as the initial probability, or to define states. It is also possible to detect behavior using a classification model such as an SVM, but in this study, we used an autoencoder, which is a representative unsupervised learning model, to obtain a pattern from the behavior data. The autoencoder model can detect abnormal behavior by extracting the characteristics of the normal behavior data. The models used in this study were trained and tested with normal behavior data, showing an accuracy of more than 99\%. For abnormal behavior data, a loss of about 10-30\% was observed. This model is expected to assist in effectively managing elderly or demented patients and reduce the cost of caring for them.

14:30-16:30 Session 9B
Location: Panorama 2
14:30
Minimum Graph Partition with Basis

ABSTRACT. In this paper, we introduce the minimum graph partition problem with basis. A graph partition with basis is a partition of the node set into connected parts, each contains exactly one element of the basis set. The objective is to minimize the size of the largest part. We propose three different mixed-integer linear models to solve this problem, enhanced with a heuristic procedure. Experimental tests are conducted to explore the efficiency of the models on wireless sensor networks. These tests also show that the best model could solve some problem instances with up to 400 nodes in 30 minutes.

14:50
Power-Splitting Protocol Non-Orthogonal Multiple Access (NOMA) in 5G Systems: Outage Performance

ABSTRACT. In this paper, we evaluate an energy harvesting (EH) relaying cooperative non-orthogonal multiple access (NOMA) system operating in half-duplex (HD) fixed decode-and-forward (DF), where two data symbols can be received in two time-slots at the receiver which leads to higher transmission rate. In addition, power-splitting (PS) protocol is considered to understand its impacts on the outage performance and the delay-limited throughput. It is shown that due to the placement of relay, the system performance is affected. We provide simulation results to prove the robustness of the system.

15:10
A Software Defined Networking Approach for Guaranteeing Delay in Wi-Fi Networks

ABSTRACT. Recently, low latency has become one of the most critical requirements in Wi-Fi networks (e.g., for Internet access). Many factors and events such as bufferbloat, which unexpectedly happen, can affect the delay of Wi-Fi networks. Hence, the delay requirement leads to the essential of a management mechanism, which can 1) correctly detect negative behavior in latency; 2) and adjust network settings adaptively. In this paper, we introduce a solution for the two mentioned issues, which uses the emerging Software Defined Networking (SDN) technology. Our SDN-based method can guarantee a predetermined value of delay in the Wi-Fi network by taking advantage of SDN. More specifics, we correctly monitor the delay fluctuation in Wi-Fi networks on an SDN controller. We then deploy a task of queue adjustment when the delay value exceeds the predetermined one. We have developed and evaluated our method using the mininet-wifi emulator and POX controller. The results show that we can maintain the guaranteed delay level for packet transmissions without significant effects on network performance.

15:30
Development of Dynamic QoT-aware Lightpath Provisioning Mechanism with Flexible Advanced Reservation for Distributedly-controlled Multi-domain Elastic Optical Networks

ABSTRACT. We have proposed an efficient dynamic network control scheme for multi-domain elastic optical networks to provide lightpaths dynamically. The developed scheme employs a distributed QoT-aware RMSA algorithm to find the path candidates, adaptively assigns the modulation formats and utilizes a flexible active/passive mechanism for the multiple path reservations. Numerical experiments demonstrate that the proposed control scheme significantly enhances the network performance compared with the conventional scheme; it offers more than 40% blocking probability reduction.

15:50
A Dynamic Routing Protocol for Maximizing Network Lifetime in WSNs with Holes

ABSTRACT. Extending network lifetime is one of the most critical issues in handling wireless sensor networks. The network lifetime is contributed by three factors: routing path length, control packet overhead and load balance among the nodes. In the literature, many routing protocols have been proposed. However, none of them jointly considers all of these factors, thus they can not solve the network lifetime maximization problem thoroughly. In this paper, we aim at designing a routing protocol whose objective is to minimize the routing path length and the control overhead, while maximizing the load balance. The experiment results show that our proposed protocol strongly outperforms state-of-the-art protocols concerning many metrics including network lifetime, routing path stretch, load balance and control overhead.

16:10
NextLab: A new hybrid testbed platform for Software-defined Networking

ABSTRACT. Software-defined Network (SDN) is considered as a revolutionary paradigm shift of computer networks in bringing programmable networks and control-data plane separation. SDN has simplified network management by giving self-adaptive networking mechanism, encouraging innovation with programmability, and lowering cost and power consumption (e.g. CAPEX and OPEX). As a con sequence, SDN acquires remarkable attention from networking research committee. Studying on the SDN network always needs a testing environment. However, the majority of today experimental works are based on simulation tools. This kind of test is obviously not able to reflect accurate and real results compared to real testbeds. Accordingly, in this paper we propose a novel hybrid SDN testbed platform, called Nextlab, in combining virtual and physical network components. The main objective is to enhance the experimental results in making it more realistic than virtual testing environment. Besides, Nextlab provides also a very friendly GUI to users even they are not familiar with SDN configuration syntax. The experi mental works show that our ONOS-based platform gives a better performance result compared to a traditional network simulator.

14:30-16:10 Session 9C
Location: Ruby
14:30
Browser Extension-based Crowdsourcing Model for Website Monitoring

ABSTRACT. Websites play an increasingly important role; they are not only a good marketing resource but also a power business tool. Websites need to be reliable and always available for customers; error-prone websites show an unsafe and untrustworthy business. A large number of website monitoring services have been developed to periodically check the website uptime and performance. These services typically have a very limited number of checkpoints (i.e., the geographical location where we send monitoring requests to check the websites) and internet service providers (ISP) used by these checkpoints; it is because deploying checkpoints with low usage frequency would waste them a huge amount of money. Thus they cannot detect that a website is reachable from a city and an ISP but is unreachable from the others. To address this issue, this paper presents a crowdsourcing-based approach that makes use of browser extensions as checkpoints to monitor websites. Our main contributions include: (i) a website monitoring approach that combines the concepts of crowdsourcing and browser extensions; (ii) a batch processing technique for handling monitoring requests; (iii) the architecture of a conceptual crowdsourcing-based system for website monitoring that can coordinate a large number of crowd members; and (iv) various techniques to enhance the quality of monitoring. Our concepts have been implemented in a prototype system with innovative functions for website monitoring.

14:50
Risk Management for Agile Projects in Offshore Vietnam

ABSTRACT. With Agile Development increasingly becoming “the norm” in software industry, experienced Project Managers from a hierarchical, 14,000-employee Company often find themselves facing difficulties in adapting their Risk Management practices into Agile projects. While Risk Management in an Agile offshore has been explored in various studies, these works do not offer a direct explanation on why certain Company had placed an emphasis on Risk Management, and how this emphasis has led to difficulties in the Agile era. From this explanation, the paper will propose a model to more seamlessly integrate Risk Management into Agile projects. Empirical results from case studies will also be provided to prove the effectiveness of the model. Lastly, a reference Risk Register (Risks and their Responses) will be provided for Project Managers to use as a starting point in their project.

15:10
A Genetic Algorithm for Large Graph Partitioning Problem
PRESENTER: Xuan-Tung Nguyen

ABSTRACT. In this paper, we introduce the graph partitioning problem and its importance in the present time. This problem is NP-complete. This paper purpose a genetic algorithm (GA) to distribute large-scale graph for parallel computations include individual represent, fitness function, and genetic operators. We have reimplemented the algorithms (such as Greedy Algorithm, Bulk Swap algorithm), and compared the results of the algorithms on the data sets (including random data and real data). All algorithms are evaluated based on the number of edges between different sets. Through empirical results, we find that the results of genetic algorithms are superior to 62% compared to Greedy algorithm, 91% compared to Greedy graph growing algorithm and 90% compared to Bulk Swap algorithm on real dataset of the Facebook

15:30
Abstractive Text Summarization Using Pointer-Generator Networks With Pre-trained Word Embedding

ABSTRACT. Abstractive text summarization is the task of generating a summary that captures the main content of a text document. As a state-of-the-art method for abstractive summarization, the pointer-generator network produces more fluent summaries and solves two shortcomings of reproducing factual details inaccurately and phrase repetition. Though this network can generate Out-Of-Vocabulary (OOV) words, it cannot completely represent them in the context and may face the information loss problem. This paper aims to improve the quality of abstractive summarization with an extra pre-trained layer of word embedding for the pointer-generator network. This mechanism helps to maintain the meaning of words in more various contexts. This assures that every word has its own representation, even though it does not exist in the vocabulary. We modify the network with the two latest word embedding mechanisms, i.e. Word2vec and Fasttext, to represent the semantic information of words more accurately. Some OOV words which are marked as unknown tokens now can have their right embeddings and be well considered in summary generation. The experiments on the CNN/Daily Mail corpus shows that the new mechanism outperforms the only pointer-generator network by at least 3 ROUGE points.

15:50
Enhanced Genetic Algorithm for Single Document Extractive Summarization

ABSTRACT. In extractive summarization, summaries are generated by selecting the most salient sentences from the original text. The text summarization can be seen as a classification of sentences into two groups: in-summary/not-in-summary. Many approaches have been proposed to extract key sentences in which using Genetic Algorithms (GAs) has shown some promising results. In this paper, we propose an enhanced genetic algorithm in order to improve the quality of extractive text summarization. More concisely, we first evaluate the role of some sentence features and their contribution to improve the fitness function. We second investigate some crossover and mutation mechanisms in order to augment the accuracy of summarization as well as the performance of our model. The experiment has been conducted for the Daily Mail dataset to assess the proposed model and previous works. The empirical results show that our proposed GA gives better accuracy in comparison with TextRank and SummaRunNer, i.e. increasing the accuracy by 7.2% and 6.9% respectively.