previous day
next day
all days

View: session overviewtalk overview

09:30-10:30 Session 5A: Evolutionary Computation
Knowledge-based Solution Construction for Evolutionary Minimization of Systemic Risk

ABSTRACT. This paper concerns a problem of minimizing systemic risk in a system composed of interconnected entities such as companies on the market. Systemic risk arises, when, because of an initial failure of a limited number of elements, a significant part of the system fails. The system is modelled as a graph, with some nodes in the graph initially failing. The spreading of failures can be stopped by protecting nodes in the graph, which in case of companies can be achieved by setting aside reserve funds. The goal of the optimization problem is to reduce the number of nodes that eventually fail due to connections in the system.

This paper studies the possibility of utilizing external knowledge for solution construction in this problem. Rules representing reusable information are extracted from solutions of problem instances and are used when solving new instances.

Experiments presented in the paper show that using rule-based knowledge representation for constructing initial population allows the evolutionary algorithm to attain better results during the optimization run.

Crossover Operator using Knowledge Transfer for the Firefighter Problem

ABSTRACT. This paper concerns the Firefighter Problem (FFP) which is a graph-based problem in which solutions can be represented as permutations. A new crossover operator is proposed that uses a machine learning model to decide how to combine two parent solutions of the FFP into an offspring. The operator works on two parent permutations and the machine learning model provides information which parent to select the next permutation element from when constructing a new solution. Training data is collected during a training run in which transpositions are applied to solutions found by an evolutionary algorithm for a small problem instance. The machine learning model is trained to classify pairs of graph vertices into two classes corresponding to which vertex should be placed earlier in the permutation.

In the experiments the machine learning model was trained on a set of FFP instances with 1000 vertices. Subsequently, the proposed operator was used for solving FFP instances with up to 10 000 vertices. The experiments shown that the proposed operator is able to effectively use knowledge gathered when solving smaller instances for solving larger instances of the same problem.

Differential evolution for association rule mining using categorical and numerical attributes

ABSTRACT. Association rule mining is a method for identification of dependence rules between features in a transaction database. In the past years, researchers applied the method using features consisting of categorical attributes. Rarely, numerical attributes were used in these studies. In this paper, we present a novel approach for mining association based on differential evolution, where features consist of numerical as well as categorical attributes. Thus, the problem is presented as a single objective optimization problem, where support and confidence of association rules are combined into a fitness function in order to determine the quality of the mined association rules. Initial experiments on sport data show that the proposed solution is promising for future development. Further challenges and problems are also exposed in this paper.

Intelligent Rub-Impact Fault Diagnosis based on Genetic Algorithm-based IMF Selection in Ensemble Empirical Mode Decomposition and Diverse Features Models

ABSTRACT. Rub-impact faults condition monitoring is a challenging problem due to the complexity in vibration signal of rub-impact faults. These complexities make it hard to use traditional time- and frequency-domain analysis. Recently, various time-frequency analysis approaches namely empirical mode decomposition (EMD) and ensemble EMD (EEMD) have been used for rubbing fault diagnosis. However, traditional EMD suffers from “mode-mixing” problems that cause dif-ficulty to find physically meaningful intrinsic mode functions (IMF) for feature extraction. We propose an intelligent rub-impact fault diagnosis scheme using a genetic algorithm (GA)-based meaningful IMF selection technique for EEMD and diverse features extraction models. First, the acquired signal is adaptively decomposed into a series of IMFs by EEMD that correspond to different fre-quency bands of the original signal. Then, a GA search using a new fitness func-tion, which combines the mean-peak ratio (MPR) of rub impact and mutual in-formation (MI)-based similarity measure, is applied to select the meaningful IMF components. The designed fitness function ensures the selection of discriminative IMFs which carry the explicit information about rubbing faults. Those selected IMFs are utilized for extracting fault features, which are further employed with k-nearest neighbor (k-NN) classifier for fault diagnosis. The obtained results show that the proposed methodology efficiently selects discriminant signal-dominant IMFs, and the presented diverse feature models achieve high classification accu-racy for rub-impact faults diagnosis.

09:30-10:30 Session 5B: Time Series analysis
Location: Mixed room
Improving Time Series Prediction via Modification of Dynamic Weighted Majority in Ensemble Learning
SPEAKER: Peter Pavlík

ABSTRACT. In this paper, we explore how the modified DynamicWeighted Majority (DWM) method of ensemble learning can enhance time series prediction. DWM approach was originally introduced as a method to combine predictions of multiple classifiers. In our approach, we propose its modification to solve the regression problems which are based on using differing features to further improve the accuracy of the ensemble. The proposed method is then tested in the domain of energy consumption forecasting.

Assessment and Adaption of Pattern Discovery Approaches for Time Series under the Requirement of Time Warping

ABSTRACT. In the automotive industry, the cars themselves as well as the production lines, produce a high amount of data day by day. To get information out of these data there is a need for high performance data mining tools. One of these tools is the pattern discovery. This paper addresses the assessment of different approaches to discover frequent pattern in time series. Our special requirement is the detection of time warped pattern with variable length. The comparison includes approaches based on dynamic time warping (DTW), discretization as well as Keogh’s Matrix Profile. Every approach is exemplarily implemented in MATLAB and (if necessary) adapted to face our use cases. The focus of the assessment will be the quality of the results, the runtime and the effort of parametrization. For evaluation, time series test datasets are generated with predefined patterns based on random walks. The output patterns, identified by the different pattern discovery algorithms, are compared with the initial patterns and evaluated with respect to the Jaccard index. This leads to a quality score for every algorithm and every parametrization and the possibility to compare different algorithms as well as approaches.

ALoT: A time-series similarity measure based on alignment of textures

ABSTRACT. Inferring the similarity between two time-series signals is a key step in several data analysis tasks for a variety of engineering applications. In this paper, we introduce a novel elastic similarity measure (ALoT) based on the alignment of textures, instead of observed values, extracted from input time-series signals. To obtain the texture information, Local Binary Patterns are adapted for one-dimensional signals. According to experiments performed on a large number of benchmark time-series classification datasets, the proposed method achieves higher accuracy than current pairwise similarity measures for several cases in a 1-Nearest Neighbor classification setup.

Inferring Temporal Structure from Predictability in Bumblebee Learning Flight
SPEAKER: Stefan Meyer

ABSTRACT. Insects are succeeding in remarkable navigational tasks. Bumblebees, for example, are capable of learning their nest location with sophisticated flight manoeuvres, forming a so-called learning flight. The learning flights, thought to be partially pre-programmed, enable the bumblebee to memorise spatial relations between its inconspicuous nest entrance and environmental cues. To date, environmental features (e.g. object positions on the eyes) and learning experience of the insect were used to describe the flights, but its structure, thought to facilitated learning, has not been investigated systematically. In this work, we present a novel approach, to investigate whether and in which time span flight behaviour is predictable based on intrinsic properties only rather than external sensory information. We study the temporal composition of learning flights by estimating the smoothness of the underlying process. We then use echo state networks (ESN) and linear models (ARIMA) to predict the bumblebee trajectory from its past motion and identify different time-scales in learning flights using their prediction-power. We found that visual information is not necessary within a 200ms time-window to explain the bumblebee learning behaviour.

09:30-10:30 Session 5C: Special Session on Data Selection in Machine Learning (DSML 2)
Location: Meeting room
Novelty Detection using Elliptical Fuzzy Clustering in a Reproducing Kernel Hilbert Space

ABSTRACT. Nowadays novelty detection methods based on one-class classification are widely used for many important applications associated with computer and information security. In these areas, there is a need to detect anomalies in complex high-dimensional data. An effective approach for analyzing such data uses kernels that map the input space into a reproducing kernel Hilbert space (RKHS) for further outlier detection. The most popular methods of this type are support vector clustering (SVC) and kernel principle component analysis (KPCA). However, they have some drawbacks related to the shape and the position of contours they build in a RKHS. To overcome the disad-vantages a new algorithm based on fuzzy clustering with Mahalanobis dis-tance in a RKHS is proposed in this paper. Unlike SVC and KPCA it simul-taneously builds elliptic contours and finds optimal center in the RKHS. The proposed method outperforms SVC and KPCA in such important security re-lated problems as user authentication based on keystroke dynamics and de-tecting online extremist information on web forums.

Bare Bones Fireworks Algorithm for Medical Image Compression
SPEAKER: Milan Tuba

ABSTRACT. Digital images are of a great importance in medicine. Efficient and compact storing of the medical digital images represents a major issue that needs to be solved. JPEG lossy compression algorithm is most widely used where better compression to quality ratio can be obtained by selecting appropriate quantization tables. Finding the optimal quantization tables is a hard combinatorial optimization problem and stochastic metaheuristics have been proven to be very efficient for solving such problems. In this paper we propose adjusted bare bones fireworks algorithm for quantization table selection. The proposed method was tested on different medical digital images. The results were compared to the standard JPEG algorithm. Various image similarity metrics were used and it has been shown that the proposed method was more successful.

Data Pre-processing to apply Multiple Imputation techniques: A case study on real-world census data

ABSTRACT. Improving accuracy or reducing computational cost are the main approaches of machine learning techniques, but it depends heavily on the test data used. Even more so when it comes to from real-world data such as censuses, surveys or tokens that contain a high level of missing values. The data absence or presence of outliers are problems that must be treated carefully prior to any process related to data analysis. The following work presents an overview of data pre-processing and aims at presenting the steps to follow prior to process large volumes of high-dimensionality data with categorical variables. As part of the dimensionality reduction process, when there is a high level of missing values present in one or more variables, we use the Pairwise and Listwise Deletion methods. Thus, the generation of m-clusters using the Kohonen Self-Organizing Maps (SOM) algorithm with H2O over R is also considered as a division of data into similar groups, which are used as cluster to apply Multiple Imputation algorithms, creating different m-values to impute a missing value.

Imbalanced data classification based on feature selection techniques

ABSTRACT. The difficulty of the many classification tasks lies in the analyzed data nature, as disproportionate number of examples from different class in a learning set. Ignoring this characteristics causes that canonical classifiers display strongly biased performance on imbalanced datasets. In this work a novel classifier ensemble forming technique for imbalanced datasets is presented. On the one hand it takes into consideration selected features used for training individual classifiers, on the other hand it ensures an appropriate diversity of a classifier ensemble. The proposed method was tested on the basis of the computer experiments carried out on the several benchmark datasets. Their results seem to confirm the usefulness of the proposed concept.

MapReduce model for Random Forest algorithm: experimental studies

ABSTRACT. In the world where technology has largely dominated almost every aspect of human life the amount of data generated each minute grows at a rapid rate. The need to analyze massive volumes of data poses new challenges for researchers and specialists around the world. The MapReduce model became a center of interest due to offering a way of execution that allowed a parallelization of tasks. Machine learning also gained a significantly more attention due to many applications where it can be used. In this paper, we discuss the challenges of Big Data analysis and provide an overview of the MapReduce model. We conducted experiments to examine the performance of the MapReduce on the example of a Random Forest algorithm to determine its effect on the overall quality of the analysis. The paper ends with remarks on the strengths and pitfalls of using MapReduce as well as ideas on improving its potential.

10:30-11:30 Session 6: Tutorial
Tutorial: Applying Machine Learning to detect Android Malware

ABSTRACT. The possibilities and advantages of applying Machine Learning to solve the most diverse problems are beyond question. It has been proved how this wide set of techniques can help to address varied issues related to computer vision, natural language processing, fraud detection, robotics or bioinformatics, among many others. In this tutorial we aim to present the possibilities of this field when dealing with a complex, current and critical problem: the detection of malware in Android devices. As we will show, Machine Learning techniques such as classification and clustering algorithms, deep learning or evolutionary computation are currently being employed to detect those malware samples whose behavior exhibits malicious patterns. Furthermore, we will explain the different tools designed for performing Android malware analysis and reverse engineering processes. Finally, we will describe in first place our framework AndroPyTool, aimed at extracting a wide set of features from Android applications with the goal of deeply charactering their behavior and in second place the OmniDroid dataset, a comprehensive dataset of features from Android benign and malicious applications.

11:30-12:00Coffee Break
12:00-13:00 Session 7: Plenary talk
Tackling Many Objectives

ABSTRACT. Many optimisation problems in the real world need to consider multiple conflicting objectives simultaneously. Evolutionary algorithms are excellent candidates for finding a good approximation to the Pareto optimal front in a single run. However, many multi-objective optimisation algorithms are effective for two or three objective only. It is an on-going challenge to deal with a larger number of objectives. In this talk, I will explain several methods for dealing with many objectives. First, we will describe a method for reducing a large number of objectives to a smaller one, especially when there is redundancy among different objectives. Second, alternative dominance relationship, other than the Pareto dominance, will be introduced into to make previously non-comparable solutions comparable. Lastly, new algorithms will be introduced to cope with many objectives through the use of two separate archives, for convergence and diversity, respectively. Our studies show that these methods are very effective and outperform other popular methods in the literature.

13:00-14:00Lunch Break
14:10-16:10 Session 8A: Anomaly Detection and Trust Management
Effective centralized trust management model for Internet of Things
SPEAKER: Hela Maddar

ABSTRACT. The emergence of the Internet of Things (IoT) is a result of convergence between multiple technologies, like Internet, wireless communication, embedded systems, microelectronic systems and nanotechnology. In 2016, 5.5 million objects are connected every day in the world. A number that could quickly reach billions by 2020 [1]. Gartner predicts that 26 billion objects will be installed in 2020. The market for connected objects could range from a few tens of billions to up to several thousand billion units. Among the vital components of IoT, we find wireless sensor networks (WSNs). Wireless sensor networks as a vital component of the IoT, allow the representation of dynamic characteristics of the real world in the virtual world of the Internet. Nevertheless, the opening of these types of network to the Internet presents a serious problem stand point security. For that, the implementation of intrusion detection mechanisms is essential to limit the internal and external attacks that threaten the smooth running of the network.  In this paper, we propose an efficient trust management model which seeks deeply through the nodes to detect attacks that threaten wireless sensor networks. Our model includes a geographical localization system used to identify the nodes location. Although, it includes a set of rules detection attacks based on different parameter analysis. Furthermore, we propose a mathematical model for trust establishment and its update on the network. During the simulations we observe an improvement of the efficiency of the implemented geo-location model and also a reasonable energy consumption.  Similarly, we have been able to evince the efficiency of our model in terms of attacks detection rate.

Applying Tree Ensemble to Detect Anomalies in Real-World Water Composition Dataset
SPEAKER: Minh Nguyen

ABSTRACT. Drinking water is one of fundamental human needs. During delivery in distribution network, drinking water is susceptible to contaminants. Early recognition of changes in water quality is essential in the provision of clean and safe drinking water. For this purpose, Contamination warning system (CWS) composed of sensors, central database and event detection system (EDS) has been developed. Conventionally, EDS employs time series analysis and domain knowledge for automated detection. This paper proposes a general data driven approach to construct an automated online event detention system for drinking water. Various tree ensemble models are investigated in application to real-world water quality data. In particular, gradient boosting methods are shown to overcome challenges in time series data imbalanced class and collinearity and yield satisfied predictive performance.

An Adaptive Anomaly Detection Algorithm for Periodic Real Time Data Streams
SPEAKER: Zirije Hasani

ABSTRACT. Real-time anomaly detection of big data streams (time series) is one of the important research topics nowadays due to the fact that the most of the world data is generated in continuous temporal processes. Holt-Winters (HW) and Taylor Double Holt-Winters (TDHW) forecasting models are used to predict the normal behavior of the periodic streams, and if the deviation of observed and predicted value exceeded some predefined measure, an anomaly is detected. In this work, we propose an enhancement of this approach. The Genetic Algorithm (GA) is implemented to periodically optimize HW and TDHW smoothing parameters in addition to the two sliding windows parameters that improve Hyndman\textquotesingle s MASE measure of deviation, and value of the threshold parameter that defines no anomaly confidence interval. We also propose a new optimization function based on the input training datasets with the annotated anomaly intervals, in order to detect the right anomalies and minimize the number of false ones. The proposed method is evaluated on the known anomaly detection benchmarks NUMENTA and Yahoo datasets with annotated anomalies and our real log data generated by the Macedonian national education management information system.

ATM Fraud Detection using Outlier Detection

ABSTRACT. In this work, a fraud detection model that applies the accounts' behavior features and an abnormally detection method is proposed. Given a set of transactions, they are grouped into the accounts that withdraw money using only local ATM and the ones using local and abroad ATM. Only known legitimate transactions are used to extract a set of features for representing a legitimate behavior. Given an unknown transaction, it is classified using an abnormally detection. The experimental result shows that the proposed feature with an Isolation Forest abnormally detection is able to detect all fraud transactions.

Anomaly Detection in Spatial Layer Models of Autonomous Agents

ABSTRACT. For describing the complete state of complex environments with multiple mobile and autonomous agents, spatial layer models (SLM) are popular data structures. These models consist of several planes describing the spatial structure of selected features. Those SLM can directly be used for deep reinforcement learning tasks. However, detecting anomalies in such SLM poses two major challenges: the state space explosion in such settings and the spatial relations between the features. In this paper, we present a method for anomaly detection in SLM which solves both challenges by first extracting significant sub-patterns from training data and storing them in a dictionary. Afterwards, the entries of this dictionary are used for reconstructing SLM, which have to be validated. The resulting covering rate is an indicator for the (ab)normality of the given SLM. We show the applicability of our approach for a simple multi-agent scenario, and more complex smart factory scenarios with autonomous agents.

Intrusion Detection Using Transfer Learning in Machine Learning Classifiers Between Non-cloud and Cloud Datasets
SPEAKER: Roja Ahmadi

ABSTRACT. Abstract. One of the critical issues in developing intrusion detection systems (IDS) in cloud-computing environments is the lack of publicly available cloud intrusion detection datasets, which hinders research into IDS in this area. There are, however, many non-cloud intrusion detection datasets. This paper seeks to leverage one of the well-established non-cloud datasets and analyze it in rela-tion to one of the few available cloud datasets to develop a detection model us-ing a machine learning technique. A complication is that these datasets often have different structures, contain different features and contain different, though overlapping, types of attack. The aim of this paper is to explore whether a simple machine learning classifier containing a small common feature set trained using a non-cloud dataset that has a packet-based structure can be use-fully applied to detect specific attacks in the cloud dataset, which contains time-based traffic. Through this, the differences and similarities between attacks in the cloud and non-cloud datasets are analyzed and suggestions for future work are presented.

Signal Reconstruction using Evolvable Recurrent Neural Networks

ABSTRACT. Data loss is common in wireless networks due to noise, unexpected damage, unreliable link, and collision, which greatly reduces the accuracy of reconstruction. Existing interpolation techniques fails to provide a satisfactory accuracy when missing data become large. To address this problem, this paper proposes a novel approach to reconstruct randomly missing data based on interpolation and machine learning technique i.e. Cartesian genetic programming evolved recurrent neural network (CGPRNN). Although feed-forward neural networks have been very successful in signal processing fields in general with recurrent neural networks having an edge where system with memory is priority. Recurrent neural networks not only provide non-linearity but also non-Markovian state information. The proposed method is used for reconstruction of lost samples in audio signal which are non-stationary in nature through accurate predication. Simulation results are presented to validate the performance of CGPRNN for accurate reconstruction of distorted signal. The error rate of 12% for 25% missing data and 18% for 50% distorted data is achieved where the system has low confidence in its predication.

Applying cost-sensitive classifiers with reinforcement learning to IDS

ABSTRACT. When using an intrusion detection system as protection against certain kind of attacks, the impact of classifying normal samples as attacks (False Positives) or attacks as normal traffic (False Negatives) is completely different. In order to prioritize the absence of one kind of error, we use reinforcement learning strategies which allow us to build a cost-sensitive meta-classifier. This classifier has been build using a DQN architecture over a MLP. While the DQN introduces extra effort during the training steps, it does not cause any penalty on the detection system. We show the feasibility of our approach for two different and commonly used datasets, achieving reductions up to 100% in the desired error by changing the rewarding strategies.

14:10-16:10 Session 8B: Medical Applications of Artificial Intelligence
Location: Mixed room
Improving the Decision Support in Diagnostic Systems using Classifier Probability Calibration

ABSTRACT. In modern medical diagnoses, classifying a patient's disease is often realized with the help of a system-aided symptoms interpreter. Most of these systems rely on supervised learning algorithms, which can statistically extend the doctor's logic capabilities for interpreting and examining symptoms, thus supporting the doctor to find the correct diagnosis. Besides, these algorithms compute classifier scores and class labels that are used to statistically characterize the system's confidence level on a patient's type of disease. Unfortunately, most classifier scores are base on an arbitrary scale. Thus the interpretations often lack clinical significance. Especially when combining multiple classifier scores within a diagnostic system, it is essential also to apply a calibration process to make the different scores comparable.

As a frequently used calibration technique, we adapted isotonic regression for our medical diagnostic support system, to provide a flexible and effective scaling process that consequently calibrates the arbitrary scales of probability scores from classifiers. In a comparative evaluation, we show that isotonic regression can actively improve the system diagnosis based on an ensemble of classification results, also effectively remove outliers from data, thus optimize the decision support system to obtain more accurate diagnostic results.

Compound local binary pattern and enhanced Jaya optimized extreme learning machine for digital mammogram classication
SPEAKER: Figlu Mohanty

ABSTRACT. The fatality rate due to breast cancer still continues to re- main high across the world and women are the frequent suerers of this cancer. Such high fatality rate can be lowered down if the cancer is iden- tied at its early stage. In addition to this context, mammography is one of the powerful imaging modalities to detect and diagnose cancer eectively. A computer-aided diagnosis (CAD) system is a potential tool which analyses the mammographic images to reach a correct decision. The present work aims at developing a CAD framework which can clas- sify the mammograms accurately. This work has primarily four stages. First, contrast limited adaptive histogram equalization (CLAHE) is used for pre-processing. Second, feature extraction is realized using compound local binary pattern (CLBP) followed by principal component analysis (PCA) for feature reduction. Finally, an enhanced Jaya-based extreme learning machine is utilized to classify the mammograms as normal or abnormal, and further, benign or malignant. The success rate in terms of classication accuracy achieves 100% and 99.48% for MIAS and DDSM datasets, respectively.

An ELM based Regression Model for ECG Artifact Minimization from Single Channel EEG

ABSTRACT. Electroencephalogram (EEG) is the most widely used non-an invasive technique to record the electrical activity of brain for analysis or diagnostic procedures. The sensitive electrodes of EEG are susceptible to high amplitude electrocardiogram (ECG) signals, which superimpose on the recorded EEG. Minimizing this artifact effectively from a single channel EEG without a reference ECG channel is a challenge. In this paper, extreme learning machine (ELM) algorithm as a regression model is implemented for ECG artifact removal from single-channel EEG. The S-transform (ST) of the EEG signals are used as the feature set of for ELM training and testing. ST combines the progressive resolution and absolutely referenced phase information for the given time series uniquely. By training the ELM with pairs of contaminated and clean EEG signals both in magnitude and phase, is able to minimize the ECG artifact from contaminated EEG signal effectively in the testing phase. The average Root mean square error (RMSE) and the correlation coefficient (CC) for actual EEG signal to the estimated EEG signal from the ELM based regression model obtained are 0.32 and 0.96 respectively.

Intelligent wristbands for the automatic detection of emotional states for the elderly

ABSTRACT. Over the last few years, research on computational intelligence is being conducted to detect emotional states of people. This paper proposes the use of intelligent wristbands for the automatic detection of emotional states to develop an application which allows to monitor older people in order to improve their quality of life. The paper describes the hardware design and the cognitive module that allows the recognition of the emotional states. The proposed wristband

Deep Learning-based Approach for the Semantic Segmentation of Bright Retinal Damage

ABSTRACT. Regular screening for the development of diabetic retinopathy is imperative for an early diagnosis and a timely treatment, thus preventing further progression of the disease. The conventional screening techniques based on manual observation by qualified physicians can be very time consuming and prone to error. In this paper, a novel automated screening model based on deep learning for the semantic segmentation of exudates in color fundus images is proposed with the implementation of an end-to-end convolutional neural network built upon U-Net architecture. This encoder-decoder network is characterized by the combination of a contracting path and a symmetrical expansive path to obtain precise localization with the use of context information. The proposed method was validated on E-OPHTHA and DIARETDB1 public databases achieving promising results compared to current state-of-the-art methods.

Machine Learning for drugs prescription
SPEAKER: Pedro Silva

ABSTRACT. In a medical appointment, patient information, including past exams, is analyzed in order to define a diagnosis. This process is prone to errors, since there may be many possible diagnoses. This analysis is very dependent on the experience of the doctor. Even with the correct diagnosis, prescribing medicines can be a problem, because there are multiple drugs for each disease and some may not be used due to allergies or high cost. It would thus be helpful, if the doctors were able to use a system that, for each diagnosis, provided a list of the most suitable medicines.

Our approach is to support the physician in this process. Rather than trying to predict the medicine, we aim to, given the available information, predict the set of the most likely drugs.

The prescription problem may be solved as a Multi-Label classification problem since, for each diagnosis, multiple drugs may be prescribed at the same time. Due to its complexity, some simplifications were performed in order to be treatable. So, multiple approaches were done with different assumptions. The data supplied was also complex, with important problems in its quality, that led to a strong investment in data preparation, in particular, feature engineering.

Overall, the results in each scenario are good with performances almost twice the baseline, especially using Binary Relevance as transformation approach.

Identification of individual glandular regions using LCWT and machine learning techniques

ABSTRACT. A new approach for the segmentation of gland units in histological images is proposed with the aim of contributing to the improvement of the prostate cancer diagnosis. Clustering methods on several colour spaces are applied to each sample in order to generate a binary mask of the different tissue components. From the mask of lumen candidates, the Locally Constrained Watershed Transform (LCWT) is applied as a novel gland segmentation technique never before used in this type of images. 500 random gland candidates, both benign and pathological, are selected to evaluate the LCWT technique providing results of Dice coefficient of 0.85. Several shape and textural descriptors in combination with contextual features and a fractal analysis are applied, in a novel way, on different colour spaces achieving a total of 297 features to discern between artefacts and true glands. The most relevant features are then selected by an exhaustive statistical analysis in terms of independence between variables and dependence with the class. 3.200 artefacts, 3.195 benign glands and 3.000 pathological glands are obtained, from a data set of 1468 images at 10x magnification. A careful strategy of data partition is implemented to robustly address the classification problem between artefacts and glands. Both linear and non-linear approaches are considered using machine learning techniques based on Support Vector Machines (SVM) and feedforward neural networks achieving values of sensitivity, specificity and accuracy of 0.92, 0.97 and 0.95, respectively.

Specifics Analysis of Medical Communities in Social Network Services

ABSTRACT. Social networks contain a lot of useful medical information in users’ and communities’ posts, especially about adverse drug reactions. But before processing of the medical communities, it is important to be aware of their implicit features, which could affect the reliability of the information retrieved. We use the principal component centrality evaluation to reveal features of the distribution of influence of community members. Cosine similarity was used to compare vocabularies and structural indicators of communities of different types. As a result of the research, it was found that the medical communities have significant similarities with the care-related communities, so they can be used as an extension of the information database on the collection of the drug response. In addition, medical communities may have an atypical structure with several users who have high influence in a particular group, which shows that is necessary to verify the reliability of the information retrieved.

14:10-15:25 Session 8C-I: Special Session on Machine Learning for Renewable Energy applications (MLRE)
Location: Meeting room
Wind power ramp events ordinal prediction using minimum complexity echo state networks

ABSTRACT. Renewable energy is the fastest growing source of energy in the last years. In Europe, wind energy is currently the energy source with the highest growing rate and the second largest production capacity, after gas energy. There are some problems that difficult the integration of wind energy into the electric network. These include wind power ramp events, which are sudden differences (increases or decreases) of wind speed in short periods of times. These wind ramps can damage the turbines in the wind farm, increasing the maintenance costs. Currently, the best way to deal with this problem is to predict wind ramps beforehand, in such way that the turbines can be stopped before their occurrence, avoiding any possible damages. In order to perform this prediction, models that take advantage of the temporal information are often used. One of the most well-known models in this sense are recurrent neural networks. In this work, we consider a type of recurrent neural networks which is known as Echo State Networks (ESNs) and has demonstrated good performance when predicting time series. Specifically, we propose to use the Minimum Complexity ESNs in order to approach a wind ramp prediction problem at three wind farms located in the Spanish geography. We compare three different network architectures, depending on how we arrange the connections of the input layer, the reservoir and the output layer. From the results, a single reservoir for wind speed with delay line reservoir and feedback connections is shown to provide the best performance.

Distribution-based discretisation and ordinal classification applied to wave height prediction

ABSTRACT. Wave height prediction is an important task for ocean and marine resource management. Traditionally, regression techniques are used for this prediction, but estimating continuous changes in the corresponding time series can be very difficult. With the purpose of simplifying the prediction, wave height can be discretised in consecutive intervals, resulting in a set of ordinal categories. Despite this discretisation could be performed using the criterion of an expert, the prediction could be biased to the opinion of the expert, and the obtained categories could be unrepresentative of the data recorded. In this paper, we propose a novel automated method to categorise the wave height based on selecting the most appropriate distribution from a set of well-suited candidates. Moreover, given that the categories resulting from the discretisation show a clear natural order, we propose to use different ordinal classifiers instead of nominal ones. The methodology is tested in real wave height data collected from two buoys located in the Gulf of Alaska and South Kodiak. We also incorporate reanalysis data in order to increase the accuracy of the predictors. The results confirm that this kind of discretisation is suitable for the time series considered and that the ordinal classifiers achieve outstanding results in comparison with nominal techniques.

Merging ELMs with Satellite Data and Clear-sky models for Effective Solar Radiation Estimation

ABSTRACT. This paper proposes a new approach to estimate Global Solar Radiation based on the use of the Extreme Learning Machine (ELM) technique combined with satellite data and a clear-sky model. Our study area is the radiometric station of Toledo, Spain. In order to train the Neural Network proposed, one complete year of hourly global solar radiation data (from the 1st of May 2013 to the 30st of April 2014) is used as the target of the experiments, and different input variables are considered: a cloud index, a clear-sky solar radiation model and several reflectivity values from Meteosat visible images. To assess the results obtained by the ELM we have selected as a reference a physical-based method which considers the relation between a clear-sky index and a cloud cover index. Then a measure of the Root Mean Square Error (RMSE) and the Pearson's Correlation Coefficient ($r^2$) is obtained to evaluate the performance of the suggested methodology against the reference model. We show the improvement of the results obtained by the ELM with respect to those obtained by the physical-based method considered.

Studying the effect of measured solar power on evolutionary multi-objective prediction intervals

ABSTRACT. While it is common to make point forecasts for solar energy generation, estimating the forecast uncertainty has received less attention. In this article, prediction intervals are computed within a multi-objective approach in order to obtain an optimal coverage/width tradeoff. In particular, it is studied whether using power measured at the time of prediction as an additional input to meteorological forecast variables, is able to improve the properties of prediction intervals for short time horizons (up to three hours). Results show that they tend to be narrower (i.e. less uncertain), and the ratio between coverage and width is larger. The method has shown to obtain intervals with better properties than baseline Quantile Regression.

Gaussian Process Kernels for Support Vector Regression in Wind Energy Prediction

ABSTRACT. We consider wind energy prediction by Support Vector Regression, SVR, with generalized Gaussian Process kernels, proposing a validation–based kernel choice which will be then used in two prediction problems instead of the standard Gaussian ones. The resulting model beats in one problem and ties in the other and besides the flexibility this approach offers, SVR hyper–parameterization can be also simplified.

15:25-16:10 Session 8C-II: Special Session on Evolutionary Computing Methods for Data Mining: Theory and Applications (ECMD)
Location: Meeting room
Hospital admission and risk assessment associated to exposure of fungal bioaerosols at a municipal landfill using statistical models

ABSTRACT. The object of this research to determine the statistical relationship and degree of association between variables: hospital admission days and diagnostic (disease) potentially associated to fungal bioaerosols exposure. Admissions included acute respiratory infections, atopic dermatitis, pharyngitis and otitis. Statistical analysis was done using Statgraphics Centurion XVI software. In addition, was estimated the occupational exposure to fungal aerosols in stages of a landfill using BIOGAVAL method and represented by Golden Surfer XVI program. Biological risk assessment with sentinel microorganism A. fumigatus and Penicillium sp, indicated that occupational exposure to fungal aerosols is Biological action level. Preventive measures should be taken to reduce the risk of acquiring acute respiratory infections, dermatitis or other skin infections.

Bat algorithm swarm robotics approach for dual non-cooperative search with self-centered mode
SPEAKER: Eneko Osaba

ABSTRACT. This paper presents a swarm robotics approach for dual non-cooperative search, where two robotic swarms are deployed within a map with the goal to find their own target point, placed at an unknown location of the map. We consider the self-centered mode, in which each swarm tries to solve its own goals with no consideration to any other factor external to the swarm. This problem, barely studied so far in the literature, is solved by applying a popular swarm intelligence method called bat algorithm, adapted to this problem. Five videos show some of the behavioral patterns found in our computational experiments.

GELAB – A Matlab Toolbox for Grammatical Evolution

ABSTRACT. In this paper, we present a Matlab version of libGE. libGE is a famous library for Grammatical Evolution (GE). GE was proposed initially in [1] as a tool for automatic programming. Ever since then, GE has been widely successful in innovation and producing human-competitive results for various types of problems. However, its implementation in C++, libGE, was somewhat prohibitive for a wider range of scientists and engineers. libGE requires several tweaks and integrations before it can be used by anyone. For anybody who does not have a background in computer science, its usage could be a bottleneck. This prompted us to find a way to bring it to Matlab. Matlab, as it is widely known, is a fourth generation programming language used for numerical computing. Details aside, but it is well known for its user-friendliness in the wider research community. By bringing GE to Matlab, we hope that many re- searchers across the world shall be able to use it, despite their academic background. We call our implementation of GE as GELAB. GELAB is currently present online as an open-source software. It can be readily used in research and development.

16:10-16:40Coffee Break
16:40-18:10 Session 9A: Image Analysis
Deep-Learning-based Classification of Rat OCT images after Intravitreal Injection of ET-1 for Glaucoma Understanding

ABSTRACT. Optical coherence tomography (OCT) is a useful technique to monitor retinal damage. We present an automatic method to accurately classify rodent OCT images in healthy and pathological (before and after 14 days of intravitreal injection of Endothelin-1, respectively) making use of the DenseNet-201 architecture fine-tuned and a customized top-model. We validated the performance of the method on 1912 OCT images yielding promising results (AU C = 0.99 ± 0.01 in a P = 15 leave-P-out cross-validation). Besides, we also compared the results of the fine-tuned network with those achieved training the network from scratch, obtaining some interesting insights. The presented method poses a step forward in understanding pathological rodent OCT retinal images, as at the moment there is no known discriminating characteristic which allows classifying this type of images accurately. The result of this work is a very accurate and robust automatic method to distinguish between healthy and a rodent model of glaucoma, which is the backbone of future works dealing with human OCT images.

CCTV Image Sequence Generation and Modeling Method for Video Anomaly Detection using Generative Adversarial Network
SPEAKER: Won-Sup Shin

ABSTRACT. Video anomaly detection is one of the most attractive problem in various fields likes computer vision. In this paper, we propose a VAD classifier modeling method that learns in a supervised learning manner. The basic idea is to solve the problem of labeled data shortage through transfer learning. The key idea is to cre-ate an underlying model of transfer learning through the GAN of discriminator. We solved this problem by proposing a GAN model consisting of a generator that generates video sequences and a discriminator that follows LRCN structure. As a result of the experiment, The VAD classifier learned through GAN-based transfer learning obtained higher accuracy and recall than the pure LRCN classifi-er and other machine learning methods. Additionally, we demonstrated that the generator be able to stably generate the image similar to the actual data as the learning progressed. To the best of our knowledge, this paper is the first case to solve the VAD problem using the GAN model and the supervised learning man-ner.

Retinal Image Synthesis for Glaucoma Assessment using DCGAN and VAE Models

ABSTRACT. The performance of a Glaucoma assessment system is highly affected by the number of labelled images used during the training stage. However, labelled images are often scarce or costly to obtain. In this paper, we address the problem of synthesising retinal fundus images by training a Variational Autoencoder and an adversarial model on 2357 retinal images. The innovation of this approach is in synthesising retinal images without using previous vessel segmentation from a separate method, which makes this system completely independent. The obtained models are image synthesizers capable of generating any amount of cropped retinal images from a simple normal distribution. Furthermore, more images were used for training than any other work in the literature. Synthetic images were qualitatively evaluated by 10 clinical experts and their consistency were estimated by measuring the proportion of pixels corresponding to the anatomical structures around the optic disc. Moreover, we calculated the mean-squared error between the average 2D-histogram of synthetic and real images, obtaining a small difference of 3e-4. Further analysis of the latent space and cup size of the images was performed by measuring the Cup/Disc ratio of synthetic images using a state-of-the-art method. The results obtained from this analysis and the qualitative and quantitative evaluation demonstrate that the synthesised images are anatomically consistent and the system is a promising step towards a model capable of generating labelled images.

Comparison of Local Analysis Strategies for Exudate Detection in Fundus Images

ABSTRACT. Diabetic Retinopathy (DR) is a severe and widely spread eye disease. Exudates are one of the most prevalent signs during the early stage of DR and an early detection of these lesions is vital to prevent the patient's blindness. Hence, detection of exudates is an important diagnostic task of DR, in which computer assistance may play a major role.In this paper, a system based on local feature extraction and Support Vector Machine (SVM) classification is used to develop and compare different strategies for automated detection of exudates. The main novelty of this work is allowing the detection of exudates using non-regular regions to perform the local feature extraction. To accomplish this objective, different methods for generating superpixels are applied to the fundus images of E-OPHTA database and texture and morphological features are extracted for each of the resulting regions. An exhaustive comparison among the proposed methods is also carried out.

Improved Architectural Redesign of MTree Clusterer in the Context of Image Segmentation

ABSTRACT. Image segmentation by clustering represents a classical use-case of unsupervised learning. A key aspect of this problem is that instances that are being clusters may have various types and thus requesting specific algorithms that implement particular distance functions and quality metrics. This paper presents an improved version of MTree clusterer that has been tested in the context of image segmentation in the same setup as a new recently k-MS algorithm. The redesigned MTree algorithms allows many levers for setup so that many configurations are available depending on the particularities of the tackeled problem. The experimental results are promising especially as compared with the ones from previous MTree version and also as compared with classical clustering algorithms or newly developed k-MS algorithm. Further improvements in terms of available algorithms for configuration and algorithmic efficiency of integrations may lead the way to a general purpose clusterer that may be used for processing various data types.

Deep Neural Networks with Markov Random Field Models for Image Classification
SPEAKER: Hujun Yin

ABSTRACT. As one of the most intensively researched topics, image classification has attracted significant attention in recent years. Numerous approaches have been proposed to derive robust and effective image representations and to counter the intra-class variability. Conventional feature extraction and the recent deep neural networks are both feature-based methods attempting to find a good set of features for image description and recognition. Apart from these features-based approach, Markov random fields (MRFs) are generative, probabilistic image texture models, in which global model can be obtained by means of local relations and neighbourhood dependencies. This kind of property shares compatibility with convolutional neural networks (CNNs) and enables the combination of feature-based CNNs and model-based MRF models. In this work, we propose an MRF loss function to minimise modelling errors and estimate parameters. Incorporated with CNNs, these estimated parameters are utilised as the initialised weights in the first convolutional layer. Then the networks are trained with MRF initialisation. Comprehensive experiments conducted on the MNIST, CIFAR-10 and CIFAR-100 datasets are reported.

16:40-18:10 Session 9B: Natural Language Processing & Computational Linguistics
Location: Mixed room
Linguistic Features to Identify Extreme Opinions: An Empirical Study

ABSTRACT. Studies in sentiment analysis and opinion mining have examined how different features are effective in polarity classification by making use of positive, negative or neutral values. However, the identification of extreme opinions (most negative and most positive opinions) have overlooked in spite of their wide significance in many applications. In our study, we will combine empirical features (e.g. bag of words, word embeddings, polarity lexicons, and set of textual features) so as to identify extreme opinions and provide a comprehensive analysis of the relative importance of each set of features using hotel reviews.

Handwritten Character Recognition using Active Semi-Supervised Learning

ABSTRACT. Constructing a handwritten character recognition model is considered challenging partly due to the high variety of handwriting styles and the limited amount of training data. In practice, only a handful of labeled examples from limited number of writers are provided during the training of the model. Still, a large collection of already available unlabeled handwritten character data from several sources are often left unused. To alleviate the problem of small training sample size, we propose a graph-based active semi-supervised learning approach for handwritten character recognizer construction. The method iteratively builds a neighborhood graph of all examples including the unlabeled ones, assigns pseudo labels to the unlabeled data and retrains the model. Additionally, the label of the least confident pseudo label according to a newly proposed uncertainty measure is to be requested from the oracle. Experiments on NIST handwritten digits dataset demonstrated that the proposed learning method better utilizes the unlabeled data compared to existing approaches as measured by recognition accuracy. In addition, our active learning strategy is also more effective compared to baseline strategies.

Exploring Online Novelty Detection Using First Story Detection Models

ABSTRACT. Online novelty detection is an important technology in understanding and exploiting streaming data. One application of online novelty detection is First Story Detection (FSD) which attempts to find the very first story about a new topic, e.g. the first news report discussing the ``Beast from the East'' hitting Ireland. Although hundreds of FSD models have been developed, the vast majority of these only aim at improving the performance of the detection for some specific dataset, and very few focus on the insight of novelty itself. We believe that online novelty detection, framed as an unsupervised learning problem, always requires a clear definition of novelty. Indeed, we argue the definition of novelty is the key issue in designing a good detection model. Within the context of FSD, we first categorise online novelty detection models into three main categories, based on different definitions of novelty scores, and then compare the performances of these model categories in different features spaces. Our experimental results show that the challenge of FSD varies across novelty scores (and corresponding model categories); and, furthermore, that the detection of novelty in the very popular Word2Vec feature space is more difficult than in a normal frequency-based feature space because of a loss of word specificity.

Semantic WordRank: Generating Finer Single-Document Summarizations

ABSTRACT. We present Semantic WordRank (SWR), an unsupervised method for generating an extractive summary of a single document. Built on a weighted word graph with semantic and co-occurrence edges, SWR scores sentences using an article-structure-biased PageRank algorithm with a Softplus function adjustment, and promotes topic diversity using spectral subtopic clustering under the Word-Movers-Distance metric. We evaluate SWR on the DUC-02 and SummBank datasets and show that SWR produces better summaries than the state-of-the-art algorithms over DUC-02 under common ROUGE measures. We then show that, under the same measures over SummBank, SWR outperforms each of the three human annotators (aka. judges) and compares favorably with the combined performance of all judges.

Concatenating or Averaging? Hybrid Sentence Representations for Sentiment Analysis

ABSTRACT. Performances in sentiment analysis - the crucial task of automatically classifying the huge amount of users' opinions generated online - heavily rely on the representation used to transform words or sentences into numbers. In the field of machine learning for sentiment analysis the most common embedding is the bag of words (BOW) model, which works well in practice but which is essentially a lexical conversion. Another well-known method is the Word2vec approach which, instead, attempts to capture the meaning of the terms. Given the complementarity of the information encoded in the two models, the knowledge offered by Word2vec can be helpful to enrich the information comprised in the BOW scheme. Based on this assumption we designed and tested four hybrid sentence representations which combine the two former approaches. Experiments performed on publicly available datasets confirm the effectiveness of the hybrid embeddings which led to a stable increase in the performances across different sentiment analysis domains.

Weighted Voting and Meta-Learning for Combining Authorship Attribution Methods

ABSTRACT. Our research concentrates on ways to combine machine learning techniques for authorship attribution. Traditionally, research in authorship attribution is focused on the development of new base-classifiers (combinations of stylometric features and learning methods). A large number of base-classifiers developed for authorship attribution vary in accuracy, often proposing different authors for a disputed document. In this research, we use predictions of multiple base-classifiers as a knowledge base for learning the true author. We introduce and compare two novel methods that utilize multiple base-classifiers. In the Weighted Voting approach, each base-classifier supports an author in proportion to its accuracy in leave-one-out classification. In our Meta-Learning approach, each base-classifier is treated as a feature and methods’ predictions in leave-one-out cross-validation are used as training data from which machine learning methods produce an aggregated decision. We illustrate our results through a collection of 18th century political writings. Anonymously written essays were common during this period, leading to frequent disagreements between scholars over their attribution.

16:40-17:55 Session 9C: Special Session on Intelligent Techniques for the Analysis of Scientific Articles and Patents (ITASAP)
Location: Meeting room
Measuring the impact of the international relationships of the Andalusian universities using Dimensions database

ABSTRACT. Researchers usually have been inclined to publish papers with close collaborators: same University, region or even country. However, thanks to the advancements in communication technologies, members of international research networks can cooperate almost seamlessly. These networks usually tend to publish works with more impact than the local counterparts. In this paper, we try to demonstrate if this assumption is also valid in the region of Andalusia (Spain). The database is used to obtain the articles where at least one author is from an Andalusian University. The publication list is divided into 4 geographical areas: local (only one affiliation), regional (only Andalusian affiliations), national (only Spanish affiliations) and International (any affiliation). Results show that the average number of citations per paper increases as the author collaboration networks increases geographically.

Constructing bibliometric networks from Spanish doctoral Theses

ABSTRACT. The bibliometric networks as representations of complex systems provide great information that allow discovering different aspects of the behavior and interaction between the participants of the network. In this contribution, we have built a fairly large bibliometric network based on data from Spanish doctoral theses. Specifically, we have used the data of each theses defense committee to build the network with its members and we have conducted a study to discover how the nodes of this network interact, to know which are the most representative and how they are grouped within communities according to their participation in theses defense committee.

A new approach for implicit citation extraction

ABSTRACT. The extraction of implicit citations becomes more important since it is a fundamental step in many other applications such as paper summarization, citation sentiment analysis, citation classification, etc. This paper describes the limitations of previous works in citation extrac- tion and then proposes a new approach which is based on topic modeling and word embedding. As a first step, our approach uses LDA technique to identify the topics discussed in the cited paper. Following the same idea of Doc2Vec technique, our approach proposes two models. The first one called Sentence2Vec and it is used to represent all sentences fol- lowing an explicit citation. This sentences are candidates to be implicit citation sentences. The second model called Topic2Vec, used to represent the topics covered in the cited paper. Based on the similarity between Sentence2Vec and Topic2Vec representations we can label a candidate sentence as implicit or not.

Bibliometric network analysis to identify the intellectual structure and evolution of the Big Data research field

ABSTRACT. Big Data has evolved from being an emerging topic to a growing research area in business, science and education field. The Big Data concept has a multidimensional approach, and it can be defined as a term describing the storage and analysis of large and complex data sets using a series of advanced techniques. In this respect, the researches and professionals involved in this area of knowledge are seeking to develop a culture based on data science, analytics and intelligence. To this end, it is clear that there is a need to identify and examine the intellectual structure, current research lines and main trends. In this way, this paper reviews the literature on Big Data evaluating 23,378 articles from 2012 to 2017 and offers a holistic approach of the research area by using the techniques of bibliometric and network analyses. Furthermore, it evaluates the top contributing authors, countries and research themes that are directly related to Big Data. Finally, a science map is developed to understand the evolution of the intellectual structure and the main research themes related to Big Data.

Evidence-based Systematic Literature Reviews in the Cloud

ABSTRACT. Systematic literature reviews and mapping studies are useful research methods used to lay the foundations of further research. These methods are widely used in the Health Sciences and, more recently, also in Computer Science. Despite existing tool support for systematic reviews, more automation is required to conduct the complete process. This paper describes CloudSERA, a web-based app to support the evidence-based systematic review of scientific literature. The tool supports researchers to carry out studies by additional facilities as collaboration, usability, parallel searches and search integration with other systems, The flexible data scheme of the tool enables the integration of bibliographic databases of common use in Computer Science and can be easily extended to support additional sources. It can be used as a service in a cloud environment or as on-premises software.

20:00-23:30Gala dinner