Using Genetic Algorithms for the prediction of cognitive impairments
ABSTRACT. Cognitive impairments affect millions of persons worldwide, and especially elderly ones. These impairments may be one of the first signs of the arising of neurodegenerative diseases, such as Alzheimer's and Parkinson's, and it is expected that the incidence of this kind of diseases will dramatically increase worldwide in the near future. For this reason, the improvement of the tools currently used to diagnose these diseases is becoming crucial. Handwriting is one of the human skills affected by this kind of impairments, and anomalies such as micrographia have been adopted as diagnosis sign for the Parkinson's disease.
In a previous paper, we presented a study in which the handwriting of the subjects involved was recorded while they were performing some elementary tasks, such as the copy of simple words or the drawing of elementary forms. Then we extracted the features characterizing the dynamics of the handwriting and used them to train a classifier to predict whether the subject analyzed was affected by a cognitive impairment or not.
In this paper, we present a GA-based system for the improvement of the performance of the system previously presented. The GA has been used to select the subset of tasks that allow improving the prediction ability of the previous system. The experimental results confirmed the effectiveness of the proposed approach.
Differential Evolution Multi-Objective for Tertiary Protein Structure Prediction
ABSTRACT. The determination of proteins' structure is very expensive and time-consuming, making computer-aided methods attractive. However, in computational terms, the protein structure prediction is a NP-Hard problem, meaning that there is no efficient algorithm that can find a solution in a viable computational time. Nonetheless, the energy terms that compose different force fields seem to be conflicting among themselves, leading to a multi-objective problem. In this sense, different works in the literature have proposed multi-objective formulations of search mechanisms. Hence, we use the Differential Evolution Multi-Objective (DEMO) algorithm with the Rosetta score3 energy function as a force field. In our work, we split the energy terms into two objectives, one with only the \textit{van der Waals} values, while the second one contains the remaining bonded and non-bonded, including the secondary structure reinforcement. Moreover, we enhance the DEMO algorithm with structural knowledge provided by the Angle Probability List (APL). From this perspective, our work provides different contributions to the research area, since the DEMO algorithm was never used in this problem, neither the APL with this algorithm. Also, the multi-objective formulation using Rosetta score3 was not yet explored by related works, even though its relevance for the problem. Results obtained show that the DEMO found better structures than the single-objective differential evolution that uses the same mutation mechanism, energy function, and APL. Also, DEMO reached competitive results when comparing with state-of-art bi-objective approaches.
Short and Medium Term Blood Glucose Prediction using Multi-objective Grammatical Evolution
ABSTRACT. In this paper we investigate the benefits of applying a multi-objective approach for solving a symbolic regression problem by means of grammatical evolution. In particular, we continue with previous research about finding expressions to model the glucose levels in blood of diabetic patients. We use here a multi-objective Grammatical Evolution algorithm based on the NSGA-II algorithm, considering the mean squared error and an ad-hoc fitness function as objectives. This ad-hoc function is based on the Clarke Error Grid analysis, which is useful for showing the potential danger of mispredictions. Experimental results show that the multi-objective approach improves previous results in terms of Clarke Error Grid analysis.
Multiobjective Optimization of a Targeted Vaccination Scheme in the Presence of Non-diagnosed Cases
ABSTRACT. One of the problems in disease prevention and epidemics control is the large level of uncertainty present in the solved problems.
For example, when targeted vaccination schemes are elaborated, the effectiveness of a given scheme depends on the understanding of the epidemic dynamics, the response of the public to the proposed scheme, etc.
If an optimization of targeted vaccination scheme is attempted, the evaluation of solutions usually involves simulating the non-deterministic spreading of the disease.
However, the non-determinism of the epidemic dynamics is just one of the sources of uncertainty in this problem.
This paper studies another source of uncertainty in optimization of targeted vaccination schemes, which is the existence of non-diagnosed cases of the disease.
In the paper the multiobjective optimization of a targeted vaccination scheme is carried out under an assumption that only a fraction of really existing cases of the disease are known.
Solutions found by the optimization algorithm in this limited-knowledge scenario are then evaluated in a simulation in which all existing cases of the disease are taken into account.
Optimized solutions are compared with the mass vaccination scheme which is not based on the knowledge of existing cases of the disease.
An uncertainty-handling technique is proposed that randomly adds infected individuals to the simulations in order to compensate for the unknown ones.
Understanding Aesthetic Evaluation with Deep Learning
ABSTRACT. A bottleneck in any evolutionary art system is aesthetic evaluation. Many different methods have been proposed to automate the evaluation of aesthetics, including measures of symmetry, coherence, complexity, contrast and grouping. The interactive genetic algorithm (IGA) relies on human-in-the-loop, subjective evaluation of aesthetics, but limits possibilities for large search due to user fatigue and small population sizes. In this paper we look at how recent advances in deep learning can assist in automating personal aesthetic judgement. Using a leading artist's computer art dataset, we use dimensionality reduction methods to visualise both geneotype and phenotype space in order to support the exploration of new territory in any generative system. Convolutional Neural Networks trained on the user's prior aesthetic evaluations are used to suggest new possibilities similar or between known high quality genotype-phenotype mappings.
Objective Evaluation of Tonal Fitness for Chord Progressions
ABSTRACT. Chord progressions are core elements of Western tonal harmony regulated by multiple theoretical and perceptual principles. Ideally, objective measures to evaluate chord progressions should reflect their tonal fitness. In this work, we propose an objective measure of the fitness of a chord progression within the Western tonal context computed in the Tonal Interval Space, where distances capture tonal music principles. The measure considers four parameters, namely tonal pitch distance, consonance, hierarchical tension and voice leading between the chords in the progression. We performed a listening test to perceptually assess the proposed tonal fitness measure across different chord progressions, and compared the results with existing related models. The perceptual rating results show that our objective measure improves the estimation of a chord progression's tonal fitness in comparison with existing models.
An aesthetic-based fitness measure and a framework for guidance of evolutionary design in architecture
ABSTRACT. The authors introduced an interactive design framework for grammar evolu-tion for application in architecture. This system incorporates an aesthetics-based fitness measure as guidance procedure. The proposed guidance mecha-nism for grammar evolution in architecture facilitates the interactive explo-ration of solution spaces for building shapes. Design solutions in architecture refer to criteria in both, form and function. The novelty of the framework for guidance of evolutionary design in architecture was the aesthetics-based fit-ness measure. This guidance procedure allowed the user to introduce initial reference image as input to the evolutionary process. A case study displayed the application potential of the framework for architectural design.
A Local Search with a Surrogate Assisted Option for Instance Reduction
ABSTRACT. In data mining, instance reduction is a key data pre-processing step that simplifies and cleans raw data, by either selecting or creating new samples, before applying a learning algorithm.
This usually yields to a complex large scale and computationally expensive optimisation problem which has been typically tackled by sophisticated population-based metaheuristics.
Unlike the recent literature, in order to accomplish this target, this article proposes the use of a simple local search algorithm and its integration with a surrogate assisted model. This local search, in accordance with variable decomposition techniques for large scale problems, perturbs an n-dimensional vector along the directions identified by its design variables one by one.
Empirical results in 40 small data sets show that, despite its simplicity, the proposed baseline local search on its own is competitive with more complex algorithms representing the state-of-the-art for instance reduction in classification problems. The use of the proposed local surrogate model enables a reduction of the computationally expensive objective function calls with accuracy test results overall comparable with respect to its baseline counterpart.
Evolutionary Latent Space Exploration of Generative Adversarial Networks
ABSTRACT. Generative Adversarial Networks (GANs) have gained popularity over the years, presenting state-of-the-art results in the generation of samples that follow the distribution of the input training dataset. While research is being done in order to make GANs more reliable and able to generate better samples, the exploration of its latent space does not have as much attention. The latent space is unique for each model and is, ultimately, what determines the output from the generator. Usually, a random sample vector is taken from the latent space without regardless of which output it produces through the generator. In this paper, we move towards an approach for generating latent vectors and traversing the latent space with pre-determined criteria, using different approaches. We focus on the generation of sets of diverse examples by searching in the latent space using Genetic Algorithms and Map Elites. A set of experiments are performed and analysed, comparing the implemented approaches with the traditional approach.
Evolution of Scikit-Learn Pipelines with Dynamic Structured Grammatical Evolution
ABSTRACT. The deployment of Machine Learning (ML) models is a difficult and time-consuming job that comprises a series of sequential and correlated tasks that go from the data pre-processing, and the design and extraction of features, to the choice of the ML algorithm and its parameterisation. The task is even more challenging considering that the design of features is in many cases problem specific, and thus requires domain-expertise. To overcome these limitations Automated Machine Learning (AutoML) methods seek to automate, with few or no human-intervention, the design of pipelines, i.e., automate the selection of the sequence of methods that have to be applied to the raw data. These methods have the potential to enable non-expert users to use ML, and provide expert users with solutions that they would unlikely consider. In particular, this paper describes AutoML-DSGE -- a novel grammar-based framework that adapts Dynamic Structured Grammatical Evolution to the evolution of Scikit-Learn classification pipelines. The experimental results include comparing AutoML-DSGE to another grammar-based AutoML framework, Resilient Classification Pipeline Evolution (RECIPE), and show that the average performance of the classification pipelines generated by AutoML-DSGE is always superior to the average performance of (RECIPE); the differences are statistically significant in 3 out of the 10 used datasets.
A Greedy Iterative Layered Framework for Training Feed Forward Neural Networks
ABSTRACT. In recent years neuroevolution has become a dynamic and rapidly growing research field. Interest in this discipline is motivated by the need to create ad-hoc networks, the topology and parameters of which are optimized, according to the particular problem at hand. Al- though neuroevolution-based techniques can contribute fundamentally to improving the performance of artificial neural networks (ANNs), they present a drawback, related to the massive amount of computational resources needed. This paper proposes a novel population-based frame-work, aimed at finding the optimal set of synaptic weights for ANNs. The proposed method partitions the weights of a given network and, using an optimization heuristic, trains one layer at each step while “freezing“ the remaining weights. In the experimental study, particle swarm optimiza- tion (PSO) was used as the underlying optimizer within the framework and its performance was compared against the standard training (i.e., training that considers the whole set of weights) of the network with PSO and the backward propagation of the errors (backpropagation). Results show that the subsequent training of sub-spaces reduces training time, achieves better generalizability, and leads to the exhibition of smaller variance in the architectural aspects of the network.
A Local Search for Numerical Optimisation based on Covariance Matrix Diagonalisation
ABSTRACT. Pattern Search is a family of optimisation algorithms that improve upon an initial solution by performing moves along the directions of a basis of vectors. In its original definition Pattern Search moves along the directions of each variable. Amongst its advantages, the algorithm does not require any knowledge of derivatives or analytical expression of the function to optimise. However, the performance of Pattern Search is heavily problem dependent since the search directions can be very effective on some problems and lead to poor performance on others.
The present article proposes a novel enhancement of Pattern Search that explores the space by using problem-dependent search directions. Some points are sampled within the basin of attraction and the diagonalisation of the covariance matrix associated with the distribution of these points is performed. The proposed Covariance Pattern Search improves upon an initial point by varying it along the directions identified by the eigenvectors of the covariance matrix.
Using evolution to design modular robots: An empirical approach to select module designs
ABSTRACT. In modular robots, and other physical structures, the shape of the building blocks greatly influences the end result. By changing the physical properties of the module, different robotic structures with better performance for a given task can be found. In this paper, we change modules of a modular robot platform, the EMERGE modular robot, in two different ways: changing the length of the module and changing the shape of the starting (base) module. Two bases are tested: a cuboid base and a flat base. We use artificial evolution to optimize robots for a locomotion task using each different module length and base, and also evolve robots with combinations of modules of different length. Results show that, as the length of the module increases, the best robots obtained use fewer modules and fewer connections per module, and robots also become thinner. However, the increase in length results also in a decrease in locomotion performance for large length increases. Interestingly, very few of the best robots found show symmetric structures, which can be attributed to their tendency to roll over as their main means of locomotion. Modular robot designers can use the information about the effectiveness of modules with different lengths, and the use of different starting bases, to strike trade-offs between the desired number of modules in a robot and their effectiveness for a given task.
Evolving-controllers versus learning-controllers for morphologically evolvable robots
ABSTRACT. We investigate an evolutionary robot system where (simulated) modular robots can reproduce and create robot children that inherit the parents' morphologies by crossover and mutation. Within this system we compare two approaches to creating good controllers, i.e., evolution and learning. In the first one the controller of a robot child is produced by applying crossover and mutation to the controllers of its parents. In the second one the controller of the child is produced by a learning method that starts with a random brain.
The experiments show that the learning approach does not only lead to different fitness levels, but also to different (bigger) robots. This constitutes a quantitative demonstration that changes in brains, i.e., controllers, can induce changes in the bodies, i.e., morphologies.
Locating Odour Sources with Geometric Syntactic Genetic Programming
ABSTRACT. Using robots to locate odour sources is an interesting problem with important applications. Many researchers have drawn inspiration from nature for producing robotic methods, whilst others have attempted to automatically produce search strategies by resorting to Artificial Intelligence techniques. This paper extends Geometric Syntactic Genetic Programming and applies it to automatically produce robotic controllers, in the form of behaviour trees. The modification proposed to Geometric Syntactic Genetic Programming enables it to evolve trees containing multiple symbols per node. The behaviour trees produced by this algorithm are compared to those evolved by a standard Genetic Programming algorithm and to two bio-inspired strategies from the literature, both in simulation and in the real world. The statistically validated experimental results show that the Geometric Syntactic Genetic Programming algorithm is able to produce behaviour trees that outperform the bio-inspired strategies, while being significantly smaller than those evolved by the standard Genetic Programming algorithm. Moreover, there are no statistically significant differences between the performances of the strategies evolved by the two Genetic Programming algorithms.
Fitness Landscape Analysis of Automated Machine Learning Search Spaces
ABSTRACT. The field of Automated Machine Learning (AutoML) has as its main goal to automate the process of creating complete Machine Learning (ML) pipelines to any dataset without requiring deep user expertise in ML. Several AutoML methods have been proposed so far, but there is not a single one that really stands out. Furthermore, there is a lack of studies on the characteristics of the fitness landscape of AutoML search spaces. Such analysis may help to understand the performance of different search-based AutoML methods and how to improve them. This paper adapts classic fitness landscape analysis measures to the context of AutoML. This is a challenging task, as AutoML search spaces include discrete, continuous, categorical and conditional hyperparameters. We propose an ML pipeline representation, a neighborhood definition and a distance metric between pipelines, and use them in the evaluation of the fitness distance correlation (FDC) and the neutrality ratio for a given AutoML search space. Results of FDC are counter-intuitive and require a more in-depth analysis of a range of search spaces. Results of neutrality, in turn, show a strong positive correlation between the mean neutrality ratio and the fitness value.
Cooperative Parallel SAT Local Search with Path Relinking
ABSTRACT. In this paper, we propose the use of path-relinking to improve the performance of parallel portfolio-based local search solvers for the Boolean Satisfiability problem. In the portfolio-based framework several algorithms explore the search space in parallel, either independently or cooperatively with some communication between the solvers. Path-relinking is a method to maintain an appropriate balance between diversification and intensification (and explore paths that aggregate elite solutions) to properly craft a new assignment for the variables to restart from. We present an empirical study that suggest that path-relinking outperforms a set of well-known parallel portfolio-based local search algorithms with and without cooperation.
An algebraic approach for the search space of permutations with repetition
ABSTRACT. We present an algebraic approach for dealing with combinatorial optimization problems based on permutations with repetition. The approach is an extension of an algebraic framework defined for combinatorial search spaces which can be represented by a group (in the algebraic sense). Since permutations with repetition does not have the group structure, in this work we derive some definitions and we devise discrete operators that allow to design algebraic evolutionary algorithms whose search behavior is in line with the algebraic framework. In particular, a discrete Differential Evolution algorithm which directly works on the space of permutations with repetition is defined and analyzed. As a case of study, an implementation of this algorithm is provided for the Job Shop Scheduling Problem. Experiments have been held on commonly adopted benchmark suites, and the results show that the proposed approach is competitive with the known optimal objective values.
The Local Optima Level in Chemotherapy Schedule Optimisation
ABSTRACT. In this paper a multi-drug Chemotherapy Schedule Optimisation
Problem (CSOP) is subject to Local Optima Network (LON)
analysis. LONs capture global patterns in fitness landscapes. CSOPs
have not previously been subject to fitness landscape analysis. We fill
this gap: LONs are constructed and studied for meaningful structure.
The CSOP formulation presents novel challenges and questions for the
LON model because there are infeasible regions in the fitness landscape
and an unknown global optimum; it also brings a topic from healthcare
to LON analysis. Two LON Construction algorithms are proposed for
sampling CSOP fitness landscapes: a Markov-Chain Construction Algorithm
and a Hybrid Construction Algorithm. The results provide new
insight into LONs of highly-constrained spaces, and into the proficiency
of search operators on the CSOP. Iterated Local Search and Memetic
Search, which are the foundations for the LON algorithms, are found to
markedly out-perform a Genetic Algorithm from the literature.
Ant-based Neural Topology Search (ANTS) for Optimizing Recurrent Networks
ABSTRACT. Hand-crafting effective and efficient structures for recurrent neural networks (RNNs) is a difficult, expensive, and time-consuming process. To address this challenge, we propose a novel neuro-evolution algorithm based on ant colony optimization (ACO), called Ant-based Neural Topology Search (ANTS), for directly optimizing RNN topologies. The procedure selects from multiple modern recurrent cell types such as Delta-RNN, GRU, LSTM, MGU and UGRNN cells, as well as recurrent connections which may span multiple layers and/or steps of time. In order to introduce an inductive bias that encourages the formation of sparser synaptic connectivity patterns, we investigate several variations of the core algorithm. We do so primarily by formulating different functions that drive the underlying pheromone simulation process (which mimic L1 and L2 regularization in standard machine learning) as well as by introducing ant agents with specialized roles (inspired by how real ant colonies operate), i.e., explorer ants that construct the initial feed forward structure and social ants which select nodes from the feed forward connections to subsequently craft recurrent memory structures. We also incorporate communal intelligence, where best weights are shared by the ant colony for weight initialization, reducing the number of backpropagation epochs required to locally train candidate RNNs, speeding up the neuro-evolution process. Our results demonstrate that the sparser RNNs evolved by ANTS significantly outperform traditional one and two layer architectures consisting of modern memory cells, as well as the well-known NEAT algorithm. Furthermore, we improve upon prior state-of-the-art results on the time series dataset utilized in our experiments.
Neuro-Evolutionary Transfer Learning through Structural Adaptation
ABSTRACT. Transfer learning involves taking an artificial neural network (ANN) trained on one dataset (the source) and adapting it on a new, second dataset (the target). While transfer learning has been shown to be quite powerful and is commonly used in most modern-day statistical learning setups, its use has generally been restricted by architecture, i.e., in order to facilitate the reuse of internal learned synaptic weights, the underlying topology of the ANN to be transferred across tasks must remain the same and a new output layer must be attached (entailing tossing out the old output layer's weights).
This work removes this restriction by proposing a neuro-evolutionary approach that facilitates what we call adaptive structure transfer learning, which means that an ANN can be transferred across tasks that have different input and output dimensions while having the internal latent structure continuously optimized. We test the proposed optimizer on two challenging real-world time series prediction problems -- our process adapts recurrent neural networks (RNNs) to 1) predict coal-fired power plant data before and after the addition of new sensors, and to 2) predict engine parameters where RNN estimators are trained on different airframes with different engines.
Experiments show that not only does the proposed neuro-evolutionary transfer learning process result in RNNs that evolve and train faster on the target set than those trained from scratch but, in many cases, the RNNs generalize better even after a long training and evolution process. To our knowledge, this work represents the first use of neuro-evolution for transfer learning, especially for RNNs, and is the first methodological framework capable of adapting entire structures for arbitrary input/output spaces.
Detection of Frailty using Genetic Programming : The Case of Older People in Piedmont, Italy
ABSTRACT. Frailty appears to be the most problematic expression of elderly people. Frail older adults have a high risk of mortality, hospitalization, disability and other adverse outcomes, resulting in burden to individuals, their families, health care services and society. Early detection and screening would help to deliver preventive interventions and reduce the burden of frailty. For this purpose, several studies have been conducted to detect frailty that demonstrates its association with mortality and other health outcomes. Most of these studies have focused on the possible risk factors associated with frailty in the elderly population; however, efforts to identify and predict groups of elderly people who are at increased risk of frailty is still challenging in clinical settings. In this paper, Genetic Programming (GP) is exploited to detect and define frailty based on the whole elderly population of the Piedmont, Italy, using administrative databases of clinical characteristics and socio-economic factors. Specifically, GP is designed to predict frailty according to the expected risk of mortality, urgent hospitalization, disability, fracture, and access to the emergency department. The performance of GP model is evaluated using sensitivity, specificity, and accuracy metrics by splitting the data into a training set and test set. We find that GP shows competitive performance in predicting frailty compared to the traditional machine learning models. The study demonstrates that the proposed model might be used to screen future frail older adults using clinical, psychological and socio-economic variables, which are commonly collected in community healthcare institutions.
Seeding Grammars in Grammatical Evolution to Improve Search Based Software Testing
ABSTRACT. Software-based optimization techniques have been increasingly used to automate code coverage analysis since the nineties. Although several studies suggest that interdependencies can exist between condition constructs in branching conditions of real life programs e.g. (i<=100) or (i==j), etc., to date, only the Ariadne system, a Grammatical Evolution (GE)-based Search Based Software Testing (SBST) technique, exploits interdependencies between variables to efficiently automate code coverage analysis.
Ariadne employs a simple attribute grammar to exploit these dependencies, which enables it to very efficiently evolve highly complex test cases, and has been compared favourably to other well-known techniques in the literature. However, Ariadne does not benefit from the interdependencies involving constants e.g. (i<=100), which are equally important constructs of condition predicates. Furthermore, constant creation in GE can be difficult, particularly with high precision.
We propose to seed the grammar with constants extracted from the source code of the program under test in order to enhance and extend Ariadne’s capability to exploit richer types of dependencies (involving all combinations of both variables and constant values). We compared our results with the original system of Ariadne against a large set of benchmark problems which include 10 numeric programs in addition to the ones originally used for Ariadne. Our results demonstrate that the seeding strategy not only dramatically improves the generality of the system, as it improved the code coverage (effectiveness) by impressive margins, but it also reduces the search budgets (efficiency) often up to an order of magnitude.
Incremental Evolution and Development of Deep Artificial Neural Networks
ABSTRACT. NeuroEvolution (NE) methods are known for applying Evolutionary Computation to the optimisation of Artificial Neural Networks (ANNs). Despite aiding non-expert users to design and train ANNs, the vast majority of NE approaches disregard the knowledge that is gathered when solving other tasks, i.e., evolution starts from scratch for each problem, ultimately delaying the evolutionary process. To overcome this drawback, we extend Fast Deep Evolutionary Network Structured Representation (Fast-DENSER) to incremental development. We hypothesise that by re-using the knowledge gained from previous tasks we can attain superior results and speedup evolution. The results show that the average performance of the models generated by incremental development is statistically superior to non-incremental. In case the number of evaluations performed by incremental development is smaller than the performed by non-incremental development the attained results are similar in performance, which indicates that incremental development speeds up evolution. Lastly, the models generated using incremental development are more robust, and thus perform better, without further evolution, on unseen problems.
ABSTRACT. This work uses Push GP to automatically design both local and population-based optimisers for continuous-valued problems. The optimisers are trained on a single function optimisation landscape, using random transformations to discourage overfitting. They are then tested for generality on larger versions of the same problem, and on other continuous-valued problems. In most cases, the optimisers generalise well to the larger problems. Surprisingly, some of them also generalise very well to previously unseen problems, outperforming existing general purpose optimisers such as CMA-ES. Analysis of the behaviour of the evolved optimisers indicates a range of interesting optimisation strategies that are not found within conventional optimisers, suggesting that this approach could be useful for discovering novel and effective forms of optimisation in an automated manner.
ABSTRACT. This paper intends to understand and to improve the working principle of decomposition-based multi-objective evolutionary algorithms. We propose to review the design of the well-established MOEA/D framework to support the smooth integration of different strategies for sub-problem selection, while emphasizing the role of the population size and of the number of offsprings created at each generation. By conducting a comprehensive empirical analysis on a wide range of multi- and many-objective combinatorial NK landscapes with a different ruggedness, we provide new insights into the combined effect of those parameters on the anytime performance of the underlying multi-objective search process. In particular, we show that even a simple random strategy selecting a subset of sub-problems at random outperforms existing sophisticated strategies. We also study the sensitivity of such strategies with respect to the ruggedness and the objective space dimension of the target problem.
MILPIBEA: Algorithm for Multi-objective Features Selection in (Evolving) Software Product Lines
ABSTRACT. Software Product Lines Engineering (SPLE) proposes techniques to model, create and improve groups of related software systems in a systematic way, with different alternatives formally expressed, e.g., as Feature Models. Selecting the ‘best’ software system(s) turns into a problem of improving the quality of selected subsets of software features
(components) from feature models, or as it is widely known, Feature Configuration. When there are different independent dimensions to assess how good a software product is, the problem becomes even more challenging – it is then called multi-objective optimisation. Another big issue for software systems is evolution where software components change. This is common in the industry but, as far as we know, there is no algorithm designed to the particular case of multi-objective optimisation of evolving software product lines. In this paper we present MILPIBEA, a novel hybrid algorithm which combines the scalability of a genetic algorithm (IBEA) with the accuracy of a mixed-integer linear programming
solver (IBM ILOG CPLEX). We also study the behaviour of our solution (MILPIBEA) in contrast with the state-of-the-art in static software product lines (SATIBEA). We demonstrate that MILPIBEA is 42% better than SATIBEA on average, especially for the most challenging problem instances, and that MILPIBEA is the one that continues to improve the quality of the solutions when SATIBEA plateau (in the evolving context).
ABSTRACT. The use of multi-graphs in modelling multi-objective transportation problems is gaining popularity, necessitating the consideration of the Multi-objective Shortest Path Problem (MSPP) on multigraphs. This problem is encountered in time-dependent vehicle routing, multimodal transportation planning and in optimising airport operations.
This problem is more complex than the NP-hard simple graph MSPP, and thus approximate solution methods are needed to find a good representation of the true Pareto front in a given time budget.
Evolutionary algorithms have been applied with success to the simple graph MSPP, however their performances on multigraph MSPP were not systematically investigated. To this aim, we extend the most popular genetic representations to the multigraph case and compare the achieved performances. We find that the priority based encodings outperform the direct one with purely random initialisation.
We further introduce a novel heuristic initialisation technique, that is generic enough for many representations, and that further improves the convergence speed and solution quality of the algorithms.
The results are encouraging for later application to the time constrained multigraph MSPP.
Testing hybrid computational intelligence algorithms for general game playing
ABSTRACT. General Videogame Playing is one of the hottest topics in the research field of AI in videogames. It aims at the implementation of algorithms or autonomous agents able to play a set of unknown games efficiently, just receiving the set of rules to play in real time.
Thus, this work presents the implementation of eight approaches based on the main techniques applied in the literature to face this problem, including two different hybrid implementations combining Montecarlo Tree Search and Genetic Algorithms.
They have been created within the General Video Game Artificial Intelligence (GVGAI) Competition platform. Then, the algorithms have been tested in a set of 20 games from that competition, analyzing its performance. According to the obtained results, we can conclude that the proposed hybrid approaches are the best approaches, and they would be a very competitive entry for the competition.
An Empirical Exploration of Deep Recurrent Connections Using Neuro-Evolution
ABSTRACT. Neuro-evolution and neural architecture search algorithms have gained increasing interest due to the challenges involved in designing optimal artificial neural networks (ANNs). While these algorithms have been shown to possess the potential to outperform the best human crafted architectures, a less common use of them is as a tool for analysis of ANN structural components and connectivity structures. By performing these techniques while varying the allowable components, best performing architectures for those components can be found and then compared to best performing architectures for other components, allowing for a best case comparison of component capabilities -- a more rigorous examination than simply applying those components in some standard fixed topologies. In this work, we utilize the Evolutionary eXploration of Augmenting Memory Models (EXAMM) algorithm to perform a rigorous examination and comparison of recurrent neural networks (RNNs) applied to time series prediction. Specifically, we use our EXAMM-based analysis to investigate the capabilities of recurrent memory cells and the generalization ability afforded by various complex recurrent connectivity patterns that span one or more steps in time, i.e., deep recurrent connections. EXAMM was used to evolve and train over 10.56 million RNNs in 5,280 repeated experiments with varying components. While many modern, often hand-crafted RNNs rely on complex memory cells (which have internal recurrent connections that only span a single time step) operating under the assumption that these sufficiently latch information and handle long term dependencies, our results show that networks evolved with deep recurrent connections perform significantly better than those without. More importantly, in some cases, the best performing RNNs consisted of only simple neurons and deep time skip connections, without any memory cells. These results strongly suggest that utilizing deep time skip connections in RNNs for time series data prediction not only deserves further, dedicated study, but also demonstrate the potential of neuro-evolution as a means to better study, understand, and train effective RNNs.
Optimizing the Hyperparameters of a Mixed Integer Linear Programming Solver to Speed Up Electric Vehicle Charging Control
ABSTRACT. Optimization of charging profiles for controlled charging of electric vehicles is commonly done via mixed integer linear programming. The runtime of the optimization can represent an issue for the practical use. However, by tuning the parameter setting of the employed solver, it is possible to speed up the optimization process. The present work evaluates two popular hyperparameter tuning tools - irace and SMAC -
for the optimization of parameters of the SCIP solver with the objective to speed up the solving process for four common variants of the electric vehicle charging scheduling problem. Based on the results, the most important solver parameters are identified. It is shown that by tuning a very limited number of parameters, speed-ups of 60% and more can be achieved.
Challenges of Program Synthesis with Grammatical Evolution
ABSTRACT. Program synthesis is an emerging research topic in the field
of EC with the potential to improve real-world software development.
Grammar-guided approaches like GE are suitable for program synthesis
as they can express common programming languages with their required
properties. This work uses common software metrics (lines of code, McCabe
metric, size and depth of the abstract syntax tree) for an analysis
of GE’s search behavior and the resulting problem structure. We find
that GE is not able to solve program synthesis problems, where correct
solutions have higher values of the McCabe metric (which means they
require conditions or loops). Since small mutations of high-quality solu-
tions strongly decrease a solutions fitness and make a high percentage of
the solutions non-executable, the resulting problem constitutes a needle-
in-a-haystack problem. To us, one of the major challenges of future GP
research is to come up with better and more adequate fitness functions
and problem specifications to turn the current needle-in-a-haystack prob-
lems into problems that can be solved by guided search.
SGP-DT: Semantic Genetic Programming Based on Dynamic Targets
ABSTRACT. Semantic GP is a promising approach that introduces semantic awareness during genetic evolution. This paper presents a new approach called Semantic Genetic Programming based on Dynamic Target (SGP-DT) that divides the search problem into multiple GP runs. The evolution in each run is guided by a new (dynamic) target based on the residual errors. To obtain the final solution, SGP-DT combines the solutions of each run using linear scaling. SGP-DT proposes a new approach to exchange functionalities between individuals without relying on the classic formulations of crossovers. The synergy between linear scaling and mutation yields to final solutions with remarkable performances both in approximation errors and computational cost. We valuate SGP-DT on eight well known data sets and compare with e-lexicase, a state-of-the-art evolutionary technique. SGP-DT achieves small RMSE values, on average 23.19 % smaller than the one of e-lexicase.
An Evolutionary View on Reversible Shift-invariant Transformations
ABSTRACT. We consider the problem of evolving a particular kind of shift-invariant transformation -- namely, Reversible Cellular Automata (RCA) defined by conserved landscape rules -- using GA and GP. To this end, we employ three different optimization strategies: a single-objective approach carried out with GA and GP where only the reversibility constraint of marker CA is considered, a multi-objective approach based on GP where both reversibility and the Hamming weight are taken into account, and a lexicographic approach where GP first optimizes only the reversibility property until a conserved landscape rule is obtained, and then maximizes the Hamming weight while retaining reversibility. The results are discussed in the context of three different research questions stemming from exhaustive search experiments on conserved landscape CA, which concern 1) the difficulty of the associated optimization problem for GA and GP, 2) the utility of conserved landscape CA in the domain of cryptography and reversible computing, and 3) the relationship between the reversibility property and the Hamming weight.
ABSTRACT. The effects of various genetic operators and parent selection algorithms on the performance of a genetic programming system on different problems have been well studied. In this paper, we analyze how different selection algorithms influence modularity in the population of evolving programs. In particular, we observe how the number of individuals with some form of modular structure changes over generations for various selection algorithms.
ABSTRACT. Dynamic flexible job shop scheduling (DFJSS) is a very valuable practical application problem that can be applied in many fields such as cloud computing and manufacturing. In DFJSS, we need to make machine assignment and operation sequencing decisions simultaneously in dynamic environments with unpredicted events such as new job arrivals. Scheduling heuristic is an ideal candidate for solving the DFJSS problem due to its efficiency and simplicity. Genetic programming (GP) has been successfully applied to evolve scheduling heuristics for job shop scheduling automatically. However, GP has a huge search space, and the traditional search algorithms do not utilise effectively the information obtained from the evolutionary process. In this paper, we propose a new method to make better use of the information during the evolutionary process of GP to further enhance its ability of GP. To be specific, we propose two adaptive search strategies based on the frequency of features in promising individuals to guide GP to evolve effective rules. The adaptive search aims to guide the behaviour of GP over time. The results show that the proposed GP with adaptive search can converge faster and achieve significantly better performance than the GP without adaptive search in most scenarios while no worse in all other scenarios without increasing the computational cost.
A Grouping Genetic Algorithm for Multi Depot Pickup and Delivery Problems with Time Windows and Heterogeneous Vehicles
ABSTRACT. The Multi Depot Pickup and Delivery Problem with Time Windows and Heterogeneous Vehicles is a Rich Vehicle Routing Problem as it combines many real-world problems and is therefore relevant to practice. In this paper a new mathematical two-index model formulation for the MDPDPTWHV is developed as well as a Grouping Genetic Algorithm (GGA), which features a grouping-oriented individual representation. Therefore, each chromosome contains only the assignment of requests to vehicles, i. e., no information about the customer sequence is included. In order to compare different variants of the GGA to each other as well as the best one to solutions calculated by Cplex, 120 MDPDPTWHV datasets are created by a self-implemented generator. In a benchmark study, it can be shown that the way in which population management is performed is important to enhance the solution quality of the GGA. On average, the best GGA variant is 2.43 % worse than the best known solution.
ABSTRACT. Containers have gain popularity because they support fast development and deployment of cloud-native software such as micro-services and server-less applications. Additionally, containers have low overhead, hence they save resources in cloud data centers. However, the difficulty of the Resource Allocation in Container-based clouds (RAC) is far beyond VM-based clouds. The allocation task selects heterogeneous VMs to host containers and consolidate VMs to PMs simultaneously. Due to the high complexity, existing approaches use simple rule-based heuristics and meta-heuristics to solve the RAC problem. They either prone to stuck at local optima or have inherent defects in their indirect representation. To address these issues, we propose a novel group genetic algorithm (GGA) with direct representation and problem-specific operators. This design has shown significantly better performance than the state-of-the-art algorithms in with a wide range of test datasets.
Optimizing prices and periods in time-of-use electricity tariff design using bilevel programming
ABSTRACT. In this paper, a comparison is made between two bilevel programming models to design time-of-use tariffs in the electricity retail market. The upper-level objective function consists of the maximization of the retailer’s profit and the lower-level problem relates to the minimization of the consumer’s cost. In the first model, the periods in which prices apply are pre-defined and the aim is to determine the price values. In the second model, which is developed in this paper, both the periods and prices are decision variables, thus leading to a very large search space for the upper-level problem due to the number of configurations periods-prices. For the model with variable periods, a hybrid approach combining a genetic algorithm for the upper-level search with a mixed-integer linear programming solver to obtain optimal solutions to the lower-level problem is herein developed. Computational results comparing the two models are presented.
Particle Swarm Optimization: A Wrapper-based Feature Selection Method for Ransomware Detection and Classification
ABSTRACT. Ransomware has emerged as a grave cyber threat. Many of the existing ransomware detection and classification models use data set created through dynamic or behaviour analysis of ransomware, hence known as behaviour-based detection models. A big challenge in automated behaviour-based ransomware detection and classification is high dimensional data with numerous features distributed into various groups. Feature selection algorithms usually help to deal with high dimensionality for improving classification performance. In connection with ransomware detection and classification, the majority of the feature selection methods used in existing literature ignores the varying importance of various feature groups in ransomware behaviour analysis data set. For ransomware detection and classification, we propose a two-stage feature selection method that considers the varying importance of each of the feature group in data set. The proposed method utilizes particle swarm optimization, a wrapper-based feature selection algorithm, for selection of optimal number of features from each feature group to produce better classification performance. Although the proposed method shows comparable performance for binary classification, it has significantly better performance for multi-class classification as compared to existing feature selection method used for ransomware detection and classification.
ABSTRACT. The constant growth of vehicles circulating in urban environments poses a number of challenges in terms of city planning and traffic regulation. A key aspect that affects the safety and efficiency of urban traffic is the configuration of traffic lights and junctions. Here, we propose a general framework, based on a realistic urban traffic simulator, to aid city planners to optimize traffic, based on a customized version of NSGA-2. We show how different metrics -such as number of accidents, average speed of vehicles, and number of traffic jams- can be taken into account in a multi-objective fashion to obtain a number Pareto-optimal configurations.
Our experiments, conducted on two different city scenarios in Italy and different combinations of fitness functions, demonstrate the validity of this approach and show how evolutionary optimization can be a practical and effective tool for traffic planning.
A Decomposition-Based Evolutionary Algorithm with Adaptive Weight Vectors for Multi- and Many-objective Optimization
ABSTRACT. The multi-objective evolutionary algorithms based on decomposition (MOEA/D) has achieved great success in the area of evolutionary multi-objective optimization. Numerous MOEA/D variants are focused on solving the normalized multi- and many-objective problems without paying attention to problems having objectives with different scales. For this purpose, this paper proposes a decomposition-based evolutionary algorithm with adaptive weight vectors (DBEA-AWV) for both the normalized and scaled multi- and many-objective problems. In the light of this direction, we compare existing popular decomposition approaches and choose the best suitable one incorporated into DBEA-AWV. Moreover, one novel replacement strategy is adopted to attain the balance between convergence and diversity for multi- and many-objective optimization problems. Our experimental results demonstrate the proposed algorithm is efficient and reliable for dealing with different normalized and scaled problems, outperforming other several state-of-the-art multi- and many-objective evolutionary algorithms.
Automatic generation of adversarial metamorphic malware using MAP-Elites
ABSTRACT. In the field of metamorphic malware detection, training a detection model with malware samples that reflect potential mutants of the malware is crucial in developing a model resistant to future attacks. In this paper, we use a Multi-dimensional Archive of Phenotypic Elites (MAP-Elites) algorithm to generate a large set of novel, malicious mutants that are diverse with respect to their behavioural and structural similarity to the original mutant. Using two classes of malware as a test-bed, we show that the MAP-Elites algorithm produces a large and diverse set of mutants, that evade between 64\% to 72\% of the 63 detection engines tested. When compared to results obtained using repeated runs of an Evolutionary Algorithm that converges to a single solution result, the MAP-Elites approach is shown to produce a significantly more diverse range of solutions, while providing equal or improved results in terms of evasiveness, depending on the dataset in question. In addition, the archive produced by MAP-Elites sheds insight into the properties of a sample that lead to them being undetectable by a suite of existing detection engines.