Hessian Complexity Measure for Genetic Programming-based Imputation Predictor Selection in Symbolic Regression with Incomplete Data
ABSTRACT. Missing values bring several challenges when learning from real-world data sets. Imputation is a widely adopted approach to estimating missing values. However, it has not been adequately investigated in symbolic regression. When imputing the missing values in an incomplete feature, the other features that are used in the prediction process are called imputation predictors. In this work, a method for imputation predictor selection using regularized genetic programming (GP) models is presented for symbolic regression tasks on incomplete data. A complexity measure based on the Hessian matrix of the phenotype of the evolving models is proposed. It is employed as a regularizer in the fitness function of GP for model selection and the imputation predictors are selected from the selected models. In addition to the baseline which uses all the available predictors, the proposed selection method is compared with two GP-based feature selection variations: the standard GP feature selector and GP with feature selection pressure. The trends in the results reveal that in most cases, using the predictors selected by regularized GP models could achieve a considerable reduction in the imputation error and improve the symbolic regression performance as well.
Comparing Genetic Programming Approaches for Non-Functional Genetic Improvement
ABSTRACT. Genetic improvement (GI) uses automated search to find improved versions of existing software. While most GI work use genetic programming (GP) as the underlying search process, focus is usually given to the target software only. As a result, specifics of GP algorithms for GI are not well understood and rarely compared to one another. In this work, we propose a robust experimental protocol to compare different GI search processes and investigate several variants of GP- and random-based approaches. Through repeated experiments, we report a comparative analysis of these approaches, using one of the previously used GI scenarios: improvement of runtime of the MiniSAT satisfiability solver. We conclude that the test suites used have the most significant impact on the GI results. Both random and GP-based approaches are able to find improved software, even though the percentage of viable software variants is significantly smaller in the random case (14.5% vs. 80.1%). We also report that GI produces MiniSAT variants up to twice as fast as the original on sets of previously unseen instances from the same application domain.
Time Control or Size Control? Reducing Complexity and Improving Accuracy of Genetic Programming Models
ABSTRACT. Complexity of evolving models in genetic programming (GP) can impact both the quality of the models and the evolutionary search. While previous studies have proposed several notions of GP model complexity, the size of a GP model is by far the most researched measure of model complexity. However, previous studies have also shown that controlling the size does not automatically improve the accuracy of GP models, especially the accuracy on out of sample (test) data. Furthermore, size does not represent the functional composition of a model, which is often related to its accuracy on test data. In this study, we explore the evaluation time of GP models as a measure of their complexity; we define the evaluation time as the time taken to evaluate a model over some data. We demonstrate that the evaluation time reflects both a model's size and its composition; also, we show how to measure the evaluation time reliably. To validate our proposal, we leverage four well-known methods to size-control but to control evaluation times instead of the tree sizes; we thus compare size-control with time-control. The results show that time-control with a nuanced notion of complexity produces more accurate models on 17 out of 20 problem scenarios. Even when the models have slightly greater times and sizes, time-control counterbalances via superior accuracy on both training and test data. The paper also argues that time-control can differentiate functional complexity even better in an identically-sized population. To facilitate this, the paper proposes Fixed Length Initialisation (FLI) that creates an identically-sized but functionally-diverse population. The results show that while FLI particularly suits time-control, it also generally improves the performance of size-control. Overall, the paper poses evaluation-time as a viable alternative to tree sizes to measure complexity in GP.
Is k Nearest Neighbours Regression Better than GP?
ABSTRACT. The original objective of this work was to define a new geometric semantic mutation for genetic programming, that was able to generate more manageable models than the traditional one. This paper tells the story of how, starting from this objective, we ended up with rather different findings. The new mutation replaces the random trees used by the traditional one with a vector of random numbers. To evaluate the individuals on unseen data, it uses the output on the k closest training observations, analogously to the k nearest neighbours. The results presented in the first part of the paper demonstrate that the new mutation outperforms the traditional one. In the second part, we show that the new mutation simply approximates the k nearest neighbours, returning comparable, but not better, results. This finding raises several questions, that arrive up to doubting the opportunity of using genetic programming itself. However, comparisons with state-of-the art machine learning techniques (random forests) and an analysis of relevant literature, allow us to provide a more nuanced perspective. The paper terminates leaving space for future research in genetic programming.
ABSTRACT. This paper introduces a strategy for exploring possibility spaces for creative design of emerging technology systems. These systems are represented as directed acyclic graphs with nodes representing different technologies and data or media types. The naming of system and data components is intended to be populated by a subjective, personal ontology. The source material can be simply modified by a user who wishes to explore a space of possibilities, con- strained by specific resources, to generate simple visual descriptions of systems that can be quickly browsed. These dataflow visualizations are a familiar repre- sentation to those who use visual programming interfaces for creative media processing. A simple interactive genetic programming approach is used with small populations of individual designs. The system is being developed using web browser technology. A minimum functionality proof of concept system has been implemented. Next steps and current challenges are discussed.
ABSTRACT. The emergence of artificial intelligence expedited the computational fabrication of visual information, especially photography, with realism and easiness never seen before. In this paper, we present an interactive installation that explores the generation of facial portraits in the borderline between the artificial and real. This installation creates new faces by recombining the facial features of the audience and displays them on the walls of the room. The array of faces that are displayed in the installation space is contaminated with real faces, making people question about the veracity of the portraits they are observing. The photorealism of the generated faces makes the process of distinguishing the real portraits of the fictional ones.
Adapting and Enhancing Evolutionary Art for Casual Creation
ABSTRACT. Casual creators are creativity support tools designed for non-experts to have
fun with while they create, rather than for serious creative production. We
discuss here how we adapted and enhanced an evolutionary art approach for casual
creation. Employing a fun-first methodology for the app design, we improved
image production speed and the quality of randomly generated images. We further
employed machine vision techniques for image categorisation and clustering and
designed a user interface for fast, fun image generation, adhering to numerous
principles arising from the study of casual creators. We describe the
implementation and experimentation performed during the first stage of
development, and evaluate the app in terms of efficiency, image quality,
feedback quality and the potential for users to have fun. We conclude with a
description of how the app will be used as a research tool, including a planned
art installation.
ABSTRACT. We have recently developed OMNIREP, a coevolutionary algorithm to discover both a representation and an interpreter that solve a particular problem of interest.
Herein, we demonstrate that the OMNIREP framework can be successfully applied within the field of evolutionary art. Specifically, we coevolve representations that encode image position, alongside interpreters that transform these positions into one of three pre-defined shapes (chunks, polygons, or circles) of varying size, shape, and color. We showcase a sampling of the unique image variations produced by this approach.
Comparing Fuzzy Rule Based Approaches for Music Genre Classification
ABSTRACT. Most of the studies on music genre classification are focused on classification quality only. However, listeners and musicologists would favor comprehensible models, which describe semantic properties of genres like instrument or chord statistics, instead of complex black-box transforms of signal features either manually engineered or learned by neural networks. Fuzzy rules - until now not a widely applied method in music classification - offer the advantage of understandability for end users, in particular in combination with carefully designed semantic features. In this work, we tune and compare three approaches which operate on fuzzy rules: a complete search of primitive rules, an evolutionary approach, and fuzzy pattern trees. Additionally, we include random forest classifier as a baseline. The experiments were conducted on an artist-filtered subset of the 1517-Artists database, for which 245 semantic properties describing instruments, moods, singing style, melody, harmony, influence on listener, and effects were extracted to train the classification models.
Emulation Games. See and Be Seen, an Subjective Approach to Analog Computational Neuroscience
ABSTRACT. This paper proposes to describe an emulation game that addresses an approach to the area of Computational Neuroscience. It is in this context where we are designing, performing and valuing certain technical devices that could reveal, or make effective, computational events associated with the emergence of consciousness. The conceptual framework, in which is situated our game, can be defined as a possible epistemogony, a space where the physiology of knowledge and the making are activated, and from which we intend to highlight certain con-ditions that make possible both the so-called scientific research, and art making.
Iterated Granular Neighborhood Algorithm forthe Taxi Sharing Problem
ABSTRACT. One of the most popular issues that we can find in cities is transportation problems; traffic jams, pollution and the transportation costs fees. The concept of Taxi sharing is considered as a promising idea to reduce some of the transportation problems. A group of people travel from the same origin to different destinations, our goal is to assign them to several taxis while reducing the cost of all trips.
The taxi sharing problem is NP-hard, since it is a variant of the car pooling problem.
We adapt Capacitated Vehicle Routing Problem (CVRP) to solve the taxi sharing problem, in which goods are changed by passengers and trucks by taxis.
We describe a new algorithm, called Iterated Granular Neighborhood Algorithm (IGNA), based on the use of the restricted swap neighborhoods in the local search phase, eliminating moves that involve long arcs that may not be part of the best solution. We empirically analyze our algorithm solving different real-like instances of the problem with 9 to 57 passengers.
The results show that the proposed IGNA is quite competitive with the parallel micro evolutionary algorithm (p$\mu$EA).
A fast, scalable meta-heuristic for network slicing under traffic uncertainty
ABSTRACT. Perceived to be one of the cornerstones of the emerging next generation (5G) networks, \emph{Network slicing} enables the accommodation of multiple logical networks with diverse performance requirements on a common substrate platform. Of particular interest among different facets of network slicing is the problem of designing an individual network slice tailored specifically to match the requirements of the big-bandwidth next generation network services. In this work, we present an exact formulation for the network slice design problem under traffic uncertainty. As the considered mathematical formulation is known to pose a high degree of computational difficulty to state-of-the-art commercial mixed integer programming solvers owing to the inclusion of robust constraints, we propose a meta-heuristic based on ant colony optimisation algorithms for the robust network slice design problem. Experimental evaluation conducted on realistic network topologies from SNDlib reveals that the proposed meta-heuristic can indeed be an efficient alternative to the commercial mixed integer programming solvers.
Designing cable-stayed bridges with Genetic Algorithms.
ABSTRACT. Cable-stayed bridges construction involves the determination of a high number of design variables, both static and dynamic. Moreover, the properties of such variables make them statically indeterminate, meaning that a change in one design variable affects the response of the entire structure. This property makes the design of cable-stayed bridges a complex optimization problem. In this work, we use a Genetic Algorithm to evolve solutions for this problem. A set of experiments are executed, where conventional variation operators are used for exploring the solution space. The first experiments suggest that this is a problem with a deceptive landscape. However, we show that we can design solutions that optimize structural objectives. Moreover, we want also to minimize costs while presenting different optimized solutions. In the second set of experiments, we included a baseline solution in the population to evaluate if we could find better solutions using this approach. The results on the second set showed that it was possible, thus we moved to the third set of experiments with more parameter tuning. The experimental results suggest that we are able to find new and suitable solutions for the problem comparable to the existing baseline approach.
Social Learning vs Self-teaching in a Multi-agent Neural Network System
ABSTRACT. The idea that learning can be beneficial to evolution has beentaken on evolving neural networks. Moreover, learning can be classifiedinto two types, namely social learning, or learning from others (e.g., imitation), and individual trial-and-error learning. A “social learning strategy”– a rule governing whether and when to use social or individuallearning, is often said to be more beneficial than relying on social orindividual learning alone. In this paper we compare the effect on evolu-tion of social learning in comparison with that of individual learning. Aneural architecture called a “self-taught neural network” is proposed inorder to allow an agent to learn on its own, without any supervision. Wesimulate a multi-agent system in which the agent controlled by a neuralnetwork have to develop adaptive behaviour and compete with each otherfor survival. Experimental results show that self-teaching alone is moreadaptive than when combined social learning in our simulated world. Weconclude this paper with some indications for future work.
EvoDynamic: a framework for the evolution of generally represented dynamical systems and its application to criticality
ABSTRACT. Dynamical systems possess a computational capacity that may be exploited in a reservoir computing paradigm. This paper presents a general representation of dynamical systems which is based on matrix multiplication. That is similar to how an artificial neural network (ANN) would be represented in a deep learning library and its computation would be faster because of the optimized matrix operations that such type of libraries have. Initially, we implement the simplest dynamical system, a cellular automaton. The mathematical fundamentals behind an ANN are maintained, but the weights of the connections and the activation function are adjusted to work as an update rule in the context of cellular automata. The advantages of such implementation are its usage on specialized and optimized deep learning libraries, the capabilities to generalize it to other types of networks and the possibility to evolve cellular automata and other dynamical systems in terms of connectivity, update and learning rules. Our implementation of cellular automata constitutes an initial step towards a more general framework for dynamical systems. Our objective is to evolve such systems to optimize their usage in reservoir computing and to model physical computing substrates. Furthermore, we present promising preliminary results toward the evolution of complex behavior and criticality using genetic algorithm in stochastic elementary cellular automata.
Automatic rule extraction from access rules using Genetic Programming
ABSTRACT. The security policy rules in companies are generally proposed by the
Chief Security Officer (CSO), who must, for instance, select by hand
which access events are allowed and which ones should be forbidden. In
this work we propose a way to automatically obtain rules that
generalize these single-event based rules using Genetic Programming
(GP), which, besides, should be able to present them in an
understandable way. Our GP-based system obtains good dataset coverage
and small ratios of false positives and negatives in the simulation
results over real data, after testing different fitness functions and
configurations in the way of coding the individuals.
Finding behavioural patterns among League of Legends players through Hidden Markov Models
ABSTRACT. This work is aimed at finding behavioural patterns among professional players of League of Legends, one of the greatest recent phenomena in the world of video games. For that purpose, Hidden Markov Models (HMM) are used to model the sequence of events produced by a gameplay. First, the set of interesting game events for analysis is defined, and based on that, each gameplay of the dataset is transformed into a sequences of events. Then, four HMMs will be trained with the data from four different groups of sequences, according to the team that produces the events of the sequence (red/blue) and to whether that sequence led to a victory of the team or not. Finally, the resulting HMMs will be visualized and compared in order to achieve some conclusions about the macro
game strategy in League of Legends, which will help to understand the game at the highest level of its competition.
Security Risk Optimization for Multi-Cloud Applications
ABSTRACT. Security proved to be major concern when organizations outsource their data storage and processing. Encryption schemes do not provide solutions as they disable data processing in the cloud. Researchers have used constraint-based data fragmentation to increase security while maintaining availability. We build on this approach by applying fragmentation to the application logic in addition to the data in the database and propose a model for security risk assessment in multi-cloud environment. By applying a multi-objective optimization algorithm to the proposed model, we determine pareto-optimal distributions of application and data fragments to available cloud providers.
Using Skill Rating as Fitness on the Evolution of GANs
ABSTRACT. Generative Adversarial Networks (GAN) is an adversarial model that achieved impressive results on generative tasks. In spite of the relevant results, GANs present some challenges regarding the stability, making the training usually a hit-and-miss process. To overcome these challenges, several improvements were proposed to better handle the internal characteristics of the model, such as alternative loss functions or architectural changes on the neural networks used by the generator and the discriminator. Recent works proposed the use of evolutionary algorithms on the GAN training, aiming to solve these challenges and to provide an automatic way to find good models. In this context, COEGAN proposes the use of coevolution and neuroevolution to orchestrate the training of GANs. However, previous experiments detected that some of the fitness functions used to guide the evolution are not ideal.
In this work we propose the evaluation of a game-based fitness function to be used within the COEGAN method. Skill rating is a metric to quantify the skill of players in a game and has already been used to evaluate GANs. We extend this idea using the skill rating in an evolutionary algorithm to train GANs. The results show that skill rating can be used as fitness to guide the evolution in COEGAN without the dependence of an external evaluator.
Community Detection in Attributed Graphs with Differential Evolution
ABSTRACT. Detecting communities in networks, by taking into account not only node connectivity but also the features characterizing nodes, is becoming a research activity with increasing interest because of the information nowadays available for many real-world networks of attributes associated with nodes. In this paper, we investigate the capability of differential evolution to discover groups of nodes which are both densely connected and share similar features. Experiments on two real-world networks with attributes for which the ground-truth division is known show that differential evolution is an effective approach to uncover communities.
Accelerated Design of HIFU Treatment Plans Using Island-based Evolutionary Strategy
ABSTRACT. High Intensity Focused Ultrasound (HIFU) is an emerging technique for non-invasive cancer treatment where malignant tissue is destroyed by thermal ablation. Such a treatment consists of a series of short sonications destroying small volumes of tissue. High-quality treatment plans allow to precisely target malignant tissue and protect surrounding healthy tissue. Recently, we developed an evolutionary strategy to design such HIFU treatment plans using patient-specific material properties and a realistic thermal model. Unfortunately, the execution time was prohibitive for routine use. Here, we present an optimized evolutionary strategy based on island model parallelization and a revised fitness function implementation. The proposed improvements allow to develop a good treatment plan four times faster and with 5% higher success rate.
ABSTRACT. The dynamic flexible job shop scheduling (DFJSS) problem has been widely concerned in both academia and industry. Both machine assignment and operation sequencing decisions need to be made simultaneously as an operation can be processed by a set of machines in DFJSS. Scheduling heuristic has become an effective way to solve the
DFJSS problem due to its efficiency and simplicity. Genetic programming (GP) has been successfully applied to evolve scheduling heuristics for job shop scheduling automatically. Crossover and mutation are two basic genetic operators to generate offsprings in GP. However, the subtrees of the selected parents are randomly chosen in traditional GP, which may not be sufficiently effective, especially in a huge search space. In this paper, we propose new strategies to guide the subtree selection rather than pick them randomly. To be specific, the occurrences of features are used to measure the importance of each subtree of the selected parents.
The probability that a subtree is selected will be set according to its importance and the type of genetic operators. The results show that the proposed GP algorithm with the guided subtree selection for crossover can converge faster and achieve significantly better performance than its counterpart in half of the scenarios while no worse in all other scenarios without increasing the computational time.
Investigating the Use of Geometric Semantic Operators in Vectorial Genetic Programming
ABSTRACT. Vectorial Genetic Programming (VE_GP) is a new GP approach for panel data forecasting. Besides permitting the use of vectors
as terminal symbols to represent time series and including aggregate
functions to extract time series features, it introduces the possibility of
evolving the window of aggregation. The local aggregation of data allows the identification of meaningful patterns overcoming the drawback
of considering always the previous history of a series of data. In this
work, we investigate the use of geometric semantic operators (GSOs) in
VE_GP, comparing its performance with traditional GP with GSOs. Experiments are conducted on two real panel data forecasting problems,
one allowing the aggregation on moving windows, one not. Results show
that classical VE_GP is the best approach in both cases in terms of predictive accuracy, suggesting that GSOs are not able to evolve efficiently
individuals when time series are involved. We discuss the possible reasons
of this behaviour, to understand how we could design valuable GSOs for
time series in the future.
Automatically Evolving Lookup Tables for Function Approximation
ABSTRACT. Many functions, such as square root, are approximated and sped up with lookup tables containing pre-calculated values.
We introduce an approach using genetic algorithms to evolve such lookup tables for any smooth function. It provides double precision and calculates most values to the closest bit, and outperforms reference implementations in most cases with competitive run-time performance.
ABSTRACT. Ensemble learning is a powerful paradigm that has been used in the top state-of-the-art machine learning methods like Random Forests and XGBoost. Inspired by the success of such methods, we have developed a new Genetic Programming method called Ensemble GP. The evolutionary cycle of Ensemble GP follows the same steps as other Genetic Programming systems, but with differences in the population structure, fitness evaluation and genetic operators. We have tested this method on eight binary classification problems, achieving results significantly better than standard GP, with much smaller models. Although other methods like M3GP and XGBoost were the best overall, Ensemble GP was able to achieve exceptionally good generalization results on a particularly hard problem where none of the other methods was able to succeed.
Benchmarking manifold learning methods on a large collection of datasets
ABSTRACT. Manifold learning, a non-linear approach of dimensionality reduction, assumes that the dimensionality of multiple datasets is artificially high and a reduced number of dimensions is sufficient to maintain the information about the data. In this paper, a large scale comparison of manifold learning techniques is performed for the task of classification. We show the current standing of genetic programming (GP) for the task of classification by comparing the classification results of two GP-based manifold leaning methods: GP-Mal and ManiGP - an experimental manifold learning technique proposed in this paper. We show that GP-based methods can more effectively learn a manifold across a set of 155 different problems and deliver more separable embeddings than many established methods. We also ask questions about the fairness of benchmarking.
Classification of Autism Genes using Network Science and Linear Genetic Programming
ABSTRACT. Understanding the genetic background of complex diseases and disorders plays an essential role in the promising precision medicine. Deciphering what genes are associated with a specific disease/disorder helps better diagnose and treat it, and may even prevent it if predicted accurately and acted on effectively at early stages. The evaluation of candidate disease-associated genes, however, requires time-consuming and expensive experiments given the large number of possibilities. Due to such challenges, computational methods have seen increasing applications in predicting gene-disease associations. Given the intertwined relationships of molecules in human cells, genes and their products can be considered to form a complex molecular interaction network. Such a network can be used to find candidate genes that share similar network properties with known disease-associated genes. In this research, we investigate autism spectrum disorders and propose a linear genetic programming algorithm for autism-gene prediction using a human molecular interaction network and known autism-genes for training. We select an initial set of network properties as features and our LGP algorithm is able to find the most relevant features while evolving accurate predictive models. Our research demonstrates the powerful and flexible learning abilities of GP on tackling a significant biomedical problem, and is expected to inspire further exploration of wide GP applications.
GenImpro: An Evolutionary Model For Musical Improvisation
ABSTRACT. Musical improvisation and biological evolution are similarly based on the principles of unpredictability and adaptivity. Within this framework, this research examines whether and how structures of evolutionary developmental logic can be detected and described in free improvisation. The underlying concept of improvisation is participative in nature and, in this reading, contains similar generative strategies as there are in evolutionary processes. Further implications of the theory of evolution for cultural development in the concept of memetics and in the form of genetic algorithms build an interdisciplinary network of different theories and methodologies, from which the proposed model of genetic improvisation emerges.
Fusion of Hilbert-Huang Transform and Deep Convolutional Neural Network for Predominant Musical Instruments Recognition
ABSTRACT. As a subset of music information retrieval (MIR), predominant musical instruments recognition (PMIR) has attracted substantial interest in recent years due to its uniqueness and high commercial value in key areas of music analysis research such as music retrieval and automatic music transcription. With the attention paid to deep learning and artificial intelligence technology, they have been more and more widely applied in the field of MIR, thus making break-throughs in some sub-fields that have been stuck in the bottleneck. In this paper, the Hilbert-Huang Transform (HHT) is employed to map one-dimensional audio data into two-dimensional matrix format and then a deep convolutional neural network is developed to learn affluent and effective features for PMIR. In the experiment, 6705 audio pieces including 11 musical instruments are used to validate the efficacy of our proposed approach. The results are compared to four benchmarking methods and show significant improvements in terms of precision, recall and F1 measures.
Genetic Reverb: Synthesizing Artificial Reverberant Fields Via Genetic Algorithms
ABSTRACT. We present Genetic Reverb, a user-friendly VST2 audio effect plugin that performs convolution with Room Impulse Responses (RIRs) generated via a Genetic Algorithm (GA). The parameters of the plugin include some of the standard room acoustics parameters mapped to perceptual correlates (decay time, intimacy, clarity, warmth, among others). These parameters provide the user with some control over the resulting RIRs as they determine the fitness values of potential RIRs. In the GA, these RIRs are initially generated via an image method, and then evolved via truncation selection, multi-point crossover, and Gaussian mutation. These operations repeat until a certain number of generations has passed or the fitness value reaches a threshold. Either way, the best-fit RIR is returned. The user can also generate two different RIRs simultaneously, and assign each of them to the left and right stereo channels for a binaural reverberation effect. With Genetic Reverb, the user can generate and store new RIRs that represent virtual rooms, some of which may even be impossible to replicate in the physical world. An original musical composition using the Genetic Reverb plugin is presented to demonstrate its applications.
What is Your MOVE: Modeling Adversarial Network Environments
ABSTRACT. Optimal adversarial dynamics between defenders and attackers in large network systems is a complex problem one can approach from several perspectives. Still, the results obtained are often not satisfactory since they either concentrate on only one party or run very simplified scenarios that are hard to correlate with realistic settings.
To truly find which are the most robust defensive strategies, the adaptive attacker ecosystem must be given as many degrees of freedom as possible, to accurately model real attacking scenarios.
We propose a coevolutionary-based simulator called MOVE that can evolve both attack and defense strategies. To test it, we investigate several different but realistic scenarios, taking into account features such as network topology and even possible applications. The results show that the evolved strategies far surpass randomly generated strategies. Finally, the evolved strategies can help us to reach some more general conclusions for both attacker and defender sides.
Modelling the Search Process of Population-based Algorithms in Continuous Optimisation
ABSTRACT. We introduce search trajectory networks (STNs) as a tool to analyse and visualise the behaviour of population-based algorithms in continuous spaces. Inspired by local optima networks (LONs) that model the global structure of search spaces, STNs model the search trajectories of algorithms. Unlike LONs, the nodes of the network are not restricted to local optima but instead represent a given state of the search process. Edges represent search progression between consecutive states. This extends the power and applicability of network-based models to understand heuristic search algorithms. We extract and analyse STNs for two well-known population-based algorithms: particle swarm optimisation and differential evolution when applied to benchmark continuous optimisation problems. We also offer a comparative visual analysis of the search dynamics in terms of merged search trajectory networks.
On the sampling of all basins of attraction of multimodal functions
ABSTRACT. In general terms, the modality of a mathematical object refers to the number of highest or lowest values (so-called modes) that the object possesses. A multimodal function is therefore a type of function that has multiple modes as opposed to an unimodal function that has only one. Each of these modes is enclosed within a boundary region or basin, where the value acts as an attractor. In optimization, exploiting a basin of attraction means using the gradient information of the region to converge to the mode as it represents a local or global optimum. Therefore, if the aim is to locate all possible solutions in a multimodal function, we should first locate all its basins of attraction and then exploit them. This paper focuses on the former step: analyzing the cost of locating all basins of attraction of a multimodal function by sampling uniformly at random the problem space, i.e. a common approach of initialization in population-based metaheuristics. From the conducted experimentation, we will discuss the implications of the proposed model on the design of multimodal benchmarks. Among other conclusions, the proposed model
is equivalent to the problem of determining a minimum population size and therefore we can derive lower bounds on the computational budget necessary to optimize a multimodal function.
A Multi-strategy LSHADE Algorithm and its Applications on Temporal Alignment
ABSTRACT. Temporal alignment is a common problem, but performance of corresponding techniques is easily affected by time series characteristic. In this paper, we propose a novel temporal alignment approach with a combination of multi-strategy success-history based adaptive differential evolution with linear population size reduction (MLSHADE) and dynamic time warping (DTW), namely, MLDTW. To strengthen efficiency of optimization, weighted mutation strategy, inferior solution search strategy and eigen Gaussian random walk strategy is presented. Moreover, the MLDTW operators which include initialization and updating operators are given to enhance the accuracy and speed of DTW based on MLSHADE. The performance of MLSHADE is verified by using CEC 2018 test suite. Meanwhile, the availability and robustness of MLDTW is proved by using sinusoidal signal and UCR time series datasets.
From Business Curated Products to Algorithmically Generated
ABSTRACT. This document outlines how at FabFitFun we have developed a bundle of machine learning models that enabled us algorithmically assemble future boxes that are shipped to our members. FabFitFun is a technology and e-commence startup that works with small vendors to discover cool products in a similar fashion to Amazon marketplace. The team of data scientists utilize our historical data which includes details of our customers and products we have sold and are planning to sell, programmatically discover new products.We use classical machine learning supervised models to classify our products and latest technologies like genetic programming, linear optimization, Elastic Search and NLP to help business to curate products to members in a new era. Majority of products never repeat, thus that we are faced with a cold start problem. We are addressing this problem via genetic programming and feature enrichment from text. We utilize Elastic Search engine to create and score word tokens to have a rich data and ability to spot repetitive features for our supervised learning models. Our paper demonstrates novel way to evaluate algorithmically created boxes usingSharpe Ratio Concept from finance. Utilize genetic programming to generate synthetic data, provide diversity, novelty across our products
ABSTRACT. The Flexible Job Shop Problem is one of the hardest problems in the area of combinatorial optimization. Genetic Algorithms are among the methods of resolution. Despite their natural parallelism, few studies exploit the poten-tial of their parallelization in the emerging powerful Graphics Processing Units. This paper presents a conceptual model for a Parallel Microevolutionary Genetic Algorithm. Taking advantage of the Computer Unified Device Architecture, this conceptual model focuses on thread blocks organiza-tion and memory management. Future implementations may validate the potential for good results and its use to tackle more complex extensions of the problem.
ABSTRACT. Although the Hamiltonian cycle problem is known to be NP-complete, only a few graphs are actually hard to decide for complete backtracking algorithms running on large ensembles of random graphs. Historically, these hard instances are found near the Komlo ́s-Szemer ́edi bound, the average vertex degree where the Hamiltonian probability phase transition occurs. In this preliminary investigation, we take a different approach, generating hard graphs with two evolutionary algorithms. We find comHamiltonian cycle problem · instance hardness · phase tran- sition · evolutionary algorithms · plant propagation algorithm
pletely new and counterintuitive results.
Plant Propagation Parameterization: Offspring & Population Size
ABSTRACT. We investigate a wide range of interdependent parameter values for the number of offspring and the population size in the Plant Propagation Algorithm. An ‘optimal window’ of parameter values is found, for which the algorithm performs substantially better on five benchmark test functions. Moreover, apart from being within or outside the window, values appear to be largely interchangable, making the algorithm largely independent from specific settings of these parameters.
ABSTRACT. The simplified paintings-from-polygons problem (SPFP), in
which paintings or other digital images are approximated by heuristically arranging overlapping semi-opaque colored polygons, is NP-hard.
Every instance of the subset sum problem can be transformed to a SPFP
instance, solved by some algorithm, and transformed back. Whichever algorithm one chooses, it cannot be more efficient than the most efficient
algorithm for subset sum. Since subset sum is known to be NP-hard, and
SPFP is at least equally hard, SPFP must also be NP-hard.
A deep learning neural network for classifying good and bad photos
ABSTRACT. Current state-of-the-art solutions that automate the assessment of photo aesthetic quality use deep learning neural networks. Most of these networks are either binary classifiers or regression models that predict the aesthetic rating of photos. In this paper, we developed a deep learning neural network that predicts the opinion score rating distribution of a photo's aesthetic quality. Our work focused mainly on developing a deep learning network that use the best image pre-processing method to improve the correlation of ground truth and the predicted aesthetic rating distribution of photos. Earth mover distance (EMD), Spearman's rank correlation coefficient and linear rank correlation were used to measure correlation while accuracy and F1 score were used to rank the best trained network. We have identified five image pre-processing algorithms namely random, global, center, salience merge and salience clustering which were used to resize and extract the regions in a photo used for aesthetic assessment. Using InceptionV2 as base architecture and EMD loss as the objective function, we trained our network using the aesthetics visual analysis (AVA) benchmark dataset. To identify the best algorithm, we trained our network using photos of varied aesthetic rating ranges: unambiguously rated photos with mean rating < 4.0 and mean rating > 6.0, ambiguously rated photos with mean rating in the range 4.0 < mean rating <= 6.0, and all images regardless of rating. Our result show center produce the best model and significantly outperform global, random and both salience based algorithms when models are trained using unambiguously rated photos. On the other hand, both salience based algorithm outperform random, global and center when models are trained using ambiguously rated photos. Lastly, center, random and salience merge outperforms both global and salience clustering with statistically significant difference in terms of accuracy and F1 score when models are trained using all photos regardless of aesthetic rating.
Controlling Self-Organization in Generative Creative Systems
ABSTRACT. We present a new tool which simulates the development of Artificial Chemistries (AChems) to produce real-time imagery for artistic/entertainment purposes. There have been many such usages of complex systems for artistic purposes, but deciding which parameters to use for such unpredictable systems can lead to a feeling of lack of control. For our purposes, we struggled to gain enough control over the AChem real-time image generation tool to accompany music in a video-jockeying application. To overcome this difficulty, we developed a general-purpose clustering approach that attempts to produce sets of starting configurations which lead to maximally distinct visualisations, thus ensuring users feel that they have influence over the AChem when controlled with a suitable GUI. We present this approach and its application to controlling the development of AChems, along with the results from experiments with different clustering approaches, aided by both machine vision analysis and human curation. We conclude by advocating an overfitting approach supplemented by a final check by a designer, and discuss potential applications of this in artistic and entertainment settings.
Quantum Zentanglement: Combining Picbreeder and Wave Function Collapse to Create Zentangles
ABSTRACT. This paper demonstrates a computational approach to generating art reminiscent of Zentangles by combining Picbreeder with Wave Function Collapse (WFC). Picbreeder interactively evolves images based on user preferences and selected image tiles are sent to WFC, which generates patterns by filling a grid with various rotations of the tile images, placed according to simple constraints. Then other images from Picbreeder act as templates for combining patterns into a final Zentangle image. Although traditional Zentangles are black and white, the system is also capable of producing vibrant color Zentangles. Automatic evolution experiments using fitness functions instead of user selection were also conducted. Although certain automated fitness functions occasionally produce simplistic degenerate images, many automatically generated Zentangles are aesthetically pleasing and consist of interesting naturalistic patterns. The interactively generated Zentangles are nearly always pleasing and tailored to the preferences of the user creating them.
An Event-based Architecture for Cross-Breed Multi-population Bio-inspired Optimization Algorithms
ABSTRACT. Multi-population methods can combine multiple algorithms, with different parameters, interacting with each other at the same time. For instance, a genetic algorithm could find a promising global solution that is not optimal while another algorithm, more suitable for a local search, finds the global optimum. This approach has been followed extensively in recent years, with success. Moreover, there is a need for frameworks, architectures, and implementation models that can allow researchers the development of new parallel, asynchronous, heterogeneous, and parameter-free algorithms in a scalable way. In this work, we present an event-driven architecture, designed to distribute the processing of population-based algorithms asynchronously. The search algorithm uses a multi-population approach, creating multiple populations with different parameters of execution, allowing the implementation of multiple algorithms. In this work, we cross-breed Genetic Algorithms (GAs) and Particle Swarm Optimization (PSO). Experiments show that the framework allows the combined algorithms outperform, with a high probability, single-algorithm versions. The framework we provide also has few parameters to tune since single-algorithm parameters are selected randomly; in general, this will boost the diversity of every algorithm by itself.
ABSTRACT. Most Genetic Programming implementations use an interpreter to execute an individual, in order to obtain its outcome. Usually, such interpreter is the main bottleneck of the algorithm, since a single individual may contain thousands of instructions that must be executed on a dataset made of a large number of samples. Although one can use SIMD (Single Instruction Multiple Data) intrinsics to execute a single instruction on a few samples at the same time, multiple passes on the dataset are necessary to calculate the result. To speed up the process, we propose using MIMD (Multiple Instruction Multiple Data) instruction sets. This way, in a single pass one can execute several instructions on the dataset. We employ AVX2 intrinsics to improve the performance even further, reaching a median peak of $7.5$ billion genetic programming operations per second in a single CPU core.
Using evolutionary algorithms for server hardening via the moving target defense technique
ABSTRACT. The {\em moving target defense} from cyberattacks consists in changing
the profile or signature of certain services in an Internet node so
that an attacker is not able to identify it uniquely, or find specific
angles of attack for it. From an optimization point of view,
generating profiles that change and, besides, optimize security is a
combinatorial optimization problem where different service
configurations are generated and evaluated, seeking the optimum
according to a standard server vulnerability evaluation score. In this
paper we will use an evolutionary algorithm to generate different
server profiles that also minimize the risk of being attacked. Working
on the well-known web server nginx, and using an
industry-standard web configuration, we will prove that this
evolutionary algorithm is able to generate a sufficient amount of
different and secure profiles in time for them to be deployed in the
server. The system has been released as free software, as is the best
practice in security tools.