SYNASC 2015: 17TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING
CONFERENCE ON WEDNESDAY, SEPTEMBER 23RD
Days:
previous day
next day
all days

View: session overviewtalk overviewside by side with other conferences

09:00-09:50 Session 13: Invited talk: Building a Nature-Inspired Computer - Peter Bentley

Peter Bentley -  Building a Nature-Inspired Computer

Location: A11
09:00
Building a Nature-Inspired Computer
SPEAKER: Peter Bentley

ABSTRACT. Since the before birth of computers we have strived to make intelligent machines that share some of the properties of our own brains. We have tried to make devices that quickly solve
problems that we find time consuming, that adapt to our needs, and that learn and derive new information. In more recent years we have tried to add new capabilities to our devices: selfadaptation, fault tolerance, self-repair, even self-programming, or self-building. In pursing these challenging goals we push the boundaries of computer and software architectures. We invent new parallel processing approaches or we exploit hardware in new ways.

For the last decade Peter Bentley and his group have made their own journey in this area. In order to overcome the observed incompatibilities between conventional architectures and
biological processes, Bentley created the Systemic Computer – a computing paradigm and architecture designed to process information in a way more similar to natural systems. The computer uses a systemics world-view. Instead of the traditional centralized view of computation, here all computation is distributed. There is no separation of data and code/functionality into memory, ALU, and I/O. Everything in systemic computation is composed of systems, which may not be destroyed, but may transform each other through their interactions, akin to collision based computing. Two systems interact in the context of a third
system, which defines the result of their interaction. All interactions may be separated and embedded within scopes, which are also systems, enabling embedded hierarchies. Systemic computation makes the following assertions:

  • Everything is a system.
  • Systems can be transformed but never destroyed or created from nothing.
  • Systems may comprise or share other nested systems.
  • Systems interact, and interaction between systems may cause transformations of those systems, where the nature of that transformation is determined by a contextual system.
  • All systems can potentially act as context and affect the interactions of other systems, and all systems can potentially interact in some context.
  • The transformation of systems is constrained by the scope of systems, and systems may have partial membership within the scope of a system.
  • Computation is transformation.

Computation has always meant transformation in the past, whether it is the transformation of position of beads on an abacus, or of electrons in a CPU. But this simple definition also allows us to call the sorting of pebbles on a beach, or the transcription of protein, or the growth of dendrites in the brain, valid forms of computation. Such a definition is important, for it provides a common language for biology and computer science, enabling both to be understood in terms of computation.

The systemic computer is designed to enable many features of natural computation and provide an effective platform for biological modeling and bio-inspired algorithms. Several
different implementations of the systemic computer have been created, each with their own advantages and disadvantages. Simulators on conventional computers have enabled the
demonstration of bio-inspired algorithms, fault tolerance, selfrepair, and modeling of biological processes. To improve speed, an FPGA-based hardware implementation was created
and shown to be several orders of magnitude faster. A GPUbased implementation was also created in order to combine flexibility, scalability, and speed.

Through this work, many important lessons have been learned. In addition to the advances in bio-inspired computing, it is increasingly possible to see parallels between systemic computing and other techniques and architectures under development. High performance graph-based computing or novel hardware based on memristors or neural modeling may provide excellent new substrates for systemic-style computation in the future.

09:50-10:10Coffee Break
10:10-12:10 Session 14A: Artificial Intelligence (II)
Location: A11
10:10
A Surrogate-Based Strategy for Multi-Objective Tolerance Analysis in Electrical Machine Design

ABSTRACT. By employing state-of-the-art automated design and optimization techniques from the field of evolutionary computation, engineers are able to discover electrical machine designs that are highly competitive with respect to several (usually conflicting) objectives like efficiency, material costs, torque ripple and others. Apart from being Pareto-optimal, a good electrical machine design must also be quite robust, i.e., it must not be sensitive with regard to its design parameters as this would severely increase manufacturing costs or make the physical machine exhibit characteristics that are very different from those of its computer simulation model. Even when using a modern parallel/distributed computing environment, carrying out a (global) tolerance analysis of an electrical machine design is extremely challenging because of the number of evaluations that must be performed and because each evaluation requires (a series of) very time-intensive non-linear finite element (FE) simulations. In the present research, we describe how global surrogate models (ensembles of fast-to-train artificial neural networks) that are created in order to speed-up the multi-objective evolutionary search can be easily reused to perform a fast tolerance analysis of the optimized designs. Using two industrial optimization scenarios, we show that the surrogate-based approach can offer very valuable insights (and sometimes very accurate predictions) regarding the local and global sensitivities of the considered objectives at a fraction of the computational cost required by a FE-based strategy. Encouraged by the good performance on individual designs, we also used the surrogate approach to track the average sensitivity of the Pareto front during the entire optimization procedure. Our results indicate that there is no generalized increase of sensitivity during the runs, i.e., the used evolutionary algorithms do not enter a stage where they discover electrical drive designs that trade robustness for quality.

10:30
Lie Algebra-Valued Hopfield Neural Networks

ABSTRACT. This paper introduces Lie algebra-valued Hopfield neural networks, for which the states, outputs, weights and thresholds are all from a Lie algebra. This type of networks represents an alternative generalization of the real-valued neural networks besides the complex-, hyperbolic-, quaternion-, and Clifford-valued neural networks that have been intensively studied over the last few years. The dynamics of these networks from the energy function point of view is studied by giving the expression of such a function and proving that it is indeed an energy function for the proposed network.

10:50
Feature creation using genetic programming with application in malware detection

ABSTRACT. This paper extends the authors previous research on a malware detection method, focusing on improving the accuracy of the perceptron based - One Side Class Perceptron algorithm via the use of Genetic Programming. We are concerned with finding a proper balance between the three basic requirements for malware detection algorithms: a) that their training time on large datasets falls below acceptable upper limits; b) that their false positive rate (clean/legitimate files/software wrongly classified as malware) is as close as possible to 0 and c) that their detection rate is as close as possible to 1. When the first two requirements are set as objectives for the design of detection algorithms, it often happens that the third objective is missed: the detection rate is low.

This study focuses on improving the detection rate while preserving the small training time and the low rate of false positives.

Another concern is to use the perceptron-based algorithm’s good performance on linearly separable data, by extracting features from existing ones. In order to keep the overall training time low, the huge search space of possible extracted features is efficiently explored – in terms of time and memory foot-print – using Genetic Programming; better separability is sought for.

For experiments we used a dataset consisting of 350,000 executable files with an initial set of 300 Boolean features describing each of them. The feature-extraction algorithm is implemented in a parallel manner in order to cope with the size of the data set. We also tested different ways of controlling the growth in size of the variable-length chromosomes.

The experimental results show that the features produced by this method are better than the best ones obtained through mapping allowing for an increase in detection rate.

11:10
High Probability Mutation and Error Thresholds in Genetic Algorithms

ABSTRACT. Error Threshold is a concept from molecular biology that has been introduced [G. Ochoa (2006) Error Thresholds in Genetic Algorithms. Evolutionary Computation Journal, 14:2, pp 157-182, MIT Press] in Genetic Algorithms and has been linked to the concept of Optimal Mutation Rate. In this paper, the author expands previous works with a study of Error Thresholds near 1 (i.e. mutation probabilities of approx. 0.95), in the context of binary encoded chromosomes. Comparative empirical tests are performed, and the author draws conclusions in the context of population consensus sequences, population size, error thresholds, selection pressure and quality of found optima.

11:30
A Study on Techniques for Proactively Identifying Malicious URLs

ABSTRACT. As most of the malware nowadays use Internet as their main doorway to infect a new system, it has become imperative for security vendors to provide cloud-based solutions that can filter and block malicious URLs. This paper presents different practical considerations related to this problem. The key points that we focus on are the usage of different machine learning techniques and unsupervised learning methods for detecting malicious URLs with respect to memory footprint. The database that we have used in this paper was collected during a period of 48 weeks and consists in approximately 6.000.000 benign and malicious URLs. We also evaluated how detection rate and false positive rate evolved during that period and draw some conclusions related to current malware landscape and Internet attack vectors.

11:50
Stock Market Trading Strategies - Applying Risk and Decision Analysis Models for Detecting Financial Turbulence
SPEAKER: Monica Tirea

ABSTRACT. Risk handling and evaluation plays an important role in optimizing an investment portfolio. This paper's goal is to describe a system that determines, classifies and handles risk associated to any type of investment based on sentiment analysis, price movement, information related to companies,certain characteristics, the traders confidence level, and by measuring the potential loss over a certain period of time. This research implies analyzing trader's risk, market risk, risk associated to each evaluated company or financial group, political and governmental risk. The system is able to create different types of portfolio options based on the investor/trader profile, which is build based on the user's tolerance to risk ( determined by the results from an interactive quiz that the user must complete when entering the system). We propose a multi-agent system that uses different type of data(numerical, textual) in order to choose the appropriate mix of investments in order to minimize the risk and maximize the gain on a stock portfolio. In order to validate the result a system was constructed.

12:10-13:10Lunch Break
13:10-14:00 Session 15: Invited talk: Dynamic Programming on Tree Decomposition in Practice. Some Lessons Learned - Stefan Woltran

Stefan Woltran -   Dynamic Programming on Tree Decomposition in Practice. Some Lessons Learned

Location: A11
13:10
Dynamic Programming on Tree Decompositions in Practice -- Some Lessons Learned

ABSTRACT. Many prominent NP-hard problems have been shown tractable for
bounded treewidth. To put these results to practice, dynamic
programming (DP) via a traversal of the nodes of a tree
decomposition is the standard technique. However, the concrete
implementation of suitable efficient algorithms is often
tedious.  To this end, we have developed the D-FLAT system
which allows the user to specify the DP algorithm in a
declarative fashion and where the computation of intermediate
solutions is delegated to a logic-programming system.

In this talk, we first give a brief introduction to the
D-FLAT system and demonstrate its usage on some standard DP
algorithms.  In the second part of the talk, we reflect on some
of our experiences. In particular, we address the following
issues:
(i) The actual structure of the tree decomposition heavily
influences the performance of DP algorithms in practice; we
discuss some solutions how to overcome this problem.
(ii) DP algorithms for AI problems typically show recurring
patterns that call for tasks like subset-minimization. We
introduce a new method for obtaining DP algorithms from simpler
principles, which is also supported by D-FLAT.
(iii) We present ongoing work on an alternative version of our
approach. Hereby, BDDs are employed as internal datastructures.
This not only leads to better performance but also gives a handle
for defining alternative DP algorithms which delay the
materialization of partial solutions.

14:00-14:20Coffee Break
14:20-15:40 Session 16A: Symbolic Computation (I)
Location: A11
14:20
Symbolic Derivation of Mean-Field PDEs from Lattice-Based Models

ABSTRACT. Transportation processes, which play a prominent role in the life and social sciences, are typically described by discrete models on lattices. For studying their dynamics a continuous formulation of the problem via partial differential equations (PDE) is employed. In this paper we propose a symbolic computation approach to derive mean-field PDEs from a lattice-based model. We start with the microscopic equations, which state the probability to find a particle at a given lattice site. Then the PDEs are formally derived by Taylor expansions of the probability densities and by passing to an appropriate limit as the time steps and the distances between lattice sites tend to zero. We present an implementation in a computer algebra system that performs this transition for a general class of models. In order to rewrite the mean-field PDEs in a conservative formulation, we adapt and implement symbolic integration methods that can handle unspecified functions in several variables. To illustrate our approach, we consider an application in crowd motion analysis where the dynamics of bidirectional flows are studied. However, the presented approach can be applied to various transportation processes of multiple species with variable size in any dimension, for example, to confirm several proposed mean-field models for cell motility.

14:40
Computation of GCD of Sparse Multivariate Polynomial by Extended Hensel Construction
SPEAKER: unknown

ABSTRACT. Let $F$ be a squarefree multivariate polynomial in main variable $x$ and subvariables $u,v,\dots$. If the leading coefficient (LC) of $F$ w.r.t. $x$ vanishes at the origin of subvariables then we say that the LC of $F$ is singular. A representative algorithm for multivariate polynomial GCD is the EZ-GCD, which is based on the generalized Hensel construction (GHC). In order to apply the GHC, $F$ must be such that 1) the LC of $F$ is non-singular and 2) the initial Hensel factor of GCD is ``lucky''. These requirements are usually satisfied by the ``nonzero substitution'', i.e., to shift the origin of subvariables. However, the origin shifting may cause a drastic increase of the number of terms of $F$ if $F$ is sparse. In 1993, Sasaki and Kako proposed the extended Hensel construction (EHC) which is the Hensel construction for multivariate polynomial with singular LC. Using the EHC, Inaba implemented an algorithm of multivariate polynomial factorization and verified that it is very useful for sparse polynomials. In this paper, we apply the EHC for the computation of GCD of sparse multivariate polynomials. In order to find a lucky initial factor, we utilize the weighting of subvariables, etc. We report results of preliminary experiments which show that our method is hopeful.

15:00
Lagrange Inversion and series for Lambert W
SPEAKER: unknown

ABSTRACT. We show that Lagrange inversion can be used to obtain closed-form expressions for a number of series expansions of the Lambert $W$ function. Equivalently, we obtain expressions for the $n$th derivative. Various integer sequences related to the series expansions now can be expressed in closed form.

15:20
Towards an automatic tool for multi-scale model derivation illustrated with a micro-mirror array
SPEAKER: Walid Belkhir

ABSTRACT. This paper reports recent advances in the development of a symbolic asymptotic modeling software package, called MEMSALab, which will be used for automatic generation of asymptotic models for arrays of micro and nanosystems. The main purpose of this software is to construct models incrementally so that model features can be included step by step. This idea, conceptualized under the name "by extension-combination", is presented for the first time after having recalled the general functioning principles. A friendly user language recently introduced is also shortly discussed. We illustrate the mathematical operations that need to be implemented in MEMSALab by an example of an asymptotic model for the stationary heat equation in a Micro-Mirror Array developed for astrophysics.

15:40-16:00Coffee Break
16:00-17:00 Session 17A: Symbolic Computation (II) and Logic and Programming (III)
Location: A11
16:00
Computation of Stirling numbers and generalizations
SPEAKER: unknown

ABSTRACT. We consider the computation of Stirling numbers for positive and negative arguments. We describe a new, more efficient, computational scheme of Stirling cycle numbers than presently implemented in (for example) \textsc{Maple}. We also discuss generalizations of Stirling numbers, specifically those due to Flajolet and Prodinger. It becomes possible to evaluate Stirling numbers for negative arguments. The question of the value at the origin is also discussed. The point is a singular one, and different possibilities are discussed.

16:20
Solving SAT by an Iterative Version of the Inclusion-Exclusion Principle
SPEAKER: Gabor Kusper

ABSTRACT. Our goal is to present a basic, novel, and correct SAT solver algorithm; show its soundness; compare it with a standard SAT solver, give some ideas in which cases might it be competitive. We do not present a fine-tuned, state-of-the-art SAT solver, only a new basic algorithm. So we introduce \texttt{CCC}, a SAT solver algorithm which is an iterative version of the inclusion-exclusion principle. \texttt{CCC} stands for \texttt{C}ounting \texttt{C}lear \texttt{C}lauses. It counts those full length (in our terminology: clear) clauses, which are subsumed by the input SAT problem. Full length clauses are $n$-clauses, where $n$ is the number of variables in the input problem. A SAT problem is satisfiable if it does not subsume all $n$-clauses. The idea is that in an $n$-clause each of $n$ variables is present either as a positive literal or as a negative one. So we can represent them by $n$ bits. \texttt{CCC} is motivated by the inclusion-exclusion principle,it counts full length clauses as the principle does in case of the SAT problem, but in an iterative way. It works in the following way: It sets its counter to be $0$. It converts $0$ to an $n$-clause, which is the one with only negative literals. It checks whether this $n$-clause is subsumed by the input SAT problem. If yes, it increases the counter and repeats the loop. If not, we have a model, which is given by the negation of this $n$-clause. We show that almost always we can increase the counter by more than one. We show that this algorithm always stops and finds a model if there is one. We present a worst case time complexity analysis and lot of test results. The test results show that this basic algorithm can outperform a standard SAT solver, although its implementation is very simple without any optimization. \texttt{CCC} is competitive if the input problem contains lot of short clauses. Our implementation can be downloaded and the reader is welcome to make a better solver out of it. We believe that this new algorithm could serve as a good basis for parallel algorithms, because its memory usage is constant and no communication is needed between the nodes.

17:00-17:20Coffee Break
17:20-18:40 Session 18A: Artificial Intelligence (III)
Chair:
Location: A11
17:20
Automatic Language Identification for Romance Languages using Stop Words and Diacritics

ABSTRACT. Automatic language identification is a natural language processing problem that tries to determine the natural language of a given content. In this paper we present a statistical method for automatic language identification of written text using dictionaries containing stop words and diacritics. We propose different approaches that combine the two dictionaries to accurately determine the language of textual corpora. This method was chosen because stop words and diacritics are very specific to a language, although some languages have some similar words and special characters they are not all common. The languages taken into account were romance languages because they are very similar and usually it is hard to distinguish between them from a computational point of view. We have tested our method using a Twitter corpus and a news article corpus. Both corpora consists of UTF-8 encoded text, so the diacritics could be taken into account, in the case that the text has no diacritics only the stop words are used to determine the language of the text. The experimental results show that the proposed method has an accuracy of over 90% for small texts and over 99.8% for large texts.

17:40
Improving malware detection response time with behavior-based statistical analysis techniques

ABSTRACT. Detection of malicious software is a current problem which has several approaches on solving. Among these we mention signature based detection, heuristic detection and behavioral analysis. In the last year the number of malicious files has increased exponentially. At the same time, automated obfuscation methods (used to to generate malicious files with similar behavior and different aspect) have grown significantly. In response to these new obfuscation methods, many security vendors have introduced file reputation techniques to quickly find out potentially clean and malicious samples. In this paper we present a statistical based method that can be used to identify a specific dynamic behavior of a program. The main idea behind this solution is to analyze the execution flow of every file and to extract sequences of native system functions with a potential malign outcome. This technique is reliable against most forms of malware polymorphism and is intended to work as a filtering system for different automated detection systems. We use a database consisting in approximately 50.000 malicious files gathered over the last three months and almost 3.000.000 clean files collected for a period of 3 years. Our technique proved to be an effective filtering method and helped us improve our detection response time against the most prevalent malware families discovered in the last year.

18:00
Business reviews classification using sentiment analysis

ABSTRACT. Abstract— The research area of sentiment analysis, opinion mining, sentiment mining, sentiment extraction is gaining popularity in the last years. Online reviews are becoming very important in measuring the quality of a business. This paper presents a sentiment analysis approach to business reviews classification using a large reviews dataset provided by Yelp: Yelp Challenge dataset. In this work we propose several approaches for automatic sentiment classification, using two feature extraction methods and four machine learning models. It is illustrated a comparative study on the effectiveness of the ensemble methods for reviews sentiment classification.

18:20
Comparative Analysis of Existing Architectures for General Game Agents

ABSTRACT. Abstract – This paper addresses the development of general purpose game agents able to learn a vast number of games using the same architecture. The article analyzes the main existing approaches to general game playing and reviews their results. Methods such as deep learning, reinforcement learning and evolutionary algorithms are considered for this problem. The testing platform is represented by the popular video game console Atari 2600. Research into developing general purpose agents for games is closely related to achieving artificial general intelligence (AGI).