SYNASC2022: 24TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING
PROGRAM FOR MONDAY, SEPTEMBER 12TH
Days:
next day
all days

View: session overviewtalk overview

09:20-10:50 Session 2: Tutorial: Quantified Boolean Formulas - Martina Seidl

Tutorial: Quantified Boolean Formulas

Abstract:

Many application problems from artificial intelligence, verification, and formal synthesis can be efficiently encoded as quantified Boolean formulas (QBFs), the extension of propositional logic with quantifiers. This extension makes the QBF decision problem PSPACE-complete (in contrast to SAT which is NP-complete).Over the last years, much progress has been made concerning the theory and practice of QBF research. In this tutorial, we review the state of the art of QBF technology and show how to perform automated reasoning with QBFs.

Location: Room 1
10:50-11:10Coffee Break
11:10-12:00 Session 3: Invited talk: Reasoning with SAT and Beyond - Martina Seidl

Invited talk: Reasoning with SAT and Beyond

Abstract:

One of the biggest success stories of symbolic artificial intelligence is written by SAT solvers, i.e., by tools that answer the question if a given propositional formula is satisfiable or not. Despite the NP-hardness of SAT, modern solvers are able to decide huge formula instances from industrial application domains like formal verification. The performance boost of SAT solvers we have seen over the last 20 years only became possible by cleverly exploiting the problem structure of the formula, by a tight integration of powerful search and inference techniques, and advanced heuristics. Because of their power, SAT solvers are nowadays also the backend reasoning engines for solving decision problems beyond NP like QBF, SMT, and MaxSAT. 

In this talk, we shortly trace the story of SAT solving and show how SAT helps to solve problems beyond NP. We identify open challenges related to other branches of symbolic reasoning and artificial intelligence. 

Location: Room 1
12:00-13:30Lunch Break
13:30-15:10 Session 4: Symbolic Computation Session (1)
Location: Room 1
13:30
F-polynomials and Newton polytopes

ABSTRACT. We provide an effective algorithmic method for computation of Gross-Keel-Hacking-Kontsevich potential, F-polynomials and Bernstein-Kazhdan decoration function and it's complexity bounds. For simply laced Lie algebras we make conjecture and provide experimental evidence that Newton polytopes for Gross-Keel-Hacking-Kontsevich potential do not contain any interior lattice points.

13:50
Homotopy Techniques for Analytic Combinatorics in Several Variables
PRESENTER: Kisun Lee

ABSTRACT. We combine tools from homotopy continuation solvers with the methods of analytic combinatorics in several variables to give the first practical algorithm and implementation for the asymptotics of multivariate rational generating functions not relying on a non-algorithmically checkable `combinatorial' non-negativity assumption. Our homotopy implementation terminates on examples from the literature in three variables, and we additionally describe heuristic methods that terminate and correctly predict asymptotic behaviour in reasonable time on examples in even higher dimension. Our results are implemented in Julia, through the use of the HomotopyContinuation.jl package, and we provide a selection of examples and benchmarks.

14:10
Detecting Implicit Indeterminates in Symbolic Computation

ABSTRACT. In the design of symbolic mathematical computation systems, it is a popular choice to use the same syntax for both mathematical indeterminates and programming variables. While mathematical indeterminates are to be used without specific values, programming variables must be initialized before use. A problem occurs when mistakenly uninitialized programming variables are silently taken to be mathematical indeterminates. This article explores how this problem can arise and its consequences. An algorithm to analyze programs for this defect is shown along with a Maple implementation.

14:30
Balanced Dense Multivariate Multiplcation: The General Case

ABSTRACT. Abstract—Multiplication of dense polynomials is a core operation in most of the algorithms dealing with algebraic computations. Considering only the costs of arithmetic operations performed on the coefficients, any algorithm for polynomial multiplication has a super-linear time complexity. For polynomials of moderate size, this time grows quadratically with the size of the input, while for larger sizes, algorithms based on Fast Fourier Transforms (FFTs) are employed. However, in practice, the cost of accessing memory is also significant, this cost not only depends on the size of input data, but also on the “shape” of the data representing the input polynomials. Similar to the case of matrices in linear algebra, shape refers to the distribution of the non-zero coefficients. Moreover, and specifically to polynomial algebra, shape refers also to the number of variables and the distribution of their partial degrees. Polynomials that are bivariate and balanced (that is with equal partial degrees) are shown to maximize parallelism and minimize memory cost access for multiplication based on FFT techniques. In addition, the authors propose a strategy for reducing multiplication of multivariate (or univariate) dense polynomials to multiplication of a pair of bivariate and balanced polynomials. This strategy reduces cache misses and increases parallelism while increasing work only by a small constant factor, typically 2. However, the strategy of the previous work makes some genericity assumptions (referred as shape lemma) on the distribution of the partial degrees of the input polynomials. With this paper, we relax the assumptions of the previous work. As a result, we obtain general preprocessing techniques to reshape the input polynomials in order to minimize the cost of memory access, while preserving sufficient parallelism, so as to reduce the running time of polynomials multiplication in multi-threaded implementations.

14:50
A web version of TARSKI, a system for computing with Tarski formulas and semialgebraic sets
PRESENTER: Robert Vajda

ABSTRACT. We report on successfully porting and making available online TARSKI, a system for computing with Tarski formulas (Boolean combinations of polynomial sign conditions over the real numbers). TARSKI provides operations like formula simplification and quantifier elimination, and is able to read and write SMT-LIB syntax. The web version of TARSKI was created by using the Emscripten C++ to JavaScript/WebAssembly compiler. A terminal-based interpreter is provided via jquery.terminal. TARSKI is also available as a JavaScript library. We report on an experimental use in the dynamic geometry software GeoGebra.

15:10-15:30Coffee Break
15:30-16:50 Session 5A: Symbolic Computation Session (2) + Numerical Computing Session
Location: Room 1
15:30
Artificial Conflict Sampling for Real Satisfiability Problems

ABSTRACT. We outline some preliminary ideas on a guided theory assignment of variables in a satisfiability problem. One objective of this approach is to mix the top-down approach of cylindric algebraic decomposition and the bottom-up approach of partial theory assignments of modern SAT/SMT solvers. We use equational constraints and a single strict inequality at a time to artificially create conflicting variable assignment traces, which can later be used in conflict resolution to enrich the constraints of the original satisfiability problem.

15:50
Algorithm for intersecting symbolic and approximate linear differential varieties
PRESENTER: Siyuan Deng

ABSTRACT. This article provides algorithms for systems of approximate linear partial differential equations that exploit exact subsystems. Such exact systems have rational function coefficients over Q and can be reduced to forms (e.g. differential Gröbner bases) by a finite number of differentiations and eliminations using available computer implementations. We will use the rifsimp algorithm in Maple for this purpose. Such algorithms use solvers based on orderings (rankings) of their derivatives, are coordinate dependent, and are prone to instability when applied to approximate input. In contrast, our Geometric Involutive Form algorithm, uses a sequence of geometric differentiations (prolongations) and projections to complete approximate linear systems to geometric involutive form. In particular, it uses numerical linear algebra (especially the SVD) to monitor dimension criteria for termination. However, this latter method can be expensive as the size of the matrices rapidly increases with the number of variables and order of derivatives. Approximate differential systems in applications often have exact subsystems and this motivated us to develop the hybrid method described in this article. The first step of the method is to partition the input into an exact subsystem and an approximate subsystem. The exact subsystem is reduced using our rifsimp algorithm and used to simplify the approximate system. The previous partition, reduction and simplification steps are repeated until no new exact equations are found. Then the reduced exact system is used to simplify prolongations of the approximate subsystem. Checking that the jointly prolonged system is geometrically involutive is done by computing dimension criteria of the simplified prolonged approximate system and using the differential Hilbert function of the reduced exact system. Our algorithm is illustrated by determination of approximate symmetry properties of a gravitational potential for a gaseous cloud. It enables a significant reduction of the size of the coefficient matrices of prolongations involved in numerical computations compared to our previous approach.

16:10
A Neural Network based approach to find solutions to diffusion equations

ABSTRACT. The authors experiment with the Forward Euler approach and the Neural Network approach to solve the diffusion equation in this research. A diffusion equation is an equation that relates the continuous rate of change in the amount of a chemical species contained at a certain point in time. The formation diffusion equation is a non-linear network model that uses a series of neurons, each with its weight vector and bias. They have compared the performance with the results using finite difference methods. Future work would include investigating how the neural network performs for different learning rates and the number of iterations.

16:30
Extreme Points of the Unit Ball B4 in a Space of Real Polynomials of Degree at most Four with the Supremum Norm

ABSTRACT. Denote by $P_n$ the space of real polynomials $p$ of degree at most $n$ equipped with the sup norm on the interval $I=[-1,1]$. Let $B_n=\{p \in P_n: ||p||_{\infty, I}\le1\}$ be the unit ball and let $EB_n$ denote the set of extreme points of $B_n$.

In this work we are concerned with the explicit description of $EB_4$. The sets $EB_0, EB_1, EB_2, EB_3$ were already characterized, but to the best of our knowledge, the complete description of $EB_4$ is not yet given. We achieve our characterization with the aid of symbolic computation.

15:30-16:50 Session 5B: Workshop on Natural Computing and Applications
Location: Room 2
15:30
Organic Structures Emerging From Bio-Inspired Graph-Rewriting Automata

ABSTRACT. Graph-Rewriting Automata (GRA) are an extension of Cellular Automata to a dynamic structure using local graph-rewriting rules. This work introduces linear algebra based tools that allow for a practical investigation of their behavior in deeply extended time scales. A natural subset of GRA is explored in different ways thereby demonstrating the benefits of this method. Some elements of the subset were discovered to create chaotic patterns of growth and others to generate organic-looking graph structures. These phenomena suggest a strong relevance of GRA in the modeling natural complex systems. The approach presented here can be easily adapted to a wide range of GRA beyond the chosen subset.

15:50
The XGBoost Tuning by Improved Firefly Algorithm for Network Intrusion Detection

ABSTRACT. Research proposed in this article presents a novel improved version of widely adopted firefly algorithm and its application for tuning the eXtreme Gradient Boosting (XGboost) hyper-parameters for network intrusion detection. One of the greatest issues from the domain of network intrusion detection systems are relatively high false positives and false negatives rates. In the proposed study, by using XGboost optimized with enhanced firefly algorithm, this challenge is addressed. Devised method was adopted and tested against recent benchmarking USNW-NB15 dataset for network intrusion detection. Achieved results of proposed method were compared to the ones obtained by standard machine learning methods, as well as to XGBoost models tuned by other swarm algorithms. Reported comparative analysis results prove that the proposed metaheuristics has significant potential in tackling machine learning hyper-parameters optimization challenge and that it can be used for improving classification accuracy, precision, recall, f1-score and area under the receiver operating characteristic curve for network intrusion detection datasets.

16:10
An Evaluation of Image Texture Descriptors and their Invariant Properties

ABSTRACT. Image processing applications include image classification, image segmentation, image synthesis and many others. Each such task depends on extracting an effective set of features to characterize the images, and texture analysis has proven to output some of the most valuable features. For this reason, image texture analysis has been an actively researched topic and numerous methods have been proposed, each of them having its advantages and limitations. In practical applications, it is impossible to ensure that all images have the same scale, rotation, viewpoint, etc., so texture analysis methods should ideally be invariant. This study inspects the most commonly used texture descriptors, tests their accuracy in classifying the Kylberg texture dataset, and evaluates their invariant properties by the means of various synthetically transformed images. Having this analysis done, in the future we would like to formulate some improvements to some existing descriptors, to increase their accuracy and to make them invariant to a larger set of transformations.

16:30
On Using Perceptual Loss within the U-Net Architecture for the Semantic Inpainting of Textile Artefacts with Traditional Motifs

ABSTRACT. It is impressive when one gets to see a hundreds or thousands years old artefact exhibited in the museum, whose appearance seems to have been untouched by centuries. Its restoration had been in the hands of a multidisciplinary team of experts and it had undergone a series of complex procedures. To this end, computational approaches that can support in deciding the most visually appropriate inpainting for very degraded historical items would be helpful as a second objective opinion for the restorers. The present paper thus attempts to put forward a U-Net approach with a perceptual loss for the semantic inpainting of traditional Romanian vests. Images taken of pieces from the collection of the Oltenia Museum in Craiova, along with such images with garments from the Internet, have been given to the deep learning model. The resulting numerical error for inpainting the corrupted parts is adequately low, however the visual similarity still has to be improved by considering further possibilities for finer tuning.

16:50-17:10Coffee Break
17:10-19:10 Session 6A: Special Session on Symbolic Regression
Location: Room 1
17:10
Symbolic Regression with augmented dataset using RuleFit

ABSTRACT. Symbolic Regression models are often associated with transparency and interpretability. The main motivation is their ability to describe nonlinear models balancing accuracy and conciseness. But, in practice, it may generate models that are hard to understand at the same level as opaque models. From another perspective, linear models are guaranteed to be transparent but fails to model nonlinearities. The algorithm RuleFit uses a tree-based nonlinear model to create meta-features augmenting the dataset, increasing the accuracy of the linear models while maintaining their transparency. In this paper we test whether this augmented dataset can help Symbolic Regression models to find more transparent models without reducing the overall accuracy. The results indicate that the augmented models have a slightly better accuracy on a class of benchmark while keeping the expression size small and closer to a linear model. As a caveat, the models also tend to become closer to a step function which limits the interpretability of the studied phenomena.

17:30
Comparison of OLS and NLS to fit Transformation-Interaction-Rational expressions

ABSTRACT. Transformation-Interaction-Rational is a representation for Symbolic Regression created with the intent to constrain the search space of mathematical expressions with only simple models. One of the benefits of this representation is that, under certain conditions, the numerical coefficients can be fitted using the Ordinary Least Squares method. The conditions are that either the invertible function is the identity or that the training dataset is noiseless. A noisy dataset may bias the optimization of the numerical parameters towards a wrong solution. In this work, we bring evidence that OLS is actually robust to noise and returns a similar solution to a nonlinear search.

17:50
Steel Phase Kinetics Modeling using Symbolic Regression
PRESENTER: David Piringer

ABSTRACT. We describe an approach for empirical modeling of steel phase kinetics based on symbolic regression and genetic programming. The algorithm takes processed data gathered from dilatometer measurements and produces a system of differential equations that models the phase kinetics. Our initial results demonstrate that the proposed approach allows to identify compact differential equations that fit the data. The model predicts ferrite, pearlite and bainite formation for a single steel type. Martensite is not yet included in the model. Future work shall incorporate martensite and generalize to multiple steel types with different chemical compositions.

18:10
Identification of Discrete Non-Linear Dynamics of a Radio-Frequency Power Amplifier Circuit using Symbolic Regression

ABSTRACT. The identification of non-linearities or undesirable dynamic behavior of electrical components to create realistic models for circuit simulators is a common problem. Previous modeling forms are largely based on extensive physical knowledge at the semiconductor level, which has produced reliable solutions over the past decades. This however implies the measurement of physical prototypes in laboratories, which can be costly. It is therefore desirable to have reliable software models of the prototypes available to outsource this procedure to simulators. This paper presents a number of solutions from the field of empirical modeling including symbolic regression, which allow to parameterize such models from measured values. As an example we are utilizing time-domain data from a real radio-frequency power amplifier circuit. We compare a Hammerstein-Wiener model with two methods for symbolic regression, and find that the Hammerstein-Wiener model produces the best predictions but has many non-zero coefficients. Both symbolic regression methods produce short linear models with slightly higher prediction error than the HW model.

18:30
Local Optimization Often is Ill-conditioned in Genetic Programming for Symbolic Regression

ABSTRACT. Gradient-based local optimization has been shown to improve results of genetic programming (GP) for symbolic regression. Several state-of-the-art GP implementations use iterative nonlinear least squares (NLS) algorithms such as the Levenberg-Marquardt algorithm for local optimization. The effectiveness of NLS algorithms depends on appropriate scaling and conditioning of the optimization problem. This has so far been ignored in symbolic regression and GP literature. In this study we use a singular value decomposition of NLS Jacobian matrices to determine the numeric rank and the condition number. We perform experiments with a GP implementation and six different benchmark datasets. Our results show that rank-deficient and ill-conditioned Jacobian matrices occur frequently and for all datasets. The issue is less extreme when restricting GP tree size and when using many non-linear functions in the function set.

18:50
Application of Symbolic Regression in Polymer Processing
PRESENTER: Wolfgang Roland

ABSTRACT. Modeling and simulation is essential in polymer processing for predicting process characteristics and designing processing machines. Traditional models are based on analytical approaches. Over the last decades numerical simulation techniques have grown significantly with the rising computational power. With the ongoing digitalization the available data increased significantly and data-based modeling techniques have become popular also for production systems. Utilizing the available data powerful models, for instance, decision trees and artificial neural networks, can be trained. The prediction accuracy is strongly governed by the quality of the underlying training data. In this work, a hybrid approach is presented combining analytical, numerical and data-based approaches efficiently to overcome the limitations of the individual techniques. As a result, explicit symbolic regression models are obtained, which are optimized on the basis of a numerically derived dataset. The power of this approach is demonstrated by a selected use-case. These highly accurate models may be implemented into any further application.

17:10-18:30 Session 6B: Special Session for PhD students (1)
Location: Room 2
17:10
Accelerating Gröbner Basis Computation via Weighted Ordering in Parameter Identifiability of ODE Models
PRESENTER: Ilia Ilmer

ABSTRACT. We consider a specific class of polynomial systems that arise in parameter identifiability problems of models of ordinary differential equations (ODE) and discover a method for speeding up the Gröbner basis computation by using a weighted ordering. Our method explores the structure of the ODE model to generate weight assignments for each variable. We provide empirical results that show an improvement across different symbolic computing frameworks.

17:30
AlphaFold-based protein analysis pipeline

ABSTRACT. During the 14th edition of the Critical Assessment of protein Structure Prediction competition, great progress towards solving the protein structure prediction problem has been achieved by the winning model, DeepMind's AlphaFold2. Thanks to AlphaFold2's significant leap in accuracy, new possibilities in protein structure analysis and design have been opened. This paper presents a new protein analysis pipeline that builds upon the predictions of AlphaFold2. The core functionality of the pipeline is to determine and present different properties based on the protein sequence and the predicted three-dimensional structure. Some of the available features include computing physicochemical properties, executing an evolutionary analysis by aligning the sequence against databases such as Pfam and Swiss-Prot/UniRef90, the detection of binding pockets using P2Rank, and the molecular docking of ligands using AutoDock Vina. The results produced by the pipeline can be visualized as a MultiQC HTML report. The performance of the pipeline has been analyzed using a small dataset of protein structures, and the developed workflow has then been used to compare the accuracy of AlphaFold2's predictions against other experimental structures. The pipeline has been developed using Nextflow, a popular workflow manager for bioinformatic analyses, and has been made freely available at https://github.com/OtimusOne/AFPAP.

17:50
Effective Prediction System For Affective Computing On Emotional Psychology With Artificial Neural Network

ABSTRACT. Human emotions reveal mental health. Understanding people's emotions help make vital decisions. With recent improvements in AI and machine learning, affective computing has become an interesting topic of research that adapts human emotional behavior and increases learning outcomes connected to behavioral psychiatry. Machine learning algorithm evaluation improves prediction quality, yet problems arise with connected decisions. The suggested research predicts true emotion using neurophysiological data. Emotional changes trigger physiological responses. The suggested system uses Gaussian mixture models to develop a novel prediction algorithm utilizing the AMIGOS dataset. The dataset included ECG, EEG, and GSR (GSR). The statistical response following data processing, measurable emotion labeling, and training samples affect the findings. The provided system is compared to state-of-the-art statistical measures like standard deviation, population mean, etc. The system can compare an interpreter to validate emotion labeling parameters. A unique Emotional detecting artificial Neural Network (EMONN) system is improved by using deep learning models to find covariate values that help identify participant personality traits. Novel Emotional detecting artificial Neural Network (EMONN) achieves 93% accuracy with reduced computing time. A new Emotional detecting artificial Neural Network (EMONN) system is researched and developed by analysing deep learning models to detect covariate values in the data set.

18:10
Different approaches for biological strings

ABSTRACT. The antimicrobial resistance (AMR) of bacteria to antibiotics is tightly connected with their genetic structure changes. Many methods belonging to various disciplines are applied to analyze these genetic changes. The genes involved in AMR undergo different changes. These changes can be analyzed using distances between different variants (alleles) of a gene. This paper aims to study the different variants of the sul2 gene using the Levenshtein method and a distance method based on binary representation and compares the results.

18:30-18:50 Session 7: Special Session on Advances in Distributed, Secure and Scalable Algorithms for Blockchain Technology
Location: Room 2
18:30
A Practical Analysis of Techniques for Minting Genetic Information as NFTs in Blockchain Technology

ABSTRACT. In this paper we perform a practical analysis of techniques for minting genetic information, in particular genomic data, as non-fungible tokens (NFTs) in supporting blockchains, and perform a wide-range analysis of the best and most efficient tools to ensure such data's privacy, non-repudiation and storage efficiency. We analyze the demands of the NFT-driven blockchain ecosystem today, and discuss how we can apply common approaches in storing and accessing genomic data, such as compression methods and encryption techniques for it, to the NFT world. We perform a practical experiment of our assessment, and compare the publicly available tools and libraries for achieving the aforementioned goal, in order to determine which one provides the best results in terms of speed and storage efficiency, and draw relevant conclusions as to which approach is more beneficial to NFT-driven ecosystems where genomic data could be safely preserved and actively traded.

19:00-20:00Wellcome Party