View: session overviewtalk overview
Invited talk
SAT and applications
10:30 | ABSTRACT. Certifying results by checking proofs and models is an essential feature of modern SAT solving. Even though incremental solving with assumptions and core extraction is crucial for many applications support for incremental proof certificates is lacking. We propose a proof format for incremental SAT solving and corresponding checkers. We further extend it to make use of resolution hints. Experiments on incremental SAT solving for satisfiability modulo theories and bounded model checking show the feasibility of our approach, again also confirming that resolution hints substantially reduce checking time. |
11:00 | ABSTRACT. Adding blocked clauses to a CNF formula can substantially speed SAT-solving up, both in theory and practice. In theory, the addition of blocked clauses can exponentially reduce the length of the shortest refutation for a formula. In practice, it has been recently shown that the runtime of CDCL solvers decreases significantly on certain instance families when blocked clauses are added as a preprocessing step. This fact is in contrast, but not contradiction, with Blocked-Clause Elimination (BCE) being known as a sometimes effective preprocessing step for over a decade, and suggests that the practical role of blocked clauses in SAT-solving might be richer than expected. In this paper we propose a theoretical study of the complexity of Blocked-Clause Addition (BCA) as a preprocessing step for SAT-solving, and in particular, consider the problem of adding the maximum number of blocked clauses of a given arity k to an input formula F. While BCE is a confluent process, meaning that the order in which blocked clauses are eliminated is irrelevant, this is not the case for BCA: adding a blocked clause to a formula might unblock a different clause that was previously blocked, and this order-sensitivity turns out to be a crucial obstacle for carrying out BCA efficiently as a preprocessing step. Our main result is that computing the maximum number of k-ary blocked clauses that can be added to an input formula F is NP-hard for every k >= 2. |
11:30 | ABSTRACT. Sorting networks are sorting algorithms that execute a sequence of operations independently of the input. Since they can be implemented directly as circuits, sorting networks are easy to implement in hardware -- but they are also used often in software to improve performance of base cases of standard recursive sorting algorithms. For this purpose, they are translated them into machine-code instructions in a systematic way. Recently, a deep-learning system discovered better implementations than previously known of some sorting networks with up to 8 inputs. In this article, we show that all these examples are instances of a general pattern whereby some instructions are removed. We show that this removal can be done when a particular set of constraints on integers is satisfiable, and identify conditions where we can reduce this problem to propositional satisfiability. We systematically apply this general construction to improve the best-known implementations of sorting networks of size up to 128, which are the ones most commonly found in software implementations. |
12:00 | PRESENTER: Simon Schwarz ABSTRACT. We propose a translation from eBPF (extended Berkeley Packet Filter) code to CHC (Constrained Horn Clause sets) over the combined theory of bitvectors and arrays. eBPF is in particular used in the Linux kernel where user code is executed under kernel privileges. In order to protect the kernel, a well-known verifier statically checks the code for any harm and a number of research efforts have been performed to secure and improve the performance of the verifier. This paper is about verifying the functional properties of the eBPF code itself. Our translation procedure bpfverify is precise and covers almost all details of the eBPF language. Functional properties are automatically verified using z3. We prove termination of the procedure and show by real world eBPF code examples that full-fledged automatic verification is actually feasible. |
Theorem Proving Unleashed
14:00 | ABSTRACT. The CVC5 solver is today one of the strongest systems for solving first order problems with theories but also without them. In this work we equip the enumeration-based instantiation method with a neural network that guides the choice of instantiated quantifiers and their ground instances. For that we develop a relatively fast graph neural network that repeatedly scores all available instantiation options with respect to the available formulas. The network runs directly on a CPU without the need for any special hardware. We train the neural guidance on a large set of proofs generated by the e-matching instantiation strategy and evaluate its performance on a set of previously unseen problems. |
14:20 | Waste Reduction: Experiments in Sharing Clauses between Runs of a Portfolio of Strategies (Experimental Paper) ABSTRACT. Most high-end automated theorem provers for first order logic (FOL) split available time between short runs of a large portfolio of search strategies. These runs are typically independent and can be parallelized to exploit all the available processor cores. We explore several avenues of re-using clauses generated by earlier runs and present experimental results of their usefulness or lack thereof. |
14:40 | ABSTRACT. In this paper we demonstrate how logic programming systems and Automated first-order logic Theorem Provers (ATPs) can improve the accuracy of Large Language Models (LLMs) for logical reasoning tasks where the baseline performance is given by direct LLM solutions. We first evaluate LLM reasoning on steamroller problems using the PRONTOQA benchmark. We show how accuracy can be improved with a neuro-symbolic architecture where the LLM acts solely as a front-end for translating a given problem into a formal logic language and an automated reasoning engine is called for solving it. However, this approach critically hinges on the correctness of the LLM translation. To assess this translation correctness, we secondly define a framework of syntactic and semantic error categories. We implemented the framework and used it to identify errors that LLMs make in the benchmark domain. Based on these findings, we thirdly extended our method with capabilities for automatically correcting syntactic and semantic errors. For semantic error correction we integrate first-order logic ATPs, which is our main and novel contribution. We demonstrate that this approach reduces semantic errors significantly and further increases the accurracy of LLM logical reasoning. |
15:10 | ABSTRACT. While many of the state-of-art Automated Theorem Provers (ATP) like E and Vampire, were subject to extensive tuning of strategy schedules in the last decade, the classical ATP prover Prover9 has never been optimized in this direction. Both E and Vampire provide the user with an automatic mode to select good proof search strategies based on the properties of the input problem, while Prover9 provides by default only a relatively weak auto mode. Interestingly, Prover9 provides more varied means for proof control than its competitors. These means, however, must be manually investigated and that is possible only by experienced Prover9 users with a good understanding of how Prover9 works. In this paper, we investigate the possibilities of automatic configuration of Prover9 for user-specified benchmark problems. We employ the automated strategy invention system Grackle to generate Prover9 strategies with both basic and advanced proof search options which require sophisticated strategy space features for Grackle. We test the strategy invention on AIM train/test problem collection and we show that Prover9 can outperform both E and Vampire on these problems. To test the generality of our approach we train and evaluate strategies also on TPTP problems, showing that Prover9 can achieve reasonable complementarity with other ATPs. |
15:30 | PRESENTER: Marton Hajdu ABSTRACT. Rewriting techniques based on reduction orderings generate “just enough” consequences to retain first-order completeness. This is ideal for superposition-based first-order theorem proving, but for at least one approach to inductive reasoning we show that we are missing crucial consequences. We therefore extend the superposition calculus with rewriting-based techniques to generate sufficient consequences for automating induction in saturation. When applying our work within the unit-equational fragment, our experiments with the theorem prover Vampire show significant improvements for inductive reasoning. |
Verification
16:30 | ABSTRACT. We present a reasoning framework in support of establishing the functional correctness of programs with recursive data structures with automated first-order theorem proving. Specifically, we focus on functional programs implementing sorting algorithms. We formalize the semantics of recursive programs in many-sorted first-order logic while introducing sortedness/permutation properties as part of our first-order formalization. Rather than focusing on sorting lists of elements of specific first-order theories such as integer arithmetic, our list formalization relies on a sort parameter abstracting (arithmetic) theories and, hence, concrete sorts. Additionally we formalize the permutation property of lists in first-order logic in an effective way that allows to automatically discharge verification conditions of such algorithms purely by means of first-order theorem proving. Proof search is powered by automated theorem proving: we adjust recent efforts for automating inductive reasoning in saturation-based first-order theorem proving. Importantly, we advocate a compositional reasoning approach for automating proofs by induction required to verify functional programs implementing and preserving sorting and permutation properties over parameterized list structures. We showcase the applicability of our framework over recursive sorting algorithms, including Mergesort and Quicksort; to this end, we turn first-order theorem proving into an automated verification engine by guiding automated inductive reasoning with manual proof splits. |
17:00 | PRESENTER: Joseph Tafese ABSTRACT. Simulation is an important aspect of model checking, serving as an invaluable pre-processing step that can quickly generate a set of reachable states. This is evident in model checking tools at the Hardware Model Checking Competitions, where Bᴛᴏʀ2 is used to represent verification problems. Recently, Bᴛᴏʀ2MLIR was introduced as a novel format for representing safety and correctness constraints for hardware circuits. It provides an executable semantics for circuits represented in Bᴛᴏʀ2 by producing an equivalent program in LLVM-IR. One challenge in simulating Bᴛᴏʀ2 circuits is the use of persistent (i.e., immutable) arrays to represent memory. Persistent arrays work well for symbolic reasoning in Sᴍᴛ but they require copy-on-write semantics when being simulated natively. We provide an algorithm for converting persistent arrays to transient (i.e., mutable) arrays with efficient native execution. This approach is implemented in Bᴛᴏʀ2MLIR, which opens the door for rapid prototyping, dynamic verification techniques and random testing using established tool chains such as LibFuzzer and KLEE. Our evaluation shows that our approach, when compared with BᴛᴏʀSɪᴍ, has a speedup of three orders of magnitude when safety properties are trivial, and at least one order of magnitude when constraints are disabled. |
17:20 | ABSTRACT. The various possible semantics of Dijkstra’s guarded commands are elucidated by means of a new graph-based low-level, generic specification language designed for asynchronous nondeterministic algorithms. The graphs incorporate vertices for actions and edges indicating control flow alongside edges for data flow. This formalism allows one to specify, not just input-output behavior nor only what actions take place, but also precisely how often and in what order they transpire. |
17:30 | PRESENTER: Mudathir Mahgoub Yahia Mohamed ABSTRACT. We present a number of first- and second-order extensions to SMT theories specifically aimed at representing and analyzing SQL queries with join, projection, and selection operations. We support reasoning about SQL queries with either bag or set semantics for database tables. We provide the former via an extension of a theory of finite bags and the latter via an extension of the theory of finite relations. Furthermore, we add the ability to reason about tables with null values by introducing a theory of nullable sorts based on an extension of the theory of algebraic datatypes. We implemented solvers for these theories in the SMT solver cvc5 and evaluated them on a set of benchmarks derived from public sets of SQL equivalence problems. |