CADE-30: 30TH INTERNATIONAL CONFERENCE ON AUTOMATED DEDUCTION
Accepted Papers with Abstracts
Fabian Achammer (TU Wien)
Stefan Hetzl (Vienna University of Technology)
Renate A. Schmidt (University of Manchester)
Computing Witnesses Using the SCAN Algorithm

ABSTRACT. Second-order quantifier-elimination is the problem of finding, given a formula with second-order quantifiers, a logically equivalent first-order formula. While such formulas are not computable in general, there are practical algorithms and subclasses with applications throughout computational logic. One of the most prominent algorithms for second-order quantifier elimination is the SCAN algorithm which is based on saturation theorem proving. In this paper we show how the SCAN algorithm on clause sets can be extended to solve a more general problem: namely, finding an instance of the second-order quantifiers that results in a logically equivalent first-order formula. In addition we provide a prototype implementation of the proposed method. This work paves the way for applying the SCAN algorithm to new problems in application domains such as modal correspondence theory, knowledge representation, and verification.

Tanbir Ahmed (University of Windsor)
Lamina Zaman (University of Windsor)
Curtis Bright (University of Windsor)
[FP] Symbolic Sets for Proving Bounds on Rado Numbers

ABSTRACT. Given a linear equation ℰ of the form ax + by = cz where a, b, c are positive integers, the k-colour Rado number R_k(ℰ) is the smallest positive integer n, if it exists, such that every k-colouring of the positive integers {1, 2, ..., n} contains a monochromatic solution to ℰ. In this paper, we consider k = 3 and the linear equations ax + by = bz and ax + ay = bz. Using SAT solvers, we compute a number of previously unknown Rado numbers corresponding to these equations. We prove new general bounds on Rado numbers inspired by the satisfying assignments discovered by the SAT solver. Our proofs require extensive case-based analyses that are difficult to check for correctness by hand, so we automate checking the correctness of our proofs via an approach which makes use of a new tool we developed with support for operations on symbolically-defined sets—e.g., unions or intersections of sets of the form {f(1), f(2), ..., f(a)} where a is a symbolic variable and f is a function possibly dependent on a. No computer algebra system that we are aware of currently has sufficiently capable support for symbolic sets, leading us to develop a tool supporting symbolic sets using the Python symbolic computation library SymPy coupled with the Satisfiability Modulo Theories solver Z3.

Emma Ahrens (RWTH Aachen University)
Jan-Christoph Kassing (RWTH Aachen, Research Group Computer Science 2)
Jürgen Giesl (RWTH Aachen University)
Joost-Pieter Katoen (RWTH Aachen University)
Analyzing Weighted Abstract Reduction Systems via Semirings

ABSTRACT. We present novel semiring semantics for abstract reduction systems (ARSs), i.e., a set A together with a binary relation → denoting reductions. More precisely, we provide a weighted version of ARSs, where the reduction steps induce weights from a semiring S = (S, ⊕, ⊙, 0, 1), which consists of a set S associated with two operations ⊕ and ⊙ with identity elements 0 and 1, respectively. Inspired by provenance analysis in database theory and logic, we obtain a formalism that can be used for provenance analysis of arbitrary ARSs. Our semantics handle (possibly unbounded) non-determinism and possibly infinite reductions. For that reason, we consider semirings where the natural order (that is induced by addition of the semiring) forms a complete lattice. In most applications of semiring semantics in logic, a higher “truth value” w.r.t. the natural order is more desirable, e.g., for the confidence semiring Sconf = ([0, 1], max, ·, 0, 1) that can be used to assign confidence values to facts, one would like to obtain a value close to the most desirable confidence 1. However, in the application of software verification, it is often the reverse, e.g., for computing the runtime of a reduction via the arctic semiring Sarc = (N±∞, max, +, −∞, 0). While every runtime t < ∞ may still be acceptable, the aim is to prove boundedness, i.e., that the maximum ∞ (an infinite runtime) cannot occur. Our semiring semantics capture and generalize numerous formalisms that have been studied in the literature. For example, boundedness w.r.t. different semirings can imply termination of the underlying ARS, it can ensure that certain bad states cannot be reached (safety), etc. By considering tuples over different semirings, we can combine multiple analyses into a single combined framework analyzing combinations of properties, where simply performing each analysis separately fails. Due to our generalization of existing formalisms via our semiring semantics, we can now use techniques and ideas from, e.g., termination analysis, to prove boundedness or other properties (or combinations of properties) of reduction systems using a completely different semiring. We present sufficient conditions for boundedness and show that the interpretation method, a well-known method to, e.g., prove termination of ARSs, can be generalized to a sound (and sometimes complete) technique to prove boundedness. On the other hand, we also try to find worst-case lower bounds on weights in order to detect bugs and possible denial of service attacks.

Franz Baader (TU Dresden)
Stefan Borgwardt (TU Dresden)
Filippo De Bortoli (TU Dresden)
Patrick Koopmann (Vrije Universiteit Amsterdam)
Concrete Domains Meet Expressive Cardinality Restrictions in Description Logics

ABSTRACT. Standard Description Logics (DLs) can encode quantitative aspects of an application domain through either *number restrictions*, which constrain the number of individuals that are in a certain relationship with an individual, or *concrete domains*, which can be used to assign concrete values to individuals using so-called features. These two mechanisms have been extended towards very expressive DLs, for which reasoning nevertheless remains decidable. Number restrictions have been generalized to more powerful comparisons of sets of role successors in ALCSCC, while the comparison of feature values of different individuals in ALC(D) has been studied in the context of ω-admissible concrete domains D. In this paper, we combine both formalisms and investigate the complexity of reasoning in the thus obtained DL ALCOSCC(D), which additionally includes the ability to refer to specific individuals by name. We show that, in spite of its high expressivity, the consistency problem for this DL is ExpTime-complete, assuming that the constraint satisfaction problem of D is also decidable in exponential time. It is thus not higher than the complexity of the basic DL ALC. At the same time, we show that many natural extensions to this DL, including a tighter integration of the concrete domain and number restrictions, lead to undecidability.

Pablo Barenbaum (Universidad Nacional de Quilmes (CONICET), Quilmes, Argentina and ICC, Universidad de Buenos Aires, Argentina)
Delia Kesner (Université Paris Cité, CNRS, IRIF, F-75013 Paris, France)
Mariana Milicich (Université Paris Cité, CNRS, IRIF, F-75013 Paris, France)
A Fresh Inductive Approach to Useful Call-by-Value

ABSTRACT. Useful evaluation is an optimised evaluation mechanism for functional programming languages, introduced by Accattoli and Dal Lago. The key to useful evaluation is to represent programs with sharing and to implement substitution of terms only when this contributes to the progress of the computation. Initially defined in the framework of call-by-name, useful evaluation has since been extended to call-by-value. The definitions of usefulness in the literature are complex and lack inductive structure, which makes it challenging to (formally) reason about them. In this work, we define useful call-by-value evaluation inductively, proceeding in two stages. First, we refine the well-known Value Substitution Calculus, so the substitution operation becomes linear, yielding the lcbv calculus. The two calculi are observationally equivalent. We then further refine lcbv by restricting linear substitution only when it contributes to the progress of the computation, yielding the ucbv strategy. This new substitution notion is sensitive to the surrounding evaluation context, so it is non-trivial to capture it inductively. Moreover, we formally show that the resulting ucbv is a sound and complete implementation of lcbv, optimised to implement useful evaluation. As a further contribution, we show that the ucbv strategy can be implemented by an existing lower-level abstract machine called GLAMoUr with polynomial overhead in time. This entails, as a corollary, that ucbv is time-invariant, i.e., that the number of reduction steps to normal form in ucbv can be used as a measure of time complexity. Our ucbv strategy is part of the preliminary work required to develop semantic interpretations of useful evaluation, for which its inductive formulation is more suitable than the (non-inductive) existing ones.

Lukas Bartl (Universität Augsburg)
Jasmin Blanchette (Ludwig-Maximilians-Universität München)
Tobias Nipkow (Technische Universität München)
Exploiting Instantiations from Paramodulation Proofs in Isabelle/HOL

ABSTRACT. Metis is an ordered paramodulation prover built into the Isabelle/HOL proof assistant. It attempts to close the current goal using a given list of lemmas. Typically these lemmas are found by Sledgehammer, a tool that integrates external automatic provers. We present a new tool that analyzes successful Metis proofs to derive variable instantiations. These increase Sledgehammer’s success rate, improve the speed of Sledgehammer-generated proofs, and help users understand why a goal follows from the lemmas.

Christoph Benzmüller (Otto-Friedrich-Universität Bamberg)
Faithful Logic Embeddings in HOL – A recipe to have it all: deep and shallow, automated and interactive, heavy and light, proofs and counterexamples, meta and object level

ABSTRACT. Deep and shallow embeddings of non-classical logics in classical higher-order logic have been explored, implemented, and used in various automated reasoning tools in recent years. This paper presents a recipe for the simultaneous deployment of different forms of deep and shallow embeddings in classical higher-order logic, enabling not only flexible interactive and automated theorem proving and counterexample finding at meta and object level, but also automated faithfulness proofs between the logic embeddings. The approach, which is fruitful for logic education, research and application, is deliberately illustrated here using simple propositional modal logic. However, the work presented is conceptual in nature and not limited to such a simple logic context.

Jasmin Blanchette (Ludwig-Maximilians-Universität München)
Mathias Fleury (Albert-Ludwigs-Universität Freiburg)
Martin Suda (Czech Technical University in Prague)
Sophie Tourret (Max-Planck-Institut für Informatik, Saarbrücken, Germany)
First-Order Reasoning, Below and Beyond: Workshop in Honor of Christoph Weidenbach’s 60th Birthday

ABSTRACT. This event is dedicated to Christoph Weidenbach on occasion of his 60th birthday. It honors his career as a scientist and teacher, and will be an occasion for researchers to present their research related to the automation of first-order logic and its fragments (broadly understood). Topics of interest include (and are not limited to) first-order reasoning with resolution and superposition, SAT and SMT solving, tableau methods, and formal proofs of logical calculi. A festschrift will be published to collect the new contributions presented at the event.

Yasmine Briefs (Max-Planck-Institut für Informatik)
Mathias Fleury (University of Freiburg)
From Building Blocks to Real SAT Solvers

ABSTRACT. In his \emph{Automated Reasoning Building Blocks}, Weidenbach presented a version of the CDCL calculus (and a proof of non-redundancy). This calculus is the base for IsaSAT, the only verified SAT solver to take part in the SAT Competition.

In this abstract presentation, we discuss the differences between the abstract calculus and the concrete implementations found in SAT solvers.

Recently, Briefs and he have rediscovered chronological backtracking with a different aim: generalizing the non-redundancy to SMT solving. We discuss the difference between his presentation and the other existing ones.

Martin Bromberger (Max Planck Institute for Informatics)
Martin Desharnais (Max Planck Institute for Informatics)
Christoph Weidenbach (Max Planck Institute for Informatics)
A Stepwise Refinement Proof that SCL(FOL) Simulates Ground Ordered Resolution

ABSTRACT. Recently, it has been demonstrated that SCL(FOL) can simulate ground ordered resolution. We revisit this result and provide a new formal proof in Isabelle/HOL. The existing pen-and-paper proof is monolithic and challenging to comprehend. In order to improve clarity, we develop an alternative proof structured as eleven (bi)simulation steps between the two calculi, transitioning from ordered resolution to SCL(FOL). A key simulation lemma ensures that, under certain conditions, one simulation direction can be automatically lifted to the other. Consequently, for each of the eleven steps, it suffices to establish only one direction of simulation. The complete proof is included in the Isabelle Archive of Formal Proofs.

Chad Brown (Czech Technical University in Prague)
Karel Chvalovský (Czech Technical University in Prague)
Mikoláš Janota (Czech Technical University in Prague)
Miroslav Olšák (University of Innsbruck)
Stefan Ratschan (Institute of Computer Science Academy of Sciences of the Czech Republic)
SMT and Functional Equation Solving over the Reals: Challenges from the IMO

ABSTRACT. We use SMT technology to address a class of problems involving uninterpreted functions and nonlinear real arithmetic. In particular, we focus on problems commonly found in mathematical competitions, such as the International Mathematical Olympiad (IMO), where the task is to determine all solutions to constraints on an uninterpreted function. Although these problems require only high-school-level mathematics, state-of-the-art SMT solvers often struggle with them. We propose several techniques to improve SMT performance in this setting.

Christopher W. Brown (United States Naval Academy)
Presentation Only: Semantics of Division for Polynomial Solvers

ABSTRACT. How to handle division in systems that compute with logical formulas involving what would otherwise be polynomial constraints over the real numbers is a surprisingly difficult question. This presentation reports recently published work that surveys existing solutions, describes their issues, and proposes a novel solution for the problem that is specifically tailored to tools from symbolic computing and real algebraic geometry.

Gianpiero Cabodi (DAUIN - Department of Control and Computer Engineering, Politecnico di Torino, Turin, IT)
Paolo Enrico Camurati (DAUIN - Department of Control and Computer Engineering, Politecnico di Torino, Turin, IT)
Alberto Griggio (FBK - Fondazione Bruno Kessler, Trento, IT)
Marco Palena (CNIT - National Inter-University Consortium for Telecommunications, Turin, IT)
Paolo Pasini (DET - Department of Electronics and Telecommunications, Politecnico di Torino)
Marco Roveri (DISI - Department of Information Engineering and Computer Science University of Trento, Trento, IT)
Giulia Sindoni (FBK - Fondazione Bruno Kessler, Trento, IT)
Stefano Tonetta (FBK - Fondazione Bruno Kessler, Trento, IT)
A Theorem Prover Based Approach for SAT-Based Model Checking Certification

ABSTRACT. In the field of formal verification, certifying proofs serve as compelling evidence to demonstrate the correctness of a model within a deductive system. These proofs can be automatically generated as a by-product of the verification process and are key artefacts for high-assurance systems. Their significance lies in their ability to be independently verified by proof checkers, which provides a more convenient approach than certifying the tools that generate them. Modern model checking algorithms adopt deductive methods and usually generate proofs in terms of inductive invariants, assuming that these apply to the original system under verification. Model checkers, though, often make use of a range of complex pre-processing simplifications and transformations to ease the verification process, which add another layer of complexity to the generation of proofs. In this paper, we present a novel approach for certifying model checking results exploiting a theorem prover and a theory of temporal deductive rules that can support various kinds of transformations and simplification of the original circuit. We implemented and experimentally evaluated our contribution on invariants generated using two state-of-the-art model checkers, nuXmv and PdTRAV, and by defining a set of rules within a theorem prover, to validate each certificate.

Ali Khan Caires Ribeiro Santos (Universidade de Brasília)
Maribel Fernandez (King's College London)
Daniele Nantes Sobrinho (Imperial College London)
Equational Reasoning Modulo Commutativity in Languages with Binders

ABSTRACT. In computer science and mathematical logic, formal languages abstract away concrete syntax details, focusing on abstract syntax to capture structural properties. However, conventional abstract syntax falls short for languages with variable-binding constructs. In this paper, we work with the nominal language which is a general formal language for representing binders, freshness conditions and $\alpha$-renaming. Rather than relying on the usual freshness constraints, we introduce a nominal algebra which employs a generalisation of permutation fixed-point constraints in $\alpha$-equivalence judgements, seamlessly integrating commutativity into the reasoning process.

We investigate the proof-theoretical properties of the framework and provide a sound and complete semantics in the setting of nominal sets. Additionally, we propose a novel algorithm for nominal unification modulo commutativity, which we prove terminating and correct. Moreover, thanks to the use of fixed point constraints, the resulting unification theory is finitary, unlike the standard nominal unification modulo commutativity using freshness constraints. This extension provides a robust foundation for structural induction and recursion over syntax with binding constructs and commutative operators, such as first-order logic and the $\pi$-calculus, enabling reasoning modulo both $\alpha$-equivalence and commutativity.

Mohamed Chaabani (Université de Montpellier)
Simon Robillard (Université de Montpellier)
Verified Path Indexing

ABSTRACT. Indexing of syntactic terms is a key component for the efficient implementation of automated theorem provers. This paper presents the first verified implementation of a term indexing data structure, namely a formalization of path indexing in the proof assistant Isabelle/HOL. We define the data structure, maintenance operations, and retrieval operations, including retrieval of unifiable terms, instances, generalizations and variants. We prove that maintenance operations preserve the invariants of the structure, and that retrieval operations are sound and complete.

Roland Coghetto (Independant scholar (cafr-msa2p asbl - Belgium))
IsaGeoCoq, a port of GeoCoq with Isabelle/HOL

ABSTRACT. IsaGeoCoq is a port of GeoCoq 2.4.0, library written for Rocq, to Isabelle/HOL. A brief description of all the translated files is presented, as well as the difficulties encountered during the manual translation and useful resources such as the sledgehammer tool. Par- ticular emphasis is placed on the evolution of the work: initially the translation contained some implications at the level of the postulates of parallels. Currently, many equivalences between postulates have been formalized thanks to the translation of the results obtained in GeoCoq, but also the construction of many notions included in the first part of the book "Metamathematische methode in des geometrie", allowing to demonstrate the Pythagorean theorem, secondary school level exercises, but also interpretations between different axiom systems such as those of Tarski and Hilbert.

Prunelle Colin (Université de Strasbourg)
Pierre Boutry (Université de Strasbourg)
On the Coq/Rocq Mechanization of Beeson's "On the Notion of Equal Figures in Euclid"

ABSTRACT. When Beeson started his project of computer-checking Euclid's Elements he used a set of $36$ axioms. They can we split into two kinds. 20 of them are Tarski-style axioms and the remaining $16$ are about the notion of equal figures. Then, at the Logic Colloquium 2019, he presented his approach of defining the concept of equal figures and how to prove that it verifies the axioms needed for computer-checking Euclid's Elements. This work was completed with "On the Notion of Equal Figures in Euclid". Our work aims at the formal verification in the Coq/Rocq proof assistant of the proofs from "On the Notion of Equal Figures in Euclid". This would not only guarantee that there is no hole in the proofs, but it would also allow to establish the mutual interpretability between the axioms used in "Proof-Checking Euclid" and the one used by Tarski and his co-authors in "Metamathematische Methoden in der Geometrie". We use the GeoCoq library as a basis for this work. Up to our knowledge, there is no formal verification of the results from "On the Notion of Equal Figures in Euclid".

Thierry Dana-Picard (Jerusalem College of Technology)
Singular points of plane curves obtained from geometric constructions: automated methods in man-and-machine collaboration

ABSTRACT. We explore envelopes and osets, using automated commands for envelopes and geometric loci. The topology of envelopes and osets is generally more complicated than that of the progenitor, whence the importance to study the singularities. The work involves both analytic and algebraic methods (Gröbner bases, elimination), and networking between dierent kinds of software. Accurate plotting is a central issue, for which the new command Plot2D provided by GeoGebra-Discovery is ef- cient. Finally we discuss briey how some pitfalls of AI can and should be used to develop critical thinking and other skills. The activities presented here have been proposed to in-service teachers learning towards an advanced degree.

Thiago Felicissimo (INRIA)
Jean-Pierre Jouannaud (Université Paris-Saclay, INRIA, Deducteam, Laboratoire de Méthodes Formelles)
Sort-Based Confluence Criteria for Non-Left-Linear Higher-Order Rewriting

ABSTRACT. Powerful confluence criteria for higher-order rewriting exist for left-linear systems, even in the presence of non-termination. On the other hand, confluence criteria that allow for mixing non-termination and non-left-linearity are either extremely limited or hardly usable in practice. In this paper, we study confluence criteria which explore sort information to make proving higher-order confluence possible, even in the presence of non-termination and non-left-linearity. We give many interesting examples of systems covered by our results, including a (confluent) variant of Klop's counterexample, and a calculus issuing from a dependent type theory with cumulative universes.

Florian Frohn (RWTH Aachen University)
Jürgen Giesl (RWTH Aachen University)
Infinite State Model Checking by Learning Transitive Relations

ABSTRACT. We propose a new approach for proving safety of infinite state systems. It extends the analyzed system by transitive relations until its diameter D becomes finite, i.e., until constantly many steps suffice to cover all reachable states, irrespective of the initial state. Then we can prove safety by checking that no error state is reachable in D steps. To deduce transitive relations, we use recurrence analysis. While recurrence analyses can usually find conjunctive relations only, our approach also discovers disjunctive relations by combining recurrence analysis with projections. An empirical evaluation of the implementation of our approach in our tool LoAT shows that it is highly competitive with the state of the art.

Carsten Fuhs (Birkbeck, University of London)
Liye Guo (Radboud University)
Cynthia Kop (Radboud University)
Proving Call-by-value Termination of Constrained Higher-order Rewriting by Dependency Pairs

ABSTRACT. n/a (the submission itself is a one-page abstract)

Thibault Gauthier (Czech Technical University in Prague)
Josef Urban (Czech Technical University in Prague)
Learning Conjecturing from Scratch

ABSTRACT. We develop a self-learning approach for conjecturing of induction predicates on a dataset of 16197 problems derived from the OEIS. These problems are hard for today's SMT and ATP systems because they require a combination of inductive and arithmetical reasoning.

Starting from scratch, our approach consists of a feedback loop that iterates between (i) training a neural translator to learn the correspondence between the problems solved so far and the induction predicates useful for them, (ii) using the trained neural system to generate many new induction predicates for the problems, (iii) fast runs of the z3 prover attempting to prove the problems using the generated predicates, (iv) using heuristics such as predicate size and solution speed on the proved problems to choose the best predicates for the next iteration of training.

The algorithm discovers on its own many interesting induction predicates, ultimately solving over 5500 problems, compared to 2265 problems solved by CVC5, Vampire or Z3.

Jannis Gehring (DHBW Stuttgart)
Stephan Schulz (DHBW Stuttgart)
Premise Selection with the Alternating-Path-Method

ABSTRACT. We describe the implementation of the alternating path method for premise selection in the theorem prover PyRes.

Estifanos Getachew (University of Waterloo)
Arie Gurfinkel (University of Waterloo)
Richard Trefler (University of Waterloo)
Relatively Complete and Efficient Partial Quantifier Elimination

ABSTRACT. Quantifier elimination is used in various automated reasoning tasks, including quantified SMT solving, Exists/Forall solving, program synthesis, model checking, and constrained Horn clause (CHC) solving. Complete quantifier elimination, however, is computationally intractable for many theories. The recent algorithm QEL shows a promising approach to approximate quantifier elimination, which has resulted in improvements in solver performance. QEL performs partial quantifier elimination with a completeness guarantee that depends on a certain semantic property of the given formula. Considerably generalizing the previous approach, we identify a subclass of local theories in which partial quantifier elimination can be performed efficiently. We present T-QEL, a sound extension of QEL, and an efficient algorithm for this class of theories. The algorithm utilizes the proof theoretic characterization of the theories, which is based on restricted derivations. Finally, we prove for T-QEL, soundness in general, and relative completeness with respect to the identified class of theories.

Massin Guerdi (Ludwig-Maximilians-Universität München)
A Lambda-Superposition Tactic for Isabelle/HOL

ABSTRACT. We introduce jeha, an Isabelle/HOL tactic and automated theorem prover based on the λ-superposition calculus. An alternative to Isabelle’s metis tactic, jeha targets higher-order logic directly, avoiding the overhead introduced by metis’ translations to first-order logic. Like metis, jeha can be used as a reconstruction backend for Sledgehammer to improve the reconstruction rate of proofs produced by external higher-order automated theorem provers such as E, Vampire, and Zipperposition.

Simon Guilloud (EPFL)
Sankalp Gambhir (EPFL)
Auguste Poiroux (EPFL)
Yann Herklotz (EPFL)
Thomas Bourgeat (EPFL)
Viktor Kuncak (EPFL)
Julie Cailler (Université de Loraine)
Interoperability of Proof systems with SC-TPTP (System Description)

ABSTRACT. We introduce SC-TPTP, an extension of the TPTP derivation format that encompasses sequent formalism, enabling seamless proof exchange between interactive theorem provers and first-order automated theorem provers. We provide a way to represent non-deductive steps---Skolemization, clausification, and Tseitin normal form---as deductive steps within the format. Building upon the existing support in the Lisa proof assistant and the Goéland theorem prover, SC-TPTP ecosystem is further enhanced with proof output interfaces for Egg and Prover9, as well as proof reconstruction support for HOL Light, Lean, and Rocq.

Yoan Géran (Université de Strasbourg)
Pierre Boutry (Université de Strasbourg)
First-Order Simplification of GeoCoq using Dedukti

ABSTRACT. In this paper we present a procedure used to translate the GeoCoq library, a Rocq formalization of proofs in geometry, into a First-Order Logic encoding in the logical framework \Dedukti. We describe the modifications made to the tool to permit this translation and the process that lead to the first-order simplification. As a result, this translation enables us to share the proofs to other proofs assistants.

Gábor Gévay (Bolyai Institute, University of Szeged)
Benedek Kovács (ADG Foundation)
Zoltán Kovács (The Private University College of Education of the Diocese of Linz, Institute of Initial Teacher Training)
Is a regular polygon determined by its diagonals?

ABSTRACT. We discuss some possible characterizations of regular polygons that can be directly used in algebraic provers in automated reasoning in geometry.

Thomas Hader (TU Wien)
Ahmed Irfan (SRI International)
Stéphane Graham-Lengrand (SRI International)
PO: Decision Heuristics in MCSat

ABSTRACT. The Model Constructing Satisfiability (MCSat) approach to Satisfiability Modulo Theories (SMT) has demonstrated strong performance when handling complex theories such as nonlinear arithmetic. Despite being in development for over a decade, there has been limited research on the heuristics utilized by MCSat solvers as in Yices2.

In this paper, we discuss the decision heuristics employed in the MCSat approach of Yices2 and empirically show their significance on QF_NRA and QF_NIA benchmarks. Additionally, we propose new ideas to enhance these heuristics by leveraging theory-specific reasoning and drawing inspiration from recent advancements in SAT solvers.

Our new version of the MCSat Yices2 solver not only solves more nonlinear arithmetic benchmarks than before but is also more efficient compared to other leading SMT solvers.

Marton Hajdu (TU Wien)
Robin Coutelier (TU Wien)
Laura Kovács (TU Wien)
Andrei Voronkov (The University of Manchester)
Term Ordering Diagrams

ABSTRACT. The superposition calculus for reasoning in first-order logic with equality relies on simplification orderings on terms. Modern saturation provers use the Knuth-Bendix order (KBO) and the lexicographic path order (LPO) for discovering redundant clauses and inferences. Implementing term orderings is however challenging. While KBO comparisons can be performed in linear time and LPO checks in quadratic time, using the best known algorithms for these orders is not enough. Indeed, our experiments show that for some examples term ordering checks may use about 98% of the overall proving time. The reason for this is that some equalities that cannot be ordered can become ordered after applying a substitution (post-ordered), and we have to check for post-ordering repeatedly for the same equalities. In this paper, we show how to improve post-ordering checks by introducing a new data structure called term ordering diagrams, in short TODs, which creates an index for these checks. We achieve efficiency by lazy modifications of the index and by storing and reusing information from previously performed checks to speed up subsequent checks. Our experiments demonstrate efficiency of TODs.

Marton Hajdu (TU Wien)
Laura Kovács (TU Wien)
Andrei Voronkov (The University of Manchester)
Partial Redundancy in Saturation

ABSTRACT. Redundancy elimination is one of the crucial ingredients of efficient saturation-based proof search. We improve redundancy elimination by introducing a new notion of redundancy, based on partial clauses and redundancy formulas, which is more powerful than the standard notion: there are both clauses and inferences that are redundant when we use our notions and not redundant when we use standard notions. In a way, our notion blurs the distinction between redundancy at the level of inferences and redundancy at the level of clauses. We present a superposition calculus PaRC on partial clauses. Our calculus is refutationally complete and is strong enough to capture some standard restrictions of the superposition calculus. We discuss the implementation of the calculus in the theorem prover Vampire. Our experiments show the power of the new approach: we were able to solve 24 TPTP problems not previously solved by any prover, including previous versions of Vampire.

Simone Heisinger (Johannes Kepler University Linz)
Katharina Blaimschein (Johannes Kepler University Linz)
Martina Seidl (Johannes Kepler University Linz)
Evaluating LLMs on Formal Models Coursework

ABSTRACT. The use of Large Language Models is on the rise in educational settings such as creating teaching material, helping with assignments or solving exercises. At the same time these new possibilities bring new challenges in regards to the understanding of the capabilities of LLMs. To offer a glimpse into the current capabilities of LLMs in regards to solving coursework, we developed a set of questions from our Bachelor's course Formal Models, which allows us to rate and compare the generated answers of LLMs. We developed an evaluation template to rate the responses, which allows us to not only compare the end result of given answers, but also intermediate steps and understanding of given tasks. The evaluations are presented on an interactive website, which allows us to offer new insights and enables a faster response to newly released models.

Jonathan Hellwig (Karlsruhe Institute of Technology)
André Platzer (Karlsruhe Institute of Technology)
A Real-Analytic Approach to Differential-Algebraic Dynamic Logic

ABSTRACT. This paper introduces a proof calculus for real-analytic differential-algebraic dynamic logic, a variant of differential-algebraic dynamic logic. Our calculus allows correct reduction of differential-algebraic equations to ordinary differential equations through axiomatized index reduction. Additionally, it ensures compatibility between differential-algebraic equation proof calculus and (differential-form) differential dynamic logic for hybrid systems. One key contribution is ghost switching which establishes precise conditions to decompose multi-modal systems into hybrid pro- grams, thereby correctly hybridizing sophisticated differential-algebraic dynamics. We demonstrate our calculus on a Euclidean pendulum and formally verify its energy conservation.

Roland Herrmann (University of Regensburg)
Philipp Rümmer (University of Regensburg)
What's Decidable About Arrays With Sums?

ABSTRACT. The theory of arrays, supported by virtually all SMT solvers, is one of the most important theories in program verification. The standard theory of arrays, which provides \emph{read} and \emph{write} operations, has been extended in various different ways in past research, among others by adding extensionality, constant arrays, function mapping (resulting in combinatorial array logic), counting, and projections. This paper studies array theories extended with \emph{sum constraints,} which capture properties of the sum of all elements of an integer array. The paper shows that the theory of extensional arrays extended with constant arrays and sum constraints can be decided in non-deterministic polynomial time. The decision procedure works both for finite and infinite index sorts, as long as the cardinality is fixed a priori. In contrast, adding sum constraints to combinatorial array logic gives rise to an undecidable theory. The paper concludes by studying several fragments in between standard arrays with sums and combinatorial arrays with sums, aiming at providing a complete characterization of decidable and undecidable fragments.

Andrzej Indrzejczak (University of Lodz, Poland)
Cut Elimination for Negative Free Logics with Definite Descriptions

ABSTRACT. We present a sequent calculus GNFL for negative free logic with definite descriptions in the classical and intuitionistic version, with empty and nonempty domains. It is shown constructively that GNFL satisfies the cut elimination theorem and its cut-free version satisfies the subformula property.

Predrag Janicic (University of Belgrade)
Julien Narboux (IRIF - Université Paris Cité)
Geometry Machine Revisited

ABSTRACT. Gelernter's Geometry Machine is significant not only as the very first automated prover for geometry but also as one of the earliest artificial intelligence systems. Developed almost seventy years ago, it remains a seminal work, in some aspects still fresh and inspiring. In this paper, we provide historical reflections on the Geometry Machine, interpret its results in modern terms, and attempt to reproduce its outputs.

Alexandre Jean (ICube UMR 7357 CNRS Université de Strasbourg)
Pierre Boutry (ICube UMR 7357 CNRS Université de Strasbourg)
Nicolas Magaud (ICube UMR 7357 CNRS Université de Strasbourg)
An Automated Approach towards Constructivizing the GeoCoq Library

ABSTRACT. In this work, we aim at the verification in Coq/Rocq of Beeson's main result in A constructive version of Tarski's geometry, namely that the arithmetization of geometry can be obtain without assuming any decidability property. Our first goal is to deal with the lemmas that do not assert the existence of some geometrical objects. This is what Beeson calls the "wholesale importation of proofs of negative theorems from classical to constructive geometry". Instead of relying on the Gödel double-negation interpretation, we work at the level of tactics, the language that allows to build proofs interactively. Reasoning steps that are not intuitionistically acceptable can still be performed under the assumption that they are used to prove a negative theorem, thus constructivizing the proofs. Thus, we rely on coq-ditto to automate the constructivization process. We report on the current progress towards this goal.

Michael Katzenberger (Technical University of Munich)
Jürgen Richter-Gebert (Technical University of Munich)
Manifold-based Proving Methods in Projective Geometry

ABSTRACT. This article compares different proving methods for projective incidence theorems. In particular, a technique using quadrilateral tilings recently introduced by Sergey Fomin and Pavlo Pylyavskyy is shown to be at most as strong as proofs using bi-quadratic final polynomials and thus, also proofs using Ceva-Menelaus-tilings. Furthermore, the equivalence between quadrilateral-tiling-proofs and proofs using exclusively Menelaus configurations is shown as well as additional results for Ceva-Menelaus-proofs.

Daniela Kaufmann (TU Wien)
James Davenport (University of Bath)
10th SC-Square Workshop

ABSTRACT. Name of the Event: SC-Square, the Workshop on Satisfiability Checking and Symbolic Computation

Contact Information: At the moment, the appointment process of the workshop organisers is still in progress. In the meantime, the Steering Committee of SC-Square stand as organisers. To contact us, please send a mail to: scsc-steering@lists.bath.ac.uk

Brief description for the website: Symbolic Computation is concerned with the efficient algorithmic determination of exact solutions to complicated mathematical problems. Satisfiability Checking has recently started to tackle similar problems but with different algorithmic and technological solutions. The two communities share many central interests, but researchers from these two communities rarely interact. Also, the lack of common or compatible interfaces for tools is an obstacle to their fruitful combination. Bridges between the communities in the form of common platforms and road-maps are necessary to initiate an exchange, and to support and direct their interaction. The aim of this workshop is to provide an opportunity to discuss, share knowledge and experience across both communities.

Daniela Kaufmann (TU Wien)
Clemens Hofstadler (Johannes Kepler University Linz)
EA Recycling Algebraic Proof Certificates

ABSTRACT. Proof certificates can be used to validate the correctness of algebraic derivations. However, in practice, we frequently observed that the exact same proof steps are repeated for different sets of variables, which leads to unnecessarily large proofs. To overcome this issue we extend the existing Practical Algebraic Calculus with linear combinations (LPAC) with two new proof rules that allow us to capture and reuse parts of the proof to derive a more condensed proof certificate. We integrate these rules into the proof checker Pacheck 2.0. Our experimental results demonstrate that the proposed extension helps to reduce both proof size and verification time.

András Kerekes (Radnóti Miklós High School, Szeged)
Zoltán Kovács (The Private University College of Education of the Diocese of Linz, Institute of Initial Teacher Training)
Towards automatic detection of geometric difficulty of geometry problems

ABSTRACT. In this contribution, we compare three different options to define geometric difficulty of geometry problems. By computing correlation between the achieved values and the algebraic complexity, and, alternatively, ad-hoc human intuition, we confirm that our former definition seems acceptable.

Laura Kovács (TU Wien)
Michael Rawson (TU Wien)
Vampire 2025: The 9th Vampire Workshop

ABSTRACT. The Vampire 2025 workshop discusses recent developments in implementing, applying benchmarking and comparing first-order theorem provers and their combinations with other systems. The workshop is going to shed the light on problems such as what is essential for substantial progress in theorem proving tools; what are the best implementation principles to be used; what are the best heuristics and strategies, depending on application areas; both successful and unsuccessful case studies; and missing features in modern theorem provers

Zoltán Kovács (The Private University College of Education of the Diocese of Linz, Institute of Initial Teacher Training)
Pedro Quaresma (Department of Mathematics, School of Science and Technology, University of Coimbra)
Automated Deduction in Geometry (ADG’25)

ABSTRACT. Proposal for a Workshop at CADE 2025

Zoltán Kovács (The Private University College of Education of the Diocese of Linz, Institute of Initial Teacher Training)
Proving theorems on regular polygons by elimination --- the elementary way

ABSTRACT. We use elementary geometry and elimination to prove some non-trivial theorems on regular polygons, including regular 5-, 7- and 9-gons. The required knowledge to follow the proofs does not go beyond high school knowledge.

Hendrik Leidinger (Max Planck Institute for Informatics)
Christoph Weidenbach (Max Planck Institute for Informatics)
Computing Ground Congruence Classes

ABSTRACT. Congruence closure on ground equations is a well-established and efficient algorithm for deciding ground equalities. It constructs an explicit representation of ground equivalence classes based on a given set of input equations, allowing ground equalities to be decided by membership. In many applications, these ground equations originate from grounding non-ground equations.

We propose an algorithm that directly computes a non-ground representation of ground congruence classes for non-ground equations. Our approach is sound and complete with respect to the corresponding ground congruence classes. Experimental results demonstrate that computing non-ground congruence classes often outperforms the classical ground congruence closure algorithm in efficiency.

Enrico Lipparini (University of Cagliari)
Thomas Hader (TU Wien)
Ahmed Irfan (SRI International)
Stéphane Graham-Lengrand (SRI International)
Boosting MCSat Modulo Nonlinear Integer Arithmetic via Local Search.

ABSTRACT. The Model Constructing Satisfiability (MCSat) approach to the SMT problem extends the ideas of CDCL from the SAT level to the theory level. Like SAT, its search is driven by incrementally constructing a model by assigning concrete values to theory variables and performing theory-level reasoning to learn lemmas when conflicts arise. Therefore, the selection of values can significantly impact the search process and the solver’s performance. In this work, we propose guiding the MCSat search by utilizing assignment values discovered through local search. First, we present a theory-agnostic framework to seamlessly integrate local search techniques within the MCSat framework. Then, we highlight how to use the framework to design a search procedure for Nonlinear Integer Arith- metic (NIA), utilizing accelerated hill-climbing and a new operation called feasible-sets jumping. We implement the proposed approach in the Model Constructing Satisfiability (MCSat) engine of the Yices2 solver, and empirically show its performance using the NIA benchmarks of SMT-LIB.

Salvador Lucas (Universitat Politècnica de València)
Confluence of Almost Parallel-Closed Generalized Term Rewriting Systems

ABSTRACT. *Generalized Term Rewriting Systems* (GTRSs) extend Conditional Term Rewriting Systems by (i) restricting rewritings on arguments of function symbols and (ii) allowing for more general conditions in rules, namely, atoms defined by a set of Horn clauses. They are useful to model and analyze properties of computations with sophisticated languages like Maude. Huet proved that left-linear and parallel closed Term Rewriting Systems (TRSs) are confluent. In this paper, we generalize and extend Huet's results to GTRSs *without requiring left-linearity*. This improves Huet's result, as we show with some examples. Also, Huet's result entails confluence of weakly orthogonal TRSs, thus providing a syntactic criterion for proving confluence *without requiring termination*. We similarly introduce *weakly V-orthogonal* GTRSs, which are confluent. These results, implemented in the confluence tool CONFident, improve its ability to deal with context-sensitive and conditional term rewriting systems. We briefly report on this.

Alessio Mansutti (IMDEA Software Institute)
PO Integer Linear-Exponential Programming

ABSTRACT. In this talk, I will present Integer linear-exponential programming (ILEP), an extension of integer linear programming (ILP) in which linear systems of inequalities are generalized to also feature two additional functions: the exponential function x → 2^x and the remainder function (x, y) → (x mod 2^y). We will focus on interesting aspects of ILEP, and on the symbolic techniques used to design the current best-known algorithm for solving this problem.

Vesna Marinkovic (Faculty of Mathematics, University of Belgrade)
Tijana Sukilovic (Faculty of Mathematics, University of Belgrade)
Filip Maric (University of Belgrade)
Different Types of Locus Dependencies in Solving Geometry Construction Problems

ABSTRACT. Over the centuries, geometric construction problems have received significant attention. Although many of these problems can be solved using only a ruler and a compass, there are some that cannot. Relatively little effort has been devoted to identifying construction problems that have an obstruction to their solution. This paper focuses on locus-dependent problems – problems where the required figure exists only under specific constraints on the given elements. We explore various types of locus dependencies encountered in geometric constructions and propose automated procedures to solve these types of problem.

Tomaz Mascarenhas (Federal University of Minas Gerais)
Harun Khan (Stanford University)
Abdalrhman Mohamed (Stanford University)
Andrew Reynolds (University of Iowa)
Haniel Barbosa (Universidade Federal de Minas Gerais)
Clark Barrett (Stanford University)
Cesare Tinelli (The University of Iowa)
PO Formalization of a Proof Calculus for Incremental Linearization for NTA and NRA Satisfiability

ABSTRACT. Determining the satisfiability of formulas involving nonlinear real arithmetic and transcendental functions %, such as exponential and sine, is necessary in many applications, such as formally verifying dynamic systems, digital signal processing and motion planning.

Doing this automatically generally requires costly and intricate methods, which limits their applicability.

In the context of SMT solving, Incremental Linearization was introduced recently to facilitate reasoning on this domain, via an incomplete but easy to implement and highly effective approach.

The approach, based on abstraction-refinement via an incremental axiomatization of the nonlinear and transcendental operators, is currently implemented in the SMT solvers MathSAT and cvc5.

The cvc5 implementation is also proof-producing.

In this paper, we present a formalization in the Lean proof assistant of the proof calculus employed by cvc5.

This formalization ensures the soundness of the proof calculus, not only making the underlying algorithm more trustworthy but also enabling the possibility of reconstructing in Lean proofs emitted by cvc5 for this logic fragment.

We discuss how we modeled the rules in the proof assistant and the challenges encountered while formalizing them.

As far as we know, this is the first formalization of a proof calculus for incremental linearization for satisfiability modulo nonlinear arithmetic and transcendental functions.

Stephan Merz (Inria Nancy)
Invariant Synthesis: Decidable Fragments to the Rescue

ABSTRACT. Techniques for synthesizing inductive invariants of distributed algorithms are becoming useful. This presentation reports on recent work for generating an inductive invariant for the Raft consensus algorithm, and it points out the importance of designing decidable and expressive fragments of first-order logic.

Jasper Nalbach (RWTH Aachen University)
Lucas Michel (Université de Liège)
Erika Abraham (RWTH Aachen University)
Christopher Brown (United States Naval Academy)
James H. Davenport (University of Bath)
Matthew England (Coventry University)
Pierre Mathonet (Université de Liège)
Naïm Zenaïdi (Université de Liège)
FP Projective Delineability for Single Cell Construction

ABSTRACT. The cylindrical algebraic decomposition (CAD) is the only complete method used in practice for solving problems like quantifier elimination or SMT solving related to real algebra, despite its doubly exponential complexity. Recent exploration-guided algorithms like NLSAT, NuCAD, and CAlC rely on CAD technology but reduce the computational effort heuristically. Single cell construction is a paradigm that is used in each of these algorithms.

The central property on which the CAD algorithm is based is called delineability. Recently, we introduced a weaker notion called projective delineability which can require fewer computations to guarantee, but needs to be applied carefully. This paper adapts the single cell construction for exploiting projective delineability and reports on experimental results.

Jasper Nalbach (RWTH Aachen University)
Erika Ábrahám (RWTH Aachen University)
FP A Variant of Non-uniform Cylindrical Algebraic Decomposition for Real Quantifier Elimination

ABSTRACT. The Cylindrical Algebraic Decomposition (CAD) method is currently the only complete algorithm used in practice for solving real-algebraic problems. To ameliorate its doubly-exponential complexity, different exploration-guided adaptations try to avoid some of the computations. The first such adaptation named NLSAT was followed by Non-uniform CAD (NuCAD) and the Cylindrical Algebraic Covering (CAlC). Both NLSAT and CAlC have been developed and implemented in SMT solvers for satisfiability checking, and CAlC was recently also adapted for quantifier elimination. However, NuCAD was designed for quantifier elimination only, and no complete implementation existed before this work.

In this paper, we present a novel variant of NuCAD for both real quantifier elimination and SMT solving, provide an implementation, and evaluate the method by experimentally comparing it to CAlC.

Julien Narboux (IRIF - Université Paris Cité)
Walther Neuper (Johannes Kepler University)
Jørgen Villadsen (Technical University of Denmark)
Pedro Quaresma (Department of Mathematics, School of Science and Technology, University of Coimbra)
Theorem Proving Components for Educational Software (ThEdu'25) Proposal for a Workshop at CADE 2025

ABSTRACT. (Computer) Theorem Proving (TP) is becoming a paradigm as well as a technological base for a new generation of educational software in science, technology, engineering and mathematics.

The paradigm calls for complete, transparent and interactive software models of mathematics: the models shall be \emph{complete} with respect to deduction from first principles (``systems that explain themselves'') and with respect to coverage in doing mathematics --- formal specification, separation into sub-problems and assigning methods for solving a specified problem.

The models shall be \emph{transparent} concerning underlying mathematics knowledge, which is mechanized in a human-readable representation, and concerning intermediate steps towards a problem solution. The models shall be \emph{interactive} such that the input of students is checked reliably and verbosely, and such that the system provides feedback.

The ThEdu workshop brings together experts in automated theorem proving or interactive theorem proving with experts in education to design intelligent systems for teaching and evaluate their impact.

Johannes Niederhauser (University of Innsbruck)
Aart Middeldorp (University of Innsbruck)
The Computability Path Ordering for Beta-Eta-Normal Higher-Order Rewriting

ABSTRACT. We lift the computability path ordering and its extensions from plain higher-order rewriting to higher-order rewriting on beta-eta-normal forms where matching modulo beta-eta is employed. The resulting order NCPO is shown to be useful on practical examples. In particular, it can handle systems where its cousin NHORPO fails even when it is used together with the powerful transformation technique of normalization. We also argue that automating NCPO efficiently is straightforward using SAT/SMT solvers whereas this cannot be said about the transformation technique of neutralization. Our prototype implementation supports automatic termination proof search for NCPO and is also the first one to automate NHORPO with neutralization.

Jan Otop (University of Wrocław)
Anti-pattern templates

ABSTRACT. We introduce anti-pattern templates, which extend both anti-pattern matching by abstracting ground terms to variables, and unification by negative constraints. For anti-patter templates, we study the satisfiability problem, which asks whether a given template can be instantiated to a valid anti-pattern matching instance.

We show that the satisfiability problem for anti-pattern templates is NP-complete, but it becomes tractable if the number of negation symbols is bounded. Next, we consider anti-pattern templates modulo an equational theory and discuss its relations with other variants of equational frameworks with negative constraints. In particular, we conclude that the satisfiability problem for anti-pattern templates considered modulo associativity becomes undecidable, while it is decidable modulo associativity and commutativity.

Dirk Pattinson (The Australian National University)
Cláudia Nalon (University of Brasília)
Sourabh Peruri (Indian Institute of Science Bengaluru)
From Modal Sequent Calculi to Modal Resolution

ABSTRACT. We establish a systematic correspondence between sequent calculi and resolution calculi for a broad class of modal logics. Our main result is that soundness and completeness transfer from sequent to resolution calculi as long as cut and weakening are admissible. We first construct generative calculi that essentially import modal rules to a res- olution setting, and then show how soundness and completeness transfer to absorptive calculi, where modal rules are translated into generalised resolution rules. We discuss resolution calculi that establish validity, and then introduce local clauses to generate calculi for inconsistency. Finally, for modal rules of a certain shape, we show how to construct layered resolution calculi, a technique that has so far only been established for normal modal logics. Our work directly yields new sound and complete resolution calculi, and layered resolution calculi, for a large number of modal logics.

Álvaro Pereira-Albert (Universidad Nebrija)
Tomas Recio (Universidad Nebrija)
M. Pilar Velez (Universidad Nebrija)
Computing loci with GeoGebra Discovery: some issues and suggestions

ABSTRACT. In our contribution, we discuss the current performance of the GeoGebra Discovery LocusEquation command, which automatically outputs where to place some point in a construction so that a given property is satisfied. We will show some examples of unexpected output of this command, regarding different computational or interpretative issues (number of conditions, non-principal locus ideals, nondegeneracy conditions, etc.). Then we focus on the proposal of some theoretical and algorithmic recommendations to improve the performance of this LocusEquation command.

Valentin Promies (RWTH Aachen University)
Jasper Nalbach (RWTH Aachen University)
Erika Abraham (RWTH Aachen University)
Paul Wagner (RWTH Aachen University)
More is Less: Adding Polynomials for Faster Explanations in NLSAT

ABSTRACT. To check the satisfiability of (non-linear) real arithmetic formulas, modern satisfiability modulo theories (SMT) solving algorithms like NLSAT depend heavily on single cell construction, the task of generalizing a sample point to a connected subset (cell) of $\RR^n$ that contains the sample and over which a given set of polynomials is sign-invariant.

In this paper, we propose to speed up the computation and simplify the representation of the resulting cell by extending the considered set of polynomials with further linear polynomials. While this increases the total number of (smaller) cells generated throughout the algorithm, our experiments show that it can pay off when using suitable heuristics due to the interaction with Boolean reasoning.

Long Qian (Carnegie Mellon University)
Eric Wang (Carnegie Mellon University)
Bernardo Subercaseaux (Carnegie Mellon University)
Marijn Heule (Carnegie Mellon University)
Unfolding Boxes with Local Constraints

ABSTRACT. We consider the problem of finding and enumerating polyominos that can be folded into multiple non-isomorphic boxes. While several computational approaches have been explored, including SAT, randomized algorithms, and decision diagrams, none has been able to perform at scale. We argue that existing SAT encodings are hindered by the presence of global constraints (e.g., graph connectivity or acyclicity), which are generally hard to encode effectively, and hard for solvers to reason about. In this work, we propose a new SAT-based approach that replaces these global constraints with simple local constraints that have substantially better propagation properties. Our approach dramatically improves the scalability of both computing and enumerating common box unfoldings: (i) while previous approaches could only find common unfoldings of two boxes up to area 88, ours easily scales up to 150, and (ii) while previous approaches were only able to enumerate common unfoldings up to area 30, ours scales up to 60. This allows us to rule out 46, 54, and 58 as the smallest areas allowing a common unfolding of three boxes, thereby refuting a conjecture of Xu et al. (2017).

Pedro Quaresma (Department of Mathematics, School of Science and Technology, University of Coimbra)
Nuno Baeta (University of Coimbra)
A Generator of Geometry Deductive Database Method Provers

ABSTRACT. The Geometry Automated-Theorem-Provers (GATP) based on the deductive database method, use a data-based search strategy to improve the efficiency of forward chaining. An implementation of such a method is expected tobe able to efficiently prove a large set of geometric conjectures, producing readable proofs. The number of conjectures a given implementation can prove will depend on the set of inference rules chosen. Natural language and visual renderings of the proofs are made possible by the use of a synthetic forward-chaining method. In the original approach, the method and the corresponding implementation had the axioms and rules of inference hard-coded. In our implementation, the OGP-GDDM Prover, we used an approach based in an SQL database library and an in-memory database to allow a more flexible approach to re-implementations of the provers whenever a different set of axioms and rules of inference is needed. The Provers-Generator for the Geometric Deductive Databases Method (PG for short) is the next natural step. The PG is a program that, given a set of rules, generates a prover, an OGP-GDDM-prover, for that specific set of rules. The applications in areas such as education are very important given the possibility, opened by the PG, of having a prover, capable of producing readable proofs, adapted to a specific audience.

Pedro Quaresma (Department of Mathematics, School of Science and Technology, University of Coimbra)
Pierluigi Graziani (University of Urbino)
Towards a Structural Analysis of Interesting Geometric Theorems: Analysis of a Survey

ABSTRACT. The concept of what makes a geometric theorem “interesting” is deeply rooted in human intuition, pedagogy, and aesthetic perception. Despite its centrality in the mathematical experience, “interestingness” remains elusive to formal characterisation. Larry Wos highlighted this profound challenge in his 31st problem [8], calling for methods not only to automate theorem proving but to enable the generation and recognition of new and interesting theorems. Building upon Wos’ foundational insight, Quaresma, Graziani, and Nicoletti [6] formalised a related and precise computational question: whether it is algorithmically decidable if a Turing Machine can produce interesting theorems. By reduction to Rice’s theorem [7], they demonstrated that this problem is undecidable, establishing significant theoretical limits. However, this does not mean we cannot attempt to address Wos’ problem (specifically its aspect concerning interestingness) in a heuristic way. In our project, we aim to present a methodology capable of exploring the structural properties of geometric theorems that may underlie their perceived interestingness. Our approach systematically integrates human-based survey data, automated theorem proving, and Geometrographic analysis. First, we will show how it is possible to build upon the results of a comprehensive survey conducted among mathematicians, educators, and students. Participants were asked to list geometric theorems they considered interesting and to provide qualitative explanations for their choices. This phase allowed us to capture the human dimension of mathematical interest, revealing subjective factors such as simplicity, elegance, surprise, utility, and conceptual depth. Second, we will show how to construct formal proofs for each identified theorem using automated theorem proving tools, specifically employing the Area Method [2] (and similar methods that can be characterised geometrographically) as implemented in the GCLC prover [1,3]. These proofs will be synthetically generated, ensuring uniformity and enabling precise structural analysis. Third, we will show how the given formal proof can be analysed using a set of quantitative metrics, known as Geometrographic coefficients A precondition to this project is the need to have a rigorous characterisation of the human perception of interestingness. For that matter, we designed a survey. Answers have been collected about the subject of interesting geometric theorems.

Pedro Quaresma (Department of Mathematics, School of Science and Technology, University of Coimbra)
Predrag Janicic (University of Belgrade)
Julien Narboux (Université Paris Cité, IRIF)
Zoltán Kovács (The Private University College of Education of the Diocese of Linz, Institute of Initial Teacher Training)
Anna Petiurenko (University of the National Education Commission, Kraków)
Filip Marić (University of Belgrade)
Nuno Baeta (University of Coimbra)
ADG-Lib Initiative

ABSTRACT. This note briefly presents the ADG-Lib initiative. Our goal is to provide rigorous descriptions of the axiom systems used by the different geometry automatic and interactive theorem provers, propose common input and output languages for geometry theorem provers, and establish and make available to the research community a large library of geometric problems.

Florian Rabe (FAU Erlangen-Nürnberg)
Uwe Waldmann (MPI Informatics)
Özgür Lütfü Özcep (University of Hamburg)
Kai Sauerwald (FernUni Hagen)
Deduktionstreffen

ABSTRACT. The annual Workshop Deduktionstreffen (German, roughly translates to ``deduction meeting'') is the prime activity of the Special Interest Group on Deduction Systems (Fachgruppe Deduktionssysteme) of the AI Chapter (Fachbereich KI) of the German Society of Informatics (Gesellschaft fuer Informatik, GI) and will be organized this year jointly with the Special Interest Group on Knowledge Representation and Reasoning (Fachgruppe Wissensrepräsentation und Schließen).

The Deduktionstreffen is an international meeting with an informal and friendly atmosphere, where everyone interested in deduction systems and related topics can report on their work in an accessible setting. It has a long tradition for being a meeting place for the German community on automated reasoning. Some previous meetings have been associated with CADE.

A special focus of the workshop is on young researchers and students, who are particularly encouraged to present their ongoing research projects to a wider audience, and to receive constructive feedback from more experienced participants. Another goal of the meeting is to stimulate networking effects and to foster collaborative research projects.

Teppei Saito (JAIST)
Nao Hirokawa (JAIST)
Lexicographic Combination of Reduction Pairs

ABSTRACT. We present a simple criterion for combining reduction pairs lexicographically. The criterion is applicable to arbitrary classes of reduction pairs, such as the polynomial interpretation, the matrix interpretation, and the Knuth--Bendix order. In addition, we investigate a variant of the matrix interpretation where the lexicographic order is employed instead of the usual component-wise order. Effectiveness is demonstrated by experiments and examples, including Touzet's Hydra Battle.

Renate A. Schmidt (University of Manchester)
Hongkai Yin (Central European University)
Applying SCAN to Basic Path Logic and the Ordered Fragment (Extended Abstract)

ABSTRACT. The SCAN algorithm aims to solve the problem of eliminating second-order quantifiers from an existentially second-order quantified first-order formula and equivalently expressing it in first-order logic. This problem has applications in many subfields of computer science, from modal correspondence theory, knowledge representation, and automated reasoning to verification. In this paper we consider the application of the SCAN algorithm to basic path logic and the ordered fragment of first-order logic. We extend the clausal class of basic path logic to accommodate inequalities, which characterizes the search space of SCAN for basic path logic. By introducing constant Skolemization, we show that sentences of the ordered fragment can be transformed into basic path clauses and the generalized class, and this transformation preserves logical consequences in the ordered fragment. We prove that SCAN terminates on clauses of basic path logic, and therefore it decides second-order quantifier elimination and semantic forgetting for basic path logic. We show a general result that inequalities between constants can be eliminated without cost to consequences in basic path logic or the ordered fragment. The result also shows that SCAN can compute uniform interpolants expressed in basic path logic of ordered sentences.

Roberto Sebastiani (University of Trento)
Entailment vs. Verification for Partial-assignment Satisfiability and Enumeration

ABSTRACT. Many procedures for SAT-related problems, in particular for those requiring the complete enumeration of satisfying truth assignments, rely their efficiency and effectiveness on the detection of (possibly small) partial assignments satisfying an input formula. Surprisingly, there seems to be no unique universally-agreed definition of formula satisfaction by a partial assignment in the literature. In this paper we analyze in deep the issue of satisfaction by partial assignments, highlighting some ambiguities and subtleties of this concept, and investigating their practical consequences. We identify two alternative notions that are implicitly used in the literature, namely verification and entailment, which coincide if applied to CNF formulas but differ and present complementary properties if applied to non-CNF or to existentially-quantified formulas. We show that, although the former is easier to check and as such is implicitly used by most current search procedures, the latter has better theoretical properties, and can improve the efficiency and effectiveness of enumeration procedures.

Shivam Sharma (Texas A&M University)
John Keyser (Texas A&M University)
Inference Maximizing Point Configurations for Parsimonious Algorithms

ABSTRACT. We present an exploration of inferring orientations for point configurations. We can compute the orientation of every triple of points by a combination of direct calculation and inference from prior computations. This ``parsimonious'' approach was suggested by Knuth in 1992, and aims to minimize calculation by maximizing inference. We wish to investigate the efficacy of this approach by investigating the minimum and maximum number of inferences that are possible for a point set configuration. To find the configurations which yield maximum inferences, there is no direct formula and hence two constructive approaches are suggested. A basic analysis of sequences that achieve those maximum inferences is also presented and some properties have been proved regarding their existence.

Romain Sidhoum (Université de Montpellier)
Simon Robillard (Université de Montpellier)
Teaching Automated Deduction: an Implementation-BasedApproach with Maude

ABSTRACT. This paper presents a pedagogical experience integrating the Maude language into a master's-level course on automated software verification. The course aims to provide students with both theoretical foundations and practical skills in formal methods, particularly through model checking and equational reasoning. Maude, a high-performance logical framework based on rewriting logic, was selected due to its strong formal semantics and versatility. Over the semester, students engaged in hands-on assignments using Maude to specify and verify system properties, culminating in a mini-project. The outcomes indicate that Maude facilitated a deeper understanding of formal verification techniques and allowed students to model complex systems concisely and rigorously. Reflections on student feedback and course design are discussed to inform future integration of formal tools in software verification curricula.

Viorica Sofronie-Stokkermans (University of Koblenz)
On Symbol Elimination and Uniform Interpolation in Theory Extensions

ABSTRACT. We study the links between symbol elimination in local theory extensions and possibilities of computing uniform interpolants for ground formulae w.r.t. such theories. In contrast to approaches which only consider the problem of eliminating existentially quantified variables, we distinguish between interpreted and uninterpreted function symbols and analyze possibilities of obtaining uniform interpolants which can contain arbitrary interpreted function symbols, and uninterpreted function symbols in a given, specified set. We identify situations in which a hierarchical form of symbol elimination can be successfully used for this, and also investigate the limitations of the method we propose.

Lukas Stevens (Technical University of Munich)
Rebecca Ghidini (École normale supérieure, Département d’informatique de l’ENS and Inria)
Simplified and Verified: A Second Look at a Proof-Producing Union-Find Algorithm

ABSTRACT. Using Isabelle/HOL, we verify an union-find data structure with an explain operation due to Nieuwenhuis and Oliveras. We devise a simpler, more naive version of the explain operation whose soundness and completeness is easy to verify. Then, we prove the original formulation of the explain operation to be equal to our version. Finally, we refine this data structure to Imperative HOL, enabling us to export efficient imperative code. The formalisation provides a stepping stone towards the verification of proof-producing congruence closure algorithms which are a core ingredient of Satisfiability Modulo Theories (SMT) solvers.

Martin Suda (Czech Technical University in Prague)
Efficient Neural Clause-Selection Reinforcement

ABSTRACT. Clause selection is arguably the most important choice point in saturation-based theorem proving. Framing it as a reinforcement learning task is a way to challenge the human-designed heuristics of state-of-the-art provers and to instead automatically evolve—just from prover experiences—their potentially optimal replacement.

In this work, we present a neural network architecture for scoring clauses for clause selection that is powerful yet efficient to evaluate. Following reinforcement learning principles to make design decisions, we integrate the network into the Vampire theorem prover and train it from successful proof attempts. An experiment on the diverse TPTP benchmark finds the neurally guided prover improve over a baseline strategy from which it initially learns—in terms of the number of in-training-unseen problems solved under a practically relevant, short CPU instruction limit—by 20%.

Geoff Sutcliffe (University of Miami)
The CADE-30 ATP System Competition (CASC-30)

ABSTRACT. The CADE ATP System Competition (CASC) is the annual evaluation of fully automatic, classical logic, ATP systems – the world championship for such systems. One purpose of CASC is to provide a public evaluation of the relative capabilities of ATP systems. Additionally, CASC aims to stimulate ATP research, motivate development and implementation of robust ATP systems that can be easily and usefully deployed in applications, provide an inspiring environment for personal interaction between ATP researchers, and expose ATP systems within and beyond the ATP community. CASC evaluates the performance of ATP systems in terms of the number of problems solved with an acceptable solution output, and the average time taken for problems solved, in the context of a bounded number of eligible problems and specified time limits. CASC-30 will be held during the 30th International Conference on Automated Deduction.

Geoff Sutcliffe (University of Miami)
The TPTP Tea Party (TPTPTP)

ABSTRACT. The TPTP Tea Party (TPTPTP) brings together developers and users of the TPTP World, including (but not limited to): the TPTP problem library, the TSTP solution library, the TPTP syntax, the TPTPWorld software, etc. The meeting aims to elicit feedback, suggestions, criticisms, etc, of these resources, in order to ensure that their continued development of the TPTP World meets the needs of successful automated reasoning.

Melanie Taprogge (University of Greifswald, Institute of Mathematics and Computer Science, Greifswald, Germany)
Towards the Verification of Higher-Order Logic Automated Reasoning

ABSTRACT. The diversity of output encodings of different reasoning systems hinders proof verification and interoperability. The Dedukti framework addresses this issue by implementing the λΠ-calculus modulo, enabling proofs from different frameworks to be expressed, combined, and checked automatically. We identify common challenges and requirements for verifying proofs of automated theorem provers for higher-order logic, and develop general strategies for systematically deriving encodings of calculus rules and proof steps. Building on these strategies, we propose a modular encoding approach, which we demonstrate and evaluate using the higher-order logic prover Leo-III. This effort has already led to the discovery and resolution of minor bugs in the original prover.

Guilherme Toledo (State University of Campinas)
Benjamin Przybocki (University of Cambridge)
Yoni Zohar (Bar-Ilan University)
Being polite is not enough (and other limits of theory combination)

ABSTRACT. In the Nelson–Oppen combination method for satisfiability modulo theories, the combined theories must be stably infinite; in gentle combination, one theory has to be gentle, and the other has to satisfy a similar yet weaker property; in shiny combination, only one has to be shiny (smooth, with a computable minimal model function and the finite model property); and for polite combination, only one has to be strongly polite (smooth and strongly finitely witnessable). For each combination method, we prove that if any of its assumptions are removed, then there is no general method to combine an arbitrary pair of theories satisfying the remaining assumptions. We also prove new theory combination results that weaken the assumptions of gentle and shiny combination.

Guilherme Toledo (State University of Campinas)
Benjamin Przybocki (Stanford University)
Yoni Zohar (Bar-Ilan University)
Polite theory combination, an overview

ABSTRACT. Satisfiability modulo theories (SMT), if summarized into a single sentence, studies algorithms to test whether a formula is satisfied in a given theory. Theory combination is a branch of SMT where we combine these algorithms in order to obtain one that works for the combined theory: methods include Nelson-Oppen ([NO79]), and those involving shininess ([RRZ05a]) and gentleness ([Fon09]).

One method that, although extremely useful and used in cvc5 ([BBB+22]), had a complicated history is the one involving strong politeness ([JB10]). It started as only politeness in [RRZ05b], meaning smooth and finitely witnessable. A theory is smooth when we can increase one of its models as we please while keeping a fixed formula ϕ satisfied. Finite witnessability, on the other hand, is far more difficult to intuit. It involves a computable function wit, called a witness, that takes a quantifier-free formula ϕ and returns another quantifier-free formula wit(ϕ) such that: first, ϕ and wit(ϕ) are essentially equivalent (more precisely, ϕ is T-equivalent to the existential closure of wit(ϕ) with respect to those variables not in ϕ); and, if ϕ is actually satisfiable in T, then there is a T-interpretation that satisfies ϕ where every element is the interpretation of a variable in wit(ϕ). In a sense, wit(ϕ) ”witnesses” all elements of some T-interpretation that satisfies ϕ, so we can say that this interpretation is ”witnessed”.

But the proof that this method worked was flawed. A correction was found in [JB10], the corresponding property being later on named strong politeness. The underlying algorithm is the same as the one for politeness, but in the earlier case the algorithm was not sound. The combination method then guarantees, essentially, that if two theories are decidable, and one of them is strongly polite, then their combination is also decidable, strong politeness being the conjunction of smoothness and strong finite witnessability. The latter is almost the same as finite witnessability, except that now wit is required to maintain, in the witnessed model, the relationship between variables found in the starting model, where ϕ is known to be satisfied. Notice, however, that [JB10] did not prove that theory combination assuming only politeness is not possible, only that the provided proof did not work.

Later on, in [CR13,CR18], strong politeness plus decidability was proven to be equivalent to shininess, property involved in another theory combination method; a theory is shiny when it is smooth; stably finite, a property guaranteeing that if a quantifier-free formula has a T-interpretation where it is satisfied, then it has another T-interpretation where it is satisfied and whose domains are finite and of cardinality equal to or lesser than the cardinality of the corresponding domains in the starting interpretation; and one can computably find the minimal (tuples of) cardinalities of a T-interpretation that satisfies a given quantifier-free formula. Were strong politeness to imply decidability, we would have that not only strong politeness itself is equivalent to shininess, but the theory combination method involving strong politeness could be simplified: if a theory is strongly polite and another is decidable, then their combination is decidable.

A natural follow-up question was whether there are actually theories that were polite without being strongly polite: if they don’t exist, then the proof that theory combination with strong politeness is possible also proves that theory combination with only politeness is possible. A simple counterexample was eventually found in [SZR+21], but it was not stably finite. This is important as strong finite witnessability is deeply related to stable finiteness: it seemed like politeness and stable finiteness could imply strong politeness.

In summary, many open questions were left in the study of polite and strongly polite theories: 1. Can we simplify strong politeness (to, for example, politeness)? 2. Does strong politeness imply decidability? 3. Are polite and stably finite theories also strongly polite? Notice that a positive answer to the third question could possibly imply a positive one for the first as well, were we to find a combination method assuming only politeness and stable finiteness. We have recently found a partial simplification of strong politeness, giving a somewhat positive answer to the first question, but proved that the others have negative answers. We here present some of these results.

In [TZB23] we started an analysis of the interplay between several combination methods, showing that strong politeness could be replaced by strong finite witnessability and stable infiniteness, a property already well-known given its use in the Nelson and Oppen combination method and weaker than smoothness. Later on, in [PdTZB24], this result was strengthened, at the cost of a significantly more challenging proof, to show that actually stable infiniteness and strong finite witnessability imply strong politeness, thus offering a positive answer to the first question.

In addition, [TZB23] presented a theory that is polite and stably finite with- out being strongly polite, answering the third question in the negative. This theory achieves this by, in a way, having a predictable (read computable) behavior for its models of cardinality a power of 2, while its other models are somewhat unpredictable. Finally, in a paper under peer-review, we construct a theory that is strongly polite yet not decidable, settling the last of these open problems, again in the negative. There is, however, a silver lining to this submitted paper, which also shows that a ”decidability-free” definition of shininess still implies decidability.

All of these results follow, as well as complement, the history of politeness in theory combination so far, enhancing our understanding of this sometimes difficult technique. And they herald the use of new instruments, from areas such as model theory, complexity theory or order theory, in theory combination, that in turn lead to new, sharper results.

Sophie Tourret (INRIA and MPI for Informatics)
Conflict-based Literal Model Generation

ABSTRACT. Simple Clause Learning (SCL) is an automated theorem proving technique for first-order logic that Christoph Weidenbach and his team have designed and studied in recent years. In this extended abstract, I adapt a method for model construction presented by Bromberger, Krasnopol, Möhle and Weidenbach in 2024 at IJCAR to make it more directly useful to SCL.

Jørgen Villadsen (Technical University of Denmark)
Simon Tobias Lund (Technical University of Denmark)
Anders Schlichtkrull (Aalborg University)
Formalizing Axiomatics for First-Order Logic

ABSTRACT. We formalize classical first-order logic with functions in the proof assistant Isabelle/HOL. The standalone formalization is based on entries in the Isabelle Archive of Formal Proofs. The soundness and completeness theorems hold for languages of arbitrary cardinalities. The 3 axioms and 3 rules together cover implication, negation and universal quantification. The unique approach has been used to teach formal methods and automated reasoning.

Jørgen Villadsen (Technical University of Denmark)
Teaching Axiomatic Systems and Metatheory in Isabelle

ABSTRACT. For more than five years we have used the Isabelle proof assistant to teach metatheory of propositional and first-order logic, not only for natural deduction and sequent calculus, but also for simple axiomatic systems. Languages of any cardinality are supported. We present the tools and the lessons learned. The focus is on the formal soundness and completeness theorems for classical propositional logic.

Andrei Voronkov (The University of Manchester)
Michael Rawson (University of Southampton)
Anja Petković Komel (Vienna University of Tehcnology)
Johannes Schoisswohl (TU Wien)
Ground Truth: Checking Vampire Proofs via Satisfiability Modulo Theories

ABSTRACT. The Vampire automated theorem prover is extended to output proofs in such a way that each inference is represented by a quantifier-free SMT instance. If every instance is unsatisfiable, the proof can be considered verified by an external SMT solver. This pragmatic form of proof checking places only a very light burden on the SMT solver, and can easily handle inferences that other systems may find difficult, such as theory inferences or extensive ground reasoning. The method is considerably easier to implement than proof formats based on small kernels and covers a greater variety of modern-day inferences.

Jelle Wemmenhove (Eindhoven University of Technology)
Nick Hoofd (Eindhoven University of Technology)
Alexander Schüler-Meyer (Eindhoven University of Technology)
Jim Portegies (Eindhoven University of Technology)
Reinforcing students’ proof structure with Waterproof: an analysis

ABSTRACT. Waterproof is educational software, based on a proof assistant, that is intended to help students with acquiring the skill of writing mathematical proofs. This extended abstract compares the proof structure of students who interacted with the software versus that of students who did not, by means of a coding scheme that combines the concepts of proof framework and natural deduction. The results suggest that students become skilled in the type of behavior that gets trained in Waterproof.

Christoph Wernhard (University of Potsdam)
Zsolt Zombori (HUN-REN Alfréd Rényi Institute of Mathematics)
Mathematical Knowledge Bases as Grammar-Compressed Proof Terms: Exploring Metamath Proof Structures

ABSTRACT. Metamath is an established computer language for archiving, verifying, and studying mathematical proofs. Its basis is condensed detachment, with unification and a view on proofs as terms. Grammar-based tree compression applied to such proof terms factorizes repeated patterns, which provide proofs of definite clauses, pattern variables corresponding to body atoms. A Metamath database is then just a grammar that compresses a set of giant proof trees. We provide a formal account of this view and have implemented a practical toolkit. In experiments we compare the proof structuring of Metamath by human experts with machine-generated structurings, and address possibilities of the interplay of both.

Yizheng Zhao (Nanjing University)
Renate A. Schmidt (The University of Manchester)
Uniform Interpolation and Forgetting for Large-Scale Ontologies with Application to Semantic Difference in SNOMED CT

ABSTRACT. In this paper, we present a practical method for computing uniform interpolation (UI) and forgetting in ELIH-ontologies, addressing fundamental operations critical for detecting semantic differences in ontology evolution. Our method extends previous work to accommodate inverse roles while maintaining termination and soundness guarantees. Empirical evaluation across industry benchmark datasets Oxford ISG and NCBO BioPortal demonstrates 100% success rates with significant improvements in computational efficiency compared to state-of-the-art systems. The practical impact of our approach is validated through application to SNOMED CT, the world's largest clinical terminology supporting healthcare information systems globally. This enables terminologists and developers to systematically track semantic differences, detect unintended consequences, and validate change safety across SNOMED CT releases.

Yi Zhou (Carnegie Mellon University)
Amar Shah (Carnegie Mellon University)
Zhengyao Lin (Carnegie Mellon University)
Marijn Heule (Carnegie Mellon University)
Bryan Parno (Carnegie Mellon University)
Cazamariposas: Automated Instability Debugging in SMT-based Program Verification

ABSTRACT. Program verification languages such as Dafny and F⋆ often rely heavily on Satisfiability Modulo Theories (SMT) solvers for proof automation. However, SMT-based verification suffers from instability, where semantically irrelevant changes in the source program can cause spurious proof failures. While existing mitigation techniques emphasize preemptive measures, we propose a complementary approach that focuses on diagnosing and repairing specific instances of instability-induced failures. Our key technique is a novel differential analysis to pinpoint problematic quantifiers in an unstable query. We instantiate this technique in Cazamariposas, a tool that automatically identifies such quantifiers and suggests fixes. We evaluate Cazamariposas on multiple large-scale systems verification projects written in three different program verification languages. Our results demonstrate Cazamariposas' effectiveness as an instability debugger. In the majority of cases, Cazamariposas successfully isolates the issue to a single problematic quantifier and provides a stabilizing fix.