Interoperability of Proof systems with SC-TPTP (System Description)
ABSTRACT. We introduce SC-TPTP, an extension of the TPTP derivation format that encompasses sequent formalism, enabling seamless proof exchange between interactive theorem provers and first-order automated theorem provers.
We provide a way to represent non-deductive steps---Skolemization, clausification, and Tseitin normal form---as deductive steps within the format.
Building upon the existing support in the Lisa proof assistant and the Goéland theorem prover, SC-TPTP ecosystem is further enhanced with proof output interfaces for Egg and Prover9, as well as proof reconstruction support for HOL Light, Lean, and Rocq.
ABSTRACT. Indexing of syntactic terms is a key component for the efficient
implementation of automated theorem provers. This paper presents the
first verified implementation of a term indexing data structure,
namely a formalization of path indexing in the proof assistant
Isabelle/HOL. We define the data structure, maintenance operations,
and retrieval operations, including retrieval of unifiable terms,
instances, generalizations and variants. We prove that maintenance
operations preserve the invariants of the structure, and that
retrieval operations are sound and complete.
Simplified and Verified: A Second Look at a Proof-Producing Union-Find Algorithm
ABSTRACT. Using Isabelle/HOL, we verify an union-find data structure with an explain operation due to Nieuwenhuis and Oliveras. We devise a simpler, more naive version of the explain operation whose soundness and completeness is easy to verify. Then, we prove the original formulation of the explain operation to be equal to our version. Finally, we refine this data structure to Imperative HOL, enabling us to export efficient imperative code. The formalisation provides a stepping stone towards the verification of proof-producing congruence closure algorithms which are a core ingredient of Satisfiability Modulo Theories (SMT) solvers.
Faithful Logic Embeddings in HOL – Deep and Shallow
ABSTRACT. Deep and shallow embeddings of non-classical logics in classical higher-order logic have been explored, implemented, and used in various automated reasoning tools in recent years. This paper presents a recipe for the simultaneous deployment of different forms of deep and shallow embeddings in classical higher-order logic, enabling not only flexible interactive and automated theorem proving and counterexample finding at meta and object level, but also automated faithfulness proofs between the logic embeddings. The approach, which is fruitful for logic education, research and application, is deliberately illustrated here using simple propositional modal logic. However, the work presented is conceptual in nature and not limited to such a simple logic context.
A Stepwise Refinement Proof that SCL(FOL) Simulates Ground Ordered Resolution
ABSTRACT. Recently, it has been demonstrated that SCL(FOL) can simulate ground ordered resolution. We revisit this result and provide a new formal proof in Isabelle/HOL. The existing pen-and-paper proof is monolithic and challenging to comprehend. In order to improve clarity, we develop an alternative proof structured as eleven (bi)simulation steps between the two calculi, transitioning from ordered resolution to SCL(FOL). A key simulation lemma ensures that, under certain conditions, one simulation direction can be automatically lifted to the other. Consequently, for each of the eleven steps, it suffices to establish only one direction of simulation. The complete proof is included in the Isabelle Archive of Formal Proofs.
Cut Elimination for Negative Free Logics with Definite Descriptions
ABSTRACT. We present a sequent calculus GNFL for negative free logic with definite descriptions in the classical and intuitionistic version, with empty and nonempty domains. It is shown constructively that GNFL satisfies the cut elimination theorem and its cut-free version satisfies the subformula property.
ABSTRACT. We establish a systematic correspondence between sequent
calculi and resolution calculi for a broad class of modal logics. Our main
result is that soundness and completeness transfer from sequent to resolution
calculi as long as cut and weakening are admissible. We first
construct generative calculi that essentially import modal rules to a res-
olution setting, and then show how soundness and completeness transfer
to absorptive calculi, where modal rules are translated into generalised
resolution rules. We discuss resolution calculi that establish validity, and
then introduce local clauses to generate calculi for inconsistency. Finally,
for modal rules of a certain shape, we show how to construct layered
resolution calculi, a technique that has so far only been established for
normal modal logics. Our work directly yields new sound and complete
resolution calculi, and layered resolution calculi, for a large number of
modal logics.
A Fresh Inductive Approach to Useful Call-by-Value
ABSTRACT. Useful evaluation is an optimised evaluation mechanism for functional programming languages, introduced by Accattoli and Dal Lago.
The key to useful evaluation is to represent programs with sharing and to implement substitution of terms only when this contributes to the progress of the computation. Initially defined in the framework of call-by-name, useful evaluation has since been extended to call-by-value. The definitions of usefulness in the literature are complex and lack inductive structure, which makes it challenging to (formally) reason about them.
In this work, we define useful call-by-value evaluation inductively, proceeding in two stages. First, we refine the well-known Value Substitution Calculus, so the substitution operation becomes linear, yielding the lcbv calculus. The two calculi are observationally equivalent. We then further refine lcbv by restricting linear substitution only when it contributes to the progress of the computation, yielding the ucbv strategy. This new substitution notion is sensitive to the surrounding evaluation context, so it is non-trivial to capture it inductively. Moreover, we formally show that the resulting ucbv is a sound and complete implementation of lcbv, optimised to implement useful evaluation.
As a further contribution, we show that the ucbv strategy can be implemented by an existing lower-level abstract machine called GLAMoUr with polynomial overhead in time. This entails, as a corollary, that ucbv is time-invariant, i.e., that the number of reduction steps to normal form in ucbv can be used as a measure of time complexity.
Our ucbv strategy is part of the preliminary work required to develop semantic interpretations of useful evaluation, for which its inductive formulation is more suitable than the (non-inductive) existing ones.
ABSTRACT. Clause selection is arguably the most important choice point in saturation-based theorem proving. Framing it as a reinforcement learning task is a way to challenge the human-designed heuristics of state-of-the-art provers and to instead automatically evolve—just from prover experiences—their potentially optimal replacement.
In this work, we present a neural network architecture for scoring clauses for clause selection that is powerful yet efficient to evaluate. Following reinforcement learning principles to make design decisions, we integrate the network into the Vampire theorem prover and train it from successful proof attempts. An experiment on the diverse TPTP benchmark finds the neurally guided prover improve over a baseline strategy from which it initially learns—in terms of the number of in-training-unseen problems solved under a practically relevant, short CPU instruction limit—by 20%.
ABSTRACT. We develop a self-learning approach for conjecturing of induction predicates on a dataset of 16197 problems derived from the OEIS. These problems are hard for today's SMT and ATP systems because they require a combination of inductive and arithmetical reasoning.
Starting from scratch, our approach consists of a feedback loop that iterates between
(i) training a neural translator to learn the correspondence between the problems solved so far and the induction predicates useful for them,
(ii) using the trained neural system to generate many new induction predicates for the problems,
(iii) fast runs of the z3 prover attempting to prove the problems using the generated predicates,
(iv) using heuristics such as predicate size and solution speed on the proved problems to choose the best predicates for the next iteration of training.
The algorithm discovers on its own many interesting induction predicates, ultimately solving over 5500 problems, compared to 2265 problems solved by CVC5, Vampire or Z3.