CADE-28: THE 28TH INTERNATIONAL CONFERENCE ON AUTOMATED DEDUCTION
PROGRAM FOR THURSDAY, JULY 15TH
Days:
previous day
next day
all days

View: session overviewtalk overview

08:00-10:00 Session 30: System Descriptions and Competition Reports (CADE-28)
Location: ZoomRoom 1
08:00
Twee: an equational theorem prover (System Description)

ABSTRACT. Twee is an automated theorem prover for equational logic. It implements unfailing Knuth-Bendix completion with ground joinability testing and a connectedness-based redundancy criterion. Its performance is fairly strong: it finished in second place in the UEQ division of CASC-J10, and solved some problems that the other systems could not. This paper describes Twee’s design and implementation.

08:15
The Isabelle/Naproche Natural Language Proof Assistant (System Description)

ABSTRACT. We present the recent Isabelle/Naproche system which combines proof editing in the well-known Isabelle Proof Interactive Development Environment with concurrent checking by the Naproche system (Natural Proof Checking). Naproche accepts proof texts in the controlled natural language ForTheL (Formula Theory Language), which is close to the mathematical vernacular. ForTheL files may use LaTeX symbols and constructs for direct mathematical typesetting. Our system description covers - general remarks on formal mathematics with natural language input, and technical information on - natural language processing in Naproche, - (weakly typed) first-order processing, - type checking, - logical checking, - communication with the Eprover as external ATP, - integration of Naproche into Isabelle. We also include extracts from an example formalization.

08:30
The Lean 4 Theorem Prover and Programming Language (System Description)

ABSTRACT. Lean 4 is a reimplementation of the Lean interactive theorem prover (ITP) in Lean itself. It addresses many shortcomings of the previous versions and contains many new features. Lean 4 is fully extensible: users can modify and extend the parser, elaborator, tactics, decision procedures, pretty printer, and code generator. The new system has a hygienic macro system custom-built for ITPs. It contains a new typeclass resolution procedure based on tabled resolution, addressing significant performance problems reported by the growing user base. Lean 4 is also an efficient functional programming language based on a novel programming paradigm called functional but in-place. Efficient code generation is crucial for Lean users because many write custom proof automation procedures in Lean itself.

08:45
Harpoon: Mechanizing Metatheory Interactively (System Description)

ABSTRACT. Beluga is a proof checker that provides a sophisticated infrastructure for implementing formal systems based on the logical framework LF and meta-theoretic proofs as total,recursive functions that transform LF derivations.

In this paper, we describe Harpoon, an interactive proof engine built on top of Beluga. It allows users to develop proofs interactively using a small, fixed set of high-level actions that safely transform a subgoal. A sequence of actions elaborates into a (partial) proof script that serves as an intermediate representation describing an assertion-level proof. Last, poof scripts translate into Beluga programs which can be type-checked independently.

Harpoon is available on GitHub. We have used Harpoon to replay a wide array of examples covering all features supported by Beluga. In particular, we have used it for normalization proofs, including the recently proposed POPLMark reloaded challenge.

09:00
The CADE-28 ATP System Competition (CASC-28) Results

ABSTRACT. In order to stimulate ATP research and system development, and to expose ATP systems within and beyond the ATP community, the CADE ATP System Competition (CASC) is held at each CADE and IJCAR conference. CASC-28 was held on 13th July 2021, during the 28th International Conference on Automated Deduction (CADE-28). The results of the competition will be presented.

09:30
Termination and Complexity Competition Results

ABSTRACT. The termination and complexity competition (termCOMP) focuses on automatedtermination and complexity analysis for various kinds of programmingparadigms, including categories for term rewriting, integer transitionsystems, imperative programming, logic programming, and functionalprogramming. We report on the results of termCOMP 2020, held on 13th July 2021during CADE-28.

10:00-10:30Break
10:30-12:00 Session 31: ATP+AI (CADE-28)
Location: ZoomRoom 1
10:30
Confidences for Commonsense Reasoning

ABSTRACT. Commonsense reasoning has long been considered one of the holy grails of artificial intelligence. Our goal is to develop a logic-based component for hybrid -- machine learning plus logic -- commonsense question answering systems. A critical feature for the component is estimating the confidence in the statements derived from knowledge bases containing uncertain contrary and supporting evidence obtained from different sources. Instead of computing exact probabilities or designing a new calculus we focus on extending the methods and algorithms used by the existing automated reasoners for full classical first-order logic. The paper presents the CONFER framework and implementation for confidence estimation of derived answers.

11:00
Neural Precedence Recommender
PRESENTER: Filip Bártek

ABSTRACT. The state-of-the-art superposition-based theorem provers for first-order logic rely on simplification orderings on terms to constrain the applicability of inference rules, which in turn shapes the ensuing search space. The popular Knuth-Bendix simplification ordering is parameterized by symbol precedence—a permutation of the predicate and function symbols of the input problem’s signature. Thus, the choice of precedence has an indirect yet often substantial impact on the amount of work required to complete a proof search successfully. This paper describes and evaluates a symbol precedence recommender, a machine learning system that estimates the best possible precedence based on observations of prover performance on a set of problems and random precedences. Using the graph convolutional neural network technology, the system does not presuppose the problems to be related or share a common signature. When coupled with the theorem prover Vampire and evaluated on the TPTP problem library, the recommender is found to outperform a state-of-the-art heuristic by more than 4 % on unseen problems.

11:30
Improving ENIGMA-Style Clause Selection While Learning From History

ABSTRACT. We re-examine the topic of machine-learned clause selection guidance in saturation-based theorem provers, propose several improvements to this approach, and experimentally confirm their viability.

For the demonstration we use a Recursive Neural Network to classify clauses based on their derivation history and the presence or absence of automatically supplied theory axioms therein. The automatic theorem prover Vampire guided by the network achieves a 41% improvement on a relevant subset of SMTLIB in real time evaluation.

12:00-12:30Break
12:30-13:30 Session 32: Invited Talk (CADE-28)
Location: ZoomRoom 1
12:30
Towards the Automatic Mathematician

ABSTRACT. Over the recent years deep learning has found successful applications in mathematical reasoning. Today, we can predict fine-grained proof steps, relevant premises, and even useful conjectures using neural networks. This extended abstract summarizes recent developments of machine learning in mathematical reasoning and the vision of the N2Formal group at Google Research to create an automatic mathematician. The second part discusses the key challenges on the road ahead.

13:30-14:00Break