EUROPROOFNET-WG5: THEOREM PROVING AND MACHINE LEARNING IN THE AGE OF LLMS
PROGRAM

Days: Monday, April 7th Tuesday, April 8th

Monday, April 7th

View this program: with abstractssession overviewtalk overview

09:10-10:10 Session 2: Keynote 1. Sergei Gukov (Caltech & Dublin Inst. Adv. Studies). Machine Learning for Mathematics Research

Abstract: In this talk, we explore the transformative potential of custom reinforcement learning (RL) algorithms in accelerating solutions to complex, research-level mathematical challenges. We begin by illustrating how these algorithms have achieved a 10X improvement in areas where previous advances of the same magnitude required many decades. A comparative analysis of different network architectures is presented to highlight their performance in this context. We then delve into the application of RL algorithms to exceptionally demanding tasks, such as those posed by the Millennium Prize problems and the smooth Poincaré conjecture in four dimensions. Drawing on our experiences, we discuss the prerequisites for developing new RL algorithms and architectures that are tailored to these high-level challenges. This talk is largely based on recent work with A.Shehper, A.Medina-Mardones, L.Fagan, B.Lewandowski, A.Gruen, Y.Qiu, P.Kucharski, and Z.Wang.

Bio: After receiving his PhD from Princeton University, Sergei Gukov spent five years at Harvard University as a research fellow of the Clay Mathematics Institute and two years at the school of mathematics at the Institute for Advanced Studies. His passion is building new bridges between different areas of mathematical physics and pure mathematics, such as quantum topology, mirror symmetry, and gauge theory. His more recent interests involve new connections between mathematics and machine learning.

 

Chair:
Christian Saemann (Heriot-Watt University, UK)
10:40-12:30 Session 3: Fundamental Research in Machine Learning Methods for Proof Generation: Structures, Representations, Abstractions, Algorithms

Contributed Talks devoted to general methodological issues in applications of machine learning in proof finding and proof generation 

Chair:
Kathrin Stark (Heriot-Watt University, UK)
10:40
Cameron Freer (Massachusetts Institute of Technology, United States)
Alexander Lew (Massachusetts Institute of Technology, United States)
Timothy O'Donnell (McGill University, Mila – Québec AI Institute,, Canada)
Vikash Mansinghka (Massachusetts Institute of Technology, United States)
Autoformalization and autoinformalization via probabilistic reasoning (abstract)
11:10
Huajian Xin (University of Edinburgh, China)
Frontier of Formal Theorem Proving with Large Language Models: Insights from the DeepSeek-Prover (abstract)
11:40
Guy Axelrod (Chalmers, Sweden)
Moa Johansson (Chalmers, Sweden)
Andrea Silvi (Chalmers, Sweden)
Intrinsic Motivation To Construct Terms In A Dependent Type Theory (abstract)
12:00
Konstantinos Kogkalidis (Aalto University, Finland)
Orestis Melkonian (IOG, UK)
Jean-Philippe Bernardy (Gothenburg University, Department of Computer Science and Engineering, Sweden)
Structure-Aware Neural Representations of Agda Programs (abstract)
12:20
Steven Obua (Practal, UK)
Abstraction Logic Is All You Need (abstract)
12:30-13:30 Lunch with GAIL, the Generative AI Lab

The Generative AI Laboratory (GAIL) at the University of Edinburgh is a centre for excellence dedicated to researching all aspects of generative artificial intelligence (AI) in society. Uniting the diverse research expertise across the University with generative AI at its core, GAIL taps into a thriving AI landscape with recognised strengths in natural language processing, machine learning, and data-driven innovation.

This interdisciplinary initiative focuses on advancing generative AI techniques in the key areas of future health and care, climate and sustainability, and economic growth, and is committed to innovating in generative AI through partnerships with global stakeholders, driving new collaborations and dynamic projects across government, industry, and the public sector.

To learn more about GAIL, see here: https://gail.ed.ac.uk/

13:30-14:30 Session 4: Using Large Language Models for Proof Translation

Contributed Talks devoted to Large Language Models for translating betwen different provers, proof formats and mathematical proofs written in natural languages

Chair:
Andrew G. Gordon (Cogna and University of Edinburgh, UK)
13:30
Jules Viennot (IRIF, Université Paris Cité, Inria, CNRS, France)
Guillaume Baudart (IRIF, Université Paris Cité, Inria, CNRS, France)
Emilio Jesús Gallego Arias (IRIF, Université Paris Cité, Inria, CNRS, France)
Marc Lelarge (DI ENS, PSL University, Inria, France)
MiniF2F in Rocq: Automatic Translation Between Proof Assistants — A Case Study (abstract)
14:00
Laetitia Teodorescu (AdaptiveML, France)
Guillaume Baudart (IRIF, Université Paris Cité, Inria, CNRS, France)
Emilio Jesús Gallego Arias (IRIF, Université Paris Cité, Inria, CNRS, France)
Marc Lelarge (DI ENS, PSL University, Inria, France)
NLIR: Natural Language Intermediate Representation for Mechanized Theorem Proving (abstract)
14:30-15:30 Session 5: Keynote 2. Swarat Chaudhuri (UT Austin and Google Deepmind). Formal Mathematical Reasoning: A New Frontier for Large Language Models

Abstract: AI for Mathematics (AI4Math) is not only intriguing intellectually but also crucial for AI-driven discovery in science, engineering, and beyond. In the recent past, there have been many efforts on AI4Math that mirror techniques in NLP, in particular, by training large language models on textual math datasets. As a complementary yet less explored avenue, formal mathematical reasoning is grounded in formal systems such as proof assistants, which can verify the correctness of reasoning and provide automatic feedback. In this talk, we advocate for formal mathematical reasoning as a task for LLMs and argue that it is indispensable for advancing AI4Math to the next level. In recent years, we have seen steady progress in using AI to perform formal reasoning, including core tasks such as theorem proving and autoformalization, as well as emerging applications such as verifiable generation of code and hardware designs. However, significant challenges remain to be solved for AI to truly master mathematics and achieve broader impact. We summarize existing progress, discuss open challenges, and envision critical milestones to measure future success. We end by calling on the research community to come together to drive transformative advancements in this field.

 

Bio: Swarat Chaudhuri is a Professor of Computer Science and the director of the Trishul laboratory at UT Austin, as well as a Visiting Researcher at Google Deepmind, London. His research lies at the interface of machine learning, programming languages, and formal methods. Through a synthesis of ideas from these areas, he seeks to develop a new generation of intelligent systems that are reliable, transparent, and secure by construction and can solve mathematical reasoning, scientific discovery, and system engineering tasks beyond the scope of contemporary AI. Dr. Chaudhuri has received the NSF CAREER award, the ACM SIGPLAN John Reynolds Dissertation award, Meta and Google Research awards, and several ACM SIGPLAN and SIGSOFT distinguished paper awards. He also served as a Program Chair for CAV 2016 and ICLR 2024.

Chair:
Ekaterina Komendantskaya (Department of Computer Science, Heriot-Watt University, UK)
16:00-17:00 Session 6: Using Large Language Models in Software Verification

Contributed Talks devoted to applications of Large Language Models in formal verification for software, hardware, and systems engineering, including industrial applications

Chair:
Elizabeth Polgreen (University of Edinburgh, UK)
16:00
Aleksandr Shefer (Jetbrains Research, Constructor University Bremen, Germany)
Igor Engel (JetBrains Research, Constructor University Bremen, Germany)
Stanislav Alekseev (JetBrains Research, Neapolis University Pafos, Cyprus)
Daniil Berezun (JetBrains Research, Netherlands)
Ekaterina Verbitskaia (JetBrains Research, Constructor University Bremen, Netherlands)
Anton Podkopaev (JetBrains Research, Constructor University Bremen, Netherlands)
Are LLMs Ready for Software Verification? (abstract)
16:30
Samuel Teuber (Karlsruhe Institute of Technology, Germany)
Bernhard Beckert (Karlsruhe Institute of Technology, Germany)
Towards LLM-support for Deductive Verification of Java Programs (abstract)
17:10-17:30 Session 8: Ethical and Philosophical Issues; AI Creativity

Contributed Talks devoted to Ethical and Philosophical issues arising in using Generative AI in Theorem Proving

Chair:
Yixuan Li (University of Edinburgh, UK)
17:10
Markus Pantsar (RWTH Aachen University, Germany)
The need for ethical guidelines in mathematical research in the time of generative AI (abstract)
17:20
Irina Starikova (Institute of philosophy, Slovak Academy of Science, Slovakia)
Mathematics and Creative AI (abstract)
19:00-22:00 Workshop Dinner at the Surgeons's Hall, Edinburgh City Centre

Further Venue details: https://museum.rcsed.ac.uk/

Prior registration for the dinner is essential.

Tuesday, April 8th

View this program: with abstractssession overviewtalk overview

10:30-12:20 Session 10: Using Large Language Models in Automated Theorem Proving

Contributed Talks devoted to Machine Learning and Generative AI in automating proof search and proof construction, in the context of the Automated Theorem Proving

Chair:
Wenda Li (Edinburgh University, UK)
10:30
Stephan Schulz (DHBW Stuttgart, Germany)
Automated Theorem Provers as the Hub of the AI Math Ecosystem (abstract)
10:50
Alexei Lisitsa (University of Liverpool, UK)
In search of simplicity: a case study in ML and LLM assisted applications of ATP in mathematics (abstract)
11:10
Robin Rawiel (Osnabrück University, Germany)
Lukas Niehaus (Osnabrück University, Germany)
Reinforcement Learning for Term Rewrite Systems (abstract)
11:30
Richard Thompson (Naval Postgraduate School, United States)
Adam Pease (Naval Postgraduate School, United States)
Angelos Toutsios (Naval Postgraduate School, United States)
Roberto Milanese Jr. (Naval Postgraduate School, United States)
Jarrad Singley (Naval Postgraduate School, United States)
Formalizing Natural Language: Cultivating LLM Translations Using Automated Theorem Proving (abstract)
12:00
Rashid Barket (Coventry University, UK)
Uzma Shafiq (Coventry University, UK)
Matthew England (Coventry University, UK)
Jürgen Gerhard (Maplesoft, Canada)
Transformers to Predict the Applicability of Symbolic Integration Routines (abstract)
12:20-13:00 Session 11: Lemma Conjecturing, Abduction with LLM

Contributed Talks devoted to Symbolic AI and Generative AI for Lemma Conjecturing

Chair:
Josef Urban (CIIRC/CVUT, Czechia)
12:20
Yutaka Nagashima (The Czech Academy of Sciences, Czechia)
Daniel Sebastian Goc (University of Cambridge, UK)
Proof By Abduction in Isabelle/HOL (abstract)
12:30
Yousef Alhessi (UCSD, United States)
Sólrún Halla Einarsdóttir (Chalmers University of Technology, Sweden)
Emily First (University of California, San Diego, United States)
George Granberry (Chalmers University of Technology, Sweden)
Moa Johansson (Chalmers University of Technology, Sweden)
Nicholas Smallbone (Chalmers University of Technology, Sweden)
Neuro-Symbolic Lemma Conjecturing (abstract)
14:00-15:20 Session 12: Practical Use of LLMs in Theorem Proving, Intelligent User Interfaces

Contributed Talks. Can Generative AI improve user experience? This session will discusses extensions to theorem proving user interfaces that include Large Language Models and other Machine Learning assistance 

Chair:
Swarat Chaudhuri (UT Austin and Google Deepmind, United States)
14:00
Jonathan Julian Huerta Y Munive (Czech Technical University in Prague, Czechia)
DeepIsaHOL progress report: current machine learning for the Isabelle proof assistant (abstract)
14:30
Gleb Solovev (JetBrains Research, Constructor University Bremen, Germany)
Nikita Khramov (JetBrains Research, Constructor University Bremen, Germany)
Andrei Kozyrev (JetBrains Research, Constructor University Bremen, Germany)
Anton Podkopaev (JetBrains Research, Constructor University Bremen, Netherlands)
CoqPilot benchmarking framework (abstract)
15:00
Job Petrovčič (Faculty of Mathematics and Physics, University of Ljubljana, Slovenia)
Sebastian Mežnar (Jozef Stefan Institute, Faculty of Mathematics and Physics, University of Ljubljana, Slovenia)
Ljupčo Todorovski (Faculty of Mathematics and Physics, University of Ljubljana, Slovenia)
Kernel-level expression generator (abstract)
15:20-16:05 Session 13A: EuroProof Net WG5 meeting: open to all participants

To learn more about Euro Proof Net, its events and working groups, see here: https://europroofnet.github.io/

15:20-16:30 Session 13B: Poster Session (coffee served in the poster room)

Contributed Posters devoted to a range of methods that combine machine learning and theorem proving.

15:20
Alberto Gandolfi (New York University Abu Dhabi, UAE)
How LLMs could Fool a Proof Checker: The Risks of Inconsistent Assumptions in Automated Proof Systems (abstract)
15:30
Tim Fernando (Computer Science, Trinity College Dublin, Ireland)
Between min cost search and least action (abstract)
15:40
Alexandra Fikiori (NTUA, Greece)
Asterios Gkantzounis (national technical university of athens, Greece)
Foivos Skarpelos (national technical university of athens, Greece)
Petros Stefaneas (NTUA, Greece)
On Abstract Logics for Interacting Provers (abstract)
15:50
Artjoms Sinkarovs (University of Southampton, UK)
Correct by Construction Machine Learning (abstract)
16:00
Luca Pasetto (University of Luxembourg, Luxembourg)
Christoph Benzmüller (Otto-Friedrich-Universität Bamberg, Germany)
The Fatio Protocol for Formal Dialogue in Isabelle/HOL (abstract)
16:10
Maxim Zyskin (University of Oxford, UK)
Chessboard covers and AI. (abstract)
16:20
Bartosz Naskręcki (Adam Mickiewicz University, Poland)
From Mathematical Theory to Machine Learning: Detecting Symmetry Groups in Crystallographic Tilings (abstract)
16:30-17:30 Session 14: Discussion Panel: What is next for Generative AI and Theorem Proving? Future directions and research opportunities

This panel session will provide an opportunity for community building, and free-format discussion of the current trends in the area, as well as its future directions.

Panel members:

Chair:
Ekaterina Komendantskaya (Department of Computer Science, Heriot-Watt University, UK)
17:30-19:00 Free Social Time on Campus

Possible drinks and meal at Mariott Hotel: https://www.marriott.com/en-us/hotels/edihw-courtyard-edinburgh-west/overview/?scid=f2ae0541-1279-4f24-b197-a979c79310b0