Deep Reinforcement Learning for Synthesizing Functions in Higher-Order Logic

Title:Deep Reinforcement Learning for Synthesizing Functions in Higher-Order Logic

Conference:LPAR23

Tags:combinators, Diophantine equations, HOL, Reinforcement Learning and tree neural networks

Abstract:

The paper describes a deep reinforcement learning framework based on self-supervised learning within the proof assistant HOL4. A close interaction between the machine learning modules and the HOL4 library is achieved by the choice of tree neural networks (TNNs) as machine learning models and the internal use of HOL4 terms to represent tree structures of TNNs. Recursive improvement is possible when a task is expressed as a search problem. In this case, a Monte Carlo Tree Search (MCTS) algorithm guided by a TNN can be used to explore the search space and produce better examples for training the next TNN. As an illustration, term synthesis tasks on combinators and Diophantine equations are specified and learned. The success rate on test sets generated for each task is respectively 65% and 78.5%. These results are compared with state-of-the-art ATPs on combinators and set a precedent for statistically guided synthesis of Diophantine equations.