LeanProgress: Guiding Search for Neural Theorem Proving via Proof Progress Prediction

Robert Joseph George, Suozhi Huang, Peiyang Song, Anima Anandkumar

cs.AI Feb 25, 2025 · v1

theorem-proving-ml ai-agents mathlib dataset

Read PDF arXiv abstract Code

TL;DR

Trains a model predicting remaining proof steps from Lean Workbook Plus and Mathlib4 to guide best-first search in neural theorem proving.

Abstract

Mathematical reasoning remains a significant challenge for Large Language Models (LLMs) due to hallucinations. When combined with formal proof assistants like Lean, these hallucinations can be eliminated through rigorous verification, making theorem proving reliable. However, even with formal verification, LLMs still struggle with long proofs and complex mathematical formalizations. While Lean with LLMs offers valuable assistance with retrieving lemmas, generating tactics, or even complete proofs, it lacks a crucial capability: providing a sense of proof progress. This limitation particularly impacts the overall development efficiency in large formalization projects. We introduce LeanProgress, a method that predicts the progress in the proof. Training and evaluating our models made on a large corpus of Lean proofs from Lean Workbook Plus and Mathlib4 and how many steps remain to complete it, we employ data preprocessing and balancing techniques to handle the skewed distribution of proof lengths. Our experiments show that LeanProgress achieves an overall prediction accuracy of 75.8% in predicting the amount of progress and, hence, the remaining number of steps. When integrated into a best-first search framework using Reprover, our method shows a 3.8% improvement on Mathlib4 compared to baseline performances of 41.4%, particularly for longer proofs. These results demonstrate how proof progress prediction can enhance both automated and interactive theorem proving, enabling users to make more informed decisions about proof strategies. Our code is merged in this library here https://github.com/lean-dojo/LeanDojo-v2.

Problem

LLM-based theorem proving in Lean lacks a notion of proof progress, meaning search algorithms cannot distinguish productive proof steps from dead ends, leading to wasted computation on stalled proof attempts.

Approach

LeanProgress trains a language model to predict the number of remaining steps to proof completion given a current proof state. Training data is generated by extracting proof trees via best-first search with ReProver, then balancing the distribution of proof lengths. The step predictor is integrated into LeanCopilot as a tactic (predict_steps) to guide search.

Results

The step predictor provides an interpretable progress signal that improves search efficiency compared to traditional best-first search guided by log probability alone. The framework, data pipeline, and integration with LeanCopilot are open-sourced.

Papers With

LeanProgress: Guiding Search for Neural Theorem Proving via Proof Progress Prediction