LeanConjecturer: Automatic Generation of Mathematical Conjectures for Theorem Proving

Naoto Onda, Kazumi Kasaura, Yuta Oriike, Masaya Taniguchi, Akiyoshi Sannai, Sho Sonoda

cs.AI Jun 27, 2025 · v1

autoformalization theorem-proving-ml dataset mathlib topology

Read PDF arXiv abstract Code

TL;DR

LeanConjecturer generates university-level mathematical conjectures as Lean 4 statements from Mathlib seed files for theorem-proving training.

Abstract

We introduce LeanConjecturer, a pipeline for automatically generating university-level mathematical conjectures in Lean 4 using Large Language Models (LLMs). Our hybrid approach combines rule-based context extraction with LLM-based theorem statement generation, addressing the data scarcity challenge in formal theorem proving. Through iterative generation and evaluation, LeanConjecturer produced 12,289 conjectures from 40 Mathlib seed files, with 3,776 identified as syntactically valid and non-trivial, that is, cannot be proven by \texttt{aesop} tactic. We demonstrate the utility of these generated conjectures for reinforcement learning through Group Relative Policy Optimization (GRPO), showing that targeted training on domain-specific conjectures can enhance theorem proving capabilities. Our approach generates 103.25 novel conjectures per seed file on average, providing a scalable solution for creating training data for theorem proving systems. Our system successfully verified several non-trivial theorems in topology, including properties of semi-open, alpha-open, and pre-open sets, demonstrating its potential for mathematical discovery beyond simple variations of existing results.

Problem

Formal theorem proving with LLMs is limited by the scarcity of formalized mathematical statements and proofs. Directly prompting LLMs to generate conjectures often results in hallucinated libraries or syntactically incorrect statements.

Approach

LeanConjecturer is a pipeline for automatically generating university-level mathematical conjectures in Lean 4 using LLMs. It combines rule-based context extraction with LLM-based theorem statement generation. Conjectures are iteratively generated and evaluated for syntactic validity and non-triviality (cannot be proven by aesop). The generated conjectures are then used as training data for GRPO-based reinforcement learning to improve theorem proving capabilities.

Results

From 40 Mathlib seed files, LeanConjecturer produced 12,289 conjectures, with 3,776 identified as syntactically valid and non-trivial. GRPO training on the generated conjectures improved proof success rates, with DeepSeek-Prover-V2-7B improving from 2285/24576 proofs to 3657/24576 after 10 epochs of GRPO training.

Total	Valid	Novel	Non-Trivial
12289	10950	4130	3776

Conjecture generation statistics

Papers With

LeanConjecturer: Automatic Generation of Mathematical Conjectures for Theorem Proving