← All papers
First page of Translating Informal Proofs into Formal Proofs Using a Chain of States

Translating Informal Proofs into Formal Proofs Using a Chain of States

Ziyu Wang, Bowen Yang, Chenyi Li, Yuan Zhang, Shihao Zhou, Bin Dong, Zaiwen Wen

cs.LO Dec 11, 2025 · v1
Translates informal proofs into Lean 4 via an intermediate Chain of States representation, with custom datasets and benchmarks.
We address the problem of translating informal mathematical proofs expressed in natural language into formal proofs in Lean4 under a constrained computational budget. Our approach is grounded in two key insights. First, informal proofs tend to proceed via a sequence of logical transitions - often implications or equivalences - without explicitly specifying intermediate results or auxiliary lemmas. In contrast, formal systems like Lean require an explicit representation of each proof state and the tactics that connect them. Second, each informal reasoning step can be viewed as an abstract transformation between proof states, but identifying the corresponding formal tactics often requires nontrivial domain knowledge and precise control over proof context. To bridge this gap, we propose a two stage framework. Rather than generating formal tactics directly, we first extract a Chain of States (CoS), a sequence of intermediate formal proof states aligned with the logical structure of the informal argument. We then generate tactics to transition between adjacent states in the CoS, thereby constructing the full formal proof. This intermediate representation significantly reduces the complexity of tactic generation and improves alignment with informal reasoning patterns. We build dedicated datasets and benchmarks for training and evaluation, and introduce an interactive framework to support tactic generation from formal states. Empirical results show that our method substantially outperforms existing baselines, achieving higher proof success rates.

Translating informal mathematical proofs into formal Lean 4 proofs is difficult because informal proofs omit intermediate states and auxiliary lemmas, while formal systems require explicit representation of each proof state and connecting tactics.

The authors propose CoSProver, which introduces a Chain of States (CoS) representation as an intermediate layer between informal and formal proofs. Each informal reasoning step is decomposed into a sequence of explicit proof states connected by tactics. A CoS generator LLM (fine-tuned DeepSeek Qwen-7B) produces the state sequences from informal proofs, followed by a prover that fills in tactics between states. The system applies deduplication and completeness verification to ensure valid chains.

CoSLLM (7B parameters) achieves 58.60% on MiniF2F-test and 55.73% on MiniF2F-valid with a single sample, and 95.49%/94.67% with 5 samples. On the CoS benchmark, it reaches 53% with 5 samples. GPT-4o with 5 samples achieves 95.90%/94.90% on MiniF2F but only 40% on the CoS benchmark. DeepSeek-R1 (600B) reaches 67.62%/61.48% with 5 samples.