← All papers
First page of Understanding Tool-Augmented Agents for Lean Formalization: A Factorial Analysis

Understanding Tool-Augmented Agents for Lean Formalization: A Factorial Analysis

Ke Zhang, Patricio Gallardo, Maziar Raissi, Sudhir Murthy

cs.SE Apr 16, 2026 · v1
Factorial study of tool-augmented agents (incl. a Lean REPL) for Lean 4 formalization.
Automatic translation of natural language mathematics into faithful Lean 4 code is hindered by the fundamental dissonance between informal set-theoretic intuition and strict formal type theory. This gap often causes LLMs to hallucinate non-existent library definitions, resulting in code that fails to compile or lacks semantic fidelity. In this work, we investigate the effectiveness of tool-augmented agents for this task through a systematic factorial analysis of three distinct tool categories: Fine-tuned Model Querying (accessing expert drafts), Knowledge Search (retrieving symbol definitions), and Compiler Feedback (verifying code via a Lean REPL). We first benchmark the agent against one-shot baselines, demonstrating large gains in both compilation success and semantic equivalence. We then use the factorial decomposition to quantify the impact of each category, isolating the marginal contribution of each tool type to overall performance.

Automatic translation of natural language mathematics into Lean 4 code is hindered by the dissonance between informal set-theoretic intuition and strict formal type theory, causing LLMs to hallucinate non-existent library definitions.

The authors conduct a systematic factorial analysis of three tool categories for Lean formalization agents: Fine-tuned Model Querying (accessing expert drafts), Knowledge Search (retrieving symbol definitions), and a third tool category. The study investigates how combinations of these tools affect the effectiveness of LLM-based autoformalization agents working in Lean 4.

The factorial analysis identifies which tool combinations most effectively address the compilation and semantic fidelity challenges in automated Lean formalization.