LeanTutor: Towards a Verified AI Mathematical Proof Tutor

Manooshree Patel, Rayna Bhattacharyya, Thomas Lu, Arnav Mehta, Niels Voss, Narges Norouzi, Gireeja Ranade

cs.LG Jan 24, 2026 · v1

autoformalization ai-agents dataset logic

Read PDF arXiv abstract

TL;DR

An autoformalizer/proof-checker module uses Lean to verify student proof steps, paired with the PeanoBench Lean/NL proof dataset.

Abstract

This paper considers the development of an AI-based provably-correct mathematical proof tutor. While Large Language Models (LLMs) allow seamless communication in natural language, they are error prone. Theorem provers such as Lean allow for provable-correctness, but these are hard for students to learn. We present a proof-of-concept system (LeanTutor) by combining the complementary strengths of LLMs and theorem provers. LeanTutor is composed of three modules: (i) an autoformalizer/proof-checker, (ii) a next-step generator, and (iii) a natural language feedback generator. To evaluate the system, we introduce PeanoBench, a dataset of 371 Peano Arithmetic proofs in human-written natural language and formal language, derived from the Natural Numbers Game.

Problem

LLMs allow natural language communication but are error-prone in mathematical reasoning, while theorem provers like Lean provide provable correctness but are hard for students to learn. There is no system combining both for mathematical tutoring.

Approach

LeanTutor combines the complementary strengths of LLMs and theorem provers in three modules: an autoformalizer/proof-checker that translates student input to Lean and verifies it, a next-step generator that suggests proof continuations, and a natural language feedback generator. To evaluate the system, the authors introduce PeanoBench, a dataset of 371 Peano Arithmetic proofs in both natural and formal language, derived from the Natural Numbers Game.

Results

The paper presents a proof-of-concept system demonstrating that combining LLMs with Lean 4 verification can provide provably correct mathematical tutoring, evaluated on 371 proofs in PeanoBench.

Papers With

LeanTutor: Towards a Verified AI Mathematical Proof Tutor