← All papers
First page of Visored: A Controlled-Natural-Language Prover for LLM-Generated Mathematics

Visored: A Controlled-Natural-Language Prover for LLM-Generated Mathematics

Xiyu Zhai, Xinyi Chen, Yiping Wang, Runlong Zhou, Liao Zhang, Simon S. Du

cs.PL Jun 16, 2026 · v1 cs.AI
A controlled-natural-language prover whose accepted proofs are re-emitted as checked Lean files, driven by an LLM on miniF2F.
We present a dependent-type-based prover designed around the way LLMs (and humans) tend to write mathematics, complementing existing systems such as Lean and Rocq. Its core design choices are a surface that imitates mathematical natural language and a rule-driven automation layer that closes the routine steps a textbook would omit, so that an accepted proof can be re-emitted as a checked Lean file. Early experiments suggest that, even without any prover-specific training data, LLMs can learn to use it effectively on the miniF2F benchmark. Lean output excerpts: https://github.com/xiyuzhai-husky-lang/visored/

LLMs write mathematics in natural-language style but still hallucinate, and directly producing checked Lean from that style is hard. The paper asks how to give LLM workflows reliable mathematical feedback closer to how proofs are actually written.

Visored is a dependent-type-based prover with a controlled-natural-language surface that imitates mathematical prose. A rule-driven, cost-bounded automation layer discharges routine steps a textbook would omit. The pipeline elaborates CNL into a typed AST and Visored MIR, checks well-definedness, and transpiles an accepted proof into a Lean file built with lake. An LLM coding agent (Claude) drives the CLI, reading diagnostics and revising.

A coverage study on miniF2F-valid (244 problems) shows LLMs can use the CNL surface effectively with no prover-specific training data; the authors present it as a prototype and a lower bound on expressivity rather than a tuned pass rate.