Gödel's Poetry
Kelly J. Davis
cs.AI
Dec 16, 2025 · v1
TL;DR
Uses specialized language models for Lean4 proof generation with recursive theorem decomposition via an AST-extended Kimina Lean Server.
Abstract
Formal, automated theorem proving has long been viewed as a challenge to artificial intelligence. We introduce here a new approach to computer theorem proving, one that employs specialized language models for Lean4 proof generation combined with recursive decomposition of difficult theorems into simpler entailing propositions. These models are coordinated through a multi-agent architecture that orchestrates autoformalization (if required), proof generation, decomposition of difficult theorems into simpler entailing propositions, and recursive proof (and/or decomposition) of these propositions. Without decomposition, we achieve a 90.4% pass rate on miniF2F. With decomposition, this is significantly improved. A key technical contribution lies in our extension of the Kimina Lean Server with abstract syntax tree (AST) parsing capabilities to facilitate automated, recursive proof decomposition. The system is made available on PyPI as goedels-poetry (at
https://pypi.org/project/goedels-poetry ), and the open-source implementation KellyJDavis/goedels-poetry (at
https://github.com/KellyJDavis/goedels-poetry ) facilitates both adaptation to alternative language models and extension with custom functionality.
Problem
Formal automated theorem proving remains a challenge for AI, particularly for complex theorems that require decomposition into manageable subproblems. Existing neural theorem provers lack systematic mechanisms for recursive proof decomposition.
Approach
The system (Godel's Poetry) employs a multi-agent architecture coordinated by a supervisor agent. Specialized agents handle autoformalization, proof generation, recursive decomposition of difficult theorems into simpler entailing propositions, and proof validation. The system uses specialized language models for Lean 4 proof generation and extends the Kimina Lean Server with abstract syntax tree (AST) parsing capabilities to facilitate automated recursive proof decomposition.
Results
Without decomposition, the system achieves a 90.4% pass rate on miniF2F. With recursive decomposition, this rate is significantly improved. The system is publicly available on PyPI as goedels-poetry.