KELPS: A Framework for Verified Multi-Language Autoformalization via Semantic-Syntactic Alignment
Jiyao Zhang, Chengli Zhong, Hui Xu, Qige Li, Yi Zhou
cs.CL
Jul 11, 2025 · v1
TL;DR
KELPS translates informal math into Lean, Coq, and Isabelle via Knowledge Equations, producing a verified multilingual parallel corpus.
Abstract
Modern large language models (LLMs) show promising progress in formalizing informal mathematics into machine-verifiable theorems. However, these methods still face bottlenecks due to the limited quantity and quality of multilingual parallel corpora. In this paper, we propose a novel neuro-symbolic framework KELPS (Knowledge-Equation based Logical Processing System) to address these problems. KELPS is an iterative framework for translating, synthesizing, and filtering informal data into multiple formal languages (Lean, Coq, and Isabelle). First, we translate natural language into Knowledge Equations (KEs), a novel language that we designed, theoretically grounded in assertional logic. Next, we convert them to target languages through rigorously defined rules that preserve both syntactic structure and semantic meaning. This process yielded a parallel corpus of over 60,000 problems. Our framework achieves 88.9% syntactic accuracy (pass@1) on MiniF2F, outperforming SOTA models such as Deepseek-V3 (81%) and Herald (81.3%) across multiple datasets. All datasets and codes are available in the supplementary materials.
Problem
Autoformalization methods face bottlenecks due to limited quantity and quality of multilingual parallel corpora between natural language and formal proof languages (Lean, Coq, Isabelle).
Approach
The authors propose KELPS (Knowledge-Equation based Logical Processing System), an iterative neuro-symbolic framework for translating informal mathematics into multiple formal languages. KELPS first translates natural language into Knowledge Equations (KEs), an intermediate representation grounded in assertional logic. KEs are then converted to target languages (Lean, Coq, Isabelle) through rigorously defined rules preserving syntactic structure and semantic meaning. The process yields a parallel corpus of over 60,000 problems.
Results
KELPS achieves 88.9% syntactic accuracy (pass@1) on MiniF2F, outperforming DeepSeek-V3 (81%) and Herald (81.3%) across multiple datasets. The framework produces verified formalizations in Lean, Coq, and Isabelle simultaneously from the same intermediate representation.