AXLE: A Cloud Infrastructure for Lean 4 Theorem Proving Utilities
Jimmy Xin, Alex Schneidman, Chris Cummins, Karun Ram, Srihari Ganesh, Jannis Limperg
cs.LO
Jun 24, 2026 · v1
cs.AI
TL;DR
Presents AXLE, a scalable cloud service providing 14 Lean 4 metaprogramming tools for proof verification, manipulation, and extraction.
Abstract
We present AXLE (Axiom Lean Engine), a cloud service for Lean 4 proof manipulation, extraction, and verification. Recent progress in AI for mathematics -- reinforcement learning pipelines, agentic proving workflows, dataset curation -- demands Lean 4 tooling that scales to millions of requests while remaining correct and robust; existing infrastructure offers parallel compilation but not scalable proof verification, higher-level proof manipulation, multi-version support, or per-request isolation at the throughput modern AI workflows require. AXLE provides 14 Lean 4 metaprogramming tools spanning strict proof verification, declaration metadata extraction, semantic source manipulation, deterministic proof repair and simplification, and lemma extraction. The service runs as a multi-tenant cloud deployment with per-request isolation and concurrent support for multiple Lean 4 and Mathlib versions, accessible via a Python SDK, command-line interface, web UI, MCP server, and raw HTTP API. AXLE is publicly available and free to use at
https://axle.axiommath.ai and via the axiom-axle PyPI package, with no local Lean 4 installation required. It has served over 500 million requests to date and is the underlying infrastructure for Axiom Math's proving efforts, including its 12/12 score on the 2025 Putnam competition.
Problem
AI-driven Lean 4 workflows (RL training, agentic proving, dataset curation) need fast, correct, and scalable tooling with per-request isolation and multi-version support, which existing infrastructure does not provide.
Approach
AXLE implements 14 Lean 4 metaprogramming tools (strict verification, metadata extraction, proof repair, simplification, lemma extraction) served from a multi-tenant cloud deployment. Each request runs in isolation with concurrent support for multiple Lean/Mathlib versions. Access is via Python SDK, CLI, web UI, MCP server, or HTTP API.
Results
Median verify_proof latency of 0.97s, 7.27x near-linear scaling to 8 concurrent requests, 100% agreement with alternative strict checkers, and over 500 million requests served. Underlies Axiom Math's 12/12 Putnam 2025 result.
| system | median latency | throughput | total wall | success |
|---|
| Axle verify_proof | 0.97 s | 0.43 req/s | 9,205 s | 100.0% |
| SafeVerify | 10.1 s | 0.107 req/s | 37,258 s | 99.9% |
| Comparator | 95.7 s | 0.026 req/s | 151,267 s | 93.8% |
Strict-checker comparison