← All papers
First page of Keep the Proof State Live: Snapshotting for Efficient Tactic Search in Lean 4

Keep the Proof State Live: Snapshotting for Efficient Tactic Search in Lean 4

Austin Shen, Yunong Shi

cs.LO May 25, 2026 · v1
Introduces proof-state snapshotting via a Lean 4 language-server extension to speed parallel tactic search.
Automated theorem proving systems built on Lean 4 increasingly rely on parallel tactic search over partially specified proofs, such as those generated by Draft-Sketch-Prove (DSP) pipelines. In current systems, each search branch reconstructs a proof state by re-running elaboration, leading to substantial per-branch overhead. In Lean 4 with Mathlib, this cost has two components: (1) import loading, which deserializes pre-compiled libraries (~60 s per branch); and (2) theorem-body elaboration, which re-checks the theorem context up to the target goal (estimated 18-735 s depending on proof complexity). Together, these account for >99% of per-branch wall time, making portfolio-based search impractical at scale. We observe that this overhead arises from a mismatch between the structure of proof search and its execution model: branching is implemented via repeated reconstruction of proof states rather than direct reuse. To address this, we introduce proof-state snapshotting, which captures the elaborated proof state once and reuses it across branches via a small extension to the Lean 4 language server. Across 48 miniF2F-v2 problems (45 prove-phase benchmarks and 3 full end-to-end runs), our approach achieves a 5.6-50x wall-time speedup over the standard fallback (average 14x, median 9.7x). Speedup increases with the number of proof branches. Our method is orthogonal to import-level caching (e.g., Kimina Lean Server), which avoids import loading but not theorem-body elaboration. The patched Lean binary and the Snapshot-DSP pipeline will be released as open source upon publication.

In portfolio-based tactic search for Lean 4 theorem proving, each search branch reconstructs the proof state from scratch (import loading ~60s plus theorem-body elaboration 18-735s), consuming over 99% of per-branch wall time and making parallel search impractical.

The authors introduce proof-state snapshotting, which captures the elaborated proof state once via the Lean 4 language server and reuses it across all branches through three custom LSP methods. The system forks the already-elaborated state at each sorry position and fans out tactic attempts in parallel using Lean's work-stealing scheduler, achieving O(1) elaboration cost regardless of branch count.

Figure 1: (a) Fallback (Level 0): every branch independently reloads Mathlib and re-elaborates the theorem body (75–795 s per branch depending on theorem complexity; 75 s for simple theorems, up to 795 s for the hardest benchmark), yielding O(N) total overhead. The diagram shows W\!=\!2 concurrent workers; in general W\approx\lfloor\text{RAM}/3\,\text{GB}\rfloor , so W\ll N on any single machine.

Across 48 miniF2F-v2 problems, snapshotting achieves 5.6-50x wall-time speedup over the standard fallback (average 14x, median 9.7x). Speedup scales with sorry-hole count: 7.9x at 1 hole, 29.8x at 5 holes. The approach is orthogonal to import-level caching and composes with it for potentially over 100x combined speedup.

ApproachWall timeSpeedup
Level 0 (lake build, per branch)2,641 s1x
Level 1 (import cached, W=2)~330 s~8x
Level 2 (snapshot, this work)133 s~20x
Level 1+2 (combined, projected)~17 s>100x
Speedup by approach level