← All papers
Finiteness of Symbolic Derivatives in Lean
cs.LO
Feb 1, 2025 · v1
TL;DR
Formalizes a constructive finiteness proof of symbolic regex derivatives in Lean 4.
Abstract
This paper extends Brzozowski's classical result on finiteness of regular expression derivatives to symbolic location-based derivatives. The authors prove that the derivative state space is finite when quotienting only by associativity, deduplication, and idempotence (ADI), without requiring commutativity of union, enabling application to derivative-based backtracking (PCRE) match semantics where union is noncommutative. The proof is constructive, building a finite overapproximation of the derivative set, and is formalized in Lean 4.
Problem
Brzozowski's classical result shows finiteness of regular expression derivatives, but extending this to symbolic location-based derivatives under non-commutative union semantics (needed for PCRE backtracking match) was open.
Approach
The authors prove that the derivative state space is finite when quotienting only by associativity, deduplication, and idempotence (ADI), without requiring commutativity of union. The proof is constructive, building a finite overapproximation of the derivative set, and is formalized in Lean 4.
Results
Finiteness of symbolic derivatives under ADI quotienting is established, enabling derivative-based algorithms for backtracking regex semantics where union order matters.
