Null Measurability at the Symmetrization Interface in VC Learning
Dhruv Gupta
cs.LG
Apr 27, 2026 · v1
TL;DR
Formalizes the descriptive-set-theory infrastructure and main results in Lean 4.
Abstract
Recent work revisiting measurability in the fundamental theorem of statistical learning imposes Borel measurability of ghost-gap suprema. We show that, at the one-sided ghost-gap interface actually used by the standard symmetrization proof, this requirement is stronger than necessary. For any Borel-parameterized concept class on a Polish domain, the bad event "there exists a hypothesis whose ghost empirical error exceeds its training empirical error by at least ε/2" is analytic. By Choquet capacitability, it is therefore measurable in the completion of every finite Borel measure. We then construct a concept class whose bad event is null-measurable but not Borel, giving a strict separation from the Borel supremum condition. Finally, we prove closure under patching, fixed and countable interpolation, and fiber-product amalgamation, showing that the weaker regularity level is stable under natural concept-class constructors. In the realizable setting, where targets belong to the class and are measurable, these results weaken the measurability hypothesis needed by the symmetrization route from finite VC dimension to PAC learnability. The main results and the descriptive-set-theoretic infrastructure used by them are formalized in Lean 4.
Problem
The standard symmetrization proof in PAC learning theory imposes Borel measurability of ghost-gap suprema, but this requirement may be stronger than necessary for the fundamental theorem of statistical learning.
Approach
The authors show that for any Borel-parameterized concept class on a Polish domain, the bad event at the symmetrization interface is analytic and therefore null-measurable (measurable in the completion of every finite Borel measure) by Choquet capacitability. They construct a concept class whose bad event is null-measurable but not Borel, giving a strict separation. They also prove closure under patching, fixed and countable interpolation, and fiber-product amalgamation. The main results and the descriptive-set-theoretic infrastructure are formalized in Lean 4.
Results
The paper weakens the measurability hypothesis needed by the symmetrization route from finite VC dimension to PAC learnability, replacing Borel measurability with null-measurability and proving this weaker condition is strictly more general yet stable under natural concept-class constructors.