← All papers

Yanasse: Finding New Proofs from Deep Vision's Analogies, Part 1

Alexandre Linhares

cs.AI Apr 19, 2026 · v1

theorem-proving-ml mathlib

Read PDF arXiv abstract

TL;DR

Transfers Mathlib tactic-invocation patterns across areas to find new Lean-verified proofs.

Abstract

Project Yanasse presents a method for discovering new proofs of theorems in one area of mathematics by transferring proof strategy patterns (e.g., Lean 4 tactic invocation patterns) from a structurally distant area. The system extracts tactic usage distributions across 27 top-level areas of Mathlib (217,133 proof states), computes z-scores to identify tactics that are heavily used in a source area but rare or absent in a target area, matches source and target proof states via GPU-accelerated NP-hard analogy (running on a MacBook Air via Apple's MPS backend), and then asks an AI reasoning agent to semantically adapt--not symbol-substitute--the source tactics invocation pattern to the target theorem. In this first part of the study, the method is applied to the pair Probability -> Representation Theory, producing 4 Lean-verified new proofs out of 10 attempts (40%). The proofs compile with zero sorry declarations. The key finding is that tactic schemas decompose into a head (domain-gated, rarely transfers) and a modifier (domain-general, often transfers): filter upwards's head fails in representation theory (no Filter structure), but its [LIST] with ω modifier transfers cleanly as ext1 + simp [LIST] + rfl. Crucially, the underlying matching engine--deep vision lib.py--is entirely domain independent: the same optimization code for an NP-hard matching that matches chess positions by analogy matches Lean proof states by analogy, without knowing which domain it is processing. Only a relation extractor is domain-specific.

Problem

Proof strategy transfer between distant areas of mathematics has historically relied on human insight. No systematic method existed to automatically identify tactics that are routine in one mathematical area but absent in another, and then apply them to discover new proofs.

Approach

Project Yanasse extracts tactic usage distributions across 27 top-level areas of Mathlib (217,133 proof states), computes z-scores to identify tactics heavily used in a source area but rare in a target area, and matches source and target proof states via GPU-accelerated analogy on Apple's MPS backend. An AI reasoning agent then attempts to close target goals using the transferred tactic patterns. Lean 4 verifies all candidate proofs.

Results

For the top-ranked source-target pair (Probability to Representation Theory), four out of ten attempted schema transfers produced Lean-verified alternative proofs. Three new proofs were discovered using tactics like filter_upwards, ext1+simp, and any_goals that are common in probability but rare in representation theory.

Papers With

Yanasse: Finding New Proofs from Deep Vision's Analogies, Part 1