Geoint-R1: Formalizing Multimodal Geometric Reasoning with Dynamic Auxiliary Constructions

Jingxuan Wei, Caijun Jia, Qi Chen, Honghao He, Linzhuang Sun, Conghui He, Lijun Wu, Bihui Yu, Cheng Tan

cs.AI Aug 5, 2025 · v1

Read PDF arXiv abstract Code

TL;DR

Geoint-R1 generates Lean4 code for auxiliary geometric constructions, with a 1,885-problem benchmark of Lean4-annotated geometry problems.

Abstract

Mathematical geometric reasoning is essential for scientific discovery and educational development, requiring precise logic and rigorous formal verification. While recent advances in Multimodal Large Language Models (MLLMs) have improved reasoning tasks, existing models typically struggle with formal geometric reasoning, particularly when dynamically constructing and verifying auxiliary geometric elements. To address these challenges, we introduce Geoint-R1, a multimodal reasoning framework designed to generate formally verifiable geometric solutions from textual descriptions and visual diagrams. Geoint-R1 uniquely integrates auxiliary elements construction, formal reasoning represented via Lean4, and interactive visualization. To systematically evaluate and advance formal geometric reasoning, we propose the Geoint benchmark, comprising 1,885 rigorously annotated geometry problems across diverse topics such as plane, spatial, and solid geometry. Each problem includes structured textual annotations, precise Lean4 code for auxiliary constructions, and detailed solution steps verified by experts. Extensive experiments demonstrate that Geoint-R1 significantly surpasses existing multimodal and math-specific reasoning models, particularly on challenging problems requiring explicit auxiliary element constructions.

Problem

Multimodal LLMs struggle with formal geometric reasoning, especially dynamically constructing and verifying auxiliary geometric elements.

Approach

Geoint-R1 generates formally verifiable geometric solutions from text and diagrams, integrating auxiliary-element construction, formal reasoning represented in Lean4, and interactive visualization, trained via SFT then reinforcement learning with a verification reward. The Geoint benchmark provides 1,885 annotated problems with Lean4 code for auxiliary constructions.

Figure 2: The Geoint-R1 architecture and training pipeline.

Results

Geoint-R1 surpasses existing multimodal and math-specific reasoning models, particularly on problems requiring explicit auxiliary constructions; ablations show the verification reward, RL, and curriculum each contribute.

Papers With

Geoint-R1: Formalizing Multimodal Geometric Reasoning with Dynamic Auxiliary Constructions