ForEx: A Formal Verification Framework for Explainable Reasoning in Logical Fallacy Detection and Annotation

Pei-Cing Huang, Chienyu Liu, Chan Hsu, Ci-Siang Chen, Pei-Ju Lee, Yihuang Kang

cs.AI Jun 20, 2026 · v1 cs.CL cs.SC

Read PDF arXiv abstract

TL;DR

Translates LLM-generated fallacy-detection explanations into Lean4 and checks whether the rationale is derivable from encoded premises.

Abstract

Current evaluations of Large Language Models (LLMs) on logical fallacy detection focus on predicted labels, but do not establish whether those labels are supported by the reasoning the models provide. We propose ForEx (Formal Verification for Explainable Reasoning), a framework that translates LLM-generated explanations into Lean4 and verifies whether the translated rationale is derivable under encoded premises, not the logical validity of the original natural language argument. To distinguish prediction outcomes from the formal status of the supporting reasoning, we introduce the LLM Argument Verification Matrix, which separates label consistency from formal verification status. Experiments on LOGIC-Climate show that over 90% of LLM outputs can be translated into formal reasoning chains that pass verification, while agreement with human annotations remains around 20%. These results expose a systematic gap between formal derivability and label agreement, a distinction invisible to prediction-based metrics. ForEx moves LLM evaluation beyond label correctness toward machine-checkable analysis of formalized reasoning chains.

Problem

LLM evaluations on logical fallacy detection focus on predicted labels and do not establish whether the labels are supported by the reasoning the models provide.

Approach

ForEx translates LLM-generated explanations into Lean4 and verifies whether the translated rationale is derivable under encoded premises. The LLM Argument Verification Matrix separates label consistency from formal verification status.

Results

On LOGIC-Climate, over 90% of LLM outputs translate into formal chains that pass verification, while agreement with human annotations stays around 20%, exposing a gap between formal derivability and label agreement.

Papers With

ForEx: A Formal Verification Framework for Explainable Reasoning in Logical Fallacy Detection and Annotation