← All papers
First page of Can Proof Assistants Verify Multi-Agent Systems?

Can Proof Assistants Verify Multi-Agent Systems?

Julian Alfredo Mendez, Timotheus Kampik

cs.PL Mar 10, 2025 · v1
The Soda language compiles to Lean so multi-agent systems can be formally verified in the Lean proof assistant.
This paper presents the Soda language for verifying multi-agent systems. Soda is a high-level functional and object-oriented language that supports the compilation of its code not only to Scala, a strongly statically typed high-level programming language, but also to Lean, a proof assistant and programming language. Given these capabilities, Soda can implement multi-agent systems, or parts thereof, that can then be integrated into a mainstream software ecosystem on the one hand and formally verified with state-of-the-art tools on the other hand. We provide a brief and informal introduction to Soda and the aforementioned interoperability capabilities, as well as a simple demonstration of how interaction protocols can be designed and verified with Soda. In the course of the demonstration, we highlight challenges with respect to real-world applicability.

Multi-agent systems are notoriously difficult to test due to their dynamic and distributed nature, and ensuring that agent autonomy does not lead to safety violations requires formal verification methods that can also integrate with mainstream software ecosystems.

The authors present Soda, a high-level functional and object-oriented language that compiles both to Scala (for mainstream software integration) and to Lean 4 (for formal verification). This dual-target design allows multi-agent systems implemented in Soda to be both deployed in production and formally verified. The paper demonstrates how interaction protocols can be designed and verified, highlighting challenges with respect to real-world applicability.

The demonstration shows that interaction protocols for multi-agent systems can be formally verified via Lean compilation. Performance benchmarks show the system scales from 59,578 transactions/second with 1 user to 3,574 transactions/second with 2,048 users and 131,072 transactions.