A selected list of Autoformalization papers, projects, frameworks, libraries, tools, and other resources.
This repo is intended to be a manually selected and curated list rather than an exhaustive or authoritative bibliography. The entries, metadata, descriptions, and reading guides are manually written by the maintainer, and thus inevitably constrained by one's knowledge and perspective. Important work may be missing, and some details or descriptions may be incomplete, outdated, or mistaken.
Corrections, suggestions, and friendly discussions are more than welcome! Please feel free to open an issue or pull request.
| Year | Title | Venue | Main Techniques | Main Takeaways | Reading Guide |
|---|---|---|---|---|---|
| 2019 | A Promising Path Towards Autoformalization and General Artificial Intelligence | CICM 2020 | - | 1. Neural Autoformalizer + Neural Formal Theorem Prover + Symbolic Proof Verifier = Self-Improving Mathematical General Intelligence. 2. Autoformalization is essential for Neural Formal Theorem Proving to provide massive training data. |
Pioneering Manifesto: a must-read if you're interested in the history; most visions anticipated current standard practices, while a few are slightly outdated (e.g. underestimation of Informal Theorem Provers). ★★★½☆ |
| 2024 | Formal Mathematical Reasoning: A New Frontier in AI | ICML 2025 Position Track |
- | 1. Autoformalization has 3 important applications: synthesizing data for theorem provers, guiding formal theorem proving, and validating informal reasoning. 2. There are 3 main challenges in Autoformalization: automated evaluation, task decomposition, and interaction with formal systems. |
Best First Read: great intro to the field; frames Autoformalization in the greater context of AI for Formal Mathematics, identifies critical challenges and future milestones. ★★★★½ |
| 2025 | Autoformalization in the Era of Large Language Models: A Survey | - | - | 1. A taxonomy of Autoformalization tasks based on the workflow: Data Pre-processing, Autoformalization Methods, Post-processing, and Evaluation. 2. Autoformalization is helpful in 3 applications: Software Verification, Medical Domains, and LLM Output Verification. |
Good Resource Collection: a nice collection of Autoformalization datasets, benchmarks, methods, and progress in various mathematical domains; the resource coverage slightly lags the fast-moving frontier, and it focuses mostly on math and statement autoformalization. ★★★☆☆ |
| 2025 | Towards a Common Framework for Autoformalization | AAAI 2026 | - | 1. An Autoformalization task consists of 3 components: The Informal Language L_i (a domain-specific subset of it), The Formal Language L_f, and the Semantic Equivalence Criterion E. 2. The general framework encompasses targets beyond Interactive Theorem Prover Languages: Automated Theorem Prover Inputs (e.g. Prover9), Logic Programming Languages (e.g. SWI-Prolog), Planning Languages (e.g. PDDL), and Knowledge Representations (e.g. OWL). |
Inclusive Conceptual Framework: an inclusive conceptual view of Autoformalization with many applications beyond Mathematics. ★★★★☆ |
| 2026 | Theory-Level Autoformalization: From Isolated Statements to Unified Formal Knowledge Bases | ICML 2026 Position Track |
- | 1. Advocates a focus shift from Statement-Level Autoformalization to Theory-Level: formalizing the entire context including axioms, definitions, notations, theorems, proofs, tactics, and their inter-dependencies. 2. There are 4 key challenges in theory-level: reliable equivalence checking, hierarchical decomposition and abstraction learning, low-resource domain-specific languages, and multi-modal inputs. 3. There are 3 promising paths: benchmarks with sound equivalence checking, general-purpose models over specialized autoformalizers, and a common intermediate representation as a unified formalization target. |
Theory-Level Agenda: a forward-looking agenda focusing on Theory-Level Mathematics and other real-world applications (e.g. Automated Theorem Proving, Protocol Verification, Hardware Verification, Declarative Programming). No Rating: paper authored by the repo maintainer. |
Under construction.
| Year | Title | Venue | Main Techniques | Main Takeaways | Reading Guide |
|---|---|---|---|---|---|
| 2015 | Learning to Parse on Aligned Corpora (Rough Diamond) | ITP 2015 | - | - | No Rating: mainly for historical context. |
| 2017 | System Description: Statistical Parsing of Informalized Mizar Formulas | SYNASC 2017 | - | - | No Rating: mainly for historical context. |
| Year | Title | Venue | Main Techniques | Main Takeaways | Reading Guide |
|---|---|---|---|---|---|
| 2018 | First Experiments with Neural Translation of Informal to Formal Mathematics | CICM 2018 | - | - | No Rating: mainly for historical context. |
| 2019 | Exploration of Neural Machine Translation in Autoformalization of Mathematics in Mizar | CPP 2020 | - | - | No Rating: mainly for historical context. |
| Year | Title | Venue | Main Techniques | Main Takeaways | Reading Guide |
|---|---|---|---|---|---|
| 2022 | Autoformalization with Large Language Models | NeurIPS 2022 | - | 1. General Foundation Models + Few-Shot Prompting works surprisingly well for Statement Autoformalization and generalizes to non-trivial cases. 2. Synthetic data curated via Autoformalization improves Neural Theorem Proving. 3. There is an asymmetry between Informalization and Formalization: LLMs are better at the former than the latter, providing the basis for methods leveraging back-translation. |
Foundational Empirical Results: the findings are now widely accepted as common wisdom, which inspire the development of many important methods in the field. ★★★★☆ |
| 2022 | Towards a Mathematics Formalisation Assistant using Large Language Models | - | - | - | No Rating: mainly for historical context. |
Under construction.
Under construction.
Under construction.
Under construction.
Under construction.
Under construction.
Under construction.
Under construction.