Mathlib4 Theorem Dependency Graph (LeanDojo Benchmark 4, v10)
Description
This dataset contains a directed graph of theorem dependencies extracted from the Mathlib4 library, a formal mathematics ecosystem for the Lean 4 proof assistant. The graph was constructed from the LeanDojo Benchmark 4 (v10), originally published on Zenodo (DOI: 10.5281/zenodo.8040109). Each node represents a theorem or lemma identified by its fully qualified name within Mathlib4. A directed edge from node A to node B indicates that theorem A was used as a tactic in the proof of theorem B. Nodes are annotated with the file path of their source file within the Mathlib4 repository, which encodes the mathematical area the theorem belongs to (e.g., Mathlib/Topology/Semicontinuous.lean). The graph was exported from a Python pipeline using NetworkX and is provided in GraphML and GEXF formats for interoperability. Basic statistics: Nodes: 137,046 Edges: 304,433 Average degree: 4.44 Isolated nodes: 43,138 (~32%) Strongly connected components: 137,046 (all trivial, confirming the graph is a DAG) Weakly connected components: 43,936 Source data: LeanDojo Benchmark 4, v10 (random/train.json, random/test.json, random/val.json) Related publication: Yang et al., LeanDojo; The mathlib Community, CPP 2020.
Files
Steps to reproduce
Download the LeanDojo Benchmark 4 (v10) tar.gz from Zenodo (DOI: 10.5281/zenodo.8040109)). Extract random/train.json, random/test.json, and random/val.json and merge them into a single list of theorems. For each theorem, create a node using full_name as the identifier and file_path as a node attribute. For each entry in traced_tactics, inspect the second element of annotated_tactic. If non-empty, each item represents a theorem used in the proof - create a node for it (using full_name and def_path) and add a directed edge from it to the parent theorem. Empty lists correspond to automated proof strategies with no theorem dependencies and should be skipped. Export the resulting graph in GEXF format.
Institutions
- Universidade Estadual de Campinas (UNICAMP)São Paulo, Campinas