Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A training dataset for mathematical embedding models built on the principle that concepts can be expressed in multiple surface forms. It includes informal natural language, rephrasings, and Lean 4 type signatures and declarations, designed for contrastive embedding training. The dataset was created by uw-math-ai and was last updated on 2026-05-29.
License is unknown; users must verify licensing terms before use.