Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A pretraining corpus for MachineLearningLM, a framework to equip large language models with in-context machine learning capabilities. The dataset consists of ML tasks synthesized from millions of structural causal models, spanning various shot counts up to 1,024. It was created by author 'eshmoideas' and last updated on June 15, 2026.
License is unknown; terms of use must be verified before application.