Sign in to view source links and access this dataset
Description
A collection of 650 relational databases spanning domains like e-commerce, finance, sports, biomedical, and government, ported to the RelBench manifest format. It was created by stanford-rdl for large-scale pretraining of relational and tabular foundation models, with each database being self-describing and tasks shipping labels as-is.
Use Cases
Pretraining relational foundation models based on the collection's 650 multi-domain databases.
Benchmarking tabular machine learning models across diverse domains like e-commerce and finance mentioned in the description.
Developing self-describing data systems based on the RelBench manifest format used by the collection.