Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
169,018 datasets
A catalog consolidating information related to the operation and management of administrative procedures. It includes data on registration, responsible parties, service channels, request status, and PQRD (Petitions, Complaints, Claims, and Suggestions). The dataset was last updated on 2026-05-15 and is hosted by www.datos.gov.co.
Companies registered with the Seville Chamber of Commerce from 2005 to 2019, including active, inactive, and canceled entities. The dataset includes columns for tax ID, registration dates, business name, and economic activity codes. It was published by www.datos.gov.co and last updated in May 2026.
Zegui Sun published data supporting research on circular economy pathways for rare earth magnets in May 2026. The dataset includes 10.2 MB of files in PNG, CSV, and XLSX formats. It is licensed under CC-BY-4.0, facilitating open reuse.
17.8 MB of raw data and molecular dynamics structures for concentration battery research. The dataset was authored by Wentao Hou and last updated on 2026-06-02. It is shared under a CC-BY-4.0 license.
CPR Land Sales Tabular Data contains records of agricultural land sales by the Canadian Pacific Railway to settlers in Alberta, Saskatchewan, and Manitoba between 1881 and 1927. These records were transcribed from original ledgers by Glenbow Archives volunteers and later enhanced by Archives and Special Collections at UCalgary. The dataset includes purchaser names, legal land descriptions, acres purchased, and cost per acre.
Records of agricultural land sales by the Canadian Pacific Railway to settlers in Manitoba from 1881 to 1927. The dataset includes purchaser names, legal land descriptions, acreage, and cost per acre, transcribed by volunteers from Glenbow Archives and enhanced by Archives and Special Collections at UCalgary. An interactive web application is available for exploring the spatial data.
International Organization for Migration (IOM) data on Internally Displaced Persons (IDPs) and Returnees in South Sudan, tracked through the Displacement Tracking Matrix (DTM). The dataset provides sub-national level figures, last updated on 2026-05-27.
551 records across eight game categories, including Chess, Go, and Texas Hold'em, are designed to fine-tune large language models on complex strategic reasoning. The dataset, created by '3amthoughts', is heavily weighted towards hard difficulty, with 522 samples classified as hard. It was last updated on May 20, 2026.
Data.mo.gov provides a dataset of asbestos abatement project notifications. The data includes columns such as Contractor Name, Project Address, Start Date, Complete Date, and Cubic Feet, suggesting records of regulated construction activities. The dataset was last updated on 2026-05-28 22:01:13.
A geospatial dataset mapping coastal erosion constraint areas in Quebec where government standards must apply. The data is provided by the Government and Municipalities of Québec and was last updated on 2026-04-22. It is intended for integration into regional land use and development plans under the Land Use Planning Act.
A point dataset showing the locations of intakes and wellheads for drinking water systems in British Columbia, as defined by the Drinking Water Protection Act. The data is maintained by the Government of British Columbia and is updated as new information is received, with a last recorded update on 2026-04-22. It is one of three related datasets on drinking water sources.
Timber Supply Blocks are the primary unit for determining allowable annual cuts in British Columbia's forestry sector. This dataset provides the spatial representation of current Timber Supply Blocks, which are designated areas within larger Timber Supply Areas. The Government of British Columbia maintains this data, which was last updated on April 22, 2026.
Gene sets are associated with neurological disorders. The dataset is a 10.7 KB XLSX file authored by Jiafang Li and shared under a CC-BY-4.0 license. It was last updated on May 29, 2026.
A small dataset listing the number of genes in a training set and candidate gene set. The 9.2 KB XLSX file was authored by Jiafang Li and last updated on May 29, 2026. It is shared under a CC-BY-4.0 license on figshare.
9.1 KB of gene expression data submitted to the CMAP platform for drug candidate discovery. The dataset is provided in XLSX format by Jiafang Li and was last updated on 2026-05-29. It is shared under a CC-BY-4.0 license.
9.0 KB of prevalence data for multiple disorders in a test set, shared by Jiafang Li on figshare. The dataset was last updated on 2026-05-29 and is available under a CC-BY-4.0 license.
Positiva Compañía de Seguros maintains this dataset of its healthcare provider network for serving affiliated members. The data includes provider names, types, locations, contact details, and official registration codes. It was last updated on 2026-05-18 and is hosted on the Colombian open data portal www.datos.gov.co.
A 5.5 KB Excel file calculates the Pearson correlation between objective functions and an emergency radius. Zahra Samadi Bahrami authored this dataset, which was last updated on May 29, 2026. The specific data points and sample size are not detailed in the available metadata.
5.5 KB of tabular results from a first scenario testing an Anti-Money Laundering (AML) system on large-scale cases. The dataset, authored by Zahra Samadi Bahrami, is available under a CC-BY-4.0 license and was last updated on May 29, 2026. Its small size suggests it likely contains summary metrics or aggregated outputs from the simulation.
Results from two scenarios evaluating the addition of an Anti-Money Laundering (AML) system on small-scale cases. The dataset was authored by Zahra Samadi Bahrami and last updated on May 29, 2026. It is a small 5.5 KB Excel file available under a CC-BY-4.0 license.