Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
147,277 datasets
A collection of digitized historical maps of Queensland from 1841 to 2005, provided by the Queensland Department of Natural Resources and Mines, Manufacturing and Regional and Rural Development. The dataset includes cadastral maps showing property boundaries, descriptions, and land tenure, along with miscellaneous related maps and key maps. The quality of the scans varies and some maps include annotations.
Young forest distribution and estimated stand ages across Russia at 500-meter resolution for 2012. NASA produced this dataset by modeling 12- to 27-year-old forests from MODIS records and augmenting with 0- to 11-year-old forest data aggregated from 30-meter Landsat imagery. The dataset provides a detailed spatial snapshot of early-stage forest recovery for the entire country.
Casanare Department in Colombia maintains a registry of nonprofit entities active with the local chamber of commerce. The dataset includes columns for legal name, address, contact information, registration dates, economic activity codes, and legal representative. It was last updated on 2026-05-18 and is provided by the www.datos.gov.co platform.
A dataset from 2026 by Debomita Chakraborty containing the percentage distribution of parameter sets that produce specific functions in synthetic gene circuits. It includes 8,813 parameter sets for a toggle triad network and 6,951 sets for an equivalent circuit, mapped across 14 distinct functions.
NSW Timber Reserves boundaries created by Forestry Corporation of NSW from the state's cadastre. The dataset is a shapefile download of approximately 280 kilobytes, though boundaries may not align with the most current cadastre due to ongoing positional improvements by Spatial Services. It was last updated on 2026-05-13.
A dataset from figshare updated on June 4, 2026, containing model fit comparisons for parasite prevalence models. It ranks three model types—intercept-only, age-sex fixed effects, and age-sex interaction models—using expected log predictive density (ELPD) metrics. The data was authored by Stephanie M. Wu and is shared under a CC-BY-4.0 license.
10.8 KB of data enumerating individuals with available measurements for various intestinal parasites, subsetted to their first time point. The dataset includes demographic breakdowns by age and sex. It was authored by Stephanie M. Wu and last updated on June 4, 2026.
French UMCS-22 measurement invariance tests among age groups of volunteer firefighters. The dataset includes 194 individuals aged 16-24, 740 aged 25-49, and 214 aged 50-65. It was authored by Marina Burakova and last updated on 2026-05-28.
Lei Zhengyao's dataset provides time-averaged wall shear stress (TAWSS) values from transient cardiovascular simulations. Values were obtained from the final cardiac cycle to ensure periodic stability. The dataset includes results for below-normal, normal, and high blood viscosity conditions, representing inter-individual variability.
Candidate phonological probes from Angelo Maria Sabatini, last updated on 2026-06-04. The dataset is a 5.5 KB Excel file listing trigrams that exhibit significant correlations with block position. A 'Match' column indicates whether each trend aligns with expectations from MD dynamics.
Patrick Schmidt published this dataset on figshare in June 2026. It contains calculated Weibull modulus and characteristic strength values derived from Weibull plots of bending bar samples. The dataset is small, at 10.8 KB, and the number of bars analyzed per sample is indicated in brackets.
During the 1989 FIFE Intensive Field Campaign, a Russian Gemma spectroradiometer collected visible and near infrared spectra from a helicopter over multiple study sites. The data provides an intermediate scale of sampling between surface measurements and higher altitude aircraft and spacecraft imaging. Measurements were designed to characterize FIFE sites and allow comparison with SE-590 and Modular Multiband Radiometer (MMR) instruments.
Victoria Sivill's dataset provides classification accuracy scores with standard deviation for the binary transformed P(M), P(P), and P(P|M) under a variety of classification models. The data is stored in an XLS file sized 5.5 KB and was last updated on May 28, 2026. The description notes that the most successful model on all three target variables was the neural network classifier.
Over 30 columns describe housing conditions, socioeconomic status, and pet ownership for the municipality of Paipa. The dataset includes variables such as Sisbén score, housing stratum, tenure, sanitation, lighting, and counts of vaccinated pets. It was published by datos.gov.co and last updated on 2026-05-18.
From July 2019 onwards, this dataset contains all recorded fly-tipping incidents in the City of York, sourced from the council's customer relationship management tool. It is published by the Government Digital Service and is available in multiple geospatial and tabular formats. The data is a live API link to the council's GIS server, reflecting changes to the master copy.
Unresolved fly-tipping incidents recorded by the City of York Council from July 2019 onward. The data is sourced from the council's customer relationship management tool and is published via a live API link to their GIS server, excluding incidents created in the last 14 days. The dataset is provided by the Government Digital Service under the OGL-UK-3.0 license.
Graffiti reports recorded in York's customer relationship management system from November 2019 onward. This dataset contains the most recent incidents covering a 30-day period, but excludes reports created within the last 14 days. The data is published as a live API link to the City of York Council's GIS server, meaning changes to the master copy are reflected immediately.
13.4 MB of supplementary Excel tables from the BIO285D research thesis of a Biochemistry student at Pontificia Universidad Católica de Chile. The tables were not included in the final manuscript due to their length. The data was uploaded in June 2026.
Medición del Desempeño Municipal (MDM) measures, compares, and ranks municipalities in the Department of Boyacá, Colombia, based on their integral performance in management and development results. The dataset is provided by the Colombian National Planning Department (DNP) and covers the 2020 period. It analyzes two components—management and results—while accounting for initial endowments to create comparable groups.
Salt Bin - Last 30 Days Incidents contains the most recent salt bin incident reports in York, covering a 30-day period. The data originates from the City of York Council's customer relationship management tool, with records from November 2019 onwards. This dataset is a live API link to the council's GIS server, reflecting updates immediately, but excludes incidents created in the last 14 days.