Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
148,812 datasets
A dataset containing the results of a zeroed unitarization procedure for counties in the Podkarpackie region of Poland. The data served as input for a statistical analysis conducted for a master's thesis. The dataset is 45.4 KB in size, stored in an XLSX file, and was last updated on June 2, 2026.
Data collected on 24 February 2012 by the Great Barrier Reef Wireless Sensor Network, part of the Wireless Sensor Networks Facility. The facility is a component of the Great Barrier Reef Ocean Observing System (GBROOS) and the Australian Integrated Marine Observing System (IMOS). The data set is hosted by the Australian Ocean Data Network.
A 5.5 KB Excel dataset from figshare, last updated May 26, 2026. It contains percentages of calcified granulomas from naturally infected pigs with neurocysticercosis (NCC) that show positive immunoreactivity to residual cyst antigens, as detected by monoclonal antibody-based immunohistochemistry assays. The data is distributed across various post-treatment timepoints and was authored by Luz M. Toribio under a CC-BY-4.0 license.
An overview of performance metrics for different model topologies. The dataset reports NLLH, AIC, AICc, and BIC values, the number of model parameters, and a checkmark indicating if a model reproduces described observations. All values are estimated via maximum likelihood estimation. The dataset was authored by Yuhong Liu and last updated on June 2, 2026.
A collection of p-values from chi-square tests for uniform distribution and independence, and linear regression results for haul trends. The dataset, 11.9 KB in size, was authored by Pedro Leitão and last updated on June 2, 2026. It applies Bonferroni corrections and highlights significant positive trends in haul numbers per year.
A dataset listing main fishing gears and their associated counts of potential métiers, vessels, trips, and hauls. The data is provided in an XLS file and was authored by Pedro Leitão. It was last updated on June 2, 2026.
A 5.5 KB Excel file uploaded by Uma Shashi Sharma on June 2, 2026, compares clustering algorithms. The dataset likely contains performance scores for methods applied to a shape embedding of dendritic spines, a neuroscience structure. It evaluates both hard clustering metrics like Silhouette score and soft clustering metrics like average entropy.
Field-collected Aedes aegypti mosquitoes were exposed to cypermethrin. The dataset contains mutation frequencies for alleles R52H, D126N, P162H, and A532E, comparing live and dead insects. Han-Hsuan Chung published the data on figshare in May 2026.
A 9.3 KB Excel file provides input data for a Mann-Whitney statistical test program. The dataset likely contains data from an officer training class to evaluate feedback on attitude. It was uploaded anonymously to figshare and is available under a CC-BY-4.0 license.
George Song provides datasets and auxiliary files for reproducing analyses from a paper on navigating protein fitness landscapes via deep learning. The 1.9 GB repository includes protein sequences, fitness scores, mutant libraries, and structure files. It was last updated on 2026-05-11 under a CC-BY-4.0 license.
Bucaramanga, Colombia's air quality data is collected by the Regional Autonomous Corporation for the Defense of the Bucaramanga Plateau (CDMB) from a monitoring station at the Jorge Eliécer Gaitán school. The dataset contains real-time measurements of pollutants including PM10, PM2.5, SO2, NO2, O3, and CO, along with meteorological variables like temperature, wind speed, and humidity. It is collected in compliance with Resolution 2254 of 2017 from Colombia's Ministry of Environment and Sustainable Development.
Detailed information on income received by the municipality of Yopal, Casanare, during the 2023 and 2024 fiscal years. The dataset aims to provide transparency and facilitate analysis of municipal financial management in terms of collection and revenue sources. It is published by the Colombian open data platform www.datos.gov.co and was last updated on 2026-05-18.
A survey dataset measuring citizen perception and knowledge regarding compliance with a departmental public policy on post-consumer waste management. The policy was approved via ordinance 025 in December 2019. The data is hosted by www.datos.gov.co and was last updated in May 2026.
A 46.9 MB dataset from the SAGE study, containing records from an article screening step for systematic reviews. The dataset, authored by Mathias Pietrancosta, is available in CSV, XLSX, and TXT formats under a CC-BY-4.0 license and was last updated on June 3, 2026.
A 489.6 MB collection of search strings and their development steps, likely for systematic literature reviews, authored by Mathias Pietrancosta and last updated in June 2026. The dataset is available in multiple formats including DOCX, XLSX, CSV, and TXT under a CC-BY-4.0 license. Its specific row count and column structure are not detailed in the provided metadata.
A 23.5 MB dataset related to the SAGE study's data extraction step, authored by Mathias Pietrancosta and last updated on June 3, 2026. The dataset is available under a CC-BY-4.0 license and includes files in PDF, XLSX, DOCX, CSV, and JSON formats.
SAGE study data synthesis step dataset is a 747.3 KB collection of CSV and XLSX files authored by Mathias Pietrancosta. The dataset was last updated on June 3, 2026, and is shared under a CC-BY-4.0 license on figshare.
Location data for all recycling facilities within the Causeway Coast and Glens Borough Council area. The dataset is provided by the Government Digital Service under the UK Open Government Licence and is available in multiple geospatial formats including KML, GeoJSON, and ESRI Shapefile. Further operational details like opening hours are referenced via a link to the council's official website.
Australian atmospheric monitoring station 'Arcturus' in the Bowen Basin collected three years of continuous methane and carbon dioxide measurements. The dataset underpins a simulation study modeling fugitive emissions from a typical coal seam gas field at varying rates and distances. Results, including statistical detection likelihoods and false alarm rates, were presented at the American Geophysical Union meeting in December 2013.
Change of name registration data from the NSW Registry of Births Deaths and Marriages. The data reflects the total number of registrations completed, not the number of actual life events. The dataset is licensed under CC-BY-4.0 and was last updated on 2026-05-28.