Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
166,085 datasets
A 1960 comparison of magnetic field instruments calibrated at Toolangi against a proton precession magnetograph at Woomera. Observations were conducted on June 27th and 28th, 1960 by the Weapons Research Establishment. The record likely contains calibration data and comparisons between different measurement technologies.
Ruiqiong Zhou's study on figshare analyzes 2,058 single frozen–thawed day-6 blastocyst transfers from August 2021 to August 2024. The research investigates the interaction between progesterone exposure duration and blastocyst expansion stage on live birth rates. The dataset likely contains clinical variables and outcomes from this retrospective cohort study.
Sentinel-1A SAR VV Backscatter Quicklooks are medium-resolution (approximately 20m pixel size) radar images processed by CSIRO for the eReefs Phase 5 project. The product is derived from Level 1 IW GRD data with VV polarization, reprojected onto a regular grid and filtered to reduce speckle. It is available as a multi-format geospatial service for environmental monitoring.
RyeAI/ftspeech-pnc-da is a text-only companion dataset for Danish speech processing. It provides restored punctuation and capitalization for utterances from the original FTSpeech dataset, with each row containing an utterance_id for joining back to the source. The dataset was created by RyeAI and last updated on 2026-06-19.
Wilson Castro's dataset on figshare describes the distribution of chicken meat types based on pH measured at 24 hours postmortem. The dataset is a 5.5 KB XLS file last updated on 2026-05-19. It is shared under a CC-BY-4.0 license.
A summary of adjusted survival models, including covariates, complete-case sample sizes, and event counts. The dataset is a 12.8 KB Excel file authored by Wenhan Yang and last updated on June 2, 2026. It is shared under a CC-BY-4.0 license on the figshare platform.
Hayato Harima published ANOVA and Tukey test results for viral titers on 2026-05-26. The 15.7 KB Excel file contains statistical analysis of recombinant virus infections in Vero E6 cells measured at 72 hours post-infection.
A dataset showing the distribution of microalbuminuria prevalence across quartiles of the Waist-to-Height Ratio (WHtR). The data includes 95% confidence intervals and is provided in an XLS file. It was authored by Xia Huang and last updated on 2026-05-19.
A synthetic Turkish-language dataset designed for training AI models on reasoning tasks related to automotive spare parts. It contains approximately 990 question-answer entries, primarily focused on Opel vehicles but includes examples for other brands. The dataset was created by aiprojecom and is licensed under MIT.
Australian Communications and Media Authority published anticipatory notices for Opticomm Pty Ltd infrastructure projects. The data includes two project areas with estimated completion dates, addresses, and geographic coordinates. The notices were given on 31 May 2025 and partly declared on 29 April 2026.
A directory of registered vehicle mechanical workshops operating in the Huila Department of Colombia. The dataset includes location and contact information for workshops, compiled from public records maintained by the Huila Chamber of Commerce. It was last updated on 2026-05-18.
40.9 KB Excel file provides biomass data for dominant grassland species. Zhening Zhu published the dataset on figshare in June 2026. It likely contains aboveground and belowground biomass measurements for the top four species at each sampling site.
Brazilian industrial innovation indicators from the 2022 Semi-Annual Innovation Survey (PINTEC Semestral) conducted by IBGE. The dataset contains 108 tables covering industrial companies in mining/extractive and manufacturing sectors with 100 or more employees, with a thematic module on digital transformation. Data were restructured from IBGE spreadsheets and published as tabular data on Dataverse.
Colombian municipal and district administrations present initiatives for potable water and basic sanitation investment projects in their development plans. The dataset includes columns for project description, estimated value, funding sources, and study status. It was issued on 20240802 and last updated on 2026-05-18 18:45:35 via the Socrata platform.
Mixed Beverage Tax revenue distributions are tracked for cities and counties in Texas. The dataset likely contains monthly or periodic reports detailing tax payments, year-to-date totals, and comparisons to prior periods. It is published by the City of Austin and appears on multiple data platforms.
Council spending data published monthly by the Government Digital Service. The dataset covers transactions from April 2016 onward and is available in multiple formats including CSV, XML, and JSON. The description notes that publication timescales may have been affected during the COVID-19 pandemic.
12,000 feet of marine conglomerate, sandstone, limestone, and shale are described for the Upper Devonian and Carboniferous platform sequence. The data, provided by the Australian Ocean Data Network, details a geological formation disconformably overlain by 350 feet of terrestrial sandstone. Last updated metadata indicates a record from 2026-06-05.
5.3 GB of fitted model parameters from a computational neuroscience study on inter-area brain dynamics. The dataset, authored by Mitra Javadzadeh, supports the findings of a 2024 bioRxiv preprint on dynamic consensus-building between neocortical areas. It was last updated on May 17, 2026.
Members of the Risaralda Regional Competitiveness Commission in Colombia, including their affiliated entities and contact information. The dataset is hosted on the Colombian open data portal www.datos.gov.co and was last updated on 2026-05-18. It lists individuals with their associated companies, positions, and municipality-level entity details.
20 comparative tests validated an ESP32-based weighing device for measuring cleaning agent use in hospital sterile supply departments. The code likely contains the firmware and application logic for the device, which was tested over 107 cleaning cycles. Wei Zheng authored this dataset, which was last updated on April 10, 2026.