Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
163,584 datasets
Complete career batting statistics for Indian cricketers in One Day Internationals. The data covers players from 1974 through 2026. It was sourced from Kaggle, but the author and organization are unknown.
Nucleotide identities of yad genes against the EC958 reference in yad positive strains. The dataset is a 58.0 KB CSV file authored by Chloe Ellison and last updated on June 1, 2026. It is shared under a CC-BY-4.0 license on the figshare platform.
Beneficiary records for university welfare services offered by ESAP. The data includes columns for service type, academic program, student methodology, territory, and beneficiary group. It is hosted on the Colombian open data portal, datos.gov.co, and was last updated on 2026-05-18.
A 2020 inventory of information assets, published records, and available data from the Colombian Institute for the Evaluation of Education (ICFES). The dataset lists categories of information with details on format, location, and availability, sourced from the Colombian open data portal. It was last updated on the platform in May 2026.
Mean accuracies for a machine learning model across training, validation, and test datasets. The dataset was authored by Kristina Kobrock and last updated on May 26, 2026. It is a small 5.5 KB XLS file available under a CC-BY-4.0 license.
NSW Cadastre web service provides a dynamic map of cadastral features extracted from the NSW Digital Cadastral Database (DCDB). The service offers access to a state-wide integrated database of current land titles and property boundaries, maintained by Spatial Services (DCS). It was last updated on 2026-05-12.
geoBoundaries maintains standardized, open-license political boundaries for every country globally. This dataset provides ADM0 (country), ADM1, and ADM2 level administrative divisions for Venezuela. The boundaries have been produced and maintained since 2017.
Pairwise comparisons of regional capacity scores from 2023 include mean differences, 95% confidence intervals, and adjusted p-values. The data is derived from a one-way ANOVA with Tukey’s HSD post-hoc test and was authored by Pratik Sharma. The 10.3 KB XLSX file was last updated on May 12, 2026.
A 5.5 KB Excel file containing results from an ablation study on the AVP task. The dataset was authored by Enyan Liu and last updated on June 2, 2026. It is shared under a CC-BY-4.0 license on the figshare platform.
A dataset detailing a spatiotemporal encoder architecture, with parameter counts verified against the implementation. The dataset is a 5.5 KB XLS file authored by Hongmin Wang and last updated on June 2, 2026. It is shared under a CC-BY-4.0 license on the figshare platform.
A validation dataset from the METABRIC breast cancer cohort examining associations among a 16-gene hypoxia score, copy number alteration (CNA) burden, and overall survival. The 5.5 KB Excel file was authored by Wenhan Yang and last updated on June 2, 2026. It is shared under a CC-BY-4.0 license on figshare.
Primer sequences used for RT-qPCR validation of potential key genes. The dataset is a 22.5 KB Excel file authored by Haiming Liang and shared under a CC-BY-4.0 license. It was last updated on June 2, 2026.
Nagara Wakgari Futasa authored this dataset on TH softening technology. The dataset is a 13.5 KB Excel file available under a CC-BY-4.0 license. It was last updated on June 2, 2026.
Fit statistics for latent class models ranging from 2 to 6 classes, published by Shannon A. H. Compton. The dataset is a 5.5 KB Excel file hosted on figshare and last updated in May 2026.
AA-Briefcase-Lite is a public example scenario for Artificial Analysis' frontier agentic evaluation of realistic, long-horizon knowledge work. The dataset extends frontier model benchmarking beyond coding and short-form reasoning to the professional deliverables knowledge workers produce day to day. It consists of four private scenarios in which agents complete realistic professional workflows across data science.
Spatial layers from Ku-Ring-Gai Council detail flood characteristics for design floods ranging from a 20% Annual Exceedance Probability to the Probable Maximum Flood. The dataset is hosted on data.gov.au and was last updated in May 2026. It provides flood map outputs for the Middle Harbour Northern Catchments area.
SS2013v02/GA4402 is a collection of marine underwater video and still images from Balls Pyramid. The dataset is provided by the Australian Ocean Data Network and was last updated on 2026-06-17.
Australian Ocean Data Network provides a tool for calculating distances to marine features. The AMSIS Distance To tool outputs the distance from a given location to the nearest selected marine feature. The dataset was last updated on 2026-06-17.
A collection of one-to-one semantic matches between harmful and harmless prompts, created by aligning prompts from the mlabonne/harmful_behaviors and mlabonne/harmless_alpaca source datasets. The dataset was created by the organization heretic-org and was last updated on 2026-06-16.
A LoRA adapter fine-tunes the Gemma-3-4B-Instruct model for the Sinhala language. The adapter is reported to be 92.3%-preferred in an open evaluation. The author, organization, and specific training data details are unknown.