Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
153,025 datasets
Detailed information of datasets collected from publications. The dataset is a 5.5 KB XLS file published by Enyan Liu under a CC-BY-4.0 license. It was last updated on June 2, 2026.
Distribution data for Ixodes persulcatus ticks that tested positive for phleboviruses. The 9.5 KB Excel file was authored by Mikhail Y. Kartashov and last updated in June 2026. Its specific geographic and temporal coverage must be inferred from the data file.
A replication package containing analysis scripts and documentation for a study on medical bias in large language models. The materials are available as a 222.2 KB ZIP file under a CC-BY-4.0 license, authored by Qiufeng Jia and last updated on 2026-05-19. The full study materials are hosted on a public GitHub repository.
Summary statistics of the Standardized Root Mean Square Error (SRMSE) for distributional fit, calculated across 20 independent runs. The dataset is a 5.5 KB Excel file authored by Michael Jones and last updated on June 2, 2026. It is shared under a CC-BY-4.0 license on the figshare platform.
Michael Jones published summary statistics of SRMSE for distributional fit across 109 countries. The dataset is a 5.5 KB Excel file last updated on June 2, 2026. It is available under a CC-BY-4.0 license on the figshare platform.
A 5.5 KB Excel file contains a cross-tabulation analysis of hypoxia groups by intrinsic subtype for the TCGA-BRCA cohort. The dataset was authored by Wenhan Yang and last updated on June 2, 2026. It is derived from The Cancer Genome Atlas Breast Invasive Carcinoma project.
Allele frequency data for the N868D variant in field-collected Aedes aegypti mosquitoes after exposure to cypermethrin insecticide. The dataset was authored by Han-Hsuan Chung and is available under a CC-BY-4.0 license. It was last updated on May 26, 2026.
Caifang Qiu published a dataset on figshare in June 2026 containing basic characteristics of female college students, comparing those majoring in dance to non-dance majors. The data is presented as means and standard deviations in a 5.5 KB Excel file. The dataset is licensed under CC-BY-4.0.
A 5.5 KB Excel file containing statistics on fire risk factors in urban villages, authored by Jiangxue Tian. The dataset was last updated on June 2, 2026, and is shared under a CC-BY-4.0 license.
A 9.5 KB Excel file from figshare details the distribution of fear-of-childbirth severity across selected baseline obstetric variables. Authored by Adeniyi Abiodun Adewunmi, the dataset was last updated on June 2, 2026. Its small size suggests a focused analysis of specific clinical factors related to maternal anxiety.
A metadata catalog from the Palmira municipal government's website, listing information resources for the 2022 fiscal year. The dataset includes columns such as ENLACE SITIO WEB, NOMBRE O TITULO DE LA INFORMACIÓN, and FRECUENCIA DE ACTUALIZACIÓN. It was published via the www.datos.gov.co platform and last updated on 2026-05-18.
5.5 KB of per-class detection metrics for Advanced Persistent Threat (APT) phases, calculated using 5-fold cross-validation. The dataset reports mean values with standard deviations and was authored by Adel Alshamrani. It was last updated on June 2, 2026.
Tiered operational definitions and distribution data for CSSI risk factors. The dataset was authored by InHo Lee and is available under a CC-BY-4.0 license. It was last updated on June 2, 2026.
Participant training and racing characteristics are summarized by Tier level (1-3). The dataset, authored by Louise Burnie, is a 5.5 KB Excel file last updated on June 2, 2026. It presents means and standard deviations for an unspecified number of athletes.
DGTek Pty Ltd's service areas as the Statutory Infrastructure Provider (SIP) are mapped in this dataset. It forms part of the official SIP register managed by the Australian Communications and Media Authority (ACMA). The dataset was last updated on 2026-05-19.
Legal opinions issued by the Competition Advocacy Working Group of the Colombian Superintendence of Industry and Commerce (SIC) since 2009. The dataset includes details such as the regulating entity, project name, SIC recommendation, and entry and exit dates. It is published by www.datos.gov.co and was last updated on 2026-05-25.
Qatar National Library provides a high-resolution digital master copy of manuscript HC.MS.03224 from its Heritage Collection. The 48.6 MB ZIP file contains a digitized Quran manuscript, released under a CC0-1.0 license. The dataset was last updated on June 1, 2026.
Horse Racing Licensing is a registry of individuals authorized to compete in New York State. The dataset contains licensing details for participants, including their name, occupation, license status, and expiration dates. It is maintained by data.ny.gov and was last updated in early April 2026.
Table 1_An integrated clinical and imaging model for predicting post-traumatic nonunion.docx is a dataset from a retrospective cohort study of 343 patients with unilateral closed long bone fractures treated by internal fixation. The dataset was created by Bin Wang and last updated on 2026-04-13. It contains clinico-radiological variables used to develop and validate a machine learning model for predicting nonunion.
20 publicly available classification datasets selected for diversity in dimensionality, sample size, class distribution, and application domain. Gabriel Lima compiled these datasets for evaluating a Model-Agnostic Multivariate Separability Index, and they are provided in CSV format. The collection was last updated on May 4, 2026.