Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
165,638 datasets
SpriteID-5K v2.0 is a dataset of 2D game sprite triplets annotated with pose information, designed for training motion models in the sprite domain. It was created by author Mohamed-Yossri for the BitSoul project and was last updated on Hugging Face in May 2026. The dataset's version 2.0 includes changes to action labeling and collection quotas to address imbalances present in the prior version.
Version 2 data from 2008, updated with a batch of tiles in 2012, identifies 167 landscape areas as polygons attributed with geological names related to mass movement. The British Geological Survey (BGS) created this data at a 1:25,000 scale, covering selected 'classic' geology areas like Llandovery, Coniston, and the Cuillan Hills. It includes deposits that have moved downslope (landslips) as well as foundered strata where ground has collapsed due to subsidence.
Mass movement version 7 identifies landscape areas across Great Britain attributed with types of mass movement, such as landslips. The data covers onshore England, Wales, Scotland, and the Isle of Man at a 1:50,000 scale and is provided by the British Geological Survey (BGS). It includes foundered strata not described in the standard rock classification scheme, but caution is advised as historical recording may be incomplete and the landscape is dynamic.
Magdalena Department in Colombia provides data on the number of students enrolled in official and private schools across its municipalities from 2010 to 2020. The dataset excludes the municipalities of Santa Marta and Ciénaga. It includes columns for official, private, and contracted schools, as well as subregion and municipality codes.
Weiren Wang published a dataset of annotated stress-strain curves on figshare in June 2026. The dataset includes figure titles, axis labels, sample identifiers, and point coordinates, and is packaged in a 64.5 MB RAR file. It is shared under a CC-BY-4.0 license.
21.5 KB of data on the frequency of current use of specific wearable devices among users, shared by André Hajek on figshare. The dataset was last updated on June 2, 2026, and is available under a CC-BY-4.0 license in XLS format.
Summary statistics for geometric characteristics and simulation outputs, stratified by sex and heart failure status. The dataset is a 20.7 KB XLSX file authored by José Alonso Solís-Lemus and last updated on June 2, 2026. It is licensed under CC-BY-4.0 and hosted on figshare.
Harman's two English translated versions of children's literature are analyzed through descriptive statistics. The dataset was authored by He He and last updated on June 2, 2026. It is a small dataset of 5.5 KB, stored in an XLS file format.
5.5 KB of descriptive statistics and correlation data from a research project's third study. The dataset, authored by Brendan Boyle, is available under a CC-BY-4.0 license and was last updated on June 2, 2026. It is stored in an XLS format on the figshare platform.
5.5 KB of descriptive statistics and correlation data from a research project's second study. The dataset, authored by Brendan Boyle, is available under a CC-BY-4.0 license and was last updated on June 2, 2026. Its small size suggests it likely contains summary-level results rather than raw observational data.
Brendan Boyle published a dataset containing descriptive statistics and correlations for key variables from Study 1. The data is stored in a 5.5 KB XLS file, indicating a small-scale summary. It was last updated on June 2, 2026, and is shared under a CC-BY-4.0 license.
Detailed information of datasets collected from publications. The dataset is a 5.5 KB XLS file published by Enyan Liu under a CC-BY-4.0 license. It was last updated on June 2, 2026.
Distribution data for Ixodes persulcatus ticks that tested positive for phleboviruses. The 9.5 KB Excel file was authored by Mikhail Y. Kartashov and last updated in June 2026. Its specific geographic and temporal coverage must be inferred from the data file.
A replication package containing analysis scripts and documentation for a study on medical bias in large language models. The materials are available as a 222.2 KB ZIP file under a CC-BY-4.0 license, authored by Qiufeng Jia and last updated on 2026-05-19. The full study materials are hosted on a public GitHub repository.
Summary statistics of the Standardized Root Mean Square Error (SRMSE) for distributional fit, calculated across 20 independent runs. The dataset is a 5.5 KB Excel file authored by Michael Jones and last updated on June 2, 2026. It is shared under a CC-BY-4.0 license on the figshare platform.
Michael Jones published summary statistics of SRMSE for distributional fit across 109 countries. The dataset is a 5.5 KB Excel file last updated on June 2, 2026. It is available under a CC-BY-4.0 license on the figshare platform.
A 5.5 KB Excel file contains a cross-tabulation analysis of hypoxia groups by intrinsic subtype for the TCGA-BRCA cohort. The dataset was authored by Wenhan Yang and last updated on June 2, 2026. It is derived from The Cancer Genome Atlas Breast Invasive Carcinoma project.
Allele frequency data for the N868D variant in field-collected Aedes aegypti mosquitoes after exposure to cypermethrin insecticide. The dataset was authored by Han-Hsuan Chung and is available under a CC-BY-4.0 license. It was last updated on May 26, 2026.
Caifang Qiu published a dataset on figshare in June 2026 containing basic characteristics of female college students, comparing those majoring in dance to non-dance majors. The data is presented as means and standard deviations in a 5.5 KB Excel file. The dataset is licensed under CC-BY-4.0.
A 5.5 KB Excel file containing statistics on fire risk factors in urban villages, authored by Jiangxue Tian. The dataset was last updated on June 2, 2026, and is shared under a CC-BY-4.0 license.