Loading...
Loading...
Image classification, object detection, segmentation, face recognition, OCR, image generation, video understanding
15,906 datasets
A Data Management and Sharing Plan outlines the scientific data to be generated and/or used in a research project. The plan describes a strategy for managing and sharing project data related to the identification of first-in-class ligands that bind glial fibrillary acidic protein (GFAP). It was authored by Alison Axtman and last updated on May 11, 2026.
A Data Management and Sharing Plan outlines the scientific data to be generated and/or used in a research project and describes a strategy for managing and sharing that data. The plan is authored by Kevin Weeks and originates from the ODUM Harvested Dataverse. It was last updated on May 11, 2026.
A Data Management and Sharing Plan authored by Bethany Hedt-Gauthier, last updated on 2026-05-11. It describes the scientific data to be generated and/or used in the research project 'Sex, HIV, and Lung Health Across the Life Course: The Uganda Lung Health Study'. The plan outlines a strategy for managing and sharing the project's data.
Yearly data from the Electronic Labor Organization Reporting System (e-LORS), established under the Labor-Management Reporting and Disclosure Act. The system facilitates the electronic filing, storage, and disclosure of data submitted to the Department of Labor by labor unions, employers, and other entities. The dataset was last updated on March 7, 2026.
PAC-BENCH is a benchmark pipeline for evaluating multi-agent systems operating under privacy constraints, created by PAC-Bench. The dataset page was last updated on 2026-04-13.
Global ocean surface current observations collected by the U.S. Naval Oceanographic Office between 1853 and 1973. The data consists of 416 files of ship drift velocities averaged by one-degree and 2x5-degree squares, organized by year, month, and 10-degree latitudinal bands. Derived statistics include variances, covariance, and Eddy Kinetic Energy.
A binary image classification dataset containing 2,300 images of hands with and without gloves. The dataset is hosted on Kaggle, but the author, organization, and creation date are unknown.
1966 to 1978 data from ESSA and NOAA polar-orbiting satellites, providing global visible and infrared band observations. The collection includes scanned images from 35mm film or paper photographs, converted to NetCDF format and mapped to a 10km polar stereographic grid. It was created by scanning archival images, with navigation and processing performed by the National Snow and Ice Data Center (NSIDC).
InstVL is a large-scale dataset of images and videos designed for instance-aware vision-language pre-training. The dataset was created by wovenbytoyota-vai and introduced in the paper 'InstAP: Instance-Aware Vision-Language Pre-Train for Spatial-Temporal Understanding'. It was last updated on the platform in April 2026.
EMD-62785 and PDB 9L3G contain cryo-electron microscopy density maps and atomic coordinates for the flotillin complex. The dataset includes map and model files, along with official validation reports, provided while the primary entries are under a hold for publication status. These structural data support research into the molecular organization of membrane microdomains.
50.2 million image-text pairs across 50 languages support training and evaluation of multilingual text recognition systems. The dataset was created by Interfaze-AI and released on Hugging Face in late 2025. It is designed to handle diverse scripts and domains.
Juan Moises de la Serna Tuya provides a dataset on the human menstrual cycle, focusing on hormonal, physiological, and anatomical changes. It details the four key stages of the cycle: menstruation, pre-ovulation, ovulation, and post-ovulation. The data is licensed under CC BY 4.0 and was last updated in March 2026.
Biology data on cell regeneration cycles in the human body. The dataset describes the varying renewal rates for different cell types, such as skin epidermis renewing every 30 days and red blood cells every 120 days. It was authored by Juan Moises de la Serna Tuya and last updated in March 2026.
Robomimic Organized is a dataset collection hosted on HuggingFace by VEHwang. It likely contains data for robot manipulation tasks, organized for imitation learning research. The dataset was last updated on 2026-05-25.
March 2025 survey data from households in Mbarara district, southwestern Uganda, analyzing factors for willingness to pay for waste management services. The dataset contains results from bivariate and multivariable logistic regression analyses. It is a 9.5 KB Excel file.
Long-term vertical water temperature observations from a fixed mooring in the deepest area (approximately 155 m) of Lake Michigan's southern basin. The dataset was collected by the National Oceanic and Atmospheric Administration to understand the lake's vertical thermal structure. Observations were made as continuously as possible, with annual equipment maintenance or replacement causing some temporal gaps and variations in measurement depth.
Vertical water temperature data was collected from a single mooring in the central basin of Lake Superior from August 2018 through August 2020. Observations were recorded continuously at 21 distinct depths, with a maximum depth of 198 meters and a high measurement accuracy of ±0.002 °C. The dataset was collected by NOAA NCEI.
Andrew Gelman's research from Columbia University resolves two long-standing controversies in American politics regarding redistricting. The analysis demonstrates that redistricting increases electoral responsiveness and that any redistricting reduces partisan bias compared to a system without it. The work highlights the impact of statistical methods and assumptions on conclusions in this domain.
A Who does What Where (3W) dataset lists humanitarian organizations operating in Cameroon at the Admin 2 administrative level. The dataset is critical for identifying gaps and planning humanitarian response. It was published by OCHA Cameroon and last updated on 2026-03-18.
The Indian Ocean and Southern Ocean contain surface measurements of dissolved inorganic carbon, total alkalinity, temperature, and salinity collected during the R/V Marion-Dufresne OISO-02 cruise from August 19 to September 7, 1998. This dataset is part of the Ocean Indien Service d'Observations (OISO) program, initiated in 1998, which collects pCO2 and associated parameters along repeated ship tracks. The data are regularly included in international synthesis projects like the Surface Ocean CO2 Atlas (SOCAT) and the Global Ocean Data Analysis Project (GLODAP).