Loading...
Loading...
Image classification, object detection, segmentation, face recognition, OCR, image generation, video understanding
15,189 datasets
Colombia's National Cultural Information System (SINIC) consolidates records on physical spaces dedicated to arts, culture, and knowledge. The dataset includes cultural facilities such as theaters, libraries, museums, and community-managed spaces. Information is sourced from national and territorial entities, with institutional verification and participatory updates.
Malayalam speech data from the Vividh-ASR benchmark, designed to evaluate automatic speech recognition models on real-world audio. The dataset is a test split organized by acoustic complexity to expose performance gaps in models trained predominantly on clean studio recordings. It was created by adalat-ai and last updated on Hugging Face in May 2026.
Natural England's 2000 Isles of Scilly Monitoring dataset collates surveys of infaunal organisms on intertidal sandflats. The data were gathered for purposes including Marine Conservation Zone verification, condition assessments, and Natura 2000 site feature surveys, following established methodologies. It is published under the Open Government Licence via the Government Digital Service.
Carla A. Ng's dataset analyzes trends in organofluorine chemistry from the last five years, comparing newly proposed substances with those studied for environmental degradation. It defines metrics like Fluorine Atom Fraction (FAF) and the presence of specific fluorinated groups to characterize compounds. The data is stored in a 1.0 MB XLSX file and was last updated on 2026-04-15.
A 2026 perspective analysis by Carla A. Ng compares trends in newly proposed organofluorine compounds over the last five years against those with existing environmental fate studies. The dataset defines and applies two metrics—Fluorine Atom Fraction (FAF) and the presence of specific fluorinated groups—to analyze the relationship between fluorination degree and environmental degradation. It is a 2.1 MB Excel file shared under a CC-BY-NC-4.0 license on figshare.
A curated subset of the COCO 2014 Validation set containing 9,986 image-caption pairs. The images are center-cropped and resized to 256x256 resolution in PNG format. The dataset was created by user 'byliutao' and is designed as a standard reference benchmark for Fréchet Inception Distance evaluation in text-to-image generation research.
TowerDataset is a power-line corridor point cloud dataset for semantic segmentation. It contains 661 annotated airborne and mobile LiDAR scenes of transmission towers, conductors, insulators, vegetation, and ground objects, totaling 2,466,076,987 points. The dataset was created by author tccx18 and was last updated on Hugging Face in May 2026.
Over 170 satellite-derived Sea Surface Temperature and Chlorophyll-a images from July 2002 to December 2016 map the seasonal Bonney Coast upwelling system. The dataset was created by researchers using MODIS data and presented at the 2017 Australian Marine Science Association Conference. It captures the spatial extent and intensity of upwelling events, which can influence over 10,000 km2 of ocean surface.
Montreal, Quebec, is the location for this 2014 study measuring air pollutant exposure inside and outside vehicles during rush hour. Health Canada collected data on volatile organic compounds (VOCs) during two-hour drives in the fall as part of the Commuter Air Pollution Intervention study. The dataset provides descriptive VOC statistics intended to offer relevant Canadian data on driving exposure.
NASA's Wind spacecraft provides processed solar wind magnetic field data linearly interpolated to a 60-second resolution. The dataset was constructed by Dr. J.M. Weygand for Prof. R.L. McPherron and has been used in superposed epoch and cross correlation studies. Version 2 includes a correction for an offset found in the Bz component after November 2004.
Beijing contains a dataset of 320 demolished and undemolished informal settlements compiled between 2017 and 2018. It was created by Shiqi Ma for research published in the American Journal of Political Science. The data is based on satellite imagery, digital street views, and fieldwork.
Seismic refraction surveys conducted by the Bureau of Mineral Resources at three potential dam sites near Rathdowney in southern Queensland. The surveys were performed in 1959 to support the Irrigation and Water Supply Commission's development plans for the Logan River valley. The dataset is provided by Geoscience Australia.
Acoustic Doppler Current Profiler (ADCP) data collected during the RV Investigator's 2015 voyage titled 'Great Australian Bight deep water geological and benthic ecology program'. The vessel departed Hobart on 25 October 2015 and arrived in Port Lincoln on 28 November 2015, with data also collected during a 3-day mobilization period. This dataset was processed and archived by the CSIRO Oceans and Atmosphere Information and Data Centre in Hobart.
Fall 2022 data from the Sub-Mesoscale Ocean Dynamics Experiment (S-MODE) provides estimated chlorophyll-a and particulate organic carbon concentrations from approximately 300 km offshore of San Francisco. The Portable Remote Imaging Spectrometer (PRISM), mounted on a GIII aircraft, captured this Level 2 data with a pushbroom imaging spectrometer operating from 350-1050 nm and a co-aligned SWIR radiometer for atmospheric correction. This dataset is designed to study how short-spatial-scale ocean dynamics influence the vertical exchange of physical and biological variables.
A study measuring soil organic carbon (SOC), particulate organic carbon (POC), and mineral-associated organic carbon (MAOC) under different nitrogen addition rates. The dataset likely contains microbial diversity and network complexity metrics for bacteria and fungi across litter and soil layers. It was authored by Jun Chen and last updated on 2026-04-25.
Regional flood defences in the Dutch province of Groningen, established under the Environmental Ordinance. These structures are designed and maintained to withstand a water level with a statistical recurrence interval of once every 100 years. The data is provided by the Dutch Ministry of the Interior and Kingdom Relations and is part of the Frisian Boezem water system.
Protected area A stock contains the strongest regional flood defences in the province of Groningen, Netherlands, as established by the provincial Environmental Ordinance. The dataset is provided by the Dutch Ministry of the Interior and Kingdom Relations under a CC-PDM-1.0 license. The defences are designed to withstand a water level occurring once every 100 years and meet a stability load occurring once every 1000 years.
Mary Mirvis provides the underlying data for a scoping review of the whole-cell imaging literature. The 244.8 KB Excel file contains metadata on 118 individual datasets, including imaging modality, resolution, cell type, and organelles. This data supports the analysis and figures in the associated 2026 publication.
A 5.5 KB Excel file compares the performance of Inception-ResNet V2 models using different attention mechanisms. Chao Zhang authored this small-scale benchmark dataset, which was last updated on May 14, 2026. The dataset likely contains tabular results from computer vision experiments.
Chao Zhang published a table of prediction probabilities from classical convolutional neural network models for peritoneal metastasis on a test set. The data compares model performance at dropout probabilities of 0 and 0.3. The dataset is 35.4 KB in size and was last updated on May 14, 2026.