Loading...
Loading...
Image classification, object detection, segmentation, face recognition, OCR, image generation, video understanding
15,891 datasets
Oceanographic measurements include dissolved inorganic carbon, dissolved oxygen, hydrostatic pressure, potential temperature, salinity, and water temperature. Data were collected from the OCEANUS research vessel during a cruise in the North Atlantic Ocean from May 29 to June 3, 1995. The dataset was compiled by Taro Takahashi of Columbia University's Lamont-Doherty Earth Observatory and R. Pickart as part of the CARINA synthesis project.
July 19-26, 1986 data collection includes dissolved inorganic carbon, alkalinity, pH, salinity, temperature, and nutrients from the Barents Sea. Measurements were taken using CTD and bottle instruments during the CARINA/58LA19860719 cruise. The data were collected by researchers from Gothenburg University and contributed to the CARINA synthesis project for biogeochemical investigations.
The Patuxent River estuary was monitored over a 25-hour tidal cycle from October 17-18, 1972. The dataset contains chemical and physical measurements, including dissolved oxygen, nutrients, chlorophyll, and heat concentration. Data were collected by the University of Maryland's Chesapeake Biological Lab and submitted to the National Oceanographic Data Center.
The GINGALOG database table contains selected information from the Large Area Counter aboard the Japanese Ginga X-ray astronomy satellite. The satellite operated from 5 February 1987 to November 1991 in a circular orbit with a 96-minute period. This catalog, prepared by NASA HEASARC from data provided by ISAS Japan, serves as a basic log pointing to archived FITS files for detailed analysis.
IFAG in Germany provides GPS satellite and ground station data for the European region as part of the International GPS Service for Geodynamics. The dataset includes high-accuracy satellite orbits, Earth rotation parameters, and daily RINEX observation files from a network of permanent tracking sites. The service has been operational since January 1994, with twenty days of data available online.
1994 to present data from the International GPS Service for Geodynamics (IGS), providing high-accuracy satellite and Earth observation data. The dataset includes GPS satellite orbits, Earth rotation parameters, and site coordinates, collected from a global network of about 40 permanent tracking stations. It is maintained by the Institut Geographique National (IGN) in France as a Global Data Center.
SARD is a large-scale synthetic dataset for Arabic Optical Character Recognition. It provides controlled and diverse training data simulating real-world book layouts. The dataset was created by riotu-lab and was last updated in April 2026.
The Ginga LAC Mode Catalog contains selected information from the Large Area Counter aboard the Japanese Ginga X-ray astronomy satellite. The mission operated from 5 February 1987 to November 1991, collecting data in a circular orbit with a 96-minute period. This catalog was prepared by NASA HEASARC from data provided by the Institute of Space and Astronautical Science in Japan.
Results from multiple multilingual OCR models applied to the test split of the GlotOCR-bench dataset, containing 16,375 samples. The dataset was created by cis-lmu and last updated on April 12, 2026. It includes outputs from models such as rednote-hilab/dots.ocr, zai-org/GLM-OCR, and deepseek-ai/DeepSeek-OCR-2.
Lead 90th percentile measurements for public water systems across Michigan counties, as reported by data.michigan.gov. The dataset tracks monitoring periods and upcoming sampling due dates for each supply system. It was last updated in March 2026.
Starrydata2 is a database containing experimental property data for inorganic materials. The dataset is a 51.0 MB ZIP file published by author Tomoya Mato and last updated in April 2026. It aggregates data from experimental studies in materials science.
reLAIONet is a manually proofread, web-sourced image classification benchmark aligned to ImageNet's label space. It is designed for out-of-distribution evaluation of class-conditional generative and discriminative models. The dataset was created by harvardairobotics and was last updated in April 2026.
Data from a 2026 study of 113 Bangladeshi Non-Governmental Organisation (NGO) firms examines managers' perceptions of sustainability capital. The dataset contains Likert-scale survey responses measuring intellectual, social, and environmental capital as drivers of sustainability capital and its subsequent impact on organizational performance. Analysis was conducted using structural equation modeling, informed by political economy, institutional, and incomplete contracts theories.
UNESCO-sourced education, demographic, and socio-economic indicators for the Democratic People's Republic of Korea updated as of March 2026. The data aggregates SDG 4 Global and Thematic metrics alongside policy-relevant socio-economic indicators from the UIS bulk data service. It provides a standardized view of educational progress and demographic trends within the country.
20,000 text samples compiled from three distinct sources: Wikipedia, Project Gutenberg, and CNN/DailyMail. The dataset was created by author 'brograrnmer' and last updated on May 5, 2026. Preprocessing involved regex cleaning to replace certain patterns with whitespace.
Supplementary information supports research on suppressing halide phase segregation in perovskite materials for tandem solar cells. The dataset, provided by author Yifan Xu, was last updated in April 2026. It is a 9.1 MB Excel file containing experimental data and analysis.
Supplementary Information for a research paper on suppressing halide phase segregation in wide-bandgap perovskites. The dataset, provided by author Yifan Xu, was last updated in April 2026 and is available as a 9.1 MB XLSX file under a CC-BY-4.0 license.
An image dataset containing signs for the letters A through Z and the numbers 1 through 9. The dataset was uploaded to Kaggle, but the author, organization, and specific collection details are not provided. The total number of images, file formats, and last update date are unknown.
PaveBench is a large-scale benchmark for pavement distress perception and interactive vision-language analysis on real-world highway inspection images. It supports four core tasks: classification, object detection, semantic segmentation, and vision-language question answering. The dataset was created by MML-Group and was last updated on the platform in April 2026.
From July 18 to August 20, 1993, this dataset contains discrete sample and profile observations of carbon dioxide partial pressure, dissolved inorganic carbon, and related biogeochemical variables collected aboard the USCGC POLAR SEA in the North Greenland Sea. It was gathered by researchers from the Commonwealth Scientific and Industrial Research Organization, Universitat Kiel, and others using CTD and bottle instruments as part of the CARINA data synthesis project. The project aimed to produce an internally consistent data set for biogeochemical investigations, originally focused on the North Atlantic.