Loading...
Loading...
Image classification, object detection, segmentation, face recognition, OCR, image generation, video understanding
15,608 datasets
IMP-8 spacecraft data processed to provide solar wind measurements at a 60-second resolution. The dataset consists of linearly interpolated plasma data in GSM coordinates, originally constructed by Dr. J.M. Weygand for Prof. R.L. McPherron. It was primarily used in superposed epoch and cross correlation studies on solar wind phenomena.
ISEE-1 spacecraft tri-axial fluxgate magnetometer data processed to 60-second resolution in GSE coordinates. The dataset consists of linearly interpolated solar wind magnetic field measurements. It was constructed by Dr. J.M. Weygand for Prof. R.L. McPherron and used in superposed epoch and cross correlation studies.
Cassini Ion and Neutral Mass Spectrometer telemetry packets contain all science data from the instrument's checkout through the entire Saturn tour. Each packet is converted from raw data numbers to dimensional values and organized as a spreadsheet row. NASA produced this dataset, which was last updated in March 2026.
Data from a study evaluating the retention of critical metals bound to organic matter by ultrafiltration membranes. The dataset includes metal and natural organic matter quantification, speciation measurements, and chromatogram data from HPSEC-ICP-MS analysis. It was authored by Océane Hourtané and last updated in May 2026.
FL510, a transgressive salinity-tolerant rice genotype, has its constitutive metabolome and lipidome compared to its parents IR29 and Pokkali. The dataset includes identified analytes from principal component and partial least squares discriminant analyses, supported by transcriptome data on pathway-related genes. Author Isaiah Catalino Pabuayon published the data on figshare under a CC0 license, with a last update in March 2026.
Natural hazard statistics for the Lao People's Democratic Republic track disaster frequency, human impact, and economic damage in XLSX format. Produced by the Centre for Research on the Epidemiology of Disasters (CRED) and updated through 2026, the data is aggregated by year and disaster subtype.
44,663 traditional Chinese pharmaceutical label documents from Taiwan's Food and Drug Administration (TFDA). The dataset was created by twinkle-ai and last updated on 2026-05-03. Each record contains rendered WebP images of all PDF pages and structured data extracted into a 17-field JSON schema.
Aggregated historical data on natural hazard events in Uganda, compiled by the Centre for Research on the Epidemiology of Disasters. The records quantify disaster frequency, human fatalities, and economic damages categorized by year and specific disaster subtypes.
Aggregated annual statistics on natural hazard events in the Democratic Republic of the Congo, categorized by disaster subtype. Produced by the Centre for Research on the Epidemiology of Disasters (CRED), the data tracks human impact and economic costs through early 2026. Each record summarizes disaster frequency, fatalities, and financial damage for a specific hazard type within a given year.
5,666 scene photographs annotated for mirror surfaces, introduced at CVPR 2020. The dataset includes 5,095 training images with masks and edges, and 571 test images with masks only. It was created by author 'garrying' and last updated on the Hugging Face platform in April 2026.
Liang Liu provides full electrophoresis and other images supporting research on how rRNA intermediates coordinate nucleolar architecture in the model organism C. elegans. The dataset is a 56.2 MB ZIP file published under a CC-BY-4.0 license on figshare. It was last updated on 2026-04-27.
Records detail kilograms of cocaine paste and base seized by Colombian public forces during operations. Data is reported by municipality and department, with entries including the seizure date. The dataset is published by Colombia's national open data portal, www.datos.gov.co, with a last recorded update in March 2026.
A dataset containing 100 episodes of robot action data, totaling 103,706 frames and 300 videos, created using LeRobot. It was uploaded by YOLO2431 to Hugging Face on May 7, 2026. The dataset is structured for a single task involving a Yam bimanual robot.
An automatically annotated dataset for the Corpus Clarification task, introduced in a 2026 paper by Lequeu et al. The dataset transforms noisy, multi-topic citizen contributions from the Grand Débat National into structured data. It was authored by LequeuISIR and last updated on Hugging Face in April 2026.
Information on the organizational structure of the financial department of the Lytyn Village Council by years. The dataset is provided by the States site of Ukraine and was last updated on 2026-05-06. It is available in common tabular formats like Excel and CSV.
An English translation of a Chinese corpus for training Socratic teaching models. The dataset was created by ulises-c and last updated on May 4, 2026. It enables English-language research and fine-tuning without requiring access to the original Chinese data.
Preclinical investigations and pilot clinical imaging studies for a series of peptide-derived, c-Met-targeted PET probes labeled with Gallium-68. It reports synthesis details, in vitro/in vivo stability (>90%), and clinical results including tumor-to-lung ratios and correlation with c-Met expression (R=0.71). The dataset is 265.7 KB in size.
GlaciStore is a pre-proposal cover sheet submitted to the Integrated Ocean Discovery Programme (IODP) on 31 March 2014. The document, led by Heather Stewart of the British Geological Survey on behalf of a 25-member consortium, outlines scientific objectives and details 12 proposed drill sites for investigating glacial history and basin processes relevant to offshore CO2 storage in the North Sea. The publicly available cover sheet includes an abstract, research objectives, and a table of site coordinates, water depths, and drilling targets.
Replication Data for 'How Filibuster Rhetoric Informs Perceptions of Politicians' by Kevin Banda of Legislative Studies Quarterly. The dataset contains results from a preregistered survey experiment and secondary cross-sectional survey analysis, last updated on May 12, 2026. It examines how elite messaging about the filibuster shapes citizens' ideological and affective evaluations of political figures.
jcabshear created a dataset of manually captioned fantasy character images for training generative AI models. The newest version includes 30 images each for races and classes such as aarakocra, dragonborn, elf, and tiefling. This collection was last updated on 2026-04-24 and is hosted on Hugging Face.