Loading...
Loading...
Image classification, object detection, segmentation, face recognition, OCR, image generation, video understanding
15,841 datasets
Three JSONL files contain linked layers of persona-grounded data. The dataset includes character profiles, character-grounded questions, and multi-turn dialogues built around those questions. Author yonglixiang uploaded it to Hugging Face, with a last recorded update on 2026-04-27.
192 Whole Slide Image (WSI) thumbnails comprise this dataset, with the shorter edge exceeding 1000 pixels. Author Christoph Blattgerste contributed the data, which was last updated on May 10, 2026. It includes a second folder of processed images where tissue detection masks have been applied, resulting in square 1024x1024 pixel images.
Humanitarian Outcomes maintains this dataset of aid worker security incidents in the Democratic Republic of the Congo, updated through March 2026. The data tracks violence against humanitarian personnel and undergoes an annual verification process to ensure the accuracy of historical records.
Anonymized participant data from a study investigating Hepatitis B Virus screening uptake and associated risk behaviors among adults attending an HIV clinic in Eastern Uganda. The dataset includes retrospective records and prospective questionnaires, a codebook, sample questionnaire, and study protocol. It was authored by Rose Mary Busingye and last updated in March 2026.
A Non-Destructive Computational Assaying framework uses a fine-tuned MobileNetV2 architecture to analyze microscopic surface textures. The system, developed by Ayush Jain and hosted on Harvard Dataverse, reportedly achieves 92.5% accuracy in differentiating genuine .925 sterling silver and flags 98% of fraudulent hallmarks. The dataset, last updated in April 2026, is derived from a specialized 'Jewelerβs Macro-Library'.
Geochemical data on cobalt, nickel, and manganese concentrations in pyrite mineral samples from the Faro No. 1, Vangorda, and Swim Lakes deposits in Yukon's Anvil Range. The dataset originates from a thesis held by the Government of Yukon. Specific row counts, column features, and sample data are unavailable.
Nannofossil biostratigraphy, 46 stable bulk carbonate stable isotope measurements (oxygen and carbon), and 71 organic and inorganic carbon percentage measurements from between 1313.71 and 1326.82 meters below sea floor at IODP Site U1480. The dataset was produced under NERC grant NE/P021182/1 and is associated with the British Geological Survey. It was last updated on 2026-04-09.
A collection of photographs compiled from the official website of the Republic of Turkey for wanted individuals. The dataset was created by user tibetyalman for academic analysis and awareness purposes. It was last updated on the platform in April 2026.
Two weeks of dashcam video recordings from Dhaka, Bangladesh, captured under daytime and nighttime conditions. The dataset comprises 173 high-resolution videos totaling 6 hours and 18 minutes, with approximately 634,000 frames annotated with 2,180,000 bounding boxes across 12 distinct vehicle classes. It was created by Mohammed, Saif and last updated on March 29, 2026.
Slakh2100 is a dataset created to study the impact of training data quality and quantity on music source separation. The dataset is a mirror copy hosted on Hugging Face, with its official repository on Zenodo. It was introduced in a 2019 IEEE WASPAA paper by researchers including Ethan Manilow.
A catalog of established local rates submitted by ground ambulance service organizations to the Oregon Division of Financial Regulation. The data is published by data.oregon.gov for transparency under HB 3243 and was last updated in March 2026.
Magnetite with combined titanium dioxide is the predominant heavy mineral in the Bougainville beach sands, derived from recent andesitic volcanoes. The data, published by Geoscience Australia, describes concentrations on the island's eastern and western coasts. Spectacular high concentrations on the eastern coast are likely too small for iron ore deposits, while economically important concentrations may exist in coastal plains on the western side.
HSL Objects v1 contains 5,037 images annotated for object detection in the RoboCup Humanoid Soccer League. The dataset is split into 4,029 training, 504 validation, and 504 test images. It was created by whirlwind-ams and last updated in April 2026.
Annual statistics about the accomplishments of the Board of Public Works, with data collection beginning in 2018. The dataset is provided by the City of Bloomington and was last updated on March 22, 2026. It is available in multiple structured formats including CSV, JSON, RDF, and XML.
Organic geochemical studies provide insight into sediment origin and history. The data likely contains analyses of biological markers from sedimentary rock extracts, published by Geoscience Australia Data. The record was last updated on 2026-03-25.
BenthiCat - Raw comprises unprocessed data files from BenthiCat surveys, including ROS Bag optical recordings, XTF side-scan sonar files, and .xyz multibeam echosounder data. The dataset includes sector-specific subdirectories for sonar data and a GIS folder with georeferenced mosaics and habitat interpretations. Hayat Rajani authored this foundational collection for custom preprocessing and analysis.
Mistake Attribution (MATT) benchmarks from CVPR 2026 go beyond binary mistake detection to attribute semantic role violations and identify Points-of-No-Return. The dataset, created by researchers from the University of Michigan and Voxel51, provides large-scale benchmarks for fine-grained mistake understanding in first-person videos. It was last updated on April 17, 2026.
ROSETTA-ORBITER 67P RPCLAP 3PRL CALIBRATED V1.0 contains calibrated data from the Rosetta spacecraft's RPC-LAP instrument. The data, provided by the National Aeronautics and Space Administration, was acquired during the pre-landing mission phase targeting comet 67P/Churyumov-Gerasimenko. This version reorganizes the data into fewer files with longer time series, sorted by measurement type.
Chemical and mineralogical data for four ferromanganese samples, including two nodules and two crusts, collected from the Dampier Ridge and Lord Howe Rise off eastern Australia. The samples, gathered by the vessel Sonne, provide measurements for elements like Ni, Cu, Co and Mn:Fe ratios, with deposits up to 20 cm thick.
Geoscience Australia provides a selection of images and short animations explaining key aspects of the 2004 Indian Ocean tsunami. The resources were revised and reissued for the tenth anniversary of the disaster, updating previous materials. The dataset is hosted by the Australian Ocean Data Network and was last updated on 2026-04-10.