Loading...
Loading...
Image classification, object detection, segmentation, face recognition, OCR, image generation, video understanding
15,476 datasets
uv-scripts created this dataset for bootstrapping object-detection datasets using vision-language models. The dataset provides bounding box JSON outputs from free-form prompts, designed for integration with labelling tools like Label Studio and FiftyOne. It was last updated on May 12, 2026.
National Defence | Défense nationale provides counts of Cadet Organizations Administration and Training Service (COATS) members for fiscal years 2003 to 2025. The dataset is published on the open_canada platform under the OGL-CA-2.0 license. It was last updated on April 29, 2026.
Information about the organizational structure of the Department of Investment Policy, Projects, International Relations, Tourism and Promotions of the Ivano-Frankivsk City Council. The data is structured according to the organizational and administrative document 'Structure and staffing'. The dataset was last updated on 2026-04-21.
A list of all Umbrella Bodies registered with AccessNI. Umbrella Bodies are organizations authorized to countersign applications on behalf of other organizations. The dataset is provided by the Government Digital Service under the UK Open Government Licence.
48.2 KB of data on agentic departments, which are self-organizing ensembles of AI agents assuming institutional positions to achieve complex functions. The dataset was authored by Paul Brützke and last updated on April 29, 2026. It is available as an XLSX file under a CC-BY-4.0 license.
7,093 high-difficulty samples form a benchmark for large multimodal models focused on real-world document processing. It covers 5 major OCR-centric tracks and emphasizes practical enterprise tasks and underrepresented corner cases. The dataset was created by Eioss and was last updated on Hugging Face in May 2026.
MetaphorVU-Bench is a benchmark dataset for metaphorical video understanding. It is characterized by a systematic taxonomy and contains metaphorical videos curated from billions of real-world candidates with rigorous human annotation. The dataset was created by author lzq2021 and was last updated on Hugging Face in May 2026.
The Province of Groningen in the Netherlands provides geospatial data for the Delfzijl Zuid wind farm. This dataset is defined by the provincial Environmental Ordinance and is published by the Dutch Ministry of the Interior and Kingdom Relations. The data is available in JPEG, WFS, and WMS formats under a CC-PDM-1.0 license.
MONET is a large-scale, curated image-text dataset designed for training text-to-image systems. It contains 104.9 million high-quality image-text pairs distilled from 2.9 billion raw pairs across nine open sources. The dataset was created by jasperai and was last updated on the platform in May 2026.
Boundaries of the plan area for the Eemsmond-Delfzijl structural vision, as defined in the Environmental Ordinance of the Province of Groningen. The dataset is provided by the Dutch Ministry of the Interior and Kingdom Relations and is available under a Creative Commons Public Domain Mark license. The last update date is unknown.
Forest and nature reserves in 2001. This map layer was compiled on the basis of various maps, such as property situations, nature-managing organisations, and topographical data. The dataset is provided by the Ministerie van Binnenlandse Zaken en Koninkrijksrelaties and is licensed under CC-PDM-1.0.
OVPD is a virtual-physical fusion testing dataset derived from the 2025 OnSite Autonomous Driving Challenge. It is organized at the clip level, with each clip corresponding to a complete test run from a participating team. The dataset was created by Yuhang253820 and was last updated on May 8, 2026.
245 single-frame embryo images were classified by three human embryologists and two deep learning models, ResNet-34 and VGG16. The study, authored by Radhika Kakulavarapu and last updated in March 2026, evaluates accuracy, agreement, and interpretability using explainable AI techniques like Grad-CAM. Embryologists achieved 89.9% accuracy, outperforming the AI models, while interpretability assessments showed ResNet-34 explanations were rated biologically relevant 89% of the time.
Organic carbon records from sediment core BHB15-6 in the Bohai Sea, China. The dataset is a 62.8 KB Excel file authored by Hai Li and published under a CC-BY-4.0 license on figshare. Its last update is recorded as 2026-04-29.
126,380 story metadata records and 525,650 chapter text entries from the adult-fanfiction.org public archive. The dataset, organized into four parquet tables, includes 22 archive subdomains and 16,597 listing pages. It was created by author trentmkley and last updated on 2026-05-15.
Twenty-seven CTD casts and eleven multiple corer drops characterize the physical and chemical properties of the water column and sediment at the ECOGIG seep and other sites in the northern Gulf of Mexico. The dataset includes binned profiles, bottle files, and measurements of salinity, pH, dissolved nutrients, gases, and particulate carbon and nitrogen. Data was collected by the National Oceanic and Atmospheric Administration during the R/V Endeavor cruise EN527 from June 20 to July 3, 2013.
1,328 refugee households from the Central African Republic were surveyed across four camps in the Democratic Republic of the Congo. The joint UNHCR and WFP assessment, conducted from August to September 2021, aimed to understand basic needs and vulnerabilities related to livelihoods. This dataset is an anonymized version of the original survey results.
770 households from the Central African Republic living with host families in Zongo and Yakoma, Democratic Republic of the Congo, were assessed in October 2021. UNHCR and WFP conducted the joint assessment using systematic random sampling from 3 to 17 October, 2021. The dataset is an anonymous version of the original, aimed at refining a targeting strategy for vulnerable populations.
Meijia Chang's 2026 figshare dataset contains experimental data on two nonfused ring electron acceptors (NFREAs), UF-CHex-2F and UF-CHexC3-2F. It supports research into molecular assembly engineering for high-efficiency organic solar cells, with a reported power conversion efficiency of 15.42%.
The AusSeabed Bathymetry Compilations Coverage Database contains polygon extents of bathymetry data acquisitions. It is a live database updated regularly and augmented by collaborators, identifying data spanning coastal, continental shelf, and deep sea locations, including the Australian Antarctic Territory. Each polygon contains metadata describing compilation details.