Loading...
Loading...
Image classification, object detection, segmentation, face recognition, OCR, image generation, video understanding
14,859 datasets
The Prussian Urmesstischblätter are hand-drawn, one-off topographic maps produced starting in 1822 for the entire territory of Prussia. They were created on a scale of 1:25,000 and were not published, serving as the basis for smaller-scale maps. The Bundesamt für Kartographie und Geodäsie provides these sheets, which mark the beginning of modern topographic cartography.
Multi-temporal city data for the Democratic Republic of the Congo includes annual land consumption rate, population growth rate, their ratio, and built-up area per capita. The dataset was produced by the United Nations Human Settlements Programme (UN-Habitat) Data and Analytics Section. It was last updated on May 6, 2026.
Statistics Canada surveyed businesses in the third quarter of 2025 to measure their perceived ability to take on more debt. The data is disaggregated by industry classification, business size, type of business, activity, and majority ownership. It is available in multiple formats under the OGL-CA-2.0 license and was last updated on the platform in April 2026.
Prussian territory was surveyed for the Royal Prussian General Staff starting in 1822. The dataset likely contains high-quality prints of hand-drawn, one-off topographic maps at a scale of 1:25,000. These original survey sheets, produced in 1839, mark the beginning of modern topographic cartography.
A public ASR training dataset combining multiple audio sources. It includes 4,100 Hindi-dominant YouTube podcast segments, 1,160 English podcast segments, 787 Hindi-English code-mixed podcast segments, and 24,459 synthetic Hinglish entity-normalization speech clips. The dataset was created by sajalmadan0909 and was last updated on June 4, 2026.
Organogram data is released by all UK central government departments and their agencies. Snapshots for 31st March and 30th September are published biannually by the 6th of June and December each year. The data is validated and released in CSV format by Active Travel England under an OGL-UK-3.0 license.
A GIS Shapefile showing the extents of all Licensed Marine Disposal Sites for the UK, its Crown Dependencies, and Overseas Territories. The dataset is maintained by the Government Digital Service and includes sites regulated under the London Convention/Protocol of the International Maritime Organisation. Sites are classified as Open, Disused, or Closed based on their licensing activity within the last 5 to 10 years.
A 2026 figshare dataset by Jiwei Gu documents the synthesis and evaluation of 105 tiagabine-based compounds as potential brain-penetrant radioligands for GABA Transporter 1 (GAT-1). It details the screening process that identified four lead candidates (GATT-31, GATT-34, GATT-39, GATT-44) based on affinity, lipophilicity, and efflux liability. The data supports the first reported in vivo PET imaging of GAT-1 in nonhuman primates.
A dataset pairing three rendered views of laboratory assets with historical experimental actions and candidate next actions. The target is the protocol-consistent next action, framing the task as next-action prediction rather than generic image captioning. The dataset is the Level 1 split of LabHorizon, created by CongLab-Research and last updated on 2026-05-29.
Sediment particulate organic carbon and nitrogen data from marine samples collected between 1996 and 2017 around England and Wales. Measurements include the less than 2mm and less than 63 micron sediment fractions, silt/clay percentage, and sample weight. These data were collated from multiple sources including CSEMP and MPA monitoring to support the Cefas/Defra project 'Operational indicators of seafloor integrity'.
A GIS shapefile repository of all designated marine disposal sites across the UK and Crown Dependencies, regulated under the London Convention/Protocol. Sites are classified as Open, Disused, or Closed based on their licensing activity within the last 5 to 10 years. This dataset includes both historic legacy sites and newly designated areas for reporting disposal volumes to international bodies like IMO and OSPAR.
Maps at 30-meter resolution show landscape surface burn severity from the 2014-2015 fires in Northwest Territories and Northern Alberta, Canada. The dataset was created by the National Aeronautics and Space Administration using Landsat 8 imagery and regression models trained with field data. Field observations estimated burned area across five severity classes in stratified plots.
The Gym288-skeleton dataset is a human skeleton-based action recognition benchmark derived from the FineGym dataset. It provides temporally precise, fine-grained annotations of gymnastic actions along with 2D human pose sequences extracted from original video frames. The dataset was created by Lozumi and was last updated on Hugging Face in May 2026.
3032 occurrences of fusion temperatures and enthalpies for two-component molecular cocrystals and their pure components form this database. German L. Perlovich compiled the data from literature published between 1900 and 2024. The dataset includes evaluated formation thermodynamic functions for 934 specific two-component crystals.
A list of programs and grants offered by the City of Montreal to its residents and local organizations. The data is published on the city's official website and is available in CSV format under a CC-BY-4.0 license. It was last updated on April 17, 2026.
An inventory of open source solutions developed and modified by the City of Montreal. The dataset is published by the city's Information Technology Department under a CC-BY-4.0 license. It was last updated on April 17, 2026.
Open Spatial Reasoning is a multiple-choice dataset for evaluating 3D spatial reasoning from single driving images. The dataset was created by ReasonCore and was last updated on 2026-05-29. Each image contains numbered bounding boxes referencing objects, and questions probe a model's ability to reconstruct the real 3D scene.
Australian Ocean Data Network provides a study of uranium concentrations in organic-rich shales as a predictor for hydrocarbon potential. The data likely contains measurements of uranium, total organic carbon (TOC), and pyrolysis yields from shale samples across four Australian basins spanning the Mesoproterozoic to Cretaceous periods. The dataset was last updated on 2026-04-10.
RoboFine-Bench is a benchmark for evaluating Vision-Language Models on execution-level details of robot manipulation. It contains 500 held-out robot videos and is part of the FineVLA framework for fine-grained instruction alignment. The dataset was authored by xlangai and last updated on HuggingFace in May 2026.
Quantitative survey data from research participants, organized in a tabular format with respondents as rows and observed variables as columns. The 19.4 KB XLSX file was authored by Nguyen Thi Hang and last updated on May 2, 2026. It is shared under a CC-BY-4.0 license on the figshare platform.