Loading...
Loading...
Image classification, object detection, segmentation, face recognition, OCR, image generation, video understanding
15,996 datasets
A dataset named 'train_trocr_2026' published on Kaggle. The title suggests it is intended for training the TrOCR (Transformer-based Optical Character Recognition) model. Its specific content, size, and origin are not detailed in the provided metadata.
Short video clips and pose estimation outputs for individual signs from German Sign Language (Deutsche Gebärdensprache, DGS). The dataset is designed for training and evaluating models on isolated sign recognition tasks and includes both raw video data and .pev files. It was authored by fhswf and last updated on March 12, 2026.
A dataset from the State of Connecticut listing administrating agencies for licenses and credentials within the eLicensing system. The dataset was last updated on 2026 03 22. It is available in multiple structured formats including CSV, JSON, RDF, and XML.
120 dog breeds form the base of this image classification dataset. It is an amplified version of the classic Stanford Dogs Dataset, with thousands of new high-quality images fetched from Bing and filtered via YOLO object detection. The dataset is hosted by author fedehorl on Hugging Face in Parquet format.
An image dataset designed for basketball highlight and action recognition tasks. The dataset was sourced from Kaggle, but its creator, size, and specific temporal coverage are unknown.
A comparison of interval prediction performance between the IVMD-CNN-SVM-LSCV-B and Bootstrap methods under different confidence levels. The dataset is 5.5 KB in size, stored in XLS format, and was authored by Hong Ma. It was last updated on March 17, 2026.
Kaggle hosts a dataset titled 'HouseHold_ObjectDetection_Yolo'. The dataset likely contains images of common household items, annotated for training object detection models using the YOLO (You Only Look Once) framework. Specific details regarding the number of images, annotation format, collection method, and creator are not provided in the available metadata.
Property Valuation Services Corporation (PVSC) provides address, GPS coordinates, and five-year history of assessed and taxable assessed values for all property accounts in Nova Scotia. The dataset includes columns for civic address components, map coordinates, municipal unit, and assessment account number. Data is published via thedatazone.ca platform and was last updated in January 2026.
A 5.5 KB spreadsheet compiled by Ken Kunugitani, last updated in March 2026, aggregates reported performance metrics for convolutional neural networks designed to count tyrosine hydroxylase-positive cells. The dataset likely contains quantitative measures comparing AI-based counting to expert manual counting, a method described as having limited reproducibility. This collection appears focused on applications in Parkinson's disease research using mouse and human substantia nigra samples.
A 5.5 KB Excel file uploaded by Xiaofei Yin on March 17, 2026, provides a quantitative comparison between feeding raw data directly into a Convolutional Neural Network (CNN) and an alternative method proposed in the associated study. The dataset likely contains results from experiments on lithium battery voltage data, comparing metrics like running time and feature extraction capability. It is shared under a CC-BY-4.0 license on the figshare platform.
Australia's onshore gravity anomaly grid derived from 1,391,556 gravity station observations. The dataset integrates national and regional survey data, applying terrain corrections using bathymetry and topography.
Pagothenia borchgrevinki fish were exposed to a 10°C thermal stressor for 10 minutes, with physiological measurements taken before and during 48 hours of recovery. Data includes haematocrit, plasma chloride concentrations, and osmolarity for groups of six fish at multiple time points. The study was conducted by SCIOPS and published in December 1987.
Harvard Dataverse hosts replication data for a study on U.S. political donation behavior. The archive contains the data and code required to reproduce the findings of the paper 'Democrats Donate More to Their Party Where Dominant, Republicans Where Outnumbered'. It was authored by Xiajing Zhu and last updated in April 2026.
A 360-degree all-around view collection of 7 industrial parts, captured with one axis of rotation. It includes a training set of all-around images and test sets with ground truth anomaly masks. The dataset was created by author kentaito321 and was last updated in March 2026.
An image dataset for detecting Downy Mildew in vineyards. The description indicates a focus on early detection, which suggests the images may capture various stages of disease progression. The dataset's author, organization, and specific collection details are unknown.
A collection of aerial images of soybean crops captured by unmanned aerial vehicles (UAVs). The dataset is hosted on Kaggle, but details on its size, creation date, and authorship are not provided in the available metadata.
Parcel land size data includes assessment account numbers, civic addresses, and geographic coordinates. Each property is represented by a single civic address, though some properties have multiple addresses. The data is provided by Property Valuation Services Corporation (PVSC) for internal mass appraisal functions.
Xiaofan Shi's dataset provides performance evaluation results for various improved Faster-RCNN models tested on a rice leaf disease detection task. The data is stored in a 5.5 KB Excel file, indicating a small, focused set of experimental results. It was last updated in March 2026.
PaddlePaddle released Real5-OmniDocBench in 2026 as a benchmark for document analysis across five real-world scenarios including warping and illumination. The collection contains between 1,000 and 10,000 images, most of which were manually captured using handheld mobile devices to simulate authentic user conditions.
MiniImageNet is a widely used benchmark dataset derived from the larger ImageNet collection. It is published on Kaggle, though the specific author and organization are not listed. The dataset's exact size, composition, and last update date are unknown from the provided metadata.