Loading...
Loading...
Image classification, object detection, segmentation, face recognition, OCR, image generation, video understanding
15,996 datasets
A collection of images for detecting water pollution through classification models. The dataset is hosted on Kaggle, but specifics on the number of images, collection method, and creator are not provided. The last update date and license information are also unknown.
Synthetic Surgical Scalpel (w/ Yolo Box) is a dataset of realistic synthetic images of surgical instruments. The description indicates it includes a reproducible pipeline for generating the data. The dataset is hosted on Kaggle, but specific details on size, author, and license are not provided.
National Oceanic and Atmospheric Administration collected dissolved inorganic carbon, total alkalinity, pH, and nutrient data in the Gulf of Alaska region. Samples were gathered via discrete sampling using Niskin bottles and other instruments between May 2007 and July 2013. The dataset is part of the Ocean Carbon and Acidification Data System.
A set of digital grids for Australia's margin provides bathymetry, gravity, and magnetic data at resolutions of 250-1000 meters. Produced by the Australian Geological Survey Organisation in cooperation with Desmond Fitzgerald and Associates and the Australian Hydrographic Service, the dataset integrates and levels marine ship-track data with satellite and onshore sources for geological interpretation.
Michael Hogan's book 'A Cross of Iron' analyzes the origins of the U.S. national security state during the Truman administration. The work, published by The Ohio State University, examines political culture, budget battles, and key policy documents like NSC-68. It covers the period from 1945 to 1954, focusing on the ideological and institutional shifts of the early Cold War.
Indira Gandhi International Airport temperature data published on Kaggle. The dataset likely contains time-series records of temperature measurements. Metadata is minimal; actual content requires verification after download.
An object detection dataset likely intended for training or evaluating YOLO-based computer vision models. The dataset is hosted on Kaggle, but its specific contents, size, and creation details are not provided. Further details such as the number of images, annotation types, and source are unknown.
A dataset of tomato leaf images annotated for object detection tasks. The description indicates it is intended for training YOLOv8 models and includes background noise. The dataset's author, organization, and specific size are unknown.
DGANet v0.1 is a dataset published on Kaggle. Its title suggests a focus on Generative Adversarial Networks (GANs), a common computer vision technique. The dataset's specific content, size, and creation details require verification after download.
A set of model weights for a DeepLabResNet34 neural network architecture. The weights represent a training checkpoint after 15 epochs, as indicated by the title. The dataset is hosted on Kaggle, a platform for data science and machine learning projects.
LSCD_YOLO is a dataset hosted on Kaggle, likely for training and evaluating computer vision models. The dataset's title suggests it is formatted for use with the YOLO (You Only Look Once) object detection framework. Specific details about its contents, size, and creation are unavailable from the provided metadata.
A dataset titled 'Preprocessed_keypoint' is hosted on Kaggle. The dataset's title suggests it contains keypoint data, likely for human pose estimation or similar computer vision tasks. No further metadata is available to confirm its size, origin, or specific structure.
Unet_resnet50_a2epoch30_trying is a dataset published on Kaggle. The title suggests it contains artifacts from training a U-Net model with a ResNet50 backbone for 30 epochs. Metadata is minimal; actual content requires verification after download.
A collection of banana images likely intended for training convolutional neural networks, sourced from Kaggle. The dataset's size, specific contents, and creation details are not provided in the metadata. Users must download the dataset to verify its scope and quality.
HR Crime image dataset is a collection of images published on Kaggle. The dataset's title suggests a focus on visual data related to crime within a human resources context. Specific details on the number of images, annotation methods, and collection dates are unavailable from the provided metadata.
CTGAN Outputs is a dataset published on Kaggle. The title suggests it contains data generated by a Conditional Tabular Generative Adversarial Network (CTGAN). The specific content, size, and origin are unknown from the provided metadata.
1996 onward, Storm Data is a chronological listing of U.S. weather phenomena including hurricanes, tornadoes, thunderstorms, hail, floods, drought, lightning, high winds, snow, and temperature extremes. The dataset and publication, produced by NOAA's National Climatic Data Center (NCDC), contain reports from the National Weather Service (NWS) with statistics on personal injuries and damage estimates. Preliminary data is available back to 1950.
A 51-day corporate simulation dataset produced by OrgForge with the insider threat module enabled. The corpus provides structured security telemetry in JSONL/Parquet format for benchmarking LLM-based detection, with ground truth derived deterministically from the simulation's event log. It was authored by aeriesec and last updated on 2026-03 25.
AMAP data provides information on the status and threats to the Arctic environment, supporting scientific advice for remedial actions. The program, established in 1991, is now a working group of the Arctic Council. Its objective is to deliver reliable information on Arctic environmental contaminants and changes.
5 scripts for converting, validating, and sampling object detection datasets, released by uv-scripts and updated in March 2026. These tools support 6 bounding box formats and allow users to process Hugging Face datasets without local cloning.