Loading...
Loading...
Image classification, object detection, segmentation, face recognition, OCR, image generation, video understanding
15,950 datasets
ResNet-18 pretrained weights are a standard benchmark model for image classification tasks. The weights are likely trained on the ImageNet dataset, a large-scale collection of labeled images. This resource is published on Kaggle, a platform for data science and machine learning.
A dataset of white blood cell (WBC) images from the LISC and Raabin sources, augmented using Generative Adversarial Networks (GANs). It is hosted on Kaggle, but the specific collection date, creator, and dataset size are unknown. The data is intended for computer vision tasks related to medical cell classification.
Dalian University of Technology's dataset for detecting and tracking unmanned aerial vehicles. It is published on Kaggle. The dataset likely contains images annotated for use with the YOLO object detection framework.
Uruguay telephone survey of 1,476 respondents conducted between January and February 2025. It covers five departments (Colonia, Cerro Largo, Maldonado, Salto, and TacuarembΓ³) and measures six dimensions of subnational democratic quality alongside sociodemographic variables. The dataset was authored by Martin Freigedo and last updated in March 2026.
August 14 to 31, 1996, this dataset contains temperature, salinity, fluorescence, light transmission, and water density measurements collected by Conductivity-Temperature-Depth (CTD) casts during the NOAA Endeavor cruise EN-287 in the Mid-Atlantic Bight. The data was gathered by NOAA NCEI to provide observations between SeaSoar tows on a cruise from Narragansett, Rhode Island.
Lichess provides 5,751,400 chess puzzles extracted from 300 million analyzed games and verified by Stockfish NNUE engines. Updated monthly, the collection includes tactical tags and difficulty ratings for every position. The dataset was generated using over 50 years of CPU time to ensure engine-verified accuracy.
ONOTE is a large-scale, omnimodal benchmark designed to evaluate music understanding across three major notation systems: Western Staff, Jianpu (Numbered Notation), and Guitar Tablature. The dataset is organized into sub-directories based on notation and instrument type. It was created by Weisiqing123 and last updated on April 1, β.
20 variables of soil profile data collected by Syeda Nyma Ferdous of MASBio, including organic carbon, total nitrogen, pH, texture fractions, and bulk density. The dataset supports land degradation assessment and carbon stock estimation. It was last updated on April 21, 2026.
IDCNNVIDMAE dataset is a computer vision dataset published on Kaggle. The dataset's specific content, size, and creation details are not provided in the available metadata. Its title suggests it likely contains video data intended for training or evaluating deep learning models.
A GitHub repository by carinahausladen, last updated 2026-05-06, introduces materials related to fairness in artificial intelligence. It includes fairness metrics, datasets, and discusses social choice theory for AI alignment and ethics debates on AI's impact on democracy. The repository is licensed under the MIT license.
A 5.5 KB dataset provides descriptive statistics comparing measurement techniques for industrial workpiece dimension measurement. The dataset, created by Yazid Saif and last updated in March 2026, focuses on techniques including Vision Inspection (VI), Coordinate Measuring Machines (CMM), and Convolutional Neural Networks (CNN). It addresses topics such as edge detection, varying hole sizes, and interference regions in machine vision applications.
VGGFace2_subset is a dataset of facial images, likely derived from the larger VGGFace2 collection. It is hosted on the Kaggle platform. The specific number of images, collection method, and time range are not provided in the available metadata.
EfficientNet_b0 is a set of pre-trained model weights for the EfficientNet-B0 architecture, a convolutional neural network designed for image classification. The dataset is hosted on Kaggle, a platform for data science and machine learning. The specific source, creation date, and size of the weights file are not detailed in the provided metadata.
Eight time series from 2017-08-11 to 2017-10-25 investigate photochemical degradation effects on the carbon cycle in Baffin Bay, Texas. The dataset includes measurements of dissolved inorganic carbon, its stable carbon isotopes, dissolved organic carbon isotopes, and chlorophyll a from closed-system incubation experiments. Data originates from a NOAA NCEI project and is hosted on multiple government platforms.
25 Sep 2020 - 26 Sep 2020 bathymetry survey data from the Australian Hydrographic Office. The surface was created from a contracted national reference survey between Gantheaume Point and Talboys Rock, Broome WA, for calibrating multibeam echosounders. It provides 0.5m resolution grids in MSL, LAT, and Ellipsoid vertical datum, exported as 32-bit floating point GeoTIFF files.
WildDet3D-Data provides 3D bounding box annotations for images sourced from major detection datasets like COCO and LVIS. The training split with human-reviewed annotations contains 102,979 images and 229,934 annotations across 11,879 categories. It was created by Allen Institute for AI (allenai) and was updated in April 2026.
SceneVerse++ contains 1019 3D scenes automatically reconstructed from unlabeled internet videos. It was created by bigai to address the scarcity of annotated 3D scene data. The dataset was last updated in April 2026.
This geospatial dataset documents artisanal mining site visits across Eastern Democratic Republic of Congo (DRC) between 2009 and 2020. Produced by the International Peace Information Service (IPIS), the data tracks mining activities with a focus on human rights indicators such as child labor. It provides longitudinal observations of mining locations over an 11-year period.
Sarah Sunn Bush from Yale University presents a text-based analysis of democracy promotion efforts. The work includes case studies on Jordan and Tunisia and an appendix containing data on categories of democracy assistance and major organizations. The dataset likely contains qualitative and quantitative information supporting the book's argument about the structure of foreign aid.
501 S&P 500 constituent stocks are represented by 1,374,694 candlestick chart images from January 2010 to March 2025. Each image is labeled with a 6-class forward return category based on price movement thresholds.