DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Computer Vision Datasets | DataSalon

All Categories

👁️

Computer Vision

Image classification, object detection, segmentation, face recognition, OCR, image generation, video understanding

15,996 datasets

Computer Vision

ASTER L1T: Terrain-Corrected Satellite Radiance Data in Cloud-Optimized GeoTIFFs

ASTER L1T data provides calibrated at-sensor radiance, geometrically corrected and rotated to a north-up UTM projection. The data is derived from the ASTER L1A product using a single resampling step and incorporates GLS2000 digital elevation data for precision terrain correction. Each granule is provided as three separate Cloud-Optimized GeoTIFF files corresponding to the VNIR (15m), SWIR (30m), and TIR (90m) sensors, with metadata for converting digital numbers to radiance and reflectance.

ImageGeospatialSustainabilitySatellite ImageryRadianceComputer VisionNatural ResourceEarth ObservationCogMining+1

0 views

Computer Vision

Solar Panel Images Enhanced with Generative Adversarial Networks

A dataset of solar panel images likely processed or generated using Generative Adversarial Networks (GANs). The dataset is hosted on Kaggle, but its exact size, creation date, and authorship are unknown. Columns and specific content details require verification after download.

ImageGenerative Adversarial NetworksComputer VisionSolar Energy+1

0 views

Computer Vision

DCLA Programs Funding: Annual Cultural Development Awards by Organization

DCLA Cultural Development Funding amounts are allocated per fiscal year to specific organizations. The data is hosted by data.cityofnewyork.us and was last updated on 2026-02-26. Columns suggest it tracks the Organization, Total Final Award, and Application # for each funded program.

TabularCSVXMLJSONGovernment SpendingPercent For ArtProjectFiscal YearCultural AffairsArts GrantsProgramPercentCulturalArtistFundCultural FundingArtFundingDclaCULTURE+1

0 views

Computer Vision

Dolma3 6T Unique: 1.26 Billion Deduplicated Documents for Language Model Training

1,258,453,709 unique documents form the Dolma3 6T training mix, selected by a Bloom filter built from 1.26B deduplicated IDs. The dataset is materialized from multiple sources including Common Crawl, Stack Exchange, and scientific PDFs, and was created by HCAI-Lab. It was last updated on March 14, 2026.

TextWeb CrawlLanguage Model TrainingDeduplicationText Corpus+1

0 views

Computer Vision

Epstein Files OCR: Page-Level Text from Early Document Release

Epstein Files OCR — Datasets 1–8 (Early Release) contains page-level OCR output in Markdown format from a public release of documents related to the Jeffrey Epstein case. The dataset is designed for question answering, information retrieval, and text classification tasks. It was created by the author 'ishumilin' and last updated on March 17, 2026.

TextSize Categories10 Kn100 KIsland VisitsTask Categoriesquestion AnsweringUnsealed DocumentsEpstein FilesLicensecc0 10Ocr TextEpstein CaseFbi FilesEpsteinTask Categoriestext RetrievalPassenger ListCourt DocumentsLegal TextRegionusFlight LogsLegalTask Categoriestext ClassificationPrivate JetJeffrey Epstein+1

0 views

Computer Vision

Seattle Discrimination Case Closures by Month and Type, 2017-Present

Closed discrimination case investigations from the Seattle Office for Civil Rights (SOCR) are tracked monthly from 2017 onward. The dataset shows completed cases categorized by type, such as race or disability. It is published by the City of Seattle and was last updated in March 2026.

TabularTime SeriesCSVXMLJSONDiscriminationRaceCivil RightsCase ClosuresDisabilityOCROffice Of Civil RightsDiscrimination CasesSeattleHarassment+1

0 views

Computer Vision

Finance Minister Survival and Economic Performance in Democracies and Autocracies

Replication data for a forthcoming study in Political Research Quarterly. The dataset likely contains records linking economic performance indicators to the tenure of finance ministers across different political regimes. It was authored by Jonas Willibald Schmid and is hosted on Harvard Dataverse, with a last update recorded on April 20, -2026.

TabularEconomic PerformanceFinancePolitical ScienceMinister SurvivalDemocracy Autocracy+1

0 views

Computer Vision

YoNAH: North Atlantic Humpback Whale Photographs, Genetics, and Behavior Data

1992-1993 standardized study of humpback whales across their North Atlantic breeding and feeding grounds. The YoNAH project collected photographs of natural markings, genetic samples, and behavior data. The work was undertaken by SCIOPS, representing a broad-ranging, intensive study of a marine mammal species.

MultimodalPhotographic IdentificationAnimal BehaviorWhale ResearchFinanceLarge ScaleMarine Biology+1

0 views

Computer Vision

YOLO Trained Weights for Multi-Target Multi-Camera Tracking

YOLO Trained Weights MTMCT is a dataset of pre-trained model weights for the YOLO object detection architecture, hosted on Kaggle. The weights are likely intended for tasks involving multi-target, multi-camera tracking scenarios. The dataset's specific content, size, and creation details require verification after download due to minimal provided metadata.

ImageTrained WeightsYoloMulti Target TrackingComputer VisionObject Detection+1

0 views

Computer Vision

Real Versus AI Image Classification Corpus

US region binary classification dataset for training AI-image detectors, built by Zitacron from 17 public HuggingFace sources. The corpus contains real and AI-generated images, all with commercial licenses.

ParquetLibrarypolarsLibrarydaskSize Categories1 Mn10 MLanguageenModalitytextAi GeneratedLibrarymlcroissantModalityimageLibrarydatasetsBinary ClassificationLicensecc By 40Task Categoriesimage ClassificationComputer VisionAi DetectionRegionusReal Images+1

0 views

Computer Vision

California NEVI Funding Project Regions by County

Regions for allocating federal electric vehicle charging infrastructure funding under California's NEVI program Rounds 4 and 5. The data organizes the state into project regions by county, created by the State of California and last updated in March 2026.

GeospatialZIPCSVTextExcelEnergy policyUnited StatesElectric Vehicle InfrastructureTransportation Planning+1

0 views

Computer Vision

Garbage Detection Dataset: Municipal Waste and Litter Images

Garbage_Detection_Dataset is a collection of images for object detection tasks, published on Kaggle. The raw description indicates it is a YOLO-ready dataset, which suggests it contains bounding box annotations for municipal waste and litter. The dataset's specific size, origin, and update history are not detailed in the provided metadata.

ImageYoloWaste ManagementComputer VisionObject Detection+1

0 views

Computer Vision

Key Officers of U.S. Foreign Service Posts, 1966-2017

Quarterly observations from 1966 to 2017 track the names, locations, and key personnel of U.S. overseas diplomatic posts. David Lindsey compiled this micro-level dataset for the Measuring Diplomacy Project. It includes identifiers for top-ranking officers and post classifications, such as embassies, consulates, and missions.

TabularTime SeriesSocial SciencesDiplomacyForeign ServiceGovernment+1

0 views

Computer Vision

DAMO-YOLO Weights: Pre-trained Object Detection Model Parameters

A collection of pre-trained model weights for the DAMO-YOLO object detection architecture. Published on Kaggle, the dataset likely contains the parameter files necessary to initialize or fine-tune the model. The author, organization, and specific version or training details are unknown.

ImageComputer VisionObject DetectionModel Weights+1

0 views

Computer Vision

monet-cyclegan-g-p2m: Monet Painting to Photo Translation Model

A dataset hosted on Kaggle containing a CycleGAN generator model. The title 'monet-cyclegan-g-p2m' suggests it is likely a pre-trained model for translating images from the style of Monet paintings to photographs. Its specific contents, such as the number of model parameters or training images, are not detailed in the provided metadata.

ImageGenerative Adversarial NetworksImage To Image TranslationComputer VisionCyclegan+1

0 views

Computer Vision

POPs in Salmonids and Macroinvertebrates from Glacier Bay, Alaska

A 2007 pilot study by the University of Alaska Southeast determined baseline levels of persistent organic pollutants (POPs) and total mercury in juvenile coho salmon from streams near Glacier National Park. Concentrations of POPs were relatively low (< 10 ng/g, wet weight), but salmon from streams with higher spawner density showed increased levels of banned chlorinated pesticides. A follow-up study in 2015 analyzed POPs in resident salmonids and benthic macroinvertebrates from five streams using gas chromatography/mass spectrometry.

TabularMacroinvertebratesEnvironmental scienceInstrument Not ApplicableDdtsSalmonidsDocnoaanmfsnwfscAlaskaNational Marine Fisheries ServiceNorthwest Fisheries Science CenterBenchmarkPollutantsObsoleteNoaa Us Department Of CommerceSalmonid RecoveryEfs Environmental And Fisheries Sciences DivisionMontlake+1

0 views

Computer Vision

North Atlantic and Pacific Ocean CTD Profiles for Water Chemistry, 1981-1994

North Atlantic Ocean and North Pacific Ocean physical and chemical data collected via bottle casts from NOAA Ship MALCOLM BALDRIGE and other platforms between June 15, 1981 and April 25, 1994. The data were submitted by Dr. Richard A. Feely of the Pacific Marine Environmental Laboratory (PMEL). It includes measurements for temperature, salinity, oxygen, and various nutrients and carbon system parameters.

TabularTime SeriesOceanographyNoaaNorth AtlanticCtd ProfilesWater Chemistry+1

0 views

Computer Vision

MEDIPROD_IV: Marine Chemistry Profiles from the Western Mediterranean Sea, 1981

October 20 to November 15, 1981, saw the collection of discrete profile data during the R/V Jean Charcot MEDIPROD_IV cruise in the Western Mediterranean Sea. The dataset includes measurements of dissolved inorganic carbon, total alkalinity, temperature, salinity, dissolved oxygen, and nutrients. NOAA's National Centers for Environmental Information (NCEI) archives this data.

TabularTime SeriesMarine ScienceHydrographic ProfilesCarbon cycleOcean ChemistryMediterranean Sea+1

0 views

Computer Vision

Scotian Shelf Geophysical and Oceanographic Data Record, 1964

1964 geophysical and oceanographic observations from a cruise in the North American Basin of the Atlantic Ocean. The data includes magnetic measurements, echo sounding traverses, seismic refraction equipment checks, and studies of heat production in sediments, organic matter in sea water, and sediment samples. The cruise also occupied eight stations of the seasonal Halifax oceanographic section for the Atlantic Oceanographic Group.

TabularGeospatialMarine ScienceOceanographic dataAtlantic OceanGeophysicsSediment Analysis+1

0 views

Computer Vision

Sediment Core Analysis from Antarctica's Vestfold Hills, 1985

1985 sediment coring activities at Davis Station, Antarctica, are documented in this scanned report. The report details the analysis of cores from the Vestfold Hills for water content, total organic content, and non-polar lipid content. It was produced by Lin Jian-ping and archived by the Australian Antarctic Data Centre.

TextAntarcticaSediment corePolar ResearchPaleoenvironmentGeochemistry+1

0 views

PreviousPage 400 of 798Next