DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Computer Vision Datasets | DataSalon

All Categories

👁️

Computer Vision

Image classification, object detection, segmentation, face recognition, OCR, image generation, video understanding

15,846 datasets

Computer Vision

Ar Pusht Image: Arabic-Translated Vision-Language-Action Robotics Dataset

An Arabic-translated version of the PushT image dataset for robotics. The dataset provides high-quality Arabic instructions for complex manipulation tasks, enabling the training of localized robotics policies. It was created by hamzabouajila and last updated on 2026-04-21.

ImageMultimodalArabic LanguageManipulation TasksRoboticsComputer VisionVision Language Action+1

0 views

Computer Vision

OCR-Markdown-Dense-200x: Synthetic Dense Document Images with Structured Text

OCR-Markdown-Dense-200x is a synthetic dataset designed for dense document optical character recognition tasks. The dataset was created by author prithivMLmods and was last updated on April 21, 2026. It focuses on extracting structured HTML or Markdown representations from densely packed document pages.

MultimodalDocument UnderstandingOptical Character RecognitionComputer VisionSynthetic DataSynthetic+1

0 views

Computer Vision

MiniMSD: Processed 2D Medical Image Segmentation Benchmark for 10 Organs

A processed and reduced medical image segmentation benchmark covering 10 human organs. The dataset is derived from the Medical Segmentation Decathlon by converting volumetric NIfTI scans into serialized 2D RGB images with segmentation masks. It is provided in multiple resolution variants (244, 512) for easier use and was last updated on 2026-04-19.

ImageMedical ImagingBenchmarkImage SegmentationHealthcareComputer VisionHuman Organs+1

0 views

Computer Vision

BloodshotNet: Large-Scale Computer Vision Dataset for Blood Detection

BloodshotNet-Dataset is the official, large-scale aggregated dataset designed to train a YOLO-based blood detection model. The dataset was created by author 'petre-bit' and was last updated on Hugging Face in April 2026. It contains highly graphic and sensitive imagery, including simulated and real blood, serious injury, and surgical scenes.

ImageYoloComputer VisionLarge ScaleMedical ImageryBlood DetectionGraphic ContentSynthetic+1

0 views

Computer Vision

Uganda Infrastructure: World Bank Indicators from IEA, ITU, and ICAO

Giving access to infrastructure indicators for Uganda compiled by the World Bank Group from specialized international agencies. It aggregates metrics across transport, energy, and telecommunications sectors, with the most recent update recorded in March 2026. The data is delivered in CSV format and serves as a centralized resource for national development tracking.

Facilities InfrastructureIndicators+1

0 views

Computer Vision

Uganda Health Indicators: World Bank Data from WHO, UNICEF, and UNAIDS

Giving access to health indicators for Uganda sourced from the World Bank, aggregating metrics from the UN Population Division, WHO, UNICEF, and UNAIDS. It covers health systems, disease prevention, reproductive health, nutrition, and population dynamics, with the latest update recorded in March 2026. The data is delivered in CSV format to support public sector health analysis.

IndicatorsHealth+1

0 views

Computer Vision

Uganda: Environmental Indicators for Forests, Biodiversity, and Emissions

Uganda environmental indicators covering forests, biodiversity, emissions, and pollution, curated by the World Bank Group. These tabular records provide a localized view of natural and man-made resource metrics, with the most recent update recorded in March 2026.

IndicatorsEnvironment+1

0 views

Computer Vision

Uganda Energy and Mining: World Bank Development Indicators

This dataset tracks energy production, use, dependency, and efficiency indicators for Uganda, compiled by the World Bank Group. It aggregates data from the International Energy Agency and the Carbon Dioxide Information Analysis Center through March 2026. The records provide a time-series view of national energy and mining development.

IndicatorsDevelopmentEnergy+1

0 views

Computer Vision

Uganda Aid Effectiveness: World Bank Indicators for Poverty and Health

World Bank indicators for aid effectiveness in Uganda, focusing on poverty reduction, health, and education metrics. Maintained by the World Bank Group, the data was last updated in March 2026 and is provided in CSV format. It tracks the impact of international aid on the achievement of Millennium Development Goals within the country.

Aid EffectivenessIndicators+1

0 views

Computer Vision

OverlayDataset: 499,249 Images with Object Layouts and Scene Captions

OverlayDataset is a large-scale vision-language collection of 499,249 images paired with dense object-level annotations, local prompts for objects, and global scene captions. It was created by dsrivastavv and last updated on Hugging Face in April 2026. The dataset is designed for training controllable image generation systems.

ImageMultimodalVision LanguageScene CaptioningComputer VisionLarge ScaleObject Annotation+1

0 views

Computer Vision

Video Question Answering Benchmark with 3200 Annotations

Annotation data for the Video-MME-v2 benchmark, containing 800 1080p MP4 video files and 3200 question-answer pairs stored in a Parquet file. The dataset was created by MME-Benchmarks and the repository was last updated in April 2026.

TabularVideoMultimodal AiBenchmarkingBenchmarkQuestion AnsweringVideo Understanding+1

0 views

Computer Vision

Simulated Zinc Copper Manganese Requirements for Bovine Life Stages

Four datasets contain simulated records for mineral requirements in different bovine groups. Each dataset holds 5,000 individual records for growing heifers, pregnant nulliparous cows, pregnant parous cows, and lactating cows. The data was created by Jean-Baptiste Daniel and published on figshare in April 2026.

TabularExcelAnimal NutritionAgricultural ScienceMineral RequirementsBovine HealthSynthetic+1

0 views

Computer Vision

Marathi Handwritten Sentence Images for OCR, 4,648 Annotations

4,648 annotated images of handwritten Marathi sentences in Devanagari script. The dataset is hosted on Kaggle and likely contains samples for training optical character recognition models. Its specific origin, collection method, and update history are not detailed in the provided description.

ImageHandwritten TextMarathiImage DatasetOCRDevanagari Script+1

0 views

Computer Vision

Marathi Printed OCR Dataset with 4,648 Annotated Sentence Images

4,648 annotated Marathi printed sentence images in Devanagari script. The dataset is hosted on Kaggle. The author, organization, and last update date are unknown.

ImagePrinted TextComputer VisionMarathi LanguageOCRDevanagari Script+1

0 views

Computer Vision

UV OCR Scripts: 14 Vision-Language Models for Markdown Extraction

14 OCR models ranging from 0.9B to 8B parameters provided by uv-scripts as of March 2026. These scripts facilitate the conversion of image-based datasets into markdown format using HuggingFace Jobs and the UV package manager.

Uv ScriptVision Language ModelArxiv260313032Hf JobsRegionusOCRDocument Processing+1

0 views

Computer Vision

Atha Text Dataset: Indonesian Sentiment Classification for NLP Experiments

Atha Text Dataset is a sentiment classification resource for the Indonesian language, containing three sentiment classes. The dataset is authored by Bangkah and was last updated on April 13, 2026. Its intended purpose is for learning NLP pipelines and establishing experimental baselines, not for production benchmarking.

TextSentiment AnalysisBenchmarkText ClassificationNatural Language Processing+1

0 views

Computer Vision

Mid-Atlantic Shelf Water Chemistry and Benthic Organisms, 1976-1977

Mid-Atlantic continental shelf data collected from November 1976 to September 1977. The dataset contains water column physical and chemical measurements, including temperature, salinity, and dissolved oxygen, alongside benthic organism surveys with species abundance and biomass. Data were submitted by the Virginia Institute of Marine Science and processed by the National Oceanographic Data Center (NODC) into standard formats F014 and F132.

TabularOceanographyMid AtlanticBenthic OrganismsWater ChemistryPhysical Oceanography+1

0 views

Computer Vision

Gulf of Mexico Oceanographic and Biological Time Series from Moored Instruments

Moored instrument data from the Gulf of Mexico captures time-series measurements of ocean currents, water chemistry, phytoplankton, and zooplankton. The dataset was submitted by Texas A&M University as part of the Brine Disposal project, with collection occurring from 1979-08-30 to 1981-08-01. Data were processed by the National Oceanographic Data Center into standard formats including F005 for current meters and F028 for phytoplankton.

TabularTime SeriesGulf Of MexicoWater ChemistryMoored InstrumentsOcean CurrentsMarine Biology+1

0 views

Computer Vision

Gulf of Mexico Brine Disposal Project Oceanographic and Benthic Data 1977-1979

October 1977 to August 1979 data from the Gulf of Mexico Brine Disposal project includes current direction, chemical parameters, benthic organisms, and wind wave spectra. Data were collected via moored current meter casts and other instruments by Texas A&M University and processed by the National Oceanographic Data Center (NODC) into standard formats.

TabularTime SeriesOceanographyWater ChemistryPhysical OceanographyCoastal EngineeringMarine Biology+1

0 views

Computer Vision

Gulf of Mexico Benthic and Oceanographic Survey Data from 1981-1982

Gulf of Mexico data from January 1981 to July 1982 includes current direction, water chemistry, and benthic organism measurements from moored instruments. Data was submitted by Texas A&M University for the Brine Disposal project and processed by the National Oceanographic Data Center into standard formats. It contains time-series current meter data, physical and chemical water column parameters, and species-level benthic survey information.

TabularTime SeriesOceanographyBenthic OrganismsCurrent MeterGulf Of MexicoWater ChemistryCurrent MeasurementMarine Biology+1

0 views

PreviousPage 282 of 790Next