DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Computer Graphics & Simulation Datasets | DataSalon

All Categories

🎨

Computer Graphics & Simulation

3D models, rendered datasets, physics simulation, digital twins, synthetic data generation, game engine data

1,034 datasets

GR1 Robot Tabletop Simulation 3D Assets

NVIDIA's PhysicalAI DigitalCousin Assets provide a collection of 3D meshes, textures, and object metadata for simulated tabletop manipulation environments. These digital assets, including mugs, bottles, bowls, and containers, populate virtual scenes for the GR1 robot. The dataset was published by NVIDIA and last updated in June 2025.

Multimodal3d AssetsDigital TwinRobotics simulationTabletop ManipulationSynthetic+1

0 views

Computer Graphics & Simulation

VR vs Web Serious Game Scores and Questionnaire Results for 289 Computer Science Students

Data from 289 computer science students supports the article 'Are Virtual Reality Serious Video Games More Effective Than Web Video Games?'. The dataset includes pre-test and post-test scores and questionnaire results for two groups: 110 students using VR and 179 using a web format. Author López Fernández, Daniel published the data via e-cienciaDatos Harvested Dataverse, last updated in October 2025.

TabularVirtual RealitySerious GamesComputer Science Students+1

0 views

Computer Graphics & Simulation

Transparent Fragment Images and Masks with HDRI Lighting

TransFrag27K is a large-scale dataset containing 27,000 images and masks at 640×480 resolution. It covers fragments of common everyday glassware, incorporating over 150 background textures and 100 HDRI environment lightings. The dataset was created by chenbr7 and last updated in August 2025.

ImageSize Categories10 Kn100 KMaterial PropertiesComputer VisionTask Categoriesimage SegmentationLicensecc By Nc 40RegionusLarge ScaleTransparent ObjectsSynthetic DataSynthetic+1

0 views

Computer Graphics & Simulation

Human-Authored 3D Interior Scenes with 211 Synthetic Environments

The Habitat Synthetic Scenes Dataset (HSSD) contains 211 human-authored 3D scenes and over 18,000 models of real-world objects. It is designed to mirror real interiors more closely than prior synthetic datasets for consumption in the Habitat simulation platform.

LanguageenLicensecc By Nc 40RegionusEmbodied Ai+1

0 views

Computer Graphics & Simulation

Html Eval: HTML Generation Performance Across Web, 3D, and Game Scenarios

An evaluation dataset for HTML generation models, covering diverse web development scenarios. The dataset was created by nex-agi and last updated on 2025-11-19. It includes examples for landing pages, e-commerce sites, responsive layouts, financial dashboards, WebGL scenes, Three.js applications, and browser-based games.

Text3d-scenesModel EvaluationBenchmarkHtml GenerationWeb DevelopmentGame DesignFinance+1

0 views

Computer Graphics & Simulation

Point-Cache: 3D Point Cloud Datasets for Robustness Analysis

Datasets used in the paper 'Point-Cache: Test-time Dynamic and Hierarchical Cache for Robust and Generalizable Point Cloud Analysis'. The repository includes ModelNet and SONN datasets, organized into subfolders for different corruption types. The dataset was uploaded by author 'auniquesun' and last updated on 2025-06-19.

Point Cloud3d ObjectsComputer VisionSonnModelnet+1

0 views

Computer Graphics & Simulation

Aria Everyday Activities: Egocentric Multimodal Recordings of Daily Life

Project Aria's Aria Everyday Activities (AEA) dataset provides recordings of daily activities from a first-person perspective. The description mentions it includes high-frequency 6DoF trajectories, observed point clouds, and synchronized RGB and monochrome camera views. The dataset was last updated on Hugging Face by projectaria on September 17, 2024.

Point CloudMultimodalEveryday ActivitiesMultimodal SensingEgocentric VisionTrajectory Tracking+1

0 views

Computer Graphics & Simulation

Arctic Ocean Zooplankton Samples from Amundsen and Nansen Basins

Vertical and stratified zooplankton sampling was conducted during July-August 2001 aboard the Swedish icebreaker Oden. The program FAMIZ studied distribution and abundance in the Amundsen and Nansen basins. The data was collected by SCIOPS.

GeospatialOcean SamplingZooplanktonArctic OceanMarine Biology+1

0 views

Computer Graphics & Simulation

Synthetic Instruction Dataset for LLM Finetuning

SmolTalk is a synthetic dataset containing 1 million samples created for supervised finetuning of large language models. It was developed by HuggingFaceTB to address performance gaps with public SFT datasets and was used to build the SmolLM2-Instruct model family. The dataset's methodology and details are documented in a research paper.

ParquetLibrarypolarsArxiv250202737LibrarydaskSize Categories1 Mn10 MLanguageenModalitytextModalitytabularLibrarymlcroissantLibrarydatasetsRegionusSynthetic+1

0 views

Computer Graphics & Simulation

URDF: 500 Textured and Untextured 3D Models for Robotics Simulation

500 3D models in URDF format, split into 235 textured and 265 untextured versions, are provided by Behavision. The dataset is designed to support research in robotics simulation, grasping, and physics simulation. It was last updated on August 7, 2025.

MultimodalGraspingPhysics SimulationRobotics simulation3d-modelsUrdf+1

0 views

Computer Graphics & Simulation

KodCode V1 Sft R1: Synthetic Programming Problems with Verifiable Solutions

KodCode is a fully-synthetic open-source dataset for coding tasks, created by KodCode and last updated on March 17, 2025. It contains 12 distinct subsets spanning domains from algorithmic to package-specific knowledge and difficulty levels from basic exercises to competitive programming. The dataset is designed for supervised fine-tuning and RL tuning.

TextAlgorithmic ProblemsCode GenerationSoftware TestingSynthetic DataProgramming ChallengesSynthetic+1

0 views

Computer Graphics & Simulation

SUM Parts: Part-Level Semantic Segmentation of Urban Textured Meshes

SUM Parts is a benchmark dataset for part-level semantic segmentation of urban textured meshes. It covers 2.5 square kilometers and includes annotations for 21 classes such as terrain, vegetation, water, and building components. The dataset was created by author gwxgrxhyz and last updated on June 21, 2025.

GeospatialSemantic SegmentationComputer VisionUrban Meshes+1

0 views

Computer Graphics & Simulation

MultiCamVideo: Synchronized Multi-Camera UE5 Videos and Trajectories

MultiCamVideo is a synthetic dataset of synchronized multi-camera videos and corresponding camera trajectories rendered in Unreal Engine 5. Created by KlingTeam and released in 2025 alongside the ReCamMaster paper, it provides ground-truth spatial data for multi-view video research.

RegionusArxiv250311647Licenseapache 20+1

0 views

Computer Graphics & Simulation

PartNet Shape Archive with Part Annotations

PartNet Archive contains 3D object data with part-level annotations, derived from the ShapeNet repository. The prerelease v0 from March 2019 includes meshes, point clouds, and HDF5 files for semantic and instance segmentation tasks. ShapeNet assembled this collection for research in fine-grained 3D shape understanding.

Point CloudMultimodalPart Segmentation3 D ShapesLicenseotherLanguageenComputer VisionRegionusGeometric DataArxiv1512030123D shapes+1

0 views

Computer Graphics & Simulation

Ling-Coder-SFT: Supervised Fine-Tuning Data for Code Generation

More than 5 million samples of supervised fine-tuning data used to train the Ling-Coder Lite model. The dataset is part of a larger collection that also includes DPO and synthetic QA subsets, created by inclusionAI and last updated on March 27, 2025.

TextAi TrainingCode GenerationLarge ScaleSynthetic DataSynthetic+1

0 views

Computer Graphics & Simulation

Analog Electronics Question and Answer Pairs, 2,516 Synthetic Examples

2,516 synthetic question and answer sets focused on analog electronics, created from seven different perspectives. The dataset was generated by the author 'theprint' using a prompt designed to provide useful, inspiring, and appropriately detailed assistance. It was last updated on Hugging Face on August 12, -2025.

TextAnalog ElectronicsQuestion AnsweringElectronics EducationSynthetic DataSynthetic+1

0 views

Computer Graphics & Simulation

Salmonid Stomach Contents and Oceanographic Data from Northeastern Pacific, 1958

From May 10 to August 24, 1958, Canadian exploratory fishing vessels collected data on salmonids caught via gillnetting in the Northeastern Pacific Ocean. The dataset likely contains tabulated records for each fish, including length, weight, sex, maturity, and stomach contents, alongside fishing position data showing gear, depth, surface temperature, salinity, and oceanographic domain. The data was gathered by NOAA_NCEI.

TabularFisheriesOceanographySalmonidsStomach ContentMarine Biology+1

0 views

Computer Graphics & Simulation

Objaverse-Rand6View: 1024px Multi-View Renders with Depth and Normal Maps

Objaverse-Rand6View provides 1024x1024 multi-view renders including RGB, depth, and normal maps derived from a high-quality subset of the Objaverse repository. Created by huanngzh and released in late 2024, the collection features randomized orthographic and perspective views designed for 3D generative modeling.

Point CloudModality3dTask Categoriesimage To 3dLanguageenArxiv241203632ObjaverseTask Categoriestext To ImageTask Categoriestext To 3dRegionusTask Categoriesimage To ImageHigh QualityLicensemit+1

0 views

Computer Graphics & Simulation

Rainbow Trout Tail Fin Embryonic Development Images and Analysis

Data from Christine Mayer's study applies a Geometric Morphometric Image Analysis (GMIA) method to larval and juvenile rainbow trout tail fin images. The dataset enables the joint quantitative analysis of embryo shape and spatial patterns of cellular activity. It was last updated in June 2020.

Oncorhynchus mykissEvo DevoImage AnalysisRainbow Trout+1

0 views

Computer Graphics & Simulation

KodCode Light RL 10K Synthetic Programming Tasks

KodCode is a fully-synthetic open-source dataset providing verifiable solutions and tests for coding tasks. It contains 12 distinct subsets spanning domains from algorithmic to package-specific knowledge and difficulty levels from basic exercises to competitive programming. The dataset was created by KodCode and last updated in April 2025.

TextAi TrainingSynthetic DataProgramming ChallengesSynthetic+1

0 views

PreviousPage 45 of 52Next