Loading...
Loading...
3D models, rendered datasets, physics simulation, digital twins, synthetic data generation, game engine data
1,020 datasets
Reflect3R is a synthetic dataset for training models to perform 3D stereo reconstruction from a single camera view, aided by mirror reflections. The data was created by researchers from the University of Oxford and is associated with a paper for the 3DV 2026 conference. It was last updated on the Hugging Face platform in March 2026.
A synthetic dataset designed for predicting video game purchase behavior using machine learning. The dataset's author, size, and specific features are unknown. It was sourced from Kaggle, but its last update date and other metadata are not provided.
A synthetic dataset for predicting streaming subscription renewal behavior. It was sourced from Kaggle, but the author, organization, and specific creation date are unknown. The dataset's size, number of rows, and column-level details are not provided in the available metadata.
Synthetic data is a common resource for training and testing machine learning models. This dataset is hosted on Kaggle, a popular platform for data science competitions and projects. The specific content, size, and generation method are not detailed in the available metadata.
Synthetic data generated to model supply chain operations without using real proprietary information. The dataset is hosted on Kaggle, a popular platform for data science competitions and projects. Its specific origin, size, and creation method are not detailed in the available metadata.
200,000 synthetic records analyze ChatGPT usage, behavior, and engagement trends. The dataset is hosted on Kaggle, but its author, organization, and license are unknown. The last update date and specific column details are also unavailable.
Modotte's MathX-5M dataset is part of a lineup focused on providing high-quality data for model training and fine-tuning. It is curated from public sources and enhanced with synthetic data from both closed and open-source models. The dataset serves as a foundation for instruction-based model tuning and was last updated on February 10, —.
Rule-based synthetic data designed for used car price prediction and resale analysis. The dataset was created for Kaggle, though the specific author and creation date are unknown. Its size, specific features, and row count are not detailed.
ShapeNet is a large-scale repository of 3D CAD models representing objects. It contains models from numerous semantic categories organized under the WordNet taxonomy and provides annotations like consistent rigid alignments, parts, and bilateral symmetry planes. The repository was presented by Anne Lynn S. Chang and collaborators.
ShapeNetCar_preprocessed is a dataset of 3D car models, likely derived from the ShapeNet repository. The data has been preprocessed, suggesting it is formatted for machine learning tasks. It is hosted on Kaggle, but specific details on the number of models, features, and creation date are unavailable.
A synthetic dataset exploring relationships between AI usage, career stress, burnout, and student readiness. The data was sourced from Kaggle, but the author, organization, and specific size are unknown. The dataset's last update date is also not provided.
WMGStereo-150k provides 150,000 synthetic stereo image pairs and disparity maps generated by the Princeton Vision & Learning Lab. Released in 2025, the collection uses procedural generation to create indoor, nature, and dense "flying" scenes for depth estimation tasks.
7,200 square kilometers of Shizuoka prefecture are covered by this high-precision 3D point cloud data. The dataset is the result of a multi-year effort using aerial laser survey, airborne laser bathymetry, and mobile mapping systems, and is provided by AIGID under a CC-BY-4.0 license. It is intended for visualization and analysis in infrastructure, disaster prevention, and autonomous driving.
High-precision 3D point cloud data encompasses the entire Kanagawa prefecture in Japan. The data is produced through aerial laser survey, airborne laser bathymetry, and mobile mapping systems, the culmination of many years of dedicated effort. It is published by AIGID under a CC-BY-4.0 license and hosted on AWS Open Data.
Netflix Synthetic Dataset Fairness Data Quality likely contains artificial data designed for evaluating algorithmic fairness and data quality metrics. Published on Kaggle, its specific content, size, and creation details require verification after download. The dataset's primary purpose appears to be testing and benchmarking fairness-aware machine learning models.
Mesh The Guti is a dataset published on Kaggle. The title suggests it contains 3D mesh data, likely for computer graphics or simulation applications. The dataset's specific content, size, and origin require verification after download.
NASA's Ames Research Center provides a synthetic dataset created to demonstrate the MKAD algorithm's effectiveness. The data is designed to test anomaly detection in both continuous numerical and binary discrete data types. It was last updated on March 13, 2026.
Microsoft created this synthetic dataset from the H&M Personalized Fashion Recommendations competition data. The dataset is derived from a Kaggle competition and a related Hugging Face dataset. It was last updated on February 11, 2026.
A dataset of realistic synthetic data published on Kaggle. The specific content, size, and creator are unknown from the provided metadata. Its intended use is likely for training or testing machine learning models.
NeuroData provides multiple neuroimaging datasets stored as Neuroglancer Precomputed Volumes. The collection spans multiple modalities and scales, from nanoscale electron microscopy to mesoscale structural and functional MRI. Many datasets include segmentations and meshes.