Sign in to view source links and access this dataset
Description
17,477 instruction-response pairs were compiled from Unity documentation, Stack Overflow, and GitHub for training AI models. The dataset contains 16,604 training and 873 test examples, created by vishnuOI and last updated in April 2026. It covers topics like C# scripting, XR/VR development, physics, and performance optimization.
Use Cases
Fine-tune a language model to generate C# scripts for Unity given an instruction like 'create a player movement script'.
Train a model to answer Unity-specific questions on topics like physics simulation or UI Toolkit based on the instruction-response pairs.
Build a code assistant that provides performance optimization tips for Unity projects using the curated examples.
Develop an AI tutor for XR/VR development in Unity by learning from the structured instructions and solutions.
Strengths
17,477 total examples provide a substantial corpus for model training.
Data is sourced from 8,403 Unity documentation entries, 6,709 Stack Overflow posts, and 2,365 GitHub examples, offering diverse practical contexts.
Explicit train/test split of 16,604 and 873 examples facilitates model evaluation.
Limitations
The test set of 873 examples is relatively small for robust model evaluation.
Source breakdown shows potential bias towards official Unity documentation over community code, which may limit stylistic diversity.
No information on the temporal range of source data, which could affect relevance for newer Unity engine versions.
Provenance
Source
Aggregated from Unity documentation, Stack Overflow, and GitHub.
Collection Method
Instruction-response pairs compiled from the three source platforms.
Freshness
Last updated in April 2026.
License is unknown; users should verify terms before commercial use. The dataset page on Hugging Face must be consulted for the full description and potential updates.