Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Comprising 39.37 million tokens of curated Luau source code across 29,215 unique files and 457 GitHub repositories. Developed by khtsly and updated in March 2026, it focuses on type-safe, functional architecture for code-specific model training.
The source code belongs to the original authors; users must comply with the individual licenses of the 457 source repositories. The dataset is specifically optimized for the training stages of LLMs.