Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
1.664 million cleaned and labeled source code samples across 16 programming languages, curated for language identification tasks. The dataset was created by author kaushik-harsh-99 and is hosted on Hugging Face. It was last updated on May 30, 2026.
License is unknown; terms of use must be verified before application.