Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
CLI-1M is a dataset of 975,933 natural-language to shell-command pairs created by carosh. It covers 18 industries, 6 shell types, and 13 languages, and is released under an Apache-2.0 license. The dataset was last updated on 2026-05-12 and is described as the most industry-diverse public corpus for NL→shell generation.
The dataset is released under an Apache-2.0 license, but the specific sources of the original command pairs are not detailed, which may affect redistribution or commercial use.