Sign in to view source links and access this dataset
Description
An enhanced version of the NL2SH-ALFA dataset, this collection contains natural language instructions paired with corresponding Bash commands for the NL2Bash translation task. The dataset was created by author dilkushsingh and was last updated on 2026-06-24. It was produced by combining, deduplicating, and filtering multiple source datasets, with improvements made to the natural language instructions to reduce ambiguity.
Use Cases
Train sequence-to-sequence models for natural language to Bash command translation based on the instruction-command pairs.
Benchmark the performance of code generation models on the specific task of translating English instructions to Bash.
Fine-tune large language models for shell command synthesis using the provided natural language prompts.
Study ambiguity and clarity in natural language instructions for programming tasks based on the described enhancements.
Strengths
Dataset is an enhanced version of NL2SH-ALFA with improved natural language instructions to reduce ambiguity.
Source data was produced by combining, deduplicating, and filtering multiple datasets, suggesting a degree of curation.
Limitations
Description metadata is limited; actual data quality, size, and structure require manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.
Row count, file formats, and license information are unknown, which may limit suitability assessment.
Provenance
Source
huggingface user dilkushsingh, based on an enhanced version of the NL2SH-ALFA dataset.
Collection Method
Combining, deduplicating, and filtering multiple source datasets.
Freshness
Last updated 2026-06-24 17:30:23; freshness should be verified.
License is unknown; users must verify permissions before use.