Sign in to view source links and access this dataset
Description
24,800 structured text prompts systematically vary across genre, tempo, instrument, and mood to study controllability in text-to-music models like MusicGen. The dataset, created by bodhisattamaiti, contains only prompts in CSV format, with no accompanying audio files. It was last updated on December 11, 2025.
Use Cases
Training or fine-tuning text-to-music models based on structured prompts for genre, tempo, instrument, and mood.
Benchmarking model performance on controllability tasks based on systematic prompt variations.
Studying the effect of prompt paraphrasing on music generation based on the two paraphrase forms per combination.
Analyzing the relationship between prompt structure and generated audio features based on the 8 structural variants.
Strengths
Large scale with 24,800 structured prompts.
Systematic variation across four key musical dimensions: genre, tempo (BPM), instrument, and mood.
Includes 8 structural variants and 2 paraphrase forms for each combination, enabling detailed controllability studies.
Limitations
Description metadata is limited; actual data quality requires manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.
Contains only text prompts; no audio files are included for direct model output comparison.
Provenance
Source
huggingface
Collection Method
Likely manually or programmatically constructed for research purposes.
Time Range
null
Freshness
Last updated 2025-12-11 23:35:07; freshness should be verified.
Geography
null
License is unknown; users should verify permissions before use. A companion dataset (Prompt2MusicLibrary) is referenced but not described here.