Offering LLM-generated pseudo music captions derived from three multi-label tag datasets for audio-language tasks. It features music-to-caption pairs across four distinct generation tasks to support text-to-music and music-to-text model training.
Use Cases
- Train text-to-music generative models using the provided music-to-caption pairs
- Fine-tune music captioning systems by mapping audio signals to the pseudo-caption strings
- Analyze the relationship between structured tags and natural language using the caption_attribute_prediction field
Strengths
- Includes pseudo-captions generated from three distinct multi-label music tag datasets
- Features a caption_attribute_prediction column for granular music attribute analysis
- Supports four specific tag-to-caption generation tasks
- Designed specifically for text-to-music and music-to-text cross-modal research