Sign in to view source links and access this dataset
Description
CMI Pref Pseudo contains 56,000 music generations from 23 models and 165,000 pairwise comparisons for preference modeling research. The dataset was created by HaiwenXia and last updated on March 3, 2026. Prompts are compositional, including text, optional lyrics, and reference audio.
Use Cases
Training music preference models based on pairwise comparison data.
Evaluating music generation models based on human or AI-labeled preferences.
Studying multimodal prompt conditioning based on compositional prompts with text, lyrics, and audio.
Strengths
Contains 165,000 pairwise comparisons, providing a substantial basis for preference learning.
Includes 56,000 music generations from 23 different music generation models.
115,000 labels were generated using the qwen3-omni model via a dedicated data-labeling pipeline.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Last updated 2026-03-03 15:48:05; freshness should be verified.
Provenance
Source
huggingface
Collection Method
Generations from 23 music models, with labels from an AI pipeline using qwen3-omni.
Freshness
2026-03-03 15:48:05
License restrictions are unknown and should be verified before use.