Sign in to view source links and access this dataset
Description
Over 37 hours of synchronized multimodal data for singing-driven 3D head motion, featuring motion subtitles and acoustic descriptions. The dataset, named SingMoSub, was created by ZikaiHuang and was last updated on March 1, 2026. It provides temporally aligned, region-level motion annotations for modeling expressive head and facial dynamics.
Use Cases
Train models for singing-driven 3D head motion generation based on synchronized audio and motion sequences.
Develop region-level facial animation controls based on the provided motion subtitle annotations.
Research multimodal alignment between audio features and expressive facial dynamics in singing scenarios.
Create systems for generating acoustic descriptions of motion based on the provided annotations.
Strengths
Over 37 hours of synchronized multimodal data, providing substantial material for training.
First dataset featuring motion subtitles specifically for singing-driven 3D head motion.
Provides temporally aligned, region-level motion annotations for detailed modeling.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment for specific model training needs.
Provenance
Source
ZikaiHuang on Hugging Face
Collection Method
Likely involves motion capture and audio recording of singing performances, with subsequent annotation.
Time Range
null
Freshness
Last updated 2026-03-01 12:07:12; freshness should be verified.
Geography
null
License is unknown; users must verify the license terms on the dataset page before use.