SpeakerCard-1M is a speaker-centric corpus built on the VoxCeleb1 and VoxCeleb2 datasets. It was created by JYP2024 using a tool-first, LLM-last pipeline where ten acoustic probes extract evidence for a structured schema. The dataset was last updated on June 3, 2026.
Use Cases
- Train speaker verification models based on structured speaker traits like gender and accent.
- Analyze correlations between acoustic probes and speaker states like emotion and speaking rate.
- Benchmark multi-factor speaker characterization systems based on the schema separating stable traits from utterance-level states.
- Develop constrained language models for verbalizing acoustic evidence into structured fields.
Strengths
- Built on established VoxCeleb1 and VoxCeleb2 datasets, providing a known foundation.
- Implements a structured schema separating stable speaker traits from utterance-level states.
- Employs a defined pipeline with ten acoustic probes for evidence extraction.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
- Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
- Source
- VoxCeleb1 and VoxCeleb2 datasets.
- Collection Method
- Created via a tool-first, LLM-last pipeline with acoustic probes and a constrained LLM.
- Time Range
- null
- Freshness
- Last updated 2026-06 03 11:22:43; freshness should be verified.
- Geography
- null