SPGISpeech 2.0 is a dataset for speaker-tagged transcription in the financial domain, created by Kensho. It contains audio snippets and their corresponding fully formatted text transcriptions, suitable for end-to-end automatic speech recognition (ASR). The dataset improves the diversity of applicable modeling tasks while maintaining the core characteristics of the original SPGISpeech dataset.
Use Cases
- Training end-to-end automatic speech recognition (ASR) models based on audio snippets and transcriptions
- Developing speaker-tagged transcription systems based on the dataset's core characteristic
- Building domain-specific speech models for financial applications based on the described domain focus
- Improving model task diversity based on the dataset's stated improvement over the original
Strengths
- Contains audio snippets and corresponding fully formatted text transcriptions, usable for end-to-end ASR
- Specifically designed for speaker-tagged transcription in the financial domain
- Improves the diversity of applicable modeling tasks compared to the original SPGISpeech dataset
Limitations
- Column-level documentation is absent; field semantics must be inferred after download
- Row count is unknown, which may limit suitability assessment
- Description metadata is limited; actual data quality requires manual inspection after download
Provenance
- Source
- Kensho
- Freshness
- Last updated 2026-04-28 15:45:25; freshness should be verified