SPGISpeech 2.0: Financial Domain Speech Transcription Dataset

Name: SPGISpeech 2.0: Financial Domain Speech Transcription Dataset
Creator: kensho
Published: 2026-04-20T15:32:58
Keywords: Financial Domain, Speaker Tagged, Audio, Finance, Speech Recognition, Multimodal

by kenshoUpdated 2mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

SPGISpeech 2.0 is a dataset for speaker-tagged transcription in the financial domain, created by Kensho. It contains audio snippets and their corresponding fully formatted text transcriptions, suitable for end-to-end automatic speech recognition (ASR). The dataset improves the diversity of applicable modeling tasks while maintaining the core characteristics of the original SPGISpeech dataset.

Use Cases

Training end-to-end automatic speech recognition (ASR) models based on audio snippets and transcriptions
Developing speaker-tagged transcription systems based on the dataset's core characteristic
Building domain-specific speech models for financial applications based on the described domain focus
Improving model task diversity based on the dataset's stated improvement over the original

Strengths

Contains audio snippets and corresponding fully formatted text transcriptions, usable for end-to-end ASR
Specifically designed for speaker-tagged transcription in the financial domain
Improves the diversity of applicable modeling tasks compared to the original SPGISpeech dataset

Limitations

Column-level documentation is absent; field semantics must be inferred after download
Row count is unknown, which may limit suitability assessment
Description metadata is limited; actual data quality requires manual inspection after download

Provenance

Source: Kensho
Freshness: Last updated 2026-04-28 15:45:25; freshness should be verified

Audio Multimodal Financial Domain Speaker Tagged Finance Speech Recognition

Related Datasets

Quality Score

D36

Description

39

Source

36

Reputation

40

Access

22

Community

37 downloads

1 likes

0 views

Dataset Info

Author: kensho
Created: Apr 20, 2026
Updated: Apr 28, 2026
Last synced: Jul 23, 2026

Access

22

Community

37 downloads

1 likes

0 views

Dataset Info

Author: kensho
Created: Apr 20, 2026
Updated: Apr 28, 2026
Last synced: Jul 23, 2026

SPGISpeech 2.0: Financial Domain Speech Transcription Dataset

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info