464 multimodal earnings conference calls from S&P 500 companies featuring sentence-level alignment between audio recordings and text transcripts. The dataset provides structured financial disclosures paired with stock volatility labels for modeling market risk responses.
Use Cases
- Train multimodal risk prediction models using aligned audio features and transcript text
- Analyze executive sentiment by correlating vocal pitch and energy with specific transcript sentences
- Evaluate speech recognition performance on domain-specific financial terminology and corporate jargon
- Study the impact of Q&A dynamics on market volatility using the segmented call structure
Strengths
- 464 earnings conference calls from S&P 500 firms with paired audio and text
- Sentence-level alignment between audio segments and transcript strings
- Stock volatility labels calculated from post-event market performance
- Includes distinct segments for formal presentations and interactive Q&A sessions