RT's Russian-language news headlines from October 7, 2023, to January 19, 2025, concerning the Israeli-Palestinian conflict. The dataset includes 8,757 distinct headlines filtered for conflict-related keywords, annotated for grammatical case using an LLM with human-reviewed validation. Author Lu, Tingting created this dataset to support a Keymorph Analysis study of narrative orientations in media coverage.
Use Cases
- Conducting Keymorph Analysis based on grammatical case annotations to identify narrative framing.
- Studying media representation of geopolitical conflicts based on a corpus of Russian-language headlines.
- Training or evaluating NLP models for morphologically rich languages based on LLM-assisted annotations.
- Analyzing statistical distributions of grammatical cases based on derived analytical data like Pearson residuals.
Strengths
- 8,757 distinct headlines provide a substantial text corpus for analysis.
- Annotations were generated with LLM assistance and include a 20% human-reviewed and corrected sample for quality assurance.
- Derived analytical data includes statistical summaries, standardized residual values, and log-likelihood ratios.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- The dataset is sourced solely from RT, which may reflect a specific editorial perspective inherent to the source.
Provenance
- Source
- Headlines collected from RT's official Russian-language news website.
- Collection Method
- Headlines were filtered for keywords, and grammatical case annotations were generated using ChatGPT-5 mini API.
- Time Range
- 2023-10-07 to 2025-01-19
- Freshness
- Last updated 2026-05-17 10:10:13.