Sign in to view source links and access this dataset
Description
Speeches extracted from government institutions of the US, UK, and Canada. Each speech is a single partition and contains metadata such as speaker name, date, country, and source URL. The dataset was created by author 'hazylavender' for an ICML 2025 paper and was last updated on March 24, 2026.
Use Cases
Evaluate federated learning algorithms based on the described use of each speech as a client partition.
Analyze political speech patterns based on metadata like speaker, date, and country.
Generate synthetic text data for privacy-preserving research based on the dataset's stated purpose.
Study comparative government rhetoric across the US, UK, and Canada based on the described geographic scope.
Strengths
Designed for a specific research purpose, linked to an ICML 2025 paper.
Includes structured metadata (speaker, date, country, URL) for each speech.
Explicitly structured for federated learning evaluation with speeches as partitions.
Limitations
Description metadata is limited; actual data quality requires manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Provenance
Source
Government institutions of the US, UK, and Canada.
Collection Method
Extracted from source URLs, as mentioned in the description.
Freshness
Last updated 2026-03-24 18:29:53; freshness should be verified.
Geography
United States, United Kingdom, Canada
License is unknown; terms of use must be verified before application.