A dataset derived from BBC News RSS feeds, likely containing text features for topic modeling. The description mentions Non-negative Matrix Factorization (NMF) was applied to the data. The author, organization, and specific scale are unknown.
Use Cases
- Train topic models based on news article text features.
- Benchmark text feature extraction algorithms using a known news corpus.
- Analyze news content trends from RSS feed data.
Strengths
- Derived from a known news source (BBC News), providing a structured text corpus.
- Description indicates application of NMF, suggesting the data is pre-processed for topic modeling.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
- Last update date is unknown; freshness unverified.
Provenance
- Source
- BBC News RSS feeds.
- Collection Method
- Data was gathered from RSS feeds and processed using Non-negative Matrix Factorization (NMF).