Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
20,000 text samples compiled from three distinct sources: Wikipedia, Project Gutenberg, and CNN/DailyMail. The dataset was created by author 'brograrnmer' and last updated on May 5, 2026. Preprocessing involved regex cleaning to replace certain patterns with whitespace.
License information is unknown, which may restrict commercial or redistribution use.