Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A preprocessed version of the Spanish Wikipedia dump from May 2026, totaling 8.4 GB. The dataset was created by raj2708 and includes articles parsed from wikitext to plain text. It is intended for use in large language model pretraining.
License is CC-BY-SA 4.0, which requires attribution and share-alike distribution of derivatives.