Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
HRM-Text is a pre-built dataset for language model pretraining, created by applying data_io cleaning scripts to raw text data. The dataset was uploaded by the organization sapientinc and was last updated on May 21, 2026. It is associated with a research paper titled 'HRM-Text: Efficient Pretraining Beyond Scaling'.
License is unknown; users should verify terms of use before downloading.