Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
~700,000 English SMS messages for binary classification of spam and smishing, created by notd5a and last updated in March 2026. The dataset was iteratively refined through multiple rounds of error analysis to improve data quality for cybersecurity applications.
The dataset is licensed under CC BY-NC 4.0, which restricts use to non-commercial purposes. It is compatible with Polars, Dask, and Hugging Face Datasets libraries for processing.