A collection of synthetic phishing emails, likely for training and evaluating detection models. The dataset is hosted on Kaggle and its columns suggest it contains text content for classification tasks. Specific details on volume, creation date, and authorship are not provided in the available metadata.
Use Cases
- Training a text classifier to distinguish phishing from legitimate emails (inferred from domain, verify after download)
- Benchmarking NLP models on synthetic security threat data (inferred from domain, verify after download)
- Studying linguistic patterns and features in phishing attempts (inferred from domain, verify after download)
Strengths
- Published on Kaggle, a platform with established data sharing and versioning.
- Platform tags indicate a clear focus on cybersecurity and text classification.
Limitations
- Metadata is minimal; actual content requires verification after download.
- Row count, file formats, and column definitions are unknown.
- License and authorship details are absent, which may affect usage rights.
Provenance
- Source
- Kaggle
- Collection Method
- Synthetically generated, as indicated by the title.
- Time Range
- null
- Freshness
- Last updated date is unknown; freshness unverified.
- Geography
- null