Kaggle hosts a collection of emails labeled as phishing attempts. The dataset likely contains textual email content for security analysis. Its specific size, origin, and update history are not detailed in the available metadata.
Use Cases
- Training a binary classifier to distinguish phishing from legitimate emails (inferred from domain, verify after download)
- Analyzing linguistic patterns and common tactics used in phishing campaigns (inferred from domain, verify after download)
Strengths
- Published on Kaggle, a platform with an established data science community.
- The title explicitly states the dataset's focus on English phishing emails.
Limitations
- Metadata is minimal; actual content requires verification after download.
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.