PhiUSIIL_Phishing_URL_Dataset is a dataset of URLs related to phishing, likely collected for security research. It is hosted on Kaggle, but details on its size, features, and creation are unspecified. The dataset's content and structure require verification after download.
Use Cases
- Train a classifier to distinguish phishing URLs from legitimate ones (inferred from domain, verify after download)
- Analyze lexical patterns and features common in malicious web addresses (inferred from domain, verify after download)
- Benchmark the performance of new URL-based threat detection algorithms (inferred from domain, verify after download)
Strengths
- Published on Kaggle, a platform for sharing data science resources.
Limitations
- Metadata is minimal; actual content requires verification after download.
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.