Phishing_site_urls is a dataset likely containing website addresses labeled for malicious intent. The dataset is published on Kaggle, a platform for data science competitions and projects. Specific details such as the number of URLs, collection timeframe, and original author are not provided in the available metadata.
Use Cases
- Train a binary classifier to distinguish phishing URLs from legitimate ones (inferred from domain, verify after download)
- Feature engineering for URL-based threat intelligence systems (inferred from domain, verify after download)
- Benchmarking web security and anti-phishing algorithms (inferred from domain, verify after download)
Strengths
- Published on Kaggle, a major platform for data science resources.
Limitations
- Metadata is minimal; actual content requires verification after download.
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.