Enhanced Breast Cancer Diagnostic Dataset contains 5.5 thousand samples for breast cancer prediction. The dataset includes engineered features, as noted in its description. It is hosted on Kaggle, but the original author and organization are unknown.
Use Cases
- Train binary classification models for cancer prediction based on engineered features.
- Benchmark feature engineering techniques for medical tabular data.
- Develop educational tutorials on medical machine learning using a publicly available dataset.
Strengths
- Dataset contains 5.5 thousand samples, providing a substantial base for model training.
- Description mentions engineered features, suggesting potential for advanced predictive modeling.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
- Last update date is unknown; freshness unverified.