A dataset from 2026-05-05 showing the distribution of patients across training, validation, and test sets for five outer folds in a nested cross-validation procedure. It was authored by Alexy Tran-Dinh and published on figshare. The dataset preserves a 75%/25% class distribution of survivors/non-survivors across all folds.
Use Cases
- Validate model performance stability across folds based on stratified patient distribution.
- Assess the impact of cross-validation partitioning on model training for survival prediction.
- Analyze the balance of survivor/non-survivor classes in training and test splits.
- Design reproducible machine learning experiments for clinical datasets using nested cross-validation.
Strengths
- Stratification by outcome is maintained across all folds, preserving a 75%/25% class distribution.
- The dataset is structured for a nested cross-validation procedure with five outer folds.
- It is published under a CC-BY-4.0 license, allowing for open use and sharing.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
- The dataset is 13.8 KB, indicating a very small scope.
Provenance
- Source
- figshare
- Freshness
- Last updated 2026-05-05 18:00:17; freshness should be verified.