Diabetes and Validation Datasets for NCR-EBRB Model Performance Comparison
by Shucheng Feng·Updated 1mo ago
286.6 KB1files
Available on 1 platform
Sign in to view source links and access this dataset
Description
Supporting information datasets from a study by Shucheng Feng, last updated May 5, 2026. The ZIP file contains processed experimental data for diabetes diagnosis and model validation, sourced from Medical City Hospital/Al-Kindy Teaching Hospital and the UCI Machine Learning Repository. It includes training and testing splits for diabetes and validation sets for Iris, Banknote, Ecoli, and Newthyroid, all restricted to two premise attributes.
Use Cases
Benchmarking diabetes diagnosis models based on provided training and test splits with proportions of 0.2, 0.25, 0.3, 0.35, and 0.4.
Validating classification algorithms on standard datasets like Iris, Banknote, Ecoli, and Newthyroid with a 0.3 test split.
Comparing model performance with controlled feature influence based on datasets restricted to two premise attributes.
Strengths
Includes multiple pre-defined training and testing splits for diabetes data (test proportions: 0.2, 0.25, 0.3, 0.35, 0.4).
Contains validation datasets for four established classification problems (Iris, Banknote, Ecoli, Newthyroid).
Data processing is described as restricting attributes to two to ensure a fair comparison by eliminating feature selection influence.
Limitations
Row count is unknown, which may limit suitability assessment.
Column-level documentation is absent; field semantics must be inferred after download.
The dataset is small in scale at 286.6 KB, indicating limited scope.
Provenance
Source
Original data sourced from Medical City Hospital/Al-Kindy Teaching Hospital (Mendeley Data) and the UCI Machine Learning Repository.
Collection Method
Processed and organized by the author for a specific study.
Freshness
Last updated 2026-05-05 17:30:20; freshness should be verified.
Data is packaged in a ZIP file; contents require extraction. License is CC-BY-4.0.