Name: Diabetes and Validation Datasets for NCR-EBRB Model Performance Comparison
Creator: Shucheng Feng
Published: 2026-05-05T17:30:20
License: CC-BY-4.0
Keywords: Machine Learning, ZIP, Model validation, Healthcare, Tabular, Medical Data, Diabetes Diagnosis

Description

Supporting information datasets from a study by Shucheng Feng, last updated May 5, 2026. The ZIP file contains processed experimental data for diabetes diagnosis and model validation, sourced from Medical City Hospital/Al-Kindy Teaching Hospital and the UCI Machine Learning Repository. It includes training and testing splits for diabetes and validation sets for Iris, Banknote, Ecoli, and Newthyroid, all restricted to two premise attributes.

Use Cases

Benchmarking diabetes diagnosis models based on provided training and test splits with proportions of 0.2, 0.25, 0.3, 0.35, and 0.4.
Validating classification algorithms on standard datasets like Iris, Banknote, Ecoli, and Newthyroid with a 0.3 test split.
Comparing model performance with controlled feature influence based on datasets restricted to two premise attributes.

Strengths

Includes multiple pre-defined training and testing splits for diabetes data (test proportions: 0.2, 0.25, 0.3, 0.35, 0.4).
Contains validation datasets for four established classification problems (Iris, Banknote, Ecoli, Newthyroid).
Data processing is described as restricting attributes to two to ensure a fair comparison by eliminating feature selection influence.

Limitations

Row count is unknown, which may limit suitability assessment.
Column-level documentation is absent; field semantics must be inferred after download.
The dataset is small in scale at 286.6 KB, indicating limited scope.

Provenance

Source: Original data sourced from Medical City Hospital/Al-Kindy Teaching Hospital (Mendeley Data) and the UCI Machine Learning Repository.
Collection Method: Processed and organized by the author for a specific study.
Freshness: Last updated 2026-05-05 17:30:20; freshness should be verified.

Data is packaged in a ZIP file; contents require extraction. License is CC-BY-4.0.

Tabular ZIP Machine Learning Model validation Healthcare Medical Data Diabetes Diagnosis

Diabetes and Validation Datasets for NCR-EBRB Model Performance Comparison

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info