Multimodal data from 108 patients were collected for a study comparing machine learning and deep learning models for predicting radiation-induced oral mucositis. The dataset includes CT imaging, dose distribution, and clinical features, and was published by Ling Li on figshare in April 2026. It is a small-cohort dataset of 137.2 KB, intended for radiomics and dosimetric analysis.
Use Cases
- Comparing traditional ML algorithms (e.g., Extra Trees, Logistic Regression) to deep learning architectures (e.g., 1D-CNN, 3D-CNN) for multi-class prediction based on multimodal clinical data.
- Developing models to predict radiation-induced oral mucositis severity based on dosiomic features and CT imaging.
- Studying model overfitting and mode collapse in high-dimensional architectures when applied to small-cohort medical datasets.
- Evaluating the performance of feature dimensionality reduction strategies combined with lightweight neural networks for limited data.
Strengths
- Dataset size of 108 patients is explicitly stated, providing a clear scale.
- Multimodal data includes CT imaging, dose distribution, and clinical features, as described in the study.
- Model performance metrics (AUC, accuracy, MCC) are reported with mean and standard deviation, offering benchmarks.
- Data is shared under a permissive CC-BY-4.0 license.
Limitations
- Row count and column-level documentation are unknown, which limits suitability assessment and requires manual inspection.
- The 137.2 KB file size indicates a very limited scope, likely containing extracted features rather than raw images.
- Data may reflect bias inherent to the specific clinical cohort studied.
Provenance
- Source
- Ling Li via figshare.
- Collection Method
- Collected as part of a study evaluating predictive models for radiation-induced oral mucositis.
- Time Range
- null
- Freshness
- Last updated 2026-04-09 17:28:57; freshness should be verified.
- Geography
- null