Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A dataset of 1,879 individuals used to investigate integrating NLP with clinical data for type 2 diabetes risk prediction. The data includes structured clinical variables and unstructured textual entries processed with a BERT-based NLP pipeline. The study, authored by Yaoyan Lu and last updated in 2026, validated the integrated model on a post-2020 cohort of 939 individuals.
Data is provided as a PDF file (137.4 KB), which likely contains a description of the study and results rather than the raw, machine-readable dataset.