Thyroid Cancer Prognostic Risk Model Based on Immune Gene Expression
by Qi Qi·Updated 1mo ago
61.5 KB1files
Available on 1 platform
Sign in to view source links and access this dataset
Description
A retrospective study of 180 thyroid cancer patients treated between May 2022 and April 2025, used to construct a prognostic risk model based on immune gene expression. The dataset, authored by Qi Qi and shared under a CC-BY-4.0 license, includes analysis of genes like CDK1, B3GNT7, S100A9, and MMP9, and their correlation with immune cell infiltration and clinical outcomes.
Use Cases
Training a binary classification model to predict patient prognosis based on the expression levels of CDK1, B3GNT7, S100A9, and MMP9.
Analyzing correlations between specific immune gene expression and the infiltration abundance of B lymphocytes, CD4+ T lymphocytes, and CD8+ T lymphocytes.
Validating the prognostic performance of a logistic regression model using metrics like the C-index (0.919) and AUC (0.880) reported in the study.
Investigating the relationship between high expression of S100A9 and MMP9 genes and advanced lymph node metastasis (pN stage) or distant metastasis (pM stage).
Strengths
Includes a clearly defined cohort of 180 patients with thyroid cancer.
Reports specific model performance metrics, including an average C-index of 0.919 and an AUC of 0.880.
Identifies four specific immune genes (CDK1, B3GNT7, S100A9, MMP9) as independent risk factors with statistical significance (P < 0.05).
Limitations
Row count and column-level documentation are unknown, which limits suitability assessment.
The dataset is very small at 61.5 KB, suggesting limited scope or a summary document rather than raw patient-level data.
Description metadata is limited; actual data quality and structure require manual inspection after download.
Provenance
Source
figshare, author Qi Qi.
Collection Method
Retrospective clinical study of patients from a single hospital, supplemented with analysis from the TCGA-THCA database.
Time Range
Patient data from May 2022 to April 2025.
Freshness
Last updated 2026-04-22 22:01:10; freshness should be verified.
The primary file format is DOC, which may not be a standard data format and could contain a manuscript or report rather than a structured dataset.