05_HGNC_Mapped_Data contains standardized molecular data files from the TCGA Lower Grade Glioma Python pipeline. Aaliah Aly published this dataset on figshare in May 2026. The files include HGNC-mapped gene expression, copy number alteration, and mutation datasets.
Use Cases
- Integrating expression, CNA, and mutation data based on standardized gene identifiers for multi-omics analysis.
- Constructing graph databases or SQL tables based on reliable gene-linked molecular records.
- Performing downstream cancer genomics analysis based on datasets with resolved gene naming inconsistencies.
Strengths
- Gene identifiers were standardized using official HGNC information, including approved symbols and aliases.
- Unmapped or invalid gene entries were filtered out to improve accuracy for downstream integration.
- The dataset is 98.7 MB in size and includes CSV and XLSX files ready for integration.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
- Last updated 2026-05-07 02:49:04; freshness should be verified.
Provenance
- Source
- Generated from the TCGA Lower Grade Glioma Python pipeline.
- Collection Method
- HGNC-based gene identifier mapping performed using the hgnc_mapping.py script.
- Freshness
- Last updated 2026-05-07 02:49:04.