TCGA LGG: Original Source Files for Multi-Omics Database Construction
by Aaliah Aly·Updated 1mo ago
138.5 MB6files
Available on 1 platform
Sign in to view source links and access this dataset
Description
138.5 MB of original TCGA LGG source files used to build a multi-omics relational database. The unmodified TXT files include clinical information, survival outcomes, mutation data, copy number alterations, and mRNA expression data. Author Aaliah Aly uploaded these files to figshare in May 2026 to support transparency and reproducibility.
Use Cases
Verify database construction workflows based on the original source files.
Reproduce multi-omics analyses for low-grade glioma based on the clinical and genomic data.
Audit data cleaning and harmonization steps based on the unmodified input materials.
Validate SQL table generation and population processes based on the provided source data.
Strengths
Files are the unmodified source materials, providing a direct audit trail for database construction.
Includes multiple data types: clinical information, survival outcomes, mutation data, copy number alterations, and mRNA expression.
138.5 MB of source data supports detailed reproducibility checks.
Licensed under CC-BY-4.0, allowing for open reuse and sharing.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Data may reflect temporal and source bias inherent to the TCGA project.
Provenance
Source
TCGA (The Cancer Genome Atlas) LGG (Low-Grade Glioma) project.
Collection Method
Files were selected and used for data cleaning, harmonization, validation, SQL table generation, SQL population, and database implementation.
Freshness
Last updated 2026-05-06 15:51:03; freshness should be verified.
Files are in TXT format; specific parsing and structure must be determined after download.