TCGA LGG source files downloaded and reviewed but excluded from final database construction. The collection includes 506 MB of unmodified materials in MD, TXT, and SEG formats, provided by Aaliah Aly on figshare under CC-BY-4.0 license. These files were last updated on May 6, 2026.
Use Cases
- Auditing database construction workflows based on excluded source materials
- Ensuring reproducibility in cancer genomics projects based on transparency of source file review
- Analyzing file formats and structures used in TCGA data pipelines based on MD, TXT, and SEG files
Strengths
- Files are unmodified and provided exactly as source materials, supporting direct inspection
- 506 MB of data offers a substantial volume of excluded materials for review
- CC-BY-4.0 license provides clear permissions for reuse
Limitations
- Description metadata is limited; actual data quality requires manual inspection after download
- Column-level documentation is absent; field semantics must be inferred after download
- Row count is unknown, which may limit suitability assessment
Provenance
- Source
- TCGA (The Cancer Genome Atlas) LGG (Lower Grade Glioma) project
- Collection Method
- Files were downloaded and reviewed during a project but not used in the final database construction workflow.
- Freshness
- Last updated 2026-05-06 15:05:55; freshness should be verified