Telomere-to-Telomere Genome Assembly of Wild Soybean Glycine Soja HAAS216
by Muhammad Asad·Updated 2mo ago
1.1 GB2files
Available on 1 platform
Sign in to view source links and access this dataset
Description
Muhammad Asad's dataset provides a high-quality telomere-to-telomere genome assembly and annotation for the wild soybean Glycine soja accession HAAS216. The assembly spans 1,018.17 Mb, integrates PacBio HiFi, ONT ultra-long reads, and Hi-C data, and contains 48,390 predicted protein-coding genes. It was last updated on April 14, 2026.
Use Cases
Comparative genomics between wild and cultivated soybean based on the high-quality assembly.
Structural variation detection based on the resolved repetitive and complex genomic regions.
Identification of genomic regions associated with stress tolerance and agronomic traits based on the functional gene annotations.
Strengths
Assembly spans 1,018.17 Mb with 55.84% repetitive sequences resolved.
Contains 48,390 predicted protein-coding genes, with 98.30% functionally annotated.
Integrates multiple high-fidelity data sources (PacBio HiFi, ONT ultra-long reads, Hi-C) for chromosome-scale scaffolding.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Provenance
Source
Muhammad Asad via figshare.
Collection Method
Assembly integrates PacBio HiFi long reads, Oxford Nanopore Technologies ultra-long reads, and Hi-C data.
Time Range
null
Freshness
Last updated 2026-04-14 08:29:24; freshness should be verified.
Geography
null
Data is provided in GFF3 and FA file formats, which require specific bioinformatics tools for analysis.