ChromTransfer: Genomic Data for Chromatin State Prediction Models
by Yiman Wang·Updated 2mo ago
13.0 GB6files
Available on 1 platform
Sign in to view source links and access this dataset
Description
13.0 GB of genomic data files support preprocessing, training, and prediction for the ChromTransfer model. The collection includes TF co-binding signals, chromatin context, DNA sequences, and FUNCODE scores for mouse (mm10) and human (hg38) genomes. Yiman Wang published this dataset on figshare in April 2026.
Use Cases
Training ChromTransfer-Base models based on DNA sequences for genomic regions.
Training ChromTransfer-Cons models based on FUNCODE scores for genomic regions.
Training ChromTransfer-Reg models based on TF co-binding and chromatin context signals.
Identifying co-binding transcription factors based on TF interaction information.
Generating training, validation, and test datasets based on BED files defining genomic regions.
Strengths
Dataset size is 13.0 GB, indicating a substantial collection of genomic data.
Data is provided for two major reference genomes: mouse (mm10) and human (hg38).
Files are clearly organized by function (Regulatory, FUNCODE, DNA, regions, cobinding_TF_source, demo).
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
Source
figshare
Collection Method
Likely compiled for the ChromTransfer model project.
Freshness
Last updated 2026-04-10 07:11:42; freshness should be verified.
Files are compressed in .tar.gz format; requires extraction. Usage instructions are linked to a GitHub repository.