Summary of CpG Sites and SNPs for Human, Chimpanzee, Rhesus Macaque, and Silkworm
by Sheel Chandra·Updated 12d ago
9.4 KB1files
Available on 1 platform
Sign in to view source links and access this dataset
Description
Sheel Chandra's dataset provides a summary of CpG sites and single nucleotide polymorphisms (SNPs) for multiple species, last updated on June 1, 2026. The analysis excludes genic regions using NCBI RefSeq annotations and, for human data, also excludes phylogenetically conserved regions defined by phastCons. The dataset is a 9.4 KB XLSX file licensed under CC-BY-4.0.
Use Cases
Comparative analysis of CpG site conservation across human, chimpanzee, rhesus macaque, and silkworm based on the described exclusion criteria.
Studying C>T SNP mutations by leveraging the described polarization method using minor allele frequencies.
Investigating ancestral state inference in silkworm using the est-sfs method mentioned in the description.
Strengths
Dataset is small (9.4 KB), facilitating quick download and inspection.
Analysis methodology is described, including specific exclusion criteria for genic and conserved regions.
Data covers multiple species (human, chimpanzee, rhesus macaque, silkworm) for cross-species comparison.
Limitations
Row count and column-level documentation are unknown, limiting suitability assessment.
The dataset's small size (9.4 KB) suggests a very limited scope or summary-level data.
Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
Source
Sheel Chandra via figshare
Collection Method
Analysis involved excluding genic regions (RefSeq) and conserved regions (phastCons for human), with SNP polarization using minor allele frequencies and est-sfs inference.
Freshness
Last updated 2026-06-01 17:47:57; freshness should be verified.
Data is provided in XLSX format; users will need compatible spreadsheet software or a library to read it.