Genomic Atlas of Gut Clostridia: Phylogeny and Short-Chain Fatty Acid Production
by Laura Sola·Updated 2mo ago
1.2 MB1files
Available on 1 platform
Sign in to view source links and access this dataset
Description
A 2026 study by Laura Sola provides a genomic reclassification of 1,897 Clostridia species from the Unified Human Gastrointestinal Genome database. The analysis identifies 519 potential butyrate producers, 257 propionate producers, and 77 capable of both, with abundance assessed from 151 healthy human fecal metagenomes. The dataset includes taxonomic classifications, phylogenetic relationships, and pathway analysis for butyrate and propionate production.
Use Cases
Train taxonomic classification models based on the genomic and phylogenetic features of 1,897 species.
Analyze correlations between microbial phylogeny and metabolic potential for short-chain fatty acid production.
Benchmark metagenomic profiling tools using the abundance data from 151 healthy human fecal samples.
Study the distribution of butyrate and propionate biosynthesis pathways across the reclassified Clostridia genera and families.
Strengths
Includes a substantial number of 1,897 Clostridia species for analysis.
Provides specific counts of potential butyrate (519) and propionate (257) producers.
Incorporates abundance data from 151 human fecal metagenomes, indicating butyrate producers account for an average of 28.0% of each microbiome.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
The primary data file is a DOCX document, which may require extraction or conversion for computational analysis.
Provenance
Source
figshare, authored by Laura Sola.
Collection Method
Genomes retrieved and reclassified using GTDB-Tk; phylogeny determined from 120 ubiquitous single-copy proteins; metabolic pathways investigated with GapSeq.
Time Range
Study dated 2026; temporal coverage of source genomic data is not specified.
Freshness
Last updated 2026-04-10 05:59:15.
Geography
Geographic coverage is not specified, but metagenomic data is from healthy human subjects.
Data is provided in a DOCX file format; users may need to extract tables or text for analysis. License is CC-BY-4.0.