List of Significant COGs: Gene Clusters Correlated with Ocean Nutrients
by Rika Anderson·Updated 24d ago
165.6 KB1files
Available on 1 platform
Sign in to view source links and access this dataset
Description
A 2026 analysis by Rika Anderson identifies clusters of orthologous genes (COGs) whose abundances correlate with nutrient concentrations in the global ocean. The dataset is derived from 139 Tara Oceans metagenomic samples, analyzing 4,787 COGs against environmental metadata including phosphate, nitrate/nitrite, oxygen, and modeled iron. Statistical models were applied to control for confounding effects from variables like temperature, depth, and salinity.
Use Cases
Identify microbial genes associated with phosphate limitation based on correlation analysis.
Investigate relationships between nitrate/nitrite concentrations and specific orthologous gene clusters.
Control for environmental confounders like temperature and salinity when modeling gene abundance.
Study the distribution of functional gene groups across different ocean depth layers (surface, DCM).
Strengths
Analyzes a substantial set of 4,787 clusters of orthologous genes (COGs).
Leverages data from 139 globally distributed Tara Oceans metagenomic samples.
Statistical model incorporates multiple environmental controls, including log-transformed nutrient concentrations.
Applies a presence filter, discarding COGs found in less than one-third of samples to ensure meaningful results.
Limitations
Iron concentration data is modeled (PISCES2) rather than directly measured, and is unavailable for mesopelagic zone samples.
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Provenance
Source
Ocean Microbial Reference Catalog v2 (OM-RGC.v2) from the Tara Oceans Project.
Collection Method
Correlation analysis using compound Poisson linear models via MaAsLin2 software on metagenomic and environmental data.
Freshness
Last updated 2026-05-13 00:10:09; freshness should be verified.
Geography
Global ocean sampling locations from the Tara Oceans expedition.
File format is TXT. The dataset is small (165.6 KB), indicating it likely contains processed statistical results rather than raw sequence data.