A dataset of Gini importance scores for each site in a MAFFT alignment of Erg11 protein sequences from Saccharomycotina yeasts. The data, authored by Marie-Claire Harrison and last updated in March 2026, maps sites to Candida albicans residues and indicates prior observation in clinical isolates. The file is a 68.0 KB XLSX spreadsheet.
Use Cases
- Prioritizing functionally important residues for experimental validation based on Gini importance scores.
- Mapping sequence variation across yeast species to a reference Candida albicans structure.
- Identifying sites of known clinical relevance for surveillance of emerging drug resistance.
Strengths
- Includes a direct mapping to Candida albicans residues, providing a reference framework.
- Flags sites previously observed in clinical isolates, linking computational scores to real-world evidence.
- Released under a permissive CC-BY-4.0 license for reuse.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
- The dataset is small (68.0 KB), indicating limited scope.
Provenance
- Source
- figshare, authored by Marie-Claire Harrison.
- Collection Method
- Likely derived from MAFFT sequence alignment and subsequent analysis of Gini importance.
- Time Range
- null
- Freshness
- Last updated 2026-03 17:32:54; freshness should be verified.
- Geography
- null