Data Sheet 7_Integrative proteome-wide structural analysis and high-throughput docking ide
by Anderson Pereira Soares·Updated 1mo ago
18.6 MB1files
Available on 1 platform
Sign in to view source links and access this dataset
Description
Integrative proteome-wide virtual screening offers a powerful route to discover broad-spectrum antivirals against emerging flaviviruses. This dataset contains homology models of structural and nonstructural proteins from Zika, Yellow Fever, West Nile, Saint Louis Encephalitis, and Usutu viruses, along with docking results for a library of 160 natural product scaffolds and repurposed antivirals. It was authored by Anderson Pereira Soares and last updated on 2026-04-30.
Use Cases
Prioritizing lead compounds for experimental validation based on predicted binding energies and multitarget profiles.
Analyzing conserved binding sites across flavivirus proteins to guide panflaviviral therapeutic design.
Training machine learning models for virtual screening using features from the described structural and electrostatic analyses.
Comparing the docking performance of different natural product scaffolds against specific viral targets like the NS5 polymerase.
Strengths
The dataset is the product of a standardized pipeline combining sequence and structure-based pocket prediction, electrostatic profiling, and pharmacokinetic filtering.
Docking was performed exhaustively with 2,000 runs per pocket using AutoDock4/Vina, followed by clustering and ranking.
Comparative analyses (RMSD, PCA, RMSF) were used to confirm conserved core folds and guide grid definition.
The work identified 40 top-ranked scaffolds, with specific lead compounds like myricetin and temoporfin having estimated binding affinities reported.
Limitations
The dataset is 18.6 MB in size, which may limit the depth of structural data included.
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment for large-scale analyses.
Provenance
Source
figshare
Collection Method
Data was generated through homology modeling of viral proteins and high-throughput molecular docking of a focused compound library.
Freshness
Last updated 2026-04-30 05:24:55; freshness should be verified.
The primary file format is a ZIP archive; contents and specific data formats require inspection after download. License is CC-BY-4.0.