Potential Energy Surface Data for Water and Phenol Clusters via Graph Neural Networks
by Siqi Chen·Updated 1mo ago
187.3 MB1files
Available on 1 platform
Sign in to view source links and access this dataset
Description
A 187.3 MB dataset supporting a transferable framework for predicting potential energy surfaces in hierarchically structured chemical systems. Siqi Chen developed the FB-GNN-MBE model, which integrates fragment-based graph neural networks with many-body expansion theory, and the data was last updated in April 2026. It includes benchmarks for water, phenol, and mixture systems, demonstrating chemical accuracy for two-body and three-body energy predictions.
Use Cases
Train graph neural networks to predict two-body (2B) and three-body (3B) interaction energies from fragment-based molecular graphs.
Fine-tune pre-trained GNN models (student networks) on uniform-density water cluster data for transfer learning across system sizes.
Benchmark the accuracy of energy predictions against first-principles quantum mechanical models for water and phenol dimers.
Apply the many-body expansion (MBE) framework to decompose and predict total system energy from individual fragment energies and interactions.
Strengths
Dataset size of 187.3 MB indicates substantial computational data for model training and validation.
Validated for chemical accuracy on benchmark systems including water, phenol, and their mixtures.
Includes a demonstrated transfer learning protocol between mixed-density and uniform-density water cluster ensembles.
Limitations
Specific row count and number of unique molecular configurations are unknown.
Data scope is limited to the studied systems (water, phenol, mixtures) and may not generalize to other molecules without adaptation.
Relies on the accuracy of the underlying first-principles QM models used for one-fragment energy evaluation.
Provenance
Source
Siqi Chen via figshare.
Collection Method
Generated using a fragment-based graph neural network (FB-GNN) integrated with many-body expansion (MBE) theory, trained on quantum mechanical models.
Freshness
Last updated in April 2026.
Data is packaged in a ZIP archive; specific internal file formats and structures are not detailed. License is CC-BY-4.0.