Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A processed version of the PubChem-10M dataset, canonicalized using RDKit and split into training and validation sets. The dataset was created by user 'sagawa' and last updated on September 4, 2022. It contains molecular structures represented as canonical SMILES strings.
Requires RDKit or similar cheminformatics library to process the canonical SMILES format.