ChEMBL 36: Chemical and Bioactivity Representations
Available on 1 platform
Sign in to view source links and access this dataset
Description
A dataset from the ChEMBL database, version 36, likely containing molecular representations and associated bioactivity data for chemical compounds. The dataset is hosted on Kaggle, but its specific size, columns, and update date are not provided in the metadata. It is derived from the ChEMBL resource, a manually curated database of bioactive molecules.
Use Cases
Train a model to predict compound bioactivity from molecular fingerprints (inferred from domain, verify after download)
Perform similarity searches for lead compound optimization (inferred from domain, verify after download)
Build a knowledge graph linking chemical structures to biological targets (inferred from domain, verify after download)
Strengths
Published on Kaggle, a major platform for data science.
Derived from ChEMBL, a well-known, manually curated database for bioactive molecules.
Limitations
Metadata is minimal; actual content requires verification after download.
Column-level documentation is absent; field semantics must be inferred after download.
Row count, file formats, and last update date are unknown, which may limit suitability assessment.
Provenance
Source
ChEMBL database
Collection Method
Likely an export or snapshot of the ChEMBL 36 release.
License is unknown; users must verify terms of use for the original ChEMBL data.