ALCHIMIA: Generated Molecules for CB2R and S1R Receptor Targets
by Domenico Alberga·Updated 1mo ago
2.7 MB1files
Available on 1 platform
Sign in to view source links and access this dataset
Description
ALCHIMIA, a hybrid reinforcement learning and genetic algorithm framework, generated molecules for two pharmacologically relevant targets: human Cannabinoid Receptor 2 (CB2R) and human Sigma nonopioid intracellular Receptor 1 (S1R). The dataset, created by Domenico Alberga and shared on figshare, contains results from three design scenarios: unconstrained hit identification, scaffold-constrained lead optimization, and the design of dual modulators. It was last updated on April 16, 2026.
Use Cases
Benchmarking generative molecular design algorithms based on synthetic accessibility and drug-likeness scores.
Training policy networks for reinforcement learning in chemistry based on sequences of medicinal chemistry transformations.
Optimizing multi-objective molecular properties like binding affinity and QED scores using a genetic algorithm framework.
Studying chemical lineages and diversity in generated molecular populations for specific protein targets.
Strengths
Framework is built on a vocabulary of 33 specific medicinal chemistry-inspired molecular transformations.
Molecules were generated for two specific protein targets (CB2R and S1R) across three distinct design scenarios.
Dataset is associated with a freely available GitHub repository, providing transparency and reproducibility.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment for large-scale model training.
Data may reflect methodological bias inherent to the ALCHIMIA framework's specific transformations and optimization goals.
Provenance
Source
figshare, author Domenico Alberga.
Collection Method
Generated by the ALCHIMIA computational framework combining reinforcement learning and a genetic algorithm.
Freshness
Last updated 2026-04-16 07:29:45; freshness should be verified.
License is CC-BY-NC-4.0, which prohibits commercial use. The dataset is small at 1.8 MB.