Life Cycle
Thinking in Early Drug Discovery: Reducing
Environmental Impact and Animal Test
by Jianing Xu·Updated 3d ago
139.7 KB1files
Available on 1 platform
Sign in to view source links and access this dataset
Description
A dataset of 1150 compounds used to develop and validate machine learning models for predicting acute oral toxicity across six drug scaffolds. The models, including QSAR, q-RASAR, and deep learning methods, were applied to virtually screen over 23,000 untested molecules. The work was authored by Jianing Xu and last updated on June 2, 2026.
Use Cases
Training interpretable QSAR models based on quantitative structure–activity relationships described in the study.
Benchmarking deep learning algorithms for global toxicity prediction across multiple drug scaffolds.
Prioritizing low-toxicity drug leads based on predicted acute oral toxicity values.
Analyzing structure–toxicity relationships to guide the de novo design of safer chemicals.
Strengths
Dataset contains 1150 compounds, providing a foundation for model development.
Models achieved external validation coefficients (R²) ranging from 0.7674 to 0.8980, indicating predictive performance.
The computational workflow was applied to screen over 23,000 untested molecules, demonstrating scalability.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count for the primary dataset is unknown, which may limit suitability assessment.
The dataset is small in scale at 139.7 KB, indicating limited raw data volume.
Provenance
Source
figshare
Collection Method
Likely compiled for research on computational toxicity prediction in drug discovery.
Freshness
Last updated 2026-06-02 12:15:39; freshness should be verified.
License is CC-BY-NC-4.0, which prohibits commercial use.