QSAR-Biodeg: Molecular Properties for Biodegradability Prediction

Name: QSAR-Biodeg: Molecular Properties for Biodegradability Prediction
Creator: Eddie Bergman
License: us-pd
Keywords: Qsar, Tabular, Biodegradation, Environmental Chemistry, Synthetic, Molecular Properties

by Eddie Bergman

arff

Available on 1 platform

Sign in to view source links and access this dataset

Description

A subsampled dataset for quantitative structure-activity relationship (QSAR) modeling of chemical biodegradability. It was created by Eddie Bergman from the original qsar-biodeg dataset on OpenML using a controlled random sampling procedure. The subsampling parameters include a seed of 1, a maximum of 2000 rows, 100 columns, and 10 classes, with stratification applied.

Use Cases

Predicting chemical biodegradability based on molecular property features.
Training classification models to categorize chemicals into up to 10 biodegradability classes.
Benchmarking feature selection algorithms on a dataset with 100 molecular descriptor columns.
Developing QSAR models for environmental risk assessment of new compounds.

Strengths

Subsampling was performed with a fixed random seed (1), ensuring reproducibility.
The creation method used stratified sampling, which likely preserves class distribution.
The dataset is derived from a known QSAR benchmark (qsar-biodeg) on OpenML.

Limitations

Row count is unknown, which may limit suitability assessment for large-scale modeling.
Column-level documentation is absent; field semantics must be inferred after download.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: OpenML dataset qsar-biodeg (ID 1494), subsampled by Eddie Bergman.
Collection Method: Algorithmic subsampling with random column and row selection, preserving class stratification.

License is listed as us-pd (public domain in the United States).

Tabular Qsar Biodegradation Environmental Chemistry Synthetic Molecular Properties

Related Datasets

Quality Score

C45

Description

58

Source

43

Reputation

18

Access

52

Community

0 views

Dataset Info

License: us-pd
Author: Eddie Bergman
Last synced: Apr 28, 2026

Access

52

Community

0 views

Dataset Info

License: us-pd
Author: Eddie Bergman
Last synced: Apr 28, 2026

QSAR-Biodeg: Molecular Properties for Biodegradability Prediction

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info