Loading...
Loading...
Organic/inorganic chemistry, analytical chemistry, electrochemistry, molecular properties, chemical reactions
2,032 datasets
A glossary containing definitions and explanatory notes for more than 450 terms used in multidisciplinary research related to modern theoretical concepts and computational methods. It was created by Vladimir I. Minkin of Southern Federal University to provide guidance on terminology in theoretical organic chemistry. The aim is to contribute to the elimination of inconsistencies and ambiguities in the meanings of terms in this area.
A cheminformatics dataset from the UCI Machine Learning Repository for predicting the bioconcentration factor (BCF) of chemical compounds. It contains molecular descriptors as features and a continuous logBCF target variable for regression tasks. The dataset was contributed by authors from the Milano Chemometrics and QSAR Research Group.
A dataset for predicting the bioconcentration factor of chemical compounds, which measures accumulation in living organisms. It was created by Grisoni, F., Consonni, V., Villa, S., Vighi, M., & Todeschini, R. and is sourced from the UCI Machine Learning Repository and the Milano Chemometrics and QSAR Research Group. The dataset contains molecular descriptors and a categorical target variable for classification.
The QSAR Bioconcentration Classes Dataset originates from the UCI Machine Learning Repository and the Milano Chemometrics and QSAR Research Group. Authors Grisoni, F., Consonni, V., Villa, S., Vighi, M., & Todeschini, R. compiled molecular descriptors to predict the bioconcentration factor of chemical compounds. The dataset is used for classification tasks to categorize compounds into bioconcentration classes.
A well-known cheminformatics dataset from the UCI Machine Learning Repository, originally created by Grisoni et al. of the Milano Chemometrics and QSAR Research Group. Its primary objective is to predict the log-transformed bioconcentration factor (logBCF) of chemical compounds, a key measure of environmental toxicity. The dataset contains molecular descriptors describing chemical structure and properties.
The QSAR Bioconcentration Classes Dataset originates from the UCI Machine Learning Repository and the Milano Chemometrics and QSAR Research Group. Authors Grisoni, F., Consonni, V., Villa, S., Vighi, M., & Todeschini, R. compiled molecular descriptors to predict the bioconcentration factor of chemical compounds. The dataset is used for classification tasks to categorize compounds into bioconcentration classes.
Mass spectrometry (MS) – SomaScan (SS) concordance data provides evidence for biomarker panels in a chronic progressive disease study. The 5.5 KB Excel file, authored by Blake Hummer and last updated in March 2026, is licensed for open use under CC-BY-4.0. Platform tags suggest the data relates to a pilot study identifying 68 proteins and staging biomarker profiles.
QSAR_fish_toxicity contains 6 molecular descriptor attributes for 908 chemicals, used to predict acute aquatic toxicity for the fish Pimephales promelas (fathead minnow). The dataset was developed for quantitative regression QSAR models, using LC50 (the concentration lethal to 50% of test fish over 96 hours) as the target feature. It is licensed under CC-BY-4.0 and hosted on OpenML.
Geoscience Australia Data published a monograph on ocean margin systems, last updated in March 2026. The publication examines the dynamics of benthic life and its influence on biogeochemical reactions and fluxes in transitional zones between oceans and continents. The dataset is tagged with Earth sciences, Marine, and Geochemistry subjects.
Encompassing experimental data from a study investigating roll-to-roll coating methods for producing iridium oxide catalyst layers used in proton exchange membrane water electrolyzers. The research compares two coating methods, slot die and gravure, and analyzes their impact on film microstructure, electrolyzer performance, and durability. Row and column counts are unknown.
Raúl Acosta-Murillo's dataset describes identified cancer targets with ChEMBL and UniProt identifiers. The data is stored in a 9.5 KB Excel file and was last updated on March 17, 2026. Platform tags suggest the dataset was used for predicting cancer bioactivities using various chemical representations and machine learning models.
A phytochemical dataset integrates USDA botanical records with PubMed citations, ClinicalTrials.gov study counts, ChEMBL bioactivity scores, and USPTO patent density. It is provided in production-ready JSON and Parquet formats by the author wirthal1990-tech. The dataset was last updated in March 2026.
Figshare hosts Supplementary Table S4 containing mass spectrometry results from a study on BCOR mutations in retinoblastoma. The dataset, shared under CC BY license by Michelle G. Zhang, is a 137.0 KB Excel file. It supports analysis of deregulated cell cycle and hypoxic adaptation pathways.
701.4 KB of quantitative mass spectrometry data assesses the effect of λGRTS targeting on EGFP translation efficiency in mammalian cells and its off-target effects. The dataset was created by Junzhe Liu and last updated in March 2026. It is provided in XLS format.
NOAA/WDS Paleoclimatology archives bulk and compound-specific nitrogen isotope data from the Eastern Pacific Ocean's Santa Barbara Basin. The dataset contains parameters for paleoceanography studies, with a time period measured in calendar years before present. It is maintained by the NOAA National Centers for Environmental Information under the World Data Service for Paleoclimatology.
Over 350,000 chemical records from the National Library of Medicine provide structure and nomenclature authority files. More than 80,000 records include chemical structures. The database is maintained by the NLM's SCIOPS organization.
A proficiency testing panel for molecular diagnostics, consisting of seven swabs. The dataset details test items for pathogens like Treponema pallidum and Haemophilus ducreyi, relevant to yaws eradication campaigns. It was authored by Claudia Mueller and last updated in March 2026.
Maritime Archaeology Ltd compared wreck data from the UK's National Monuments Record and the UK Hydrographic Office. The project, commissioned by English Heritage, aimed to identify and resolve discrepancies between these two official maritime heritage datasets. The data was aggregated by the Marine Environmental Data & Information Network and last updated in March 2026.
SCIOPS produced a dataset on the concentrations of phenolic metabolites in the Antarctic lichen Umbilicaria antarctica. Densitometric analysis (HPTLC) was used to measure usnic acid, atranorin, and gyrophoric acid levels in thalli of different ages. The data was last updated on February 20, 1989.
A 30-year collection period from Antarctica provides data on the accumulation of usnic acid in two lichen species: Neuropogon aurantiaco-ater and Ramalina terebrata. The dataset, sourced from NASA EarthData and last updated in 1996, examines the relationship between this UV-B absorbing compound and critical levels of ozone depletion.