DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Mathematics & Statistics Datasets | DataSalon

All Categories

📐

Mathematics & Statistics

Mathematical datasets, statistical benchmarks, probability, optimization, operations research

3,080 datasets

EP102: METTL3 Inhibitor Compound Data for Solid Tumors and AML

A dataset discloses lead optimization results for a series of METTL3 inhibitors, culminating in the discovery of compound EP102. The data, published by Guillaume Dutheuil on figshare in April 2026, includes information on structural modifications that improved oral bioavailability and decreased lipophilicity. EP102 has demonstrated efficacy in mouse tumor models and has entered clinical development for advanced solid tumors.

TabularCSVCompound OptimizationClinical DevelopmentMettl3 InhibitorsOncologyHealthcareDrug Discovery+1

0 views

Mathematics & Statistics

Kisło et al 2026: Raw Data for Statistical Analysis

Raw data used for statistical analysis in the research by Kisło et al, published in 2026. The dataset is 38.5 KB in size and is available in CSV format under a CC-BY-4.0 license. It was last updated on April 28, 2026.

TabularCSVStatistical AnalysisRaw DataResearch Data+1

0 views

Mathematics & Statistics

Breakeven Complexity: Simulated PDE Trajectories for Neural Solver Benchmarking

20,000 training and 1,000 test trajectories for each of three partial differential equations (Navier-Stokes, Kuramoto-Sivashinsky, Gray-Scott) simulated via Exponax. The dataset was created by author 'yijingz' for the paper 'Breakeven complexity: A new perspective on neural partial differential equation solvers' and was last updated on Hugging Face in May 2026.

Time SeriesScientific ComputingNeural SolversPhysics SimulationSyntheticPartial Differential Equations+1

0 views

Mathematics & Statistics

Ontario Vital Statistics Registration Requests By Year

Ontario's Office of the Registrar General provides annual counts of completed proof of registration requests from 2010 onward. Data includes requests for certificates like birth or marriage records, covering both residents and non-residents. The Government of Ontario releases this information under the Vital Statistics Act.

TabularTime SeriesGovernment RegistryVital StatisticsDemographicsPublic Services+1

0 views

Mathematics & Statistics

SRA-Bench: A Benchmark for Skill-Retrieval-Augmented LLM Agents

SRA-Bench is a benchmark dataset for evaluating skill-retrieval-augmented large language model agents, created by WeihangSu and last updated on April 22, 2026. It contains 5,400 test instances and a skill library of 26,262 skills, of which 636 are gold skills and 25,626 are web-collected distractors. The dataset includes sub-benchmarks like TheoremQA and LogicBench for specific reasoning tasks.

TextLlm BenchmarkAgent TestingBenchmarkSkill RetrievalReasoning Evaluation+1

0 views

Mathematics & Statistics

MSMARCO Annotation: Utility Labels for Training Dense Retrievers

Wanglanhuajiaofen annotated the MS MARCO dataset for utility using the Qwen3-32B model. The dataset supports research on multi-positive optimization objectives for dense retrieval. It was last updated on 2026-05-19.

TextDense RetrievalQwenNatural Language ProcessingMs MarcoInformation Retrieval+1

0 views

Mathematics & Statistics

BRADSHAW: Automated Molecular Design for ERAP1 Inhibitor Optimization

Robert P. Law published a dataset on figshare in 2026 detailing a pilot project for automated molecular design. The 5.8 KB CSV file documents the application of the BRADSHAW platform to optimize inhibitors of Endoplasmic Reticulum Aminopeptidase 1 (ERAP1), a target in cancer immunotherapy and autoimmune diseases. The work involved four iterations of generative design, machine learning model refinement, and multiparameter optimization.

TabularCSVMachine LearningErap1Medicinal ChemistryDrug DiscoveryMolecular Design+1

0 views

Mathematics & Statistics

Summary of Treatment Sessions with Statistical Comparisons

A dataset summarizing treatment sessions, likely for medical or behavioral interventions. It was authored by Yucheng Li and last updated on May 8, 2026. The data is stored in a RAR archive with a size of 247.7 KB.

TabularTreatment SessionsMedical ResearchStatistical Comparisons+1

0 views

Mathematics & Statistics

Statistical Significance Analysis Comparing a Novel Approach to State-of-the-Art Methods

A 5.5 KB Excel file containing p-values for statistical significance analysis. The data compares a specific approach against state-of-the-art methods across three distinct experimental scenarios. Authored by Mohammadamin Moragheb and last updated on 2026-05-08, it is shared under a CC-BY-4.0 license.

TabularExcelMethodologyStatistical SignificanceP Values+1

0 views

Mathematics & Statistics

Statistical Parameters and Model Adequacy for Factorial Design

5.5 KB Excel file from figshare contains statistical parameters and model adequacy metrics for factorial design. Author Nilima Thombre last updated the dataset on 2026-05-08. The data likely supports analysis of experimental design outcomes.

TabularExcelModel AdequacyFactorial designStatistical Parameters+1

0 views

Mathematics & Statistics

Pyrimidoindole Derivatives with Activity Against Drug-Resistant Bacteria

Compound 65, a pyrimido[4,5-b]indole derivative, demonstrated potent broad-spectrum activity against multidrug-resistant Gram-negative bacteria without detectable hERG liability. The dataset contains results from a structure-based design campaign to overcome pharmacokinetic limitations of earlier compounds. Yuzhi Liu published this data on figshare in April 2026.

TabularAntibacterial CompoundsStructure Activity RelationshipMedicinal ChemistryBenchmarkHealthcareDrug Discovery+1

0 views

Mathematics & Statistics

Pyrimidoindole Derivative Structures for Antibacterial Drug Discovery

Published in 2026 by Yuzhi Liu, this dataset contains 3D molecular structure files for a series of pyrimido[4,5‑b]indole derivatives designed to combat multidrug-resistant Gram-negative bacteria. It includes the structure of the lead compound, designated 65, which demonstrated potent antibacterial activity and improved pharmacokinetic properties.

GraphAntibacterial CompoundsBenchmarkHealthcareStructural BiologyDrug DiscoveryComputational Chemistry+1

0 views

Mathematics & Statistics

Optimal TCSC Allocation for Congestion Management in IEEE and Indian Power Networks

A research paper proposes a method for managing transmission line congestion in electricity markets. The method uses line flow sensitivity factors and particle swarm optimization to find the optimal location and parameter setting for a Thyristor Controlled Series Capacitor (TCSC). The proposed method is tested on the IEEE 30-bus system, IEEE 118-bus system, and a 33-bus Indian network.

TabularPower SystemsFacts DevicesCongestion ManagementIeee Test CasesOptimization+1

0 views

Mathematics & Statistics

Zika Virus Polymerase Assay Data with 204 Natural Product Screenings

204 compounds from the Natural Product Set IV of the National Cancer Institute Developmental Therapeutics Program were screened for activity against the Zika virus RNA-dependent RNA polymerase. The dataset, authored by Vanessa Aitken and last updated in April 2026, contains results from an optimized malachite green colorimetric assay. It identifies two preliminary modulator compounds, purpurogallin and digallic acid.

TabularEnzyme AssayDrug ScreeningNatural productsZika VirusVirologyHealthcare+1

0 views

Mathematics & Statistics

Fraud Detection Sensitivity in Simulated HbA1c Trial Data by Sample Size

A simulation study by Philippe P. Hujoel, last updated in April 2026, examines the sensitivity of a 3-sigma lnCVR statistical test for detecting simulated fraud in clinical trial data. The 5.5 KB Excel file contains results where the worst 50–90% of HbA1c scores in an intervention arm were replaced by a best-responder value. The analysis varies the sample size per trial arm to assess the test's detection power.

TabularExcelStatistical PowerSimulation StudyFraud DetectionClinical TrialsSample SizeSynthetic+1

0 views

Mathematics & Statistics

Statistical Power for Detecting HbA1c Fraud in Clinical Trials

Philippe P. Hujoel authored a dataset analyzing the sensitivity of a statistical method for detecting simulated fraud in clinical trial data. The dataset, last updated on April 15, 2026, is a 5.5 KB Excel file. It models scenarios where 50% to 90% of the worst HbA1c scores in an intervention arm are replaced by a best-responder value.

TabularExcelStatistical PowerHba1cFraud DetectionClinical TrialsSample SizeSynthetic+1

0 views

Mathematics & Statistics

Broad Sound Estuarine Sediment Geochemistry with Statistical Analysis

Analysis of estuarine sediments from Broad Sound, Queensland, applying Q-mode and R-mode factor analysis, discriminant analysis, and regression. The study identifies geochemical processes controlling concentrations of P2O5, Cu, Pb, and Zn in intertidal and supratidal zones. Research was published by the Australian Ocean Data Network, with a platform record last updated in April 2026.

TabularAudioEarth sciencesJournalMarine ScienceQueenslandMarineGA PublicationEstuarine SedimentsPublished ExternalQLDGeochemistry+1

0 views

Mathematics & Statistics

Ontario Greenhouse Industry Statistics on Sales and Input Costs

Statistical data on the greenhouse industry for Ontario and Canada, last updated in April 2026. The dataset includes metrics such as square footage, sales, employee numbers, and selected input costs. It is published by the Government of Ontario.

TabularExcelOntarioAgricultureEconomic StatisticsGreenhouse Industry+1

0 views

Mathematics & Statistics

Statistical Significance Analysis with Paired T-Test P-Values

Statistical significance analysis results from paired t-tests, with p-values indicating significant improvement where values are less than 0.05. The dataset is a 5.5 KB XLS file authored by Jianye Gu and last updated on 2026-05-07. It is shared under a CC-BY-4.0 license on the figshare platform.

TabularExcelSignificance TestingP ValuesStatistical AnalysisPaired T Test+1

0 views

Mathematics & Statistics

Cohen's Kappa Statistical Analysis for All Models on Test Set

Ummer Shakeel published a statistical analysis of model performance on a test set in May 2026. The dataset contains Cohen's Kappa scores for multiple models, stored in a 5.5 KB Excel file. It is licensed for reuse under CC-BY-4.0.

TabularExcelCohens KappaStatistical AnalysisModel EvaluationTest Set+1

0 views

PreviousPage 66 of 154Next