Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
165,283 datasets
S1 File contains data for the study 'Association between pre-competition strength and sprint canoe/kayak performance: A mixed-effects analysis of professional Chinese athletes'. The dataset is 33.3 KB in size, stored as an XLSX file, and was authored by Zongwei Chen. It was last updated on 2026-05-28.
Property-related charge information by period, sourced from data.cityofnewyork.us. The dataset includes columns such as TAXYEAR, SUM_BAL, VALCLASS, and PARID, but a technical issue means 2023 data is missing and cannot be recreated from this snapshot. It was last updated on 2026-05-21.
Part16βpart19 of the WildGUI dataset contain screenshot images, extending the main release at xwm/WildGUI. The dataset was introduced by Video2GUI and is hosted by author joker-112. The repository was last updated on 2026-06-14.
Replication data and code for a study analyzing the impact of natural disasters on corporate performance in China. The dataset, approximately 864 MB in size, likely contains firm-level financial and operational metrics linked to disaster events. It supports research into how environmental shocks affect business outcomes such as profitability and innovation.
Almost 10 times the number of light commercial vehicles were on-road in Queensland compared with heavy freight vehicles as of 30 June 2019. The number of registered light commercial vehicles more than doubled since 30 June 2001, while heavy freight vehicles increased by 49% in the same period. This dataset is provided by the Queensland Department of Environment, Tourism, Science and Innovation.
Australia's Southeast Marine Region dataset from the Australian Ocean Data Network provides 3D images and descriptive text about the marine environment. The dataset was last updated on 2026-06-17. It is available in HTML and PDF formats.
Supplementary material 4 from a study on decadal seafloor geodesy along the Nankai Trough. The dataset contains the average of standard deviations for coefficients used in estimating slip deficit rates for two directions, labeled "02" and "03". It was authored by Yusuke Yokota and is shared under a CC-BY-4.0 license.
Australian Ocean Data Network provides a record of gravity and magnetic data sources covering the remote offshore Capel and Faust basins on the Lord Howe Rise. The documentation describes the processes applied to level the collected geophysical data. This dataset was last updated on 2026-06-17.
57% of 112 surveyed German healthcare professionals treating cardiology patients reported using telemedicine. This dataset contains predictors of telemedicine use identified via Bayesian Model Averaging and an XGBoost model achieving 0.88 AUROC, created by Pascal Petit and last updated in April 2026. It likely includes variables related to professional role, knowledge, attitudes, and demographics.
112 healthcare professionals from a German cross-sectional survey provide data on telemedicine use determinants. The dataset contains the performance metrics and predictor importance results from a final XGBoost model developed by Pascal Petit, last updated in April 2026. The model achieved an AUROC of 0.88 and 79% accuracy in predicting telemedicine adoption.
57% of 112 surveyed German healthcare professionals reported using telemedicine. This 5.5 KB Excel file contains the performance metrics and predictor importance analysis from an XGBoost model predicting telemedicine adoption, authored by Pascal Petit and last updated in April 2026. The model achieved an AUROC of 0.88 and 79% accuracy using nested cross-validation.
Property tax records for Guadalajara de Buga, Colombia, spanning over six decades from 1960. The dataset includes detailed information on land and property taxes paid by owners. It is hosted by datos.gov.co and was last updated in May 2026.
A 5.5 KB Excel dataset presents results from a unified framework for evaluating machine learning-based Intrusion Detection Systems (IDS). The framework harmonizes features from the NSL-KDD and CICIDS2017 datasets and benchmarks models including Random Forest, which achieved 98.0% accuracy and 97.0% F1-score. Authored by Shailendra Mishra and last updated on April 20, 2026, this work focuses on reproducibility and statistical validation in cybersecurity research.
Shailendra Mishra's evaluation metrics reporting summary, published on figshare in April 2026. The 5.5 KB XLS file contains results from a unified framework for evaluating Intrusion Detection Systems (IDS). The framework harmonized features from the NSL-KDD and CICIDS2017 datasets and benchmarked supervised, unsupervised, deep learning, and ensemble models.
5.5 KB of statistical test results from a framework evaluating machine learning models for network intrusion detection. The dataset, authored by Shailendra Mishra and last updated in April 2026, contains results from Wilcoxon signed-rank, McNemarβs, and DeLong tests applied to models like Random Forest on harmonized NSL-KDD and CICIDS2017 datasets.
Shailendra Mishra's framework harmonizes features from the NSL-KDD and CICIDS2017 network intrusion datasets for evaluating machine learning models. The dataset, last updated in April 2026, is a 5.5 KB Excel file containing the harmonized data used in the study. Experimental results from the framework demonstrated a Random Forest model achieving 98.0% accuracy and 97.0% F1-score on this data.
A 5.5 KB dataset from figshare, last updated on 2026-04-20, containing results from an ablation study on machine learning models for intrusion detection. The work by Shailendra Mishra proposes a unified framework, harmonizing the NSL-KDD and CICIDS2017 datasets and benchmarking models including Random Forest, which achieved 98.0% accuracy and 97.0% F1-score.
A 5.5 KB Excel file containing harmonized features from two network intrusion datasets, NSL-KDD and CICIDS2017, for evaluating machine learning models. The dataset was created by Shailendra Mishra and last updated on April 20, 2026. It supports a framework for reproducible and statistically validated benchmarking of Intrusion Detection Systems.
Cross-validation results from a framework evaluating machine learning models for network intrusion detection. The dataset contains performance metrics from models like Random Forest, which achieved 98.0% accuracy and 97.0% F1-score on harmonized data. The work by Shailendra Mishra was last updated in April 2026.
A 5.5 KB Excel dataset created by Shailendra Mishra and last updated on April 20, 2026. It contains harmonized features from the NSL-KDD and CICIDS2017 network intrusion datasets, processed through a unified framework for evaluating machine learning-based Intrusion Detection Systems (IDS). The work includes results from benchmarking supervised, unsupervised, deep learning, and ensemble models.