Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
142,125 datasets
City Owned and Leased Properties (COLP) is a detailed inventory of all real estate owned or leased by New York City government agencies. The Department of City Planning (DCP) publishes this dataset as required by the City Charter, with data extracted from the Integrated Property Information System (IPIS) maintained by the Department of Citywide Administrative Services (DCAS). The current version was released on 2026-05-18.
NASA's SE-590 Reflectance Factors and Radiances Measured from a Helicopter Data Set provides intermediate-scale spectral characterization of land surfaces. Data were collected at 17 grid locations over 6 clear-sky days in 1989 during the FIFE IFC-5 campaign. The helicopter-borne SE-590 instrument was used to bridge measurements between ground radiometers and higher-altitude aircraft or satellite sensors.
Licensees, TradingNames, and Dateexpiry columns suggest a daily-updated registry of occupational licenses issued by the Access Canberra authority. The dataset likely contains records for agents, traders, and businesses in the Australian Capital Territory, tracking license issuance and expiration. Its presence on multiple government data platforms indicates it serves as an official public record.
Bin Xie's dataset contains raw and supporting data for a study on discovering and engineering a serine protease to degrade polylactic acid (PLA). The 1.3 MB collection includes enzyme activity measurements, lactic acid release profiles, and depolymerization efficiency calculations. It was last updated on May 21, 2026, and supports the reproducibility of structure-based discovery and machine learning-aided protein engineering.
Fahim Nasir published a dataset on figshare on May 15, 2026, detailing a workflow for forecasting customer conversion in bank marketing. The dataset is 17.5 KB in size and is stored in an XLS file format. It was created to support a study proposing a two-stage quadrilateral evaluation framework for model selection under class imbalance.
A 5.5 KB Excel dataset supports a study proposing a two-stage quadrilateral evaluation framework for forecasting customer conversion in bank marketing. Authored by Fahim Nasir and last updated in May 2026, the data was used to test ensemble and deep learning models under five sampling strategies to address class imbalance. Results indicate the XGBoost model with Borderline2SMOTE sampling showed improved adaptability with reduced synthetic distortion.
Fahim Nasir created a dataset evaluating machine learning models on imbalanced class distributions for bank marketing customer conversion forecasting. The dataset, last updated on 2026-05-15, contains results from a two-stage evaluation framework testing ensemble and deep learning models with five sampling strategies. The framework assesses models across discrimination, calibration, computational cost, and explainability to select suitable model-sampling combinations.
13.5 KB of data in XLS format accompanies a study on forecasting customer conversion in bank marketing. The dataset, authored by Fahim Nasir and last updated in May 2026, is used to evaluate models and sampling techniques under a proposed quadrilateral evaluation framework. It supports research into responsible predictive analytics for Banking 4.0.
South Korean survey data from 349 adults aged 65 and older recruited from senior welfare centers and community facilities in the Seoul metropolitan area. The cross-sectional study, authored by Taeyeon Koo, measured ten constructs within an Augmented Technology Acceptance Model framework. Supplementary files were last updated on May 28, 2026.
NSW Department of Climate Change, Energy, the Environment and Water produced a report summarizing a literature review and gap analysis of the flood warning network. The report, last updated on 2026-05-22, aims to establish a prioritised list of recommended work packages to improve flood warning in the Richmond and Wilsons River Basins. Work packages may involve installing or upgrading rainfall/water level gauges and setting up flood forecast locations.
Data from 1848 participants in a 12-month digital lifestyle intervention, the Healthy Weight Coaching program, collected between baseline and 12 months. The dataset includes self-reported weight, height, waist circumference, and RAND-36 health-related quality of life scores across eight domains. It was authored by figshare admin karger and last updated on 2026-05-07.
A 1.1 MB sample dataset by Mina Kobayashi, last updated 2026-05-27, for the CAFE-3D visualization environment. It integrates AI evaluation results for two oblique images with point-based geospatial information on ground strength (cone index). The dataset serves as supplementary material for a related manuscript, demonstrating integrated visualization of image and point data.
MixNet experimental results for multivariate time series forecasting are provided by Xinhan Wang. The dataset contains results from testing on seven benchmark datasets from primary domains, stored in an XLS file of 13.5 KB. The dataset was last updated on 2026-05-26.
A compilation from the 1970s to the present, this dataset provides point locations for sediment samples and observational data from the Vestfold Hills in East Antarctica. It incorporates data from published and unpublished sources to support analysis of physical and chemical properties, sedimentary processes, and glacial history. The compilation is presented by the Australian Ocean Data Network to make sample locations and types more readily available.
Performance and monitoring data collected from female Gaelic games athletes. The dataset includes measures of eccentric hamstring strength, inter-limb asymmetry, and other performance metrics. It was created by Donnacha Mulcahy and last updated in May 2026.
Annual mean color intensities from 1984 to 2023 were derived from Landsat satellite data for global rivers classified as brown-, green-, and yellow-dominant. The dataset includes a global 1°×1° grid vector file clipped to land polygons. It was authored by Nuoxiao Yan and is available under a CC-BY-4.0 license.
Experimental artifacts from an iterative usability study of Yēgatu, a gamified mobile application for autonomous learning of the indigenous Nheengatu language. The study involved two cycles of expert usability testing using quantitative and qualitative metrics. The repository was created by Fabiann Barbosa and last updated on 2026-06-01.
A high-resolution digital master copy of manuscript HC.MS.00441 from the Qatar National Library Heritage Collection. The manuscript is titled 'The Unique Jewel: Commentary on the Summary of Rulings and the Science of Monotheism, Part Two' and was authored by Muhammad ibn al-Hasan al-Wasiti (1317-1374). The dataset is provided as a 5.6 GB ZIP file under a CC0 1.0 public domain dedication.
A 1.5 GB high-resolution digital copy of manuscript HC.MS.00755 from the Qatar National Library Heritage Collection. The manuscript, authored by Abu al-Hasan Ali ibn Muhammad al-Tabari, is titled 'Interpretation of Problematic Verses and Narratives and Their Clarification with Proofs, Evidence, and Traditions'. It was last updated on the platform in June 2026.
A 1.6 GB high-resolution digital master copy of manuscript HC.MS.2018.0014 from the QNL Heritage Collection. The manuscript is the 'Al-Jami' al-Sahih' (The Authentic Collection) by Abu Abdullah Muhammad ibn Ismail al-Bukhari (810-870). It was published by Qatar National Library under a CC0-1.0 license and last updated on 2026-06-02.