Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
145,729 datasets
A composite FGFR inhibitor sensitivity score was derived from large-scale pharmacogenomic screening and baseline transcriptomes. The dataset includes a compact, interpretable transcriptional signature validated across multiple cholangiocarcinoma cohorts and an orthogonal resource (GDSC). Yading Xie authored this work, last updated on 2026-04-30.
A pharmacogenomic dataset integrating large-scale drug sensitivity screening with baseline transcriptomes for cholangiocarcinoma. The data includes a composite FGFR inhibitor sensitivity score derived from multiple compounds and an interpretable transcriptional program extracted from linear model coefficients. The dataset was created by Yading Xie and last updated on 2026-04-30.
Graduation records for students from the Colegio Mayor de Antioquia university institution. The dataset includes details on academic programs, the number of graduates per year, and basic graduate characteristics, allowing for historical analysis of graduate trends. It is published by www.datos.gov.co and was last updated on 2026-05-18.
Colombia's directory of educational facilities for preschool, basic, and secondary levels, with records from 2019. The dataset is sourced from the Integrated Enrollment System (SIMAT), the Unique Directory of Establishments (DUE), and the National Ministry of Education, with georeferenced data from DANE. It contains contact, location, and enrollment information for school headquarters and establishments.
A global meta-analysis documents seasonal and interannual variability in reef-based kelp communities. Temporal trends in kelp biomass, stipe density, percent cover, and density change rates are measured across ecoregions in global temperate zones. The data is supplied by the Australian Ocean Data Network in both raw taxon-level and aggregated site-level formats.
Geoscience Australia and CSIRO Marine & Atmospheric Research collected three years of continuous methane and carbon dioxide measurements at the 'Arcturus' monitoring station in the Bowen Basin, Australia. This dataset underpins a simulation study using the TAPM model to test the sensitivity of atmospheric detection techniques for fugitive emissions from a new coal seam gas field. The work establishes a baseline critical for distinguishing CSG emissions from other sources like cattle and landfill.
Operation IceBridge campaigns collected this experimental, time-sensitive geophysical product over the Arctic sea ice cover. Derived data from the Airborne Topographic Mapper, Snow Radar, Digital Mapping System, and KT19 pyrometer likely contains measurements of sea ice freeboard, snow depth, ice thickness, ice roughness, and sea ice elevation. This quick look dataset is designed for projects requiring rapid analysis, such as sea ice forecasting, and serves as a counterpart to the more processed IceBridge L4 dataset.
A retrospective cohort of 339 prostate cancer patients who underwent preoperative whole-prostate MRI followed by radical prostatectomy and extended pelvic lymph node dissection. The dataset, created by Yun Luo and last updated in 2026, was used to develop a composite machine learning model integrating MRI-derived radiomics, deep learning features, and clinical parameters.
Interventions carried out by the Montreal Fire Safety Service (SIM) are listed, including location and deployed units. Data originates from the Workstation Assisted Dispatch System (RAO) for real-time management and reporting. The dataset is used for reports to the Ministry of Public Security and for compiling statistics for citizens, media, and insurers.
An inventory and classification of information assets for risk management and protection level determination, published by datos.gov.co. The dataset includes 28 columns detailing asset properties, legal classifications, and custodial information. It was last updated on 2026-05-18.
DATOS AREAS PROTEGIDAS RISARALDA provides detailed information on protected areas and strategic ecosystems within the Risaralda department. The data is broken down by municipality and includes specific metrics on ecosystem types such as tropical dry forest, wetlands, and páramos. The dataset originates from www.datos.gov.co and was last updated on 2026-05-18.
A dataset from the Socrata platform, last updated on 2026-05-18, describing a public works project in Valledupar, Colombia. The project involves road construction, public space rehabilitation, and utility network upgrades for a public transportation system. It includes columns such as Inversion (investment), Empleos Directos (direct jobs), and Personas Beneficiadas (benefited people).
Mass photometry data and histograms support a 2026 study on complement inhibition. The dataset includes CSV and PNG files totaling 34.3 MB, authored by Tereza Kadavá. It provides experimental measurements for visualizing distinct protein binding modes.
2014–2023 data on 307 potential fishing métiers, compiled by Pedro Leitão. The dataset includes main gear types, target species, and aggregated vessel, trip, and haul counts for the study period. It is a 19.7 KB CSV file shared under a CC-BY-4.0 license.
Beginning in academic year 2000, this dataset tracks the Aid for Part-Time Study (APTS) program, a grant for eligible part-time undergraduate students in New York State. It lists award recipients, dollar amounts, and average awards by college and sector group. The data is provided by data.ny.gov and was last updated in May 2026.
Locations associated with environmental authorities (EAs) in Queensland, derived from the state's public register. The dataset includes permits with statuses of 'Granted', 'Granted - Not Effective', or 'Suspended'. It was last updated on 2026-05-12 by the Department of Environment, Science and Innovation.
Nine convolutional neural network models achieved AUC scores between 0.921 and 0.967 for distinguishing malignant melanoma from other skin lesions. The final XGBoost ensemble model, built by Jinyan Jiang, achieved an AUC of 0.988 on the test dataset. This 455.8 KB Excel file contains performance data for a model trained on the ISIC-2024 and HAM10000 datasets.
Performance metrics for a deep learning-ensemble model designed to differentiate malignant melanoma from other malignant skin lesions. The model was trained and tested using the ISIC-2024 and HAM10000 dermatoscopic datasets, achieving an AUC of 0.988 on the test set. The dataset was created by Jinyan Jiang and last updated on 2026-05-15.
Nine convolutional neural network models achieved AUCs between 0.921 and 0.967 for distinguishing malignant melanoma from other malignant skin lesions. An XGBoost ensemble model built on their outputs achieved an AUC of 0.988 on a test dataset, as documented in an Excel file by Jinyan Jiang last updated in May 2026. The model was trained and validated using the ISIC-2024 and HAM10000 dermatoscopic datasets.
Nine convolutional neural network models were evaluated for dermatoscopic differential diagnosis between malignant melanoma and other malignant skin lesions. The models, trained on the ISIC-2024 and HAM10000 datasets, achieved AUCs from 0.921 to 0.967, with an XGBoost ensemble model reaching an AUC of 0.988 on the test set. The dataset was created by Jinyan Jiang and last updated on 2026-05-15.