Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
147,466 datasets
Miksa M. Henkrich's dataset contains 2,463 patient-reported weight gain narratives labeled by a GPT4.1 model into 12 thematic categories. The data includes patient demographics, treatment outcomes, and unsupervised clustering results. It was last updated on April 14, 2026.
2,463 patients with overweight or obesity provided unstructured narratives about weight gain causes prior to a weight loss treatment. The narratives were automatically labeled into 12 thematic categories using a GPT4.1 large language model, achieving precision and recall of 0.906 and 0.897. The dataset, authored by Miksa M. Henkrich and last updated in April 2026, supports analysis of associations between reported causes, demographics, and treatment outcomes.
2463 patients with overweight or obesity provided open-ended weight gain cause narratives prior to a multidisciplinary weight loss treatment. The narratives were labeled using 12 thematic categories via a GPT4.1 large language model, and associations with demographic factors and treatment outcomes were analyzed. The dataset was authored by Miksa M. Henkrich and last updated on April 14, 2026.
Revenue data from 2012 through the present tracks the State of Connecticut's income. The dataset includes columns for FISCAL_YEAR, REVENUE CATEGORY, FUND_TYPE, and ACTUAL_AMOUNT, allowing for detailed tracking of state finances. It is used in the official Open Connecticut application for government transparency.
A high-resolution digital master copy of manuscript HC.MS.03223 from the QNL Heritage Collection. The dataset is a 112.6 MB ZIP file provided by Qatar National Library under a CC0 1.0 license. The record was last updated on June 1, 2026.
Qatar National Library provides a 285.2 MB high-resolution digital copy of manuscript HC.MS.03153 from its Heritage Collection. The dataset contains Kufic script folios and is available under a CC0 1.0 license. It was last updated on June 1, 2026.
A 274.0 MB high-resolution digital master copy of manuscript HC.MS.03152 from the Qatar National Library Heritage Collection. The dataset was authored by Qatar National Library and last updated on June 1, 2026. It is provided under a CC0 1.0 license as a ZIP file.
A high-resolution digital master copy of manuscript HC.MS.03166 from the Qatar National Library Heritage Collection. The dataset is a 99.2 MB ZIP file containing a Quranic fragment written in Kufic script, published under a CC0-1.0 license by Qatar National Library. The catalog record and digitized manuscript are accessible via provided links.
A 102.9 MB high-resolution digital master copy of manuscript HC.MS.03164 from the Qatar National Library Heritage Collection. The dataset is a ZIP file containing a digitized Quranic fragment written in Kufic script, made available under a CC0 1.0 license. The record was last updated on June 1, 2026.
A 93.1 MB high-resolution digital master copy of manuscript HC.MS.03165 from the Qatar National Library Heritage Collection. The dataset is a single ZIP file published by Qatar National Library under a CC0 1.0 license. The full catalog record and digitized manuscript are accessible via provided links.
A 73.4 MB high-resolution digital master copy of manuscript HC.MS.03150 from the QNL Heritage Collection. The dataset was published by Qatar National Library and is available under a CC0-1.0 license. The last update was recorded on 2026-06-01.
A high-resolution digital master copy of manuscript HC.MS.03148 from the QNL Heritage Collection. The 62.1 MB ZIP file provides a digital surrogate of a Quranic manuscript fragment. Qatar National Library published this dataset under a CC0 1.0 Public Domain Dedication license.
Qatar National Library provides a high-resolution digital master copy of manuscript HC.MS.03149, titled 'Quranic Fragment'. The 63.9 MB ZIP file is available under a CC0 1.0 license and was last updated on June 1, 2026.
A 1.0 GB high-resolution digital master copy of manuscript HC.MS.03154 from the Qatar National Library Heritage Collection. The dataset is provided under a CC0 1.0 license and was last updated on June 1, 2026. It offers direct access to the digitized manuscript pages via the QNL Digital Repository.
UK Biobank data from 50,021 participants without type 2 diabetes, tracked from 2006-2010 with an average follow-up of 12.0 to 12.03 years. The dataset includes over 1,400 proteins and more than 280 metabolites analyzed for complications like kidney, cardiovascular, and neurological issues. Author Ming Hao used LASSO Cox and LightGBM models to identify GDF15 as a predictive biomarker for kidney complications.
50,021 UK Biobank participants without type 2 diabetes were tracked for an average of 12 years to identify protein and metabolite predictors of complications. The research highlights plasma protein GDF15 as a strong predictor for kidney complications, outperforming traditional markers like blood glucose and HbA1c. Ming Hao published this analysis under a CC-BY-4.0 license in May 2026.
A 494.9 MB dataset formatted for compatibility with the LFADS analysis repository. The data is intended to produce results for the base case of a corresponding research paper. It was authored by Ankit Vishnubhotla and last updated on May 27, 2026.
A geospatial dataset representing key resource areas across Queensland, identified as containing extractive resources of state or regional significance. The data is provided by the Queensland Department of Natural Resources and Mines, Manufacturing and Regional and Rural Development and was last updated on 2026-05-29. It is available in multiple formats including SHP and WMS for mapping and analysis.
Corporación Autónoma Regional del Alto Magdalena (CAM) provides hydrometeorological data from its automated monitoring network. The dataset includes periodic measurements of precipitation, temperature, humidity, water level, solar radiation, and wind. Data is published as open information for public consultation and institutional transparency, with a basic quality control applied.
Building and Safety Temporary Special Event (TSE) Permits from the City of Los Angeles document events requiring inspection and approval by the Department of Building and Safety. The dataset includes permit details, event names, dates, and precise location data for events held on public or vacant land. Columns suggest it supports analysis of event logistics, spatial distribution, and regulatory compliance.