Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
169,684 datasets
Raw and statistically transformed data supporting an analytical study on the indicator-based assessment of socioeconomic development in municipalities of Zamość County. The dataset was authored by Roman Berdo and last updated on May 18, 2026. It is a small dataset, 72.0 KB in size, available as an XLSX file.
Roman Berdo's dataset contains raw and statistically transformed data for an analytical study on the socioeconomic development of municipalities in Tarnów County, Poland. The dataset is stored in an XLSX file sized at 74.3 KB and was last updated on May 18, 2026. It is published under a CC-BY-4.0 license on figshare.
37.2 KB of raw and statistically transformed data for an analytical study on the indicator-based analysis of socioeconomic development in municipalities of Oświęcim County. The dataset was authored by Roman Berdo and last updated on May 18, 2026. It is available in XLSX format under a CC-BY-4.0 license.
Moonee Valley City Council provides a dataset listing playgrounds within its municipal boundaries. The data is available in multiple geospatial formats including GeoJSON and ESRI Shapefile. The dataset was last updated on 2026-04-26.
Reintegros 2-2019 contains administrative records of student refunds processed by the Universidad Pedagógica y Tecnológica de Colombia (UPTC) for the first semester of 2019. The data covers students who left due to academic or non-academic desertion. It is hosted on the Colombian open data portal www.datos.gov.co and was last updated on 2026-05-18.
Vema cruise 33, leg 1, was conducted from 17 November to 17 December 1975 over the southeast Indian Ridge. The dataset is an observer's report published by the Australian Ocean Data Network. The report is available in HTML and PDF formats.
Evaluation and performance metrics of algorithms used in a machine learning context. The dataset is a 30.9 KB CSV file authored by Daryl Cruz and last updated on May 22, 2026. The specific algorithms and metrics evaluated are not detailed in the description.
A dataset listing variables for species selected using the Variance Inflation Factor (VIF) method. Daryl Cruz created this dataset, which was last updated on May 22, 2026. The data is stored in a 15.0 KB CSV file.
Kerala, India, is the geographic scope for this dataset of 800 community-dwelling adults aged 60 years and above. The data was analyzed by Blessy Sarah Mathew using an integrated machine learning framework combining K-Means clustering, PCA, and Random Forest classification to identify multidimensional health risk profiles. The dataset was last updated on 2026-04-17.
Approximately 57 million scientific papers with raw full text, extracted from multiple large-scale academic paper collections. The dataset is provided by scientifi-papers and was last updated on 2026-05-19. It includes subsets such as papers-2 (~18.5M papers), papers-3 (~27.4M papers), and pes2o (~8.2M papers).
Produced and maintained since 2017 by geoBoundaries, this dataset provides political administrative boundaries for Yemen. It contains three levels of subnational divisions: ADM0 (country), ADM1, and ADM2. The data is part of a standardized, open-license global database of boundaries for every country.
Legacy product from the Australian Ocean Data Network with no abstract available. This observer's report documents the Vema cruise 33 leg 3 over the magnetic quiet zone south of Australia, conducted from 20 January to 19 February, 1976. The dataset is published on data_gov_au and last updated in 2026.
Applicants registered for programs offered by the Universidad Pedagógica y Tecnológica de Colombia (UPTC) for the second semester of 2019. The dataset includes 16 columns specifying variables for each applicant, such as demographic and academic origin. It is hosted on the Colombian open data portal, datos.gov.co, and was last updated in May 2026.
geoBoundaries provides standardized administrative boundaries for Thailand at the ADM0 (country), ADM1, and ADM2 levels. The geoBoundaries Global Database is an open-license resource of political boundaries for every country, produced and maintained since 2017. This specific dataset for Thailand was last updated on 2026-04-29.
Public events in the City of Montreal as broadcast on the official municipal calendar. The data likely contains event characteristics such as date, type, target audience, cost, and location, and is provided by the Government and Municipalities of Québec. The dataset was last updated on 2026-04-22.
Public charging stations for electric vehicles, likely containing location data. The dataset is provided by the Government and Municipalities of Québec and is available in multiple geospatial and tabular formats. It was last updated on April 22, 2026.
FedericaFabri1 created this derived text dataset in 2026. It contains affirmative sentences generated by merging questions and answers from the ITALIC multiple-choice dataset, transforming interrogative structures into declarative statements while preserving original semantics.
Over 2000 kilometres of high-frequency echo-sounder data collected between February and March 2000 on the George V land shelf in East Antarctica. The dataset from Geoscience Australia describes seafloor morphology and acoustic facies, which are explained in terms of glacial and oceanographic influences since the Last Glacial Maximum.
12,709 notification records from Brazil's Information System for Notifiable Diseases (SINAN) between 2013 and 2020. The dataset includes 26 clinical symptom, comorbidity, and sociodemographic features for predicting laboratory-confirmed Chikungunya cases versus discarded suspected cases. It was created by Xinyu Lu and last updated in April 2026.
A catalog of studies and designs held by the Secretariat of Infrastructure. The dataset includes columns for project type, consultant, municipality, and date. It was last updated on 2026-05-18 18:48:24 and is provided by the Colombian open data portal www.datos.gov.co.