Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
147,466 datasets
Building and Safety Temporary Special Event (TSE) Permits from the City of Los Angeles document events requiring inspection and approval by the Department of Building and Safety. The dataset includes permit details, event names, dates, and precise location data for events held on public or vacant land. Columns suggest it supports analysis of event logistics, spatial distribution, and regulatory compliance.
Government of Yukon compiled two isotope datasets for the Yukon region. The first dataset includes whole-rock (Nd, Hf, Sr, Pb, S, O) and feldspar (Pb) isotope analyses; the second contains sulphide (S and Pb) isotope analyses. The data was last updated on 2026-05-20.
ACT Government Open Data provides a daily updated list tracking the number of Security Employee licences. The dataset is maintained by ACIntel and was last updated on June 1, 2026. It likely contains counts or statuses of licences issued to security personnel.
20.3 KB of audio data in EAF format, last updated on June 3, 2026. The dataset contains a request for help from Mauridi Omari Mpendu to Marie-Annick Moreau, translated by Moshi S. Bora, concerning the EMKP project in Rufiji, Tanzania. It was authored by Marie-Annick Moreau and is shared under a CC-BY-NC-SA 4.0 license.
Marie-Annick Moreau authored a 20.3 KB dataset published on figshare in June 2026. The data consists of a translated conversation in EAF format, where Mauridi Omari Mpendu requests help for people in Rufiji and Marie-Annick Moreau explains her limitations and goals for the EMKP project.
131.8 KB of survey data from a cross-sectional study on dysmenorrhea among postpartum women in Western Kenya. The dataset, authored by Madalitso Khwepeya, is available under a CC-BY-4.0 license. Its last update was recorded as 2026-05-26.
Manufactured Housing Program Certified Entities lists businesses and individuals licensed by New York State to manufacture, install, or service manufactured homes. The dataset originates from the state's program database of record, as defined by regulation 19 NYCRR 1210.2(h). It is published by data.ny.gov and was last updated in early April 2026.
A derived data and code package supporting a public FDA MRSP-detection boundary benchmark manuscript. The 7.5 MB ZIP file contains source-lock and endpoint-lock records, crosswalks, manifests, pangenome provenance, model artifacts, and workflow scripts. It was authored by Cleverson de Souza and last updated on June 3, 2026.
Florida and surrounding regions are the focus of this dataset containing plotted vertical profiles of convection from tropical cyclones. The NASA ER-2 high-altitude research aircraft collected the data using its Doppler radar during the CAMEX-3 field campaign from August 5 to September 27, 1998. Data is provided as daily browse files in GIF format, representing radar reflectivity and Doppler velocity measurements.
Data from 2026 indicates the operational status of power provider feeds, including response codes and timestamps. The dataset is designed for and consumed by the Maryland Emergency Management Agency (MEMA) Power Outage web application. It contains a limited historical record inventory.
A benchmark dataset for evaluating deep learning splice prediction algorithms, converted and shared by Nathan Fortier. The dataset excludes 15 specific genetic variants due to formatting issues or complex variant types that could not be unambiguously resolved. It was last updated on May 13, 2026.
Vicmap Elevation DEM 10m is a raster representation of Victoria's elevation with a 10-meter spatial resolution. The product is constructed from multiple source datasets and has been hydrologically enforced to correctly define natural surface drainage. It was quality assured by the Victorian State Government and third-party consultants and last updated in April 2026.
CDMB's Air Quality Surveillance System (SVCA) operates four automatic monitoring stations, including one on the terrace of the Club Union in Bucaramanga. The stations measure real-time atmospheric pollutants such as particulate matter (PM₁₀ and PM₂.₅), sulfur dioxide (SO₂), nitrogen dioxide (NO₂), tropospheric ozone (O₃), and carbon monoxide (CO), in compliance with Colombian Ministry of Environment Resolution 2254 of 2017. The dataset includes 17 columns of measurements like pollutant concentrations, meteorological variables, and timestamps.
311 Service Requests - Austin Transportation and Public Works lists all public service requests assigned to the City of Austin's Transportation and Public Works department from fiscal year 10/01/2021 onward. The dataset is updated twice daily and includes detailed tracking of request status, location, and performance metrics. Columns suggest it likely contains geospatial boundaries, departmental workflows, and timeliness indicators for urban service management.
López de Micay municipality's dataset records citizen Petitions, Complaints, Claims, Suggestions, and Reports (PQRSD) received through various channels. It includes the request type, reception date, entry channel, status, closure date, and responsible department. The data has been anonymized to comply with Colombian personal data protection and public information access laws.
Summary statistics from a genetic study comparing HLA profiles across three groups: patients with systemic sclerosis and interstitial lung disease (SSc-ILD+), patients with systemic sclerosis without ILD (SSc-ILD-), and controls. The dataset, authored by Carlos Rosa-Baez and last updated in May 2026, is a 7.1 MB TSV file containing the results of these three primary comparisons.
A 438.6 KB dataset from figshare, last updated June 3, 2026, characterizes Cladosporium fungal species from six caves in the Brazilian Cerrado savanna. Author Pedro Oliveira used an integrative approach combining morphology and multilocus phylogenetic analyses of ACT, ITS, RPB2, TEF1-α, and TUB genes. The dataset includes descriptions of six new species.
1992 through 2020 data from the New York State Birth Defects Registry shows the occurrence of selected major birth defects. The dataset includes counts and prevalence per 10,000 live births, broken down by defect name, sex, residence, and birth year. It is published by health.data.ny.gov and was last updated in May 2026.
Navy Coastal Ocean Model output from the Sub-Mesoscale Ocean Dynamics Experiment captures daily ocean state variables across three field campaigns. Data includes salinity, sea water temperature, water depth, and surface wind stress to study vertical exchange processes. Files are provided in netCDF format for the pilot campaign in Fall 2021, IOP1 in Fall 2022, and IOP2 in Spring 2023.
Synthetic data designed for developing a fuzzy-semantic decision model for personalized hotel recommendations. The dataset contains 10 customer preference profiles and 5 hotel attribute profiles, generated through a controlled randomization process. It was created by Ditdit Nugeraha Utama and published on figshare in May 2026.