Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
166,763 datasets
Ground-based Leaf Area Index (LAI) data from ENON and Nakakawane, paired with corresponding cloud-free Sentinel-2 satellite imagery. The dataset was authored by Xuanwen Wang and last updated on June 4, 2026. It is a small dataset of 806.9 KB, stored in CSV format.
Scottish Water's Sewer Catchment Areas, also known as Drainage Operational Areas (DOAs), define the geographic zones where wastewater assets and surface water flow to a single Sewage Treatment Works or Public Septic Tank. The dataset is provided by the Scottish Government via SpatialData.gov.scot and was last updated on 2026-06-04.
Disbursement details for agricultural development credit from 2020 onward, hosted on Colombia's open data portal. The data includes loan values, interest rates, beneficiary characteristics, and financial inclusion indicators for producers accessing credit under FINAGRO conditions. Columns such as `tipo_productor`, `valor_credito`, and `sexo` provide insights into the recipients and terms of these loans.
96 information assets are cataloged in this metadata schema for the proactive disclosure of information by the Mayor's Office of Bucaramanga. The dataset describes planned and published information resources from 2021 to 2025, with columns for responsible parties, formats, and update frequency. It is published on the datos.gov.co platform via Socrata and was last updated on 2026-05-18.
Macro routes for street sweeping and cleaning services operated by EMVARIAS Grupo EPM. The dataset includes schedules for the corregimientos of San Cristóbal, Santa Elena, and San Antonio de Prado, with frequencies ranging from Monday to Saturday, every other day, and twice per week. It is hosted on the Colombian open data portal www.datos.gov.co and was last updated on 2026-05-18.
LeRobot created this dataset for robot learning. It contains 800 episodes and 20,000 frames of data, recorded at 15 frames per second. The dataset was last updated on June 8, 2026.
800 robot manipulation episodes generated using the LeRobot framework. The dataset contains 20,000 frames recorded at 15 frames per second, focusing on a single task. It was created by lerobot and last updated in June 2026.
60 episodes of robot manipulation data created using LeRobot, totaling 38,861 frames. The dataset is structured for training and contains video and structured data files for a single task performed by a Franka robot. It was last updated on June 15, 2026.
Municipal-level inventory data for other livestock species in the Department of Valle del Cauca, Colombia. The dataset includes counts for species such as horses, donkeys, mules, buffalo, rabbits, guinea pigs, sheep, and goats. It is sourced from Urpa - Umatas municipales and was last updated on 2026-05-18.
Lakes location and size data from the Colorado Department of Transportation (CDOT). The dataset includes geospatial geometry and attributes like name and area. It was last updated on 2026-05-29 11:01:30 and is hosted by data.colorado.gov.
The dataset contains inventory information for computers, tablets, content servers, and innovation labs across educational institutions in the Boyacá department. It is sourced from the Colombian open data platform www.datos.gov.co and was last updated on May 18, 2026. The data includes school locations, enrollment figures, and technology counts from inventories conducted in 2022.
Personnel records from a Colombian social reintegration process over a one-year period, disaggregated by the type of economic benefit received. The data is provided by datos.gov.co and was last updated on 2026-05-18. Columns suggest information on benefit types, municipality and department of residence, and data cut-off dates.
Quarterly data reported by Approved Authorised Treatment Facilities and Approved Exporters about the amount of non-obligated Waste Electrical and Electronic Equipment received in the UK. Figures are broken down by 13 categories, with reports available from Q1 2010 onward. The dataset is produced by the Environment Agency and contains no company-specific information.
30+ everyday sounds are cataloged with their decibel ranges and associated hearing-damage risk. This reference table is compiled exclusively from public figures published by the CDC, NIOSH, NIDCD, or ASHA. The dataset is maintained by APPSTACK LLC and was last updated in June 2026.
Postulados según situación penitenciaria contains data on demobilized individuals processed under Colombia's Law 975 of 2005 and its modification, Law 1592 of 2010. The dataset is managed by the Transitional Justice Directorate of the Ministry of Justice and Law and is organized by place of demobilization. It was last updated on 2026-05-18.
Event records of incidents that can modify the health situation of a community, including disease, risk factors, and other determinants. The dataset is hosted by the Colombian open data portal www.datos.gov.co and was last updated on May 18, 2026. Columns suggest it includes demographic details like nationality, sex, and age for each recorded event.
Results from experiments (S17–S24) evaluating machine learning models for intrusion detection under domain shift. The 5.5 KB dataset, created by Dung Ha Thanh, was last updated in April 2026. It contains performance metrics from the TAN IDS evaluation framework.
Cross-dataset evaluation results (S9–S16) from the TAN-IDS framework, a method for assessing NetFlow-based intrusion detection models. The dataset, created by Dung Ha Thanh and shared on figshare in April 2026, contains performance metrics from experiments testing model robustness across different network environments. It is a small dataset at 5.5 KB, stored in an XLS file.
Evaluation results for machine learning models across eight distinct scenarios (S1–S8) assessing robustness to domain shift in network intrusion detection. The data originates from the TAN-IDS evaluation framework research by Dung Ha Thanh, published on figshare in April 2026. This 5.5 KB XLS file contains comparative performance metrics.
A public transport route dataset for Dosquebradas, Colombia, last updated on 2026-05-18 16:38:24. It describes the 'Ruta 17 - Molivento' bus line, listing key stops for both inbound and outbound journeys. The data is hosted by the Colombian open data portal www.datos.gov.co on the Socrata platform.