Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
162,381 datasets
A public transport route dataset for Dosquebradas, Colombia, last updated on 2026-05-18 16:38:24. It describes the 'Ruta 17 - Molivento' bus line, listing key stops for both inbound and outbound journeys. The data is hosted by the Colombian open data portal www.datos.gov.co on the Socrata platform.
NASA's Parker Solar Probe SWEAP instrument provides Level 3 measurements of electron pitch angle distributions in the solar wind. The data is governed by specific 'Rules of the Road' requiring user collaboration with the principal investigator for scientific publication. This dataset is part of the mission's effort to study the Sun's corona and solar wind acceleration.
NASA's Parker Solar Probe SWEAP instrument provides Level 3 measurements of Electron Pitch Angle Distributions from the SPAN-B sensor. The data is structured in 13.981-second intervals and is governed by specific collaboration rules for scientific use. The dataset was last updated on March 13, 2026.
NASA's Parker Solar Probe mission provides electron pitch angle distribution data from the SPAN-Electron instrument. The dataset is part of the Solar Wind Electrons Alphas and Protons (SWEAP) instrument suite and is governed by specific collaboration rules. Data files are named with a versioned format and last updated on 2026-03 13.
Measurements of gases from mobile sources circulating in the jurisdiction of Corantioquia, Colombia. Data was obtained from roadside operations and companies across different municipalities. The dataset includes columns for subtotals by fuel type, municipality, year, and total rejected results.
Barrios y Veredas del Municipio de Villavicencio contains information on the neighborhoods and rural districts of Villavicencio Municipality. The data was updated as of April 30 and provided by the Dirección de Ordenamiento Territorial - DOT. It is available via the www.datos.gov.co platform in CSV, JSON, XML, and RDF formats.
Fusagasugá municipality in Colombia monitors the behavior of its water sources, including surface and underground streams. The dataset includes columns for location coordinates (Este, Norte), source names, activity types, and measurement dates. It is hosted by the Colombian open data portal www.datos.gov.co and was last updated on 2026-05-18.
Corantioquia jurisdiction in Colombia contains data on indigenous communities participating in environmental culture processes. The dataset includes information on location, indigenous community, reservation, ethnicity, legal acts or resolutions, and titled area in hectares. It was published on the Colombian open data portal www.datos.gov.co and was last updated on 2026-05-18.
A multi-domain reasoning dataset built to improve frontier models by revealing their failures and turning expert grading into training signal. The dataset pairs self-contained tasks with weighted rubrics across three domains — Computer Science, Data Science, and Chemistry. It was created by TuringEnterprises and last updated on 2026-06-16.
Historical data on teachers by academic level in the public sector for the urban and rural zones of the municipality of Sabaneta. The dataset includes columns for Sector, Year, Zone, Quantity, and Academic Level. It is hosted by www.datos.gov.co and was last updated on 2026-05-18.
Tiny-Ko-Stories is a dataset of 2,003,542 original Korean short stories, created by author psymon and last updated on June 13, 2026. Inspired by the English TinyStories dataset, it was generated from scratch in Korean to test if small models can demonstrate reasoning and creativity with limited, high-quality data. The dataset includes Korean-specific elements like native names, sentence rhythm, onomatopoeia, and small event structures.
Weekly updated registry of corporations and limited liability companies in Oregon designated as benefit companies. The dataset includes business names, official registry numbers, entity types, and the dates of their benefit designation. Columns suggest it provides details on companies that have committed to creating a public benefit alongside profit.
Colombian national and regional data on the educational level of individuals who entered the reintegration process, as of a specific cut-off date. The dataset is published by datos.gov.co and was last updated on 2026-05-18. It includes columns for municipality, department, process status, and educational level.
Xin-Rui released the ImagineTime benchmark in 2026 to evaluate image generation models. It contains 750 benchmark cases designed to test a model's ability to produce ordered 2x2 motion sheets with coherent entities and state transitions. The dataset was published with the paper 'Can Image Models Imagine Time?' and is hosted on Hugging Face.
Student enrollment data from the Digital University Institution of Antioquia for the 2024-02 academic period. The dataset includes 11 columns covering biological sex, place of birth, academic level, program, semester, and course load. It was published on the Socrata platform via datos.gov.co and last updated on May 18, 2026.
Administrative and provisional career officials who currently work in the different secretaries and offices of the municipal administration. The dataset includes columns for employee name, department, hire date, salary assignment, and job title. It was published on the Colombian open data portal, datos.gov.co, and was last updated on 2026-05-18.
University of Cauca maintains a directory of its current units and dependencies. The dataset includes names, descriptions, contact information, and geographic coordinates for each entry. It was last updated on May 18, 2026, and is hosted by the Colombian open data portal www.datos.gov.co.
A machine learning framework achieved test R² values from 0.942 to 0.963 for predicting missing Sonic logs and 0.927 to 0.930 for Gamma Ray logs. This dataset from the University of Kansas supports a study on using K-Nearest Neighbors regression to address gaps in geophysical well data. The workflow involves correlation-guided feature selection and min–max normalization on data from five wells.
Source data files for the manuscript "Gating Crosstalk in Potassium Channels". The 3.7 GB ZIP archive contains structures and parameters for molecular dynamics simulations, including input TPR files, initial and final structures, and MDP parameter files for each parallel simulation. The dataset was authored by GU and last updated on May 26, 2026.
Student enrollment records for the University of Valle disaggregated by campus, faculty, and academic program per semester. The dataset covers undergraduate and graduate students from 2000 to 2022. It is hosted by the Colombian open data portal www.datos.gov.co.