Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
161,677 datasets
5.5 KB of experimental results from a hybrid forecasting model applied to a 10 MW/20 MWh electrochemical energy storage power station. The dataset, authored by Lingzhi Xi and uploaded to figshare in April 2026, contains performance metrics including Mean Squared Error (MSE), Mean Absolute Error (MAE), and Coefficient of Determination (R²) for day-ahead 24-hour power generation forecasts.
pre_data is a table of plume detection results, including time, latitude, longitude, emission flux, and 1σ uncertainty. The dataset was authored by Zunze Zhang and last updated on June 3, 2026. It is a small dataset, 13.6 KB in size, and is available in CSV format under a CC-BY-4.0 license.
Colombia's proactive information disclosure schema, published via datos.gov.co. The dataset catalogs published and planned information from public bodies, with columns for generation date, format, language, and update frequency. It was last updated on 2026-05-18.
Pranav Joshi's dataset contains qualitative research on digital transformation in the Indian advertising sector. The data is stored in a 1.1 MB PDF file and was last updated on June 4, 2026. It is shared under a CC-BY-4.0 license on the figshare platform.
A study proposing a dimensionless number for analyzing dynamic plastic deformation of clamped rectangular plates under underwater explosion loads. The dataset includes experimental data used to establish a correlation between the dimensionless plastic deformation and the proposed number. The 316.0 KB PDF file was authored by Weizheng Xu and last updated on 2026-05-18.
High capacity wells data from Prince Edward Island, Canada. The dataset is published by the Government of Prince Edward Island on the open_canada platform. It was last updated on 2026-06-10.
BMRS is a dataset of Bongard–Maximov problems for remote sensing, published on Preprints.org in June 2026. The dataset is authored by Nikita Firsov, Olga Terekhova, and colleagues. It was last updated on the Hugging Face platform on 2026-06-22.
Daily-updated dataset of arXiv papers from AI/ML and adjacent categories, enriched with LLM-derived signals. It includes a 0–100 importance score, topical/lab tags, a one-line takeaway, and dense full-page summaries for a selected subset. The dataset is published by author taesiri and was last updated on 2026-06-17.
curt is a machine-first programming language designed for AI agents with a focus on output-token cost. This dataset contains the complete evaluation record for language version 0.2, including benchmark suites, model-generated programs, and reference materials. The dataset was created by therikkening and was last updated on June 12, 2026.
Historical information on Colombian beneficiaries of international scholarship calls from 2018 to 2024, grouped by various demographic and program variables. The data is provided by www.datos.gov.co and was last updated on 2026-05-26. Columns suggest records for MODALIDAD, GÉNERO, PAÍS DE DESTINO, and ESTRATO SOCIOECONOMICO DE RESIDENCIA.
Submissions and evaluation results for the CADGenBench leaderboard. The dataset contains one row per submitted and evaluated entry, as read by the leaderboard table. It was created by HuggingAI4Engineering and last updated on June 10, 2026.
GNS3 file exports were created as part of a master's thesis at NTNU. The files can be downloaded and imported into GNS3 to extract and run the network topology used in the thesis titled 'IPsec tunnels between end user devices behind NAT'. The dataset was authored by Sindre Revheim Svellingen and last updated on 2026-06-21.
35 million scanned archival documents from the Dutch National Archives, available as open data. The collection spans from medieval monastery records to archives of the Dutch East and West India Companies and documents on the decolonization of India. The material is provided by the Ministerie van Binnenlandse Zaken en Koninkrijksrelaties and is accessible via an OAI-PMH API.
Geospatial data from the Digital Atlas of Colombian Coral Reefs details the location and classification of coral areas identified up to 2020. It includes columns for biotic, geomorphic, and ecological units, as well as sector and zone information. The data is provided by www.datos.gov.co and was last updated on 2026-05-18.
Geospatial points for mountains and elevations within the Valle del Cauca region of Colombia. The dataset includes columns for Cordillera, Montaña, Altitud_(msnm), and precise location via Longitud and Latitud. It was published by www.datos.gov.co and last updated on 2026-05-18.
Australian Ocean Data Network provides a geospatial map resource detailing the locations of Australia's major ports. The data is served via WFS, WMS, and PNG formats, offering flexibility for different mapping and analysis needs. It was last updated on June 4, 2026.
Santa Marta District Institute for Recreation and Sports (INRED) maintains this registry of its information assets. The dataset includes columns for asset name, type, description, confidentiality classification, legal basis, and responsible owner. It was last updated on 2026-05-18 18:59:42 and is published by www.datos.gov.co.
NYPD Officer Profile - Department Recognition tracks awards bestowed upon uniformed members of the New York City Police Department. The dataset includes the award type and the date it was given, linked to individual officer profiles. Its presence on multiple government data platforms indicates its use for official transparency and public accountability.
Piedecuesta, Colombia's municipal public services company maintains this inventory of trees under its care. The dataset includes columns for tree condition, scientific and common names, family, neighborhood location, and categorization. It was last updated on 2026-05-18 and is available via the Colombian open data portal.
Anonymized environmental complaints data from the Regional Autonomous Corporation of Cundinamarca (CAR), covering potential impacts on natural resources since January 1, 2009. The dataset includes columns for complaint type, municipality, environmental media affected, response status, and dates. It is published by www.datos.gov.co and was last updated in May 2026.