Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
168,428 datasets
A metadata catalog from the Colombian open data portal, detailing information published by public entities under Law 1712 of 2014. The schema includes columns for information category, responsible party, update frequency, and access location. It was last updated on the platform on 2026-05-18.
Registered applicants who formalized the enrollment process, made the pecuniary payment, and uploaded required documents. The dataset includes columns for Territorial, Program Name, Enrollment Status, Circumscription, Methodological Strategy, Semester, Validity, Stratum, Gender, and Educational Level. It is hosted on the Colombian open data portal www.datos.gov.co and was last updated on 2026-05-18.
Enrollment records for undergraduate and graduate programs at the National School of Sport University Institution. The data covers six academic periods from 2023-01 to 2025-02, with columns for student origin, gender, and socioeconomic stratum. It was published by the Colombian open data portal www.datos.gov.co and last updated on May 18, 2026.
A Colombian government inventory of information assets, including hardware and software, managed via the Socrata platform. The dataset tracks assets through lifecycle stages like creation, processing, storage, and destruction, and includes security classifications. It was last updated on 2026-05-18.
Santa Barbara Channel field campaigns collect in-situ radiometric data to validate water-leaving reflectance products from the PACE OCI instrument. The dataset is produced by NASA and involves rapid-response measurements from small boats during optimal sea and sky conditions. Data is actively maintained, with a last recorded update in March 2026.
Text fragments and academic content associated with professors from the Department of Computing at the Federal University of Piauí. The dataset is authored by vickminari and was last updated on June 15, 2026. It is available in Parquet and JSON Lines formats.
Turkish speech recognition benchmark comparing 20 automatic speech recognition models on 1,060 audio clips totaling about 105 minutes. The benchmark was produced by Muhammed Kumcu and Yagmur Tuncer and draws from three established Turkish speech sources. It contains performance metrics but does not distribute the source audio, reference transcripts, or generated hypotheses.
A schema listing information sections on the municipal website of Villanueva La Guajira. It describes the format, update frequency, and responsible area for each published item. The dataset is hosted on the Colombian open data portal www.datos.gov.co and was last updated on May 18, 2026.
NASA public-domain source videos for the MIRCE benchmark, including documentary, Apollo mission, and Earth-from-space clips. The subset contains approximately 11 GB of video files. This dataset is paired with Lando0/mirce-meta for candidate-pool indexes.
Datos.gov.co provides a dataset on the Immediate Prize Collection Incentive (IPI) "Raspa&Listo" lottery. It tracks gross income, administrative expenses, and exploitation rights, broken down by Colombian department, year, and month. The data was last updated on 2026-05-18.
A unified matrix consolidating the information asset registry, publication scheme, and index of classified and reserved information for the Colombian Ministry of Agriculture and Rural Development. The dataset includes columns for information categories, formats, responsible parties, legal justifications, and publication status. It is hosted on the Colombian open data portal and was last updated on 2026-05-18.
5,795 small molecules were compiled from BindingDB to support machine learning models predicting acetylcholinesterase (AChE) inhibitory activity. The dataset was curated by R.U. Laskar and published on figshare in April 2026. It includes multiple molecular representations, such as physicochemical descriptors and graph-based structures, used to evaluate fifteen predictive models.
Eurostat provides quarterly balance of payments data for the European Union and the euro area, compiled according to the BPM6 standard. The dataset was last updated on April 15, 2026, and is available under a CC-BY-4.0 license. It covers international economic transactions between residents and non-residents.
Australian Ocean Data Network provides fossil records of archaeocyathan and radiocyathan fauna from Early Cambrian marine dolostones. The Todd River Dolomite and Mount Baldwin Formation contain species like Aldanocyathus greeni Kruse sp. nov. and Radiocyathus minor. This data correlates central Australian deposits with South Australian limestone formations and Siberian Atdabanian stages.
Matriz cuantitativa de cultivos permanentes de Sucre 2018 contains agricultural data for the department of Sucre, Colombia, for the year 2018. The dataset includes columns for product, price, production, yield, and area. It is hosted on the Colombian open data portal www.datos.gov.co and was last updated in 2026.
Colombian voting centers for the 2018 presidential election located outside the country, listing citizens eligible to vote. The dataset includes columns for voting station, municipality, department, and counts of eligible men and women. It was published on datos.gov.co and last updated on 2026-05-18.
A structured catalog of information published by Colombian government entities, as mandated by Decree 103 of 2015. The dataset includes columns for information name, description, responsible parties, language, format, and access methods. It is hosted on the Colombian open data portal, www.datos.gov.co, and was last updated on May 18, 2026.
Registro de Activos de Información de la Corporación Autónoma Regional de Santander CAS is a metadata registry of information assets. The dataset is published by www.datos.gov.co and was last updated on 2026-05-18. It includes columns describing asset language, physical and electronic access points, format, and responsible office.
Global ground-based GNSS stations collect high-rate Beidou broadcast ephemeris data, formatted as sub-hourly RINEX files. NASA's Crustal Dynamics Data Information System (CDDIS) archives and distributes this data, which includes signals from multiple global navigation systems. The dataset is actively maintained, with a last recorded update in March 2026.
A business registry listing small and medium enterprises (PYMES) in the Colombian archipelago of San Andrés, Providencia and Santa Catalina. The dataset is hosted on the Colombian open data portal www.datos.gov.co and was last updated on 2026-05-18. It includes columns for company registration, size, and economic activity classification.