Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
143,034 datasets
Experimental data from destructive quasi-static tests on five-layer cross-laminated timber panels. The 93.0 MB dataset compares results from pine-only and eucalyptus-pine hybrid layups to values estimated using three analytical methods. Benedict Boulle published this study on figshare under a CC-BY-4.0 license, last updated in May 2026.
Russian forest fire locations from the 1998 and 1999 fire seasons are provided as vector point maps. The data originates from the National Forest Fire Center of Russia archive, collected by the Center of Remote Sensing in Irkutsk. It is derived from satellite imagery and formatted as ArcView shapefiles.
Global satellite data from the Nimbus-7 THIR instrument provides cloud information from October 31, 1978, to October 27, 1984. The dataset contains total cloud amounts, radiances at three altitude levels, and identifies cirrus and deep convective clouds. Data are averaged onto instrument footprints that vary from 50 km at nadir to 200 km at the edges.
A scout drilling campaign investigated approximately 200 acres in the Western Coalfield of New South Wales. The Bureau conducted the work to determine the suitability of the Lithgow and Irondale Seams for open cut exploitation. Results are published by the Australian Ocean Data Network.
Experimental data from two manipulative studies on marine predator-prey interactions. The dataset includes measurements of kelp blade weight before and after trials where predator activity and prey hunger state were varied. Author Nikita Sridhar published the data on figshare under a CC-BY-4.0 license, with a last update recorded in 2026.
Australian Ocean Data Network provides measurements of growth, secondary production, and egg production rates for pelagic copepods. Data were collected near Australia's North West Cape during the austral summers of 1997/1998 and 1998/1999. The dataset was created to assess zooplankton communities in that region.
The Fitzroy Contaminants Project studies nutrient and fine-sediment dynamics in the Fitzroy Estuary and Keppel Bay. It originates from the Australian Ocean Data Network and was last updated on 2026-06-04. The dataset likely contains measurements of materials discharged from the largest Queensland catchment into the Great Barrier Reef lagoon.
2500 Million years of geological history are recorded in the granulite facies gneisses of the Stillwell Hills region. This dataset from the Australian Ocean Data Network describes Archaean to Proterozoic orthogneisses, charnockitic gneiss, mafic sills, dykes, and subordinate supracrustal rocks. The data was last updated on 2026-06-04.
A 102.3 KB Excel dataset published by Lena Szczuczko on May 22, 2026, presenting results from a computational framework for analyzing charge-transfer excitations. The data likely contains metrics from applying domain-based decomposition methods to correlated wave functions like EOM-CCSD and EOM-pCCD+S. It assesses the performance of two domain-accumulation strategies across various basis sets.
Sediments of the Late Palaeozoic Urana Formation include glaciomarine diamictite, fine-grained sediment, sandstone, and conglomerate facies. The facies assemblage is dominated by paratillites formed by ice-rafting and fine-grained sediments. Palaeontological and sedimentological evidence suggests these rocks were deposited towards the end of the major Late Palaeozoic glaciation of southeastern Australia.
Seabed sediment data extracted from Geoscience Australia's MARine Sediment database. The dataset includes percentages of carbonate, mud, sand, and gravel size material in sediment samples across the Australian Exclusive Economic Zone. Data grids were created using ArcGIS Inverse Distance Squared Weighted methodology.
Villavicencio, Colombia, business registry data from the local Chamber of Commerce for small and medium enterprises (SMEs) that were active between 2012 and 2022. The dataset includes company details such as registration number, legal name, contact information, economic activity codes, and financial size indicators. It was published on the Colombian open data portal datos.gov.co and last updated in May 2026.
A dataset for short-term forecasting of the Air Quality Index (AQI) using a Transformer encoder model. Created by Kun-Chou Lee and last updated in May 2026, it contains multivariate time-series data where input features include AQI and other numerical variables. The dataset is preprocessed with forward-fill for missing values and Min-Max normalization, and supervised learning samples are constructed using a sliding-window scheme.
A geospatial layer delineating the perimeter of the Special Intervention Zone (ZIS) established by decree to promote better flood zone management in Quebec. The zone was created using data on flooded areas from 2017 and 2019 and was in force from June 10 to July 15, 2019. This dataset, provided by the Government and Municipalities of Québec, is no longer in force as of the adoption of Decree 817-2019.
Lobbying-related compensation received by registered lobbyists as reported in their quarterly reports to the Chicago Board of Ethics. The dataset includes details on lobbyist identities, their clients, and the amounts of compensation reported for specific time periods. It is available on multiple government data platforms, indicating its role in public transparency.
Graduate records from the Institución Universitaria de Envigado (IUE) span from 1995 to the present, covering undergraduate and postgraduate programs. The dataset includes 14 columns such as graduation year, program, gender, and residential location. Data is provided by the IUE's planning department and was last updated in May 2026.
MinYuan Zhang's dataset contains clinical and MRI-based morphological parameters for 608 patients who underwent arthroscopic knee surgery. It includes 281 patients with confirmed medial meniscus posterior root tears and 327 control patients without. The data was used to train and evaluate 10 machine learning models, with results published on figshare in May 2026.
geoBoundaries provides standardized, open-license political administrative boundaries for Sierra Leone. The dataset includes boundaries from national (ADM0) down to local administrative levels (ADM4). It is produced and maintained by the geoBoundaries Global Database of Political Administrative Boundaries Database.
Mauritius administrative boundaries for ADM0 (country) and ADM1 (first subnational) levels are available in this dataset. The geoBoundaries Global Database of Political Administrative Boundaries Database produced and maintains this standardized, open-license resource. The dataset was last updated on 2026-05-23.
Cambodia's subnational administrative boundaries from country level (ADM0) down to the third subdivision (ADM3). The geoBoundaries Global Database of Political Administrative Boundaries produces and maintains this standardized, open-license resource. It was last updated on 2026-05-23.