Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
44,453 datasets
The Bureau of Mineral Resources compiled reconnaissance-style maps of surficial cover facies on the Great Barrier Reef in 1982. These maps exhibit only surficial facies and apply a simple bathymetric classification differentiating supratidal, intertidal, and subtidal zones. Map resolution is variable over the reef and generally decreases with increasing water depth.
Colombia's inventory of public information generated or controlled by government entities that has been classified as confidential or reserved. The dataset includes columns for legal justification, classification duration, and responsible parties. It is hosted on the datos.gov.co platform and was last updated on 2026-05-18.
A large dataset of PyTorch problems paired with candidate Triton-kernel implementations and their measured GPU runtimes. Each row contains a self-contained Python program defining a reference PyTorch model and a Triton-reimplemented version. The dataset was generated by MakoraGenerate and is hosted by makora-ai, with a last update timestamp of 2026-06-12.
Nepalese rice landraces were evaluated for nitrogen use efficiency. The dataset contains measurements for 20 landraces and twelve nitrogen-responsive traits under optimum and stress conditions over 28 days. Bibas B.K. published the data on figshare in April 2026.
2 agent trace sessions generated by the nex-agi/nex-n2-pro:free model using the teich tool. The dataset is prepared for supervised fine-tuning of AI models. It was created by user armand0e and last updated on June 19, 2026.
2026 data from the Quebec Tourism Information System lists tourist accommodation establishments of the hotel type. The dataset includes establishments holding a valid registration certificate under the Tourist Accommodation Act at the time of publication.
Quebec Tourism Information System provides a listing of tourist accommodation establishments with valid registration certificates. The dataset includes campgrounds, caravanning sites, and ready-to-camp sites governed by the Tourist Accommodation Act. Data is sourced from the official SIT Quebec system.
Quebec's dataset lists tourist accommodation establishments, such as bed and breakfasts, that held a valid registration certificate at the time of publication. The data originates from the Quebec Tourism Information System (SIT Quebec) and is governed by the Tourist Accommodation Act. Row and column counts are not specified in the provided metadata.
Sewer manholes on Montreal Island are placed at maximum intervals of 120 meters in straight sections. The dataset includes attributes like installation date, material, status, and owner, published annually by municipal authorities. Updates made during the year are not included, limiting its use for engineering without validation.
A theoretical descriptor (Ω) predicts whether a proton 'stays' or 'goes' via an excited-state intramolecular proton transfer (ESIPT) mechanism in imidazo[1,5-a]pyridine-3-yl phenols. The dataset, 190.4 KB in size, contains results from a combined experimental and theoretical study investigating the fluorescent behavior of these compounds. It was authored by Anita Cinco and last updated on May 6, 2026.
25.1 KB of anonymized data and R scripts supporting the article "The role of collaborative gaps in stakeholders' partnership evaluations in complex environmental governance systems." The dataset, named 'NetworkingActivity_Data_NamesRemoved,' contains anonymized stakeholder activity records, and the materials were authored by Harrison Fried and last updated on May 21, 2026. All individual and organizational names have been replaced with untraceable unique identifiers.
51 potential energy curves for the nitrogen molecule detail its singlet, triplet, quintet, and septet electronic states. The dataset was authored by Chaima Hammami and published under a CC-BY-4.0 license on figshare. Its last update was recorded on 2026-05-26.
Registro de Activos de Información is the official inventory of public information generated or controlled by the Sogamoso Chamber of Commerce as a legally obligated entity. The dataset includes metadata columns such as NOMBRE O TITULO DE LA INFORMACIÓN, DESCRIPCIÓN DEL CONTENIDO DE LA INFORMACIÓN, and FORMATO. It is hosted on the Colombian open data portal www.datos.gov.co and was last updated on May 18, 2026.
Released on 2026-05-28, this dataset contains 100,000 real-world low-quality and high-quality image pairs generated by Multi-modal Foundation Models (MFMs). It was created by researchers from The Hong Kong Polytechnic University and OPPO Research Institute to expand generalization boundaries in image restoration tasks.
24.1 KB PDF document authored by Marie-Annick Moreau, describing a scene of collaborative construction. The description details Bumbo working inside a trap chamber while other men assist from the outside, positioning and tying stakes. The dataset was last updated on figshare in June 2026.
A 2026 report from Geoscience Australia details development work at the Adelaide River Uranium Mine. The report describes the Black Lode ore shoot, containing about 70 tons per foot depth of ore averaging about 0.5% U3O8, developed to a depth of 200 feet. About 3,500 tons of ore were treated at Rum Jungle, with 1,500 tons remaining broken in stopes, but no proved ore reserves are reported.
Monthly food ration counts for persons deprived of liberty in Colombia, likely covering 2023 and 2024. The data originates from www.datos.gov.co and was last updated in May 2026. It includes columns for each month, totals for 2023 and 2024, and details on the detention facility type and location.
A 2023 geospatial data product provides seabed morphology and geomorphology information for Flinders Reefs within the Coral Sea Marine Park, north-eastern Australia. The dataset is published by McNeil et al. and hosted by the Australian Ocean Data Network. It is intended for use by marine park managers, regulators, and the general public.
Geospatial seabed morphology and geomorphology information for Flinders Reefs within the Coral Sea Marine Park. The data product was published in McNeil et al. (2023) and is served via an OGC Web Map Service (WMS). It is intended for use by marine park managers, regulators, the general public, and other stakeholders.
Field measurements of microclimatic parameters for model calibration and validation. The data includes air temperature, relative humidity, wind speed, and black globe temperature, collected at pedestrian height in a sunken plaza-type entrance. The dataset was created by xiaochun hong and last updated in May 2026.