Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
44,563 datasets
Gold particle size distribution data from Sulphur and Dominion creek drainages in the Klondike District, Yukon. The dataset likely contains results from screening and bulk leaching by cyanidation of colluvium, virgin gravel, and tailings samples. It was published by the Government of Yukon and last updated on April 17, 2026.
Monitoring results for continuing care and supportive living accommodations in Alberta, assessing adherence to legislated accommodation and health service standards. The Government of Alberta publishes these reports, with data last updated in April 2026.
Tunja, Boyacá, Colombia, is the geographic scope for this dataset of pregnant mothers. The data was collected by the Secretariat for Women and Gender Equity for the years 2020, 2021, and 2022. It likely contains health assessments, socioeconomic indicators, and prenatal care details for mothers in the municipality.
General budget data for the Municipality of Fusagasugá in 2020, with a snapshot as of June 30, 2020. The dataset is published by www.datos.gov.co and was last updated on the platform in May 2026. It includes columns for initial estimates, budget line item names, and their corresponding codes.
General budget data for the Municipality of Fusagasugá, Colombia, with a snapshot as of June 30, 2020. The dataset is hosted on the Colombian open data portal, datos.gov.co, and was last updated in the platform's system in May 2026. Columns suggest a structured view of budget line items, including their initial estimates and classification codes.
Preferential attachment models generate graphs through a growth mechanism where new vertices connect to existing ones based on their current degree. The chapter discusses various formulations of this principle, noting that linear popularity functions can lead to power-law degree distributions. The dataset likely contains synthetic graphs or model parameters generated by these mechanisms.
A 2026 theoretical physics paper analyzes the one-dimensional Dirac equation with general relativistic contact interactions supported on two symmetric points. Authored by Carlos A. Bonin, the work investigates scattering and confining properties, focusing on parity-symmetric interactions. The dataset is a 177.1 KB PDF file available under a CC-BY-4.0 license.
Nine species of trilobites are recorded from three localities near the base of the Kayrunnera Group in western New South Wales. The trilobite assemblage, including species of Ammagnostus, Meteoraspis, and Blackwelderia, is early Late Cambrian (Mindyallan) and referable to the Glyptagnostus stolidotus zone. The dataset, hosted by the Australian Ocean Data Network, describes the Boshy Formation's shallow marine deposits and the geological context of an angular unconformity.
472 peptides from Apis mellifera venom were identified via de novo sequencing, with 103 high-confidence sequences. Ten novel peptides were identified as candidate TRPV1-targeting agents, and three synthesized peptides showed anti-inflammatory effects in macrophage assays. The dataset was published by Kai Wang on figshare in April 2026.
Phanerozoic Australia's buoyant cratonic platform is characterized by non-marine facies, contrasting with the marine facies of Laurasia. The dataset likely contains geological analysis discussing Pan-African mafic underplating and its role in creating a permanently buoyant lower crust. It is hosted by the Australian Ocean Data Network and was last updated on 2026-05-05.
Companies who received funding from the Critical Minerals Innovation Fund. The dataset is published by the Government of Ontario under the OGL-CA-2.0 license and was last updated on May 13, 2026.
MagicData's Multi-stream conversation dataset captures each speaker's audio track separately, preserving natural conversational phenomena. The dataset is an ASR corpus containing Chinese audio recorded on mobile devices at 16 kHz and 16 bits. It was created by MagicHub and last updated on the platform in June 2026.
Consolidated figures of victims by victimizing event at the national level for January 2019. The dataset includes columns for event type, disability status, ethnicity, sex, and life cycle stage. It is hosted by the Colombian open data portal, datos.gov.co, and was last updated in May 2026.
296 AI-generated human-centric scenes form a multimodal benchmark for computer vision. The collection includes 85 surveillance-style images and 128 occupation-by-gender portraits. It was created by Nima0Kamali and last updated on Hugging Face in May 2026.
Monthly updated geospatial data from the Historic Environment Division (HED) provides guidance zones indicating when planners should consult HED regarding proposals near heritage assets. The service is provided by HED and data is produced from HED sources, updated monthly. Attribution values include a unique ID, source layer, and buffer value for each record.
The Escuela Tecnológica Instituto Técnico Central (ETITC) publishes its Registry of Information Assets. This inventory lists public information generated, obtained, acquired, transformed, or controlled by the institution. The dataset was last updated on 2026-05-18.
Sediment progradation rates in the central Great Barrier Reef lagoon decrease from 2.5 meters per year at the Burdekin River delta front to 0.1 meters per year north of Townsville. The dataset describes four distinct sedimentary assemblages forming the prograding shoreline: beach-ridge plain, chenier plain, mangrove-mud-flat plain, and barrier bar-lagoon complex. It models the distribution of fluvially derived mud and relict sands, suggesting terrigenous sediments dominate most of the lagoon shelf.
New South Wales geological data presents evidence for mineralogical and chemical zoning in seven deposits of the Cobar-Nymagee area. The dataset describes features typical of exhalative deposits in a non-volcanic environment, relating them to Devonian sedimentary facies and regional tectonics. It is provided by the Australian Ocean Data Network via data.gov.au.
Western Australia's Phanerozoic geology is described, focusing on sedimentary basins, rock types, and mineral deposits. The dataset likely contains information on basin ages, tectonic settings, and strata-bound mineralisation. It was published by the Australian Ocean Data Network and last updated in May 2026.
An official inventory of public information assets generated or controlled by the Municipal Government of Guadalajara de Buga. The dataset includes 24 columns detailing each asset's description, format, legal status, and responsible parties. It is hosted on the Colombian open data portal www.datos.gov.co and was last updated on 2026-05-18.