Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
44,784 datasets
A survey of 1,501 Ontarians aged 18+ conducted between April 29 and May 1, 2022, during the provincial election campaign. Ipsos Canada collected data online and by telephone on voting likelihood, party preferences, and leadership traits. The sample was weighted to reflect the adult population based on 2016 Census data.
The Bear Paw breccia zone is located at the Clear Creek property in the Tintina gold belt, central Yukon Territory. Gold mineralization occurs in hydrothermal breccias with stockwork quartz + potassium-feldspar + sulphide veins. Grades of up to 2.3 g/t gold over 31.8 m have been intersected in recent drilling.
2010 bedrock mapping and soil sampling data from the Mount Mervyn map sheet (106C/04) in central Yukon. The dataset was produced by the Government of Yukon as part of the South Wernecke mapping project, highlighting complex geology and new areas of nickel and gold potential.
Southeastern Yukon's Regal Ridge area contains preliminary geological investigations of emerald mineralization from 2002. The report details a continuum of rock types from quartz monzonite to quartz-tourmaline veins and discusses the potential source of chromium. It was published by the Government of Yukon and last updated in April 2026.
Len porphyry gold prospect is located 47 km north of Mayo, Yukon, in the Tombstone Suite intrusive belt. Government of Yukon documents exploration from the 1960s to 1997, including soil geochemistry, trenching, and diamond drilling that encountered grades up to 2.22 g/t gold across 18.6 m.
Nova Scotia's mineral rights database contains layers for exploration licences, special licences, leases, hydrocarbon storage-area licences, and non-mineral registrations. The database is maintained by the Nova Scotia Department of Natural Resources and Renewables and is updated nightly around 2:00 AM. An extraction date-time field identifies when information was pulled from the source NovaROC system.
Southwestern Nova Scotia bedrock is classified into three acid rock drainage potential categories: high, moderate, and low. This digital product from the Government of Nova Scotia informs landowners, developers, and planners about geological hazards. It was last updated on April 17, 2026.
Historical disaster events in Alberta and their associated cost summaries show what the province has paid to date under the Disaster Recovery Program. The program provides financial assistance for uninsurable property losses or damages resulting from a disaster. The figures reflect total provincial costs incurred, notwithstanding any federal reimbursement.
A list of tracked and watched species occurrences in Alberta, Canada, generated by intersecting natural region maps with conservation data. The dataset is produced by the Alberta Conservation Information System (ACIMS) and includes only elements with mapped occurrences. It was last updated on April 17, 2026.
1998-2012 data on quality of life across all Dutch neighborhoods and municipalities, updated biennially. The Leefbaarometer provides scores reflecting the situation, developments, and backgrounds of neighborhoods. It is published by the Dutch Ministry of the Interior and Kingdom Relations under a CC0-1.0 license.
Headcount enrolment within the Alberta post-secondary education system provides statistics on unique learners by institution and academic year. The Government of Alberta publishes these data tables for enrolments in approved programs at publicly-funded institutions. Separate files focus on system-level totals or subsets for international and self-identified Indigenous learners.
Q2 2015 progress report on the Dutch Generic Digital Infrastructure (GDI), which supports digital government services. The data, published by the Ministerie van Binnenlandse Zaken en Koninkrijksrelaties, tracks the availability, connection status, and usage of GDI components. It is provided under a CC0-1.0 license in PDF and CSV formats.
An administrative dataset from Alberta's Workers Compensation Board and government evaluates the impact of occupational health and safety inspections on injury rates. The data is structured at the employer-industry-year-month level, combining workers' compensation claims with regulatory enforcement activity. It was created for the Partnership for Work, Health and Safety (PWHS) to analyze firm-level outcomes.
Alberta small and medium-sized enterprises are represented in this dataset linking survey data on human resource management practices from 2016 and 2018 with archival organizational-level injury data from the Alberta Workers' Compensation Board from 2014 to 2019. The dataset was created to connect HR practices with injury data at the organizational level over time. Survey variables were removed prior to public posting; contact the researcher for more information.
Government of Alberta records track learner completions within the province's publicly-funded post-secondary system. The data is organized by institution and academic year, with separate files for system-level, international, and Indigenous learner subsets. It was last updated on 2026-04-17 under the OGL-CA-2.0 license.
Prompt To Gesture is a computer vision dataset built for training models to recognize deictic (pointing) gestures in human-robot interaction. It mixes real human recordings with image-to-video generated synthetic videos to address the scale and diversity limitations of traditional data collection. The dataset was created by author sano90 and last updated on Hugging Face in May 2026.
NYC Benefits Platform lists over 80 health and human services programs available to city residents. The NYC Opportunity Product team collaborates with more than 15 government agencies to collect and update the data, which includes application details and eligibility in eleven languages. The dataset was last updated in April 2026.
Hunter J. Ries published this dataset on 2026-05-07. It contains intra-host single-nucleotide variants (iSNVs) from two outlier samples, each with more than twenty iSNVs. The 23.7 KB XLSX file includes variants annotated with snpEff and antigenicity information.
MMAE is a benchmark for instruction-based audio editing, serving as a comprehensive evaluation testbed. It covers 7 distinct audio modalities, including sound, speech, and music. The dataset was created by BoJack and was last updated on HuggingFace in June 2026.
Moral-Circle-Alignment-Lab curated a dataset of writing modeling compassionate moral reasoning about nonhuman sentient beings. The corpus is designed for pretraining and fine-tuning language models to reason more carefully about decisions affecting sentient life. The dataset was last updated on May 28, 2026.