Loading...
Loading...
Student performance, MOOC logs, knowledge tracing, standardized tests, learning analytics
12,945 datasets
OTel-Embedding is a contrastive learning dataset for training bi-encoder retrieval models using Multiple Negatives Ranking Loss. It is part of the Open Telco AI project, curated by over 100 domain experts from industry and academia. Each sample consists of an anchor query, one positive passage, and five hard negative passages.
Great Britain is covered by a dataset assessing the potential for geological deposits to collapse and cause rapid subsidence. The British Geological Survey produced this version 8 dataset, generalizing detailed geological data to a hexagonal grid with a 5km side length. It was last updated on 2026-04-09.
The Collapsible deposits dataset (5km Hex-Grid) version 7 from the British Geological Survey (BGS) provides a generalized assessment of potential ground collapse across Great Britain. It maps susceptibility to rapid subsidence on a hexagonal grid with a 5km side length, rating areas as Low (1), Moderate (2), or Significant (3). The dataset was last updated on 2026-04-09 and is based on the BGS Digital Map (DiGMapGB-50) and expert geological knowledge.
13 Black Sesotho-speaking male adolescents in South Africa's Motheo District participated in semi-structured interviews about depression. Researcher Pheello Nyembe collected this qualitative data to explore conceptualizations of depression, its causes, symptoms, and help-seeking strategies. The dataset was last updated in April 2026.
ACTmapi provides annually updated geospatial data for school locations within the Australian Capital Territory. The dataset is sourced from the ACT Government Geospatial Data Catalogue and was last updated on April 4, 2026. It is available in multiple formats including CSV, GeoJSON, and KML.
A three-wave longitudinal dataset tracks 1,014 Chinese university faculty members over two years. It contains item-level responses and precomputed scores for the Perceived Physical Literacy Instrument and ShiromβMelamed Burnout Measure across baseline, one-year, and two-year follow-ups. The dataset was authored by Ma, Rui Si and is hosted on Harvard Dataverse.
Eleven experienced secondary school mathematics teachers from Catalonia participated in semi-structured interviews exploring student learning difficulties. The research identifies sources of complexity related to inherent knowledge intricacy and task design, using cognitive load theory as a framework. Interviews were conducted face-to-face, transcribed verbatim, and analyzed until empirical-driven saturation was reached.
244 first- and second-year medical students participated in a 2020 survey assessing the impact of pre-medical scribing and assisting. The data includes responses to a researcher-designed survey and a standardized Professional Self Identity Questionnaire (PSIQ). Academic performance metrics were analyzed but are not included in the shared data.
Rhiannon Fleming's mixed-method research assesses the Research Experience for Undergraduates program in solar and astrophysics at Montana State University. The dataset includes interview transcripts, chain notes, paraphrasing documents, a journal, and a confidence spreadsheet from a 2024 cohort of seven students. Data were collected weekly over a 10-week program using surveys, interviews, and journals.
Three U.S. academic institutions in the Midwest, Southeast, and Mountain West contributed data from 40 semi-structured interviews conducted between October 2023 and April 2024. Krista Cooksey collected transcripts from principal investigators, research staff, IRB personnel, and community partners discussing visual templates for informed consent key information sections. The data was analyzed using the COM-B framework and thematic content analysis.
Synthetic documents created to train models for the research paper 'Negation Neglect: When models fail to learn negations in training'. The dataset was authored by HarryMayne and last updated on 2026-05-14. It contains all synthetic documents used for the claims in the associated paper.
Vertical profiles of conductivity, temperature, and pressure from shallow water surveys conducted by the National Oceanic and Atmospheric Administration. The data includes measurements from transmissometers and dissolved oxygen sensors, collected at a descent rate of 0.5 to 0.75 meters per second to a maximum depth of 30 meters. Profiles were processed using Sea-Bird Instrument's SeaSoft SBE Data Processing software.
Contact information for publicly funded schools in Ontario, including public, Catholic, hospital, provincial, summer, and night schools. The dataset contains fields for region, board name and type, school name and level, contact details, grade range, and date opened. Data is sourced from the Board School Identification Database and maintained by school boards.
2018-2019 onward data on private school enrollment in Ontario, Canada, reported annually as of October 31st. The dataset, provided by the Government of Ontario, contains counts of male and female students at elementary and secondary levels for each private school. Enrollment numbers are rounded to the nearest five from the 2018-2019 academic year.
Ontario's Ministry of Education provides student enrolment figures for elementary and secondary schools, aggregated by gender and school board. Data is collected via the Ontario School Information System (OnSIS) October submissions and includes public and Catholic school types. The dataset was last updated in March 2026.
Enrolment numbers track students in grades 9 to 12 across ministry-defined secondary school courses in Ontario. Data is reported by public and Catholic schools to the Ontario School Information System. Privacy protection suppresses counts below 10 students, and numbers from 2018-2019 onward are rounded to the nearest five.
Public schools offering French as a Second Language (FSL) programs in Ontario for the 2023-2024 academic year. The dataset enumerates English-language schools with student enrolment in FSL programs across elementary and secondary panels for all publicly funded school boards. Data includes board and school identifiers, contact details, and counts for FSL Core, Extended, and Immersion program types.
In 2014/2015, WFP and UNHCR conducted a socio-economic categorization in select refugee camps in Chad. The dataset covers 12,643 households in the Gozamir and Belom refugee camps. It was published by UNHCR - The UN Refugee Agency.
12.5 KB Excel file containing supplementary results from a meta-analysis on cardiovascular health. The data, authored by Anand Ruban Agarvas, likely details associations between hemochromatosis-related HFE genotypes and carotid artery wall thickness in females from the UK Biobank. It was last updated on 2026-04-28.
A 3.9 MB PDF document authored by Hamdan Mansoor and last updated on 2026-04-11. The study develops an assessment tool to understand the current status of the University of Fallujah, supporting post-ISIS recovery and institutional quality improvement efforts.