Loading...
Loading...
Student performance, MOOC logs, knowledge tracing, standardized tests, learning analytics
13,019 datasets
Microsoft AI for Good Lab mapped affected buildings in Toamasina, Madagascar, following Cyclone Gezani using Airbus satellite imagery and AI models. The dataset provides geospatial damage assessment data. It was last updated in April 2026.
Monthly spend on non-school agency staff published by the Government Digital Service. The data is updated and released on a monthly schedule, suggesting a regular time series. The dataset originates from the eu_open_data platform.
Government Digital Service data summarizes annual expenditure on non-school agency staff. The dataset is derived from more granular monthly spending records. It is published as open data on the EU platform.
The Ableton Live God-Level Beat Producer Dataset contains 9,000 examples designed to train large language models to act like professional producers in Ableton Live 12 Suite. It was created by author 11-47 and last updated on Hugging Face in May 2026. The dataset covers topics such as drum programming, bass sound design, mixing, and live performance setup.
A study assessing the impact of poor-quality dredged sediment samples on spatial predictions of seabed mud content in the Australian marine margin. The analysis uses 14,204 samples from the MARS database, focusing on two regions: the Southwest Region (407 samples) and the Petrel Region (534 samples). Predictions were made using Inverse Distance Weighting and Ordinary Kriging, with accuracy measured by relative mean absolute error.
Newfoundland and Labrador's provincial government provides data on the number of active school buses. The dataset likely contains counts broken down by model year, region, and operator type (school board versus private). It was last updated on April 17, 2026.
300 videos from BiliBili, TikTok, and Kwai were assessed in February 2024 using the Global Quality Score (GQS) and modified DISCERN (mDISCERN) tools. The dataset likely contains scores and video characteristics like duration and engagement metrics. It was created by Shiqi Zhou and is licensed under CC-BY-4.0.
A 2026 expert survey comparing lesson plans generated by three large language models for teaching a basketball layup. Benedikt Meixner submitted a prompt to GPT-4o, Claude Sonnet, and Google Gemini, and teaching experts rated the plans on 28 predefined quality criteria using 5-point Likert scales. The dataset likely contains the expert ratings and statistical analysis results.
28 quality criteria were used to evaluate basketball layup lesson plans generated by three LLMs (GPT-4o, Claude Sonnet, and Google Gemini). Teaching experts rated the plans on 5-point Likert scales, with the most frequent median rating being 'acceptable'. Benedikt Meixner published this research on figshare in March 2026.
8,898 compounds with median lethal dose (LD50) values form a curated dataset for toxicity prediction. Tanuj Sharma authored this research, which compares four molecular representation learning approaches, including the ChemModernBERT model. The dataset was last updated on March 19, 2026.
TrustXQoE is a large-scale, human-annotated dataset for HTTP Live Streaming research, collected in an SDN-CDN edge environment. The dataset includes synchronized measurements from multiple layers of the video delivery chain, such as client playback, network QoS, server metrics, and MOS/QoE labels. It was authored by Abdelhak Heroucha and published on Harvard Dataverse in May 2026.
Suspension and removal reports from Dutch primary, special, and secondary education schools submitted to the Inspectorate since 2014. The data includes national counts by sector and education type, suspension durations, and the frequency of given reasons. The dataset is provided by the Ministry of the Interior and Kingdom Relations and is available in ODS format.
School weighting data used to assess learning outcomes in Dutch primary education. The Central Bureau of Statistics calculates a school's weighting based on parental education levels, maternal education averages, parental country of origin, mother's length of stay in the Netherlands, and parental debt remediation status. This model, from the Dutch Ministry of the Interior and Kingdom Relations, entered into force on 1 August 2020.
On the Books is a labeled training set from a collections-as-data project at UNC Chapel Hill Libraries. It contains expert-labeled chapter/section pairs from North Carolina session laws passed between 1866 and 1967, identifying them as Jim Crow laws or not. The dataset was created by author biglam and last updated on Hugging Face in April 2026.
A symposium held at the University of Wales, Swansea in July 2007 honoured the career of Professor Michael Collins. The event, organized by his former students, celebrated his contributions over 30 years as a scientist, teacher, and mentor. About 30 of his 50+ postgraduate students attended to discuss the various subjects and projects he supervised.
tw-highschool is a dataset of Traditional Chinese knowledge texts covering four major high school subjects in Taiwan: biology, chemistry, mathematics, and physics. Curated by Huang Liang Hsun, it likely contains explanatory paragraphs, formula derivations, example problems, and terminology glossaries. The dataset is intended as supplementary corpus material for training or fine-tuning models with Taiwanese high school subject knowledge.
From August 2017 to July 2020, audio diaries were collected from medical students progressing from clerkship through their first postgraduate year. The data captures challenges in applying knowledge, team integration, and role uncertainty, analyzed through Gruppen's Learning Environment Framework. This longitudinal design reveals how learners adapt to increasing clinical responsibility over time.
15 master's students participated in a Q-methodology study exploring causes of learning burnout in online environments. The research identifies four factor groups, including lack of self-management and challenges in adapting to the online classroom. The findings are presented in a 12.2 KB DOCX file.
Fifteen master's students participated in a Q-methodology study to identify causes of learning burnout in online environments. The research categorized burnout factors into four distinct groups, including self-management challenges and environmental distractions. The findings are intended to inform graduate course design and instructional practices.
145 studies published from 2015 to 2025 were analyzed in a systematic review following PRISMA guidelines. The review, authored by Xiyang Yin and shared on figshare, identifies four core themes in medical ethics education. It was last updated in March 2026 and is available as an 800.3 KB PDF.