Loading...
Loading...
Student performance, MOOC logs, knowledge tracing, standardized tests, learning analytics
13,001 datasets
528 ChatML-formatted rows contain trajectories of code execution, rendered for instruction tuning with a 48,000-token budget. The dataset, created by user 'bytkim', was last updated on Hugging Face in May 2026. It includes prompt/completion expansions for supervised fine-tuning tooling alongside token-budget and lineage reports.
OCDEL provides a listing of open certified child care facilities and early learning programs in Pennsylvania, current as of the last day of each month. The dataset includes licensing, capacity, and geospatial information for each provider. Data is published by the Pennsylvania Office of Child Development and Early Learning (OCDEL) via data.pa.gov.
Legacy Massachusetts Comprehensive Assessment System (MCAS) results for grades 3-10 and high school science from 2007 to 2022. The dataset includes achievement levels, participation rates, average scaled scores, and median student growth percentiles. It is published by the Massachusetts Department of Elementary and Secondary Education on the educationtocareer.data.mass.gov platform.
Data on the number of pupils in primary special education, organized by administrative weight scheme and competent authority. The dataset originates from the Ministry of Education and is published via the eu_open_data platform under a CC0-1.0 license. Temporal coverage and last update date are unknown.
Dutch vocational education (MBO) competent authorities, specifically school boards, with their address details. The dataset includes branch and correspondence addresses, denomination, and administrative office numbers. It is provided by the Ministry of the Interior and Kingdom Relations under a CC-BY-4.0 license.
Supplementary Material for: Exploring Optometrists’ Practice Patterns in Falls Prevention Management contains qualitative data from a study of 23 Australian community optometrists. The dataset likely contains transcripts or analysis from four focus groups exploring knowledge, barriers, and enablers for implementing falls prevention in clinical practice. It identifies a substantial evidence-to-practice gap and factors like perceived low knowledge and sporadic clinical integration.
Dalong Zhang's dataset compares competency outcomes for emergency medicine residents undergoing "Script Killing" immersive teaching versus traditional methods. The study involved 36 rotating residents at Zhengzhou People's Hospital from November 2024 to March 2025. It includes scores for theoretical knowledge, OSCE performance, triage accuracy, and 3-month knowledge retention.
This dataset describes the development and implementation of a MediaPipe-based AI program for health-oriented middle school physical education. The study used a participatory action research design over three months in one public middle school, involving PE teachers and 9th-grade students as co-developers. The program evolved through three cycles to recognize fitness movements and provide visual and auditory feedback.
10,996 samples, including 4388 substrates and 2880 nonsubstrates curated from 1629 publications, plus 3728 pseudonegative samples, for nine major CYP isoforms. The dataset was created by Yingjie Yang and published on figshare in March 2026. It supports the EviCYP framework, which integrates evidential deep learning and vector quantization to predict substrates with uncertainty estimates.
Yukon's GARDEd database contains over 350,000 surficial geochemical samples from assessment reports, primarily from the Dawson Range submitted since 2005. Samples are categorized as rock, soil, stream sediments (silt), or vegetation, with the initial release representing approximately 10% of Yukon's assessment report geochemical data.
20,750 students aged 5–18 years from Southern China were screened for myopia between 2018 and 2021. Ningfeng Li authored this serial cross-sectional study comparing two diagnostic criteria: spherical equivalent refraction alone versus combined with uncorrected visual acuity. The combined criterion consistently yielded lower prevalence estimates, revealing a 15–21% annual relative overestimation by the simpler method.
Dissolved CO2, CH4, and N2O data collected from intermittent and permanent streams in Te Muri Regional Park, Auckland/Tāmaki Makaurau, New Zealand/Aotearoa. The data was gathered for a doctoral dissertation by Julia Jakobsson at Waipapa Taumata Rau | University of Auckland, published in 2025.
1146 cleaned ChatML rows of teacher-student code generation trajectories rendered with a 48k token budget. The dataset, created by bytkim, was last updated on May 7, 2026. It includes prompt/completion expansions for supervised fine-tuning tooling and detailed row-level provenance reports.
A 132.5 KB dataset of text documents describing the effects of the herb Valeriana jatamansi on learning and memory in rats. The dataset was authored by Lei An and last updated on April 21, 2026. It is available under a CC-BY-4.0 license.
A collection of 63 high-quality instruction-following datasets containing nearly 25 million samples. Each sample is scored across 30 dimensions, including IFD, PPL, and Deita_Quality. The dataset was created by OpenDataArena and last updated on 2026-04-27.
A dataset supporting a hybrid knowledge and data-driven methodology for assessing digital maturity in higher education. It includes core data and MATLAB codes for the proposed approach, such as an initial influence matrix and key influencing factor Kendall rank correlation coefficients. The dataset was authored by Zhao, Yuqi and last updated on May 17, 2026.
British Columbia harbours have projected 30-year, 50-year, and 100-year sea level return levels for 2050 and 2100 under the SSP126 low emission scenario. The data combines present extreme levels from 1993-2020 simulations with IPCC-projected mean sea level rise, adjusted for local vertical land motion. It is provided by Fisheries and Oceans Canada.
A dataset by Ghulam Madni, published on figshare in May 2026. It is a small dataset (30.2 KB) in XLSX format, likely containing data related to the impact of AI agent personas in educational settings. The specific row count, column details, and geographic scope are not provided in the metadata.
Projected 30-year, 50-year, and 100-year return levels for harbours in British Columbia under the SSP585 high emission scenario for 2050 and 2100. The data combines present extreme sea levels derived from 1993-2020 hourly coastal sea level simulations with projected mean sea level rise from IPCC AR6, adjusted for local vertical land motion.
Fisheries and Oceans Canada provides projected 30-year, 50-year, and 100-year extreme sea level return levels for harbours in British Columbia. The dataset combines present extremes derived from 1993-2020 hourly coastal sea level simulations with projected mean sea level rise from the IPCC 6th Assessment Report under the SSP245 scenario. Projections are provided for the years 2050 and 2100, adjusted for local vertical land motion.