Loading...
Loading...
Student performance, MOOC logs, knowledge tracing, standardized tests, learning analytics
12,989 datasets
An instruction dataset for language model fine-tuning, processed to remove semantically similar prompts. The dataset was created by author kaushik-harsh-99 and was last updated on 2026-05-16. It is a version of a prior dataset where near-identical instructions with different wording have been deduplicated.
Median earnings data for full-time workers aged 25-64 in Alberta, Canada, for the year 2010. The dataset is broken down by ten distinct levels of educational attainment and by sex. It is an official statistic published by the Government of Alberta.
A structured psychoeducation intervention dataset adapted from a 2012 study by Sharif et al. The dataset was uploaded by Adamu Kenea to figshare in April 2026. It is a small Excel file of 17.5 KB.
Compass is a collection of processed 16S and shotgun-derived microbiome tables formatted for machine learning. The dataset is organized into benchmark tasks, each with a specific configuration. The author is outpost-bio, and the dataset was last updated on May 12, 2026.
The 1996 TARFOX campaign sampled the US mid-Atlantic haze plume, revealing the unexpected importance of carbonaceous compounds and water condensed on aerosols. Coordinated measurements from four satellites and four aircraft captured gradients in aerosol optical thickness to isolate aerosol radiative effects. This dataset supports closure tests for satellite retrievals and provides in-situ measurements of aerosol optical depth, backscatter, and extinction.
School-level per pupil expenditures by major functional categories and funding sources, starting from fiscal year 2019. The dataset also includes school-level enrollment, demographic indicators, teacher salary and staffing data, and MCAS performance metrics. It is published by the Massachusetts educationtocareer.data.mass.gov platform and is part of a suite of three related finance datasets.
Metadata scraped from the website of The New Centre for Research & Practice, a paraacademic institution, describing its educational seminars. The dataset was created by the mlx-community and last updated on Hugging Face in April 2026. It likely contains titles and summarized descriptions for seminars listed as 'non-members available'.
New York City's Department of Citywide Administrative Services (DCAS) maintains a network of over 1,600 electric vehicle charging ports across more than 1,025 stations for city fleet use. The dataset includes station locations, charger types, and operational details for fleet operators. It was last updated on March 12, 2026.
A geospatial dataset delineating areas of catchments that drain to points on main tributaries immediately upstream of Victorian estuaries. It was derived by Deakin University for a project funded by the National Heritage Trust and the Department of Sustainability and Environment. The dataset is provided by the Department of Energy, Environment and Climate Action and was last updated in April 2026.
Synthetic business contact datasets are commonly used for CRM testing and marketing analytics workflows. This dataset provides a realistic example of business contact database structures, including fields like First Name, Last Name, Company, Job Title, Email, and Country. It was created by emailmarketingdataset and last updated on March 16, 2026.
350 university students in Guangzhou responded to an online questionnaire on sports dance appreciation. Principal Component Analysis and cluster analysis identified three core appreciation dimensions and three distinct appreciator typologies. The study, authored by Yuanmei He and last updated in March 2026, constructs a theoretical framework for understanding this hybrid art-sport form.
321 third-grade students from six Hong Kong primary schools participated in a study examining the relationship between arithmetic estimation and inhibitory control tasks with visual distractors. The dataset, authored by Kerry Lee and last updated in March 2026, includes results from math, inhibitory, updating, working memory, and processing speed tasks. Significant distraction effects were observed, but individual differences in inhibitory abilities did not explain variance in math performance.
A 2026 study collected survey data from 689 physical education and sport teachers affected by an earthquake. The data, published by ΕΔ±hmehmet YiΔit, includes measures of psychological resilience, life engagement, coping humor, and post-earthquake trauma levels. Statistical analyses were performed using SPSS and the PROCESS Macro.
Deakin University created this geospatial dataset delineating areas of Victorian coastal catchments that drain directly to estuaries. The work was part of the project 'Linking catchments to the sea: Understanding how human activities impact on Victorian estuaries', funded by the National Heritage Trust and the Department of Sustainability and Environment. The dataset was last updated on 2026-04-09.
A mixed-methods study in Kazakhstan explores digital STEM approaches to improve school-university continuity in mathematics education. The research includes an experimental study with two 10th-grade classes and a survey of 33 university participants. It reports a 15-16% improvement in student academic performance and an average positive attitude score of 3.85 from university respondents.
Alma Abylkassymova's mixed-methods study data explores digitalized STEM approaches for mathematics education continuity. The dataset contains results from an experimental study in two 10th-grade classes and a survey of 33 university participants, measuring academic performance gains and attitudes. Findings include a 15-16% student performance improvement and an average survey score of 3.85 on STEM-digital integration.
38 hours of speech from 120 unscripted telephone conversations between native Spanish speakers, developed by the Linguistic Data Consortium. This second edition combines original audio and transcripts with updated formats and revised transcriptions conforming to modern guidelines. Calls originated in North America to overseas locations, with participants speaking to family or friends on topics of their choice.
Approximately 49 hours of speech from 120 unscripted telephone conversations between native Japanese speakers, developed by the Linguistic Data Consortium (LDC). This second edition re-releases the original CALLHOME Japanese collection with updated transcripts, file formats, and documentation. The data was collected as part of the CALLHOME series to support research in speaker and language identification.
A 1989 Statewide Assessment Report provides the boundaries for Land Conservation Council Study Areas, depicted as line and polygon features. The dataset, published by the Department of Energy, Environment and Climate Action, includes information on historic studies but excludes boundaries for Special Investigations. It was last updated on April 8, 2026.
Motor performance, anthropometric, and subjective coach assessment data for 46 German youth 3x3 basketball players. The dataset supports a methodological comparison of logistic regression and fast-and-frugal tree models for predicting player selection. Nora Cermak authored the study, with data last updated in March 2026.