DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Education Datasets | DataSalon

All Categories

🎓

Education

Student performance, MOOC logs, knowledge tracing, standardized tests, learning analytics

13,019 datasets

Education

Cyclone Gezani Building Damage Assessment for Toamasina

Microsoft AI for Good Lab mapped affected buildings in Toamasina, Madagascar, following Cyclone Gezani using Airbus satellite imagery and AI models. The dataset provides geospatial damage assessment data. It was last updated in April 2026.

GeospatialFacilities InfrastructureGeodataCyclones Hurricanes TyphoonsCyclone ImpactDamage AssessmentBuilding InfrastructureNatural DisastersDisaster Response+1

0 views

Education

Monthly Non-School Agency Staff Expenditure from Government Digital Service

Monthly spend on non-school agency staff published by the Government Digital Service. The data is updated and released on a monthly schedule, suggesting a regular time series. The dataset originates from the eu_open_data platform.

TabularCSVJSONGovernment SpendingEducation StaffingMonthly Expenditure+1

0 views

Education

Annual Government Spending on Non-School Agency Staff by Financial Year

Government Digital Service data summarizes annual expenditure on non-school agency staff. The dataset is derived from more granular monthly spending records. It is published as open data on the EU platform.

TabularCSVJSONGovernment SpendingFinancePublic SectorAgency staffFinancial Year+1

0 views

Education

Ableton God Producer Dataset: 9,000 Examples for AI Music Production Training

The Ableton Live God-Level Beat Producer Dataset contains 9,000 examples designed to train large language models to act like professional producers in Ableton Live 12 Suite. It was created by author 11-47 and last updated on Hugging Face in May 2026. The dataset covers topics such as drum programming, bass sound design, mixing, and live performance setup.

TextAudioAbleton LiveSound DesignAi TrainingLarge ScaleBeat MakingMusic Production+1

0 views

Education

Assessment of the effects of the inclusion of poor quality sediment samples on spatial pre

A study assessing the impact of poor-quality dredged sediment samples on spatial predictions of seabed mud content in the Australian marine margin. The analysis uses 14,204 samples from the MARS database, focusing on two regions: the Southwest Region (407 samples) and the Petrel Region (534 samples). Predictions were made using Inverse Distance Weighting and Ordinary Kriging, with accuracy measured by relative mean absolute error.

TabularGeospatial🇦🇺 AustraliaBenchmarkSpatial InterpolationFinanceData QualityMarine GeologySynthetic+1

0 views

Education

Active School Buses by Model Year and Operator in Newfoundland and Labrador

Newfoundland and Labrador's provincial government provides data on the number of active school buses. The dataset likely contains counts broken down by model year, region, and operator type (school board versus private). It was last updated on April 17, 2026.

TabularVehicle RegistryTransportationEducationPublic SectorSchool Buses+1

0 views

Education

Reliability and Quality Scores for Dietary Weight Loss Videos on Chinese Platforms

300 videos from BiliBili, TikTok, and Kwai were assessed in February 2024 using the Global Quality Score (GQS) and modified DISCERN (mDISCERN) tools. The dataset likely contains scores and video characteristics like duration and engagement metrics. It was created by Shiqi Zhou and is licensed under CC-BY-4.0.

Tabular🇨🇳 ChinaInternet VideosQuality AssessmentWeight LossDietary InterventionHealthcareReliability AssessmentLarge Scale+1

0 views

Education

Expert Ratings of LLM-Generated Basketball Layup Lesson Plans

A 2026 expert survey comparing lesson plans generated by three large language models for teaching a basketball layup. Benedikt Meixner submitted a prompt to GPT-4o, Claude Sonnet, and Google Gemini, and teaching experts rated the plans on 28 predefined quality criteria using 5-point Likert scales. The dataset likely contains the expert ratings and statistical analysis results.

TabularBasketballLesson PlanningPhysical EducationArtificial IntelligenceTeachingLarge Language ModelHigher EducationLesson PlanSynthetic+1

0 views

Education

LLM-Generated Basketball Lesson Plans Evaluated by Experts

28 quality criteria were used to evaluate basketball layup lesson plans generated by three LLMs (GPT-4o, Claude Sonnet, and Google Gemini). Teaching experts rated the plans on 5-point Likert scales, with the most frequent median rating being 'acceptable'. Benedikt Meixner published this research on figshare in March 2026.

TabularBasketballLesson PlanningPhysical EducationArtificial IntelligenceTeachingLarge Language ModelHigher EducationLesson PlanSynthetic+1

0 views

Education

ChemModernBERT: LD50 Predictions for 8,898 Compounds

8,898 compounds with median lethal dose (LD50) values form a curated dataset for toxicity prediction. Tanuj Sharma authored this research, which compares four molecular representation learning approaches, including the ChemModernBERT model. The dataset was last updated on March 19, 2026.

TabularMachine LearningBenchmarkOral ToxicityLarge Language ModelChemical CompoundsLarge ScaleToxicity PredictionLd50 PredictionChem Modern Bert+1

0 views

Education

TrustXQoE: Cross-Layer HLS Edge QoE/QoS Dataset for Video Streaming

TrustXQoE is a large-scale, human-annotated dataset for HTTP Live Streaming research, collected in an SDN-CDN edge environment. The dataset includes synchronized measurements from multiple layers of the video delivery chain, such as client playback, network QoS, server metrics, and MOS/QoE labels. It was authored by Abdelhak Heroucha and published on Harvard Dataverse in May 2026.

TabularTime SeriesTrust Aware NetworkingVideo StreamingEdge ComputingLarge ScaleSdn CdnQuality Of Experience+1

0 views

Education

School Suspensions and Removals in the Netherlands, 2014 Onward

Suspension and removal reports from Dutch primary, special, and secondary education schools submitted to the Inspectorate since 2014. The data includes national counts by sector and education type, suspension durations, and the frequency of given reasons. The dataset is provided by the Ministry of the Interior and Kingdom Relations and is available in ODS format.

TabularPolicy MonitoringSecondary EducationSchool Suspensions+1

0 views

Education

School Weight PO: Socioeconomic Weighting for Primary School Assessment in the Netherlands

School weighting data used to assess learning outcomes in Dutch primary education. The Central Bureau of Statistics calculates a school's weighting based on parental education levels, maternal education averages, parental country of origin, mother's length of stay in the Netherlands, and parental debt remediation status. This model, from the Dutch Ministry of the Interior and Kingdom Relations, entered into force on 1 August 2020.

TabularSchool PerformanceEducationPolicy AssessmentSocioeconomic Factors+1

0 views

Education

On The Books: Jim Crow Law Classifications in North Carolina Session Laws, 1866–1967

On the Books is a labeled training set from a collections-as-data project at UNC Chapel Hill Libraries. It contains expert-labeled chapter/section pairs from North Carolina session laws passed between 1866 and 1967, identifying them as Jim Crow laws or not. The dataset was created by author biglam and last updated on Hugging Face in April 2026.

TabularLegal DocumentsHistorical TextMachine Learning TrainingJim Crow LawsNorth Carolina+1

0 views

Education

Symposium Proceedings Honouring Michael Collins in Sediment Dynamics, 2007

A symposium held at the University of Wales, Swansea in July 2007 honoured the career of Professor Michael Collins. The event, organized by his former students, celebrated his contributions over 30 years as a scientist, teacher, and mentor. About 30 of his 50+ postgraduate students attended to discuss the various subjects and projects he supervised.

TextOceanographyTribute PublicationAcademic SymposiumSediment Dynamics+1

0 views

Education

Taiwan High School Subject Knowledge Texts in Traditional Chinese

tw-highschool is a dataset of Traditional Chinese knowledge texts covering four major high school subjects in Taiwan: biology, chemistry, mathematics, and physics. Curated by Huang Liang Hsun, it likely contains explanatory paragraphs, formula derivations, example problems, and terminology glossaries. The dataset is intended as supplementary corpus material for training or fine-tuning models with Taiwanese high school subject knowledge.

TextTraditional ChineseHigh SchoolEducationNatural Language ProcessingSubject KnowledgeText Corpus+1

0 views

Education

Longitudinal Audio Diaries on Medical Student Transition to Clinical Practice

From August 2017 to July 2020, audio diaries were collected from medical students progressing from clerkship through their first postgraduate year. The data captures challenges in applying knowledge, team integration, and role uncertainty, analyzed through Gruppen's Learning Environment Framework. This longitudinal design reveals how learners adapt to increasing clinical responsibility over time.

TextAudioMedical StudentsMedical EducationHealthcareLongitudinal StudyMedical Training ChallengesAudio DiariesClinical TransitionGruppens Learning Environment FrameworkGruppen Framework+1

0 views

Education

Master's Student Online Learning Burnout Factors via Q-Methodology

15 master's students participated in a Q-methodology study exploring causes of learning burnout in online environments. The research identifies four factor groups, including lack of self-management and challenges in adapting to the online classroom. The findings are presented in a 12.2 KB DOCX file.

Q MethodologyLearning BurnoutMasters StudentsQualitative StudyOnline Learning+1

0 views

Education

Master's Student Online Learning Burnout Causes via Q-Methodology

Fifteen master's students participated in a Q-methodology study to identify causes of learning burnout in online environments. The research categorized burnout factors into four distinct groups, including self-management challenges and environmental distractions. The findings are intended to inform graduate course design and instructional practices.

Q MethodologyLearning BurnoutMasters StudentsQualitative StudyOnline Learning+1

0 views

Education

Systematic Review of Healthcare Ethics Education Trends from 2015 to 2025

145 studies published from 2015 to 2025 were analyzed in a systematic review following PRISMA guidelines. The review, authored by Xiyang Yin and shared on figshare, identifies four core themes in medical ethics education. It was last updated in March 2026 and is available as an 800.3 KB PDF.

TextSystematic ReviewCompetency Based ProgramsMedical EducationEthics EducationHealthcareProfessionalismMutual SafetyVirtues And CharacterHealthcare Professions+1

0 views

PreviousPage 220 of 650Next