DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Education Datasets | DataSalon

All Categories

🎓

Education

Student performance, MOOC logs, knowledge tracing, standardized tests, learning analytics

13,372 datasets

Education

Chicago School Probation Policy Study: External Support and Student Outcomes

Chicago Public Schools implemented a probation policy in 1996 for elementary schools where fewer than 15% of students scored at grade-level norms on the Iowa Test of Basic Skills in reading. The dataset likely contains results from a two-year study analyzing the design and implementation of this policy, including the external support provided to schools. The study was authored by Kara Finnigan and focuses on the policy's consequences and assistance mechanisms.

TabularPedagogyMedicineEducation PolicyAccountabilityMedical EducationStudent PerformancePsychologyReading ProcessTest BiologyAutonomySchool AccountabilityProbation SupportPublic RelationsPublic AdministrationMathematics EducationSanctionsPolitical ScienceCurriculum+1

0 views

Education

National Education Longitudinal Study: 1988 Eighth-Grade Cohort

A major longitudinal effort by the United States Department of Education to track the 1988 eighth-grade cohort through high school and beyond. The study collects policy-relevant data on educational processes, dropout predictors, and equal opportunity from four components: Parent, School Administrator, Student, and Teacher. This first stage provides foundational trend data on critical transitions from elementary school.

TabularEducation PolicyTeacher DataMathematicsPsychologyStudent AchievementLongitudinal StudyGeographyStatisticsMathematics EducationPolitical ScienceSocioeconomic Factors+1

0 views

Education

Ocean Station P: Zooplankton Depth Profiles, 1971

June to July 1971 data from the SUBARCTIC-A cruise of the R/V YAQUINA details the vertical distribution of zooplankton at Ocean Station 'P' in the Subarctic Pacific. The analog report contains tabular estimates of zooplankton displacement volumes and numerical densities for many species or categories at different depths. This archival dataset provides a snapshot of pelagic community structure from a historically significant oceanographic station.

TabularTime SeriesOcean Station POceanographyZooplanktonSubarctic PacificVertical Distribution+1

0 views

Education

tom_checkpoint_dropout_0.3: Machine Learning Model Checkpoint

A machine learning model checkpoint named 'tom_checkpoint_dropout_0.3' published on Kaggle. The dataset likely contains saved model parameters and architecture details. Its specific contents and creation date are unknown.

TabularMachine LearningModel CheckpointDropout+1

0 views

Education

TOM_BEST_NO_DROPOUT: Student Performance Data Without Dropouts

A dataset titled TOM_BEST_NO_DROPOUT, published on Kaggle. Its name suggests a focus on student performance or educational outcomes, specifically excluding dropout cases. The dataset's content, scale, and authorship are unknown from the provided metadata.

TabularStudent PerformanceDropout PredictionEducation+1

0 views

Education

Lyrics For Learning: A Text Corpus for Language Tasks

Lyrics For Learning is a text dataset published on Kaggle. The title suggests it contains song lyrics intended for educational or analytical purposes. Metadata is minimal; actual content, size, and structure require verification after download.

TextAudioLanguage LearningLyricsText Data+1

0 views

Education

Machine Learning Competition Data from Kaggle

Kaggle hosts a dataset from a machine learning competition. The specific topic and scale are not detailed in the provided metadata. The data is likely structured for predictive modeling tasks typical of the platform.

TabularMachine LearningCompetitionModeling+1

0 views

Education

Student Performance Prediction Dataset

Student Performance Prediction Dataset is a dataset hosted on Kaggle. The dataset likely contains information for predicting student academic outcomes. Specific details such as column definitions, size, and origin are not provided in the metadata.

TabularStudent PerformanceEducationPrediction+1

0 views

Education

Student Exam Performance Dataset from Kaggle

A dataset concerning student exam performance, sourced from the Kaggle platform. The specific number of records, features, and temporal coverage are not provided in the available metadata. The dataset's content and structure require verification after download.

TabularStudent PerformanceEducationExamination+1

0 views

Education

London School Air Quality Exposure by Parliamentary Constituency, 2013

Greater London Authority provides a dataset of air quality exposure for educational institutions, broken down by parliamentary constituency. The data is based on the 2013 London Atmospheric Emissions Inventory and includes all educational establishments except private nurseries. The legal limit for NO2 is an average annual concentration of 40 ug/m3.

TabularGeospatialAir QualityEnvironmental ExposureEducationPublic Health+1

0 views

Education

Synthetic Multi-Turn Conversations for Browser Agent Tasks

1,062 synthetic multi-turn conversations between a user and an AI assistant focus on practical agentic tasks like train ticket booking, dynamic form filling, and payment processing. Created by DataCreator AI, the dataset includes diverse scenarios such as successful execution, context retrieval, tool integration, and failure recovery.

JSONBrowser TasksSize Categories1 Kn10 KTask Categoriestext GenerationLibrarypolarsLanguageenTool CallingLibrarymlcroissantLibrarydatasetsLibrarypandasLicensecc By 40Function CallingSftRegionusLlm TrainingAgentic TasksSynthetic Data+1

0 views

Education

International Student Data on Metacognition, Strategies, and Achievement

A study analyzing factors influencing student learning across countries, using data from the Programme for International Student Assessment (PISA). The dataset, authored by Gregory J. Marchant, examines relationships between student demographics, academic strategies, metacognition, and achievement. The primary data source is PISA 2009, focusing on cognitive and control strategies used by students.

TabularDevelopmental PsychologyCross CountryPsychologyStudent AchievementEducationCognitive psychologyMetacognitionGeographyMathematics EducationDemographySociologyDemographicsCognition+1

0 views

Education

multiROC: Tools for ROC and PR Curve Calculation in Multi-Class Classification

Runmin Wei authored a software package for evaluating multi-class classification models. The package computes areas under ROC and PR curves using micro-averaging and macro-averaging methods. Its methodology references academic papers by Van Asch (2013) and Pedregosa et al. (2011).

TabularClass PhilosophyMachine LearningComputer ScienceMathematicsModel EvaluationArtificial IntelligencePr CurvesClassificationStatisticsRoc CurvesReceiver Operating Characteristic+1

0 views

Education

Instructional Quality Assessment: Classroom Observation Toolkit

A formal toolkit for rating instructional quality based primarily on classroom observation and student assignments. The toolkit was developed for reading comprehension and mathematics at the elementary school level. A large pilot study was conducted in Spring 2003 in two moderately large urban school Districts.

TabularEngineeringEpistemologyComputer ScienceEducation AssessmentQuality AssessmentReliability EngineeringClassroom ObservationInstructional QualityPhilosophyEvaluation MethodsQuality Philosophy+1

0 views

Education

Learning to Eat Soup With a Knife: Counterinsurgency Case Studies from Malaya and Vietnam

Lieutenant Colonel John A. Nagl's book compares counterinsurgency doctrine development in the Malayan Emergency (1948-1960) and the Vietnam War (1950-1975). The analysis uses archival sources and participant interviews to argue that organizational culture determines an army's ability to adapt. A new preface reflects on the author's combat experience in Iraq.

TextSubject DocumentsLibrary ScienceHistoryColonialismMilitary HistoryHistorical AnalysisLawVietnam WarDoctrinePolitical ScienceOrganizational LearningCounterinsurgency+1

0 views

Education

Supervised Fine-Tuning Dataset For Language Model Training

SFT-Dataset is a collection for supervised fine-tuning of base language models like Qwen/Qwen3-4B-Base. The dataset is tagged for text generation, reasoning, math, science, and code tasks. It is authored by 96kevinli29 and was last updated in March 2026.

ParquetSize Categories10 Kn100 KTask Categoriestext GenerationLicenseotherLibrarypolarsLanguageenModalitytextCodeLibrarymlcroissantLibrarydatasetsLibrarypandasSftRegionusReasoningScienceMathSupervised Fine Tuning+1

0 views

Education

Teacher Data from Kaggle

Teacher data published on Kaggle. The dataset's specific content and scope are not detailed in the provided metadata. Columns, sample data, and other specifics are unknown.

TabularSchoolsEducationTeachers+1

0 views

Education

DigitalCorpora: Synthetic Digital Forensics Evidence for Research

DigitalCorpora provides disk images, memory dumps, and network packet captures for digital forensics research. The data is synthetic, created by students and faculty acting in persona, allowing use without prior authorization or IRB approval. This collection is hosted on AWS S3 and accessible via the digitalcorpora.org website.

MultimodalDisk ImagesMachine LearningMachine TranslationNetwork TrafficText AnalysisImagingInternetCyber SecurityIntrusion DetectionCsiInformation RetrievalSyntheticComputer SecurityImage ProcessingDigital ForensicsComputer Forensics+1

0 views

Education

Machine Learning Model Performance on Plasmonic Sensor Metrics

Figshare hosts a 5.5 KB dataset analyzing three machine learning regression models. The data compares model performance on confinement loss and wavelength sensitivity metrics for enhanced plasmonic sensors. Author Sonia Akter contributed this analysis under a CC BY 4.0 license.

Machine LearningDiv PSurface Plasmon ResonanceConfinement LossAchieving Ultrahigh SensitivityRandom forest regressionPredicting Optical ResponsesSurpassing Prior MlIntegrating Machine LearningCritical Performance MetricsWavelength SensitivityCapturing Confinement LossAchieving Superior AccuracyRecord Wavelength SensitivityAlgorithm Computational FrameworkEnhanced Plasmonic SensorsMachine Learning InterpretationHole Rings02 Across33 1+1

0 views

Education

Classroom Behavior Dataset: Student Attention and Teaching Data

Student Attention and Teaching Behavior Data published on Kaggle. The dataset likely contains observations linking student engagement metrics with instructor actions. Its specific size, collection method, and temporal coverage are not detailed in the available metadata.

TabularEducation AnalyticsClassroom BehaviorStudent Attention+1

0 views

PreviousPage 343 of 667Next