DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Education Datasets | DataSalon

All Categories

🎓

Education

Student performance, MOOC logs, knowledge tracing, standardized tests, learning analytics

13,406 datasets

Education

Orbit Wars MCTS Teacher Corpus

A corpus likely related to the Orbit Wars game environment and the Monte Carlo Tree Search (MCTS) algorithm. It was published on Kaggle, but the specific creator, size, and creation date are unknown. The content may include game state data or training logs intended for AI development.

TextGame AiMonte Carlo Tree SearchReinforcement LearningNatural Language ProcessingStrategy Games+1

0 views

Education

Orbit Wars MCTS Teacher Output Weights

A dataset from Kaggle containing the output weights from a Monte Carlo Tree Search teacher model for the game 'Orbit Wars'. The dataset likely contains parameters or learned values from an AI training process. Its specific structure and size are not detailed in the available metadata.

TabularMachine LearningGame AiMonte Carlo Tree SearchReinforcement Learning+1

0 views

Education

HLE-Verified: A Structured Revision Benchmark for AI Evaluation

HLE-Verified is a lightweight, evaluation-ready reformatting of the benchmark created by the Skylenage Team. The original work, 'HLE-Verified: A Systematic Verification and Structured Revision of Humanity's Last Exam', was authored by Weiqi Zhai et al. and is associated with arXiv paper 2602.13964. The dataset was last updated on February 28, III.

TextArxiv260213964Ai SafetyEvaluationBenchmarkLicensecc By 40HleSource Datasetsskylenagehle VerifiedRegionus+1

0 views

Education

Municipal Property Assessment Subproperty Use Code Descriptions

Calgary's official list of subproperty use codes for property assessment. The dataset provides CODE and DESCRIPTION fields, published by data.calgary.ca. It was last updated in January 2026.

TabularCSVXMLJSONCalgaryCodeSubpropertyTax CodesSub PropertyAssessmentProperty AssessmentMunicipal Data+1

0 views

Education

Teacher Data from Kaggle

Teacher_new is a dataset hosted on the Kaggle platform. Its specific contents and structure are not detailed in the available metadata. The dataset's origin, size, and collection method are currently unknown.

TabularEducationTeachersKaggle+1

0 views

Education

UCSF Renal Mass CT Dataset: 831 3D Multiphase Exams with Annotations

831 3D Multiphase CT exams of renal masses, registered across phases and annotated to identify the masses. The dataset is provided by the UCSF Larson Advanced Imaging Lab and is hosted on AWS Open Data. It is licensed under CC-BY-4.0.

ImageMedicineMedical ImagingLife SciencesCancerRenalRadiologyComputed Tomography+1

0 views

Education

Exam Score Prediction Dataset

Kaggle hosts a dataset focused on predicting exam scores. The dataset's specific size, features, and collection methodology are not detailed in the provided metadata. Its origin and temporal coverage are also unknown.

TabularExam ScoresStudent PerformanceEducationPrediction+1

0 views

Education

CO2 Frost Formation Data from Cryogenic Carbon Capture Experiment

A dataset from the University of Chester 2020 experiment defines real-time CO2 frost formation in a vertical packed column. It includes Electrical Capacitance Tomography (ECT) capacitance measurements at 714 frames per second and temperature profiles recorded via thermocouples. The data supports analysis of frost formation dynamics for cryogenic carbon capture processes.

Carbon Capture And StorageTomographyNerc DdcCarbon dioxideCryogenics+1

0 views

Education

Ego4D: First-Person Video with Feature Extraction Tools

Ego4D is an egocentric video repository developed by Facebook Research, featuring tools for feature extraction and visualization as of March 2026. It provides a framework for processing first-person video data for computer vision applications, though specific record counts are not listed in the repository metadata. The project includes example usage scripts to assist researchers in implementing egocentric video models.

VideoFeature ExtractionVisuzalizationComputer Vision+1

0 views

Education

Financial Risk Assessment Data Preprocessed from Kaggle

PREETHAM GOUDA uploaded this dataset from Kaggle. It is licensed as a US public domain work. The dataset is described as being related to financial risk and credit assessment.

TabularMachine LearningFinancial RiskBankingCredit AssessmentFinance+1

0 views

Education

Financial Risk Assessment Data from Kaggle

A dataset for financial risk assessment, originally sourced from Kaggle. The author is listed as PREETHAM GOUDA. The dataset is licensed under a US public domain license, but specific details on its size, features, and creation date are not provided in the metadata.

TabularFinancial RiskTabular DataBankingCredit AssessmentFinance+1

0 views

Education

Global School Closures for COVID-19 from UNESCO Monitoring

UNESCO monitoring data tracks the impact of COVID-19 on education systems worldwide. Over 100 countries implemented nationwide closures, impacting over half of the world's student population. This data is compiled by UNESCO and distributed by HDX.

TabularTime SeriesPublic PolicyCovid 19Global EducationSchool Closures+1

0 views

Education

Student Productivity Classification Dataset for Educational Machine Learning

A synthetic categorical dataset intended for educational machine learning classification tasks. The author, organization, and specific data volume are unknown. The dataset's last update date is also unknown.

TabularEducationClassificationSynthetic DataSyntheticStudent Productivity+1

0 views

Education

XNomial: Exact Goodness-of-Fit Test for Multinomial Data

Bill Engels authored this statistical package for testing whether observed counts fit a given expected ratio. The description suggests it is used for applications like analyzing genetic cross data against expected frequencies. It provides exact tests via enumeration or Monte Carlo sampling, using statistics like the log-likelihood ratio or chi-square.

TabularMultinomial DistributionMathematicsBiologyTest BiologyStatisticsGoodness Of FitApplied mathematics+1

0 views

Education

Kenyan Girls' Scholarship Program Evaluation with Exam Score Data

A randomized evaluation of a merit scholarship program for Kenyan girls, conducted by researcher Rebecca Thornton of Harvard University. The study measured exam score gains for scholarship recipients and tracked externalities for other students and teacher attendance. The data likely contains results from two districts, with heterogeneous program effects reported.

TabularRandomized EvaluationKENYALiberationBenchmarkEducationChemistryPolitical ScienceScholarshipAcademic Performance+1

0 views

Education

doebioresearch: Analysis of Experimental Designs for Biological Research

Raj Popat created the doebioresearch package, which includes example datasets for analyzing seven common experimental designs in biological research. The analysis covers statistical methods like analysis of variance, normality tests, and multiple comparison tests. The package also provides functions for data transformation and yield conversion.

TabularComputer ScienceDOEAgricultural TrialsBiological ResearchExperimental Design+1

0 views

Education

rENA: Epistemic Network Analysis for Discourse and Reasoning

A dataset for Epistemic Network Analysis (ENA), a method developed by Shaffer et al. for quantifying patterns in discourse or reasoning. The data likely contains coded connections for modeling networks, enabling both quantitative and qualitative comparisons. The dataset is authored by Cody Marquart and sourced from the paperswithcode platform.

TabularEpistemologyPsychologyEpistemic Network AnalysisPhilosophySociologyDiscourse AnalysisQuantitative Ethnography+1

0 views

Education

Janitor: R Package for Data Cleaning and Tabulation

An R package built by Sam Firke for examining and cleaning data. It provides functions to format column names, create frequency tables and crosstabs, and explore duplicate records. The package follows tidyverse principles and is optimized for user-friendliness for beginning-to-intermediate R users.

TabularR PackageEngineeringEpistemologyComputer ScienceData ScienceTidyverseSimple PhilosophyPhilosophyStatisticsBiochemical EngineeringData CleaningComputer Security+1

0 views

Education

Search Strategy for Intercultural Feedback Literacy Studies

J.P.C. Staaks created a documented search strategy for literature across five major academic databases: ERIC (Ovid), PsycInfo (Ovid), LLBA (ProQuest), Web of Science, and Scopus. The dataset likely contains the specific search terms and logic used to identify relevant studies on feedback literacy in intercultural educational contexts. Its primary function is to enable replication and systematic review of research in this domain.

TextSecondary EducationInterculturalSearch StrategyFeedback LiteracyHigher Education+1

0 views

Education

RAG-Per-Examples: Retrieval-Augmented Generation Demonstration Cases

A collection of example cases for Retrieval-Augmented Generation (RAG) systems, published on Kaggle. The dataset likely contains text-based prompts, contexts, and model responses intended to illustrate RAG workflows. Specific details on the number of examples, original authors, and creation date are unavailable.

TextRag ExamplesRetrieval Augmented GenerationLlm TrainingNatural Language Processing+1

0 views

PreviousPage 383 of 669Next