DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Education Datasets | DataSalon

All Categories

🎓

Education

Student performance, MOOC logs, knowledge tracing, standardized tests, learning analytics

13,362 datasets

Education

Synthetic Resume Credibility Assessment Dataset for NLP

CAD-S is the first openly available dataset for resume credibility assessment using NLP. It supports supervised learning for detecting inconsistencies between claimed skills and supporting evidence within resumes. The dataset was created by aselasperera and was last updated in March 2026.

CSVSize Categories10 Kn100 KLibrarypolarsResumeLanguageenTechnicalModalitytextLibrarymlcroissantLibrarydatasetsLibrarypandasLicensecc By 40HrRegionusTask Categoriestext Classification+1

0 views

Education

Titanic Passenger Data for Machine Learning Classification

Titanic passenger data is a canonical benchmark for binary classification tasks in machine learning education. The dataset is published on Kaggle, a platform for data science competitions and projects. Its exact size, features, and provenance are unspecified in the provided metadata.

TabularMachine LearningTitanicClassificationHistorical Data+1

0 views

Education

Education Autolabel: Machine-Learning Labeled Data

Education Autolabel is a dataset published on HuggingFace by author shyuni. The dataset likely contains labeled data for educational applications, inferred from its title. Its last update was recorded on 2026-05-01 12:58:12.

TextMachine LearningAutolabelEducation+1

0 views

Education

Qwen3.6 Plus High Reasoning 500X: A Knowledge Distillation Dataset

1,739,249 tokens of text data generated by the Qwen3.6-plus model for knowledge distillation. The dataset covers topics including coding, mathematics, finance, medicine, and economics, with a maximum sequence length of 6,500 tokens per row. It was created by author 'ansulev' and last updated on April 8, —.

TextMathematicsFinanceLlm TrainingCodingKnowledge Distillation+1

0 views

Education

Supervised Learning Performance Metrics for Pediatric Diarrhea Prediction in Sierra Leone

2020 data from the Sierra Leone Demographic and Health Survey (SLDHS) provides performance metrics for machine learning models predicting diarrhea in children under five. The dataset, authored by Yahye Hassan Muse, contains model evaluation results in a 9.5 KB Excel file.

TabularExcelDiarrhea PredictionSierra LeonePediatric HealthMachine Learning MetricsPublic Health+1

0 views

Education

Historical Audit Documents for the State of Oregon

Historical audit documents for Oregon, published by the State of Oregon. The data populates the Audits Search tool on the Oregon Secretary of State website and was last updated on March 8, 2026. The records are available in multiple machine-readable formats including XML, RDF, JSON, and CSV.

TextCSVXMLJSONHistorical DocumentsPublic RecordsOregonAuditCafrGovernment Audits+1

0 views

Education

Revenue Example Dataset

Revenue Example is a dataset hosted on Kaggle. Its specific content and scope must be verified after download, as detailed metadata is not provided. The author, organization, and data collection method are unknown.

TabularExampleRevenueBusiness+1

0 views

Education

Personality and Enrichment Interventions for Garnett's Bushbabies

Lauren Highfill's research dataset examines the relationship between personality traits and environmental enrichment effectiveness in Garnett's bushbabies. It contains assessments of five personality factors and behavioral outcomes for ten subjects across five different enrichment interventions. The study aims to inform individualized animal management strategies based on personality differences.

TabularEnvironmental EnrichmentPsychologyAnimal welfarePrimate BehaviorAnimal PersonalitySocial Psychology+1

0 views

Education

International Student Assessment Scores For 15-Year-Olds

PISA 2003 measured the capabilities of 15-year-old students in reading, mathematics, and science literacy across participating countries. The study, conducted by the Organisation for Economic Co-operation and Development (OECD), achieved an 83 percent response rate from students sampled in April-May 2003. Mathematics literacy was the primary subject area assessed in depth for this cycle.

TabularPedagogyMedicineEducation AssessmentMedical EducationStudent PerformancePsychologyLiteracyReading ProcessInternational StudyMathematics EducationPolitical ScienceGrade Level+1

0 views

Education

University Enrolment Counts by Program and Student Origin

Enrolment data is aggregated by institution, credential type, level of study, and student origin. The dataset is provided by data.novascotia.ca and was last updated in February 2026. It includes fields for major field of study, province of residence, and registration status.

TabularCSVXMLJSONPost Secondary EducationStudentEducation StatisticsCanadian UniversitiesUniversityStudent EnrolmentEnrolment+1

0 views

Education

HPTN 068: Post-Intervention Health Assessment Data for Young Women

HPTN 068 post-intervention case report form data was collected by The Statistical and Data Management Center. The dataset contains follow-up assessments from young women participants who returned for a scheduled post-intervention visit, designated as Visit Code 701. The data was last updated on April 10, 2026.

TabularHiv PreventionPost InterventionClinical TrialsHealth Assessment+1

0 views

Education

Niger Delta University Nursing Faculty Student Registrations for 2026/2027

Faculty of Nursing Sciences, Niger Delta University, Amassoma 2026/2027 Registra dataset likely contains student registration records for the 2026/2027 academic year. The data appears to be sourced from a Nigerian university's nursing faculty. The dataset's exact structure and size are unknown.

TabularNursingEducationNigeriaStudent Registration+1

0 views

Education

Machine Learning Dataset from Kaggle

Kaggle hosts a dataset titled 'machine learning'. The dataset's specific content, size, and origin are not detailed in the provided metadata. Metadata is minimal; actual content requires verification after download.

TabularMachine LearningEducationArtificial Intelligence+1

0 views

Education

exam-LoRa: Low-Rank Adaptation Training Data

exam-LoRa is a dataset published on Kaggle. The title suggests it contains data related to Low-Rank Adaptation (LoRA), a technique for fine-tuning large machine learning models. Its specific content, size, and origin are not detailed in the available metadata.

TabularMachine LearningFine TuningLow Rank Adaptation+1

0 views

Education

Satellite and In-Situ Aquatic Data for Temporal Mismatch Analysis

YSI sonde in-situ data supports research quantifying temporal mismatches with satellite observations in aquatic environments. The dataset is associated with a 2026 publication in Remote Sensing Letters. Specific row counts, column details, and file formats are not provided in the source description.

TabularTime SeriesGeospatialIn Situ MeasurementAquatic EnvironmentSatellite Data+1

0 views

Education

Afghanistan Multi-Sector Assessment 2019

2019 data from a nationwide multi-sector assessment conducted by the REACH Initiative. The dataset covers topics including Education, Health, Needs Assessment, and Socioeconomics. Specific details on row count, column count, and temporal coverage are unavailable.

EducationHealthNeeds AssessmentSocioeconomics+1

0 views

Education

Afghanistan Multi-Sector Needs Assessment 2020

REACH Initiative's Whole of Afghanistan Assessment 2020 dataset documents multi-sectorial needs across the country. The assessment covers sectors including Facilities Infrastructure, Nutrition, and Health. It provides a yearly snapshot of conditions for humanitarian planning.

Facilities InfrastructureNutritionHealth+1

0 views

Education

Student Classroom Engagement Dataset with Behavioral Indicators

The Student Classroom Engagement Dataset contains behavioral, attention, participation, and learning efficiency indicators. It was sourced from Kaggle, but the author, organization, and specific collection details are unknown. The dataset's size, row count, and last update date are unspecified.

TabularClassroom EngagementStudent BehaviorLearning Efficiency+1

0 views

Education

Indian Used Car Price Prediction Data for Tata Motors

Machine Learning Ready Dataset for Indian Used Car Price Prediction. The dataset is sourced from Kaggle and focuses on Tata Motors vehicles. Specific details on size, authorship, and update frequency are not provided.

Tabular🇮🇳 IndiaMachine LearningCar PricesAutomotive+1

0 views

Education

ATLAS-Higgs-Boson-Machine-Learning-Challenge-2014: Higgs Boson Classification Data

The 2014 Kaggle Higgs Boson Machine Learning Challenge dataset contains 800,000 simulated particle collision events from CERN. It was originally sourced from CERN's open data portal and prepared for a public machine learning competition. The data is provided under a CC0 1.0 public domain license.

TabularMachine LearningPhysics SimulationCernClassificationSimulationParticle Physics+1

0 views

PreviousPage 338 of 667Next