Loading...
Loading...
Student performance, MOOC logs, knowledge tracing, standardized tests, learning analytics
13,437 datasets
Example-SQL-1 is a dataset published on Kaggle. The title suggests it likely contains SQL-related examples, such as queries or database structures, for educational purposes. The dataset's specific content, size, and origin require verification after download.
A real-world dataset of used cars intended for machine learning, exploratory data analysis, and regression tasks. The dataset is hosted on Kaggle, but specific details about its origin, size, and creation date are not provided. Its description indicates a focus on practical applications for predicting vehicle prices.
'Learning Matrix' focuses on the process of realizing and recycling learning rather than simply finishing a course. The dataset was uploaded to Kaggle, but its author, size, and specific contents are not detailed. Its description suggests it may track learning cycles or outcomes.
University of Bergen and British Geological Survey conducted a marine geophysical survey in August 2000 aboard the RV Hakon Mosby. The data collection focused on the AFEN slide area of the Shetlands margin to assess continental slope stability for the EU COSTA project. Subsurface seismic data was acquired using BGS Deep-tow boomer and Halliburton sleeve gun systems.
A geological provenance study examines sediment sources, uplift, and erosion in the Faroe-Shetland Basin. The work, backed by the hydrocarbon industry, applies Ar-Ar dating of detrital micas to constrain sediment entry points and reservoir quality. The dataset originates from a scoping study by the British Geological Survey.
University of Utah baseball data published on Kaggle. The dataset likely contains performance statistics and records for the university's baseball team. Specific details on columns, size, and collection timeframe are not provided in the metadata.
Historical Rolling Needs Assessment data maintained by the City of Austin, last updated March 2026. The dataset covers humanitarian aid, emergency response, and social services needs. Available file formats include JSON, RDF, CSV, and XML.
AncientCivilizations_Archon_25k is a 25,000-example dataset created by WithinUsAI for training advanced reasoning in ancient history and archaeology. It focuses on methods, uncertainty, and research-grade interpretation, covering topics like Mesopotamian chronology and Egyptian state formation. The dataset was last updated on February 5, 2026.
AncientCivilizations_Archon_25k is a 25,000-example dataset created by author 11-47 for Within Us AI. It is designed to train master-scholar reasoning across ancient history, archaeology, and epigraphy, emphasizing research-grade interpretation over trivia. The dataset was last updated on Hugging Face on February 5, 2026.
Multi-Task learning ISIC challenge dataset from Kaggle. The dataset likely contains medical images for the ISIC (International Skin Imaging Collaboration) challenge, focusing on multi-task learning problems. Its specific contents, scale, and creation details require verification after download.
An End-to-End - Machine Learning is a dataset hosted on Kaggle. Its title suggests it is designed to demonstrate or support a complete machine learning workflow. The dataset's specific content, size, and origin are not detailed in the provided metadata.
The dataset provides 2014 national, state, and Continuum of Care-level Point-in-Time (PIT) and Housing Inventory (HIC) estimates of homelessness. It includes counts of chronically homeless persons, homeless veterans, and homeless children and youth, as compiled by the Department of Housing and Urban Development from January 2014 counts.
A longitudinal study of approximately 14,000 children born in the U.S. in 2001, tracking their health, development, care, and education from about 9 months old through kindergarten entry. Data was collected using multiple methods including interviews, questionnaires, direct observation, and child assessments from parents, caregivers, and teachers. The study is part of the Early Childhood Longitudinal Study program, with data available since the 1998-99 period.
1978 to 1997 data on teenage GED receipt and high school continuation rates across U.S. states. The dataset combines GED policy information from the GED Testing Service (GEDTS) with high school continuation ratios from the Common Core of Data (CCD). It was created by Duncan Chaplin to study potential unintended consequences of the GED program.
A series of school-based randomized trials in over 250 urban schools tests the impact of financial incentives on student achievement. The data, described by Roland G. Fryer, suggests incentives for educational inputs increase achievement, while output-based incentives are not effective. The study compares the cost-effectiveness of these incentive-based reforms against other popular education interventions.
FredΓ©ric Docquier's dataset updates migration statistics by educational attainment. It provides emigration stocks and rates by schooling level and gender for 195 source countries in 1990 and 2000. The data can be used to analyze trends in women's skilled migration.
An evaluation of the Sherlock 2 intelligent tutoring system designed to foster the transfer of complex technical skills. The dataset likely contains results from coached apprenticeship activities and authentic problem-solving scenarios used to train diagnostic skills. The author is Sherrie P. Gott, and the data is sourced from the paperswithcode platform.
Two new datasets were used to explore the relationship between prior attainment and higher education participation among young people in England. The analysis found no conclusive gender gap after controlling for prior attainment, but identified a higher likelihood of participation for ethnic minority backgrounds compared to White peers with the same attainment. The research was authored by Stijn Broecke.
A randomized controlled trial dataset from the Healthy Families New York (HFNY) program, authored by Kimberly DuMont. The data likely contains assessments, referrals, and home visit records for families. The dataset is sourced from the paperswithcode platform and is licensed as closed.
A research paper by Janine M. Zweig examining youth disconnection from developmental pathways and the role of alternative education. The paper discusses the needs of vulnerable youth and how alternative schools and programs can facilitate reconnection. The dataset appears to be the paper's text and associated metadata sourced from paperswithcode.