Loading...
Loading...
Student performance, MOOC logs, knowledge tracing, standardized tests, learning analytics
13,431 datasets
Datasets_de_travail_examen_03 is a collection of data likely intended for educational or examination purposes. It is hosted on Kaggle, but its specific content, size, and structure are not described. The dataset's author, organization, and license details are unknown.
A GIS dataset from the British Geological Survey provides overall susceptibility scores (1-100) and factor scores (1-10) for property subsidence risk in England and Wales. It maps shrink-swell hazard to individual building polygons, considering geology, foundation depth, drainage, building type, and tree proximity. Postcode-level average scores are also available for aggregated analysis.
1971 to 1986 mineral resource maps produced by the British Geological Survey's Industrial Minerals Assessment Unit for Great Britain. The collection includes sand and gravel resource maps, with some covering conglomerate, limestone, dolomite, and celestite in specific regions. Maps show locations of resource blocks, deposit categories, and boreholes with indicative logs.
A geochemical study funded by NERC from 1998-2001 investigates heavy metal speciation and bioavailability for risk assessment. The project, conducted by Imperial College, University of Nottingham, and the British Geological Survey, integrates scanning electron microscopy, chemical extractions, and isotopic analyses. Results are available as GIS maps to support decision-making for brownfield site redevelopment.
Early Dropout Prediction: MOOCCube Subset is a dataset hosted on Kaggle. The title suggests it is a subset of the MOOCCube dataset, likely containing data related to predicting student dropout in Massive Open Online Courses. The dataset's specific content, size, and origin require verification after download.
Student_Exams_results is a dataset hosted on Kaggle. Its specific content, such as the number of records, subjects, or student demographics, is not detailed in the available metadata. The dataset likely contains tabular data related to student performance on examinations.
Kaggle hosts a collection of final project presentations for machine learning courses. The dataset likely contains materials submitted by students or practitioners to demonstrate their work. Specific details on the number of entries, authors, and creation dates are not provided in the available metadata.
Survey data from students attending U.S. law schools in 1991, obtained from the R package fairml. The response variable indicates whether a student's undergraduate GPA exceeded 3.0, and race information has been binarized into white and non-white categories.
A survey of students attending law school in the United States in 1991. The dataset was obtained from the R package fairml and has been modified for binary classification tasks. The response variable indicates whether a student's undergraduate GPA was greater than 3, and race information has been binarized into white and non-white categories.
A 1991 survey of students attending law school in the U.S., sourced from the R package fairml. The response variable indicates whether undergraduate GPA is greater than 3.0, and race data has been binarized into white and non-white categories.
A dataset created for exploring peak detection algorithms in the context of content summarization for Twitter networks. The data likely contains time-series signals for clustering tweets by topic-aware peak times. It is shared under a CC0-1.0 license on the OpenML platform.
Smart Learning Performance Evaluation Data published on Kaggle. The dataset likely contains metrics for analyzing teaching effectiveness in technology-enhanced learning environments. Its specific scale, origin, and update history are not detailed in the provided metadata.
Delivering a human-verified Lean 4 formalization of Statistical Learning Theory (SLT) grounded in empirical process theory, released by liminho123 in February 2026. It contains fewer than 1,000 records of formal mathematical proofs, including developments for Gaussian Lipschitz concentration missing from the standard Mathlib library.
47,000 hours of speech audio and 19 million fine-grained speaking style captions categorized into splits like FCaps-PSCBase and FCaps-Emilia. The dataset provides open-ended descriptions of vocal characteristics for large-scale speech modeling and synthesis.
Thomas J. Kane from the University of California, Los Angeles authored this research, which likely contains evaluation data on after-school programs. The dataset is published on the paperswithcode platform. The specific number of rows, columns, and temporal coverage is unknown.
The Achenbach System of Empirically Based Assessment is a dataset for psychological and behavioral research, likely containing standardized assessment scores. It was authored by Thomas M. Achenbach of the University of Vermont and is hosted on the paperswithcode platform. The specific temporal coverage, scale, and column details are not provided in the metadata.
Cocoedit 40K contains 40,000 records for image editing tasks, published by wyh6666 in March 2026. The data is provided as split zip archives on Hugging Face that require manual concatenation before extraction.
A dataset focused on federated learning systems with privacy and performance targets. The data likely contains metrics related to distributed communication and model training. It is hosted on Kaggle, but specific details about its creation, size, and temporal coverage are unknown.
Transperineal ultrasound images intended for fetal head segmentation to assess labor progress. The dataset likely contains medical images collected for research in obstetrics and gynecology. The author, organization, and specific collection details are unknown.
A dataset titled 'Deep Learning 02' published on Kaggle. The content likely relates to machine learning education, potentially containing examples, exercises, or model outputs. Metadata is minimal; the specific subject, size, and origin are unknown.