Loading...
Loading...
Student performance, MOOC logs, knowledge tracing, standardized tests, learning analytics
12,585 datasets
School exclusion data for all schools in Calderdale, including Academies, collected from May 2019 onwards. The data, sourced from the school census, includes fixed-term suspensions and permanent exclusions alongside the main reason for each exclusion. Calderdale Metropolitan Borough Council last updated this dataset on 2026-06 05.
Raw data and scripts supporting a manuscript on Odontaspididae shark body size changes after the Cretaceous-Paleogene mass extinction. The dataset includes R scripts and Excel files totaling 48.0 KB, created by Danillo Santos - Granzotti and last updated on 2026-05-31. It provides a counterexample to the Lilliput Effect and presents evidence for the Brobdingnag Effect in vertebrates.
A 5.8 MB repository from figshare, authored by Yiyun Fan and last updated in May 2026. It contains qualitative materials from a study on self-regulated learning in data science education, including data stories, datasets, student comprehension question responses, and discussion transcripts. The repository also includes a course syllabus, thematic analysis summaries, and a README file.
NASA's OPERA project provides a calibration and validation database for its Dynamic Surface Water Extent (DSWx-S1) product derived from Sentinel-1 radar data. The database is structured as an Amazon Web Services S3 bucket containing classification items and includes a reference document, validation results, and example Jupyter notebooks for access and querying. It is designed to support the evaluation and refinement of algorithms for detecting surface water from satellite imagery.
Reproduction artifacts for the BAGEL benchmark from the paper arXiv:2604.20570. The artifacts are hosted by GSI-Bench and require separate download from Hugging Face. The dataset page was last updated on 2026-06-03.
Australian Ocean Data Network provides a reference dataset of geomorphic habitat areas from near-pristine estuaries. The data, derived from GIS maps on OzEstuaries, aims to benchmark environmental characteristics and quantify habitat changes in modified coastal systems. The dataset was last updated on 2026-06-05.
Ontario Ministry data tracks students from Grade 9 entry into publicly funded secondary schools to their postsecondary destinations within seven years. The dataset integrates records from the Ontario School Information System, college and university enrolment reports, apprenticeship agreements, and student assistance programs from 2010 to 2022. To protect privacy, numbers are suppressed when fewer than 10 students are represented and all values are rounded to the nearest 5.
2011 National Household Survey data on highest educational attainment, broken down by age and sex. The dataset is provided by the Government of Nova Scotia and covers multiple geographies including provinces, counties, municipalities, and health authorities. It is archived and not maintained, with users directed to Statistics Canada for current information.
The 2011 National Household Survey provides data on respondents' major field of educational study. This archived dataset is sourced from the Government of Nova Scotia and covers multiple geographies including provinces, counties, municipalities, and electoral districts. The data is retained for archival purposes and has not been maintained since its release.
Jamie Davis provides a dataset of structured JSON objects pairing raw source code with optimized equivalents, bug fixes, and invalid syntax examples. The dataset includes pre-computed complexity scores, execution tracking, and input-output verification arrays. It was last updated on 2026-05-28 and is engineered to train automated program repair tools, parsers, and static analyzers.
The Indigenous Advisory Committee provides expert advice to the Impact Assessment Agency of Canada (IAAC). This advice supports the development of policies and guidance documents for the impact assessment system. The dataset is a PDF document titled 'Assessment of Potential Impacts on Rights: Operational Guidelines for Project Proponents'.
A statewide map of fire extent and severity for New South Wales, Australia, developed by the NSW Department of Climate Change, Energy, the Environment and Water in collaboration with the NSW Rural Fire Service. The data is produced through a semi-automated machine learning framework based on Sentinel 2 satellite imagery and classifies fire severity into standardized classes from unburnt to extreme. The dataset was last updated on 2026-04 27.
During the 2003-2004 Antarctic summer, a team from Geoscience Australia drilled through the Amery Ice Shelf to study the underlying ocean cavity. The resulting educational resource, last updated in 2026, provides real data from this expedition for student analysis. It includes teacher answers and is available in PDF and HTML formats.
A research paper describing Node-Sampling, a self-optimizing multi-agent method for improving the quality of multiple-choice questions generated by language models for medical education. The study was authored by Lilly Marie Düsterbeck and published on figshare in April 2026. The method was evaluated through expert review and showed significant improvement in question stem quality using a three-agent configuration requiring 33% of original resources.
A multicenter cohort of 616 patients diagnosed with Sjögren's syndrome, comprising 81 anti-centromere antibody (ACA)-positive and 535 ACA-negative cases. The dataset was used to develop a machine learning model for identifying the clinical signature of ACA-positive SS, achieving an AUC of 0.811 in validation. It was authored by Wenlong Zhu and last updated in April 2026.
616 patient records from a multicenter study aim to identify the clinical signature of anti-centromere antibody-positive Sjögren’s syndrome. Wenlong Zhu published the data in 2026, using machine learning models like GBDT which achieved an AUC of 0.811. The analysis highlights predictors including serological markers, age, and Raynaud’s phenomenon.
616 patients diagnosed with Sjögren's syndrome, comprising 81 anti-centromere antibody (ACA)-positive and 535 ACA-negative cases, were analyzed in a multicenter study. The dataset was used to develop a machine learning model, with a Gradient Boosted Decision Tree (GBDT) achieving an AUC of 0.811 in validation. The research was authored by Wenlong Zhu and last updated on the figshare platform in April 2026.
Wenlong Zhu's dataset contains clinical data from a multicenter cohort of 616 patients diagnosed with Sjögren's syndrome, including 81 anti-centromere antibody (ACA)-positive and 535 ACA-negative cases. The data was used to develop and validate machine learning models for identifying the clinical signature of ACA-positive Sjögren's syndrome. The dataset was last updated on April 23, 2026.
616 patient records from a multicenter study, comprising 81 anti-centromere antibody (ACA)-positive and 535 ACA-negative cases of Sjögren’s syndrome. The dataset was created by Wenlong Zhu and last updated in April 2026, and was used to develop a machine learning model for identifying the clinical signature of ACA-positive patients. The model achieved an AUC of 0.811 in the validation cohort.
42 towed-video stations captured 32 hours of seabed video and 6,229 photographs to characterize deep-sea biological assemblages on the Lord Howe Rise. The dataset includes 3,413 seabed characterizations of physical and biological variables, plus sediment and biological samples from 36 stations. Geoscience Australia Data collected this information to examine the use of physical data as surrogates for predicting biodiversity in deep-sea environments.