Loading...
Loading...
Student performance, MOOC logs, knowledge tracing, standardized tests, learning analytics
13,333 datasets
Water samples from the Mediterranean, Antarctica, and North Atlantic provide measurements of bacterial degradation in organic matter, growth rates, and methane production. Data originates from multidisciplinary national programs of INSU and the European CORDIS/MAST program. The dataset focuses on the role of bacteria in carbon, nitrogen, and phosphorus fluxes.
S-wave refraction data collected along three roads crossing the Big Thompson River valley in Colorado were processed to model subsurface geology. The data provide estimated velocities and thicknesses for layered-earth models, resulting in three cross-sections of the valley. The work was summarized by the USGS and originates from the CEOS_EXTRA organization via NASA EarthData.
Nineteen seasons of colony-based data from Ross Island track Adélie penguin adaptation to expanding sea ice. This four-year NASA project analyzes migration, survival, and reproduction for known-age individuals. Research aims to understand carry-over effects from winter ice conditions to annual breeding success.
Mohsen Rakhshan's dataset contains data collected using the CAPSAS device to examine the limb position effect for myoelectric control. The data is stored in NPY format and totals approximately 3.58 GB in size. Its primary focus is on understanding how limb position influences signals used for prosthetic control.
Comprising 24-hour fleet depot charging load profiles with 15-minute average demand data and substation load integration assessment results. It was produced for a 2021 study on heavy-duty truck electrification impacts on electricity distribution systems. The associated code for generating the profiles is publicly available.
LocoreMind provides an example dataset for training agent models with tool calling capabilities. It is formatted for compatibility with the CoPaw-Flash-9B model and uses a standard three-role message structure. The dataset was last updated in April 2026.
277 non-regulatory air sensors deployed across Chicago measure fine particulate matter (PM2.5) and nitrogen dioxide (NO2), with daily means calculated for each UTC day. The project was founded by the Chicago Department of Public Health and the University of Illinois at Chicago School of Public Health, with sensors installed in Summer 2025. Data is limited to measurements that have passed automated quality control checks, and some sensors were intended to collect black carbon and weather data, though black carbon modules are offline.
Kaggle hosts a dataset related to school academic performance. The dataset's specific content, size, and origin are not detailed in the available metadata. Further details such as the number of rows, specific columns, and creation date require verification after accessing the data.
Higher education learning records related to teaching quality. The dataset is published on Kaggle. Its specific size, columns, and creation details are unknown.
The NYC School Survey is conducted annually for all parents, teachers, and students in New York City schools. Survey results provide insight into a school's learning environment and contribute a measure of diversification that goes beyond test scores. The data is provided by the City of New York and was last updated on the platform in March 2026.
Property assessment parcels for the Canadian province of New Brunswick. The dataset is maintained by the Government of New Brunswick and was last updated in March 2026. Row and column counts are unknown.
Loaded with high-quality solar resource and meteorological measurements from the SolarTAC test facility near Denver, Colorado. It was established by the National Renewable Energy Laboratory (NREL) in 2008 to support solar power project deployment and model development.
Synthetic interaction logs designed for continuous preference learning in AI coding agents. The dataset's origin, size, and temporal coverage are unspecified. It was sourced from Kaggle, but the author, organization, and license details are unknown.
28 Chinese university students provide qualitative interview data exploring the psychological experience of English learning environments. The study introduces an integrative 'Permeation–Internalization–Projection' framework based on these in-depth interviews. Author Xiangshan Ge published the findings in a 1.1 MB PDF document under a CC BY 4.0 license.
Geoscience Australia Data provides a research chapter evaluating a 'Build Back Better' disaster risk reduction campaign following the 2009 West Sumatra earthquake. The work analyzes community knowledge, resistance to change, and motivations for adopting earthquake-safe construction techniques. It is part of a wider World Bank publication on emerging best practices in natural disaster risk assessment.
Kaggle hosts this dataset titled 'Semi Supervised Learning 2'. The dataset likely contains data for practicing semi-supervised machine learning techniques. Its specific content, size, and origin are not detailed in the provided metadata.
Kaggle hosts a dataset designed for house price prediction and regression analysis using machine learning. The dataset likely contains features relevant to property valuation, such as size, location, and amenities. Its specific origin, size, and update history are not documented.
A dataset titled 'Machine_Learning' is hosted on Kaggle. The dataset's specific content, size, and structure are not described in the provided metadata. Its origin, author, and last update date are unknown.
RCD Mallorca 2025/26 Football Player Statistics provides player-level football data for exploratory analysis and machine learning. The dataset is hosted on Kaggle, but details on its size, structure, and authorship are not provided. Its specific creation date and update frequency are unknown.
SWE_Next_SFT_Trajectories.jsonl is a ShareGPT-style JSONL file containing 3,693 training examples for conversational AI. The dataset, created by TIGER-Lab, was last updated on April 2, 2026. Each example is a JSON object with a messages field containing conversation turns with roles such as system, user, assistant, and tool.