Loading...
Loading...
Student performance, MOOC logs, knowledge tracing, standardized tests, learning analytics
13,313 datasets
A large-scale collection of Telugu STEM textbook data created by InfoBayAI and last updated on April 10, 2026. The dataset is designed to support the development of advanced NLP systems and AI models for scientific understanding and problem-solving in Telugu.
NOAA/TIROS-N AVHRR sensors measure visible, near-infrared, and infrared radiation across 4 or 5 spectral bands with a ground resolution of about 1.1 km at nadir. The Level 1b data product contains quality-controlled raw data with appended sensor calibration and earth location information, stored on 6250bpi tapes from NESDIS/SDSD and the University of Miami. Data is collected via HRPT (High Resolution Picture Transmission) and LAC (Local Area Coverage) modes from satellites including NOAA-6, 7, 8, 9, 10, and TirosN.
Marie Byrd Land in West Antarctica is covered by a 16,000 km² GIS-based 3-D geological map consolidating bedrock geology, airborne geophysics, structural data, and geochronology. The resource includes rock sample collections, predominantly migmatite gneisses and plutonic rocks, registered with IGSN numbers. The database is intended for publication as a dynamic GIS by the Antarctic Geospatial Information Center at the University of Minnesota.
Kenya's coastal zone is mapped in a 1:250,000-scale vector database containing international and administrative boundaries, along with school locations. The database was developed under the Eastern African Action Plan by UNEP/GRID-PAC, UNEP/OCA-PAC, and KEMFRI, integrating data from the Survey of Kenya, Landsat imagery, and socio-economic sources. Feature names and attributes are stored for points, lines, and polygons.
2005-2024 is the stated temporal coverage for this dataset of adult illiteracy rates. It likely contains country-level or regional statistics on literacy, a key socioeconomic indicator. The dataset is published on Kaggle, but its original source and compilation method are unknown.
MAGIC populations are ideal for learning complex models due to their high genetic recombination, diversity, and large sample size. This synthetic dataset of 2000 observations was generated from a Bayesian network model developed for a talk on multiple trait prediction in plant genetics. The model and data were created by Marco Scutari, Phil Howell, David J Balding, and Ian Mackay.
Paper Conclusion RL Training is a dataset for reinforcement learning training based on the EasyR1 (verl) framework. The training model is Qwen3-VL-8B-Thinking, using an external judge model (Qwen3-4B-Instruct-2507) to score predicted conclusions against a 235B teacher model's reference conclusions. The dataset was authored by SII-ChengqiLi and last updated on 2026-04-10.
A collection of papers published for the inaugural Great Barrier Reef Conference held at James Cook University in Townsville. The dataset is provided by Geoscience Australia and was last updated on the platform in April 2026. The legacy product has no abstract available, and the specific content and structure require verification after download.
A dataset containing application and registration form information for Madonna University. The data likely includes details submitted by prospective students for the 2026/2027 academic year. The specific fields, volume, and completeness are not detailed in the available metadata.
Igbinedion University Okada 2026/2027 application and registration form data is hosted on Kaggle. The dataset likely contains information submitted by prospective students for the 2026/2027 academic session. Its author, organization, and specific data structure are unknown.
Babcock University, Ilishan-Remo, has released its application and registration form for the 2026/2027 academic session. The dataset likely contains information submitted by prospective students during the university's admissions process. Specific details on data volume, structure, and collection method are not provided in the available metadata.
Ethiopia's literacy and education levels derived from the 2016 Standard Demographic Health Survey (DHS). The data is provided by the Central Statistical Agency (CSA) of Ethiopia and published by the IGAD Climate Prediction and Applications Center (ICPAC). It is available as a GEOTIFF raster file with a spatial resolution of 0.05 pixels.
Social Security Administration data tracks employee hires, losses, and internal transfers within its Office of Systems. The database includes information on employee skills, job assignments, and transfer approvals. It was last updated on April 3, 2026.
First ISCCP Regional Experiment data from a tethered balloon campaign designed to improve cloud and radiation parameterizations in climate models. The dataset, provided by the National Aeronautics and Space Administration, includes files in BIN, TAR, ISO, HTML, and PDF formats. It focuses on the life cycles and radiative properties of Arctic clouds to validate satellite data and general circulation models.
The First ISCCP Regional Experiments (FIRE) data, produced by the National Aeronautics and Space Administration, aims to improve cloud and radiation models for climate prediction. The dataset includes measurements from the Arctic Cloud Experiment Utrecht University Tower, focusing on cirrus and marine stratocumulus cloud systems. The data was last updated on 2026-03-13.
MolmoWeb-SyntheticSkills is a dataset of synthetic web-navigation skills created by Allen Institute for AI (allenai). Each example pairs an instruction with a sequence of webpage screenshots and the corresponding low-level agent actions like clicks, typing, and scrolling. The dataset was last updated on March 24, 2026.
The 2023-24 academic year data on examination performance for Year 12 and Year 14 pupils in Northern Ireland. These data are gathered as part of the annual Summary of Annual Examination Results (SAER) exercise, which runs from May to December each year. The dataset is published by the Government Digital Service under the OGL-UK-3.0 license.
A list of child development centers provided by the District of Columbia's Office of the State Superintendent of Education. The dataset was last updated on March 25, 2026. It is available in multiple geospatial and tabular formats, including KML, GeoJSON, and CSV.
Global K-12 STEM, Robotics, AI & Engineering Education Dataset (Grades 1–12) is aggregated from Kaggle. Its specific size, source, and update frequency are not detailed in the available metadata. The dataset likely contains information on educational programs, resources, or outcomes related to STEM fields.
Synthetic supervised fine-tuning examples were generated by teacher models evaluated in the Polyglot Teachers paper. The dataset contains examples across six languages: Arabic, Czech, German, Indonesian, Japanese, Spanish, and Tagalog. It was created by ljvmiranda921 and last updated on April 5, 2026.