Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
40,756 datasets
NASA's NOAA-20 VIIRS Gap-Filled Lunar BRDF-Adjusted Nighttime Lights Daily L3 Global 15 arc-second Linear Lat Lon Grid product (VJ146A2) provides daily, atmospherically corrected observations of Earth's nighttime lights. It contains seven science data sets, including gap-filled radiance, lunar irradiance, and quality flags, at a 15 arc-second resolution. The current Collection 2.0 version features floating-point radiance data and coverage extended to both land and water surfaces.
322 question-and-answer pairs from the Dutch government's internal HR portal, P-Direkt. The dataset was created by the Ministry of the Interior and Kingdom Relations to test question-answering models for an internal HR environment. Its structure corresponds to the SQuAD 2.0 format, with answers sourced from intranet pages as of April 2021.
Hourly and daily updated global flood maps are generated from surface reflectance data captured by the NOAA-20 and NOAA-21 (JPSS) satellites. The product is processed in near real-time by NASA's LANCE system and provides 250-meter resolution data in a linear latitude-longitude grid, stored in EOS-HDF5 format. This dataset is currently in a Beta 1 status under VIIRS collection 2.
Simulation data supporting research on prescribed-time lag bipartite consensus control for nonlinear multi-agent systems. The dataset, 4.4 KB in size, was created by Jialong Tian and last updated on 2026-05-21. It likely contains numerical results from a simulation example validating a control scheme using a prescribed-time dynamic observer and an event-triggered mechanism.
Experimental results evaluating a Siamese BiLSTM model for sentence-level semantic similarity. The dataset likely contains performance metrics and feature-level analysis comparing the model to traditional methods like TF-IDF and cosine similarity. Author Weihong Zhao published the data on figshare under a CC-BY-4.0 license in May 2026.
A dataset from 2026 describing pitch and cabin configurations for autonomous aircraft cleaning robots. The data was created by Cong Hien Dinh and shared on figshare under a CC-BY-4.0 license. It supports a two-stage Coverage Path Planning framework that integrates a Genetic Algorithm for optimizing cleaning sequences.
Arun Kumar published simulation results for a novel Graph-Guided Adaptive Companding framework for Power-domain Non-Orthogonal Multiple Access systems on figshare in May 2026. The dataset likely contains metrics comparing the proposed method against conventional approaches. The ZIP file is 130.7 KB, indicating a relatively small scope.
A 15.6 KB dataset supporting a two-stage Coverage Path Planning framework for reconfigurable cleaning robots in aircraft cabins. The data likely contains parameters for a Genetic Algorithm optimizing the trade-off between cleaning time and energy consumption. Authored by Cong Hien Dinh and published on figshare in May 2026.
Datasets used in analyses comparing sociocognitive performance in selectively bred lines of Japanese quail (Coturnix japonica). The data, authored by Jeanne Seressia, includes results from gaze following, social buffering, and social discrimination learning tasks. The dataset was last updated on 2026-05-28.
Flux tower measurements of energy and mass exchange between the surface and atmosphere using eddy covariance techniques. Data were processed using PyFluxPro (v3.4.23) and include gap-filled Net Ecosystem Exchange partitioned into Gross Primary Productivity and Ecosystem Respiration. The site was established in July 2010 and is managed by the University of Adelaide's Landscape Futures Program.
MASTER Level 1B data products contain calibrated radiance imagery from 18 flights over Nevada, California, and Colorado in August-September 2006. The dataset includes 50 spectral bands from visible to thermal infrared wavelengths at approximately 10-meter spatial resolution. Supplementary files provide flight paths, instrument configuration, and summary information for each mission.
Integrated data from a study on the survival strategies of alpine Rhododendron cultivars under chronic heat stress. The dataset likely contains results from phenotypic screening, physiological profiling, transcriptomics, and biochemical analysis of the glutathione system. It was authored by Mei Zhou, shared on figshare under a CC-BY-4.0 license, and last updated in May 2026.
Global Affairs Canada's Historical Section created the 'Heads of Posts' database to identify individuals who have served as heads of Canadian diplomatic posts abroad and their dates of service. The dataset serves as a reference guide for departmental employees, academics, and the public interested in Canada's international relations. It is available on the Open Government site and was last updated on 2026-06-01.
Satellite observations from missions like TOPEX/Poseidon, Jason, and Sentinel-6 underpin this El Niño-Southern Oscillation indicator. The time series is derived from cyclostationary empirical orthogonal functions applied to gridded sea surface height anomalies from 1993 to 2019. It provides a continuously updated metric for tracking ENSO variability based on ocean topography.
Surficial geology maps of ice-free regions in Antarctica, specifically the Vestfold Hills, are presented. The data was used to inform management plans for Antarctic Specially Protected Areas (ASPAs) such as ASPA No. 143 Marine Plain. Results were presented at the 2024 Australian Antarctic Research Conference.
ISS-RapidScat Version 2.0 Level 1B data provides geo-located Sigma-0 (radar backscatter) measurements and detailed antenna pulse geometries. The dataset includes the complete pulse 'egg' footprint (approximately 25 km by 35 km) and its eight constituent 'slices' (approximately 25 km by 7 km each), derived from ephemeris and Level 1A data. This version represents a complete historical re-processing for consistent calibration and is the basis for deriving Version 2.0 wind vector products.
Evaluation data for the INAGQA question-answering system, developed by Jamal Al Qundus and last updated in May 2026. The dataset likely contains performance metrics from experiments comparing INAGQA against systems like BERT-KGQA on German financial queries. Results include an F1 score of 0.91 validated on 2,100 expert-annotated questions.
INAGQA is a novel QA system achieving 0.91 F1 score on German financial queries, validated on 2,100 expert-annotated questions. The dataset likely contains performance metrics and experimental results comparing INAGQA to systems like BERT-KGQA (F1: 0.83) and template-based systems (F1: 0.79). Authored by Jamal Al Qundus and last updated in May 2026, this 5.5 KB Excel file contributes to information systems research with language-sensitive design principles.
2,100 expert-annotated German financial questions were used to validate the INAGQA question-answering system. The dataset likely contains the system's outputs, including parsed queries and linked knowledge graph entities. Authored by Jamal Al Qundus and last updated on 2026-05-04, it is shared under a CC-BY-4.0 license.
INAGQA, a novel QA system, achieved 0.91 F1 score on German financial queries using a hybrid disambiguation algorithm. The dataset likely contains accuracy metrics for headword-centric parsing, validated on 2,100 expert-annotated questions. Jamal Al Qundus published this 5.5 KB Excel file on figshare in May 2026.