Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
43,041 datasets
Nemotron-RL-QA-Abstention-v1 is a dataset for training abstention-aware question-answering models. It was created by NVIDIA, sourced from a hybrid of automated, manually collected, and synthetic methods, and last updated on June 4, 2026. The dataset focuses on factoid question answering across domains including software engineering, health, and law.
A family of synthesized bichromophoric molecular systems, where substituted naphthalimides are attached to heptamethine cyanines. The dataset likely contains photophysical characterization data, including results from steady-state spectroscopy and femtosecond transient absorption measurements. It was authored by Ali Jamjah and last updated on May 14, 2026.
A 3.0 MB dataset by Ali Jamjah contains crystallographic information files (CIF) for synthesized bichromophoric molecular systems. The data supports analysis of structure–property relationships in cyanine-based antenna constructs, with results last updated in May 2026. It includes results from single crystal X-ray diffraction, DFT calculations, and femtosecond transient absorption measurements.
A 1.3 MB CSV file released under CC-BY-4.0 license by Shoukat Ullah on figshare. The data supports research into enhancing product performance evaluation using hybrid DistilRoBERTa and BiGRU models. It was last updated on May 26, 2026.
Geoscience Australia Data produced a high-resolution geomorphic map of a submerged plateau on the northern Lord Howe Rise in the Tasman Sea. The map covers approximately 25,500 km² and details features like ridges, valleys, volcanic peaks, and an extensive network of polygonal furrows. It was last updated on 2026-05-14.
Four linked vocabularies developed by the International Seabed Geomorphology Mapping Working Group (ISGM-WG) to support a standardized global seabed classification scheme. The vocabularies cover method terms, physiography, morphology features, and geomorphic units. Geoscience Australia published the vocabularies with permission from ISGM-WG.
Geoscience Australia's GA-SaMMT comprises seven ArcGIS Pro Python toolboxes for mapping seabed features. The tools map ten bathymetric high and eight bathymetric low Morphology Features, plus three classes of Morphological surfaces. These tools have been applied to many study areas, with real-world applications referenced in Huang et al. (2023).
The Australian Ocean Data Network provides data on the geomorphology and sediment distribution of Keppel Bay, a shallow coastal embayment in Queensland. The dataset likely contains information on seabed morphology, sub-bottom profiles, and sediment cores, revealing the paleo-path of the Fitzroy River and Holocene sea-level changes. The data was last updated on 2026-05-05.
Geoscience Australia collected marine geological data from the Kenn Plateau off northeast Australia. The RV Southern Surveyor voyage gathered 3090 km of seismic data and 7584 km of bathymetric data, with limited rock sampling. The survey aimed to improve geological understanding of this frontier offshore region.
Nemotron-RL-Multichallenge-v1 is a text dataset for reinforcement learning and instruction-following tasks, created by NVIDIA. It is a hybrid collection of manually collected and synthetic data, with a size bin indicating fewer than 10,000 records. The dataset was last updated on June 4, 2026.
9.5 KB Excel file consolidates empirical performance indicators from a MATLAB simulation. The dataset contrasts the behavior of the ML-BAMS system with a conventional hiring model across six critical evaluation domains. It was authored by Baijia Song and last updated on June 3, 2026.
The Coral Coast HF ocean radar system measures surface currents along the Western Australia coast, an area influenced by the Leeuwin Current. It consists of two SeaSonde stations at Dongara and Green Head operating at 4.463 MHz with a 200 km range. The data is provided by the Australian Ocean Data Network and was last updated on 2026-05-05.
DORIS data provides precise orbit determination and high-accuracy ground beacon positioning using a dual-frequency Doppler system. The system was developed by the French space agency CNES and has been deployed on multiple satellite missions including TOPEX/Poseidon, SPOT series, Envisat, and Jason. This centralized uplink system collects Doppler shift measurements from ground beacons to orbiting receivers.
Leicester City Council provides waste diversion rates for Leicester City. The data includes the total percentage of waste diverted from landfill through recycling, reuse, composting, and refuse-derived fuel for energy recovery. A note indicates operational issues in the 2017/18 financial year affected compost production.
Early 2024 mapping of Joint Private Works Schemes in New South Wales provides spatial boundaries for Private Water Corporations and Private Water Trusts. NSW Public Works created this single GIS dataset using the February 2024 NSW cadastre and a GDA2020 datum. Collaboration involved DCCEEW (Water Group), NSW Public Works, and DPI, with DPI creating the spatial files and PDF documentation.
The Bureau of Mineral Resources published lithofacies maps of continental shelf sediments as part of a systematic reconnaissance geological survey program. Three map sheets covering Rowley Shoals, Scott Reef, and the Arafura Sea were printed by early 1974, with work on two further sheets for the east Australian shelf advanced. Users should refer to Bulletin 83 (GeoCat # 163) for interpretation guidance, as the maps do not distinguish between modern and relic sediments.
The Australian Ocean Data Network hosts a dataset describing six sedimentary cycles in the Surat Basin. The cycles, each hundreds of metres thick, are linked to nine global sea-level oscillations during the Jurassic and Cretaceous periods. The data was last updated on 2026-05-05.
NASA Operation IceBridge collected raw Inertial Measurement Unit readings over Antarctica using a Systron Donner MMQ-G unit. The dataset includes latitude, longitude, altitude, velocity, pitch, roll, and true heading measurements. Data collection was part of the NSF and NERC-funded ICECAP project investigating the central Antarctic plate.
Colombia's inventory of information generated, obtained, or controlled by a public entity that has been classified as confidential or reserved. The dataset includes a 'VIGENCIA ACTUALIZACIÓN' field identifying the year assets were recorded from 2021 to 2025. It is published on the datos.gov.co platform via Socrata and was last updated on May 26, 2026.
An article extending a p-value-based multiple testing procedure for scenarios where study success requires at least k out of m hypotheses to be rejected. The extension replaces an initial gatekeeping step with a Fixed-Sequence MTP, allowing inferences even if the gate is not passed. The work includes an R function for calculating adjusted p-values and is licensed under CC-BY-4.0.