Loading...
Loading...
Offline RL trajectories, game data, robot demonstrations, RLHF, multi-agent interaction
10,008 datasets
Supplementary mass spectrometric files in the .RAW format support a journal article on dual-functionality whole blood microsampling for LC-MS protein analysis. The data was contributed by authors Ago Mrsa, Bjarke Strøm Larsen, Trine Grønhaug Halvorsen, and Léon Reubsaet. The dataset was last updated on 2026-04-21.
A bilateral taxation agreement between Canada and the Cook Islands concerning the exchange of information on tax matters. The document establishes mechanisms for cooperation and transparency to support the administration and enforcement of tax laws. It is an archived publication from Global Affairs Canada, last updated in the platform in February 2026.
A collection of 45,879 training samples for instruction-following reinforcement learning (RL). It was curated by NVIDIA to train the Nemotron-Cascade-2-30B-A3B model and includes multi-domain RL, on-policy distillation, and software engineering RL data.
Open CEDA is a multi-regional Environmentally-Extended Input-Output (EEIO) model developed by Watershed Technology. It connects economic exchanges to greenhouse gas emissions, representing 95% of the world's GDP across 148 countries and 400 sectors. The registry contains CEDA 2025, which uses 2023 as its base year, and CEDA 2024.
IGAC's dataset contains 30,905,678 unique real estate transaction records from 2015 to 2023 across 1,105 Colombian municipalities. The data is constructed from property registration details, including property ID (MATRICULA), legal act codes (COD_NATUJUR), and transaction values (VALOR). It originates from the Colombian open data portal (www.datos.gov.co) and was last updated in February 2026.
Topographically-based catchment delineations cover all stream-reach segments of the National Hydrography Dataset across the conterminous United States. The dataset was produced by the USGS in cooperation with the USEPA and was intended for release as part of the NHDPlus project in 2006.
Twenty-eight estuaries of national significance along U.S. coasts and Puerto Rico are managed under this EPA program. The program, established by Congress in 1987, develops long-term Comprehensive Conservation and Management Plans (CCMPs) through local stakeholder conferences. Each plan contains sustained actions for protecting water quality and living resources.
NASA HEASARC maintains this official observing program for the Rossi X-Ray Timing Explorer (RXTE) satellite. The list contains targets recommended by review panels for Guest Observer proposals up to Cycle 15, including potential Targets of Opportunity. The database table was updated periodically by the HEASARC based on information from the RXTE Guest Observer Facility.
GRID-INPE, a cooperating center to the United Nations Environment Programme, holds a collection of integrated, spatially-referenced environmental datasets. The system is dedicated to making geo-referenced environmental information accessible for analysis and decision-making. The collection includes data acquired and disseminated to support the use of geographic information systems and satellite image processing.
This dataset compares user intention and actual usage of mental health and university support services between students with and without a chronic somatic disorder. It is a small dataset of 9.5 KB in XLS format, authored by Sarah-Lena Klemm and last updated in March 2026.
Version 2.3 of the Midcourse Space Experiment (MSX) Point Source Catalog (PSC) supersedes the 1999 release and contains over 100,000 more sources. The National Aeronautics and Space Administration produced this catalog, with photometry based on co-added image plates for improved sensitivity and reliability. Its astrometric accuracy is more than 1'' better than the previous version, and it includes data from the Small Magellanic Cloud, eight nearby galaxies, and several molecular clouds and star-forming regions.
Behavioral summaries from participants in overt and covert naming tasks associated with a study on neural oscillations and repetition. The dataset was authored by Adrian Gilmore and last updated on April 25, 2026. It originates from the research project "Repetition-related reductions in neural activity support improved behavior through increases in oscillatory power."
CERCLIS Version 2 is an inventory of abandoned or uncontrolled hazardous waste sites across the United States, managed by the U.S. Environmental Protection Agency. It contains entries for over 38,000 sites, including those listed on the National Priorities List for cleanup. The database is updated quarterly by the EPA.
U.S. Geological Survey researchers collected high-resolution measurements of waves, currents, water levels, temperature, salinity, and turbidity in Hanalei Bay, Kauai, during the summer of 2006. The data set was gathered using bottom-mounted instrument packages deployed in water depths under 10 meters, supplemented with vertical water column profiles. This work supports the USGS Coastal and Marine Geology Program's Pacific Coral Reef Project to understand particle transport in coral reef settings.
Supplementary materials for a study on iron nanoflakes include synthesis and transfer protocols, equipment specifications, and sample characterization data. The dataset contains magnetic force microscopy (MFM) results for truncated and hexagonal samples of varying thicknesses and details the process of controllable magnetic vortex switching. It is accompanied by corresponding micromagnetic simulation data.
Scanning Tunneling Microscopy topography data supports Figure 1 of a 2026 Nanoscale publication. The dataset includes large-area and high-resolution STM measurements of epitaxial bilayer graphene on SiC before and after Gd intercalation. All data were acquired and analyzed by Shen Chen.
Goldman Sachs 2026 Hiring Contest Dataset is a collection of data related to a hiring contest hosted by the financial institution Goldman Sachs. The dataset is published on Kaggle, but its specific contents, size, and structure are not detailed in the available metadata. Its intended use likely relates to recruitment analytics or data science competitions.
Ocean Drilling Program hole 504B revealed a hydrothermal sulphur anomaly on the dyke-lava transition. This dataset contains sulfur concentration and isotope data from a 7.5km section of the Macquarie Ridge at Macquarie Island, a 39-9.7 Ma slow-spreading setting, with background pyrite sulfur averaging 1845 ppm and fault zones averaging 5000-11000 ppm. Data was contributed by the Australian Antarctic Data Centre (AU_AADC) and last updated in March 1998.
A list of publicly available programs and services supports women and families affected by domestic violence in Nova Scotia. The Government of Nova Scotia maintains this resource, which was last updated in February 2026. It provides information for victims, their support networks, and service providers.
403 uncensored, multi-turn chat conversations form this synthetic dataset for fine-tuning companion chatbots. Created by author n0ctyx and last updated on March 30, 2026, it is structured in ChatML JSONL format with user and assistant message roles. The dataset explicitly contains adult content (NSFW) and is intended for research on uncensored models.