Loading...
Loading...
Offline RL trajectories, game data, robot demonstrations, RLHF, multi-agent interaction
10,046 datasets
A Latin text collection titled 'Farrago confusanearum et inter se dissidentium opinionum de Coena Domini:ex sacramentarioru[m] libris congesta'. The dataset is hosted on the paperswithcode platform, which primarily aggregates resources for computer science and machine learning. The original author, organization, and creation date are unknown.
Construction and demolition waste: challenges and opportunities in a circular economy is a dataset from paperswithcode. The dataset likely contains research or analysis related to waste management in the construction sector. Its specific content, size, and authorship are not detailed in the provided metadata.
A collection of publications from the Cambridge Bibliographical Society, aggregated on the Papers with Code platform. The dataset's specific content, size, and structure are not detailed in the available metadata. It is listed under a closed license, and the original author and organization are unknown.
Transactions of the Historical Society of Ghana is a collection of academic journal articles. The dataset is aggregated from the paperswithcode platform, which suggests a focus on historical and geographical research related to Ghana. Specific details on volume, authorship, and publication dates are not provided in the metadata.
A report from the United Nations Special Rapporteur on violence against women, its causes and consequences. The dataset likely contains textual analysis and findings on this human rights issue. It is hosted on the Papers with Code platform.
A document titled 'IASC Guidelines on Mental Health and Psychosocial Support in Emergency Settings' authored by Elizabeth Carll and hosted on the paperswithcode platform. The content likely provides structured recommendations and frameworks for mental health interventions in crisis situations. The dataset's specific format, size, and update history are not provided in the metadata.
Timely and granular datasets on consumer spending, job openings, and other economic indicators. The data is provided by Opportunity Insights. The dataset likely contains high-frequency economic metrics.
One million financial transactions labeled with seven distinct fraud types. The description suggests the data includes features for fraud rings and behavioral patterns, and notes the presence of class imbalance. The dataset is hosted on Kaggle, but author, organization, and license details are unknown.
A PopPov Research Brief authored by Mahesh Karra examines the economics of reproductive health in Accra, Ghana. The dataset likely contains socioeconomic and health indicators related to fertility, family planning, and earnings. Specific details on data volume, temporal coverage, and collection methodology are not provided in the input.
A preprocessed collection of real estate transaction data from France. The dataset is intended for analysis of the French housing market, supporting tasks like visualization and exploratory data analysis. Specific details on the number of records and features are not provided.
81,808 samples of prompts and associated metadata form this dataset designed for training reward models in reinforcement learning from human feedback (RLHF). Created by NVIDIA, this collection is a curated subset from multiple sources and was last updated in December 2025. The dataset is explicitly noted as ready for commercial use.
NVIDIA's Nemotron-Cascade-RM-Training dataset provides 81,808 samples for training reward models in reinforcement learning from human feedback (RLHF). It contains prompts, data sources, and category information. The dataset was published by NVIDIA in December 2025.
A Kaggle dataset providing vector embeddings for a collection of sacred texts. The corpus likely spans multiple religious or spiritual traditions, enabling computational analysis. The specific texts, embedding model, and dataset scale are not detailed in the available metadata.
Master_support_indices.json is a dataset published on Kaggle. The title suggests it likely contains numerical indices or metrics related to support systems or performance. Its specific content, scale, and authorship are unknown from the provided metadata.
20,000 technical support tickets written in Spanish, sourced from Kaggle. The dataset is likely intended for natural language processing tasks. Its specific origin, creation date, and detailed structure are not provided in the available metadata.
Kaggle hosts a dataset titled 'flippo_img_dataset'. The dataset likely contains images, as suggested by its title. No further details on size, creator, or update date are available.
VICIdial/Asterisk data provides telephony logs and metrics from call center operations. The dataset's volume, creator, and temporal coverage are unspecified. It originates from the open-source VICIdial call center software platform.
The X2Edit Dataset is an image editing collection covering 14 diverse tasks, developed by OPPOer and hosted on Hugging Face. It was last updated on December 30, 2025. The dataset description claims it exhibits advantages over several existing open-source image editing datasets.
comoZ's Reasoning Dataset is a compiled collection for training reasoning models, containing RL and SFT subsets. The RL subset provides high-quality ground truth pairs with task_type and rubrics for reward modeling. The SFT subset offers instruction-following data with tags to model thinking processes.
Codes, figures, and other data supporting the results of a physics paper on nonreciprocal perfect Coulomb drag and coherent exciton superflow in electron-hole bilayers. The data was authored by Jun-Xiao Hui and is hosted by Harvard Dataverse. The specific number of rows, columns, and file formats is unknown.