DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Refusal XL: 16,000 Synthetic Instruction-Refusal Pairs | DataSalon

Home Computer Graphics & SimulationRefusal XL: 16,000 Synthetic Instruction-Refusal Pairs

Computer Graphics & Simulation

Refusal XL: 16,000 Synthetic Instruction-Refusal Pairs

Name: Refusal XL: 16,000 Synthetic Instruction-Refusal Pairs
Creator: mrfakename
Published: 2024-04-26T21:15:23
Keywords: Instruction Response, Text, Nlp Training, Text, Refusal Generation, Synthetic Data, Synthetic

by mrfakename·Updated 2y ago

Available on 1 platform

Description

16,000 single-turn conversations form this synthetic dataset of instruction and refusal pairs. The dataset was created by author mrfakename and last updated on 2024-04 26. Human prompts are sourced from the Capybara dataset, with refusals generated synthetically.

Use Cases

Training models to generate appropriate refusals based on user instructions.
Fine-tuning language models for safety alignment using synthetic refusal data.
Benchmarking model refusal behavior against a large-scale synthetic dataset.
Studying patterns in AI refusal responses to various instruction types.

Strengths

Scaled to approximately 16,000 conversations, over 5 times larger than its predecessor.
Focuses on a specific, important NLP task: generating refusals to instructions.
Explicitly formatted in an input-output structure for direct model training.

Limitations

Contains only single-turn conversations, lacking multi-round dialogue complexity.
Data is synthetically generated, which may not fully capture the nuance of human interactions.
Column-level documentation is absent; field semantics must be inferred after download.

Provenance

Source: huggingface
Collection Method: Human prompts sourced from the Capybara dataset, with refusals synthetically generated.
Time Range: null
Freshness: Last updated 2024-04-26 23:28:55; freshness should be verified.
Geography: null

null

Text Instruction Response Nlp Training Refusal Generation Synthetic Data Synthetic

Related Datasets

Quality Score

D36

Description

Source

Reputation

Quality Score

D36

Description

Source

Reputation

Access

Community

57 downloads

6 likes

0 views

Dataset Info

Author: mrfakename
Created: Apr 26, 2024
Updated: Apr 26, 2024
Last synced: Apr 18, 2026

Access

Community

57 downloads

6 likes

0 views

Dataset Info

Author: mrfakename
Created: Apr 26, 2024
Updated: Apr 26, 2024
Last synced: Apr 18, 2026

Refusal XL: 16,000 Synthetic Instruction-Refusal Pairs

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info