Sign in to view source links and access this dataset
Description
A dataset used to train the CoEdIT text editing models, as described in the paper 'CoEdIT: Text Editing by Task-Specific Instruction Tuning'. It was created by authors Vipul Raheja, Dhruv Kumar, Ryan Koo, and Dongyeop Kang and is hosted on Hugging Face by Grammarly. The dataset was last updated on October 21, 2023.
Use Cases
Training text editing models based on task-specific instructions.
Fine-tuning language models for grammar and style correction tasks.
Benchmarking instruction-following capabilities in text-to-text generation.
Research on controlled text generation and editing via natural language commands.
Strengths
Dataset is directly linked to a published research paper, providing academic context.
Hosted on a major platform (Hugging Face) with a specific author (Grammarly) and last update timestamp (2023-10-21).
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count, file size, and license information are unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality and structure require manual inspection after download.
Provenance
Source
Grammarly (author on Hugging Face), based on research by Vipul Raheja, Dhruv Kumar, Ryan Koo, Dongyeop Kang.
Collection Method
Created for training the CoEdIT text editing models; full details are in the associated paper.
Freshness
Last updated 2023-10-21 01:49:43; freshness should be verified.
The full dataset description is on the Hugging Face page; users must visit the provided URL for complete details.