DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

InRhetoricalRoles: Corpus for Automatic Structuring of Legal Documents | DataSalon

Home Government & LegalInRhetoricalRoles: Corpus for Automatic Structuring of Legal Documents

Government & Legal

InRhetoricalRoles: Corpus for Automatic Structuring of Legal Documents

Name: InRhetoricalRoles: Corpus for Automatic Structuring of Legal Documents
Creator: opennyaiorg
Published: 2024-04-17T07:16:01
Keywords: Benchmark, Legal Text, Text, Natural Language Processing, Document Structuring, Rhetorical Roles

by opennyaiorg·Updated 2y ago

Available on 1 platform

Description

InRhetoricalRoles is a corpus for automatic structuring of legal documents, presented at the Language Resources and Evaluation Conference in 2022. The dataset was created by authors including Prathamesh Kalamkar, Aman Tiwari, Astha Agarwal, Saurabh Karn, Smita Gupta, Vivek Raghavan, and Ashutosh Modi. It was last updated on the Hugging Face platform on 2024-05-08.

Use Cases

Train models for rhetorical role classification based on the corpus's annotation scheme.
Develop systems for automatic segmentation and structuring of legal documents.
Benchmark NLP tools for legal text processing and summarization.
Analyze linguistic patterns and argumentation structures in legal texts.

Strengths

Dataset is associated with a peer-reviewed conference paper (LREC 2022).
The corpus is specifically designed for the structured analysis of legal documents.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: opennyaiorg
Freshness: Last updated 2024-05-08 06:28:21; freshness should be verified.

License is unknown; terms of use must be verified before application.

Text Benchmark Legal Text Natural Language Processing Document Structuring Rhetorical Roles

Related Datasets

Quality Score

D33

Description

Source

Reputation

Quality Score

D33

Description

Source

Reputation

Access

Community

122 downloads

4 likes

0 views

Dataset Info

Author: opennyaiorg
Created: Apr 17, 2024
Updated: May 8, 2024
Last synced: Jun 20, 2026

Access

Community

122 downloads

4 likes

0 views

Dataset Info

Author: opennyaiorg
Created: Apr 17, 2024
Updated: May 8, 2024
Last synced: Jun 20, 2026

InRhetoricalRoles: Corpus for Automatic Structuring of Legal Documents

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info