PerCoR: Persian Commonsense Reasoning Benchmark with ~106K Multiple-Choice Examples

Name: PerCoR: Persian Commonsense Reasoning Benchmark with ~106K Multiple-Choice Examples
Creator: mina8113
Published: 2026-05-23T07:33:44
Keywords: Persian Language, Benchmark, Text, Commonsense Reasoning, Multiple Choice, Large Scale, Nlp Benchmark

by mina8113Updated 27d ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

PerCoR is a large-scale Persian benchmark for commonsense reasoning in a 4-choice sentence-completion format. It contains approximately 106,000 examples sourced from over 40 Persian websites across domains like news, culture, lifestyle, tech, religion, and travel. The dataset was created by author mina8113 and last updated on Hugging Face in May 2026.

Use Cases

Training language models for Persian commonsense reasoning based on the multiple-choice sentence completion format.
Benchmarking model performance on Persian natural language understanding tasks based on the described prefix-and-completion structure.
Analyzing linguistic and cultural commonsense patterns in Persian text based on data sourced from diverse websites.

Strengths

Contains approximately 106,000 examples, providing a substantial corpus for model training and evaluation.
Sourced from over 40 Persian websites, indicating diversity across domains like news, culture, lifestyle, tech, religion, and travel.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Last updated 2026-05-23 07:33:44; freshness should be verified.

Provenance

Source: Over 40 Persian websites across news, culture, lifestyle, tech, religion, travel, and more.
Freshness: Last updated 2026-05-23 07:33:44.
Geography: Persian-language content.

License is unknown; terms of use must be verified.

Text Persian Language Benchmark Commonsense Reasoning Multiple Choice Large Scale Nlp Benchmark

Related Datasets

Quality Score

D36

Description

39

Source

36

Reputation

35

Access

26

Community

1 likes

0 views

Dataset Info

Author: mina8113
Created: May 23, 2026
Updated: May 23, 2026
Last synced: May 26, 2026

Access

26

Community

1 likes

0 views

Dataset Info

Author: mina8113
Created: May 23, 2026
Updated: May 23, 2026
Last synced: May 26, 2026

PerCoR: Persian Commonsense Reasoning Benchmark with ~106K Multiple-Choice Examples

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info