Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
PersianPunc is a large-scale dataset for Persian punctuation restoration, containing 17 million token-level sequence labeling samples aggregated from 6 source corpora. It was created by MohammadJRanjbar and accepted at the EACL 2026 SilkRoad NLP Workshop.
License information is not provided in the input; users must verify licensing on the dataset page. The dataset is intended for token classification tasks in Persian.