12 authors produced handwritten word images, including clean samples and 7 distinct cross-out types. The dataset is designed for handwriting-related research tasks and includes a mixed subset combining clean and crossed-out examples. Author wahlinski uploaded it to Hugging Face, with a last recorded update in May 2026.
Use Cases
- Training binary classifiers to detect crossed-out words based on the presence of cross-out types.
- Developing multi-class models to identify specific cross-out styles among the 7 distinct types.
- Benchmarking HTR system robustness on documents containing occlusions using the mixed subset.
- Studying writer-specific handwriting styles and degradation patterns across the 12 different authors.
Strengths
- Includes samples from 12 different authors, providing diversity in handwriting styles.
- Defines 7 distinct cross-out types, enabling detailed analysis of occlusion patterns.
- Contains both clean and crossed-out samples, supporting comparative and robustness studies.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
- Last updated 2026-05-08 06:52:09; freshness should be verified.
Provenance
- Source
- wahlinski on Hugging Face
- Freshness
- Last updated 2026-05-08 06:52:09