DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Older Scots Part-of-Speech Tagger Using CLAWS7 Tagset | DataSalon

Home NLP & TextOlder Scots Part-of-Speech Tagger Using CLAWS7 Tagset

NLP & Text

Older Scots Part-of-Speech Tagger Using CLAWS7 Tagset

Name: Older Scots Part-of-Speech Tagger Using CLAWS7 Tagset
Creator: Beattie, Beth
Published: 2026-06-25T11:46:21
Keywords: Historical Language, Older Scots, Text, Audio, Nlp Model, Large Scale, Natural Language Processing, Part Of Speech Tagging

by Beattie, Beth / Harvard Dataverse·Updated 5d ago

Available on 1 platform

Description

A trained spaCy part-of-speech tagging pipeline for Older Scots from the sixteenth century. The model was trained on pre-tagged data from Bushnell (2021) and includes Python scripts for application to a 1-million-word historical corpus. It was authored by Beattie, Beth and hosted on Harvard Dataverse.

Use Cases

Part-of-speech tagging of Older Scots texts based on the CLAWS7 tagset mentioned in the description.
Applying a pre-trained NLP model to a historical corpus using the provided Python scripts.
Linguistic analysis of sixteenth-century language features based on the annotated model.

Strengths

Model trained on pre-tagged data from a 2021 source.
Includes Python scripts for applying the model to a 1-million-word corpus.

Limitations

Description metadata is limited; actual data quality requires manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.

Provenance

Source: Harvard Dataverse
Collection Method: Trained on pre-tagged data from Bushnell (2021).
Time Range: Sixteenth century
Freshness: Last updated 2026-06-25 11:46:21; freshness should be verified.

Text Audio Historical Language Older Scots Nlp Model Large Scale Natural Language Processing Part Of Speech Tagging

Related Datasets

Quality Score

D35

Description

Source

Reputation

Quality Score

D35

Description

Source

Reputation

Access

Community

0 views

Dataset Info

Author: Beattie, Beth
Org: Harvard Dataverse
Created: Jun 25, 2026
Updated: Jun 25, 2026
Last synced: Jul 1, 2026

Access

Community

0 views

Dataset Info

Author: Beattie, Beth
Org: Harvard Dataverse
Created: Jun 25, 2026
Updated: Jun 25, 2026
Last synced: Jul 1, 2026

Older Scots Part-of-Speech Tagger Using CLAWS7 Tagset

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info