DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Roman Urdu Text Data with Toxicity Labels | DataSalon

Home Pharmacology & Drug DiscoveryRoman Urdu Text Data with Toxicity Labels

Pharmacology & Drug Discovery

Roman Urdu Text Data with Toxicity Labels

Available on 1 platform

Description

Roman Urdu text data annotated for toxic language, sourced from the Kaggle platform. The dataset likely contains text samples with labels indicating the presence of harmful or offensive content. Specific details on volume, author, and collection timeframe are not provided in the available metadata.

Use Cases

Training a classifier to detect toxic language in Roman Urdu text (inferred from domain, verify after download)
Benchmarking multilingual toxicity detection models (inferred from domain, verify after download)
Analyzing linguistic patterns of offensive speech in a low-resource language script (inferred from domain, verify after download)

Strengths

Published on Kaggle, a major platform for sharing ML datasets.
Focuses on Roman Urdu, a specific and potentially lower-resource script variant.

Limitations

Metadata is minimal; actual content requires verification after download.
Row count, column definitions, and license are unknown, limiting suitability assessment.
Data may reflect bias inherent to its unspecified collection source on Kaggle.

Provenance

Source: Kaggle

Text Toxicity Detection Urdu Language Health Natural Language Processing Text Data

Related Datasets

Quality Score

D14

Description

Source

Reputation

Quality Score

D14

Description

Source

Reputation

Access

Community

0 views

Dataset Info

Last synced: Jul 23, 2026

Access

Community

0 views

Dataset Info

Last synced: Jul 23, 2026

Roman Urdu Text Data with Toxicity Labels

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info