Multilingual Autism Discourse on Social Media Across Five Languages
by Patricia López-Resa·Updated 1mo ago
4.4 MB1files
Available on 1 platform
Sign in to view source links and access this dataset
Description
678 public social media posts about autism collected through stratified keyword sampling in English, Spanish, French, Norwegian, and Georgian. The dataset, created by Patricia López-Resa and last updated in April 2026, includes variables for theoretical frame, linguistic style, sentiment, and engagement metrics. Posts were normalized, anonymized, and classified using multilingual lexicons and sentence-embedding-assisted disambiguation.
Use Cases
Analyze the prevalence of medical versus neuroaffirmative framing across different languages based on the theoretical frame variable.
Study the correlation between linguistic style (identity-first vs. person-first) and engagement metrics.
Track longitudinal trends in neuroaffirmative discourse sentiment and frequency after 2023.
Compare cross-cultural patterns in autism discourse using the multilingual, stratified sample.
Strengths
Includes 678 posts across five languages (English, Spanish, French, Norwegian, Georgian), enabling cross-linguistic comparison.
Posts are annotated with multiple variables: theoretical frame, linguistic style, sentiment scores (-5 to +5), and engagement metrics.
Data collection and classification methods are described, including multilingual lexicons and sentence-embedding-assisted disambiguation.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown for each language subset, which may limit suitability assessment.
The 4.4 MB file size suggests a small dataset, which may limit statistical power for some analyses.
Provenance
Source
figshare, author Patricia López-Resa.
Collection Method
Observational study using stratified keyword sampling of public social media posts, followed by normalization, anonymization, and classification.
Time Range
Data collection period is not specified; analysis notes trends after 2023.
Freshness
Last updated 2026-04-18 22:12:33; freshness should be verified.
Geography
Posts are in English, Spanish, French, Norwegian, and Georgian, suggesting international coverage.
Primary data file is in DOCX format, which may require conversion for computational analysis.