581 manually labeled social media comments related to COVID-19 vaccines, annotated for the presence of conspiracy theories. The dataset supports NLP research on automated vaccine misinformation detection and has been cited and reused by external researchers. It was authored by AminHasibul and last updated on March 23, 2026.
Use Cases
- Training classifiers to detect vaccine conspiracy theories based on annotated social media comments.
- Benchmarking misinformation detection algorithms on a manually labeled corpus.
- Studying the linguistic patterns of COVID-19 vaccine misinformation.
- Supporting public health research on online vaccine discourse.
Strengths
- 581 manually labeled comments provide a foundation for supervised learning.
- The dataset has been cited and reused by external researchers, indicating community validation.
- Focuses on a specific and critical public health challenge: COVID-19 vaccine misinformation.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is limited to 581, which may constrain model training.
- The dataset's source and geographic scope are unknown, which may introduce bias.
Provenance
- Source
- huggingface
- Collection Method
- Manually labeled social media comments.
- Freshness
- Last updated 2026-03-23 17:58:41; freshness should be verified.