COVID-19 Sentiments-India: Tweets with Sentiment Labels from March-May 2020
arff
Available on 1 platform
Sign in to view source links and access this dataset
Description
A collection of tweets related to COVID-19 in India, each labeled with a positive, neutral, or negative sentiment score. The data was sourced from IEEE Dataport tweet IDs and hydrated using the Hydrator app, covering a period from March 20 to May 31, 2020. It includes five columns: a unique text ID, the tweet text, date, location, and the assigned sentiment.
Use Cases
Train sentiment classification models based on labeled tweet text.
Analyze temporal trends in public sentiment about COVID-19 based on the date column.
Study geographic variations in discourse based on the location metadata.
Benchmark NLP models on a domain-specific (pandemic) Twitter corpus.
Strengths
Each tweet has a manually defined sentiment label (positive, neutral, negative).
Includes temporal metadata (date) and optional geographic metadata (location).
Data provenance is documented, citing the original source (IEEE Dataport) and hydration tool.
Limitations
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality and sample size require manual inspection after download.
The sentiment scoring logic is described but the underlying model or annotation process is not detailed.