Disaster Tweets: Over 11,000 Social Media Posts on Events Like Volcanoes and Fires
arff
Available on 1 platform
Sign in to view source links and access this dataset
Description
Over 11,000 tweets were collected on Jan 14th, 2020, associated with disaster keywords like crash, quarantine, and bush fires. The data includes text, location, and keyword, and was inherited from the 'Disasters on social media' source used in a Kaggle competition. Tweets cover topics such as the Taal Volcano eruption, Coronavirus, Australian bushfires, and the Iran PS752 airplane downing.
Use Cases
Classify tweets as disaster-related or not based on keyword and text content.
Analyze public discourse and sentiment around specific disaster events mentioned in the description.
Train or benchmark text classification models for crisis informatics using manually classified tweets.
Strengths
Over 11,000 data points provide a substantial corpus for model training.
Data is manually classified, which likely improves label reliability for the intended task.
Includes multiple metadata fields such as location and keyword alongside the tweet text.
Limitations
Row count is unknown, which may limit suitability assessment.
Column-level documentation is absent; field semantics must be inferred after download.
Data may reflect temporal bias inherent to the single collection date of Jan 14, 2020.
Provenance
Source
Inherited from 'Disasters on social media' source used in the 'Real or Not? NLP with Disaster Tweets' Kaggle competition.
Collection Method
Tweets were collected via keyword search and manually classified.
Time Range
Collected on Jan 14th, 2020.
Geography
Global, with tweets referencing specific locations like the Philippines, Australia, and Iran.
The dataset contains text that may be considered profane, vulgar, or offensive.