Hotel reviews from a leading travel site, containing user-provided text and metadata. The dataset includes columns for a unique User_ID, the review Description, Browser_Used, Device_Used, and a target variable Is_Response. It is published under a CC0-1.0 license on the OpenML platform.
Use Cases
- Train sentiment classification models based on the Is_Response target variable.
- Analyze customer feedback patterns based on the review Description text.
- Study the relationship between user device metadata (Browser_Used, Device_Used) and review content.
- Perform exploratory data analysis on hotel review topics and user behavior.
Strengths
- Includes a target variable (Is_Response) for supervised learning tasks.
- Contains multiple metadata fields (User_ID, Browser_Used, Device_Used) alongside the primary text data.
- Released under a permissive CC0-1.0 license for broad reuse.
Limitations
- Row count, file size, and last update date are unknown, limiting suitability assessment.
- Column-level documentation is absent; field semantics must be inferred after download.
- Data may reflect geographic or temporal bias inherent to the single, unspecified source platform.
Provenance
- Source
- A leading travel site (unspecified).
- Collection Method
- Customer reviews provided by users of the site.
- Time Range
- null
- Freshness
- Last updated date is unknown; freshness unverified.
- Geography
- null