100,000 Amazon product reviews from the shoe category, each with a star rating from 1 to 5. The dataset was created by juliensimon and is used to train a DistilBERT-based classification model. It was last updated on March 22, 2026.
Use Cases
- Train a star rating prediction model based on review text.
- Benchmark text classification algorithms on a large-scale review dataset.
- Analyze sentiment polarity in e-commerce product reviews.
- Fine-tune transformer models like DistilBERT for specific review classification tasks.
Strengths
- Contains 100,000 samples, providing a substantial corpus for model training.
- Specifically designed for a defined task of star rating prediction (1-5 stars).
- Sourced from a major e-commerce platform, likely reflecting real-world user language.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Freshness should be verified as the last update timestamp is March 22, 2026.
Provenance
- Source
- Amazon product reviews (shoe category)
- Freshness
- Last updated 2026-03-22 21:49:34
- Geography
- us