Sign in to view source links and access this dataset
Description
An extension of the Amazon Reviews 2023 dataset, created by Google and last updated in January 2026. It includes product reviews and associated images across categories like Appliances, Clothing, Sports, and Video Games. The data was cleaned and augmented, specifically filtering for items with available images.
Use Cases
Train multimodal models to predict review sentiment or rating from combined review text and product images.
Analyze the relationship between product category (e.g., Appliances, Video_Games) and visual features extracted from product images.
Study consumer behavior by linking review text to visual product attributes within specific categories like Clothing_Shoes_and_Jewelry.
Perform data augmentation or benchmarking for models using the cleaned and extended review-image pairs provided.
Strengths
Dataset is an extension of a known Amazon Reviews 2023 benchmark, implying a foundational data source.
Explicitly filtered to include only items with images, ensuring multimodal completeness for those records.
Covers four distinct product categories (Appliances, Clothing_Shoes_and_Jewelry, Sports_and_Outdoors, Videos_Games) for comparative analysis.
Limitations
Exact row count, column names, and total dataset size are unknown from the provided information.
The removal of items without images introduces a selection bias, excluding text-only reviews.
Specific geographic coverage and time range of the reviews are not detailed.
Provenance
Source
Extension of the Amazon Reviews '23 Dataset.
Collection Method
Data underwent cleaning and augmentation steps; items without images were removed.
Freshness
Last updated on 2026-01-24, indicating recent maintenance.
The full description and specific data schema are available only on the Hugging Face dataset page. License information is unknown.