Gen-Review: Human and AI-Generated Paper Reviews for ICLR 2018-2025
by Anonymous / Harvard Dataverse·Updated 1d ago
Available on 1 platform
Sign in to view source links and access this dataset
Description
A collection of over 92,000 text reviews for papers submitted to the International Conference on Learning Representations (ICLR). It includes more than 12,000 human-written reviews and more than 80,000 AI-generated reviews, covering submissions from 2018 to 2025. The dataset was uploaded to Harvard Dataverse by an anonymous author.
Use Cases
Train AI text detection classifiers based on the labeled human vs. AI-generated review text.
Analyze stylistic differences between human-written and machine-generated academic feedback.
Study the evolution of review content and quality for a major AI conference over an 8-year period.
Strengths
Contains a large volume of text, with over 80,000 AI-generated reviews and over 12,000 human-written reviews.
Covers a significant 8-year time span (2018-2025) of submissions to a prominent AI conference.
Provides a clear binary label (human vs. AI) for the source of each review text.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count for the combined dataset is not explicitly stated, which may limit suitability assessment.
The source and generation method for the AI reviews are not detailed in the provided metadata.
Provenance
Source
Harvard Dataverse, uploaded by an anonymous author.
Collection Method
Likely collected from ICLR conference submission systems, with AI-generated reviews added from an unspecified source.
Time Range
2018 to 2025
Freshness
Last updated 2026-06-17 17:53:22; freshness should be verified.
License is unknown and must be verified before use.