Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
FAPO Critic contains constructed benchmark data for training a generative reward model. The dataset was created by author dyyyyyyyy for the FAPO research project and was last updated on the platform in October 2025. It is sourced from ProcessBench and forms the FlawedPositiveBench used to train the FAPO-GenRM-4B model.
Primary usage is for training the specific FAPO-GenRM-4B model; license and detailed schema are unknown.