Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
IFBench provides a benchmark for evaluating reward models designed to assess instruction-following capabilities in AI agents. The dataset was created by the THU-KEG research group and was published in March 2025 alongside their paper on agentic reward modeling. It contains samples with unique identifiers and source annotations for structured evaluation.
Full dataset details, including all columns, sample data, and license, require visiting the Hugging Face dataset page. The platform tags suggest the dataset is in JSON format and contains text data.