Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
ITBench-Lite is a systematic framework for benchmarking large language models and AI agents on real-world IT automation tasks. The dataset contains 65 scenarios across three critical domains, including 35 scenarios for Site Reliability Engineering. It was created by IBM Research and is associated with a research paper titled 'ITBench: Evaluating AI Agents across Diverse Real-World IT Automation Tasks'.
License is listed as Apache 2.0 in the description but marked as 'unknown' in the input metadata; verification is recommended.