Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
228 Korean-language questions designed to benchmark web agents on exhaustive enumeration tasks. Each task asks an agent to fill every attribute cell of a table by exhaustively enumerating a closed set. Gold answers, source URLs, and scoring details are withheld for a leakage-aware evaluation run privately against held-out data.
Gold answers, source URLs, set sizes, and the scoring pipeline are not part of the public release.