Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
K-MetBench is a multi-dimensional benchmark for evaluating meteorology models across accuracy, reasoning quality, geo-cultural alignment, and fine-grained domain coverage. The dataset was created by soyeonbot and was last updated on Hugging Face in April 2026. Its public evaluation protocol uses an explicit advanced benchmark and an explicit reasoning benchmark followed by LLM-as-a-judge evaluation.
License is unknown; terms of use must be verified on the dataset page.