Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
OSWorld provides task examples, retrieval documents, and virtual machine snapshots for benchmarking multimodal agents performing open-ended tasks in real computer environments. The dataset was created by xlangai and last updated in October 2024. It supports evaluation on both x86 and arm64 machine architectures using VMware or VirtualBox.
Requires specific virtualization software (VMware or VirtualBox) depending on the user's machine architecture to load the provided snapshots.