DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Physical AI Benchmark For World Model Evaluation | DataSalon

Home Robotics & Autonomous SystemsPhysical AI Benchmark For World Model Evaluation

Robotics & Autonomous Systems

Physical AI Benchmark For World Model Evaluation

Name: Physical AI Benchmark For World Model Evaluation
Creator: nvidia
Published: 2025-06-11T09:05:24
Keywords: Autonomous Vehicles, Benchmark, Robotics, World Models, Physical Ai, Multimodal

by nvidia·Updated 1y ago

Available on 1 platform

Description

1044 samples of text prompts, conditioning images, and binary question-answer pairs comprise this benchmark for measuring world model progress. It covers target domains like autonomous vehicle driving, robotics, smart spaces, physics, and human common sense. The dataset was created by NVIDIA and last updated in June 2025.

Use Cases

Benchmark world model performance on binary question-answering tasks using the provided text prompts and qa pairs.
Evaluate multimodal reasoning by conditioning world models on the provided images before answering binary questions.
Test model generalization across physical AI domains like autonomous vehicle scenarios and robotics using the domain-specific prompts.
Assess common sense reasoning in physical contexts using the human and common sense question sets.

Strengths

1044 total samples provide a defined benchmark size.
Covers 6 distinct Physical AI target domains for broad evaluation.

Limitations

Limited to binary (Yes/No) questions, restricting answer complexity.
Sample size of 1044 may be insufficient for training large models from scratch.
Potential for bias in the selection of prompts and conditioning images across domains.

Provenance

Source: NVIDIA
Collection Method: null
Time Range: null
Freshness: Last updated June 2025.
Geography: null

null

Multimodal Autonomous Vehicles Benchmark Robotics World Models Physical Ai

Related Datasets

Quality Score

C40

Description

Source

Reputation

Quality Score

C40

Description

Source

Reputation

Access

Community

24 downloads

17 likes

0 views

Dataset Info

Author: nvidia
Created: Jun 11, 2025
Updated: Jun 11, 2025
Last synced: May 8, 2026

Access

Community

24 downloads

17 likes

0 views

Dataset Info

Author: nvidia
Created: Jun 11, 2025
Updated: Jun 11, 2025
Last synced: May 8, 2026

Physical AI Benchmark For World Model Evaluation

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info