Highlighting pre-rendered 3D multi-room environments categorized for evaluating spatial reasoning in Vision Language Models. It provides structured visual scene data to support the Theory of Space benchmark, focusing on active exploration and the construction of spatial beliefs.
Use Cases
- Evaluate the spatial reasoning capabilities of Vision Language Models using the pre-rendered 3D environment sequences
- Develop and test active exploration strategies in foundation models using the provided multi-room visual transitions
- Benchmark the ability of models to construct and maintain internal spatial maps from sequential visual inputs
Strengths
- Contains pre-rendered 3D multi-room environments for spatial navigation tasks
- Designed specifically to support the Theory of Space (ToS) benchmark for foundation models
- Provides visual scene data structured for active exploration and spatial reasoning evaluation