RoboPulse is a benchmark introduced in the paper 'PRM-as-a-Judge: A Dense Evaluation Paradigm for Fine-Grained Robotic Auditing'. This Hugging Face release contains the hard 1800-example subset, authored by yuheng2000 and last updated on 2026-03-30. Each example asks a vision-language judge to compare a BEFORE and AFTER state under the same task, using task-start and task-end reference frames as anchors.
Use Cases
- Benchmarking vision-language models for fine-grained progress detection based on before/after state comparisons.
- Training or auditing robotic manipulation systems based on relative progress evaluation.
- Researching dense evaluation paradigms for robotics based on the described task-anchored framework.
Strengths
- Contains 1800 examples in its hard subset.
- Designed for fine-grained evaluation of relative progress in physical manipulation.
- Uses task-start and task-end reference frames as anchors for full task scope.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is known only for the subset; full dataset scale is unknown.
- Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
- Source
- Hugging Face dataset authored by yuheng2000.
- Collection Method
- Introduced as a benchmark in the associated research paper.
- Time Range
- null
- Freshness
- Last updated 2026-03-30 15:35:40.
- Geography
- null