Name: Chi-Bench: Long-Horizon Healthcare Workflow Agent Benchmark
Creator: actava
Published: 2026-05-03T06:52:20
Keywords: Prior Authorization, Clinical Simulation, Benchmark, Ai Agent Benchmark, Healthcare, Text, Utilization Management, Healthcare Workflows

Description

χ-Bench (Chi-Bench) is a benchmark dataset for evaluating AI agents on end-to-end U.S. healthcare workflows. It was created by author 'actava' and last updated on 2026-05-19. The dataset provides task fixtures across three long-horizon domains: provider prior authorization, payer utilization management, and population care management.

Use Cases

Benchmarking AI agent performance on provider prior authorization workflows based on the described clinical case tasks.
Evaluating agent decision-making in payer utilization management scenarios based on the described policy-rich environment.
Testing long-horizon planning capabilities for population care management based on the described end-to-end workflow simulation.

Strengths

Focuses on three specific, complex healthcare domains: prior authorization, utilization management, and care management.
Simulates a high-fidelity environment of 20 healthcare applications exposed over MCP.
Benchmark tasks are designed for long-horizon, policy-rich agent evaluation.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Last updated 2026-05-19 04:40:32; freshness should be verified.

Provenance

Source: actava on Hugging Face
Collection Method: Likely created as a benchmark task suite for AI agent evaluation.
Freshness: 2026-05-19 04:40:32
Geography: U.S. healthcare workflows

License is unknown; terms of use must be verified before application.

Text Prior Authorization Clinical Simulation Benchmark Ai Agent Benchmark Healthcare Utilization Management Healthcare Workflows

Chi-Bench: Long-Horizon Healthcare Workflow Agent Benchmark

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info