DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

AgentRx: Root Cause Attribution for Multi-Agent LLM Failures | DataSalon

Home Reinforcement LearningAgentRx: Root Cause Attribution for Multi-Agent LLM Failures

Reinforcement Learning

AgentRx: Root Cause Attribution for Multi-Agent LLM Failures

Name: AgentRx: Root Cause Attribution for Multi-Agent LLM Failures
Creator: microsoft
Published: 2026-02-23T21:27:42
Keywords: Arxiv260202475, Librarypolars, Size Categoriesn1 K, Modalitytext, Librarymlcroissant, Librarydatasets, Librarypandas, Licensecc By 40, Regionus, JSON

by microsoft·Updated 4mo ago

Available on 1 platform

Description

Microsoft's AgentRx benchmark, updated in February 2026, provides under 1,000 annotated records of failed multi-agent LLM trajectories. It features step-level failure categories and designated root cause attributions across domains such as retail.

Use Cases

Training models to identify root cause failures in agentic workflows
Evaluating LLM debugging capabilities using step-level failure categories
Developing constraint-based supervision for multi-agent systems

Strengths

Step-level failure annotations for granular debugging
Root cause attribution labels for diagnostic tasks
CC BY 4.0 open license for research use

Limitations

Small sample size of fewer than 1,000 records
Focus is restricted to failed trajectories rather than a balanced success/failure set

Provenance

Source: Microsoft (Arxiv 2602.02475)
Collection Method: Annotated failed trajectories from multi-agent systems
Freshness: Last updated February 2026.
Geography: United States

Requires JSON processing; users should refer to Arxiv paper 2602.02475 for methodology details.

JSON Arxiv260202475 Librarypolars Size Categoriesn1 K Modalitytext Librarymlcroissant Librarydatasets Librarypandas Licensecc By 40 Regionus

Related Datasets

Quality Score

D37

Description

Source

Reputation

Quality Score

D37

Description

Source

Reputation

Access

Community

33 downloads

3 likes

0 views

Dataset Info

Author: microsoft
Created: Feb 23, 2026
Updated: Feb 26, 2026
Last synced: May 12, 2026

Access

Community

33 downloads

3 likes

0 views

Dataset Info

Author: microsoft
Created: Feb 23, 2026
Updated: Feb 26, 2026
Last synced: May 12, 2026

AgentRx: Root Cause Attribution for Multi-Agent LLM Failures

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info