AgentCollabBench: Diagnostic Benchmark for Multi-Agent LLM Systems

Name: AgentCollabBench: Diagnostic Benchmark for Multi-Agent LLM Systems
Creator: AgentCollabBench
Published: 2026-05-05T04:44:50
Keywords: Llm Benchmark, Multi Agent Systems, Process Failures, Benchmark, Text, Diagnostic Benchmark

by AgentCollabBenchUpdated 1mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

AgentCollabBench is a diagnostic benchmark dataset for multi-agent LLM systems, created by AgentCollabBench and last updated on 2026-05-06. It targets process-level failures that single-agent benchmarks cannot expose, such as failures emerging from inter-agent communication.

Use Cases

Diagnosing constraint decay under peer pressure based on the described failure mode.
Measuring multi-hop information loss in agent communication chains.
Analyzing false-belief propagation between agents.
Testing for private context leakage in collaborative scenarios.

Strengths

Specifically designed to expose failures unique to multi-agent collaboration.
Targets four described process-level failure modes: constraint decay, information loss, false-belief propagation, and context leakage.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: AgentCollabBench
Collection Method: Likely contains constructed scenarios or tasks to diagnose specific multi-agent failure modes.
Time Range: null
Freshness: Last updated 2026-05-06 07:10:06; freshness should be verified.
Geography: null

License is unknown; restrictions should be verified before use.

Text Llm Benchmark Multi Agent Systems Process Failures Benchmark Diagnostic Benchmark

Related Datasets

Quality Score

D36

Description

39

Source

36

Reputation

38

Access

26

Community

7 downloads

1 likes

0 views

Dataset Info

Author: AgentCollabBench
Created: May 5, 2026
Updated: May 6, 2026
Last synced: May 12, 2026

Access

26

Community

7 downloads

1 likes

0 views

Dataset Info

Author: AgentCollabBench
Created: May 5, 2026
Updated: May 6, 2026
Last synced: May 12, 2026

AgentCollabBench: Diagnostic Benchmark for Multi-Agent LLM Systems

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info