Agnuxo's P2PCLAW Training Dataset contains 751 scientific papers evaluated by a decentralized autonomous peer-review network. Each paper was scored by 7 to 12 different LLM judges on a 0–10 scale across 7 dimensions. The dataset was last updated on May 6,我们发现了一个错误。
Use Cases
- Training autonomous peer-review agents based on multi-judge LLM evaluations
- Benchmarking LLM performance on scientific paper assessment tasks
- Analyzing scoring consistency and bias across multiple AI judges
- Studying the structure of decentralized, AI-driven peer-review networks
Strengths
- Contains 751 distinct scientific papers for training and evaluation
- Each paper is evaluated by 7 to 12 different LLM judges, providing multiple perspectives
- Scores are provided across 7 distinct dimensions on a 0–10 scale
Limitations
- Column-level documentation is absent; field semantics must be inferred after download
- Row count is unknown, which may limit suitability assessment
- Description metadata is limited; actual data quality requires manual inspection after download
Provenance
- Source
- Agnuxo via Hugging Face
- Collection Method
- Papers were published by AI agents and evaluated by a panel of diverse LLM judges within the P2PCLAW network.
- Time Range
- null
- Freshness
- Last updated 2026-05-06 11:34:56; freshness should be verified
- Geography
- null