ResearchClawBench: Benchmark for Automated Scientific Research Agents

Name: ResearchClawBench: Benchmark for Automated Scientific Research Agents
Creator: InternScience
Published: 2026-03-27T04:00:28
Keywords: Publication Evaluation, Scientific Research, Benchmark, Ai Agents, Text, Automated Research

by InternScienceUpdated 20d ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

ResearchClawBench is a benchmark that measures whether AI coding agents can independently conduct scientific research from reading raw data to producing publication-quality reports. The benchmark, created by InternScience, was last updated on May 19, 2026. It evaluates agent outputs against real human-authored papers.

Use Cases

Benchmarking AI agents' ability to read raw data and produce reports based on the described research workflow.
Evaluating the quality of AI-generated scientific reports against human-authored papers as described.
Training AI agents for tasks spanning from literature re-discovery to new discovery as mentioned in the description.

Strengths

Benchmark is designed to evaluate a complete research workflow from data reading to report generation.
Evaluation is performed against real human-authored papers, providing a concrete ground truth.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: InternScience on Hugging Face.
Collection Method: Likely compiled for benchmarking AI research agents; specific collection method is not detailed.
Freshness: Last updated 2026-05-19 11:59:57; freshness should be verified.

License is unknown; terms of use must be verified before application.

Text Publication Evaluation Scientific Research Benchmark Ai Agents Automated Research

Related Datasets

Quality Score

D38

Description

39

Source

36

Reputation

49

Access

26

Community

2.8K downloads

2 likes

0 views

Dataset Info

Author: InternScience
Created: Mar 27, 2026
Updated: May 19, 2026
Last synced: Jun 8, 2026

Access

26

Community

2.8K downloads

2 likes

0 views

Dataset Info

Author: InternScience
Created: Mar 27, 2026
Updated: May 19, 2026
Last synced: Jun 8, 2026

ResearchClawBench: Benchmark for Automated Scientific Research Agents

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info