SGI-Bench: Scientist-Aligned LLM Evaluation Across 10 Disciplines

Name: SGI-Bench: Scientist-Aligned LLM Evaluation Across 10 Disciplines
Creator: InternScience
Published: 2026-03-25T06:59:55
Keywords: Agentic Framework, Scientific Benchmark, Benchmark, Llm Evaluation, Text, Multidisciplinary, Expert Curated

by InternScienceUpdated 1mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

SGI-Bench is a scientist-aligned benchmark for evaluating Scientific General Intelligence in large language models across the full inquiry cycle. The benchmark spans 10 disciplines and contains more than 1,000 expert-curated samples inspired by Science's 125 Big Questions. It was created by InternScience and last updated on HuggingFace in June 2026.

Use Cases

Benchmarking LLM performance on scientific deliberation tasks based on the described inquiry cycle.
Evaluating model conception abilities on multidisciplinary problems across 10 disciplines.
Testing agentic action and perception workflows in a scientific context as described in the framework.

Strengths

More than 1,000 expert-curated samples provide a substantial evaluation corpus.
Benchmark spans 10 distinct scientific disciplines for broad coverage.
Evaluation framework is structured around the scientist-aligned Deliberation, Conception, Action, and Perception cycle.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: InternScience on HuggingFace
Collection Method: Expert-curated samples inspired by Science's 125 Big Questions.
Freshness: Last updated 2026-06-02 12:05:10; freshness should be verified.

Text Agentic Framework Scientific Benchmark Benchmark Llm Evaluation Multidisciplinary Expert Curated

Related Datasets

Quality Score

D38

Description

42

Source

39

Reputation

41

Access

22

Community

54 downloads

1 likes

0 views

Dataset Info

Author: InternScience
Created: Mar 25, 2026
Updated: Jun 2, 2026
Last synced: Jun 26, 2026

Access

22

Community

54 downloads

1 likes

0 views

Dataset Info

Author: InternScience
Created: Mar 25, 2026
Updated: Jun 2, 2026
Last synced: Jun 26, 2026

SGI-Bench: Scientist-Aligned LLM Evaluation Across 10 Disciplines

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info