SGI-Bench: Scientist-Aligned Benchmark for Evaluating LLMs Across 10 Disciplines

Name: SGI-Bench: Scientist-Aligned Benchmark for Evaluating LLMs Across 10 Disciplines
Creator: InternScience
Published: 2025-12-03T04:14:48
Keywords: Agentic Framework, Scientific Benchmark, Benchmark, Llm Evaluation, Text, Multidisciplinary

by InternScienceUpdated 29d ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

SGI-Bench is a benchmark for evaluating Scientific General Intelligence in large language models. It contains more than 1,000 expert-curated samples spanning 10 scientific disciplines, aligned with the full inquiry cycle. The dataset was created by InternScience and last updated on 2026-06-02.

Use Cases

Benchmarking LLM performance on scientific inquiry workflows based on the described Deliberation, Conception, Action, and Perception cycle.
Training or fine-tuning models for scientific reasoning based on the expert-curated samples from 10 disciplines.
Developing agentic evaluation frameworks for AI based on the described scientist-aligned methodology.

Strengths

More than 1,000 expert-curated samples provide a substantial evaluation corpus.
Benchmark spans 10 distinct scientific disciplines for broad coverage.
Evaluation framework is designed to assess the full agentic inquiry cycle.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: InternScience via Hugging Face.
Collection Method: Expert-curated samples inspired by Science's 125 Big Questions.
Freshness: Last updated 2026-06-02 12:01:45.

License is unknown; users should verify terms before use.

Text Agentic Framework Scientific Benchmark Benchmark Llm Evaluation Multidisciplinary

Related Datasets

Quality Score

C40

Description

42

Source

39

Reputation

50

Access

22

Community

1.3K downloads

7 likes

0 views

Dataset Info

Author: InternScience
Created: Dec 3, 2025
Updated: Jun 2, 2026
Last synced: Jun 26, 2026

Access

22

Community

1.3K downloads

7 likes

0 views

Dataset Info

Author: InternScience
Created: Dec 3, 2025
Updated: Jun 2, 2026
Last synced: Jun 26, 2026

SGI-Bench: Scientist-Aligned Benchmark for Evaluating LLMs Across 10 Disciplines

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info