DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Humanity's Last Exam: 2,500 Multi-Modal Frontier Knowledge Questions | DataSalon

Home Multimodal & LLMHumanity's Last Exam: 2,500 Multi-Modal Frontier Knowledge Questions

Multimodal & LLM

Humanity's Last Exam: 2,500 Multi-Modal Frontier Knowledge Questions

Name: Humanity's Last Exam: 2,500 Multi-Modal Frontier Knowledge Questions
Creator: cais
Published: 2025-01-23T08:24:27
Keywords: Size Categories1 Kn10 K, Librarypolars, Benchmarkofficial, Modalitytext, Librarymlcroissant, Modalityimage, Librarydatasets, Librarypandas, Parquet, Regionus, Licensemit

by cais·Updated 6mo ago

Available on 1 platform

Description

Humanity's Last Exam (HLE) is a multi-modal benchmark containing 2,500 questions across dozens of academic subjects, released by the Center for AI Safety and Scale AI in January 2026. It serves as a frontier-level evaluation suite designed to test the limits of human knowledge through closed-ended questions.

Use Cases

Evaluating Large Language Model (LLM) performance on frontier-level academic questions
Testing multi-modal reasoning capabilities using the image-based question components
Benchmarking domain-specific expertise across dozens of specialized subjects

Strengths

2,500 expert-level questions
Multi-modal support including both text and image modalities
Broad subject coverage across dozens of academic fields
MIT licensed for research use

Limitations

Small sample size of 2,500 records limits its use for model training
Closed-ended format may not evaluate open-ended generative reasoning
High risk of benchmark contamination if redistributed against author requests

Provenance

Source: Center for AI Safety & Scale AI
Freshness: Last updated January 20, 2026.
Geography: United States

The authors explicitly request that users do not publicly share, re-upload, or distribute the dataset to protect benchmark integrity. It is provided in Parquet format.

Parquet Size Categories1 Kn10 K Librarypolars Benchmarkofficial Modalitytext Librarymlcroissant Modalityimage Librarydatasets Librarypandas Regionus Licensemit

Related Datasets

Quality Score

C40

Description

Source

Reputation

Quality Score

C40

Description

Source

Reputation

Access

Community

43.7K downloads

756 likes

0 views

Dataset Info

Author: cais
Created: Jan 23, 2025
Updated: Jan 20, 2026
Last synced: Jul 24, 2026

Access

Community

43.7K downloads

756 likes

0 views

Dataset Info

Author: cais
Created: Jan 23, 2025
Updated: Jan 20, 2026
Last synced: Jul 24, 2026

Humanity's Last Exam: 2,500 Multi-Modal Frontier Knowledge Questions

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info