MAX-EVAL-11: 10,000 MIMIC-III Discharge Summaries with ICD-11 Annotations

Name: MAX-EVAL-11: 10,000 MIMIC-III Discharge Summaries with ICD-11 Annotations
Creator: mas-namtla
Published: 2026-05-17T06:30:27
Keywords: Benchmark, Healthcare, Text, Clinical Text, Medical Coding, Large Scale, Icd 11, Mimic Iii

by mas-namtlaUpdated 1mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

MAX-EVAL-11 is a large-scale benchmark for evaluating large language models on full-spectrum ICD-11 medical coding. It comprises 10,000 MIMIC-III discharge summaries with expert-validated ICD-11 annotations covering 99.87% of the ICD-11 diagnostic codes. The dataset was created by mas-namtla and was last updated on HuggingFace in May 2026.

Use Cases

Benchmarking LLM performance on ICD-11 coding tasks based on the expert-validated annotations
Training models for automated medical code assignment based on clinical discharge summaries
Analyzing the coverage and distribution of ICD-11 diagnostic codes within a large clinical corpus

Strengths

10,000 discharge summaries provide a substantial corpus for model evaluation
Expert-validated ICD-11 annotations ensure label quality
Annotations cover 99.87% of the ICD-11 diagnostic code space

Limitations

Dataset is restricted and not publicly available; access requires a request and agreement to PhysioNet MIMIC-III data use terms
Column-level documentation is absent; field semantics must be inferred after download
Row count is known, but other structural details like file formats are unknown

Provenance

Source: MIMIC-III database
Collection Method: Likely extracted and annotated from MIMIC-III discharge summaries
Freshness: Last updated 2026-05-17 07:17:47; freshness should be verified

Access is restricted; researchers must contact Ujjwal Singh at [email protected] and agree to PhysioNet MIMIC-III data use terms.

Text Benchmark Healthcare Clinical Text Medical Coding Large Scale Icd 11 Mimic Iii

Related Datasets

Quality Score

D40

Description

51

Source

36

Reputation

35

Access

26

Community

1 likes

0 views

Dataset Info

Author: mas-namtla
Created: May 17, 2026
Updated: May 17, 2026
Last synced: May 23, 2026

Access

26

Community

1 likes

0 views

Dataset Info

Author: mas-namtla
Created: May 17, 2026
Updated: May 17, 2026
Last synced: May 23, 2026

MAX-EVAL-11: 10,000 MIMIC-III Discharge Summaries with ICD-11 Annotations

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info