Name: GPT-4 Annotated Severity Scores for Human Phenotype Ontology Abnormalities
Creator: Kitty B. Murphy
Published: 2026-05-21T05:44:10
License: CC-BY-4.0
Keywords: CSV, Rare Diseases, Human Phenotype Ontology, Bioinformatics, Healthcare, Tabular, Large Scale, Clinical Severity, Llm Annotation, Synthetic

Description

GPT-4 annotated the clinical severity of over 17,500 phenotypic abnormalities in the Human Phenotype Ontology across nine clinical characteristics. The annotations were benchmarked against ground-truth labels with a mean true positive recall rate of 97%. This dataset, created by Kitty B. Murphy and last updated in May 2026, provides quantitative severity metrics for prioritizing therapeutic targets in rare diseases.

Use Cases

Ranking phenotypic abnormalities by health impact based on the integrated severity scoring system.
Benchmarking LLM performance on clinical annotation tasks using the provided ground-truth comparisons.
Prioritizing gene therapy targets for rare diseases based on automated severity metrics.
Extending the annotation framework to incorporate additional clinical dimensions mentioned in the discussion.

Strengths

Annotations cover over 17,500 phenotypic abnormalities across more than 8,600 rare diseases.
Benchmarking showed strong performance with true positive recall rates ranging from 89% to 100%.
Severity is operationalized using nine specific clinical characteristics and four frequency levels.
Dataset is openly licensed under CC-BY-4.0.

Limitations

Row count is unknown, which may limit suitability assessment.
Column-level documentation is absent; field semantics must be inferred after download.
The 1.8 KB file size suggests a very limited scope, likely containing summary scores rather than raw annotations.

Provenance

Source: figshare, authored by Kitty B. Murphy.
Collection Method: GPT-4 was employed to annotate phenotypic severity, with outputs benchmarked against ground-truth labels embedded in the HPO.
Freshness: Last updated 2026-05-21 05:44:10; freshness should be verified.

The dataset is very small (1.8 KB), indicating it likely contains aggregated results or summary scores rather than the full set of raw LLM annotations.

Tabular CSV Rare Diseases Human Phenotype Ontology Bioinformatics Healthcare Large Scale Clinical Severity Llm Annotation Synthetic

GPT-4 Annotated Severity Scores for Human Phenotype Ontology Abnormalities

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info