Name: LLM Responses on Under-5 Mortality Queries Evaluated by Pediatricians
Creator: Yi Yang
Published: 2026-05-05T05:23:58
License: CC-BY-4.0
Keywords: Child Health, Evaluation, Medical Information, Benchmark, Healthcare, Text, Large Language Models, Public Health

Description

Yi Yang's research dataset on figshare contains the evaluation results of four large language models' responses to 25 public queries about the top five causes of under-5 mortality. The dataset includes scores for reliability, accuracy, completeness, comprehensibility, readability, and actionability, generated using tools like DISCERN, Likert scales, and PEMAT-P. It was last updated on May 5, 2026.

Use Cases

Benchmark LLM performance on child health information based on DISCERN reliability scores.
Compare readability of AI-generated health advice based on Flesch-Kincaid Grade Level scores.
Analyze the actionability of LLM responses for public guidance based on PEMAT-P scores.
Identify strengths and weaknesses of specific LLMs (ChatGPT-4.0, Claude 3.5 Sonnet, Bing AI, Gemini) across multiple evaluation metrics.

Strengths

Evaluations are based on 25 representative public queries derived from Google Trends.
Responses were independently scored by four pediatricians using established instruments.
Performance differences among four LLMs were statistically tested (p < 0.05).

Limitations

The dataset is a 24.2 KB DOCX file, suggesting limited scope and likely containing summary results rather than raw data.
Column-level documentation is absent; field semantics must be inferred after download.
The data reflects a specific evaluation study; its applicability to other health topics or LLMs is unknown.

Provenance

Source: Yi Yang
Collection Method: LLM responses were collected and evaluated by pediatricians using standardized tools.
Freshness: Last updated 2026-05-05 05:23:58; freshness should be verified.

License is CC-BY-4.0. The file format is DOCX, which may require specific software for viewing.

Text Child Health Evaluation Medical Information Benchmark Healthcare Large Language Models Public Health

LLM Responses on Under-5 Mortality Queries Evaluated by Pediatricians

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info