Sign in to view source links and access this dataset
Description
114,961 patient records and 27,605,231 medical images form this large-scale collection of CT scan reports with confirmed clinical findings. The dataset, created by InfoBayAI and last updated in June 2026, captures authentic imaging characteristics like scanner variability and acquisition protocols. It is designed to support the development of advanced healthcare AI and diagnostic systems.
Use Cases
Training diagnostic AI systems based on confirmed clinical findings.
Developing natural language processing models for clinical narrative understanding.
Studying imaging characteristics like scanner variability and patient positioning.
Benchmarking medical imaging models against a large-scale, multi-patient dataset.
Strengths
Contains data from 114,961 patients, indicating a substantial patient cohort.
Includes 27,605,231 medical images, providing a large volume of imaging data.
Reports contain confirmed clinical findings, which may enhance model training reliability.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
The description metadata is limited; actual data quality requires manual inspection after download.
Data may reflect temporal or institutional bias inherent to its unspecified source collection.
Provenance
Source
InfoBayAI
Freshness
Last updated 2026-06-02 05:44:48; freshness should be verified.
License is unknown; usage restrictions must be verified before application.