Sign in to view source links and access this dataset
Description
InfoBayAI created a large-scale collection of CT scan reports without clinical findings, containing data from 32,631 patients and 7,946,296 medical images. The dataset is designed to support the development of healthcare AI and diagnostic systems, capturing authentic imaging characteristics. It was last updated on June 2, 2026.
Use Cases
Train NLP models to parse structured and unstructured clinical text based on the report descriptions.
Develop medical imaging AI systems using the large collection of 7.9 million CT images.
Benchmark diagnostic AI models on data that captures scanner variability and acquisition protocols.
Study the characteristics of medical reports that lack clinical findings for anomaly detection research.
Strengths
Large scale with data from 32,631 individual patients.
Contains 7,946,296 medical images, providing substantial volume for model training.
Designed to capture authentic imaging characteristics like scanner variability and patient positioning.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
The description metadata is limited; actual data quality requires manual inspection after download.
Row count for the text reports is unknown, which may limit suitability assessment.
Provenance
Source
InfoBayAI
Freshness
Last updated 2026-06-02 05:45:20; freshness should be verified.
License is unknown; restrictions must be verified before use.