COVID-19 Case Surveillance: Patient-Level Clinical and Demographic Data
Updated 1y ago
Available on 1 platform
Sign in to view source links and access this dataset
Description
U.S. COVID-19 case data shared with the CDC, containing 12 elements per patient record. The dataset includes demographics, exposure history, disease severity indicators, outcomes, and underlying medical conditions. Data collection from certain jurisdictions was discontinued in 2023-2024, and the dataset is no longer updated as of July 2024.
Use Cases
Analyze hospitalization and ICU admission rates using `hosp_yn` and `icu_yn` across `age_group` and `sex`.
Model mortality risk using `death_yn` as a target variable with predictors like `medcond_yn` and `age_group`.
Track case progression timelines by comparing `onset_dt`, `pos_spec_dt`, `cdc_case_earliest_dt`, and `cdc_report_dt`.
Examine demographic disparities in case outcomes using `race_ethnicity_combined`, `age_group`, and `current_status`.
Strengths
Contains 12 structured data elements per case, including clinical outcomes and demographics.
Data is sourced from the CDC's national case surveillance system.
Available in multiple machine-readable formats (CSV, JSON, XML, RDF).
Limitations
Row count is unknown, which may limit suitability assessment.
Data may reflect geographic bias as several U.S. states discontinued reporting in 2023-2024.
Some data cells are suppressed to protect individual privacy.
Provenance
Source
data.cdc.gov
Collection Method
Aggregated from COVID-19 case reports submitted to the CDC by U.S. jurisdictions.
Freshness
Last updated 2025-02-23 22:52:14; dataset is no longer updated as of July 2024.
Geography
United States (with noted jurisdictional reporting gaps)
No geographic data (state/county) is included in this specific public use dataset.