Sign in to view source links and access this dataset
Description
Approximately 8.5 million 512x512 pixel JPEG cutouts of galaxies centered on their source, sourced from the DESI Legacy Survey Data Release 8. The dataset, created by Smith42, includes a 98% training, 1% validation, and 1% test split and was last updated in September 2025. It also contains accompanying metadata with galaxy properties.
Use Cases
Train galaxy classification models based on the 8.5 million labeled image cutouts.
Validate astronomical image processing algorithms using the dedicated 1% validation set.
Test model generalization on unseen galaxy images using the held-out 1% test set.
Conduct multi-modal analysis by combining the galaxy images with the provided metadata on galaxy properties.
Strengths
Large scale with approximately 8.5 million individual galaxy images.
Structured split with 98% for training, 1% for validation, and 1% for testing.
Includes accompanying metadata with galaxy properties for multi-modal analysis.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Data may reflect observational bias inherent to the DESI Legacy Survey's instrumentation and footprint.
Provenance
Source
DESI Legacy Survey Data Release 8
Collection Method
Galaxy cutouts centered on source, likely from astronomical survey imaging.
Freshness
Last updated 2025-09-19 02:13:58; freshness should be verified.
License is unknown; terms of use must be verified before application.