Sign in to view source links and access this dataset
Description
Planktonzilla-17M is a large-scale dataset combining 17 million plankton images from all publicly available, labeled sources. It integrates imagery from systems including FlowCAM, ISIIS, UVP5/UVP6, ZooScan, ZooCAM, PlanktoScope, and IFCB. The dataset was created by project-oceania and was last updated on Hugging Face in May 2026.
Use Cases
Train plankton identification models based on the 17 million labeled images.
Develop classification algorithms robust to different imaging systems mentioned in the description.
Conduct comparative studies of plankton populations across varied oceanographic environments.
Strengths
Contains 17 million images, providing a large-scale resource for model training.
Integrates data from multiple imaging systems (e.g., FlowCAM, ISIIS, UVP5/UVP6), suggesting diversity in data capture.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
The dataset's license is unspecified, which could restrict usage.
Provenance
Source
project-oceania on Hugging Face
Collection Method
Combined from all publicly available, labeled plankton datasets.
Freshness
Last updated 2026-05-27 17:45:56; freshness should be verified.
License is unknown; users must verify permissions before use.