Sign in to view source links and access this dataset
Description
70,000 synthetic human face images generated by the stratum-hq tool. The dataset includes multiple annotation layers such as captions, depth maps, normals, pose, segmentation, and embeddings from models like DINOv3 and T5. It was created by author 'timlawrenz' and last updated on the platform in May 2026.
Use Cases
Training generative adversarial networks (GANs) based on the 70,000 high-quality synthetic face images.
Developing 3D-aware models using the depth and normal map layers provided for each image.
Conducting pose estimation research based on the pose annotation layer for all 70,000 images.
Training or evaluating segmentation models using the semantic segmentation (seg) layer included.
Exploring multi-modal alignment using the paired image-caption data and T5/DINOv3 embeddings.
Strengths
Contains 70,000 images, providing a substantial scale for model training.
Offers seven distinct annotation layers (e.g., caption, depth, pose) for each image, enabling diverse analysis.
All layers are consistently available for the full set of 70,000 images, except for DINOv3 embeddings which cover 19,700 instances.
Limitations
Description metadata is limited; actual data quality and column semantics require manual inspection after download.
Column-level documentation is absent; field semantics must be inferred from file formats or external sources.
Data may reflect bias inherent to the generation model and source distribution of the original FFHQ dataset.
Provenance
Source
huggingface
Collection Method
Generated synthetically by the stratum-hq tool (v0.1.0).
Freshness
Last updated 2026-05-28 06:20:30; freshness should be verified.
License is unknown, which may restrict commercial or research use.