12,000 data points comprise the first public benchmark for digital battery passport conformance. The dataset was created by Adewumi, Oluwatosin using ChatGPT-5.1 Thinking to generate synthetic samples from 6 pilot samples provided by the Global Battery Alliance. It was last updated on May 1, 2026, and is split into training, validation, and test sets in an 80:10:10 ratio.
Use Cases
- Train classification models to distinguish between conformant and nonconformant battery passport data.
- Benchmark model performance on the synthetic conformance task described.
- Develop and validate automated systems for checking digital product passport adherence.
- Analyze patterns in synthetic nonconformant data generated from pilot samples.
Strengths
- Contains 12,000 data points, providing a substantial benchmark size.
- Includes a balanced split of 1,000 conformant and 1,000 nonconformant synthetic samples per original pilot sample.
- Features a predefined 80:10:10 split for training, validation, and testing.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Data is synthetically generated, which may not fully capture real-world variance and edge cases.
Provenance
- Source
- Generated from 6 pilot samples from the Global Battery Alliance (GBA).
- Collection Method
- Synthetic data generation using ChatGPT-5.1 Thinking (Standard).
- Time Range
- null
- Freshness
- Last updated 2026-05-01 20:55:51; freshness should be verified.
- Geography
- null