BatteryPass-12K: 12,000 Synthetic Samples for Digital Battery Passport Conformance
by Tosin Adewumi·Updated 1mo ago
23.6 MB1files
Available on 1 platform
Sign in to view source links and access this dataset
Description
BatteryPass-12K is the first public benchmark for digital battery passport conformance tasks. It contains 12,000 data points, generated using ChatGPT-5.1 Thinking from 6 pilot samples by the Global Battery Alliance, and is split into training, validation, and test sets. The dataset was authored by Tosin Adewumi and last updated in April 2026.
Use Cases
Training binary classifiers to distinguish conformant from nonconformant battery passport data.
Benchmarking generative AI models on structured data synthesis tasks for regulatory documents.
Developing validation pipelines for automated conformance checking systems.
Studying the characteristics of synthetic data generated from a small set of pilot samples.
Strengths
Contains 12,000 data points, providing a substantial corpus for model training.
Includes a predefined 80:10:10 split for training, validation, and test sets.
Data is generated from 6 pilot samples provided by the authoritative Global Battery Alliance.
Released under a permissive CC-BY-4.0 license.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
All data is synthetically generated, which may not fully capture real-world variance and complexity.
Row count is unknown, which may limit suitability assessment.
Provenance
Source
Generated from 6 pilot samples by the Global Battery Alliance (GBA).
Collection Method
Synthetic data generation using ChatGPT-5.1 Thinking (Standard).
Time Range
null
Freshness
Last updated 2026-04-22 20:16:25; freshness should be verified.