Sign in to view source links and access this dataset
Description
PianoCoRe is a large-scale piano MIDI dataset that unifies and refines major open-source piano corpora. It contains 250,046 performances of 5,625 pieces written by 483 composers, totaling 21,763 hours of performed music. The dataset was created by SyMuPe and was last updated on 2026-04-27.
Use Cases
Train generative models for piano music based on the large collection of MIDI performances.
Analyze performance styles and alignments based on the note-level score-performance data.
Study composer-specific patterns using the annotated composer and composition metadata.
Develop music transcription or synthesis tools using the refined and deduplicated MIDI data.
Strengths
Contains 250,046 performances, providing a large-scale resource.
Covers 5,625 pieces by 483 composers, indicating diversity.
Includes 21,763 hours of performed music, representing substantial temporal coverage.
Provides metadata such as deduplication flags, MIDI quality labels, and note-level alignments.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Last updated 2026-04-27 13:13:07; freshness should be verified.
File formats and exact data structure are unknown from the provided description.
Provenance
Source
SyMuPe via Hugging Face.
Collection Method
Unifies and refines major open-source piano corpora.
Freshness
Last updated 2026-04-27 13:13:07.
License is unknown; users should verify terms before use.