106,574 tracks from 16,341 artists across 161 genre categories. The collection includes 917 GiB of audio data, pre-computed features like MFCCs, and metadata tables linking tracks to albums and artists.
Use Cases
- Train multi-label genre classifiers using the 'genres_all' list and raw audio files
- Build music recommendation engines based on the 'artist_location' and 'track_genres' features
- Evaluate audio feature extraction algorithms using the 'track_duration' and 'bit_rate' technical metadata
Strengths
- 106,574 tracks with associated metadata including 'track_id', 'artist_name', and 'album_id'
- Hierarchical genre taxonomy featuring 161 categories with 'genre_id' and 'parent' mappings
- Pre-computed features for all tracks including 20 MFCCs, spectral centroid, and chromagram
- Four distinct dataset splits (small, medium, large, full) to accommodate different hardware constraints