Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Wikipedia PT Categories is a Portuguese clustering evaluation dataset containing 2,873 articles from pt.wikipedia.org, each labeled with one of 15 broad topic categories. The dataset was created by tardellirs and serves as the source for the WikipediaPTCategoriesClusteringP2P task in the MTEB(por) benchmark. It was last updated on 2026-06-08.
License information is unknown; users should verify terms of use before application.