700,000 Turkish legal documents from the Yargıtay and Danıştay courts are organized via multiple embedding models and clustering algorithms. These records represent the primary sources of legal precedent in Turkey for civil and criminal cases.
Use Cases
- Train Turkish legal NLP models using the 700,000 document texts
- Benchmark clustering algorithms using the pre-computed cluster assignments
- Perform semantic search on Turkish legal precedents using the embedding-based clusters
Strengths
- 700,000 individual legal documents from Turkey's highest courts
- Sourced from Yargıtay, the supreme court of appeal for civil and criminal matters
- Clustered using multiple embedding models and algorithms for comparative research