Vietnamese Legal Documents for Retrieval-Augmented Generation
Available on 1 platform
Sign in to view source links and access this dataset
Description
A collection of Vietnamese legal documents intended for use in Retrieval-Augmented Generation (RAG) systems. The dataset is hosted on Kaggle, but its specific size, source, and creation date are unknown. Its content likely includes legal texts, statutes, or case law in the Vietnamese language.
Use Cases
Fine-tune a language model for Vietnamese legal text comprehension (inferred from domain, verify after download)
Build a retrieval system for legal document search (inferred from domain, verify after download)
Benchmark RAG pipelines on domain-specific, non-English text (inferred from domain, verify after download)
Strengths
Published on Kaggle, a platform with a large community of data scientists.
Focuses on a specific domain (legal) and language (Vietnamese), which may be a niche resource.
Limitations
Metadata is minimal; actual content requires verification after download.
Row count, file formats, and column definitions are unknown, which limits suitability assessment.
License and authorship details are absent, complicating usage rights verification.
Provenance
Geography
Vietnam (inferred from title)
License is unknown; users must verify terms of use before commercial application.