Sign in to view source links and access this dataset
Description
Vietnamese legal text data intended for training or evaluating large language models. The dataset is described as small, but its exact size and composition are unspecified. It originates from the Kaggle platform, but the author, organization, and license details are unknown.
Use Cases
Fine-tune a legal question-answering model based on Vietnamese legal text.
Benchmark LLM performance on Vietnamese legal reasoning tasks.
Augment training data for a multilingual legal text classifier.
Analyze linguistic patterns in Vietnamese legal documents.
Strengths
Focuses on Vietnamese legal text, a specialized domain for NLP.
Explicitly intended for use with large language models.
Limitations
Description metadata is limited; actual data quality requires manual inspection after download.
Row count is unknown, which may limit suitability assessment.
Column-level documentation is absent; field semantics must be inferred after download.
Provenance
Source
Kaggle
Geography
Vietnam
License is unknown; terms of use must be verified before application.