Name: Orpo Dpo Mix 40K: Portuguese Translation of a Preference Tuning Dataset
Creator: BornSaint
Published: 2025-04-17T21:15:45
Keywords: Size Categories10 Kn100 K, Orpo, Preference Tuning, Librarypolars, Rlhf, Modalitytext, Librarymlcroissant, Librarydatasets, Librarypandas, Text, Parquet, Text, Regionus, Preference, Portuguese, Languagept, Dpo

Description

A Portuguese translation of the 'mlabonne/orpo-dpo-mix-40k' dataset, created by user BornSaint and last updated on 2025-05 06. The dataset was translated using a quantized machine translation model over more than a week on a single GPU thread. The original dataset is likely used for preference tuning and reinforcement learning from human feedback.

Use Cases

Fine-tuning language models for Portuguese text generation based on translated preference pairs.
Training reward models for Portuguese dialogue systems based on the translated preference data.
Conducting comparative studies on preference tuning methodologies across different languages.
Developing instruction-following models for Portuguese based on translated instruction-response pairs.

Strengths

Dataset is a direct translation of a known preference-tuning dataset ('mlabonne/orpo-dpo-mix-40k').
Translation process is documented, specifying the model ('google/madlad400-3b-mt') and hardware used.
Last update timestamp (2025-05-06) is provided, indicating recent activity.

Limitations

Row count, column definitions, and sample data are unavailable, limiting suitability assessment.
License information is unknown, which may restrict usage.
Translation quality and potential biases introduced by the automated process are not evaluated.

Provenance

Source: Hugging Face user BornSaint, derived from 'mlabonne/orpo-dpo-mix-40k'.
Collection Method: Automated translation using a quantized ctranslate2 version of 'google/madlad400-3b-mt'.
Time Range: null
Freshness: Last updated 2025-05-06 00:18:02.
Geography: null

License is unknown; users must verify terms before use. The dataset page notes a link to a full description.

Text Portuguese Parquet Size Categories10 Kn100 K Orpo Preference Tuning Librarypolars Rlhf Modalitytext Librarymlcroissant Librarydatasets Librarypandas Regionus Preference Languagept Dpo

Orpo Dpo Mix 40K: Portuguese Translation of a Preference Tuning Dataset

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info