DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Vqagent Pairwise Preference: Human Feedback Data for RL | DataSalon

Home Multimodal & LLMVqagent Pairwise Preference: Human Feedback Data for RL

Multimodal & LLM

Vqagent Pairwise Preference: Human Feedback Data for RL

Name: Vqagent Pairwise Preference: Human Feedback Data for RL
Creator: qgfvadfuvads
Published: 2026-04-12T20:01:24
Keywords: Preference Learning, Pairwise Comparison, Tabular, Reinforcement Learning, Human Feedback

by qgfvadfuvads·Updated 3mo ago

Available on 1 platform

Description

A dataset titled 'Vqagent Pairwise Preference' was published on the Hugging Face platform by the user 'qgfvadfuvads'. The title suggests it contains pairwise preference comparisons, likely used for training or evaluating reinforcement learning agents. The dataset was last updated on April 12, 2026.

Use Cases

Train a reward model using pairwise human preferences (inferred from domain, verify after download)
Benchmark preference alignment algorithms for AI agents (inferred from domain, verify after download)
Fine-tune language models using comparative feedback data (inferred from domain, verify after download)

Strengths

Published on the Hugging Face platform, facilitating access for the ML community.
Last update timestamp (2026-04-12 20:25:03) is provided.

Limitations

Metadata is minimal; actual content requires verification after download.
Column-level documentation is absent; field semantics must be inferred after download.
Row count, file formats, and license information are unknown, which may limit suitability assessment.

Provenance

Source: huggingface
Collection Method: Uploaded by user 'qgfvadfuvads'; original collection method is unknown.
Time Range: null
Freshness: Last updated 2026-04-12 20:25:03; freshness should be verified.
Geography: null

License is unknown; users must verify licensing terms before use.

Tabular Preference Learning Pairwise Comparison Reinforcement Learning Human Feedback

Related Datasets

Quality Score

D26

Description

Source

Reputation

Quality Score

D26

Description

Source

Reputation

Access

Community

4 downloads

1 likes

0 views

Dataset Info

Author: qgfvadfuvads
Created: Apr 12, 2026
Updated: Apr 12, 2026
Last synced: Apr 18, 2026

Access

Community

4 downloads

1 likes

0 views

Dataset Info

Author: qgfvadfuvads
Created: Apr 12, 2026
Updated: Apr 12, 2026
Last synced: Apr 18, 2026

Vqagent Pairwise Preference: Human Feedback Data for RL

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info