DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Zhihu RLHF 3K: Human Preference Data for Text Alignment | DataSalon

Home Multimodal & LLMZhihu RLHF 3K: Human Preference Data for Text Alignment

Multimodal & LLM

Zhihu RLHF 3K: Human Preference Data for Text Alignment

Name: Zhihu RLHF 3K: Human Preference Data for Text Alignment
Creator: liyucheng
Published: 2023-04-15T17:03:54
Keywords: Text Generation, Preference Data, Text, Reinforcement Learning From Human Feedback, Zhihu

by liyucheng·Updated 3y ago

Description

A dataset likely containing human preference data for Reinforcement Learning from Human Feedback (RLHF) applications. It was published by author liyucheng on the Hugging Face platform on April 15, 2023. The dataset's title suggests a connection to the Chinese Q&A platform Zhihu and a scale of approximately 3,000 entries.

Use Cases

Training a reward model to score text outputs (inferred from domain, verify after download)
Fine-tuning a language model to align with human preferences (inferred from domain, verify after download)
Benchmarking RLHF algorithms on Chinese-language data (inferred from domain, verify after download)

Strengths

Published on the Hugging Face platform, facilitating integration with common ML tools.
Author and publication date (2023-04-15) are explicitly provided.

Limitations

Metadata is minimal; actual content, column definitions, and data quality require verification after download.
Row count, file formats, and license information are unknown, which limits suitability assessment.

Provenance

Source: Hugging Face (author: liyucheng)
Collection Method: Likely sourced from or related to the Zhihu platform.
Time Range: Publication date is 2023-04-15; temporal coverage of data is unknown.
Freshness: Last updated 2023-04-15 17:06:05; freshness should be verified.
Geography: Likely contains Chinese-language data, but specific geography is unknown.

License is unknown; usage rights must be verified before application.

Text Text Generation Preference Data Reinforcement Learning From Human Feedback Zhihu

Related Datasets

Quality Score

D24

Description

Source

Reputation

Quality Score

D24

Description

Source

Reputation

Access

Community

190 downloads

94 likes

0 views

Dataset Info

Author: liyucheng
Created: Apr 15, 2023
Updated: Apr 15, 2023
Last synced: May 13, 2026

Access

Community

190 downloads

94 likes

0 views

Dataset Info

Author: liyucheng
Created: Apr 15, 2023
Updated: Apr 15, 2023
Last synced: May 13, 2026

Zhihu RLHF 3K: Human Preference Data for Text Alignment

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info