DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Openjudge | DataSalon

Home Multimodal & LLMOpenjudge

Multimodal & LLM

Openjudge

Name: Openjudge
Creator: agentscope-ai
Published: 2025-12-16T06:53:52
Keywords: Ai Evaluation, Benchmark, Grader Development, Text, Preference Pairs, Multimodal

by agentscope-ai·Updated 4mo ago

Available on 1 platform

Description

A benchmark dataset for evaluating graders across text, multimodal, and agent scenarios. It supports the OpenJudge framework with labeled preference pairs for quality-assured grader development. The dataset was created by agentscope-ai and last updated on March 4, —.

Use Cases

Benchmarking grader performance based on labeled preference pairs mentioned in the description
Developing quality-assured graders for agent scenarios based on the described task categories
Evaluating multimodal model coherence based on the image_coherence benchmark
Testing grader capabilities on tool-use tasks based on the described tool category

Strengths

Includes 12 agent evaluation tasks with 166 samples
Contains 4 multimodal evaluation benchmarks with 80 samples
Provides labeled preference pairs for structured grader development

Limitations

Column-level documentation is absent; field semantics must be inferred after download
Row count is unknown, which may limit suitability assessment
Description metadata is limited; actual data quality requires manual inspection after download

Provenance

Source: agentscope-ai
Collection Method: Likely curated for the OpenJudge framework.
Time Range: null
Freshness: Last updated 2026-03-04 12:41:38; freshness should be verified
Geography: null

License is unknown; terms of use must be verified before application.

Text Multimodal Ai Evaluation Benchmark Grader Development Preference Pairs

Related Datasets

Quality Score

C44

Description

Source

Reputation

Quality Score

C44

Description

Source

Reputation

Access

Community

142 downloads

1 likes

0 views

Dataset Info

Author: agentscope-ai
Created: Dec 16, 2025
Updated: Mar 4, 2026
Last synced: Jul 3, 2026

Access

Community

142 downloads

1 likes

0 views

Dataset Info

Author: agentscope-ai
Created: Dec 16, 2025
Updated: Mar 4, 2026
Last synced: Jul 3, 2026

Openjudge

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info