Name: CLIFT: A Structured Benchmark for Latent Rule Learning in LLMs
Creator: longarmd
Published: 2026-04-08T23:13:09
Keywords: Llm Benchmark, Benchmark, Text, Inference Testing, Synthetic Data, Synthetic, Contextual Learning

Description

5,160 synthetic instances form a structured benchmark designed to stress-test whether models learn latent rules from context. It was created by author 'longarmd' and last updated on Hugging Face in April 2026. The benchmark employs a full factorial design over task, format, application, and difficulty, with 10 independent draws per cell.

Use Cases

Benchmarking model performance on latent rule learning based on the structured factorial design.
Analyzing the impact of task format on in-context learning based on the varied format dimension.
Studying the effect of task difficulty on model generalization based on the difficulty variable.
Evaluating model transfer learning across different applications based on the application dimension.

Strengths

5,160 total instances provide a substantial testbed for evaluation.
Full factorial design over four variables (task, format, application, difficulty) allows for controlled, systematic analysis.
10 i.i.d. draws per cell help ensure statistical reliability of results.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Data is synthetic, which may limit direct applicability to real-world scenarios.

Provenance

Source: Hugging Face dataset uploaded by author 'longarmd'.
Collection Method: Synthetically generated benchmark instances.
Freshness: Last updated 2026-04-09 12:51:46; freshness should be verified.

License is unknown; users should verify terms before use.

Text Llm Benchmark Benchmark Inference Testing Synthetic Data Synthetic Contextual Learning

CLIFT: A Structured Benchmark for Latent Rule Learning in LLMs

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info