DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

LLM Performance on Adversarial Physics Traps | DataSalon

Home Government & LegalLLM Performance on Adversarial Physics Traps

Government & Legal

LLM Performance on Adversarial Physics Traps

Available on 1 platform

Description

LLM Physics Law-Breaker Benchmark Results evaluate how 21 large language models perform against 34 adversarial physics-based reasoning traps. The dataset contains benchmark scores assessing model robustness to logical inconsistencies and physical fallacies. The original author and creation date are unknown.

Use Cases

Compare performance scores across 21 different LLM architectures on physics reasoning tasks.
Analyze failure patterns across 34 distinct adversarial physics trap categories to identify model weaknesses.
Benchmark new LLM versions against established model scores from this evaluation suite.
Rank models based on aggregate scores from the physics law-breaker test suite.

Strengths

Benchmark covers 21 distinct models for comparative analysis.
Evaluation includes 34 different adversarial physics scenarios.

Limitations

Specific sample size, row count, and detailed scoring columns are unknown.
Lacks metadata on model versions, evaluation parameters, and raw response data.

Provenance

Collection Method: Adversarial benchmark testing of pre-trained LLMs.

The dataset likely contains only aggregated benchmark results, not the underlying model prompts, responses, or fine-tuning data.

Tabular Evaluation Benchmark Llm Evaluation Physics Adversarial Testing Physics Benchmark

Related Datasets

Quality Score

D21

Description

Source

Reputation

Quality Score

D21

Description

Source

Reputation

Access

Community

0 views

Dataset Info

Last synced: Apr 9, 2026

Access

Community

0 views

Dataset Info

Last synced: Apr 9, 2026

LLM Performance on Adversarial Physics Traps

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info