DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

DeepSWE: 113 Long-Horizon Software Engineering Tasks for AI Agents | DataSalon

Home Software Engineering & SecurityDeepSWE: 113 Long-Horizon Software Engineering Tasks for AI Agents

Software Engineering & Security

DeepSWE: 113 Long-Horizon Software Engineering Tasks for AI Agents

Name: DeepSWE: 113 Long-Horizon Software Engineering Tasks for AI Agents
Creator: datacurve
Published: 2026-06-01T21:58:03
Keywords: Software Engineering, Benchmark, Ai Agents, Computer Vision, Text, Code Generation, Programming Languages

by datacurve·Updated 1mo ago

Available on 1 platform

Description

113 original software engineering tasks across TypeScript, Go, Python, JavaScript, and Rust, drawn from active open-source repositories. The DeepSWE benchmark was created by datacurve to measure frontier coding agents, using isolated environments and program-based verifiers. It was last updated on June 1, 2026.

Use Cases

Benchmarking AI code generation agents based on long-horizon tasks from open-source repositories.
Evaluating agent performance across multiple programming languages based on the five languages included.
Testing program-based verification systems using the isolated task environments described.

Strengths

113 tasks provide a defined scale for evaluation.
Tasks are drawn from active open-source repositories, suggesting real-world relevance.
Covers five distinct programming languages: TypeScript, Go, Python, JavaScript, and Rust.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count and dataset size are unknown, which may limit suitability assessment.

Provenance

Source: datacurve on Hugging Face
Collection Method: Tasks drawn from active open-source repositories.
Freshness: Last updated 2026-06-01 23:15:04; freshness should be verified.

License is unknown; terms of use must be verified before application.

Text Software Engineering Benchmark Ai Agents Computer Vision Code Generation Programming Languages

Related Datasets

Quality Score

D36

Description

Source

Reputation

Quality Score

D36

Description

Source

Reputation

Access

Community

1 likes

0 views

Dataset Info

Author: datacurve
Created: Jun 1, 2026
Updated: Jun 1, 2026
Last synced: Jul 23, 2026

Access

Community

1 likes

0 views

Dataset Info

Author: datacurve
Created: Jun 1, 2026
Updated: Jun 1, 2026
Last synced: Jul 23, 2026

DeepSWE: 113 Long-Horizon Software Engineering Tasks for AI Agents

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info