Structured Code Optimization, Bug-Fix, and AST Parsing Dataset for ML Training

Name: Structured Code Optimization, Bug-Fix, and AST Parsing Dataset for ML Training
Creator: Jamie Davis
Published: 2026-05-28T18:57:39
License: CC-BY-4.0
Keywords: Code Optimization, Program Repair, Text, Text, Static Analysis, Bug Fix, Ast Parsing

by Jamie DavisUpdated 11d ago

4.5 KB1files

Available on 1 platform

Sign in to view source links and access this dataset

Description

Jamie Davis provides a dataset of structured JSON objects pairing raw source code with optimized equivalents, bug fixes, and invalid syntax examples. The dataset includes pre-computed complexity scores, execution tracking, and input-output verification arrays. It was last updated on 2026-05-28 and is engineered to train automated program repair tools, parsers, and static analyzers.

Use Cases

Train automated program repair models based on the pairing of buggy and fixed code examples.
Develop code parsers and static analyzers based on the provided Abstract Syntax Tree (AST) parsing examples.
Benchmark code optimization algorithms based on the structured input-output verification arrays.
Train models to detect and correct invalid syntax based on the provided invalid syntax examples.

Strengths

Dataset is structured with pre-computed complexity scores and verification arrays.
Data is specifically engineered for training automated program repair tools and parsers.
The dataset is published under the open CC-BY-4.0 license.

Limitations

The dataset is very small at 4.5 KB, indicating limited scope.
Row count and column-level documentation are unknown, limiting suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: Jamie Davis via figshare
Freshness: Last updated 2026-05-28 18:57:39.

Data is provided in TXT format; users must parse the JSON objects contained within.

Text Code Optimization Program Repair Static Analysis Bug Fix Ast Parsing

Related Datasets

Quality Score

C44

Description

45

Source

38

Reputation

35

Access

72

Community

0 views

Dataset Info

License: CC-BY-4.0
Author: Jamie Davis
Files: 1
Created: May 28, 2026
Updated: May 28, 2026
DOI
Last synced: May 29, 2026

Access

72

Community

0 views

Dataset Info

License: CC-BY-4.0
Author: Jamie Davis
Files: 1
Created: May 28, 2026
Updated: May 28, 2026
DOI
Last synced: May 29, 2026

Structured Code Optimization, Bug-Fix, and AST Parsing Dataset for ML Training

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info