DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Code X Glue Cc Code Completion Line | DataSalon

Home Software Engineering & SecurityCode X Glue Cc Code Completion Line

Software Engineering & Security

Code X Glue Cc Code Completion Line

Name: Code X Glue Cc Code Completion Line
Creator: google
Published: 2022-03-02T23:29:22
Keywords: Source Datasetsoriginal, Size Categories10 Kn100 K, Languagecode, Task Categoriestext Generation, Librarypolars, Language Creatorsfound, Modalitytext, Licensec Uda, Librarymlcroissant, Librarydatasets, Librarypandas, Task Categoriesfill Mask, Parquet, Task Idsslot Filling, Annotations Creatorsfound, Regionus, Multilingualitymonolingual

by google·Updated 2y ago

Available on 1 platform

Description

Multiple programming language datasets for line-level code completion tasks within the CodeXGLUE benchmark. It provides unfinished code lines and their preceding context to evaluate model performance using exact match and edit similarity metrics.

Use Cases

Train a sequence-to-sequence model to predict the remainder of a code line using the context and unfinished line features.
Benchmark the performance of code generation models using the exact match evaluation metric.
Measure the structural accuracy of autocompleted code using the edit similarity score.
Analyze model failure patterns in line-level completion compared to token-level prediction using the provided context.

Strengths

Focuses on line-level completion tasks to test model ability beyond token-level prediction.
Includes evaluation scripts for exact match and edit similarity metrics.
Sourced from the Microsoft CodeXGLUE benchmark for code-to-code intelligence.
Provides unfinished code lines paired with preceding context for sequence generation.

Parquet Source Datasetsoriginal Size Categories10 Kn100 K Languagecode Task Categoriestext Generation Librarypolars Language Creatorsfound Modalitytext Licensec Uda Librarymlcroissant Librarydatasets Librarypandas Task Categoriesfill Mask Task Idsslot Filling Annotations Creatorsfound Regionus Multilingualitymonolingual

Related Datasets

Quality Score

D34

Description

Source

Reputation

Quality Score

D34

Description

Source

Reputation

Access

Community

176 downloads

6 likes

0 views

Dataset Info

Author: google
Created: Mar 2, 2022
Updated: Jan 24, 2024
Last synced: Jun 8, 2026

Access

Community

176 downloads

6 likes

0 views

Dataset Info

Author: google
Created: Mar 2, 2022
Updated: Jan 24, 2024
Last synced: Jun 8, 2026

Code X Glue Cc Code Completion Line

Description

Use Cases

Strengths

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info