DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

IfGPT: Bulgarian Language Data for LLM Fine-tuning | DataSalon

Home NLP & TextIfGPT: Bulgarian Language Data for LLM Fine-tuning

NLP & Text

IfGPT: Bulgarian Language Data for LLM Fine-tuning

Name: IfGPT: Bulgarian Language Data for LLM Fine-tuning
Creator: DCL-IBL
Published: 2026-05-30T14:27:39
Keywords: Bulgarian Language, Text, Language Model, Fine Tuning, Open Data, Text Processing

by DCL-IBL·Updated 1mo ago

Available on 1 platform

Description

The IfGPT Dataset is developed within the project IfGPT: Infrastructure for Fine-tuning Pre-trained Large Language Models. It aims to establish a freely accessible infrastructure for the selection and pre-processing of large datasets for Bulgarian as well as tailored data for particular industries. The dataset is authored by DCL-IBL and was last updated on Hugging Face in June 2026.

Use Cases

Fine-tuning large language models based on Bulgarian language data mentioned in the description
Pre-processing text data for specific industry applications based on the mention of tailored data
Building infrastructure for dataset selection for NLP tasks based on the project's stated objectives

Strengths

Dataset is part of a project with a clear objective to establish infrastructure for LLM fine-tuning
Focuses on Bulgarian language data, which may be a less common resource
Last updated date is explicitly provided: 2026-06-03

Limitations

Column-level documentation is absent; field semantics must be inferred after download
Row count is unknown, which may limit suitability assessment
Description metadata is limited; actual data quality requires manual inspection after download

Provenance

Source: DCL-IBL
Freshness: Last updated 2026-06-03 10:36:29; freshness should be verified
Geography: Bulgarian language focus suggests primary geographic relevance to Bulgaria

Text Bulgarian Language Language Model Fine Tuning Open Data Text Processing

Related Datasets

Quality Score

D37

Description

Source

Reputation

Quality Score

D37

Description

Source

Reputation

Access

Community

15 downloads

1 likes

0 views

Dataset Info

Author: DCL-IBL
Created: May 30, 2026
Updated: Jun 3, 2026
Last synced: Jun 9, 2026

Access

Community

15 downloads

1 likes

0 views

Dataset Info

Author: DCL-IBL
Created: May 30, 2026
Updated: Jun 3, 2026
Last synced: Jun 9, 2026

IfGPT: Bulgarian Language Data for LLM Fine-tuning

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info