Sign in to view source links and access this dataset
Description
5,287 Chinese question-answer pairs form a training set for large language models in mining engineering. The dataset, created by a Hefei University of Technology student project, covers six modules including laws, specifications, and safety cases. An enhanced version includes chain-of-thought annotations to improve model reasoning.
Use Cases
Fine-tune language models on Chinese mining engineering concepts based on the 'concept' module.
Evaluate model understanding of industrial safety regulations based on the 'law' and 'safety' modules.
Benchmark model performance on domain-specific reasoning tasks based on the chain-of-thought enhanced dataset.
Train models to generate technical specifications based on the 'specifications' module.
Strengths
Contains 5,287 high-quality question-answer pairs for supervised fine-tuning.
Covers six distinct knowledge modules: laws, specifications, concepts, safety cases, forum experience, and synthesis.
Offers a separate chain-of-thought enhanced version designed to improve logical reasoning.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count for the evaluation set is unknown, which may limit suitability assessment.
Data may reflect the specific focus and source bias inherent to its academic project origin.
Provenance
Source
Hefei University of Technology student innovation project (author: acnul).
Collection Method
Constructed as part of a university-level innovation training program.
Freshness
Last updated 2025-07-24 00:02:53; freshness should be verified.
License is unknown; terms of use must be verified before application.