Name: Parallel Corpus of Huang Beijia's Novels Translated by Nicky Harman
Creator: He He
Published: 2026-06-02T17:38:42
License: CC-BY-4.0
Keywords: ZIP, Translator Style, Text, Translation Studies, Corpus Linguistics, Natural Language Processing, Chinese Literature, Childrens Literature

Description

A bilingual corpus of 398,531 words supports a mixed-methods study of translator Nicky Harman's style in rendering Chinese children's literature for English readers. The dataset, created by He He and last updated in 2026, includes parallel and comparable texts from two contemporary Chinese novels translated in 2020 and 2023. Analysis focuses on lexical, syntactic, and textual choices, including the use of contracted forms and the translation of Chinese idioms.

Use Cases

Analyzing translator's lexical style based on keyword analysis and collocation patterns mentioned in the description
Studying syntactic choices in translation based on the use of contracted forms and readability metrics
Investigating strategies for translating culture-specific elements based on the handling of Chinese four-character idioms
Examining multimodal interaction in translated children's texts based on the consideration for young readership

Strengths

Corpus totals 398,531 words, enabling quantitative analysis
Includes both parallel and comparable text alignments for methodological flexibility
Focuses on two distinct novels translated by a single, named translator (Nicky Harman) for stylistic consistency

Limitations

Column-level documentation is absent; field semantics must be inferred after download
Row count is unknown, which may limit suitability assessment
The 15.1 MB size suggests a relatively small corpus for some statistical analyses

Provenance

Source: figshare
Collection Method: Self-built bilingual corpus compiled for a research study on translator's style.
Time Range: Source novels published 1996 and 2018; translations published 2020 and 2023.
Freshness: Last updated 2026-06-02 17:38:42; freshness should be verified

Data is packaged in a ZIP file format; specific internal file structures are not described.

Text ZIP Translator Style Translation Studies Corpus Linguistics Natural Language Processing Chinese Literature Childrens Literature

Parallel Corpus of Huang Beijia's Novels Translated by Nicky Harman

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info