Parallel Corpus of Huang Beijia's Novels Translated by Nicky Harman
by He He·Updated 5d ago
15.1 MB1files
Available on 1 platform
Sign in to view source links and access this dataset
Description
A bilingual corpus of 398,531 words supports a mixed-methods study of translator Nicky Harman's style in rendering Chinese children's literature for English readers. The dataset, created by He He and last updated in 2026, includes parallel and comparable texts from two contemporary Chinese novels translated in 2020 and 2023. Analysis focuses on lexical, syntactic, and textual choices, including the use of contracted forms and the translation of Chinese idioms.
Use Cases
Analyzing translator's lexical style based on keyword analysis and collocation patterns mentioned in the description
Studying syntactic choices in translation based on the use of contracted forms and readability metrics
Investigating strategies for translating culture-specific elements based on the handling of Chinese four-character idioms
Examining multimodal interaction in translated children's texts based on the consideration for young readership
Strengths
Corpus totals 398,531 words, enabling quantitative analysis
Includes both parallel and comparable text alignments for methodological flexibility
Focuses on two distinct novels translated by a single, named translator (Nicky Harman) for stylistic consistency
Limitations
Column-level documentation is absent; field semantics must be inferred after download
Row count is unknown, which may limit suitability assessment
The 15.1 MB size suggests a relatively small corpus for some statistical analyses
Provenance
Source
figshare
Collection Method
Self-built bilingual corpus compiled for a research study on translator's style.
Time Range
Source novels published 1996 and 2018; translations published 2020 and 2023.
Freshness
Last updated 2026-06-02 17:38:42; freshness should be verified
Data is packaged in a ZIP file format; specific internal file structures are not described.