A Study on Chinese-English Subtitle Translation of the Chinese Animation Ne Zha: The Devil Child Roars the Sea ()
1. Introduction
As an important medium for cross-cultural communication, the quality of subtitle translation is directly related to the viewing experience and cultural dissemination effects of domestic and foreign audiences, and plays a key role in helping the audience understand the work’s content and cultural background. With the gradual evolution of Generative Artificial Intelligence (GenAI) from early generative models to Large Language Models represented by ChatGPT (launched in 2022), the translation model has also shifted from the traditional “Machine Translation + Post-Editing” (MTPE) to AI-driven Post-Editing (AIPE) [1]. Subsequently, domestic enterprises represented by DeepSeek (launched in 2025) have continued to promote the localization of GenAI technology, extending it to emerging fields such as the Chinese animation subtitle translation [2]. Ne Zha: The Devil Child Roars the Sea (hereinafter referred to as Ne Zha 2), released in 2025, saw its global box office exceed 15.44 billion yuan (as of June of the same year), ranking among the top five in global box office as a non-English work. The Chinese animation integrates a large number of Taoist and mythological elements and possesses cultural representativeness. Moreover, due to its massive subtitle workload, limited time, and complex cultural background, the traditional MTPE model cannot meet its needs. Therefore, how to use GenAI to improve the translation speed and quality of the Chinese animation subtitles has become an issue that urgently needs to be discussed.
Current scholars’ research on the subtitle translation includes perspectives from translation theories such as Deconstructionism [3], the CEA framework [4], and Skopos Theory [5]. These primarily discuss the specific manifestations of translation principles such as domestication, foreignization, functional equivalence, and multimodal compensation in subtitles, thereby balancing the connection between cultural transmission and audience acceptance. In the GenAI era, the application of the AIPE translation model in the Chinese animation subtitle translation still needs further exploration. In the past, the Chinese animation subtitles were mostly analyzed based on speech theory for the characteristics of their output. This paper takes Ne Zha 2 as the research object to discuss the advantages of the AIPE model in terms of translating culture-loaded words, maintaining humor, improving translation efficiency, and ensuring translation quality. This paper intends to answer the following questions: first, how different translation models accurately convey the cultural connotations of the source language; second, to analyze specific translation examples after the combination of AI translation and the Chinese animation subtitle translation strategies.
2. Characteristics of the Chinese Animation Subtitle Translation under Different Models
Contemporary artificial intelligence technology is developing rapidly, and the AIPE translation model has emerged as the times require. To verify the breakthrough innovation of this model in the Chinese animation subtitle translation, it is necessary to first clarify the specific functions and characteristics of translation models from previous different periods in such non-technical texts, so as to better convey the cultural connotations contained in the Chinese animation to the viewers.
(i) To ensure the empirical validity of this study, a total of 1200 subtitle lines from Ne Zha 2 were systematically analyzed. The selection of examples for this paper was based on three primary criteria: presence of culture-loaded words (CLWs), mythological incantations, and character-defining humor. Examples were included if they presented a translation challenge that required cross-cultural negotiation and were excluded if they were purely functional dialogue (e.g., “Yes,” “No,” “Go”)
The evaluation of “quality” in this study is defined through four explicit dimensions: Accuracy (semantic fidelity), Fluency (naturalness in the target language), Cultural Retention (preservation of Taoist/mythological imagery), and Concision (adherence to subtitling space limits). Two independent translation experts performed the assessments. In cases of disagreement regarding the preferred translation, a third senior translator acted as an arbitrator to resolve the discrepancy using the Multidimensional Quality Metrics (MQM) framework adapted for audiovisual translation.
(ii) Inspirations and Reflections on Human Translation in the Chinese Animation Subtitle Translation
Human translation refers to the translation process completed manually by the translator, which is distinguished from machine translation. Translators need to understand the semantics, grammar, cultural background, and context of the original text, and finally express it accurately through the target language. This process emphasizes the overall function and communicative effect of the translation in the target culture and can conduct a comprehensive and in-depth analysis of the original text from multiple angles such as the textual level, linguistic level, and cultural level. Relying on their deep understanding of the source and target language cultures, translators interpret the imagery in the source language text and transform it into the target language through appropriate translation strategies and principles. For rich and subtle textual emotions, they can also combine their own life experiences and literary accomplishments to accurately convey the emotions in the target language, so that the translation audience can feel the emotional resonance of the original work.
In the Chinese animation subtitle translation, human translation can flexibly adjust translation strategies based on the context and effectively maintain the formal characteristics of the original text, thereby conveying cultural connotations more accurately. However, because the duration is long and the frame count is dense, the volume of subtitle text is huge. If one relies entirely on traditional human translation, they will face challenges of being time-consuming, laborious, and inefficient. Especially when involving professional cultural fields such as Taoism and mythology, even if translators possess the corresponding professional knowledge and language ability, if multiple translators handle the Chinese animation through division of labor, it is still easy for problems such as inconsistent terminology and non-uniform styles to occur. Therefore, to achieve a balance between quality and efficiency, it is necessary for human translation to be combined with other translation models to promote the innovation and optimization of practical methods.
(iii) Technical Innovation of the Chinese Animation Subtitle Translation under the CAT + MT + PE Model
Computer-Aided Translation (CAT), as a modern translation tool, mainly helps translators improve translation efficiency and quality through Translation Memories (TM) and Terminology Bases (TB). Translation Memory, by storing translated text segments, can automatically recommend previous translation results, thereby avoiding repetitive labor. CAT will not imprison the textual creativity of non-technical texts, cause text segmentation, or lead to the problem of rigid translations. This translation method provides translators with more convenient bilingual comparison and discourse coordination, so that translators have more energy to devote to the creative translation of the text [6]. Modern translation projects easily involve hundreds or thousands, or even tens of thousands of words, and the key to influencing the uniformity of translation quality lies in the consistency of terminology. CAT tools can ensure the simplification of translation formats across different translators’ work. This can be applied not only to technical texts but also to non-technical texts, considering the volume of the text [7].
In the Chinese animation subtitle translation, especially when involving repetitive content and fixed expressions, machine translation can help translation personnel efficiently complete the initial translation work. Particularly when processing a large number of simple dialogues without cultural particularity, translators can rely on the translation memory to improve translation consistency and accuracy. However, machine translation shows obvious deficiencies when facing subtitles containing deep cultural connotations and complex emotional expressions. For example, for a sentence like “急急如律令”, machine translation might be unable to capture its cultural meaning in the Chinese context, and subsequent reliance on manual modification and editing is still required. Compared with the pure human translation of the past, the CAT + MT + PE model has greatly improved translation efficiency and quality. Under this model, there is still room for improvement, laying the foundation for the subsequent intelligent learning and development of AI.
(iv) Breakthroughs of Intelligent Agents for the Chinese Animation Subtitle Translation Model in the GenAI Era
After entering the 21st century, research during this period primarily revolved around the impact of machine translation technology at different stages on translation models, especially focusing on the MTPE model. Compared with the MTPE model, the AIPE model not only significantly improves translation efficiency but also achieves a transition from simple “machine assistance” to deep “artificial intelligence collaboration.” This indicates that the translation industry is undergoing epoch-making changes. The leap of translation models from human translation to CAT, then to MTPE, and then to AIPE has taken the precision, accuracy, and speed of translation a step further, thereby greatly improving translation efficiency and quality [8]. From a concrete workflow perspective, AIPE optimizes efficiency by: 1) Automated scene analysis to set the tone; 2) Real-time terminology consistency checks during generation; and 3) Hierarchical QA checkpoints where humans only intervene for high-risk cultural metaphors. Preliminary tests indicate that AIPE can reduce the translation cycle for a 90-minute animation by approximately 40% compared to traditional MTPE. At the same time, although AI translation can maintain consistency with the original text in terms of syntactic structure, it lacks the subtle ideological differences and strategic use of language reflected in human translation within specific contexts. Therefore, AI translation ultimately needs to be combined with manual post-editing to perfect the translation [9].
In the Chinese animation subtitle translation, it not only requires translators to have a solid linguistic foundation but also requires them to have deep cultural literacy and creativity. Therefore, when processing culture-loaded words, humorous expressions, and emotional transmission, the combination of AI and PE is particularly important. Humorous expressions or sentences with strong local characteristics in subtitles require translators to conduct re-creation and optimization, thereby expressing the emotions and connotations of the Chinese animation subtitles. Through the current representative large-data language models Doubao and DeepSeek, by inputting the same execution commands and conducting specific translation case analysis and comparison, one can clearly see the similarities and differences in the translations produced by different language models. The following are specific examples (where ST is the original text, TT1 is the Doubao version translation, and TT2 is the DeepSeek version translation):
Example 1:
ST: 太好了,还剩一点点!留得青山在,不怕没柴烧。
TT1: As long as the green hills remain, there will be no shortage of wood.
TT2: The green hills stand—where life remains, hope’s roots run deep.
Translation 1 is faithful to the original text, for example, “green hills” corresponds to “青山” and “wood” corresponds to “柴”. It is suitable for scenes requiring literal meaning. For subtitle translation, it reduces the cultural adaptability of the target language and may not effectively transmit the character’s emotions that the original text intends to convey. Translation 2 uses “life remains,” which is more philosophical and retains depth; through “hope’s roots run deep,” it conveys the imagery of hope and vitality. This method pays more attention to the natural expression of the target language and can stay closer to the symbolic meaning of the Chinese animation while maintaining abstract expression, for example, through “as long as the mountains stand, there will always be hope” to better combine the symbolic meaning of “green hills” and the cultural connotation of “hope.”
Example 2:
ST: 若前方无路,我便踏出一条路!若天地不容,我便扭转这乾坤!
TT1: If there’s no path ahead, I’ll carve one out! If heaven and earth forbid it, I’ll turn this cosmos upside down!
TT2: Should paths ahead be barred, I’ll carve a way through steep. If heaven and earth defy, I’ll turn the tides they keep!
Translation 1 adopts a literal translation method to preserve the cultural heritage behind it, for example, “carve one out” corresponds to “踏出一条路” and “turn this cosmos upside down” corresponds to “扭转这乾坤”. Although literal translation can preserve the boldness and determination of the original text, “turn this cosmos upside down” might seem too exaggerated in English and lacks linguistic fluency. Translation 2 transforms “无路” and “天地不容” into more idiomatic expressions, such as “carve a way through steep” and “turn the tides they keep,” conveying the imagery of determination and change, allowing the heroic meaning of the Chinese animation to be reflected in the target language as well, but it also reduces some cultural meanings of the source language.
Example 3:
ST: 我乃哪吒三太子,能降妖来会作诗。今日到此除奸恶,尔等妖魔快受死。
TT1: I am Prince Ne Zha, the Third. I can subdue demons and compose poems too. Today I come here to eliminate the wicked; you demons, quickly submit to your doom!
TT2: Behold, I am Prince Nezha, the Third Crown, A poet who can subdue a demon or fiend. I’m here to purge this evil place today, You hellspawn, prepare to be cleansed!
Translation 1 is straightforward and concise, capable of clearly conveying the original message, for example, “到此除奸恶” is translated as “come here to eliminate the wicked,” but the overall writing style is slightly plain and lacks the aura of certain classical literature. During the translation process, some literary figures of speech can be added to the translation, such as adding phonology or classical rhetoric, to give the translation a deeper foundation of Chinese culture. Translation 2 is more poetic, adopting rhyme and more vivid language, such as “hellspawn” and “purge,” making the translation more dramatic and fitting Nezha’s confident and bold character in the Chinese animation, making the protagonist’s image more vivid. At the same time, attention should be paid to balancing poetry and readability to avoid overly complex expressions affecting understanding.
From the above examples, it can be concluded that the translations produced under the two big-data models have their own characteristics. Translation 1 overall uses the principle of literal translation, matching the source language and target language one-to-one, thereby finding relative expressions in the target language and then conducting mutual conversion between translations. Compared to transmitting the cultural connotations behind it, this translation method can better preserve the cultural core of the source language, but it might be lacking in idiomatic expressions of the target language, which could cause cultural understanding deviations for viewers. Translation 2 adopts more free translation strategies, moving closer to target language viewers. It primarily through the conversion of expressions between word meanings finds the corresponding expression in the target language to help target language viewers better understand the cultural connotations in the source language, thereby better conveying the traditional Chinese cultural connotations contained in the Chinese animation.
3. Case Analysis of the Chinese Animation Subtitle Translation Strategies
The complexity of the Chinese animation subtitle translation is reflected not only in the conversion at the linguistic level but also involves cultural adaptation and the re-creation of information. Especially in the Chinese animation, many content carries strong Chinese cultural elements. How to accurately convey these cultural elements through subtitles and ensure the understanding and resonance of the target audience is a key issue in the translation process. The Chinese animation subtitles contain many culture-loaded words, which refer to words, phrases, and idioms unique to a certain culture and carry the unique historical and cultural genes of a nation. Taking Generative Artificial Intelligence translations as an example, and combining them with manual post-editing, the subtitle translation of the Chinese animation Ne Zha 2 can be roughly summarized with the following significant characteristics (where ST is the original text, TT1 is the DeepSeek version translation, and TT2 is the manual post-editing):
(i) Focusing on Literal Translation to Preserve Cultural Imagery
In the subtitle translation of Ne Zha 2, literal translation is widely applied, especially for words with Chinese cultural characteristics, such as “哪吒” and “申公豹”. Translated as “Nezha” and “Shen Gongbao”, the literal translation method preserves the cultural imagery of the source language as much as possible, enabling target language audiences to directly come into contact with Chinese cultural elements and preserving the linguistic charm of Chinese culture. But at the same time, it may also bring obstacles to understanding for target language viewers, such as the identity and origin of “Shen Gongbao”, as well as the relationship with the protagonist, and what the relationship is between the plot progression and characters in the context; all need to be aided for understanding through explanations in the context or adding notes next to characters.
Example 1:
ST: 不认命,就是哪吒的命。
TT1: Defying fate is Nezha’s very nature.
TT2: Not accepting fate—this is the essence of Nezha’s destiny.
Literal translation can ensure the transmission of the basic meaning, but sometimes it is necessary to further enhance the cultural depth of the translation. “哪吒的命” in the Chinese animation carries a traditional Chinese cultural background, especially the struggle against fate and its philosophical relationship with fatalism. By using “essence” and “destiny” to replace “nature,” target language viewers can better understand the deep meaning of the character Nezha in culture, that is, his fate is an unchangeable destiny, and his rebellious behavior is not the ignorance of his youth, but a more profound struggle with fate. Therefore, manual post-editing can preserve the cultural imagery of the Chinese animation and enhance the cultural connotation with the help of transliteration and expressions closer to Chinese. This modification, although still a literal translation, preserves the multiple meanings of “命”, and at the same time enhances the philosophical meaning of destiny through “essence,” highlighting Nezha’s unyielding personality. The first half through the negative sentence “Not accepting fate” even more reflects the protagonist’s resilient will and quality, and from the side, it can reflect the national spirit and connotation.
(ii) Transliteration Supplement to Preserve Cultural Charm
The literal translation method focuses on meaning and structure, while the transliteration method focuses on pronunciation. When involving concepts unique to Chinese culture, such as “太乙真人” being directly transliterated as “Taiyi Zhenren,” subtitle translation preserves the uniqueness of source language pronunciation through the transliteration method, reflecting the unique cultural charm of the vocabulary. This method is particularly effective when a corresponding vocabulary cannot be found, but it may increase the cognitive burden of the target language audience. Therefore, transliteration needs further notes next to the characters; the note part can adopt target language free translation to help viewers better understand the content.
Example 2:
ST: 这东海龙宫,岂是你说来就来?
TT1: How dare you barge into the East Sea Dragon Palace so freely?
TT2: How dare you barge into the Donghai Dragon Palace so freely?
For “东海” (East Sea) which has a strong traditional Chinese cultural background, transliteration helps to preserve the original cultural appearance. Especially when involving place names, unique myths, or cultural symbols, transliteration is often more attractive than literal or free translation and can enable viewers to better perceive the cultural connotations of the Chinese animation. At the same time, transliteration also helps to shape a unique cultural atmosphere, avoiding viewers losing their perception of the cultural background of the Chinese animation. Especially “东海龙宫,” it is not only a place name but also has strong cultural symbolism; direct transliteration helps to maintain this layer of cultural charm. Therefore, here post-editing transliterates “东海” as “Donghai,” which can directly convey the background of Chinese geographical culture to viewers, while “龙宫” still maintains a literal translation, helping foreign viewers to directly understand the actual meaning of this noun, making it convey the fantasy color of Oriental mythology.
(iii) Free Translation as a Supplement to Simplify Cultural Connotations
The translation primarily adopts free translation, transforming culture-loaded words such as “魔丸” and “灵珠” into words that are easier to understand in the target language, such as “Demon Pill” and “Spirit Pearl.” In this phase, post-editors must consider core subtitling constraints, such as segmentation and readability. Unlike literary prose, subtitles must be processed by the viewer in real-time. Therefore, AIPE decisions are often shaped by the “6-second rule” (maximum characters visible for the duration of the shot). This method simplifies the cultural connotation, reduces the difficulty of understanding, and translates it into a meaning that fits the target language better, thereby helping viewers to better understand the Chinese animation connotation. At the same time, this type of translation may sacrifice part of the cultural uniqueness, making the transmission and understanding of Chinese culture show differences.
Example 3:
ST: 自诩照世明灯,干的却是恃强凌弱、祸乱人间的勾当,你们才是邪魔外道。
TT1: You call yourselves the Lamp that Illuminates the World, yet your deeds are those of darkness—bullying the weak and plaguing the mortal realm. You are the true evil.
TT2: You claim to be the Light of the World, but your actions are those of oppression and chaos. You are the true evil.
In the case of emphasizing free translation, the focus of translation is to convey the meaning rather than being too confined by the cultural background. For “照世明灯,” a typical Chinese cultural symbol, viewers may not know its specific meaning, so it can be transformed into “Light of the World,” which intends to express “illuminating the world and bringing light to all of humanity,” which has similar symbolic meanings in Western culture and can effectively convey the original intention. At the same time, the simplified sentence avoids overly complex cultural backgrounds and rhetoric, making the translation easier to understand. To make the translation more concise and adapted to the Western cultural context, “恃强凌弱、祸乱人间” can be simplified and modified as “your actions are those of oppression and chaos.” The translator ensures the text does not clutter the screen, allowing the viewer to maintain visual focus on the animation’s cinematography. This simplified processing focuses more on behavioral evaluation, is concise, and is easier for target language viewers to understand.
(iv) Image Assistance to Enhance Visual Understanding
The Chinese animation conveys the meaning of many culture-loaded words through visual images, such as “乾坤圈” and “混天绫,” which are translated as “the Universe Ring” and “Red Armillary Sash.” Specific object images give specific explanations to the words, and subtitle translation combines the image translation method to help target language audiences understand these cultural elements more intuitively. Image assistance, as one of the important ways for the Chinese animation to convey traditional Chinese cultural ideas, has strong visual effects, thereby reducing the language barriers between the source language and the target language and conveying the cultural connotations behind them (See Figure 1).
Source: https://mp.weixin.qq.com/s/dUPnPFECnw1fWDq61EZClA.
Figure 1. Example still from the Chinese animation Ne Zha: The Devil Child Roars the Sea.
(v) Borrowed Expressions for Cross-Cultural Fusion
When translating certain culture-loaded words, subtitle translation borrows expressions already existing in the target language, such as translating “天劫” as “Heavenly Tribulation.” Another example is the incantations appearing in the Chinese animation, such as “急急如律令,” which can be understood as passwords or incantations appearing in Western fantasy works. This method achieves cross-cultural fusion, making it easier for the target language audience to accept and enhancing the transmission of culture-loaded words, but at the same time, it will weaken the cultural characteristics of the source language.
Example 4:
ST: 急急如律令
TT1: By divine decree, I strike with speed!
TT2: By this swift edict!
The result of artificial intelligence translation is “By divine decree, I strike with speed!” Although it is relatively smooth literally, it did not effectively convey the cultural connotation of the Chinese phrase “急急如律令.” This phrase has a profound cultural background, especially in Chinese traditional culture, where it is often used to convey a kind of seriousness and formality of a command. At the same time, the length and rhythm of the subtitles must also be considered, and it should be short and powerful. Therefore, manual editing retains “急急” as “swift/urgent” and retains the “decree/edict” imagery of “如律令,” making the sentence short and solemn, facilitating the unification of sound and image. Choosing “edict” expresses a classical style; “decree” is more neutral. Avoid adding “divine” or “I strike” to avoid changing the original meaning. Similar translations also include, for example, in the British classic Harry Potter series, incantations such as “Expecto Patronum,” “Stupefy,” or “Avada Kedavra,” where the translations are vivid and fit the usage habits of the target language, and can similarly be used as a reference.
Generally speaking, combined with this literary non-technical text, it is necessary to fully utilize artificial intelligence big data to improve translation efficiency and combine manual post-editing to check the quality of translation content, thereby ensuring the accuracy and cultural adaptability of the translation. Distinguished from earlier human translation, the MTPE model pays more attention to the direct conversion and output of text and translation, while the AIPE model, based on this, utilizes the big-data intelligent learning function. To better handle culture-loaded words in the translations during the Chinese animation subtitle translation and improve the consistency and precision of the translation, in each translation step, because literary texts have less repetition, attention should be paid to the terminology management of CAT, as well as the focus between AI and PE, and the priority of the two in translation and editing.
4. Conclusion
GenAI has ushered in an opportunity for technological innovation and efficiency improvement for the Chinese animation subtitle translation. This paper takes the Chinese animation Ne Zha: The Devil Child Roars the Sea as an example to discuss the application of the Artificial Intelligence Translation + Post-Editing (AIPE) model in subtitle translation. Research shows that under different translation models, the combination of manual review and the latest current Generative Artificial Intelligence can accurately and efficiently convey the cultural connotations of the original language. In addition, by analyzing specific translation examples, different translation methods and strategies are adopted to ensure the maximum resolution of translation difficulties such as culture-loaded words. Future research can explore deeper integrations of AI and human translation and develop more intelligent post-editing tools to deal with the challenges of contextual differences and linguistic habits.