Machine Translation Is More ‘Conservative’ Than Human Translation, Google Says
谷歌称机器翻译相比人工翻译更“保守”
01
In a [paper] published on January 2, 2024, Google researchers Jiaming Luo, Colin Cherry, and George Foster compared morphosyntactic divergence in [machine translation] (MT) against human [translation] (HT) and found that MT tends to be more “conservative” than HT.
谷歌研究人员在2024年1月2日发表的[论文](https://arxiv.org/pdf/2401.01419.pdf)中指出,与人工翻译相比,机器翻译更倾向于“保守”。研究人员Jiaming Luo、Colin Cherry和George Foster比较了机器翻译和人工翻译在形态句法上的差异,并发现机器翻译展现出较少的形态句法多样性,更多趋同模式,以及更多一对一的对齐。
The authors explained that translation divergences occur when translations differ structurally from source sentences. This can be due to inherent cross-lingual differences or idiosyncratic preferences of translators. These divergences naturally occur in the [translation] process and can be readily found in human translations — including those used for training MT systems — they said.
论文作者解释称,翻译差异是指翻译结构上与原句有所不同。这可能是因为跨语言的固有差异或译者的个人喜好所致。这些差异在翻译过程中自然发生,且在人工翻译中常见——包括用于训练机器翻译系统的翻译。
They also highlighted that the existence of these divergences in HT has long been regarded as a key challenge for [MT], and recent studies have demonstrated the abundance of translation divergences in HT.
他们还强调,人工翻译中存在的这些差异长期以来被视为机器翻译的一大挑战。近期研究显示,人工翻译中的差异十分丰富。
To that end, they conducted experiments to assess how MT and HT differ in terms of morphosyntactic divergence, understand the source of this difference, and explore how translation divergences in HT affect MT quality. The experiments were conducted in three language pairs: English-French, English-German, and English-Chinese using WMT datasets.
为此,他们进行了实验,评估机器翻译和人工翻译在形态句法差异方面的不同,理解这种差异的来源,并探究人工翻译中的翻译差异如何影响机器翻译质量。实验覆盖了英法、英德和英中三种语言对,使用了WMT数据集。
02
Conservative Machine
Translation
保守的机器翻译
The results revealed that MT is more “conservative” than HT, exhibiting less morphosyntactic diversity, more convergent patterns, and more one-to-one alignments. They also observed that MT tends to be less similar to HT when the source has less common structures.
结果显示,机器翻译比人工翻译更“保守”,表现出较少的形态句法多样性,更多趋同模式和更多一对一对齐。研究人员还观察到,当源语言结构不太常见时,机器翻译与人工翻译的相似度较低。
The authors attributed this discrepancy to the use of beam search, which biases MT towards more convergent patterns. This bias is most prominent when convergent patterns appear frequently — around 50% of the time — in the training data. “This could be because the model has seen the pattern enough to assign it substantial probability mass, but there is still enough uncertainty that humans will frequently choose other patterns,” said the authors.
作者将这种差异归因于使用了束搜索(beam search),这使得机器翻译倾向于更多趋同模式。这种偏见在训练数据中频繁出现的趋同模式(大约一半的情况)尤为明显。作者表示,“这可能是因为模型已经足够熟悉这种模式,从而赋予其相当的概率质量,但仍存在足够的不确定性,导致人类会频繁选择其他模式。”
Moreover, frequencies of convergent patterns in MT increase even when they are uncommon in HT, suggesting perhaps a more inherent structural bias in current MT architectures.
此外,即使在人工翻译中不常见,机器翻译中趋同模式的频率也在增加,这可能表明当前机器翻译架构中存在更根本的结构性偏差。
Lastly, they investigated how the presence of morphosyntactic divergence in HT might affect MT quality and found that, for a majority of morphosyntactic divergences, their presence in HT is correlated with decreased MT performance, presenting a greater challenge for MT systems.
最后,他们调查了人工翻译中形态句法差异的存在对机器翻译质量的影响,并发现,对于大多数形态句法差异,它们在人工翻译中的存在与机器翻译性能的降低相关,对机器翻译系统构成更大的挑战。
The authors emphasized that “this is the first work to present the comparative perspective of HT vs MT in such fine granularity covering thousands of morphosyntactic constructions,” and expressed their interest in applying the same analysis to [large language model] (LLM)-based MT systems and see if and how the [LLM] translations differ from those produced by traditional MT models.
作者强调,“这是首次从人工翻译与机器翻译的对比角度,以如此细致的粒度呈现成千上万种形态句法结构”,并表示他们有兴趣将同样的分析应用于基于[大型语言模型](LLM)的机器翻译系统,看看[LLM]的翻译与传统机器翻译模型产生的翻译有何不同。
高效精准的AI实时翻译工具——Felo 实时翻译
什么是Felo 实时翻译?
Felo实时翻译是一款AI同声传译APP,搭载GPT-4引擎和RRT技术,它能够快速且正确地翻译15种以上外语(包括英语、西班牙语、法语、德语、俄语、中文、阿拉伯语和日语等)的语音,支持下载原文和译文文本,帮助你学习地道的表达方式和发音。ChatGPT大语言模型,能够准确传达剧作的情感、表达和戏剧效果,让观众能够全面理解和享受到不同语言文化带来的精彩。
Felo 实时翻译可以帮助到同声传译什么?
Felo 实时翻译可以辅助刚入门同声传译的同学,解决跟不上记录,专业词汇翻译更佳准确。
同声传译是一项复杂而技术性强的工作,需要译员具备扎实的语言功底、丰富的专业知识和良好的团队合作精神。只有不断地学习和提升自己的翻译能力,才能够胜任这一重要的翻译任务,为国际交流的顺利进行做出贡献。