In a [paper] published on January 2, 2024, Google researchers Jiaming Luo, Colin Cherry, and George Foster compared morphosyntactic divergence in [machine translation] (MT) against human [translation] (HT) and found that MT tends to be more “conservative” than HT.

谷歌研究人员在2024年1月2日发表的[论文](中指出,与人工翻译相比,机器翻译更倾向于“保守”。研究人员Jiaming Luo、Colin Cherry和George Foster比较了机器翻译和人工翻译在形态句法上的差异,并发现机器翻译展现出较少的形态句法多样性,更多趋同模式,以及更多一对一的对齐

The authors explained that translation divergences occur when translations differ structurally from source sentences. This can be due to inherent cross-lingual differences or idiosyncratic preferences of translators. These divergences naturally occur in the [translation] process and can be readily found in human translations — including those used for training MT systems — they said.


They also highlighted that the existence of these divergences in HT has long been regarded as a key challenge for [MT], and recent studies have demonstrated the abundance of translation divergences in HT.


To that end, they conducted experiments to assess how MT and HT differ in terms of morphosyntactic divergence, understand the source of this difference, and explore how translation divergences in HT affect MT quality. The experiments were conducted in three language pairs: English-French, English-German, and English-Chinese using WMT datasets.



Conservative Machine 



The results revealed that MT is more “conservative” than HT, exhibiting less morphosyntactic diversity, more convergent patterns, and more one-to-one alignments. They also observed that MT tends to be less similar to HT when the source has less common structures. 


The authors attributed this discrepancy to the use of beam search, which biases MT towards more convergent patterns. This bias is most prominent when convergent patterns appear frequently — around 50% of the time — in the training data. “This could be because the model has seen the pattern enough to assign it substantial probability mass, but there is still enough uncertainty that humans will frequently choose other patterns,” said the authors. 

作者将这种差异归因于使用了束搜索(beam search),这使得机器翻译倾向于更多趋同模式。这种偏见在训练数据中频繁出现的趋同模式(大约一半的情况)尤为明显。作者表示,“这可能是因为模型已经足够熟悉这种模式,从而赋予其相当的概率质量,但仍存在足够的不确定性,导致人类会频繁选择其他模式。”

Moreover, frequencies of convergent patterns in MT increase even when they are uncommon in HT, suggesting perhaps a more inherent structural bias in current MT architectures. 


Lastly, they investigated how the presence of morphosyntactic divergence in HT might affect MT quality and found that, for a majority of morphosyntactic divergences, their presence in HT is correlated with decreased MT performance, presenting a greater challenge for MT systems.


The authors emphasized that “this is the first work to present the comparative perspective of HT vs MT in such fine granularity covering thousands of morphosyntactic constructions,” and expressed their interest in applying the same analysis to [large language model] (LLM)-based MT systems and see if and how the [LLM] translations differ from those produced by traditional MT models.


