Asymmetric Features of Human Generated Translation

  • Sauleh Eetemadi ,
  • Kristina Toutanova

Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing |

Published by Association for Computational Linguistics

Distinct properties of translated text have been the subject of research in linguistics for many year (Baker, 1993). In recent years computational methods have been developed to empirically verify the linguistic theories about translated text (Baroni and Bernardini, 2006). While many characteristics of translated text are more apparent in comparison to the original text, most of the prior research has focused on monolingual features of translated and original text. The contribution of this work is introducing bilingual features that are capable of explaining differences in translation direction using localized linguistic phenomena at the phrase or sentence level, rather than using monolingual statistics at the document level. We show that these bilingual features outperform the monolingual features used in prior work (Kurokawa et al., 2009) for the task of classifying translation direction.