Japanese Pronunciation Prediction as Phrasal Statistical Machine Translation

Jun Hatori and Hisami Suzuki


This paper addresses the problem of predicting the pronunciation of Japanese text.

The difficulty of this task lies in the high degree of ambiguity in the pronunciation of Japanese characters and words. Previous approaches have either considered the task as a word-level classification problem based on a dictionary, which does not fare well in handling out-of-vocabulary (OOV) words; or solely focused on the pronunciation prediction of OOV words without considering the contextual disambiguation of word pronunciations in text. In this paper, we propose a unified approach within the framework of phrasal statistical machine translation (SMT) that combines the strengths of the dictionary-based and substring-based approaches. Our approach is novel in that we combine word and character-based pronunciations from a dictionary within an SMT framework: the former captures the idiosyncratic properties of word pronunciation, while the latter provides the flexibility to predict the pronunciation of OOV words. We show that based on an extensive evaluation on various test sets, our model significantly outperforms the previous state-of-the-art systems, achieving around 90% accuracy in most domains.


Publication typeInproceedings
Published inProceedings of the 5th International Joint Conference on Natural Language Processing
PublisherAsia Federation of Natural Language Processing
> Publications > Japanese Pronunciation Prediction as Phrasal Statistical Machine Translation