Umair Z Ahmed, Kalika Bali, Monojit Choudhury, and Sowmya V. B
Back-transliteration based Input Method Editors are very popular for Indian Lan-guages. In this paper we evaluate two such Indic language systems to help un-derstand the challenge of designing a back-transliteration based IME. Through a detailed error-analysis of Hindi, Bang-la and Telugu data, we study the role of phonological features of Indian scripts that are reflected as variations and am-biguity in the transliteration. The impact of word-origin on back-transliteration is discussed in the context of code-switching. We also explore the role of word-level context to help overcome some of these challenges.
|Published in||Proceedings of IJCNLP Workshop on Advances in Text Input Methods|
|Publisher||Association for Computational Linguistics|