Resource Creation for Training and Testing of Transliteration Systems for Indian Languages
- Sowmya V. B ,
- Monojit Choudhury ,
- Kalika Bali ,
- Tirthankar Dasgupta ,
- Anupam Basu
Proceedings of the Language Resource and Evaluation Conference (LREC) 2010 |
Published by European Language Resources Association
Transliteration refers to the process of writing the text of one language using the script of another language whereby the sound of the text is preserved as far as possible (Knight and Graehl, 1998). Transliteration can be classified in to two types: forward and backward. Forward transliteration refers to the process of representation of a word (in our context, Indian language word) using a non-native script (in this case, Roman script). For example, Roman string “Sachin” might be generated by forward transliteration from the original Hindi word “सचिन” which is in the Devanagari script. Back transliteration, on the other hand, is the reverse process whereby one can obtain the native script representation back from the transliterated word. Thus, backward transliteration will generate the Devanagari string “सचिन” from the Roman string “Sachin”.
Printed / Distributed with the permission of ELRA.This paper was published within the proceedings of the LREC'2010 Conference.© 2007 ELRA - European Language Resources Association. All rights reserved.