Resource Creation for Training and Testing of Transliteration Systems for Indian Languages

Transliteration refers to the process of writing the text of one language using the script of another language whereby the sound of the text is preserved as far as possible (Knight and Graehl, 1998). Transliteration can be classified in to two types: forward and backward. Forward transliteration refers to the process of representation of a word (in our context, Indian language word) using a non-native script (in this case, Roman script). For example, Roman string “Sachin” might be generated by forward transliteration from the original Hindi word “सचिन" which is in the Devanagari script. Back transliteration, on the other hand, is the reverse process whereby one can obtain the native script representation back from the transliterated word. Thus, backward transliteration will generate the Devanagari string “सचिन" from the Roman string “Sachin”.

In  Proceedings of the Language Resource and Evaluation Conference (LREC) 2010

Publisher  European Language Resources Association
Printed / Distributed with the permission of ELRA. This paper was published within the proceedings of the LREC'2010 Conference. © 2007 ELRA - European Language Resources Association. All rights reserved.

Details

TypeInproceedings
URLhttp://www.lrec-conf.org/proceedings/lrec2010/pdf/182_Paper.pdf
Pages2902-2907
> Publications > Resource Creation for Training and Testing of Transliteration Systems for Indian Languages