கன்னித்தமிழிலிருந்து கணினித்தமிழுக்கு... : ஒரு கண்ணோட்டம்

A Kumaran


Computational Linguistics deals with computational models for analysis, synthesis or transformation text content expressed in natural languages, and form the basis for supporting most end user technologies, such as language understanding, information retrieval and extraction, machine translation, etc. The critical need for such end-user technologies in to processes efficinetly and effectively the exponentially growing natural language content in the Internet and Social Media, underscores the need for significant research in Computational Linguistics in all the languages of the world. In this paper, we underscore the importance of computational tools and technologies in any language to truly tap into and leverage the Internet for information dissemination, assimilation and empowerment, and in essense, be included into the Internet age. We introduce the statistical and machine learning based approaches used in most of the state-of-the-art computational models, and underscore the need for clean large annotated corpora and language resources for any and all types of Computational Linguistics research. In particular, we emphasize the need for corpora, basic tools and resources in Tamil, in order to ensure that Tamil language is brought to the Internet age. It is highly imperative that each and every stakeholder – namely, academia, industry, government and the community – come together to create a climate of consensus, coordination and collaboration to make sure that Tamil is taken successfully to the Computational world.


Published inProceedings of World Classical Tamil Conference, Coimbatore, India
