WikiBABEL

What is WikiBABEL Project about?

 

WikiBABEL project explores community collaborative creation of linguistic data for research by specfic language communities.

   

 

Approach

Our current focus is on collecting parallel data that is vitally needed for Machine Translation research, using the largest community participatory site, Wikipedia. The WikiBABEL leverages the existing information arbitrage between different languages in Wikipedia, to provide a rough initial content in a given target language that may be corrected by the community for creating high-quality content in target language Wikipedia. Given the large disparities in content between different Wikipedias, and given the aspiration of many Wikipedia communities to improve their presence in Wikipedia, there may be sufficient interest in using such a methodology to create new content.

 

As shown in the above figure, WikiBABEL sits as a thin transparent edit layer (WikiBABEL CORE) on any Wiki site, in particular, Wikipedia.  This layer integrates cloud-based services for discovery, linguistic and collaborative features that are supported in WikiBABEL.  Specific modules may be designed for specific wiki-systems, say Wikipedia.

 

 Current Status 

Among the first deployments of WikiBABEL is MSDNwiki - a Microsoft site that hosts user generated information for Microsoft developer communities, for creating information for specific demographics.

 

WikiBhasha beta is released as an open source MediaWiki extension in http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/WikiBhasha/, with Javascript files under Apache 2 license and the PHP files uder GPL2 license.

 

For immediate use, WikiBhasha beta is available as an installable bookmarklet from the WikiBhasha homepage, and also as a user-script (WikiBhasha) from WikiBhasha.MSR user.

 

People
Naren Datha
Naren Datha

Ashwani Sharma
Ashwani Sharma

Joseph Joy
Joseph Joy

Sridhar Vedantham
Sridhar Vedantham

Publications