Ryan Georgi, Fei Xia, and William Lewis
Recent studies have shown the potential benefits of leveraging resources for resource-rich languages to build tools for similar, but resource-poor languages. We examine what constitutes “similarity” by comparing traditional phylogenetic language groups, which are motivated largely by genetic relationships, with language groupings formed by clustering methods using typological features only. Using data from the World Atlas of Language Structures (WALS), our preliminary experiments show that typologically-based clusters look quite different from genetic groups, but perform as good or better when used to predict feature values of member languages.
In Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010)
Publisher International Conference on Computational Linguistics