Fei Xia and William D. Lewis
For the majority of the world’s languages, the number of linguistic resources (e.g., annotated corpora and parallel data) is very limited. Consequently, supervised methods, as well as many unsupervised methods, cannot be applied directly, leaving these languages largely untouched and unnoticed. In this paper, we describe the construction of a resource that taps the large body of linguistically analyzed language data that has made its way to the Web, and propose using this resource to bootstrap NLP tool development.
In Proceedings of The Third International Joint Conference on Natural Language Processing (IJCNLP)
Publisher Asia Federation of Natural Language Processing
copyright 2007 by AFNLP