Ye-Yi Wang, John Lafferty, and Alex Waibel
1996
We introduce a word clustering algorithm which uses a bilingual, parallel corpus to group together words in the source and target language. Our method generalizes previous mutual information clustering algorithms for monolingual data by incorporating a statistical translation model. Preliminary experiments have shown that the algorithm can effectively employ the constraints implicit in bilingual data to extract classes which are well suited to machine translation tasks.
![]() PDF file |
In: Fourth International Conference on Spoken Language Processing
Publisher: International Speech Communication Association
© 2007 ISCA. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the ISCA and/or the author.
| Type: | Inproceedings |
| Pages: | 2364 - 2367 |
| Volume: | 4 |
| Address: | Philadelphia, PA, USA |