Hang Li, Yunbo Cao, and Cong Li
The traditional problem that exists from the time of Tower of Babel becomes more serious in the internet era. That is how we read and write foreign languages. Figure 1 shows an estimate of the distribution of languages on the web. We see that about three fourths of the web pages are in English and non-English speakers have needs to read the documents. On the other hand, English speakers cannot read roughly one fourth of the web pages in other languages. We are more challenged than people in any other era by the language barriers that stand in our way. Our proposal here is to use multi-lingual data on the web to help overcome the difficulties in reading foreign languages. Specifically, we describe how we use statistical machine learning techniques to perform intelligent foreign language reading assistance.