Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Engkoo Pinyin Redefines Chinese Input
Microsoft Research
May 13, 2013 3:51 PM PT

Computer keyboards, descended from typewriters, were designed to support alphabets, so how does one use a keyboard with a more complex writing system? Chinese users require solutions that enable them to input at least 4,000 distinct logograms called Hanzi, which represent the characters used to compose the words in their language. The answer is software known as input-method editors (IMEs), which enable users to type Chinese words by sounding them out in a Romanized form called Pinyin. But since Pinyin can be ambiguous, the task of the IME engine is to guess the most likely intended words and present the candidates onscreen for selection.

“In IME technology, the challenge is optimal tradeoffs between efficiency, accuracy, and usability while allowing for the freshest of results—capturing new language trends automatically as they appear on the web,” says Matt Scott, senior development lead in the Innovation Engineering group at Microsoft Research Asia. “We knew we could innovate further with search technology and optimize the Pinyin IME user experience even more. The next generation of cloud-based IME represents an opportunity to be the entry point for 600 million Chinese Internet users.”

The project—initially code-named Kunlunbegan with a bold premise: The team would approach IME as a fresh start, rather than make incremental improvements to the existing Microsoft Pinyin product. The result has been Engkoo Pinyin IME, recently launched as Bing IME, the most accurate Chinese Pinyin IME technology on the market today.

Describing the outcome solely in terms of accuracy, though, barely scratches the surface of Project Kunlun’s state-of-the-art achievements.

Redefining IME

Members of the Kunlun team knew they were facing a classic innovation dilemma. Sophisticated products with large user bases often end up with layers of technical complexity built over years of incremental improvements—to the point where major reinvention becomes expensive and risky. The enhancement effort becomes more of a “sustaining innovation.”

“We believe that today’s IMEs are, by and large, sustaining innovations,” Scott explains. “Going back to the drawing board to find a ‘disruptive innovation’ is the answer to such dilemmas. That’s what we were aiming for with Engkoo Pinyin.”

The team arrived at the novel approach of defining the problem as one of linguistic translation between the IME’s input and output characters—in other words, they would regard Pinyin and Hanzias two different languages.

“By making it a machine-translation problem,” Scott says, “we were able to leverage research in Statistical Machine Translation (SMT), an area where our team members in the Natural Language Computing group of Microsoft Research Asia have over a decade of scientific achievement. Taking this approach gave us significant competitive advantages, because we could draw on some of the top experts in the world, right here at the lab.”

Taking the SMT approach proved to be the right move, delivering impressive performance results when Scott’s team compared metrics against other Pinyin IMEs in the industry. Engkoo Pinyin is the most accurate Pinyin IME based on criteria such as: highest word-, phrase-, and sentence-based input accuracy; best-quality candidate matches against user expectations; and the lowest overall character error rate.

Redefining ‘Candidate’

When Engkoo Pinyin suggests likely candidates to users, it goes beyond Chinese logograms. For one thing, it offers English.

English use is growing rapidly in China. Not only is it the nation’s most popular second language, but English terms, especially technology terms, are used frequently in everyday Chinese conversation. The team recognized that for an IME to be truly useful, it has to mirror language trends. Consequently, one of the key user-experience innovations was building in English-language assistance and the ability to mix English and Chinese input. For these features, the team used novel Engkoo language-assistance technology, productized as Bing Dictionary, directly into Engkoo Pinyin.

Bing IME delivers suggestions in Chinese and English
With Bing IME, users can enter Pinyin and get back suggestions for Chinese and English options.
“We surveyed the market and found English-language assistance technology in Chinese IMEs were sorely lacking,” Scott recalls, “but it turned out that IME is an ideal paradigm to utilize as an English-language assistant because it’s so familiar to Chinese users. As for mixed English and Chinese input, we’ve been able to implement this feature fluently and accurately for a modeless user experience.”

Scott championed another IME innovation for Engkoo Pinyin: seamless integration with web search to find “rich candidates” —media such as images, videos, music, or maps. Rather than restrict input candidates to Chinese logograms, Engkoo Pinyin also finds rich candidates through web search and enables users to insert these as quickly and naturally as any logogram into documents or email.

Today, web search for rich media means a user has to leave the authoring environment—a microblog or a document, for example—and lose concentration while typing a search query and then performing a cut-and-paste. Engkoo Pinyin eliminates the cognitive load of the contextual switch between the authoring environment and the browser by handling the search and intelligently delivering rich candidates alongside Chinese logograms and English. In this way, IME is treated as another kind of search problem, an abstraction that represents a breakthrough convergence: Every edit box in Windows also becomes a search box.

Web search returns rich media suggestions
Web search returns rich media suggestions.
“We’re going to the cloud for candidates anyway,” Scott notes, “so why not offer the convenience of rich candidates by accessing web search at the same time? This allows users to go beyond just text, tap into the power of the web, and experience a totally new dimension to input productivity.”

Although the team was intent on innovation, it knew that speed was paramount to the IME user experience. Every millisecond counts, because any delay distracts from the core scenario of an IME—input. Therefore, performance via heavy optimization, clean implementation, and lightweight resource utilization was an absolute requirement.

“Everything has to be extremely fast, because at the end of the day, this is input productivity,” Scott says. “We first had to build the baseline legacy features that Pinyin IME users expect. And then we innovated on top of that base. But without speed, no matter how innovation the solution, it wouldn’t have been acceptable.”

Redefining R&D Methodologies

Product contributions from a research lab to product groups usually happen at the component level. Project Kunlun, though, handed Engkoo Pinyin to the Bing business group as an end-to-end contribution—a rare event. Scott credits the multidivision collaboration and cross-disciplinary team that drew talent from the Natural Language Computing and Innovation Engineering groups, partnering with Microsoft Office Division China, for the innovations and engineering quality that fast-tracked Engkoo Pinyin from the lab to the product group.

At any one time, a virtual team of 20 to 30 people participated in Project Kunlun. Not only were these team members responsible for technical innovation, they also implemented a methodology adapted from agile development dubbed “deployment-driven research.” Fast, iterative cycles of user feedback, research, and development enabled a solution with features that users wanted.

“Deployment-driven research is a term coined by Microsoft Research Asia,” Scott explains. “It’s a methodology that focuses on getting cutting-edge research out of the lab and into the hands of users as fast as possible for the underlying technology to be significantly influenced and improved by large-scale user feedback and data we collect and to actively tune our inventions to be more useful for people.”

Using Weibo, a Chinese microblogging site, and QQ, an instant-messaging site, the project team launched a grassroots social-media campaign that attracted a large, enthusiastic community of testers, all eager to discuss, critique, and offer enhancement suggestions.

“IME users in China have very high expectations on product quality and features,” Scott says. “The market is fiercely competitive, and the user base is highly experienced, so our offering has to be solid in the fundamentals, as well as deliver innovation. The point is, sometimes we have good ideas, sometimes bad ideas. How can we flush out what’s not helpful as fast as possible? How can we fail faster to get to success?”

The nature of IMEs meant the team had a unique advantage. Unlike other types of software, IMEs are constantly in use. Chinese users depend on an IME to interact with their computers. Scott and his colleagues found that any kind of dissatisfaction would generate discussion and valuable feedback almost immediately; the trick was to keep listening and build a process and infrastructure to do so efficiently.

Team members collected both explicit feedback and product-usage telemetry. They built a sophisticated dashboard and service-intelligence system that gave product managers anytime, anywhere visibility into Engkoo Pinyin usage. These metrics made it straightforward to prove the business case for a new Microsoft product to be born.