*
Quick Links|Home|Worldwide
Microsoft*
Search for



Min Chu
Researcher/Project Leader
Microsoft Research Asia

Min leads the text-to-speech (TTS) efforts in the speech group at Microsoft Research Asia since she joined in March 2000. They constructed a top Mandarin TTS system during the first year and extended it to English recently. Their latest system Microsoft Mulan, a bilingual TTS system, can switch between Mandarin and English smoothly in the same engine.

Research Interests:

Speech synthesis, analysis and perception: concatenative/LPC/formant synthesis, prosody modeling, signal processing, stress in Mandarin, formant tracking, emotions in speech, speech perception.

Natural language processing: word/prosodic word segmentation, phrase/prosodic phrase parsing, letter/character-to-sound conversion, name entity recognition.

Background:

Before joining Microsoft, Min worked with Intel China Research Center for a year, leading the TTS group there. Min received her Ph.D. from the Institute of Acoustics, Chinese Academy of Sciences in 1995. During the Ph.D. program, she won the presidential graduate student fellowship from Chinese Academy of Sciences in 1994 for remarkable research contribution. (The Mandarin text-to-speech system she developed then rated the best in the competition sponsored by the national 863 program of China.) After that, Min stayed in the same institute as an associate research professor till 1999. During that period, she was the technical leader of several research projects sponsored by the national 863 program of China. In 1997, Min visited Chinese University of Hong Kong for one year where she worked on Cantonese TTS and a hybrid speech synthesis scheme of sinusoidal model and TD-PSOLA.

Min received her B.S degree from Northwestern Polytechnological Univ. in 1990, and M.S. degree from Harbin Ship building Engineering Institute in 1992, respectively.

Publications:

In English:

[1]        Min Chu and Mingzhen Bao, Comparison of Sentential-Stress Allocation within Base Phrase among Different Reading Styles, Proc. of International Conference on Speech Prosody, Nara, 2004, pp. 111-114

[2]        LiJuan Wang, Yong Zhao, Min Chu, Jianlai Zhou and Zhigang Cao, Refining Segmental Boundaries for TTS database Using Fine Contextual-Dependent Boundary Models, proc. of ICASSP 2004, Montreal, pp. I-641~I-644.

[3]        Chao Huang, Yu Shi, Jianlai Zhou, Min Chu, Terry Wang and Eric Chang, Segmental Tonal Modeling for Phone Set Design in Mandarin LVCSR, proc. of ICASSP 2004, Montreal, pp. I-901~I-904.

[4]        Ye Tian, Jianlai Zhou, Min Chu and Eric Chang, Tone Recognition with Fractionized Models and Outlined Features, proc. of ICASSP 2004, Montreal, pp. I-105~I-108.

[5]        Min Chu, Yunjia Wang and Lin He, Labeling Stress in Continuous Mandarin Speech Perceptually, proc. of the 15th International Congress of Phonetic Sciences, Barcelona, 2003.

[6]        Yunjia Wang, Min Chu and Lin He, Location of Sentence Stresses within Disyllabic Words in Mandarin, proc. of the 15th International Congress of Phonetic Sciences, Barcelona, 2003.

[7]        Yong Zhao, Min Chu, Hu Peng and Eric Chang, Custom-Tailoring TTS Voice Font – Keeping the Naturalness When Reducing Database Size, proc. of the 8th European Conference on Speech Communication and Technology (Eurospeech 2003), Geneva, 2003.

[8]        Yining Chen, Min Chu, Eric Chang, Jia Liu and Runsheng Liu, Voice Conversion with Smoothed GMM and MAP Adaptation, proc. of the 8th European Conference on Speech Communication and Technology (Eurospeech 2003), Geneva, 2003.

[9]        Min Chu, Hu Peng, Yong Zhao, zhengyu Niu and Eric Chang, Microsoft Mulan – a Bilingual TTS System, Proc. of ICASSP 2003, Hong Kong, 2003.

[10]     Yunjia Wang, Min Chu, Lin He and Yongqiang Feng, Stress Perception of Chinese Disyllabic Words in Utterance, Chinese Journal of Acoustics, Vol.22, No.1, 2003.

[11]     Min Chu, Chun Li, Hu Peng and Eric Chang, Domain Adaptation for TTS systems, Proc. of ICASSP 2002, Orlando, 2002.

[12]     Hu Peng, Yong Zhao and Min Chu, Perpetually optimizing the cost function for unit selection in a TTS system with one single run of MOS evaluation, Proc. of ICSLP2002, Denver, 2002.

[13]     Yu Shi, Eric Chang, Hu Peng and Min Chu, Power Spectral Density Based Channel Equalization of Large Speech Database for Concatenative TTS system, Proc. of ICSLP2002, Denver, 2002.

[14]     Zi-Rong Zhang, Min Chu and Eric Chang, An Efficient Way to Learn Rules for Grapheme-to-Phoneme Conversion in Chinese, ISCSLP 2002, Taipei.

[15]     Min Chu and Yao Qian, Locating Boundaries for Prosodic Constituents in Unrestricted Mandarin Texts, Journal of Computational Linguistics and Chinese Language Processing, Vol.6. No.1. Feb. 2001, pp. 61-82.

[16]     Min Chu and Hu Peng, An objective measure for estimating MOS of synthesized speech, Proc. of Eurospeech2001, Aalborg, 2001 pp.2087-2090. (won the COCOSDA best paper award)

[17]     Min Chu and Yong-Qiang Feng, Study on Factors Influencing Durations of Syllables in Mandarin, Proc. of Eurospeech2001, Aalborg, 2001, pp.927-930.

[18]     Min Chu, Hu Peng and Eric Chang, A concatenative Mandarin TTS system without prosody model and prosody modification, Proceedings of 4th ISCA workshop on speech synthesis, Scotland, 2001.

[19]     Min Chu, Hu Peng, Hong-Yun Yang and Eric Chang, Selecting non-uniform units from a very large corpus for concatenative speech synthesizer, Proc. of ICASSP2001, Salt Lake City, 2001.

[20]     Yao Qian, Min Chu and Hu Peng, Segmenting Unrestricted Chinese Text into Prosodic Words Instead of Lexical Words, Proc. of ICASSP2001, Salt Lake City, 2001.

[21]     Jieping Xu, Min Chu, Lin He and Shinan Lv, The Influence of Chinese Sentence Stress in Pitch and Duration, Chinese Journal of Acoustics, Vol. 19, No. 3, 2000, pp.270-277.

[22]     Min Chu, Difei Tang, Hongyan Si, Xuqing Tian and Shinan Lu, Research on Perception of Juncture Between Syllables in Chinese, Chinese Journal of Acoustics, Vol.17, No.2, 1998, pp. 143-152.

[23]     Min Chu and P. C. Ching, A Hybrid Approach to Synthesize High Quality Cantonese Speech, Proc. of ICASSP98, Seattle, 1998.

[24]     Dinghua Guan, Min Chu, Quan Zhang, Jian Liu and Xiangdong Zhang, The Research Project of Man-Computer Dialogue System in Chinese, Proc. of ICSLP98, Sydney, 1998.

[25]     Min Chu, Lin He, Jieping Xu and Shinan Lu, Voice Conversion Between Female and Male in a TD-PSOLA Based Chinese TTS system, Proc. of ISCSLP98, Singapore, 1998.

[26]     LU Shinan, CHU Min and SI Hongyan, Study on Chinese Text-to-Speech System, Proc. of ISSPR'98, Hong Kong, 1998.

[27]     Min Chu and Shinan Lu, Building up a Cantonese Prosody Model by Using Neural Network, Proc. of Conference on Phonetics of the Languages in China, Hong Kong, 1998.

[28]     Hongyan Si, Min Chu, and Shinan Lu, The perceptual properties of stable portion of consonants in Chinese, Proc. of the Conference on Phonetics of the Languages in China, Hong Kong, 1998.

[29]     Min Chu, Hongyan Si, Xuqing Tian, Shinan Lu and P.C. Ching, Research on Perception of Formant Transition Between Syllables in Chinese, Proc. of The Sixth Western Pacific Regional Acoustics Conference, Hong Kong, 1997, pp. 94-99.

[30]     Hongyan SI, Min CHU and Shinan LU, A novel waveform concatention algorithm for Chinese PSOLA-based synthesizer, The Proceeding of The Sixth West/Pacific Reginal Acoustics Congress, Hong Kong, 1997.

[31]     Min Chu, Difei Tang, Shinan Lu and Dinghua Guan, The prosody model for the Chinese TTS system Lengend Voice, Proc. of the First China-Japan Workshop on Spoken Language Processing, 1997, pp. 111-116

[32]     Hongyan Si, Min Chu and ShiNan Lu, A novel waveform concatenation algorithm for Chinese PSOLA-based synthesizer, Proc. of the First China-Japan Workshop on Spoken Language Processing, 1997, pp. 135-140

[33]     Difei Tang, Min Chu, Shinan Lu and Lin He, Word segmentation for Chinese TTS system Lengend Voice, Proc. of the First China-Japan Workshop on Spoken Language Processing, 1997, pp. 153-156

[34]     Min Chu and Shinan Lu, A Text-to-Speech System with High Intelligibility and High Naturalness for Chinese, Chinese Journal of Acoustics, Vol.15 No.1 , 1996, pp. 81-90.

[35]     Shinan Lu, Min Chu, Lin He, Yamin Lu, Xiaoguang Li and Jie Ma, The Design and Realization of a Spoken Chinese Output System, Proc. of ICMI'96.

[36]     Min Chu, Shinan Lu, Hongyan Si, Lin He and Dinghua Guan, The control of Juncture and prosody in Chinese TTS system, Proceeding of ICSP'96, Beijing, China,1996.

[37]     Min Chu and Shinan Lu, High Intelligibility and Naturalness Chinese TTS System and Prosodic Rules, Proc. of XIII International Congress of Phonetic, Stockolm, 1995, P.2:334-2:337.

[38]     Dinghua Guan, Min Chu and Shinan Lu, A Chinese Text-to-speech System with High Intelligibility and High Naturalness, Proc. of International Conference on Acoustics, Trodheim Norway, pp.31-34.

 

In Chinese:

[39]     初敏,自然言语的韵律组织中的不确定性及其在语音合成中的应用 中文信息学报,Vol. 18, No. 4, 2004, pp.66-71.

[40]     初敏、王韫佳和包明真,普通话节律组织中的局部语法约束和长度约束,语言学论丛,第三十辑,2004,即将出版。

[41]     初敏,自然言语的韵律组织中的不确定性及其在语音合成中的应用,第七届全国语音通讯信号处理学术论文集,2003,厦门。

[42]     初敏、王韫佳和包明真,普通话节律组织中的局部语法约束和长度约束,第六届全国现代语音学学术会议论文集,2003,天津。

[43]     王韫佳,初敏和贺琳,语义重音分布的初步研究,第六届全国现代语音学学术会议论文集,2003,天津。

[44]     王韫佳,初敏和贺琳,汉语语句重音的分类和分布,心理学报,Vol. 35, No. 6, 2003, pp. 734-742

[45]        王韫佳,初敏,贺琳和冯勇强,连续话语中双音节韵律词的重音感知,声学学报,Vol.28, No.6, 2003, pp.534-539.

[46]        张子荣和初敏,解决多音字字音转换的一种统计学习方法,中文信息学报,Vol.16, No.3, 2002, pp. 39-45.

[47]        初敏,韵律研究与合成语音的自然度,第五届全国现代语音学学术会议论文集,pp.295-3012001,北京。

[48]        冯勇强,初敏,贺琳,吕士楠,汉语话语音节时长统计分析 第五届全国现代语音学学术会议论文集,pp.66-692001,北京。

[49]        钱瑶,初敏,潘悟云,普通话韵律单元边界的声学分析 第五届全国现代语音学学术会议论文集,pp.70-742001,北京。

[50]        王韫佳,初敏,贺琳和冯勇强,语句中双音节韵律词重音感知的初步研究 第五届全国现代语音学学术会议论文集,pp.166-1702001,北京。

[51]        贺琳,初敏,吕士楠,钱瑶和冯勇强,汉语合成语料库的韵律层级标注研究 第五届全国现代语音学学术会议论文集,pp.323-3262001,北京。

[52]        许洁萍,初敏,贺林和吕士楠,汉语语句重音对音高和音长的影响,声学学报,Vol. 25 N0. 4 2000pp.335-339

[53]        初敏,吕士楠,一种将PSOLA算法与语音正弦模型结合的合成方法,第五届人机语音通讯学术会议论文集1998, 哈尔滨,pp.296-299

[54]     许洁萍,贺琳,陆亚民,吕士楠,汉语广播言语中音节时长变化初探,第五届人机语音通讯学术会议论文集1998,哈尔滨,pp.42-45

[55]     初敏,唐涤飞,司宏岩,田旭青和吕士楠,汉语音节音联感知特性的研究,声学学报 Vol.22, No.2, 1997, pp.104-110

[56]     唐涤飞,贺琳、初敏和吕士楠,“联想佳音”汉语文语转换系统的应用,第八届语音图象通信信号处理学术会议论文集1997,郑州。

[57]     初敏和吕士,一种高清晰度和高自然度的汉语文语转换系统, 声学学报,Vol. 21  4期增刊,1996pp.639-647

[58]     吕士楠,初敏,贺琳,陆亚民和李晓光,计算机汉语口语输出系统的设计与实现,软件学报,1996863专刊,pp.53-59

[59]     吕士楠,初敏,陆亚民,倪光南和李晓光, 中文DOS平台语音系统,第三届全国计算机应用学术交流大会论文集1995,北京, pp.1558-1561

[60]     陆亚民,吕士楠,初敏,贺琳和周同春,疑问句语调模型的研究,第三届全国语音通讯信号处理学术论文集,西安,1995pp. 154-157

[61]     初敏,司宏岩,田旭青,吕士楠和孔江平,汉语音节间的协同发音在听觉感知中的作用,第七届全国语音图象通讯信号处理学术会议论文集1995,西安, pp.349-353

[62]     吕士楠,初敏和李晓光,国内外语音合成技术的发展概况,第七届全国语音图象通讯信号处理学术会议论文集1995,西安, pp.133-141

[63]     初敏,吕士楠和陆亚民,利用基音同步叠加技术合成汉语的研究,第三届全国人机语音通讯学术会议论文集,重庆,pp. 394-397

[64]     吕士楠,周同春,初敏和陆亚民, 汉语合成系统中音高和音长规则研究,第三届全国人机语音通讯学术会议论文集,1994, 重庆, pp.407-410

[65]     初敏、吕士楠和周同春,汉语轻声音节合成规则研究,第六界全国语音图象通讯信号处理学术会议论文集,1993年,四川,pp. B9.107- B9.109

 


©2008 Microsoft Corporation. All rights reserved. Terms of Use |Trademarks |Privacy Statement