Brief Biography
Dr. Dong Yu (俞栋) joined Microsoft Corporation in 1998 and the Microsoft Speech Research Group (now expanded to Conversational Systems Research Center) in 2002, where he currently is a senior researcher. He holds a Ph.D. degree in computer science from University of Idaho, an MS degree in computer science from Indiana University at Bloomington, an MS degree in electrical engineering from Chinese Academy of Sciences, and a BS degree (with honor) in electrical engineering from Zhejiang University (China). His current research interests include speech processing, robust speech recognition, discriminative training, and machine learning. He has published over 120 papers in these areas and is the inventor/coinventor of more than 50 granted/pending patents.
His most recent work focuses on deep learning and its application in large vocabulary speech recognition. The context-dependent deep neural network hidden Markov model (CD-DNN-HMM) he co-proposed and developed has been seriously challenging the dominant position of the conventional GMM based system for large vocabulary speech recognition and helped popularize deep learning.
Dr. Dong Yu is a senior member of IEEE, a member of ACM, and a member of ISCA. He is currently serving as a member of the IEEE Speech and Language Processing Technical Committee (2013-) and an associate editor of IEEE transactions on audio, speech, and language processing (2011-). He has served as an associate editor of IEEE signal processing magazine (2008-2011) and the lead guest editor of IEEE transactions on audio, speech, and language processing - special issue on deep learning for speech and language processing (2010-2011).
News/Blogs Featuring My Recent Work
- DNN Research Improves Bing Voice Search
- Bing Makes Voice Recognition on Windows Phone More Accurate and Twice as Fast
- Microsoft revs speedier, smarter speech recognition for phones
- DNN与微软同声传译系统背后的故事
- Microsoft Research shows a promising new breakthrough in speech translation technology
- Deep-Neural-Network Speech Recognition Debuts
- Speech Recognition Leaps Forward
- 语音识别技术突飞猛进
Equation Number in Office 2007 and 2010
Office 2007 and 2010 come with a very nice equation editor and bibliography manager. However, it does not support equation and theorem number management. To work around this problem. I have developed a set of macros. You can download it here.
Publications
Book Chapters
-
Dong Yu, Li Deng, "Speech-Centric Multimodal User Interface Design in Mobile Technology", Handbook of Research on User Interface Design and Evaluation for Mobile Technology (Editor: Joanna Lumsden), Jan. 2008, IGI.
-
Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, Alex Acero, "Voice Search (Ch.5)", in Language Understanding: Systems for Extracting Semantic Information from Speech (editors: Gokhan Tür & Renato De Mori), John Wiley & Sons, 2011.
Refereed Journals and Magazines
- Zhen-Hua Ling, Li Deng, Dong Yu, “Modeling Spectral Envelopes Using Restricted Boltzmann Machines and Deep Belief Networks for Statistical Parametric Speech Synthesis”, IEEE Transactions on Audio, Speech, and Language Processing (in press).
- Kaisheng Yao, Dong Yu, Li Deng, Yifan Gong, "A Fast Maximum Likelihood Nonlinear Feature Transformation Method for GMM-HMM Speaker Adaptation", Neurocomputing - special issue on extreme learning machine, 2013 (in press)
- Brian Hutchinson, Li Deng, Dong Yu, "Tensor Deep Stacking Networks", IEEE Trans. on Pattern Analysis and Machine Intelligence - special issue on learning deep architectures, 2013, http://doi.ieeecomputersociety.org/10.1109/TPAMI.2012.268,(in press).
- Sabato Marco Siniscalchi, Dong Yu, Li Deng, Chin-Hui Lee, "Exploiting Deep Neural Networks for Detection-Based Speech Recognition", Neurocomputing, vol. 106, pp. 148-157, April 2013. http://dx.doi.org/10.1016/j.neucom.2012.11.008.
- Sabato Marco Siniscalchi, Dong Yu, Li Deng, Chin-Hui Lee, "Speech Recognition Using Long-Span Temporal Patterns in a Deep Network Model", IEEE Signal Processing Letters, vol. 20, no. 3, pp. 201-204, March 2013. 10.1109/LSP.2013.2237901.
-
Dong Yu, Li Deng, Frank Seide, "The Deep Tensor Neural Network with Applications to Large Vocabulary Speech Recognition", IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 2, pp. 388-396, Feb, 2013, http://dx.doi.org/10.1016/j.neucom.2012.11.008.
-
Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, and Brian Kingsbury, “Deep Neural Networks for Acoustic Modeling in Speech Recognition”, IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82-97 Nov. 2012. doi:10.1109/MSP.2012.2205597.
- Dong Yu, Li Deng, "Efficient and Effective Algorithms for Training Single-Hidden-Layer Neural Networks", Pattern Recognition Letters, vol. 33, no. 5, pp. 554-558. April 2012. DOI information: 10.1016/j.patrec.2011.12.002.
- George E. Dahl, Dong Yu, Li Deng, and Alex Acero, "Context-Dependent Pre-trained Deep Neural Networks for Large Vocabulary Speech Recognition", IEEE Transactions on Audio, Speech, and Language Processing - Special Issue on Deep Learning for Speech and Language Processing, vol. 20, no. 1, pp. 33-42, Jan 2012. doi:10.1109/TASL.2011.2134090.
-
Dong Yu, Jinyu Li, Li Deng, "Calibration of Confidence Measures in Speech Recognition", IEEE Transactions on Audio, Speech, and Language Processing, Nov. 2011, pp. 2461-2473. doi:10.1109/TASL.2011.2141988.
-
Michael L. Seltzer, Yun-Cheng Ju, Ivan Tashev, Ye-Yi Wang, and Dong Yu, "In Car Media Search", IEEE signal processing magazine, vol. 28, no. 4, pp. 50-60, June 2011, doi:10.1109/MSP.2011.941065.
- Dong Yu, Shizhen Wang, Li Deng, "Sequential Labeling Using Deep-Structured Conditional Random Fields", Journal of Selected Topics in Signal Processing - special issue on Statistical Learning Methods for Speech and Language Processing, vol 4, no 6, December 2010, pp. 965-973. doi:10.1109/JSTSP.2010.2075990.
- Dong Yu, Li Deng, Alex Acero, "Using Continuous Features in the Maximum Entropy Model", Pattern Recognition Letters. doi: 10.1016/j.patrec.2009.06.005. Vol. 30, Issue 14, pp. 1295-1300, October, 2009.
- Dong Yu, Li Deng, Yifan Gong, Alex Acero, "A Novel Framework and Training Algorithm for Variable-Parameter Hidden Markov Models", IEEE Transactions on Audio, Speech and Language Processing, vol 17, no. 7, pp. 1348-1360, September 2009. doi:10.1109/TASL.2009.2020890.
- Dong Yu, Balakrishnan Varadarajan,Li Deng, Alex Acero, "Active Learning and Semi-supervised Learning for Speech Recognition: A Unified Framework using the Global Entropy Reduction Maximization Criterion", Computer Speech and Language - Special Issue on Emergent Artificial Intelligence Approaches for Pattern Recognition in Speech and Language Processing. vol 24, issue 3, pp. 433-444, July 2010. doi:10.1016/j.csl.2009.03.004.
- Jinyu Li, Li Deng, Dong Yu, Yifan Gong, Alex Acero, "A Unified Framework of HMM Adaptation with Joint Compensation of Additive and Convolutive Distortions", Computer Speech and Language, vol. 23, pp. 389-405, 2009. doi:10.1016/j.csl.2009.02.001.
- Sibel Yaman, Li Deng, Dong Yu, Ye-Yi Wang, Alex Acero, "An integrative and discriminative technique for spoken utterance classification", IEEE Transactions on Audio, Speech and Language Processing, vol. 6, no. 6, Aug. 2008, pages 1207-1215. doi: 10.1109/TASL.2008.2001106.
-
Dong Yu, Li Deng, Xiaodong He, Alex Acero, "Large-Margin Minimum Classification Error Training: A Theoretical Risk Minimization Perspective", Computer Speech and Language, Vol. 22, No. 4, October 2008, pp. 415-429. doi:10.1016/j.csl.2008.03.002.
-
Dong Yu, Li Deng, Jasha Droppo, Jian Wu, Yifan Gong, Alex Acero, "Robust Speech Recognition Using a Cepstral Minimum-Mean-Square-Error-Motivated Noise Suppressor", IEEE Transactions on Audio, Speech and Language Processing, vol. 16, no. 5, pp. 1061-1070, July 2008. DOI: 10.1109/TASL.2008.921761.
-
Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, Alex Acero, "An Introduction to Voice Search", IEEE Signal Processing Magazine (Special Issue on Spoken Language Technology), vol. 25, no. 3, pp. 28-38, May 2008. doi: 10.1109/MSP.2008.918411.
-
Dong Yu, Deborah Frincke, "Improving the Quality of Alerts and Predicting Intruder¡¯s Next Goal with Hidden Colored Petri-Net", Computer Networks, Volume 51, Issue 3, 21 February 2007, Pages 632-654. doi: 10.1016/j.comnet.2006.05.008.
-
Dong Yu, Li Deng, Alex Acero, "Speaker-Adaptive Learning of Resonance Targets in a Hidden Trajectory Model of Speech Coarticulation", Computer Speech and Language, Vol. 27, 2007, pp. 72-87. doi:10.1016/j.csl.2005.12.002.
-
Dong Yu, Li Deng, Alex Acero, "A Lattice Search Technique for a Long-Contextual-Span Hidden Trajectory Model of Speech", Speech Communication, Elsevier. Volume: 48 Issue: 9, Sep 2006. pp. 1214-1226. doi:10.1016/j.specom.2006.05.002.
-
Li Deng, Dong Yu, and Alex Acero. "Structured Speech Modeling", IEEE Trans. on Audio, Speech and Language Processing. Volume: 14 Issue: 5, Sep 2006. pp. 1492- 1504. doi: 10.1109/TASL.2006.878265.
-
Li Deng, Dong Yu, and Alex Acero, "A Bidirectional Target-Filtering Model of Speech Coarticulation and Reduction: Two-Stage Implementation for Phonetic Recognition", IEEE Trans. Audio, Speech & Language Proc, vol. 14, No. 1, pp 256-265, Jan 2006. doi: 10.1109/TSA.2005.854107.
-
Dong Yu, Alex Acero, "Semiautomatic Improvements of System-Initiative Spoken Dialog Applications Using Interactive Clustering", © IEEE Trans. Speech & Audio Proc (Special Issue on Data Mining of Speech, Audio and Dialog), Sept. 2005, vol.13, no. 5pp 661-671. DOI: 10.1109/TSA.2005.851876.
-
Li Deng, Dong Yu, "A Speech-Centric Perspective for Human-Computer Interface: A Case Study", Journal of VLSI Signal Processing Systems (Special Issue on Multimedia Signal Processing), Vol. 41, No. 3. pp. 255-269, November 2005. doi: 10.1007/s11265-005-4150-4.
Invited/Other Journals and Magazines
-
Li Deng and Dong Yu, "Hidden Trajectory Modeling With Differential Cepstra For Speech Recognition", J. Acoust. Soc. Am. Vol. 131, no. 1, pp. 650-651, 2012
-
Dong Yu, Geoffrey Hinton, Nelson Morgan, Jen-Tzung Chien, and Shigeki Sagayama, "Introduction to the Special Section on Deep Learning for Speech and Language Processing", IEEE Transactions on Audio, Speech, and Language Processing, Jan 2012
-
Dong Yu, Li Deng, "Deep Learning and its Relevance to Signal and Information Processing", IEEE Signal Processing Magazine, vol. 28, No. 1, pp. 145-154, Jan 2011. doi:10.1109/MSP.2010.939038.
-
Dong Yu, Li Deng, "Solving Nonlinear Estimation Problems Using Splines",IEEE Signal Processing Magazine, vol. 26, no. 4, pp.86-90, July, 2009.
-
Dong Yu, Li Deng, "Teach-Ware: Signal Processing Resources at Connexions",IEEE Signal Processing Magazine, March, 2009. doi: 10.1109/MSP.2008.931093.
Refereed Conferences
-
Yan Huang, Dong Yu, Yifan Gong and Chaojun Liu, "Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration", Interspeech 2013.
-
Ossama Abdel-Hamid, Li Deng and Dong Yu, "Exploring Convolutional Neural Network Structures and Optimization Techniques for Speech Recognition", Interspeech 2013.
-
Ossama Abdel-Hamid, Li Deng, Dong Yu and Hui Jiang, "Deep Segmental Neural Network for Automatic Speech Recognition", Interspeech 2013.
-
Kaisheng Yao, Geoffrey Zweig, Mei-Yuh Hwang, yangyang shi and Dong Yu, "Recurrent Neural Networks for Language Understanding", Interspeech 2013.
-
Dong Yu, Michael L. Seltzer, Jinyu Li, Jui-Ting Huang, Frank Seide, "Feature Learning in Deep Neural Networks - Studies on Speech Recognition Tasks", ICLR 2013.
-
Zhen-Hua Ling, Li Deng, Dong Yu, "Modeling Spectral Envelopes Using Restricted Boltzmann Machines For Statistical Parametric Speech Synthesis", pp. 7825-7829, ICASSP 2013.
-
Dong Yu, Kaisheng Yao, Hang Su, Gang Li, Frank Seide, "KL-Divergence Regularized Deep Neural Network Adaptation For Improved Large Vocabulary Speech Recognition", pp. 7893-7897, ICASSP 2013.
-
Jui-Ting Huang, Jinyu Li, Dong Yu, Li Deng, Yifan Gong, "Cross-Language Knowledge Transfer Using Multilingual Deep Neural Network With Shared Hidden Layers", pp. 7304-7308, ICASSP 2013.
-
George Dahl, Jack Stokes, Li Deng, Dong Yu, "Large-Scale Malware Classification Using Random Projections And Neural Networks", pp. 3422-3466, ICASSP 2013.
-
Hang Su, Gang Li, Dong Yu, Frank Seide, "Error Back Propagation For Sequence Training Of Context-Dependent Deep Networks For Conversational Speech Transcription", pp. 6664-6668, ICASSP 2013.
-
Li Deng, Jinyu Li, Jui-Ting Huang, Kaiseng Yao, Dong Yu, Frank Seide, Michael Seltzer, Geoff Zweig, Xiaodong He, Jason Williams, Yifan Gong, Alex Acero, "Recent Advances Of Deep Learning For Speech Research At Microsoft", pp. 8604-8608, ICASSP 2013.
-
Li Deng, Ossama Abdel-Hamid, Dong Yu, "A Deep Convolutional Neural Network Using Heterogeneous Pooling For Trading Acoustic Invariance With Phonetic Confusion", pp. 6669-6673, ICASSP 2013.
-
Michael Seltzer, Dong Yu, Yongqiang Wang, "An Investigation Of Deep Neural Networks For Noise Robust Speech Recognition", pp. 7398-7402, ICASSP 2013.
- Jinyu Li, Dong Yu, Jui-Ting Huang, Yifan Gong, "Improving Wideband Speech Recognition Using Mixed-Bandwidth Training Data In CD-DNN-HMM", SLT 2012.
- Gang Li, Huifeng Zhu, Gong Cheng, Kit Thambiratnam, Behrooz Chitsaz, Dong Yu, Frank Seide, "Context-Dependent Deep Neural Networks For Audio Indexing Of Real-Life Data", SLT 2012.
- Kaisheng Yao, Dong Yu, Frank Seide, Hang Su, Li Deng, Yifan Gong, "Adaptation Of Context-Dependent Deep Neural Networks For Automatic Speech Recognition", pp. 366-369, SLT 2012.
- Xie Chen, Adam Eversole, Gang Li, Dong Yu, and Frank Seide, “Pipelined Back-Propagation for Context-Dependent Deep Neural Networks”, Interspeech 2012.
- Dong Yu, Li Deng, Frank Seide, “Large Vocabulary Speech Recognition Using Deep Tensor Neural Networks”, Interspeech 2012.
- Li Deng, Brian Hutchinson, and Dong Yu, “Parallel Training of Deep Stacking Networks”, Interspeech 2012.
- Dong Yu, Frank Seide, Gang Li, Li Deng, "Exploiting Sparseness In Deep Neural Networks For Large Vocabulary Speech Recognition", ICASSP 2012, pp.4409-4412.
- Brian Hutchinson, Li Deng, Dong Yu, "A deep architecture with bilinear modeling of hidden representations: applications to phonetic recognition", ICASSP 2012, pp. 4805-4808.
- Dong Yu, Sabato Siniscalchi, Li Deng, Chin-Hui Lee, “Boosting Attribute And Phone Estimation Accuracies With Deep Neural Networks For Detection-Based Speech Recognition”, ICASSP 2012, pp. 4169-4172.
- Li Deng, Dong Yu, John Platt, "Scalable stacking and learning for building deep architectures", ICASSP 2012, pp. 2133-2136.
- Frank Seide, Gang Li, Xie Chen, Dong Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription", ASRU 2011, pp. 24-29.
- Dong Yu and Michael L. Seltzer, "Improved Bottleneck Features Using Pretrained Deep Neural Networks", Interspeech 2011, pp. 237-240.
- Frank Seide, Gang Li and Dong Yu, "Conversational Speech Transcription Using Context-Dependent Deep Neural Networks", Interspeech 2011, pp. 437-440.
- Dong Yu and Li Deng, "Accelerated Parallelizable Neural Network Learning Algorithm for Speech Recognition", Interspeech 2011, pp. 2281-2284.
- Li Deng and Dong Yu, "Deep Convex Network: A Scalable Architecture for Deep Learning", Interspeech 2011, pp. 2285-2288.
- George E. Dahl, Dong Yu, Li Deng, and Alex Acero, "Large Vocabulary Continuous Speech Recognition With Context-Dependent DBN-HMMS", ICASSP 2011, pp. 4688-4691.
- Li Deng, Mike Seltzer, Dong Yu, Alex Acero, Abdel-rahman Mohamed, and Geoff Hinton, "Binary Coding of Speech Spectrograms Using a Deep Auto-encoder", in Interspeech 2010, pp. 1692-1695.
- Jinyu Li, Dong Yu, Yifan Gong, and Li Deng, "Unscented Transform with Online Distortion Estimation for HMM Adaptation", in Interspeech 2010, pp. 1660-1663.
- Dong Yu and Li Deng, "Deep-Structured Hidden Conditional Random Fields for Phonetic Recognition", in Interspeech 2010. pp.2986-2989.
- Abdel-rahman Mohamed, Dong Yu, and Li Deng, "Investigation of Full-Sequence Training of Deep Belief Networks for Speech Recognition", in Interspeech 2010. pp. 2846-2849.
- Dong Yu, Shizhen Wang, Jinyu Li, Li Deng, "Word Confidence Calibration Using a Maximum Entropy Model with Constraints on Confidence and Word Distributions", ICASSP 2010, pp. 4446-4449.
-
Dong Yu, Li Deng, "Semantic Confidence Calibration for Spoken Dialog Applications", ICASSP 2010, pp. 4450-4453.
-
Dong Yu, Shizhen Wang, Zahi karam, Li Deng, "Language Recognition Using Deep-Structured Conditional Random Fields", ICASSP 2010, pp. 5030-5033.
- Dong Yu, Li Deng, Alex Acero, "Hidden Conditional Random Field with Distribution Constraints for Phone Classification", Interspeech, pp. 676-679, 2009.
-
Oriol Vinyals, Li Deng, Alex Acero, Dong Yu, "Discriminative Pronunciation Learning Using Phonetic Decoder and Minimum-Classification-Error Criterion", ICASSP 2009, pp. 4445-4448.
-
Hui Lin, Li Deng, Dong Yu, Yifan Gong, Alex Acero, Chin-Hui Lee, "A Study on Multilingual Acoustic Modeling For Large Vocabulary ASR", ICASSP 2009, pp. 4333-4336.
-
Dong Yu, Li Deng, Peng Liu, Jian Wu, Yifan Gong, Alex Acero, "Cross-Lingual Speech Recognition under Runtime Resource Constraints", ICASSP 2009, pp. 4193-4196.
-
Balakrishnan Varadarajan, Dong Yu, Li Deng, Alex Acero, "Maximizing Global Entropy Reduction for Active Learning In Speech Recognition", ICASSP 2009, pp. 4721-4724.
-
Balakrishnan Varadarajan, Dong Yu, Li Deng, Alex Acero, "Using Collective Information in Semi-Supervised Learning for Speech Recognition", ICASSP 2009, pp. 4633-4636.
-
Dong Yu, Li Deng, Jian Wu, Yifan Gong, Alex Acero, "Improvements on Mel-Frequency Cepstrum Minimum-Mean-Square-Error Noise Suppressor for Robust Speech Recognition", ISCSLP 2008, Kunmin, China.
-
Dong Yu, Li Deng, Yifan Gong, Alex Acero, "Discriminative Training of Variable-Parameter HMMs for Noise Robust Speech Recognition", Interspeech 2008, pp. 285-288.
-
Dong Yu, Li Deng, Yifan Gong, Alex Acero, "Parameter Clustering and Sharing in Variable-Parameter HMMs for Noise Robust Speech Recognition", Interspeech 2008, pp. 1253-1256.
-
Jinyu Li, Li Deng, Dong Yu, Jian Wu, Yifan Gong, Alex Acero, "Adaptation of Compressed HMM Parameters for Resource-Constrained Speech Recognition", ICASSP 2008, pp. 4333-4336.
-
Jinyu Li, Li Deng, Dong Yu, Yifan Gong, Alex Acero, "HMM Adaptation Using a Phase-Sensitive Acoustic Distortion Model For Environment-Robust Speech Recognition", ICASSP 2008, pp. 4069-4072.
-
Dong Yu, Li Deng, Jasha Droppo, Jian Wu, Yifan Gong, Alex Acero, "a Minimum-Mean-Square-Error Noise Reduction Algorithm on Mel-Frequency Cepstra for Robust Speech Recognition", ICASSP 2008, pp. 4041-4044.
-
Jinyu Li, Li Deng, Dong Yu, Yifan Gong, Alex Acero, "High-Performance HMM Adaptation With Joint Compensation of Additive and Convolutive Distortions Via Vector Taylor Series", ASRU 2007, pp. 65-70.
- Dong Yu, Li Deng, "Large-Margin Discriminative Training of Hidden Markov Models for Speech Recognition", ICSC 2007, Irvine, CA (invited).
-
Geoffrey Zweig, Patrick Nguyen, Yun-Cheng Ju, Ye-Yi Wang, Dong Yu, Alex Acero, "The Voice-Rate Dialog System for Consumer Ratings", Interspeech 2007.
-
Dong Yu, Li Deng, Alex Acero, "Handling Phonetic Context and Speaker Variation in a Structure-Based Speech Recognizer", Interspeech 2007.
-
J. Sherwani, Dong Yu, Tim Paek, Mary Czerwinski, Yun-Cheng Ju, Alex Acero, "VoicePedia: Towards Speech-based Access to Unstructured Information", Interspeech 2007.
-
Dong Yu, Yun-Cheng Ju, Ye-Yi Wang, Geoffrey Zweig, Alex Acero, "Automated Directory Assistance System - from Theory to Practice", Interspeech 2007.
-
Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, Geoffrey Zweig, Alex Acero, "Confidence Measures for Voice Search Applications", Interspeech 2007.
-
Dong Yu, Li Deng, Xiaodong He, Alex Acero, "Large-Margin Minimum Classification Error Training for Large-Scale Speech Recognition Tasks", ICASSP 2007.
-
Li Deng, Dong Yu, "Use of Differential Cepstra as Acoustic Features in Hidden Trajectory Modeling for Phonetic Recognition", ICASSP 2007.
-
Sibel Yaman, Li Deng, Dong Yu, Ye-Yi Wang, Alex Acero, "A Discriminative Training Framework Using N-Best Speech Recognition Transcriptions and Scores for Spoken Utterance Classification", ICASSP 2007.
-
Dong Yu, Li Deng, Xiaodong He, Alex Acero, "Use of Incrementally Regulated Discriminative Margins in MCE Training for Speech Recognition", Interspeech 2006.
-
Xiaolong Li, Li Deng, Dong Yu, Alex Acero, "A Time-Synchronous Phonetic Decoder for a Long-Contextual-Span Hidden Trajactory Model", Interspeech 2006.
-
Dong Yu, Yun Cheng Ju, Alex Acero, "An Effective and Efficient Utterance Verification Technology Using Word N-gram Filler Models", Interspeech 2006.
-
Dong Yu, Yun Cheng Ju, Ye-Yi Wang, Alex Acero, "N-Gram Based Filler Model for Robust Grammar Authoring", ICASSP 2006.
- Li Deng, Dong Yu, and Alex Acero. "A Long-Contextual-Span Model of Resonance Dynamics for Speech Recognition: Parameter Learning and Recognizer Evaluation," in Proc. ASRU 2005.
-
Dong Yu, Li Deng, and Alex Acero. "Learning Statistically Characterized Resonance Targets in a Hidden Trajectory Model of Speech Coarticulation and Reduction," Interspeech 2005.
-
Dong Yu, Li Deng, and Alex Acero."Evaluation of a Long-contextual-span Hidden Trajectory Model and Phonetic Recognizer Using A* Lattice Search," Interspeech 2005.
-
Dong Yu, Deborah Frincke, "Alert Confidence Fusion in Intrusion Detection Systems with Extended Dempster-Shafer Theory", in the 43rd Annual ACM Southeast Conference, 2005.
-
Dong Yu, Milind Mahajan, Peter Mau, and Alex Acero, "Maximum Entropy Based Generic Filter for Language Model Adaptation", ICASSP 2005.
-
Li Deng, Xiang Li, Dong Yu, and Alex Acero, "A Hidden Trajectory Model with Bi-directional Target-Filtering: Cascaded vs. Integrated Implementation for Phonetic Recognition", ICASSP 2005.
-
Dong Yu, Mei-Yuh Hwang , Peter Mau, Alex Acero, Li Deng, "Unsupervised Learning from Users' Error Correction in Speech Dictation", InterSpeech-ICSLP 2004.
-
Li Deng, Dong Yu, and Alex Acero, "A Quantitative Model for Formant Dynamics and Contextually Assimilated Reduction in Fluent Speech", InterSpeech-ICSLP 2004.
-
Dong Yu, Deborah Frincke, "A Novel Framework for Alert Correlation and Understanding" (© Springer-Verlag), Springer's LNCS series, vol 3089. International Conference on Applied Cryptography and Network Security (ACNS) 2004.
-
Dong Yu, Deborah Frincke, "Towards Survivable Intrusion Detection System", the 37th Hawaii International Conference On System Science (HICSS-37), Big Island, Hawaii, 2004.
-
Dong Yu, Kuansan Wang, Milind Mahajan, Peter Mau, Alex Acero, "Improved Name Recognition With User Modeling", EUROSPEECH 2003.
-
Lei Yao, Dong Yu, and Taiyi Huang, "A unified spectral transformation adaptation approach for robust speech recognition", ICSLP 1996.
-
Dong Yu and Taiyi Huang, "Canonical Correlation Based Compensation Approach for Robust Speech Recognition in Noisy Environment", EUROSPEECH 1995, pp477-480.
-
Dong Yu and Taiyi Huang, "A New HMM/NN Hybrid Method for High Performance Speech Recognition", ICSLP 1994.
-
Dong Yu and Taiyi Huang, "A New Time-alignment Approach for Robust Neural Network Based Speech Recognition", in Proceedings of IEEE International Conference in Signal Processing (ICSP-93), 1993.
Workshops and/or Short Papers:
-
Dong Yu, Frank Seide, and Gang Li, "Conversational Speech Transcription Using Context-Dependent Deep Neural Networks", ICML 2012 (invited related area talk).
-
Dong Yu, Xin Chen, and Li Deng, "Factorized deep neural networks for adaptive speech recognition", International workshop on statistical machine learning for speech processing, March 2012.
-
Li Deng and Dong Yu, "Deep Convex Networks for Image and Speech Classification", ICML 2011 Workshop on Learning Architectures, Representations, and Optimization for Speech and Visual Information Processing.
-
Jinyu Li, Li Deng, Dong Yu, and Yifan Gong, "Towards High-Accuracy Low-Cost Noisy Robust Speech Recognition Exploiting Structured Model", ICML 2011 Workshop on Learning Architectures, Representations, and Optimization for Speech and Visual Information Processing.
-
Dong Yu, Li Deng, and George E. Dahl, "Roles of Pre-Training and Fine-Tuning in Context-Dependent DBN-HMMs for Real-World Speech Recognition", NIPS 2010 Workshop on Deep Learning and Unsupervised Feature Learning, Dec. 2010.
-
Dong Yu, Li Deng, Shizhen Wang, "Learning in the Deep-Structured Conditional Random Fields", NIPS 2009 Workshop on Deep Learning for Speech Recognition and Related Applications, 2009.
-
Hui Lin, Li Deng, Jasha Droppo, Dong Yu, Alex Acero, "Learning Methods in Multilingual Speech Recognition", NIPS 2008 workshops, whistler, BC, Canada, 2008.
- Dong Yu, Li Deng, Alex Acero, "The Maximum Entropy Model with Continuous Features", NIPS 2008 workshops, whistler, BC, Canada, 2008.
-
Ivan Tashev, Michael Seltzer, Yun-Cheng Ju, Dong Yu and Alex Acero, "Commute UX: Telephone Dialog System for Location-based Services", SIGDIAL 2007, Antwerp, Belgium.
-
Li Deng, Dong Yu, and Alex Acero. "A Generative Modeling Framework for Structured Hidden Speech Dynamics", in Proc. NIPS Workshop on Advances in Structured Learning for Text and Speech Processing 2005.
-
Geoffrey Zweig, Y.C. Ju, Patrick Nguyen, Dong Yu, Ye-Yi Wang, Alex Acero, "Voice-Rate: A Dialog System for Consumer Ratings", NAACL-HLT 2007, Rochester, New York, USA, pp31-32.
-
Li Deng, Xiang Li, Dong Yu, and Alex Acero, "Novel Acoustic Modeling with Structured Hidden Dynamics for Speech Coarticulation and Reduction", in Proc. of the DARPA RT04 Workshop. Palisades, New York, Nov 2004.
Patents (US and/or International)
Granted:
-
"A Generic Framework for Large-Margin MCE Training in Speech Recognition", with Li Deng, Xiaodong He, Alex Acero (granted 2013, US patent #8423364)
-
"Speech Recognition With Non-linear Noise Reduction on Mel-Frequency Cepstra", with Li Deng, Jasha Droppo, and Alex Acero (granted 2012, US patent #8306817)
- "Adapting a Compressed Model for Use in Speech Recognition", with Jinyu Li, Li Deng, Yifan Gong, and Alex Acero (granted 2012, US patent #8239195)
- "Phase Sensitive Model Adaptation for Noisy Speech Recognition", with Jinyu Li, Li Deng, Yifan Gong, and Alex Acero (granted 2012, US patent #8214215)
- "Speech-Centric Multimodal User Interface Design in Mobile Technology", with Li Deng (granted 2012, US patent #8219406)
- "High Performance HMM Adaptation With Joint Compensation of Additive and Convolutive Distortions", with Jinyu Li, Li Deng, Alex Acero (granted 2012, US patent #8180637)
- "Noise Suppressor for Robust Speech Recognition", with Li Deng, Yifan Gong, Jian Wu and Alex Acero (granted 2012, US patent #8185389)
- "Piecewise-Based Variable-Parameter Hidden Markov Models and the Training Thereof", with Li Deng, Yifan Gong, and Alex Acero (granted 2012, US patent #8160878)
- "Confidence Measure Generation for Speech Related Searching", with Ye-yi Wang and Yun-cheng Ju (granted 2012, US patent #8165877)
- "Classification Filter for Processing Data for Creating a Language Model", with Alex Acero, Julian J. Odell, Milind V. Mahajan, and Peter Mau (granted 2012, US patent #8165870)
- "Parameter Clustering and Sharing for Variable-Parameter Hidden Markov Models", with Li Deng, Yifan Gong, and Alex Acero (granted 2012, US patent #8145488)
- "Interactive Clustering Method for Identifying Problems in Speech Applications", with Alex Acero (granted 2012, US patent #8099279)
- "Detecting an Answering Machine Using Speech Recognition", with Yun Cheng Ju, Alex Acero, Craig M. Fisher, and Ye-yi Wang (granted 2011, US patent #8065146)
- "Automatic Speech Recognition Learning Using User Corrections", With Peter Mau, Mei-Yuh Hwang , and Alex Acero (granted 2011, US patent #8019602)
- "Parameter Learning in a Hidden Trajectory Model", with Li Deng, Xiaolong Li, Alex Acero (granted 2011, US patent #8010356)
- "Adapting a Language Model to Accommodate Inputs Not Found in a Directory Assistance Listing", with Yun Cheng Ju and Alex Acero (granted 2011, US patent #7912707)
- "Time Synchronous Decoding for Long-Span Hidden Trajectory Model", with Xiaolong Li, Li Deng, and Alex Acero (granted 2011, US patent #7877256)
- "Shareable Filler Model for Grammar Authoring", with Yun Cheng Ju, Alex Acero, and Ye-Yi Wang (granted 2011, US patent #7865357)
- "Integrated Speech Recognition and Semantic Classification", with Sibel Yaman, Li Deng, Ye-Yi Wang, and Alex Acero (granted 2011, US patent #7856351)
- "Compound Word Splitting for Directory Assistance Services", with Yun Cheng Ju and Alex Acero (granted 2011, US patent #7860707)
- "Efficient Capitalization through User Modeling", with Peter Mau (granted 2010, US patent #7827025)
- "Hidden Trajectory Modeling with Differential Cepstra for Speech Recognition", with Li Deng (granted 2010, US patent #7805308)
-
"Time Asynchronous Decoding for Long-Span Trajectory Model", with Li Deng, Alex Acero (granted 2010, US patent #7734460)
-
"Learning Statistically Characterized Resonance Targets in a Hidden Trajectory Model", with Li Deng, Alex Acero (granted 2010, US patent #7653535)
-
"Method of Automatically Ranking Speech Dialog States and Transitions to Aid in Performance Analysis in Speech Applications", with Alex Acero (granted 2010, US patent #7643995)
-
"System and Method for Identifying Semantic Intent from Acoustic Information", with Xiao Li, Asela J. Gunawardana, Alex Acero, and Milind Mahajan (granted 2009, US patent #7634406)
-
"Incrementally Regulated Discriminative Margins in MCE Training for Speech Recognition", with Li Deng, Xiaodong He, Alex Acero (granted 2009, US patent #7617103)
-
"Quantitative Model for Formant Dynamics and Contextually Assimilated Reduction in Fluent Speech", with Li Deng, and Alex Acero (granted 2009, US patent #7565292)
-
Indexing and Ranking Processes for Directory Assistance Services", with Yun Cheng Ju, Ye-yi Wang, and Alex Acero (granted 2009, US patent #7580942)
-
"Acoustic Models with Structured Hidden Dynamics with Integration over Many Possible Hidden Trajectories", with Li Deng, and Alex Acero (granted 2009, US patent #7565284)
-
"Speaker-adaptive Learning of Resonance Targets in a Hidden Trajectory Model of Speech Coarticulation", with Li Deng, Alex Acero (granted 2009, US patent #7519531)
-
"Two Stage Implementation for Phonetic Recognition Using a Bi-directional Target-filtering Model of Speech Co-articulation and Reduction", with Li Deng, and Alex Acero (granted 2008, US patent #7409346)
-
"System and method for user modeling to enhance named entity recognition ", with Peter Mau, Kuansan Wang, Milind Mahajan, and Alex Acero (granted 2007, US patent #7289956)
Filed/Pending:
-
"Conservatively Adapting A Deep Neural Network In A Recognition System", with Kaisheng Yao, Frank Seide, Hang Su, and Gang Li (pending, filed 2013).
-
"Exploiting Heterogeneous Data In Deep Neural Network-Based Speech Recognition Systems", with Jinyu Li, Jui-Ting Huang, and Yifan Gong (pending, filed 2013)
-
"Multilingual Deep Neural Network", with Jui-Ting Huang, Jinyu Li, Yifan Gong, and Li Deng (pending, filed 2013)
-
"Deep Neural Networks Training For Speech And Pattern Recognition", with Frank Seide, Xie Chen, Gang Li, Adam Eversole (pending, filed 2012)
-
"Computer implemented deep tensor neural network", with Li Deng and Frank Seide (pending, filed 2012)
-
"Tensor deep stacked neural network", with Li Deng, Brian Hutchinson (pending, filed 2012)
-
"Exploiting Sparseness In Training Deep Neural Networks", with Frank Seide, Gang Li, and Li Deng (pending, filed 2011)
-
"Discriminative Pretraining Of Deep Neural Networks", with Frank Seide, Gang Li, and Li Deng (pending, filed 2011)
-
"Learning Processes For Single Hidden Layer Neural Networks With Linear Output Units", with Li Deng (pending, filed 2011)
-
"Deep Convex Network With Joint Use Of Nonlinear Random Projection, Restricted Boltzmann Machine And Batch-Based Parallelizable Optimization", with Li Deng, and Alex Acero (pending, filed 2011)
-
"Online Distorted Speech Estimation Within An Unscented Transformation Framework", with Jinyu Li, Li Deng, and Yifan Gong (pending, filed, 2010)
-
"Deep Belief Network For Large Vocabulary Continuous Speech Recognition", with George Dahl and Li Deng (pending, filed 2010)
-
"Full-Sequence Training Of Deep Structures For Speech Recognition", with Abdel-rahman Mohamed and Li Deng (pending, filed 2010)
-
"Deep-Structured Conditional Random Fields for Sequential Labeling and Classification ", with Li Deng and Shizhen Wang (pending, filed 2010)
-
"Confidence Calibration in Automatic Speech Recognition Systems", with Li Deng and Jinyu Li (pending, filed 2009)
-
"MaxEnt model with continuous features", with Li Deng and Alex Acero (pending, filed 2009)
- "Searching Database of Listing", with Ye-Yi Wang, Yun-Cheng Ju, Alex Acero and Geoffrey Zweig (pending, filed 2007)
- "Configurable Grammar Templates", with Ye-Yi Wang, Yun-Cheng Ju, and Alex Acero (pending, filed 2005)
Invited Talks
- 2012 at University of Science and Technology of China, "Recent Progress on Large Vocabulary Speech Recognition Using Deep Neural Networks".
- 2012 at Northwestern Polytechnical University, China, "Recent Progress on Large Vocabulary Speech Recognition Using Deep Neural Networks".
- 2012 at I2R, A-Star, Singapore, "Recent Progress on Large Vocabulary Speech Recognition Using Deep Neural Networks".
- 2012 keynote at International Workshop on Spoken Language Translation (IWSLT 2012), Hong Kong, "Who Can Understand Your Speech Better -- Deep Neural Network or Gaussian Mixture Model?".
- 2012 at Cambridge University, "Why Deep Neural Networks Are Promising for Speech Recognition".
- 2012 at University of Edinburgh, "Why Deep Neural Networks Are Promising for Speech Recognition".
- 2012 at the International Conference on Machine Learning (ICML), invited related area talk, Edinburgh, UK, "Conversational Speech Transcription Using Context-Dependent Deep Neural Networks".
- 2012 keynote at International Workshop on Statistical Machine Learning for Speech Processing (IWSML 2012), Kyoto, Japan, "More Data + Deeper Model = Better Accuracy".
- 2012 at Zhejiang University, "Recent advances in automatic speech recognition".
- 2012 at Baidu, "Deep Neural Network based Speech Recognition – A New Paradigm".
- 2012 at Institute of Automation, Chinese Academy of Sciences, "Recent advances in automatic speech recognition".
- 2012 at University of Neveda at Reno, "Log-linear model: from shallow to deep".
- 2010 at University of Science and Technology of China, "Large vocabulary continuous speech recognition using context-dependent DNN-HMM model" .
- 2010 at Tokyo Institute of Technology, "Large vocabulary continuous speech recognition using context-dependent DBN-HMM model".
- 2010 at University of Texas at Dallas, "Automatic speech recognition: challenges ahead".
- 2009 at Institute of Automation, Chinese Academy of Sciences, "A unified framework of variable-parameter hidden Markov models" and "The maximum entropy and hidden conditional random field model with distribution constraints".
- 2009 at National Cheng Kung University, "A unified framework of variable-parameter hidden Markov models" and "The maximum entropy and hidden conditional random field model with distribution constraints".
Invited Tutorials
- Dong Yu, "Deep Learning and Its Applications in Large vocabulary Speech Recognition", Shanghai Jiaotong University, China, 2012.
- Dong Yu, "Large Vocabulary Speech Recognition Using Deep Neural Networks: Insights, Theory, and Practice", ISCSLP 2012, Hongkong, China.
- Dong Yu, Li Deng, "Deep Learning and Its Applications in Signal Processing", ICASSP 2012, Kyoto, Japan.
Technical Services
-
Grant reviewer/panelist: Austrian Science Fund (2013-), Romanian National Research Council (2012-), Research Grants Council (RGC) of Hong Kong (2008-)
-
Associate editor: IEEE Transactions on Audio, Speech, and Language Processing (2011-), IEEE signal processing magazine (2008-2011)
-
Guest editor: IEEE Transactions on Audio, Speech, and Language Processing - special issue on deep learning for speech and language processing (2010).
-
Technical/review/program committee member: IEEE SLTC (2013-), ICASSP 2004-now, INTERSPEECH 2004-now, NAACL-HLT 2009-now, ASRU 2009-now, ISCSLP 2008-now, ACL 2011-now, IEEE ChinaSIP (2013-) MLSLP 2012, ISIEA 2012, ICSC 2007-2008, EUSIPCO 2008, ACMSE 2005, NIPS workshop 2009.
-
Session chair: ACNS 2004, ICSC 2007, ICASSP 2010, Interspeech 2010-2011, ELM 2012, ICASSP 2013.
-
Referee for journals: IEEE Transactions on Audio, Speech, and Language Processing, J. Computer Speech and Language, J. Speech Communication, IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Transactions on Signal Processing, J. Pattern Recognition Letters, EURASIP J. on Audio Speech and Music Processing, J. Computer Security, J. Computer Networks, IEEE Transactions on Computer, J. Data & Knowledge Engineering.
Students/Interns Mentored/Co-mentored
- Andrew Maas, Stanford University (2012)
- Brian Hutchinson, University of Washington (2011, co-mentored)
- Xin Chen, University of Missouri (2011, now a speech scientist at Pearson)
- George Dahl, University of Toronto (2010)
- Abdel-rahman Mohamed, University of Toronto (2010, co-mentored)
- Shizhen Wang, UCLA (2009, now a scientist at Microsoft)
- Balakrishnan Varadarajan, Johns Hopkins University (2008, now a scientist at Google)
- Raghu Sampath Kumaran, University of Washington (2007)
- Jahanzeb Sherwani, Carnegie Mellon University (2006, now an adjunct faculty at CMU)
- Nelson Lee, Stanford University (2006, now a product manager at Google)
Honors and Awards
-
Microsoft Achievement Award (2013)
-
Microsoft gold star award (2004, 2009, 2013)
-
Best presentation award, ELM (2012)
-
Microsoft research technical transfer award (2009)
-
Best paper award, ACMSE (2005)
-
Microsoft patent awards (2002-now)
-
Forensics workshop travel grant (2002)
-
Indiana University graduate school fellowship (1995-1996)
-
Chinese Academy of Sciences presidential award (1994)
-
ICSLP-94 travel grant (1994)
-
Chinese Academy of Sciences excellent graduate Student award (1993)
-
Graduate school of Academia Sinica excellent graduate student award (1992)
-
Zhejiang province excellent graduate award (1991)
-
Zhejiang University excellent graduate award (1991)
-
Zhejiang University excellent student award (1987-1991)
