Dong Yu

Dong Yu
PRINCIPAL RESEARCHER
.

Brief Biography

Dr. Dong Yu (俞栋) joined Microsoft Corporation in 1998 and the Microsoft Speech and Dialog Research Group in 2002, where he currently is a principal researcher. He holds a Ph.D. degree in computer science from University of Idaho, an MS degree in computer science from Indiana University at Bloomington, an MS degree in electrical engineering from Chinese Academy of Sciences, and a BS degree (with honor) in electrical engineering from Zhejiang University (China). His current research interests include speech processing, robust speech recognition, discriminative training, and machine learning. He has published over 130 papers in these areas and is the inventor/coinventor of more than 50 granted/pending patents.

His most recent work focuses on deep learning and its application in large vocabulary speech recognition. The context-dependent deep neural network hidden Markov model (CD-DNN-HMM) he co-proposed and developed has been seriously challenging the dominant position of the conventional GMM based system for large vocabulary speech recognition and helped popularize deep learning. His work was recognized by the IEEE SPS 2013 best paper award.

Dr. Dong Yu is a senior member of IEEE, a member of ACM, and a member of ISCA. He is currently serving as a member of the IEEE Speech and Language Processing Technical Committee (2013-) and an associate editor of IEEE transactions on audio, speech, and language processing (2011-). He has served as an associate editor of IEEE signal processing magazine (2008-2011) and the lead guest editor of IEEE transactions on audio, speech, and language processing - special issue on deep learning for speech and language processing (2010-2011).

Equation Number in Office 2007 and 2010

Office 2007 and 2010 come with a very nice equation editor and bibliography manager. However, it does not support equation and theorem number management. To work around this problem. I have developed a set of macros. You can download it here.

Publications

Books

  1. Li Deng and Dong Yu, Deep Learning: Methods and Applications, Now publishing, 2014 (in press)

Book Chapters

  1. Dong Yu, Li Deng, "Speech-Centric Multimodal User Interface Design in Mobile Technology", Handbook of Research on User Interface Design and Evaluation for Mobile Technology (Editor: Joanna Lumsden), Jan. 2008, IGI.
  2. Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, Alex Acero, "Voice Search (Ch.5)", in Language Understanding: Systems for Extracting Semantic Information from Speech (editors: Gokhan Tür & Renato De Mori), John Wiley & Sons, 2011.

Refereed Journals and Magazines

  1. Zhen-Hua Ling, Li Deng, Dong Yu, “Modeling Spectral Envelopes Using Restricted Boltzmann Machines and Deep Belief Networks for Statistical Parametric Speech Synthesis”, IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 10, pp. 2129 - 2139, Oct. 2013. 10.1109/TASL.2013.2269291
  2. Kaisheng Yao, Dong Yu, Li Deng, Yifan Gong, "A Fast Maximum Likelihood Nonlinear Feature Transformation Method for GMM-HMM Speaker Adaptation", Neurocomputing - special issue on extreme learning machine, vol. 128, pp. 145–152, March 2014. http://dx.doi.org/10.1016/j.neucom.2013.02.050
  3. Brian Hutchinson, Li Deng, Dong Yu, "Tensor Deep Stacking Networks", IEEE Trans. on Pattern Analysis and Machine Intelligence - special issue on learning deep architectures, vol. 35, no. 8, pp. 1944-1957, August 2013, http://doi.ieeecomputersociety.org/10.1109/TPAMI.2012.268.
  4. Sabato Marco Siniscalchi, Dong Yu, Li Deng, Chin-Hui Lee, "Exploiting Deep Neural Networks for Detection-Based Speech Recognition", Neurocomputing, vol. 106, pp. 148-157, April 2013. http://dx.doi.org/10.1016/j.neucom.2012.11.008.
  5. Sabato Marco Siniscalchi, Dong Yu, Li Deng, Chin-Hui Lee, "Speech Recognition Using Long-Span Temporal Patterns in a Deep Network Model", IEEE Signal Processing Letters, vol. 20, no. 3, pp. 201-204, March 2013. 10.1109/LSP.2013.2237901.
  6. Dong Yu, Li Deng, Frank Seide, "The Deep Tensor Neural Network with Applications to Large Vocabulary Speech Recognition", IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 2, pp. 388-396, Feb, 2013, http://dx.doi.org/10.1016/j.neucom.2012.11.008.
  7. Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, and Brian Kingsbury, “Deep Neural Networks for Acoustic Modeling in Speech Recognition”, IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82-97 Nov. 2012. doi:10.1109/MSP.2012.2205597.
  8. Dong Yu, Li Deng, "Efficient and Effective Algorithms for Training Single-Hidden-Layer Neural Networks", Pattern Recognition Letters, vol. 33, no. 5, pp. 554-558. April 2012. DOI information: 10.1016/j.patrec.2011.12.002.
  9. George E. Dahl, Dong Yu, Li Deng, and Alex Acero, "Context-Dependent Pre-trained Deep Neural Networks for Large Vocabulary Speech Recognition", IEEE Transactions on Audio, Speech, and Language Processing - Special Issue on Deep Learning for Speech and Language Processing, vol. 20, no. 1, pp. 33-42, Jan 2012. doi:10.1109/TASL.2011.2134090. (2013 IEEE SPS Best Paper Award).
  10. Dong Yu, Jinyu Li, Li Deng, "Calibration of Confidence Measures in Speech Recognition", IEEE Transactions on Audio, Speech, and Language Processing, Nov. 2011, pp. 2461-2473. doi:10.1109/TASL.2011.2141988.
  11. Michael L. Seltzer, Yun-Cheng Ju, Ivan Tashev, Ye-Yi Wang, and Dong Yu, "In Car Media Search", IEEE signal processing magazine, vol. 28, no. 4, pp. 50-60, June 2011, doi:10.1109/MSP.2011.941065.
  12. Dong Yu, Shizhen Wang, Li Deng, "Sequential Labeling Using Deep-Structured Conditional Random Fields", Journal of Selected Topics in Signal Processing - special issue on Statistical Learning Methods for Speech and Language Processing, vol 4, no 6, December 2010, pp. 965-973. doi:10.1109/JSTSP.2010.2075990.
  13. Dong Yu, Li Deng, Alex Acero, "Using Continuous Features in the Maximum Entropy Model", Pattern Recognition Letters. doi: 10.1016/j.patrec.2009.06.005. Vol. 30, Issue 14, pp. 1295-1300, October, 2009.
  14. Dong Yu, Li Deng, Yifan Gong, Alex Acero, "A Novel Framework and Training Algorithm for Variable-Parameter Hidden Markov Models", IEEE Transactions on Audio, Speech and Language Processing, vol 17, no. 7, pp. 1348-1360, September 2009. doi:10.1109/TASL.2009.2020890.
  15. Dong Yu, Balakrishnan Varadarajan,Li Deng, Alex Acero, "Active Learning and Semi-supervised Learning for Speech Recognition: A Unified Framework using the Global Entropy Reduction Maximization Criterion", Computer Speech and Language - Special Issue on Emergent Artificial Intelligence Approaches for Pattern Recognition in Speech and Language Processing. vol 24, issue 3, pp. 433-444, July 2010. doi:10.1016/j.csl.2009.03.004.
  16. Jinyu Li, Li Deng, Dong Yu, Yifan Gong, Alex Acero, "A Unified Framework of HMM Adaptation with Joint Compensation of Additive and Convolutive Distortions", Computer Speech and Language, vol. 23, pp. 389-405, 2009. doi:10.1016/j.csl.2009.02.001.
  17. Sibel Yaman, Li Deng, Dong Yu, Ye-Yi Wang, Alex Acero, "An integrative and discriminative technique for spoken utterance classification", IEEE Transactions on Audio, Speech and Language Processing, vol. 6, no. 6, Aug. 2008, pages 1207-1215. doi: 10.1109/TASL.2008.2001106.
  18. Dong Yu, Li Deng, Xiaodong He, Alex Acero, "Large-Margin Minimum Classification Error Training: A Theoretical Risk Minimization Perspective", Computer Speech and Language, Vol. 22, No. 4, October 2008, pp. 415-429. doi:10.1016/j.csl.2008.03.002.
  19. Dong Yu, Li Deng, Jasha Droppo, Jian Wu, Yifan Gong, Alex Acero, "Robust Speech Recognition Using a Cepstral Minimum-Mean-Square-Error-Motivated Noise Suppressor", IEEE Transactions on Audio, Speech and Language Processing, vol. 16, no. 5, pp. 1061-1070, July 2008. DOI: 10.1109/TASL.2008.921761.
  20. Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, Alex Acero, "An Introduction to Voice Search", IEEE Signal Processing Magazine (Special Issue on Spoken Language Technology), vol. 25, no. 3, pp. 28-38, May 2008. doi: 10.1109/MSP.2008.918411.
  21. Dong Yu, Deborah Frincke, "Improving the Quality of Alerts and Predicting Intruder¡¯s Next Goal with Hidden Colored Petri-Net", Computer Networks, Volume 51, Issue 3, 21 February 2007, Pages 632-654. doi: 10.1016/j.comnet.2006.05.008.
  22. Dong Yu, Li Deng, Alex Acero, "Speaker-Adaptive Learning of Resonance Targets in a Hidden Trajectory Model of Speech Coarticulation", Computer Speech and Language, Vol. 27, 2007, pp. 72-87. doi:10.1016/j.csl.2005.12.002.
  23. Dong Yu, Li Deng, Alex Acero, "A Lattice Search Technique for a Long-Contextual-Span Hidden Trajectory Model of Speech", Speech Communication, Elsevier. Volume: 48 Issue: 9, Sep 2006. pp. 1214-1226. doi:10.1016/j.specom.2006.05.002.
  24. Li Deng, Dong Yu, and Alex Acero. "Structured Speech Modeling", IEEE Trans. on Audio, Speech and Language Processing. Volume: 14 Issue: 5, Sep 2006. pp. 1492- 1504. doi: 10.1109/TASL.2006.878265.
  25. Li Deng, Dong Yu, and Alex Acero, "A Bidirectional Target-Filtering Model of Speech Coarticulation and Reduction: Two-Stage Implementation for Phonetic Recognition", IEEE Trans. Audio, Speech & Language Proc, vol. 14, No. 1, pp 256-265, Jan 2006. doi: 10.1109/TSA.2005.854107.
  26. Dong Yu, Alex Acero, "Semiautomatic Improvements of System-Initiative Spoken Dialog Applications Using Interactive Clustering", © IEEE Trans. Speech & Audio Proc (Special Issue on Data Mining of Speech, Audio and Dialog), Sept. 2005, vol.13, no. 5pp 661-671. DOI: 10.1109/TSA.2005.851876.
  27. Li Deng, Dong Yu, "A Speech-Centric Perspective for Human-Computer Interface: A Case Study", Journal of VLSI Signal Processing Systems (Special Issue on Multimedia Signal Processing), Vol. 41, No. 3. pp. 255-269, November 2005. doi: 10.1007/s11265-005-4150-4.

Invited/Other Journals and Magazines

  1. Li Deng and Dong Yu, "Hidden Trajectory Modeling With Differential Cepstra For Speech Recognition", J. Acoust. Soc. Am. Vol. 131, no. 1, pp. 650-651, 2012
  2. Dong Yu, Geoffrey Hinton, Nelson Morgan, Jen-Tzung Chien, and Shigeki Sagayama, "Introduction to the Special Section on Deep Learning for Speech and Language Processing", IEEE Transactions on Audio, Speech, and Language Processing, Jan 2012
  3. Dong Yu, Li Deng, "Deep Learning and its Relevance to Signal and Information Processing", IEEE Signal Processing Magazine, vol. 28, No. 1, pp. 145-154, Jan 2011. doi:10.1109/MSP.2010.939038.
  4. Dong Yu, Li Deng, "Solving Nonlinear Estimation Problems Using Splines",IEEE Signal Processing Magazine, vol. 26, no. 4, pp.86-90, July, 2009.
  5. Dong Yu, Li Deng, "Teach-Ware: Signal Processing Resources at Connexions",IEEE Signal Processing Magazine, March, 2009. doi: 10.1109/MSP.2008.931093.

Refereed Conferences

  1. Chao Weng, Dong Yu, Mike Seltzer and Jasha Droppo, "Single-channel Mixed Speech Recognition Using Deep Neural Networks", ICASSP 2014 (to appear)
  2. Chao Weng, Dong Yu, Shinji Watanabe and Fred Juang, "Recurrent Deep Neural Networks for Robust Speech Recognition", ICASSP 2014 (to appear)
  3. Nicolas Boulanger-Lewandowski, Jasha Droppo, Mike Seltzer and Dong Yu, "Phone sequence modeling with recurrent neural networks", ICASSP 2014 (to appear)
  4. Kaisheng Yao, Baolin Peng, Geoffery Zweig, Dong Yu, Xiaolong Li and Feng Gao, "Recurrent Conditional Random Field for Language Understanding", ICASSP 2014 (to appear)
  5. Frank Seide, Hao Fu, Jasha Droppo, Gang Li, Dong Yu, "On Parallelizability of Stochastic Gradient Descent for Speech DNNs", ICASSP 2014 (to appear)
  6. Jian Xue, Jinyu Li, Dong Yu, Mike Seltzer and Yifan Gong, "Singular Value Decomposition Based Low-footprint Speaker Adaptation and Personalization for Deep Neural Network", ICASSP 2014 (to appear)
  7. Yan Huang, Dong Yu, Yifan Gong and Chaojun Liu, "Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration", Interspeech 2013, pp. 2360-2364..
  8. Ossama Abdel-Hamid, Li Deng and Dong Yu, "Exploring Convolutional Neural Network Structures and Optimization Techniques for Speech Recognition", Interspeech 2013, pp. 3366-3370..
  9. Ossama Abdel-Hamid, Li Deng, Dong Yu and Hui Jiang, "Deep Segmental Neural Network for Automatic Speech Recognition", Interspeech 2013, pp. 1849-1853..
  10. Kaisheng Yao, Geoffrey Zweig, Mei-Yuh Hwang, yangyang shi and Dong Yu, "Recurrent Neural Networks for Language Understanding", Interspeech 2013, pp. 2524-2528.
  11. Dong Yu, Michael L. Seltzer, Jinyu Li, Jui-Ting Huang, Frank Seide, "Feature Learning in Deep Neural Networks - Studies on Speech Recognition Tasks", ICLR 2013.
  12. Dong Yu, Kaisheng Yao, Hang Su, Gang Li, Frank Seide, "KL-Divergence Regularized Deep Neural Network Adaptation For Improved Large Vocabulary Speech Recognition", pp. 7893-7897, ICASSP 2013.
  13. Jui-Ting Huang, Jinyu Li, Dong Yu, Li Deng, Yifan Gong, "Cross-Language Knowledge Transfer Using Multilingual Deep Neural Network With Shared Hidden Layers", pp. 7304-7308, ICASSP 2013.
  14. George Dahl, Jack Stokes, Li Deng, Dong Yu, "Large-Scale Malware Classification Using Random Projections And Neural Networks", pp. 3422-3466, ICASSP 2013.
  15. Li Deng, Jinyu Li, Jui-Ting Huang, Kaiseng Yao, Dong Yu, Frank Seide, Michael Seltzer, Geoff Zweig, Xiaodong He, Jason Williams, Yifan Gong, Alex Acero, "Recent Advances Of Deep Learning For Speech Research At Microsoft", pp. 8604-8608, ICASSP 2013.
  16. Michael Seltzer, Dong Yu, Yongqiang Wang, "An Investigation Of Deep Neural Networks For Noise Robust Speech Recognition", pp. 7398-7402, ICASSP 2013.
  17. Jinyu Li, Dong Yu, Jui-Ting Huang, Yifan Gong, "Improving Wideband Speech Recognition Using Mixed-Bandwidth Training Data In CD-DNN-HMM", SLT 2012.
  18. Gang Li, Huifeng Zhu, Gong Cheng, Kit Thambiratnam, Behrooz Chitsaz, Dong Yu, Frank Seide, "Context-Dependent Deep Neural Networks For Audio Indexing Of Real-Life Data", SLT 2012.
  19. Kaisheng Yao, Dong Yu, Frank Seide, Hang Su, Li Deng, Yifan Gong, "Adaptation Of Context-Dependent Deep Neural Networks For Automatic Speech Recognition", pp. 366-369, SLT 2012.
  20. Xie Chen, Adam Eversole, Gang Li, Dong Yu, and Frank Seide, “Pipelined Back-Propagation for Context-Dependent Deep Neural Networks”, Interspeech 2012.
  21. Dong Yu, Li Deng, Frank Seide, “Large Vocabulary Speech Recognition Using Deep Tensor Neural Networks”, Interspeech 2012.
  22. Li Deng, Brian Hutchinson, and Dong Yu, “Parallel Training of Deep Stacking Networks”, Interspeech 2012.
  23. Dong Yu, Frank Seide, Gang Li, Li Deng, "Exploiting Sparseness In Deep Neural Networks For Large Vocabulary Speech Recognition", ICASSP 2012, pp.4409-4412.
  24. Brian Hutchinson, Li Deng, Dong Yu, "A deep architecture with bilinear modeling of hidden representations: applications to phonetic recognition", ICASSP 2012, pp. 4805-4808.
  25. Dong Yu, Sabato Siniscalchi, Li Deng, Chin-Hui Lee, “Boosting Attribute And Phone Estimation Accuracies With Deep Neural Networks For Detection-Based Speech Recognition”, ICASSP 2012, pp. 4169-4172.
  26. Li Deng, Dong Yu, John Platt, "Scalable stacking and learning for building deep architectures", ICASSP 2012, pp. 2133-2136.
  27. Frank Seide, Gang Li, Xie Chen, Dong Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription", ASRU 2011, pp. 24-29.
  28. Dong Yu and Michael L. Seltzer, "Improved Bottleneck Features Using Pretrained Deep Neural Networks", Interspeech 2011, pp. 237-240.
  29. Frank Seide, Gang Li and Dong Yu, "Conversational Speech Transcription Using Context-Dependent Deep Neural Networks", Interspeech 2011, pp. 437-440.
  30. Dong Yu and Li Deng, "Accelerated Parallelizable Neural Network Learning Algorithm for Speech Recognition", Interspeech 2011, pp. 2281-2284.
  31. Li Deng and Dong Yu, "Deep Convex Network: A Scalable Architecture for Deep Learning", Interspeech 2011, pp. 2285-2288.
  32. George E. Dahl, Dong Yu, Li Deng, and Alex Acero, "Large Vocabulary Continuous Speech Recognition With Context-Dependent DBN-HMMS",  ICASSP 2011, pp. 4688-4691.
  33. Li Deng, Mike Seltzer, Dong Yu, Alex Acero, Abdel-rahman Mohamed, and Geoff Hinton, "Binary Coding of Speech Spectrograms Using a Deep Auto-encoder", in Interspeech 2010, pp. 1692-1695.
  34. Jinyu Li, Dong Yu, Yifan Gong, and Li Deng, "Unscented Transform with Online Distortion Estimation for HMM Adaptation", in Interspeech 2010, pp. 1660-1663.
  35. Dong Yu and Li Deng, "Deep-Structured Hidden Conditional Random Fields for Phonetic Recognition", in Interspeech 2010. pp.2986-2989.
  36. Abdel-rahman Mohamed, Dong Yu, and Li Deng, "Investigation of Full-Sequence Training of Deep Belief Networks for Speech Recognition", in Interspeech 2010. pp. 2846-2849.
  37. Dong Yu, Shizhen Wang, Jinyu Li, Li Deng, "Word Confidence Calibration Using a Maximum Entropy Model with Constraints on Confidence and Word Distributions", ICASSP 2010, pp. 4446-4449.
  38. Dong Yu, Li Deng, "Semantic Confidence Calibration for Spoken Dialog Applications", ICASSP 2010, pp. 4450-4453.
  39. Dong Yu, Shizhen Wang, Zahi karam, Li Deng, "Language Recognition Using Deep-Structured Conditional Random Fields", ICASSP 2010, pp. 5030-5033.
  40. Dong Yu, Li Deng, Alex Acero, "Hidden Conditional Random Field with Distribution Constraints for Phone Classification", Interspeech, pp. 676-679, 2009.
  41. Oriol Vinyals, Li Deng, Alex Acero, Dong Yu, "Discriminative Pronunciation Learning Using Phonetic Decoder and Minimum-Classification-Error Criterion", ICASSP 2009, pp. 4445-4448.
  42. Hui Lin, Li Deng, Dong Yu, Yifan Gong, Alex Acero, Chin-Hui Lee, "A Study on Multilingual Acoustic Modeling For Large Vocabulary ASR", ICASSP 2009, pp. 4333-4336.
  43. Dong Yu, Li Deng, Peng Liu, Jian Wu, Yifan Gong, Alex Acero, "Cross-Lingual Speech Recognition under Runtime Resource Constraints", ICASSP 2009, pp. 4193-4196.
  44. Balakrishnan Varadarajan, Dong Yu, Li Deng, Alex Acero, "Maximizing Global Entropy Reduction for Active Learning In Speech Recognition", ICASSP 2009, pp. 4721-4724. 
  45. Balakrishnan Varadarajan, Dong Yu, Li Deng, Alex Acero, "Using Collective Information in Semi-Supervised Learning for Speech Recognition", ICASSP 2009, pp. 4633-4636.
  46. Dong Yu, Li Deng, Jian Wu, Yifan Gong, Alex Acero, "Improvements on Mel-Frequency Cepstrum Minimum-Mean-Square-Error Noise Suppressor for Robust Speech Recognition", ISCSLP 2008, Kunmin, China.
  47. Dong Yu, Li Deng, Yifan Gong, Alex Acero, "Discriminative Training of Variable-Parameter HMMs for Noise Robust Speech Recognition", Interspeech 2008, pp. 285-288.
  48. Dong Yu, Li Deng, Yifan Gong, Alex Acero, "Parameter Clustering and Sharing in Variable-Parameter HMMs for Noise Robust Speech Recognition", Interspeech 2008, pp. 1253-1256.
  49. Jinyu Li, Li Deng, Dong Yu, Jian Wu, Yifan Gong, Alex Acero, "Adaptation of Compressed HMM Parameters for Resource-Constrained Speech Recognition", ICASSP 2008, pp. 4333-4336.
  50. Jinyu Li, Li Deng, Dong Yu, Yifan Gong, Alex Acero, "HMM Adaptation Using a Phase-Sensitive Acoustic Distortion Model For Environment-Robust Speech Recognition", ICASSP 2008, pp. 4069-4072.
  51. Dong Yu, Li Deng, Jasha Droppo, Jian Wu, Yifan Gong, Alex Acero, "a Minimum-Mean-Square-Error Noise Reduction Algorithm on Mel-Frequency Cepstra for Robust Speech Recognition", ICASSP 2008, pp. 4041-4044.
  52. Dong Yu, Li Deng, "Large-Margin Discriminative Training of Hidden Markov Models for Speech Recognition", ICSC 2007, Irvine, CA (invited).
  53. Geoffrey Zweig, Patrick Nguyen, Yun-Cheng Ju, Ye-Yi Wang, Dong Yu, Alex Acero, "The Voice-Rate Dialog System for Consumer Ratings", Interspeech 2007.
  54. J. Sherwani, Dong Yu, Tim Paek, Mary Czerwinski, Yun-Cheng Ju, Alex Acero, "VoicePedia: Towards Speech-based Access to Unstructured Information", Interspeech 2007.
  55. Dong Yu, Yun-Cheng Ju, Ye-Yi Wang, Geoffrey Zweig, Alex Acero, "Automated Directory Assistance System - from Theory to Practice", Interspeech 2007.
  56. Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, Geoffrey Zweig, Alex Acero, "Confidence Measures for Voice Search Applications", Interspeech 2007.
  57. Dong Yu, Li Deng, Xiaodong He, Alex Acero, "Use of Incrementally Regulated Discriminative Margins in MCE Training for Speech Recognition", Interspeech 2006.
  58. Xiaolong Li, Li Deng, Dong Yu, Alex Acero, "A Time-Synchronous Phonetic Decoder for a Long-Contextual-Span Hidden Trajactory Model", Interspeech 2006.
  59. Dong Yu, Yun Cheng Ju, Ye-Yi Wang, Alex Acero, "N-Gram Based Filler Model for Robust Grammar Authoring", ICASSP 2006.
  60. Li Deng, Dong Yu, and Alex Acero. "A Long-Contextual-Span Model of Resonance Dynamics for Speech Recognition: Parameter Learning and Recognizer Evaluation," in Proc. ASRU 2005.
  61. Dong Yu, Deborah Frincke, "Alert Confidence Fusion in Intrusion Detection Systems with Extended Dempster-Shafer Theory", in the 43rd Annual ACM Southeast Conference, 2005.
  62. Dong Yu, Milind Mahajan, Peter Mau, and Alex Acero, "Maximum Entropy Based Generic Filter for Language Model Adaptation", ICASSP 2005.
  63. Dong Yu, Mei-Yuh Hwang , Peter Mau, Alex Acero, Li Deng, "Unsupervised Learning from Users' Error Correction in Speech Dictation", InterSpeech-ICSLP 2004.
  64. Dong Yu, Deborah Frincke, "A Novel Framework for Alert Correlation and Understanding" (© Springer-Verlag), Springer's LNCS series, vol 3089. International Conference on Applied Cryptography and Network Security (ACNS) 2004.
  65. Dong Yu, Deborah Frincke, "Towards Survivable Intrusion Detection System", the 37th Hawaii International Conference On System Science (HICSS-37), Big Island, Hawaii, 2004.
  66. Dong Yu, Kuansan Wang, Milind Mahajan, Peter Mau, Alex Acero, "Improved Name Recognition With User Modeling", EUROSPEECH 2003.
  67. Dong Yu and Taiyi Huang, "A New HMM/NN Hybrid Method for High Performance Speech Recognition", ICSLP 1994.
  68. Dong Yu and Taiyi Huang, "A New Time-alignment Approach for Robust Neural Network Based Speech Recognition", in Proceedings of IEEE International Conference in Signal Processing (ICSP-93), 1993.

Workshops and/or Short Papers:

  1. Kaisheng Yao, Baolin Peng, Geoffrey Zweig, Dong Yu, Xiaolong Li, Feng Gao, "Recurrent Conditional Random Fields", NIPS 2013 Deep Learning Workshop.
  2. Brian Guenter, Dong Yu, Adam Eversole, Oleksii Kuchaiev, and Michael L. Seltzer, "Stochastic Gradient Descent Algorithm in the Computational Network Toolkit", OPT2013: NIPS 2013 Workshop on Optimization for Machine Learning
  3. Dong Yu, Frank Seide, and Gang Li, "Conversational Speech Transcription Using Context-Dependent Deep Neural Networks", ICML 2012 (invited related area talk).
  4. Dong Yu, Xin Chen, and Li Deng, "Factorized deep neural networks for adaptive speech recognition", International workshop on statistical machine learning for speech processing, March 2012.
  5. Li Deng and Dong Yu, "Deep Convex Networks for Image and Speech Classification", ICML 2011 Workshop on Learning Architectures, Representations, and Optimization for Speech and Visual Information Processing.
  6. Jinyu Li, Li Deng, Dong Yu, and Yifan Gong, "Towards High-Accuracy Low-Cost Noisy Robust Speech Recognition Exploiting Structured Model", ICML 2011 Workshop on Learning Architectures, Representations, and Optimization for Speech and Visual Information Processing.
  7. Dong Yu, Li Deng, and George E. Dahl, "Roles of Pre-Training and Fine-Tuning in Context-Dependent DBN-HMMs for Real-World Speech Recognition", NIPS 2010 Workshop on Deep Learning and Unsupervised Feature Learning, Dec. 2010.
  8. Dong Yu, Li Deng, Shizhen Wang, "Learning in the Deep-Structured Conditional Random Fields", NIPS 2009 Workshop on Deep Learning for Speech Recognition and Related Applications, 2009.
  9. Hui Lin, Li Deng, Jasha Droppo, Dong Yu, Alex Acero, "Learning Methods in Multilingual Speech Recognition", NIPS 2008 workshops, whistler, BC, Canada, 2008.
  10. Dong Yu, Li Deng, Alex Acero, "The Maximum Entropy Model with Continuous Features", NIPS 2008 workshops, whistler, BC, Canada, 2008.
  11. Ivan Tashev, Michael Seltzer, Yun-Cheng Ju, Dong Yu and Alex Acero, "Commute UX: Telephone Dialog System for Location-based Services", SIGDIAL 2007, Antwerp, Belgium.
  12. Li Deng, Dong Yu, and Alex Acero. "A Generative Modeling Framework for Structured Hidden Speech Dynamics", in Proc. NIPS Workshop on Advances in Structured Learning for Text and Speech Processing 2005.
  13. Geoffrey Zweig, Y.C. Ju, Patrick Nguyen, Dong Yu, Ye-Yi Wang, Alex Acero, "Voice-Rate: A Dialog System for Consumer Ratings", NAACL-HLT 2007, Rochester, New York, USA, pp31-32.
  14. Li Deng, Xiang Li, Dong Yu, and Alex Acero, "Novel Acoustic Modeling with Structured Hidden Dynamics for Speech Coarticulation and Reduction", in Proc. of the DARPA RT04 Workshop. Palisades, New York, Nov 2004.

Patents (US and/or International)

Granted:

  1. Li Deng, Dong Yu and Alex Acero, "Deep Convex Network With Joint Use Of Nonlinear Random Projection, Restricted Boltzmann Machine And Batch-Based Parallelizable Optimization" (granted 2013, US patent #8489529)
  2. Dong Yu, Li Deng and Shizhen Wang, "Deep-Structured Conditional Random Fields for Sequential Labeling and Classification " (granted 2013, US patent #8473430)
  3. Dong Yu, Li Deng, Xiaodong He, and Alex Acero, "Generic Framework for Large-Margin MCE Training in Speech Recognition" (granted 2013, US patent #8423364)
  4. Dong Yu, Alex Acero, Jasha Droppo, Li Deng, "Speech Recognition With Non-linear Noise Reduction on Mel-Frequency Cepstra" (granted 2012, US patent #8306817)
  5. Jinyu Li, Li Deng, Dong Yu, Jian Wu, Yifan Gong, Alex Acero, "Adapting a Compressed Model for Use in Speech Recognition" (granted 2012, US patent #8239195)
  6.  Jinyu Li, Li Deng, Dong Yu, Yifan Gong, Alex Acero, "Phase Sensitive Model Adaptation for Noisy Speech Recognition" (granted 2012, US patent #8214215)
  7. Dong Yu and Li Deng, "Speech-Centric Multimodal User Interface Design in Mobile Technology" (granted 2012, US patent #8219406)
  8. Dong Yu, Li Deng, Alex Acero, Yifan Gong, Jinyu Li, "High Performance HMM Adaptation With Joint Compensation of Additive and Convolutive Distortions" (granted 2012, US patent #8180637)
  9.  Dong Yu, Li Deng, Yifan Gong, Jian Wu, Alex Acero, "Noise Suppressor for Robust Speech Recognition" (granted 2012, US patent #8185389)
  10. Dong Yu, Li Deng, Yifan Gong, Alex Acero, "Piecewise-Based Variable-Parameter Hidden Markov Models and the Training Thereof" (granted 2012, US patent #8160878)
  11.  Ye-Yi Wang, Yun-Cheng Ju, and Dong Yu, "Confidence Measure Generation for Speech Related Searching" (granted 2012, US patent #8165877)
  12. Alex Acero, Dong Yu, Julian J. Odell, Milind V. Mahajan, Peter Mau, "Classification Filter for Processing Data for Creating a Language Model" (granted 2012, US patent #8165870)
  13. Dong Yu, Li Deng, Yifan Gong, Alex Acero, "Parameter Clustering and Sharing for Variable-Parameter Hidden Markov Models" (granted 2012, US patent #8145488)
  14. Alex Acero and Dong Yu, "Interactive Clustering Method for Identifying Problems in Speech Applications" (granted 2012, US patent #8099279)
  15. Alex Acero, Craig M. Fisher, Dong Yu, Ye-Yi Wang, Yun Cheng Ju, "Detecting an Answering Machine Using Speech Recognition" (granted 2011, US patent #8065146)
  16. Dong Yu, Peter Mau, Mei-Yuh Hwang, Alex Acero, "Automatic Speech Recognition Learning Using User Corrections" (granted 2011, US patent #8019602)
  17. Li Deng, Dong Yu, Xiaolong Li, Alex Acero, "Parameter Learning in a Hidden Trajectory Model" (granted 2011, US patent #8010356)
  18. Dong Yu, Alex Acero, Yun Cheng Ju, "Adapting a Language Model to Accommodate Inputs Not Found in a Directory Assistance Listing" (granted 2011, US patent #7912707)
  19. Xiaolong Li, Li Deng, Dong Yu, Alex Acero, "Time Synchronous Decoding for Long-Span Hidden Trajectory Model" (granted 2011, US patent #7877256)
  20. Alex Acero, Dong Yu, Ye-Yi Wang, Yun Cheng Ju, "Shareable Filler Model for Grammar Authoring" (granted 2011, US patent #7865357)
  21.  Sibel Yaman, Li Deng, Dong Yu, Ye-Yi Wang, Alex Acero, "Integrated Speech Recognition and Semantic Classification" (granted 2011, US patent #7856351)
  22. Dong Yu, Alex Acero, Yun Cheng Ju, "Compound Word Splitting for Directory Assistance Services" (granted 2011, US patent #7860707)
  23. Peter Mau and Dong Yu, "Efficient Capitalization through User Modeling" (granted 2010, US patent #7827025)
  24. Li Deng and Dong Yu, "Hidden Trajectory Modeling with Differential Cepstra for Speech Recognition" (granted 2010, US patent #7805308)
  25. Dong Yu, Li Deng, Alex Acero, "Time Asynchronous Decoding for Long-Span Trajectory Model" (granted 2010, US patent #7734460)
  26. Dong Yu, Li Deng, Alex Acero, "Learning Statistically Characterized Resonance Targets in a Hidden Trajectory Model" (granted 2010, US patent #7653535)
  27. Alex Acero and Dong Yu, "Method of Automatically Ranking Speech Dialog States and Transitions to Aid in Performance Analysis in Speech Applications" (granted 2010, US patent #7643995)
  28. Xiao Li, Asela J. Gunawardana, Alex Acero, Milind Mahajan, and Dong Yu, "System and Method for Identifying Semantic Intent from Acoustic Information" (granted 2009, US patent #7634406)
  29. Xiaodong He, Alex Acero, Dong Yu, Li Deng, "Incrementally Regulated Discriminative Margins in MCE Training for Speech Recognition" (granted 2009, US patent #7617103)
  30. Li Deng, Alex Acero, and Dong Yu, "Quantitative Model for Formant Dynamics and Contextually Assimilated Reduction in Fluent Speech" (granted 2009, US patent #7565292)
  31. Dong Yu, Alex Acero, Yun Cheng Ju, and Ye-Yi Wang, “Indexing and Ranking Processes for Directory Assistance Services" (granted 2009, US patent #7580942)
  32. Li Deng, Alex Acero, Dong Yu, Xiaolong Li, "Acoustic Models with Structured Hidden Dynamics with Integration over Many Possible Hidden Trajectories" (granted 2009, US patent #7565284)
  33. Alex Acero, Dong Yu, Li Deng, "Speaker-adaptive Learning of Resonance Targets in a Hidden Trajectory Model of Speech Coarticulation" (granted 2009, US patent #7519531)
  34. Alex Acero, Dong Yu, Li Deng, "Two Stage Implementation for Phonetic Recognition Using a Bi-directional Target-filtering Model of Speech Co-articulation and Reduction" (granted 2008, US patent #7409346)
  35. Dong Yu, Peter Mau, Kuansan Wang, Milind Mahajan, Alex Acero, "System and method for user modeling to enhance named entity recognition " (granted 2007, US patent #7289956)
 

Filed/Pending:

  1. Dong Yu, Chao Weng, Mike Seltzer, Jasha Droppo, “Mixed Speech Recognition” (pending, filed 2014)
  2. Jian Xue, Jinyu Li, Dong Yu, Mike Seltzer, Yifan Gong “Low-Footprint Adaptation and Personalization for A Deep Neural Network” (pending, filed 2014).
  3. Kaisheng Yao, Baolin Peng, Geoffrey Zweig, Dong Yu, Xiaolong Li, Feng Gao, “Recurrent Conditional Random Fields” (pending, filed 2014)
  4. Anoop Deoras, Kaisheng Yao, Xiaodong He, Geoff Zweig, Mei-Yuh Hwang, Ruhi Sarikaya, Gregoire Mesnil, Li Deng, Dong Yu, "Assignment Of Semantic Labels To A Sequence Of Words Using Neural Network Architectures" (pending, filed 2013)
  5. Jui-Ting Huang, Jinyu Li, Dong Yu, Yifan Gong, Li Deng “Multilingual Deep Neural Network” (pending, filed 2013).
  6. Dong Yu, Kaisheng Yao, Hang Su, Gang Li, Frank Seide “Conservatively Adapting a Deep Neural Network In a Recognition System” (pending, filed 2013).
  7. Jinyu Li, Dong Yu, Yifan Gong, “Exploiting Heterogeneous Data in Deep Neural Network Based Speech Recognition Systems” (pending, filed 2013)
  8. Dong Yu, Frank Seide, Gang Li, Adam Eversole, Xie Chen "Deep Neural Networks Training For Speech Recognition" (pending, filed 2012)
  9. Dong Yu, Li Deng and Frank Seide, "Computer implemented deep tensor neural network" (pending, filed 2012, App Number 13/597268)
  10. Dong Yu, Li Deng, Brian Hutchinson, "Tensor deep stacked neural network" (pending, filed 2012, App Number 13/397580)
  11. Dong Yu, Li Deng, Frank Seide, and Gang Li, "Exploiting Sparseness In Training Deep Neural Networks" (pending, filed 2011, App Number 13/305741)
  12. Dong Yu, Li Deng, Frank Seide, and Gang Li, "Discriminative Pretraining Of Deep Neural Networks" (pending, filed 2011, App Number 13/304643)
  13. Li Deng and Dong Yu, "Learning Processes For Single Hidden Layer Neural Networks With Linear Output Units" (pending, filed 2011, App Number 13/113100)
  14. Li Deng, Jinyu Li, Dong Yu, and Yifan Gong "Online Distorted Speech Estimation Within An Unscented Transformation Framework" (pending, filed, 2010, App Number 12/948935)
  15. Li Deng, Dong Yu, and George Dahl, "Deep Belief Network For Large Vocabulary Continuous Speech Recognition" (pending, filed 2010, App Number 12/882233)
  16. Dong Yu, Li Deng, Abdel-rahman Mohamed "Full-Sequence Training Of Deep Structures For Speech Recognition" (pending, filed 2010, App Number 12/886568)
  17. Dong Yu, Li Deng and Jinyu Li, "Confidence Calibration in Automatic Speech Recognition Systems" (pending, filed 2009, App Number 12/634744)
  18. Dong Yu, Li Deng and Alex Acero, "Maximum Entropy Model with Continuous Features" (pending, filed 2009, App Number 12/416161)
  19. Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, Alex Acero and Geoffrey Zweig, "Searching A Database of Listings" (pending, filed 2007, App Number 11/746847)
  20. Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, and Alex Acero, "Configurable Grammar Templates" (pending, filed 2005, App Number 11/259475)

Invited Talks

  • 2012 at University of Science and Technology of China, "Recent Progress on Large Vocabulary Speech Recognition Using Deep Neural Networks".
  • 2012 at Northwestern Polytechnical University, China, "Recent Progress on Large Vocabulary Speech Recognition Using Deep Neural Networks".
  • 2012 at I2R, A-Star, Singapore, "Recent Progress on Large Vocabulary Speech Recognition Using Deep Neural Networks".
  • 2012 keynote at International Workshop on Spoken Language Translation (IWSLT 2012), Hong Kong, "Who Can Understand Your Speech Better -- Deep Neural Network or Gaussian Mixture Model?".
  • 2012 at Cambridge University, "Why Deep Neural Networks Are Promising for Speech Recognition".
  • 2012 at University of Edinburgh, "Why Deep Neural Networks Are Promising for Speech Recognition".
  • 2012 at the International Conference on Machine Learning (ICML), invited related area talk, Edinburgh, UK, "Conversational Speech Transcription Using Context-Dependent Deep Neural Networks".
  • 2012 keynote at International Workshop on Statistical Machine Learning for Speech Processing (IWSML 2012), Kyoto, Japan, "More Data + Deeper Model = Better Accuracy".
  • 2012 at Zhejiang University, "Recent advances in automatic speech recognition".
  • 2012 at Baidu, "Deep Neural Network based Speech Recognition – A New Paradigm".
  • 2012 at Institute of Automation, Chinese Academy of Sciences, "Recent advances in automatic speech recognition".
  • 2012 at University of Neveda at Reno, "Log-linear model: from shallow to deep".
  • 2010 at University of Science and Technology of China, "Large vocabulary continuous speech recognition using context-dependent DNN-HMM model" .
  • 2010 at Tokyo Institute of Technology, "Large vocabulary continuous speech recognition using context-dependent DBN-HMM model".
  • 2010 at University of Texas at Dallas, "Automatic speech recognition: challenges ahead".
  • 2009 at Institute of Automation, Chinese Academy of Sciences, "A unified framework of variable-parameter hidden Markov models" and "The maximum entropy and hidden conditional random field model with distribution constraints".
  • 2009 at National Cheng Kung University, "A unified framework of variable-parameter hidden Markov models" and "The maximum entropy and hidden conditional random field model with distribution constraints".

Invited Tutorials

  • Dong Yu, "Deep Learning and Its Applications in Large vocabulary Speech Recognition", Shanghai Jiaotong University, China, 2012.
  • Dong Yu, "Large Vocabulary Speech Recognition Using Deep Neural Networks: Insights, Theory, and Practice", ISCSLP 2012, Hongkong, China.
  • Dong Yu, Li Deng, "Deep Learning and Its Applications in Signal Processing", ICASSP 2012, Kyoto, Japan.

Technical Services

  • Grant reviewer/panelist: USA National Science Foundation (2014-), Austrian Science Fund (2013-), Romanian National Research Council (2012-), Research Grants Council (RGC) of Hong Kong (2008-)
  • Associate editor: IEEE Transactions on Audio, Speech, and Language Processing (2011-), IEEE signal processing magazine (2008-2011)
  • Guest editor: IEEE Transactions on Audio, Speech, and Language Processing - special issue on deep learning for speech and language processing (2010).
  • Technical/review/program committee member: IEEE SLTC (2013-), ICASSP 2004-now, INTERSPEECH 2004-now, NAACL-HLT 2009-now, ASRU 2009-now, ISCSLP 2008-now, ACL 2011-now, IEEE ChinaSIP (2013-) MLSLP 2012, ISIEA 2012, ICSC 2007-2008,  EUSIPCO 2008, ACMSE 2005, NIPS workshop 2009.
  • Organization committee member: MMSP 2006, ICSC 2008, NIPS workshop 2009, ICMI 2010, ICASSP 2013.
  • Session chair: ACNS 2004, ICSC 2007, ICASSP 2010, Interspeech 2010-2011, ELM 2012, ICASSP 2013.
  • Referee for journals: IEEE Transactions on Audio, Speech, and Language Processing, J. Computer Speech and Language, J. Speech Communication, IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Transactions on Signal Processing, J. Pattern Recognition Letters, EURASIP J. on Audio Speech and Music Processing, J. Computer Security, J. Computer Networks, IEEE Transactions on Computer, J. Data & Knowledge Engineering.

Students/Interns Mentored/Co-mentored

  • Chao Weng, Georgia Tech (2013)
  • Andrew Maas, Stanford University (2012)
  • Brian Hutchinson, University of Washington (2011, co-mentored, now an assistant professor at western Washington university)
  • Xin Chen, University of Missouri (2011, now a speech scientist at Pearson)
  • George Dahl, University of Toronto (2010)
  • Abdel-rahman Mohamed, University of Toronto (2010, co-mentored)
  • Shizhen Wang, UCLA (2009, now a scientist at Microsoft)
  • Balakrishnan Varadarajan, Johns Hopkins University (2008, now a scientist at Google)
  • Raghu Sampath Kumaran, University of Washington (2007)
  • Jahanzeb Sherwani, Carnegie Mellon University (2006, now an adjunct faculty at CMU)
  • Nelson Lee, Stanford University (2006, now a product manager at Google)

Honors and Awards

    • IEEE Signal Processing Society Best Paper Award (2013)
    • Microsoft Achievement Award (2013)
    • Microsoft gold star award (2004, 2009, 2013)
    • Best presentation award, ELM (2012)
    • Microsoft research technical transfer award (2009, 2014)
    • Best paper award, ACMSE (2005)
    • Microsoft patent awards (2002-now)