Li Deng

Li Deng
RESEARCH MGR/PRINCIPAL RES
.

Brief Biography

Li Deng (IEEE M'89;SM'92;F'04) received the Ph.D. degree from the University of Wisconsin-Madison. He was an assistant professor (1989-1992), tenured associate professor (1992-1996), and tenured Full Professor (1996-1999) at the University of Waterloo, Ontario, Canada. In 1999, he joined Microsoft Research, Redmond, WA, where he is currently Principal Research Manager of the Deep Learning Technology Center. Since 2000, he has also been an Affiliate Full Professor and graduate committee member at the University of Washington, Seattle, teaching a graduate course of Computer Speech Processing and serving on Ph.D. thesis committees. Prior to joining Microsoft, he conducted research and taught at Massachusetts Institute of Technology, ATR Interpreting Telecom. Research Lab. (Kyoto, Japan), and HKUST. He has been granted over 70 US or international patents in acoustics/audio, speech/language technology, large-scale data analysis, and machine learning with recent focus on deep learning. He received numerous awards/honors bestowed by IEEE, ISCA, ASA, Microsoft, and other organizations.

His current (and past) research activities include deep learning and machine intelligence applied to big text data and to speech, image and multimodal processing, computational neuroscience and information representation, deep/recurrent/dynamic neural networks, automatic speech and speaker recognition, spoken language identification and understanding, speech-to-speech translation, machine translation, language modeling, information retrieval and data mining, web search, neural information processing, dynamic systems, machine learning and optimization, parallel and distributed computing, probabilistic graphical models, audio and acoustic signal processing, image analysis and recognition, compressive sensing, statistical signal processing, digital communication, human speech production and perception, acoustic phonetics, auditory speech processing, auditory physiology and modeling, noise robust speech processing, speech synthesis and enhancement, multimedia signal processing, and multimodal human-computer interactions.

In the general areas of audio/speech/language technology and science, machine learning, signal/information processing, and other areas of computer science, he has published over 300 refereed papers in leading journals and conferences, and authored or co-authored 5 books including the latest books on Deep Learning: Methods and Applications and on Automatic Speech Recognition: A Deep-Learning Approach (Springer). He is a Fellow of the Acoustical Society of America, a Fellow of the IEEE, and a Fellow of the International Speech Communication Association. He served on the Board of Governors of the IEEE Signal Processing Society (2008-2010). More recently, he served as Editor-in-Chief for the IEEE Signal Processing Magazine (2009-2011), which earned the highest impact factor in 2010 and 2011 among all IEEE publications and for which he received the 2012 IEEE SPS Meritorious Service Award. He recently served as General Chair of the IEEE ICASSP-2013, and currently serves as Editor-in-Chief for the IEEE Transactions on Audio, Speech and Language Processing. His technical work since 2009 (when he initiated deep learning research and technology development at Microsoft with Geoff Hinton) and the leadership in industry-scale deep learning with colleagues  have created high impact in speech recognition and other areas of information processing. The work by him and the team he manages has been in use in major Microsoft speech products and in several text/data-related products, and is recognized by the 2013 IEEE SPS Best Paper Award, Microsoft Goldstar Awards, Technology Transfer Awards. His recent research interests and activities have been focused on deep learning and machine intelligence applied to large-scale text analysis and to speech/language/image multimodal processing, advancing his earlier work on speech analysis/recognition using deep neural networks and deep generative models.

Book Chapters
Journal/Magazine Publications (and Editorials)

    2014

    2013

    2012

    2011

    2010

    2009

    2008

    2007

    2006

    2005

    2004

    2003

    2002

    2001

    2000

    1999

    1998

    1997

    1996

    1995

    1994

    1993

    1992

    1991

    1990

    1989

    1988

    1987

    1986

    1985

    Conference Publications

      2014

      2013

      2012

      2011

      2010

      2009

      2008

      2007

      2006

      2005

      2004

      2003

      2002

      2001

      2000

      1998

      1997

      1996

      1995

      • J. Sun and L. Deng. "Annotation and use of speech production corpus for building language-universal speech recognizers", Proceedings of the 2nd International Symposium on Chinese Spoken Language Processing (ISCSLP), Beijing, October 2000, Vol. 3, pp. 31-34.
      • J. Sun, R. Tongneri and L. Deng. "A robust speech understanding system using conceptual relational grammar," Proceedings of the International Conference on Spoken Language Processing,October 2000, Vol. 2, pp. 879-882.
      • S. Dusan and L. Deng. "Acoustic-to-articulatory inversion using dynamical and phonological constraints" Proceedings of the 5th Speech Production Workshop: MODELS AND DATA, Kloster Seeon, Germany, May 1-4, 2000, pp. 237-240.
      • M. Naito, L. Deng, and Y. Sagisaka. "Speaker adaptation methods using vocal tract parameters," (in Japanese) Proceedings of the 1998 Spring Meeting of the Acoustical Society of Japan, Yokohama, Japan, March 17-19, 1998, pp. 55-56.
      • M. Naito, L. Deng, and Y. Sagisaka. "A study on speaker clustering methods using vocal tract parameters," (in Japanese) Proceedings of Japan Institute of Electronics, Information, and Communication Engineers (IEICE), Yokosuka, Japan, December 1997, Vol. 97, No. 441, pp. 35-40.
      • L. Deng (invited). "A dynamic, feature-based approach to speech modeling and recognition," Proceedings of the 1997 IEEE Workshop on Automatic Speech Recognition and Understanding, Santa Barbara, CA, December 14-17, 1997, pp. 107-114.
      • X. Shen, L. Deng, and A. Yasmin. "H-infinity filtering for speech enhancement," Proceedings of the International Conference on Spoken Language Processing, Philadelphia, PA, October 3-6, 1996, pp. 873-876.
      • L. Deng, X. Shen, and D. Jamieson. "Simulation of disordered speech using a frequency-domain vocal tract model," Proceedings of the International Conference on Spoken Language Processing, Philadelphia, PA, October 3-6, 1996, pp. 768-771.
      • D. Jamieson, L. Deng, M. Price, V. Parsa, and J. Till. "Interactions of speech disorders with speech coders: Effects on speech intelligibility," Proceedings of the International Conference on Spoken Language Processing, Philadelphia, PA, October 3-6, 1996, pp. 737-740.
      • G. Ramsay and L. Deng. "Optimal filtering and smoothing for speech recognition using a stochastic target model," Proceedings of the International Conference on Spoken Language Processing, Philadelphia, PA, October 3-6, 1996, pp. 1113-1116.
      • L. Deng, G. Ramsay, and D. Sun. (invited). "Production models as a structural basis for automatic speech recognition," Proceedings of the Fourth European Speech Production Workshop, Autrans, France, May 24-27, 1996, pp. 69--80.
      • L. Deng. "Finite-state automata derived from overlapping articulatory features: A novel phonological construct for speech recognition," Proceedings of the Workshop on Computational Phonology in Speech Technology, (published by Association for Computational Linguistics), Santa Cruz, CA, June 28, 1996. pp. 37-45.
      • L. Deng and H. Sheikhzadeh. "Temporal and rate aspects of speech encoding in the auditory system: Simulation results on TIMIT data using a layered neural network interfaced with a cochlear model," Proceedings of European Speech Communication Association Tutorial and Research Workshop on the Auditory Basis of Speech Recognition, July 15 - 19, 1996, Keele University, United Kingdom, pp. 75-78.
      • C. Rathinavelu and L. Deng. "HMM-based speech recognition using state-dependent, discriminatively derived transforms on Mel-warped DFT features", Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol.1, Atlanta, Georgia, May 7-10, 1996, pp. 9--12.
      • L. Deng, G. Ramsay, and H. Sameti. "From modeling surface phenomena to modeling mechanisms: Towards a faithful model of the speech process aiming at speech recognition," Proceedings of the 1995 IEEE Workshop on Automatic Speech Recognition, December 10-13, 1995, Snowbird, Utah, pp. 183-184.
      • G. Ramsay and L. Deng. "Maximum-likelihood estimation for articulatory speech recognition using a stochastic target model," Proceedings of the 1995 European Conference on Speech Communication and Technology, Spain, September 18-21, 1995, pp. 1401-1404.
      • G. Ramsay and L. Deng. "Modal analysis of acoustic wave propagation in the vocal tract using a finite-difference method," Proceedings of the XII International Congress of Phonetic Sciences, Stockholm, Sweden, August 13-19, 1995, Vol 2, pp. 338-341.
      • G. Ramsay and L. Deng. "Articulatory synthesis using a stochastic target model of speech production," Proceedings of the XII International Congress of Phonetic Sciences, Stockholm, Sweden, August 13-19, 1995, Vol 2, pp. 478-481.
      • L. Deng, J. Wu, and H. Sameti. "Improved speech modeling and recognition using multi-dimensional articulatory states as primitive speech units," Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Detroit, MI, May 8-12, 1995, pp. 385-388.
      • D. Sun and L. Deng. "Analysis of acoustic-phonetic variations in fluent speech using TIMIT," Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Detroit, MI, May 8-12, 1995, pp. 201-204.
      • C. Rathinavelu and L. Deng. "Use of generalized dynamic feature parameters for speech recognition: Maximum likelihood and minimum classification error approaches," Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Detroit, MI, May 8-12, 1995, pp. 373-376.
      • S. Shen and L. Deng. "Discrete H-infinity filtering design with application to speech enhance ment," Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Detroit, MI, May 8-12, 1995, pp. 1504-1507.
      • H. Sheikhzadeh, R. Brennan, L. Deng, and H. Sameti, "Real-time implementation of HMM-based MMSE algorithm for speech enhancement in hearing aid applications," Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 1995 ,
      • D. Sun and L. Deng. "Nonstationary-state hidden Markov model with state-dependent time warping: Application to speech recognition," Proceedings of the 1994 International Conference on Spoken Language Processing, Vol. 1, Yokohama, Japan, September, 18-22, 1994. pp. 243--246,
      • L. Deng and H. Sameti. "Speech recognition using dynamically defined speech units," Proceedings of the 1994 International Conference on Spoken Language Processing, Vol. 4, pp. 2167-2170, Yokohama, Japan, September, 18-22, 1994.
      • H. Sheikhzadeh and L. Deng. "Interval statistics from a cochlear model in response to speech sounds," Journal of the Acoustical Society of America, Vol. 95, No. 6, June 1994 (Abstract), pp. 2842. (The 127th Meeting of the Acoustical Society of America, June 4-8, 1994, Cambridge, MA.)
      • L. Deng and I. Kheirallah. "Stability analysis on finite-difference solution of a basilar-membrane vibration model with application to acoustic signal processing," Journal of the Acoustical Society of America, Vol. 95, No. 6, June 1994 (Abstract), pp. 2840. (The 127th Meeting of the Acoustical Society of America, June 4-8, 1994, Cambridge, MA.)
      • L. Deng and H. Sameti. "Articulatory phonology and speech recognition: A study on use of dynamically defined speech primitives," Journal of the Acoustical Society of America, Vol. 95, No. 6, June 1994 (Abstract), pp. 2870. (The 127th Meeting of the Acoustical Society of America, June 4-8, 1994, Cambridge, MA.)
      • G. Ramsay and L. Deng. "A stochastic framework for articulatory speech recognition," Journal of the Acoustical Society of America, Vol. 95, No. 6, June 1994 (Abstract), pp. 2871. (The 127th Meeting of the Acoustical Society of America, June 4-8, 1994, Cambridge, MA.)
      • K. Hassanein, L. Deng and M. Elmasry. "A neural predictive hidden Markov model for speaker recognition," Proceedings of the Workshop on Automatic Speaker Recognition, Identification and Verification, Martigny, Switzerland, April, 1994, pp. 115-118.
      • L. Deng and M. Aksmanovic. "HMMs with mixtures of trended functions for automatic speech recognition," IEEE International Conference on Speech, Image Processing and Neural Networks, April 13-15, 1994, HongKong, pp. 702-705.
      • L. Deng. "A theory on optimal construction of dynamic features for hidden Markov modeling of speech," IEEE International Conference on Speech, Image Processing and Neural Networks, April 13-15, 1994, HongKong, pp. 351-354.
      • L. Deng and D. Sun. "Phonetic classification and recognition using HMM representation of overlapping articulatory features for all classes of English sounds," Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Adelaide, Australia, April 19-22, 1994, Vol. 1, pp. 45-48.
      • K. Hassanein, L. Deng and M. Elmasry. "Vowel classification using a neural predictive HMM: A discriminative training approach," Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Adelaide, Australia, April 19-22, 1994, Vol 2, pp. 665-668.
      • H. Sameti, H. Sheikhzadeh, L. Deng and R. Brennan. "Comparative performance of spectral subtraction and HMM-based speech enhancement strategies with application to hearing aid design." Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Adelaide, Australia, April 19-22, 1994, Vol. 1, pp. 13-16.
      • L. Deng. "A computational model of phonology-phonetics integration for automatic speech recognition," Proceedings of the 1993 IEEE Workshop on Automatic Speech Recognition, December 12-15, 1993, Snowbird, Utah, pp. 83--84.
      • K. Hassanein, L. Deng and M. Elmasry. "A neural predictive hidden Markov model for speech and speaker recognition," Proceedings of the Fifth International Conference on Microelectronics December 14-16, 1993, Dhahran, Saudi Arabia, pp. 108-111.
      • L. Deng and D. Sun. "Speech recognition using the atomic speech units constructed from overlapping articulatory features," Proceedings of the 1993 European Conference on Speech Communication and Technology, September 21-23, 1993, Berlin, Germany, Vol. III, pp. 1635--1638.
      • D. Zhang, L. Deng, and M. Elmasry. "Pipelined neural network architecture for speech recognition," Proceedings of the 1993 World Congress on Neural Networks, July 11-15, 1993, Portland, Oregon, Vol. III, pp. 55-58.
      • L. Deng. "Design of a feature-based speech recognizer aiming at integration of auditory processing, signal modeling, and phonological structure of speech." (invited) Journal of the Acoustical Society of America, Vol. 93, No.4, Pt. 2, pp. 2318, April, 1993.
      • K. Hassanein, L. Deng, and M. Elmasry. "Maximal mutual information training of a neural predictive HMM speech recognition system," Proceedings of the 1992 IEEE Workshop on Neural Networks for Signal Processing, August 31--September 2, 1992, Copenhagen, Denmark, pp. 164-173.
      • K. Erler and L. Deng. "HMM representation of quantized articulatory features for recognition of highly confusible words," Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, San Francisco, CA., March, 1992, pp.545-548.
      • L. Deng. "Speech modeling and recognition using a time series model containing trend functions with Markov modulated parameters," Proceedings of the 1991 IEEE Workshop on Automatic Speech Recognition, Arden House, New York, December, 1991, pp. 24-26.
      • L. Deng and K. Erler. "Microstructural speech units and their HMM representation for discrete utterance speech recognition," Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Toronto, Ontario, Canada, May, 1991, pp. 193--196. P. Seitz, V. Gupta, M. Lennig, P. Kenny, L. Deng, D. O'Shaughnessy, and P. Mermelstein. "Phonological rule set complexity as a factor in the performance of a very large vocabulary word recognition system," Journal of the Acoustical Society of America, 87(1), May, 1990, S108 (Abstract).
      • L. Deng, V. Gupta, M. Lennig, P. Kenny, and P. Mermelstein. "Acoustic recognition component of an 86,000-word speech recognizer," Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Albuquerque, New Mexico, 1990, pp. 741--744.
      • L. Deng, P. Kenny, M. Lennig, V. Gupta and P. Mermelstein. "A locus model of coarticulation in a hidden-Markov-model-based speech recognizer," Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Glascow, Scotland, 1989, pp. 97-100.
      • L. Deng, P. Kenny, M. Lennig, V. Gupta and P. Mermelstein. "Large vocabulary word recognition based on phonetic representation by hidden Markov models", Proceedings of the Canadian Conference on Electrical and Computer Engineering, Vancouver, Canada, November 1988, pp. 131-134.
      • L. Deng, M. Lennig, and P. Mermelstein. "Modeling acoustic-phonetic detail in a hidden-Markov-model-based large vocabulary speech recognizer," Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, New York, New York, Vol. 1, April 1988, pp. 509--512.

      Patents (awarded)

      • Exploiting Sparseness in Training Deep Neural Networks, filed 11/28/2011, US Patent #8700552, granted on 4/15/2014.
      • Online Distorted Speech Estimation Within An Unscented Transformation Framework. filed on 11/18/2010, US Patent #8731916, granted on 5/20/2014.
      • Deep Convex Network With Joint Use Of Nonlinear Random Projection, Restricted Boltzmann Machine And Batch-Based Parallelizable Optimization, filed 3/31/2011, US Patent #8489529, granted on 7/16/2013.
      • Deep structured conditional random fields for sequence labeling and classification, U.S. Patent; filed: 1/29/2010; granted on 6/25/2013, Patent #8,473,430.
      • Automatic reading feedback with parallel polarized language modeling,'' (US Patent #8,433,576, granted on 4/30/2013
      • Generic framework for large-margin MCE training in speech recognition,'' (US Patent #8,423,364, granted on 4/16/2013
      • Integrative and discriminative technique for spoken utterance translation (US Patent #8,407,041, granted on 3/26/2013
      • Speech recognition with non-liner noise reduction on Mel-frequency cepstra, (US Patent #8,306,817, granted Nov. 6, 2012)
      • Automatic Reading Tutoring, U.S. Patent; (US Patent #8,306,822, granted Nov. 6, 2012)
      • Adapting A Compressed Model For Use In Speech Recognition,'' U.S. Patent, (#8,239,195, granted August 3, 2012)
      • Phase Sensitive Model Adaptation For Noisy Speech Recognition,'' U.S. Patent, (#8,214,215, granted July 3, 2012)
      • Minimum classification error training with growth transformation optimization,'' (U.S. Patent #8,301,449, granted Oct. 30, 2012)
      • Speech-centric multimodal user interface design in mobile technology,''  (US Patent #8,219,406, granted July 10, 2012)
      • High performance HMM adaptation with joint compensation of additive and convolutive distortions,'' (US Patent #8,180,637, granted May. 15, 2012)
      • Piecewise-Based Variable-Parameter Hidden Markov Models and the Training Thereof,'' (US Patent #8,160,878, granted April 17 2012)
      • Noise Suppressor for Robust Speech Recognition,'' (US Patent #8,185,389, granted May. 22, 2012)
      • Parameter Clustering and Sharing for Variable-Parameter Hidden Markov Models, (US Patent #8,145,488, granted March 27, 2012)
      • Parameter Learning in Hidden Trajectory Model, (U.S. Patent #8,010,356, granted August 30, 2011)
      • Time Synchronous Decoding for Long-Span Hidden Trajectory Model, (US patent #7,877,256, granted 2011)
      • Integrated Speech Recognition and Semantic Classification (granted 2011, US patent #7,856,351)
      • Hidden Trajectory Modeling with Differential Cepstra for Speech Recognition, (granted 2010, US patent #7,805,308)
      • Segment-Discriminating Minimum Classification Error Pattern Recognition, with X. He and Q. Fu (granted Jan 18, 2011, US patent #7,873,209)
      • Hidden trajectory modeling with differential cepstra for speech recognition, U.S. Patent No.: 7,805,308; granted on September 28, 2010
      • Time Asynchronous Decoding for Long-Span Trajectory Model,'' US patent No.: 7,734,460, granted on June 8, 2010
      • Method and Apparatus for Constructing a Speech Filter Using Estimates of Clean Speech and Noise,'' U.S. Patent No.: 7,725,314; granted on May 25, 2010
      • Learning Statistically Characterized Resonance Targets in a Hidden Trajectory Model,  US patent #7653535, granted January 2010.
      • Incrementally Regulated Discriminative Margins in MCE Training for Speech Recognition, US patent #7617103, granted Sept 2009
      • Quantitative model for formant dynamics and contextually assimilated reduction in fluent speech, US patent No.: 7,565,292, granted on July 21, 2009
      • Acoustic models with structured hidden dynamics with integration over many possible hidden trajectories, US patent No.: 7,565,284, granted on July 21, 2009
      • Speaker-adaptive Learning of Resonance Targets in a Hidden Trajectory Model of Speech Coarticulation, US patent No.: 7,519,531, granted on April 14, 2009
      • Greedy algorithm for identifying values for vocal tract resonance vectors, U.S. Patent No.: 7,475,011; Granted on January 6, 2009
      • Method of Speech Recognition Using Multimodal Variational Inference with Switching State Space Models, U.S. Patent No.: 7,480,615; Granted on January 20, 2009
      • Method of Speech Recognition Using Variables Representing Dynamic Aspects of Speech, U.S. Patent No.: 7,346,510; Granted on March 18, 2008
      • Method of Noise Reduction Using Instantaneous Signal-to-Noise Ratio as the Principal Quantity for Optimal Estimation, U.S. Patent No.: 7,363,221; Granted on April 22, 2008
      • Method and Apparatus for Formant Tracking Using a Residual Model, U.S. Patent No.: 7,424,423; Granted on September 9, 2008
      • Multi-Sensory Speech Enhancement Using Synthesized Sensory Signal, U.S. Patent No.: 7,406,303; Granted on July 29, 2008
      • Two-stage implementation for phonetic recognition using a bi-directional target-directed model of speech co-articulation and reduction, U.S. Patent No.: 7,409,346; Granted on August 5, 2008
      • Removing noise from feature vectors, U.S. Patent No.: 7,310,599; Granted on December 18, 2007;
      • Method of determining uncertainty associated with acoustic distortion-based noise reduction, U.S. Patent No. 7,289,955; Granted on October 30, 2007
      • Method and apparatus for identifying noise environments from noisy signals, U.S. Patent No. 7,266,494; Granted on September 4, 2007
      • Method of noisy reduction using correction and scaling vectors with partitioning of the acoustic space in the domain of noisy speech, U.S. Patent No.7,254,536; Granted on August 7, 2007
      • Method of determining uncertainty in noise reduction, US and International Patents; U.S. Patent No.: 7,174,292; Granted on Feb. 6, 2007
      • Method of Noise Estimation Using Incremental Bayes Learning, US. Patent; Patent No.: 7,165,026; Granted on Jan. 16, 2007
      • Method of iterative noise estimation in a recursive framework, U.S. Patent; Patent No. 7,139,703; Granted on Nov. 21, 2006.
      • Method of noise reduction using correction vectors based on dynamic aspects of speech and noise normalization, United States Patent No. 7,117,148; Granted on October 3, 2006.
      • Method of noise reduction based on dynamic aspects of speech, United States Patent No. 7,107,210; Granted on Sept 12, 2006.
      • Method of pattern recognition using noise reduction uncertainty, United States Patent No. 7,103,540; Granted on Sept 5, 2006.
      • Microphone array signal enhancement using mixture models (jointly with Hagai Attias), United States Patent No. 7,103,541; Granted on Sept 5, 2006.
      • Efficient backward recursion for computing posterior probabilities, United States Patent No. 7,062,407; Granted on June 13, 2006.
      • Method of speech recognition using time-dependent interpolation and hidden dynamics, United States (and International) Patent No. 7,050,975; Granted on May 23, 2006.
      • Nonlinear observation models for removing noise from corrupted speech, United States (and International) Patent No. 7,047,047; Granted on May 16, 2006.
      • Method of Noise Reduction Using Correction and Scaling Vectors with Partitioning of the Acoustic Space in the Domain of Noisy Speech, United States Patent No. 7,003,455; Granted on February 21, 2006
      • Methods and Apparatus for Denoising and Dereverberation Using Variational Inference and Strong Speech Models, United States Patent No. 6,990,447; Granted on January 24, 2006
      • Method and Apparatus for Removing Noise from Feature Vectors, United States Patent No. 6,985,858; Granted on January 10, 2006
      • Methods for Including the Category of Environmental Noise When Processing Speech Signals, United States Patent No. 6,959,276; Granted on October 25, 2005
      • Method of iterative noise estimation in a recursive framework, United States Patent; Patent No. 6,944,590; Granted on September 13, 2005
      • Method of speech recognition using variational inference with switching state space models, United States Patent; Patent No. 6,931,374; Granted on August 16, 2005
      • Pattern Recognition Training Method and Apparatus Using Inserted Noise Followed by Noise Reduction, United States (and International) Patent; Patent No. 6,876,966; Granted on April 5, 2005
      • Apparatus for Speaker Clustering and for Speech Recognition, Patent No.: 2,965,537; Granted on Aug. 13, 1999; Countries of issue: United States and Japan.
      • Apparatus for Speaker Normalization Processor and for Voice Recognition Device, Patent No.: 2986792; Granted on Oct. 1, 1999; Countries of issue: United States and Japan.
      • Patents (Pending awards)

      • Method of speech recognition using hidden trajectory hidden Markov models, U.S. Patent
      • Zero-variance model of acoustic environment for enhancing noisy speech features,'' U.S. Patent
      • Method and Apparatus for Multi-Sensory Speech Enhancement,'' International Patent;
      • Method and apparatus for continuous valued vocal tract resonance tracking using piecewise linear approximation
      • Speech resonance target estimation using formant tracking results, U.S. Patent
      • Incrementally regulating discriminative margins in MCE training for speech recognition,'' U.S. Patent; filing date: 8/25/2006
      • Using a discretized, higher order representation of hidden dynamic variables for speech recognition,'' U.S. Patent; filing date: 8/21/2006
      • Integrated speech recognition and semantic classification,'' U.S. Patent; filing date: 1/19/2007
      • Segment-discriminating minimum classification error pattern recognition,'' U.S. Patent; filing date: 1/31/2007
      • Maximum Entropy Model with Continuous Features, U.S. Patent; filing date: April 2009
      • Cross-lingual speech recognition with HMM using KL distance,'' U.S. Patent; filing date: April 2009
      • Maximum entropy model with continuous features, U.S. Patent; filing date: 4/1/2009
      • Confidence calibration in automatic speech recognition systems, U.S. Patent; filing date: 12/10/2009
      • Full sequence training of deep structures for speech recognition, U.S. Patent; filing date: 9/21/2010
      • Deep belief network for large vocabulary continuous speech recognition, U.S. Patent; filing date: 9/15/2010
      • Learning Processes For Single Hidden Layer Neural Networks With Linear Output Units, filed 5/23/2011
      • Discriminative learning of feature functions of generative type in speech translation, filed 10/28/2011
      • Discriminative pretraining of deep neural networks, filed 11/26/2011
      • Tensor Deep Stacking Networks, filed 2/15/2012.
      • Computer-Implemented Deep Tensor Neural Network, filed 8/29/2012
      • Multilingual Deep Neural Network, filed 3/11/2013
      • Assignment of semantic labels to a sequence of words using neural network architectures, filed 9/2/2013
      • Deep structured semantic model produced using click-through data. filed 9/6/2013.
      • Convolutional Latent Semantic Models and Their Applications. filed 4/1/2014
      • Context-Sensitive Search Using a Deep Learning Model, filed 4/14/2014
      • Modeling Interestingness with Deep Neural Networks, filed 6/13/2014
      Tech Reports, Special Reprints, etc

      Downloads

      •  IPAM05-MSR-VTR-Formants (This database was created by the joint work of MSR and UCLA (IPAM). See our ICASSP2006 paper (contained in the download) for details. Note that this is a 20MB download. We suggest that you save it in your disks before installing it. Note also that this is a database, although it appears as a program when you are running and "installing" it.)

      E-mail: deng at microsoft dot com
      U.S.Mail: Microsoft Research, One Microsoft Way, Redmond WA, 98052, USA
      Tel: (425) 706-2719
      Fax: (425) 706-7329 (This is the main MS FAX number so make sure to send documents to Li Deng's attention)

       

       

      Dissertations, etc.

      Leo J. Li:  Hidden Dynamic Models for Speech Processing Applications, Ph.D. Thesis, University of Waterloo, Canada, 2004 (supervisors: Li Deng and Paul Fieguth)

      News about My & Collaborators' Recent Technical Work, Deep Learning, Lecture Material, Presentation Slides, etc.

      Professional Activities & Honors/Awards

      • 2013 IEEE SPS Best Paper Award
      • Editor-In-Chief, IEEE Transactions Audio, Speech & Language Processing (2012-2015)
      • Editor-In-Chief, IEEE Signal Processing Magazine (2009-2011) IF: 4.9 and 6.0
      • Technology Transfer Award (on the DNN work), 2014
      • IEEE Northwest Area Outstanding Engineer Award, 2014
      • GoldStar Award (on deep learning), 2013
      • 2011 IEEE SPS Meritorious Service Award
      • General Chair, 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Vancouver, BC, Canada
      • Board of Governors, IEEE Signal Processing Society (Member elected, term 2008-2010)
      • Board of Governors, and VP Industrial Relations, Asian-Pacific Signal and Information Processing Association (APSIPA) (elected Sept. 2009)
      • Publications Board, IEEE Signal Processing Society (Member, 2009-2011, 2012-2014)
      • Keynote speaker: Interspeech, Sept 18, 2014.
      • Keynote speaker: The 12th China National Conference on Computational Linguistics, 2013.
      • Keynote Speaker: IEEE Odyssey Workshop, June 2012
      • Keynote Speaker: ROCLING Conf., August, 2012
      • Keynote Speaker: 2011 Workshop on ELM, Dec 2011
      • Lecturer, CMU, March 2014.
      • Tutorial Lecturer on Deep Learning for Natural Language Processing: Theory and Practice at CIKM'2014 at Shanghai, China, November 2014.
      • Tutorial Lecturer on Deep Learning: From speech analysis and recognition to language and multimodal processing, ICML, June 21, 2014.
      • Tutorial and Overview Lecturers on "Recent Advances in Deep Learning for Speech, Vision, and Language". IEEE MIIS Workshop, July 2014.
      • Trend/Overview Lecturer on "Deep Learning for Speech and Language", IEEE Conf. ChinaSIP, July 2014 
      • Tutorial Lecturer on Deep learning for natural language processing and related applications , IEEE ICASSP, 2014
      • Tutorial Lecturer on Deep Learning for Speech and Information Processing: IEEE ICASSP, Kyoto, 2012
      • ISCA Distinguished Lecturer 2010-2011 (International Speech Comm Assoc.)
      • APSIPA Inaugural Distinguished Lecturer 2012-2013 (Asian-Pacific Sig. & Information Processing Association)
      • Tutorial Lecturer on Deep Learning: APSIPA, Xi'an, Oct 2011
      • Tutorial Lecturer, ISCA Interspeech, Portland, Sept 2012 
      • Lecturer, CLSP, Johns Hopkins U., October, 2012
      • Guest Editor: IEEE Transactions on Pattern Analysis & Machine Intelligence, Special Issue, 2012
      • Editor, Computer Speech and Language (2009-2012)
      • GoldStar Awards, Microsoft Achievement Awards,  MSR Tech Transfer Awards, etc. 2002-2013
      • Area Editor, IEEE Signal Processing Magazine (2006-2008)
      • General Chair, IEEE Workshop on Multimedia Signal Processing, Victoria, BC, Canada (2006)
      • Co-organizer, 2013 ICML Workshop: Deep Learning Architecture for Audio, Speech & Language Processing, Atlanta, June 2013
      • Lead Organizer, 2011 ICML Workshop: Learning Architecture, Representation & Optimization for Speech & Visual Information Processing, Bellevue, WA, July 2011
      • Co-Chair, NIPS Workshop: Speech and Language --- Learning-Based Methods and Systems, Whistler, BC, Canada, 2008
      • Co-Chair, NIPS Workshop: Deep Learning for Speech Recognition and Related Applications, Whistler, BC, Canada, 2009
      • Guest Editor, IEEE Journal of Selected Topics in Signal Processing, Special Issue on Statistical Learning Methods for Speech and Language Processing, 2009
      • IEEE Signal Processing Society TC Review Committee (Member, term 2008-2009)
      • IEEE Signal Processing Society Long Range Planning & Implementation Committee (Member, term 2009-2010)
      • Member, Multimedia Signal Processing Technical Committee of the IEEE Signal Processing Society (2004-2008)
      • Member, Editorial Board, IEEE Signal Processing Letters (2007-2008)
      • Member, Editorial Board, IEEE Signal Processing Magazine (2005-2007)
      • Member, Editorial Board, J. Audio, Music, and Speech Processing (2005-present)
      • Founding Member, Education Committee, IEEE Signal Processing Society (1997-2000)
      • Member, Speech Processing Technical Committee, IEEE Signal Processing Society (1996-1999)
      • Associate Editor, IEEE Transactions on Speech and Audio Processing (2002-2005)
      • Principal Investigator, DARPA (US DoD) EARS Program, (2002-2005)
      • Technical Chair, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2004), Montreal, Quebec, Canada.
      • Co-Guest Editor, IEEE Signal Processing Magazine, Special Issue on Speech Technology and Systems in Human-Machine Communication (Sept 2005)
      • Co-Guest Editor, IEEE Trans. on Computers, Special Issue on Emergent Systems, Algorithms and Architectures for Speech-based Human-Machine Interaction (2006)
      • Member, IEEE Signal Processing Society Technical Directions Committee (2003-2005)
      • Member, IEEE International Conference on Multimedia and Expo Steering Committee (2004-2006)
      • Keynote speaker, IEEE 5th Workshop on Multimedia Signal Processing (IEEE Signal Processing Society), St. Thomas, US Virgin Islands (December 2002)
      • Organizer and speaker, AAAS (American Association for Advancement of Science) Symposium on "Scientific Problems Facing Speech Recognition Today", Seattle, 2004
      • Gold Star Award, Microsoft Corp, 2002, 2013.
      • Invited Lecturer, NATO Advanced Study Institute
      • Invited Lecturer, European Speech Communication (ESCA) Tutorial and Research Workshops
      • Bell Canada Research Award, 1999;

      • National Defense of Canada Research Award, 1997, 1999;

      • Center of Information and Telecommunication Ontario Research Award, 1998;

      • NSERC Industrial Oriented Research Award, 1991, 1994, 1997;

      • NSERC Collaborative Research and Development Award, 1993, 1996, 1998;

      • Nortel Technology Research Award, 1991, 1993, 1996, 1998;

      • NSERC Strategic Grant Award, 1995;

      • Ontario Information Technology Research Center of Excellence Grant Award, 1994, 1995;

      • Ontario University-Industry Research Incentive Award, 1991, 1993;

      • Natural Science and Engineering Research Council (NSERC, Canada) Presidential Award, 1991;

      • The Sixth Jerzy E. Rose Award, 1986, for original and significant research in auditory science;

      • The First Guo Mo-Ruo Award, for top academic ranking graduates in science and engineering at the University of Science and Technology of China;

      • Fellow, The Acoustical Society of America (The American Institute of Physics) (elected Dec. 2003)

      • Fellow, The IEEE (elected Dec. 2004)
      • Fellow, The ISCA (elected Aug. 2011)