I hope you've found this page because we are interested in the same things. This is a picture of me when I lived in a tiny condo during graduate school. That's where I learned to enjoy linear algebra, information theory, and machine learning, which led directly to a career in digital signal processing and automatic speech recognition. More information about my professional life appears below. For those interested in my personal life, navigate to my Droppo family web site.
main(k){for(k=22;k--;putchar(k["\nnqf2tgqvsrdkpDoqrrvdk"]-k%5));}
Research interests
- Feature transformation
- Speech and acoustic digital signal processing
- Noise-robust speech recognition
- Low-bandwidth robust cepstral transport
- Time-frequency representations
- Nonstationary signal modeling and classification
Biography
I have been with Microsoft Research since July, 2000. My primary task has been to explore different techniques to make ASR more robust to additive and channel noise. Other projects I've worked on include general speech signal enhancement, pitch tracking, multiple stream ASR, novel speech recognition features, MiPad multimodal interface, cepstral compression and transport, and the WITTY microphone.
The SPLICE project was successful in building a more robust speech recognition system. Working jointly with Alex Acero and Li Deng, we were able to get amazing results on the noisy Aurora2 corpus. But, there were fundamental problems with the approach.
The model-based feature enhancement project was meant to address the stereo-data requirement for SPLICE. The model describes how speech and noise (and noisy channels) interact to corrupt the speech features. It can be used to either enhance the features before recognition, or to adapt the recognizer's model at run-time.
Lately, I've moved to more general non-parametric warpings of the feature space. We're trying to learn transformations that improve recognition performance in both noisy and clean conditions.
I earned my Ph.D. in Electrical Engineering at the University of Washington's Interactive Systems Design Laboratory in June of 2000. Early in my studies, I helped to develop a discrete theory for time-frequency representations of non-stationary audio signals. The application of this theory to speech recognition was the core of my thesis, "Time-Frequency Representations for Speech Recognition." Other projects I worked on during this time included a GMM-based speaker verification system, subliminal audio message encoding, and non-linear signal morphing.
My MSEE was also earned at the University of Washington, in 1996. During this time, I worked on a project to develop and build an acoustic pyrometer. The device probes the fireball within a coal-fired electrical plant with several sound pressure waves, and determines a temperature profile based on acoustic time of flight measurements. My thesis described the algorithms and techniques developed to make such a device feasable.
I earned my BSEE from Gonzaga University in Spokane, in 1994. My final project consisted of building a control system for a high speed dot-matrix printer. Fuzzy logic was popular at the time, so we used it as the basis of our system. I learned two important lessons from the project, both of which were probably unintentional. First lesson: stay away from fads. Whereas a conventional design would have been well understood and easy to implement with a guaranteed minimum level of performance, our fuzzy controller needed a lot of background work and experimentation to get working correctly. Second lesson: embrace fads. I wrote a paper comparing and contrasting the behavior of fuzzy controlers to linear controllers, and recieved first prize in the region's IEEE paper contest.
Downloads
Switching Linear Dynamic Model
This code enables training and evaluation of a switching linear dynamic model for enhancing cepstral streams for automatic speech recognition, as described in our ICASSP 2004 paper, "Noise Robust Speech Recognition with a Switching Linear Dynamic Model."
Pitch and Voicing Estimates for Aurora 2
This archive consists of a set of pitch period and voicing estimates for utterances found in the Aurora 2 corpus[1] using the algorithm described in [2]. Currently, pitch estimates are available for test sets A and B, as well as the clean training data. [1] H. G. Hirsch and D. Pearce, "The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy condidions", in ISCA ITRW ASR2000 "Automatic Speech Recognition: Challenges for the Next Millennium", Paris, France, September 2000. [2] J. Droppo and A. Acero. Maximum a Posteriori Pitch Tracking, in Proc. of the Int. Conf. on Spoken Language Processing. Sydney, Australia. Dec 1998.
- Jasha Droppo and Alex Acero, Experimenting with a Global Decision Tree for State Clustering in Automatic Speech Recognition Systems, IEEE, April 2009
- Hui Lin, Li Deng, Jasha Droppo, Dong Yu, and Alex Acero, Learning Methods in Multilingual Speech Recognition, in NIPS Workshop, Whistler, BC, Canada, Microsoft, December 2008
- Jasha Droppo, Michael L. Seltzer, Alex Acero, and Y.-H. Chiu, Towards a non-parametric acoustic model: an acoustic decision tree for observation probability calculation, in Proceedings of Interspeech, International Speech Communication Association, Brisbane, Australia, September 2008
- Dong Yu, Li Deng, Jasha Droppo, Jian Wu, Yifan Gong, and Alex Acero, Robust speech recognition using cepstral minimum-mean-square-error noise suppressor, in IEEE Trans. Audio, Speech, and Language Processing, vol. 16, no. 5, Institute of Electrical and Electronics Engineers, Inc., July 2008
- Luis Buera, Jasha Droppo, and Alex Acero, Speech Enhancement using a Pitch Predictive Model, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., April 2008
- Dong Yu, Li Deng, Jasha Droppo, Jian Wu, Yifan Gong, and Alex Acero, A Minimum Mean-Square-Error Noise Reduction Algorithm on Mel-Frequency Cepstra for Robust Speech Recognition, in Proc. ICASSP, Institute of Electrical and Electronics Engineers, Inc., April 2008
- Ivan Tashev, Jasha Droppo, Michael Seltzer, and Alex Acero, Robust Design of Wideband Loudspeaker Arrays, in Proc. of International Conference on Audio, Speech and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., Las Vegas, USA, April 2008
- Jasha Droppo and Alex Acero, Environmental Robustness, in Handbook of Speech Processing, pp. 1176, Springer Verlag, 2008
- Jasha Droppo and Alex Acero, A Fine Pitch Model for Speech, in Proc. Interspeech Conference, International Speech Communication Association, August 2007
- Chris White, Jasha Droppo, Alex Acero, and Julian Odell, Maximum Entropy Confidence Estimation for Speech Recognition, in Proc. ICASSP, Institute of Electrical and Electronics Engineers, Inc., Hawaii, April 2007
- Jasha Droppo and Alex Acero, Joint Discriminative Front End and Back End Training for Improved Speech Recognition Accuracy, in Proc. ICASSP, Institute of Electrical and Electronics Engineers, Inc., Toulouse, France, May 2006
- Ivan Tashev, Jasha Droppo, and Alex Acero, Suppression Rule for Speech Recognition Friendly Noise Suppressors, in Proceedings of Eight International Conference Digital Signal Processing and Applications DSPA’06, Moscow, Russia, March 2006
- Jasha Droppo, Milind Mahajan, Asela Gunawardana, and Alex Acero, How to Train a Discriminative Front End with Stochastic Gradient Descent and Maximum Mutual Information, in Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding, Institute of Electrical and Electronics Engineers, Inc., Puerto Rico, December 2005
- A. Subramanya, Z. Zhang, Z. Liu, Jasha Droppo, and Alex Acero, A Graphical Model for Multi-Sensory Speech Processing in Air-and-Bone Conductive Microphones, in Proc. of the Interspeech Conference, International Speech Communication Association, Lisbon, Portugal, September 2005
- Jasha Droppo and Alex Acero, Maximum Mutual Information SPLICE Transform for Seen and Unseen Conditions, in Proc. Interspeech Conference, International Speech Communication Association, September 2005
- Li Deng, J. Wu, Jasha Droppo, and Alex Acero, Analysis and Comparison of Two Speech Feature Extraction/Compensation Algorithms, in IEEE Signal Processing Letters, vol. 12, no. 6, pp. 477–480, Institute of Electrical and Electronics Engineers, Inc., June 2005
- Li Deng, Jian Wu, Jasha Droppo, and Alex Acero, Dynamic Compensation of HMM Variances Using the Feature Enhancement Uncertainty Computed From a Parametric Model of Speech Distortion, in IEEE Transactions on Speech and Audio Processing, vol. 13, no. 3, pp. 412–421, Institute of Electrical and Electronics Engineers, Inc., May 2005
- Z. Liu, A. Subramanya, Z. Zhang, Jasha Droppo, and Alex Acero, Leakage Model and Teeth Clack Removal for Air- and Bone-Conductive Integrated Microphones, in Proc. ICASSP, Institute of Electrical and Electronics Engineers, Inc., Philadelphia, March 2005
- M. L. Seltzer, Alex Acero, and Jasha Droppo, Robust Bandwidth Extension of Noise-corrupted Narrowband Speech, in Proc. Interspeech Conference, International Speech Communication Association, 2005
- Zicheng Liu, Zhengyou Zhang, Alex Acero, Jasha Droppo, and Xuedong Huang, Direct Filtering for Air- and Bone-Conductive Microphones, in Proc. IEEE Workshop on Automatic Speech Recognition and Understanding, Institute of Electrical and Electronics Engineers, Inc., Siena, Italy, September 2004
- Jasha Droppo and Alex Acero, Noise Robust Speech Recognition with a Switching Linear Dynamic Model, in Proc. ICASSP, IEEE, Montreal, Canada, May 2004
- Li Deng, Jasha Droppo, and Alex Acero, Estimating cepstrum of speech under the presence of noise using a joint prior of static and dynamic features, in IEEE Transactions on Speech and Audio Processing, vol. 12, no. 3, pp. 218–233, Institute of Electrical and Electronics Engineers, Inc., May 2004
- Li Deng, Jasha Droppo, and Alex Acero, Enhancement of log Mel power spectra of speech using a phase-sensitive model of the acoustic environment and sequential estimation of the corrupting noise, in IEEE Transactions on Speech and Audio Processing, vol. 12, no. 2, pp. 133–143, Institute of Electrical and Electronics Engineers, Inc., March 2004
- Zhengyou Zhang, Z. Liu, M. Sinclair, A. Acero, Li Deng, J. Droppo, Xuedong Huang, and Yanli Zheng, Multisensory microphones for robust speech detection, enhancement, and recognition, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal, Canada, May 2004, IEEE, 2004
- Li Deng, Ye-Yi Wang, Kuansan Wang, Alex Acero, Hsiao Hon, Jasha Droppo, C. Boulis, Derek Jacoby, Milind Mahajan, Ciprian Chelba, and Xuedong Huang, Speech and language processing for multimodal human-computer interaction (Invited Article) , in Journal of VLSI Signal Processing Systems (Special issue on Real-World Speech Processing), vol. 36, no. 2-3, pp. 161 - 187, Kluwer Academic , 2004
- Y. Zheng, Z. Liu, Z. Zhang, M. Sinclair, Jasha Droppo, Li Deng, Xuedong Huang, and Alex Acero, Air and Bone-Conductive Integrated Microphones for Robust Speech Detection and Enhancement, in Proc. IEEE Workshop on Automatic Speech Recognition and Understanding, Institute of Electrical and Electronics Engineers, Inc., U.S. Virgin Islands, December 2003
- J. Wu, Jasha Droppo, Li Deng, and Alex Acero, A Noise-Robust ASR Front-End Using Wiener Filter Constructed from MMSE Estimation of Clean Speech and Noise, in Proc. IEEE Workshop on Automatic Speech Recognition and Understanding, Institute of Electrical and Electronics Engineers, Inc., U.S. Virgin Islands, December 2003
- Li Deng, Jasha Droppo, and Alex Acero, Recursive estimation of nonstationary noise using iterative stochastic approximation for robust speech recognition, in IEEE Transactions on Speech and Audio Processing, vol. 11, no. 6, pp. 568–580, Institute of Electrical and Electronics Engineers, Inc., November 2003
- Mike Seltzer, Jasha Droppo, and Alex Acero, A Harmonic-Model-Based Front End for Robust Speech Recognition, in Proc. Eurospeech Conference, International Speech Communication Association, Geneva, Switzerland, September 2003
- Jasha Droppo, Li Deng, and Alex Acero, A Comparison of Three Non-Linear Observation Models for Noisy Speech Features, in Proc. Eurospeech Conference, International Speech Communication Association, Geneva, Switzerland, September 2003
- Li Deng, Jasha Droppo, and Alex Acero, Incremental Bayes Learning with Prior Evolution for Tracking Non-Stationary Noise Statistics from Noisy Speech Data, in Proc. ICASSP, Institute of Electrical and Electronics Engineers, Inc., Hong Kong, April 2003
- Li Deng, Alex Acero, Ye-Yi Wang, Kuansan Wang, Hsiao-Wuen Hon, Jasha Droppo, Milind Mahajan, and XD Huang, A speech-centric perspective for human-computer interface, in Proc. of the IEEE Fifth Workshop on Multimedia Signal Processing, Institute of Electrical and Electronics Engineers, Inc., December 2002
- Li Deng, Jasha Droppo, and Alex Acero, Exploiting Variances in Robust Feature Extraction Based on a Parametric Model of Speech Distortion, in Proc. International Conference on Spoken Language Processing, Denver, Colorado, September 2002
- Jasha Droppo, Li Deng, and Alex Acero, Evaluation of SPLICE on the Aurora 2 and 3 Tasks, in Proc. International Conference on Spoken Language Processing, International Speech Communication Association, Denver, Colorado, September 2002
- Li Deng, Jasha Droppo, and Alex Acero, Log-Domain Speech Feature Enhancement Using Sequential MAP Noise Estimation and a Phase-sensitive Model of the Acoustic Environment, in Proc. International Conference on Spoken Language Processing, Denver, Colorado, September 2002
- Jasha Droppo, Alex Acero, and Li Deng, A Nonlinear Observation Model for Removing Noise from Corrupted Speech Log Mel-Spectral Energies, in Proc. International Conference on Spoken Language Processing, Denver, Colorado, September 2002
- Jasha Droppo, Li Deng, and Alex Acero, Uncertainty Decoding with SPLICE for Noise Robust Speech Recognition, in Proc. ICASSP, Institute of Electrical and Electronics Engineers, Inc., Florida, May 2002
- Li Deng, Jasha Droppo, and Alex Acero, A Bayesian Approach to Speech Feature Enhancement using the Dynamic Cepstral Prior, in Proc. ICASSP, Institute of Electrical and Electronics Engineers, Inc., Florida, May 2002
- Li Deng, Kuansan Wang, Alex Acero, Hsiao-Wuen Hon, Jasha Droppo, Constantinos Boulis, Ye-Yi Wang, Derek Jacoby, Milind Mahajan, Ciprian Chelba, and Xuedong D. Huang, Distributed Speech Processing in MiPad’s Multimodal User Interface, in IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, vol. 10, no. 8, pp. 605-619, Institute of Electrical and Electronics Engineers, Inc., 2002
- Li Deng, Jasha Droppo, and Alex Acero, Recursive Noise Estimation Using Iterative Stochastic Approximation for Stereo-based Robust Speech Recognition, in Proc. IEEE Workshop on Automatic Speech Recognition and Understanding, Institute of Electrical and Electronics Engineers, Inc., Madonna di Campliglio, Italy, December 2001
- Jasha Droppo, Alex Acero, and Li Deng, Evaluation of the SPLICE Algorithm on the Aurora 2 Database, in Proc. Eurospeech Conference, International Speech Communication Association, Aalbodk, Denmark, September 2001
- Li Deng, Alex Acero, L. Jiang, Jasha Droppo, and Xuedong Huang, High-Performance Robust Speech Recognition Using Stereo Training Data, in Proc. ICASSP, Institute of Electrical and Electronics Engineers, Inc., Salt Lake City, Utah, May 2001
- Jasha Droppo, Alex Acero, and Li Deng, Efficient Online Acoustic Environment Estimation for FCDCN in a Continuous Speech Recognition System, in Proc. ICASSP, Institute of Electrical and Electronics Engineers, Inc., Salt Lake City, Utah, May 2001
- Xuedong Huang, Alex Acero, C. Chelba, Li Deng, Jasha Droppo, D. Duchene, J. Goodman, Hsiao-Wuen Hon, D. Jacoby, L. Jiang, R. Loynd, Milind Mahajan, P. Mau, S. Meredith, S. Mughal, S. Neto, M. Plumpe, K. Stery, G. Venolia, Kuansan Wang, and Ye-Yi Wang, MIPAD: A Multimodal Interactive Prototype, in International Conference on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., Salt Lake City, Utah, USA, 2001
- Jasha Droppo and Alex Acero, Maximum a Posteriori Pitch Tracking, in Proc. International Conference on Spoken Language Processing, International Speech Communication Association, Sydney, Australia, December 1998
Last modified: Thursday, February 24, 2005
E-mail: jdroppo at microsoft dot com
U.S.Mail: Microsoft Corporation, One Microsoft Way, Redmond WA, 98052, USA
Tel: (425) 703-7114
Fax: (425) 706-7329



