Publications Speech (Redmond)
Books
- Gokhan Tur and Renato DeMori, Spoken Language Understanding: Systems for Extracting Semantic Information from Speech, John Wiley and Sons, New York, NY, 2011
- Ivan Tashev, Sound Capture and Processing: Practical Approaches, pp. 388, Wiley, July 2009
- Xiaodong He and Li Deng, DISCRIMINATIVE LEARNING FOR SPEECH RECOGNITION: Theory and Practice, Morgan & Claypool, October 2008
- Li Deng, DYNAMIC SPEECH MODELS --- Theory, Algorithm, and Application; (book review in IEEE Trans. Neural Networks, Vol. March 2009), Morgan & Claypool, December 2006
- Li Deng and Doug O'Shaughnessy, SPEECH PROCESSING --- A Dynamic and Optimization-Oriented Approach, Marcel Dekker Inc., June 2003
- Xuedong Huang, Alex Acero, and Hsiao-Wuen Hon, Spoken Language Processing, pp. 1008, Prentice-Hall, May 2001
- Fuliang Weng and Ye-Yi Wang, Introduction to Computational Linguistics, China Social Science Press, 1998
- Alex Acero, Acoustical and Environmental Robustness in Automatic Speech Recognition, pp. 212, Kluwer Academic , 1993
Book chapters
- Yeyi Wang, L. Deng, and A. Acero, Semantic Frame Based Spoken Language Understanding, in Chapter 3, Tur and De Mori (eds) Spoken Language Understanding: Systems for Extracting Semantic Information from Speech, , pp. 35-80, Wiley, 2011
- Gokhan Tur and Li Deng, Intent Determination and Spoken Utterance Classification, in Chapter 4, Tur and De Mori (eds) Spoken Language Understanding: Systems for Extracting Semantic Information from Speech, , pp. 81-104, Wiley, 2011
- Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, and Alex Acero, Voice Search, in Tur & DeMori Eds. Spoken Language Understanding, Wiley, 2011
- Xuedong Huang and Li Deng, An Overview of Modern Speech Recognition , in Handbook of Natural Language Processing, Second Edition, Chapter 15 (ISBN: 1420085921), pp. 339-366, Chapman & Hall/CRC, 2010
- Dong Yu and Li Deng, Speech-Centric Multimodal User Interface Design in Mobile Technology, in Chapter XVIII in Jo Lumsden (Ed.) in Handbook of Research on User Interface Design and Evaluation for Mobile Technology, IGI Global, January 2008
- Jasha Droppo and Alex Acero, Environmental Robustness, in Benesty, Sondhi, Huang (eds) Handbook of Speech Processing, Springer, 2008
- Li Deng and Jianwu Dang, Speech Analysis: The Production-Perception Perspective, in Advances in Chinese Spoken Language Processing, pp. 2-32, World Scientific Publishing, 2007
- Li Deng and H. Sheikhzadeh, Use of Temporal Codes Computed from a Cochlear Model for Speech Recognition, in Chapter 15, S. Greenberg and W. Ainsworth (eds.) Listening to Speech: An Auditory Perspective, pp. 237-256, Lawrence Erlbaum Associates, Inc., 2006
- A. Avendano, Li Deng, H. Hermansky, and B. Gold, The Analysis and Representation of Speech, in Speech Processing in the Auditory System; Chapter 2; S. Greenberg, W. Ainsworth, A. Popper, and R. Fay (eds.), ISBN: 978-0-387-00590-4 , Springer Verlag, 2004
- Li Deng, Switching Dynamic System Models for Speech Articulation and Acoustics, in Mathematical Foundations of Speech and Language Processing, vol. 138, pp. 115 - 134, Springer Verlag, 2003
- Li Deng, Articulatory Features and Associated Production Models in Statistical Speech Recognition, in Computational Models of Speech Pattern Processing, (NATO ASI Series), pp. 214-224, Springer Verlag, 1999
- Li Deng, Computational Models for Auditory Speech Processing, in Computational Models of Speech Pattern Processing, (NATO ASI Series), pp. 67-77, Springer Verlag, 1999
- Li Deng, A dynamic, feature-based approach to speech modeling and recognition, in in S. Furui, F. Juang (eds.) Automatic Speech Recognition and Understanding , pp. 107-114, Institute of Electrical and Electronics Engineers, Inc., 1997
- X. D. Huang, Alex Acero, F. Alleva, M. Hwang, L Jiang, and Milind Mahajan, From Sphinx-II to Whisper: Making Speech Recognition Usable, in Automatic Speech and Speaker Recognition, Advanced Topics, pp. 536, Kluwer Academic , 1996
- R. Stern, F. Liu, Y. Ohshima, and Alex Acero, Signal Processing for Robust Speech Recognition, in Automatic Speech and Speaker Recognition, Advanced Topics, Kluwer Academic , 1996
- Alex Acero, The Role of Phoneticians in Speech Technology, in European Studies in Phonetics and Speech Communication, European Language Resources Association, August 1995
- Don Sun and Li Deng, Nonstationary-State Hidden Markov Models for Speech Recognition, in Chapter 8, S. Levinson and L. Shepp (eds.): Image and Speech Models, Springer Verlag, 1995
- Alex Acero, Fil Alleva, Doug Beeferman, Xuedong Huang, Mei-Yuh Hwang, and Milind Mahajan, From CMU Sphinx-II to Microsoft Whisper: Making Speech Recognition Usable, in Automatic Speech and Speaker Recognition--Advanced Topics, no. MSR-TR-94-20, pp. 28, Kluwer Academic , September 1994
- D. Zhang, Li Deng, and M. Elmasry, Pipelined Neural Network Architecture For Speech Recognition, in VLSI Artificial Neural Networks Engineering, pp. 297-315, Kluwer Academic , 1994
- K. Hassanein, Li Deng, and M. Elmasry, Neural Predictive Hidden Markov Model Architecture For Speech And Speaker Recognition, in in VLSI Artificial Neural Networks Engineering, pp. 316-336, Kluwer Academic , 1994
- Li Deng, K. Hassanein, and M. Elmasry, Neural-Network Architecture For Linear And Nonlinear Predictive Hidden Markov Models: Application To Speech Recognition, in in B. H. Juang, S. Y. Kung, and C. A. Kamm, (eds.) Neural Networks for Signal Processing, Institute of Electrical and Electronics Engineers, Inc., 1991
Journal Articles
2013
- Gokhan Tur, Ye-Yi Wang, and Dilek Hakkani-Tur, TechWare: Spoken Language Understanding (SLU) Resources, in IEEE Signal Processing Magazine, May 2013
- Sabato Marco Siniscalchi, Dong Yu, Li Deng, and Chin-Hui Lee, Exploiting Deep Neural Networks for Detection-Based Speech Recognition, in Neurocomputing, Elsevier, April 2013
- Sabato Marco Siniscalchi, Dong Yu, Li Deng, and Chin-hui Lee, Speech Recognition Using Long-Span Temporal Patterns in a Deep Network Model, in IEEE Signal Processing Letters, IEEE, March 2013
- Kaisheng Yao, Dong Yu, Li Deng, and Yifan Gong, A Fast Maximum Likelihood Nonlinear Feature Transformation Method for GMM-HMM Speaker Adaptation, in Neurocomputing, 2013
- Steve Young, Milica Gasic, Blaise Thomson, and Jason Williams, POMDP-based Statistical Spoken Dialogue Systems: a Review, in Proceedings of the IEEE, vol. PP, no. 99, pp. 1-20, Proceedings of the IEEE, 2013
- Anoop Deoras, Gokhan Tur, Ruhi Sarikaya, and Dilek Hakkani-Tur, Joint Discriminative Decoding of Word and Semantic Tags for Spoken Language Understanding, in IEEE Transactions on Audio, Speech, and Language Processing, IEEE, 2013
- Dong Yu, Li Deng, and Frank Seide, The Deep Tensor Neural Network with Applications to Large Vocabulary Speech Recognition, in IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 2, pp. 388-396, IEEE, 2013
2012
- Brian Hutchinson, Li Deng, and Dong Yu, Tensor Deep Stacking Networks, in IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE, December 2012
- Dong Yu, Geoffrey Hinton, Nelson Morgan, Jen-Tzung Chien, and Shigeki Sagayama, Introduction to the Special Section on Deep Learning for Speech and Language Processing, in IEEE Transactions on Audio, Speech, and Language Processing, IEEE SPS, January 2012
- George Dahl, Dong Yu, Li Deng, and Alex Acero, Context-Dependent Pre-trained Deep Neural Networks for Large Vocabulary Speech Recognition, in IEEE Transactions on Audio, Speech, and Language Processing, Special Issue on Deep Learning for Speech and Langauge Processing, vol. 20, no. 1, pp. 30-42, January 2012
- Dong Yu and Li Deng, Efficient and Effective Algorithms for Training Single-Hidden-Layer Neural Networks, in Pattern Recognition Letters, Elsevier, 2012
2011
- Dong Yu, Jinyu Li, and Li Deng, Calibration of confidence measures in speech recognition, in IEEE Transactions on Audio, Speech, and Language Processing, IEEE SPS, November 2011
- Ivan Tashev, Recent Advances in Human-Machine Interfaces for Gaming and Entertainment, in International Journal on Information Technology and Security, vol. III, no. 3, pp. 69-76, Union of Scientists in Bulgaria, September 2011
- Dilek Hakkani-Tur, Gokhan Tur, and Larry Heck, Research Challenges and Opportunities in Mobile Applications, in IEEE Signal Processing Magazine, , August 2011
- Michael L. Seltzer, Yun-Cheng Ju, Ivan Tashev, Ye-Yi Wang, and Dong Yu, In Car Media Search, in IEEE Signal Processing Magazine, IEEE SPS, June 2011
- Dong Yu and Li Deng, Deep Learning and Its Applications to Signal and Information Processing , in IEEE Signal Processing Magazine, IEEE, January 2011
2010
- Dong Yu, Shizhen Wang, and Li Deng, Sequential Labeling Using Deep-Structured Conditional Random Fields, in IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, IEEE, December 2010
- Sven Nordholm, Thushara Abhayapala, Simon Doclo, Sharon Gannot, Patrick Naylor, and Ivan Tashev, Microphone Array Speech Processing, in EURASIP Journal on Advances in Signal Processing, HINDAWI, 16 September 2010
- Xiao Li, Ye-Yi Wang, Dou Shen, and Alex Acero, Learning with Click Graph for Query Intent Classification, in ACM Transaction on Information Systems, vol. 28, no. 3, Association for Computing Machinery, Inc., June 2010
- Patrick Nguyen and Geoffrey Zweig, Speech Recognition with Flat Direct Models, in IEEE Journal of Selected Topics in Signal Processing, IEEE, 2010
2009
- Dong Yu, Li Deng, and Alex Acero, Using continuous features in the maximum entropy model, in Pattern Recognition Letters, vol. 30, no. 8, pp. 1295-1300, Elsevier , October 2009
- Dong Yu, Li Deng, Yifan Gong, and Alex Acero, A Novel Framework and Training Algorithm for Variable-Parameter Hidden Markov Models, in IEEE Transactions on Audio, Speech and Language Processing, vol. 17, no. 7, pp. 1348-1360, IEEE, September 2009
- J. Baker, Li Deng, S. Khudanpur, C.-H. Lee, J. Glass, and N. Morgan, Updated MINDS Report on Speech Recognition and Understanding, in IEEE Signal Processing Magazine, vol. 26, no. 4, July 2009
- Dong Yu and Li Deng, Solving nonlinear estimation problems using Splines , in IEEE Signal Processing Magazine, vol. 26, no. 4, pp. 86-90, IEEE, July 2009
- J. Baker, Li Deng, Jim Glass, S. Khudanpur, C.-H. Lee, N. Morgan, and D. O'Shgughnessy, Research Developments and Directions in Speech Recognition and Understanding, Part 1, in IEEE Signal Processing Magazine, vol. 26, no. 3, pp. 75-80, May 2009
- Dong Yu and Li Deng, Teach-Ware: Signal Processing Resources at Connexions, in IEEE Signal Processing Magazine, Institute of Electrical and Electronics Engineers, Inc., March 2009
- Jinyu Li, Dong Yu, Li Deng, Yifan Gong, and Alex Acero, A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions, in Computer Speech and Language, vol. 23, pp. 389-405, Elsevier , 2009
- Dong Yu, Balakrishnan Varadarajan, Li Deng, and Alex Acero, Active Learning and Semi-supervised Learning for Speech Recognition: A Unified Framework using the Global Entropy Reduction Maximization Criterion, in Computer Speech and Language - Special Issue on Emergent Artificial Intelligence Approaches for Pattern Recognition in Speech and Language Processing , Elsevier , 2009
- Li Deng, Embracing a New Golden Age of Signal Processing, in IEEE Signal Processing Magazine, January 2009
2008
- Dong Yu, Li Deng, Xiaodong He, and Alex Acero, Large-Margin Minimum Classification Error Training: A Theoretical Risk Minimization Perspective, in Computer Speech and Language, vol. 22, no. 4, pp. 415-429, Elsevier , October 2008
- Xiaodong He, Li Deng, and Wu Chou, Discriminative Learning in Sequential Pattern Recognition --- A Unifying Review for Optimization-Oriented Speech Recognition, in IEEE Signal Processing Magazine, vol. 25, no. 5, pp. 14-36, Institute of Electrical and Electronics Engineers, Inc., September 2008
- Dong Yu, Li Deng, Jasha Droppo, Jian Wu, Yifan Gong, and Alex Acero, Robust speech recognition using cepstral minimum-mean-square-error noise suppressor, in IEEE Trans. Audio, Speech, and Language Processing, vol. 16, no. 5, Institute of Electrical and Electronics Engineers, Inc., July 2008
- Li Deng, Expanding the Scope of Signal Processing, in IEEE Signal Processing Magazine, vol. 25, no. 3, pp. 2-4, May 2008
- Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, and Alex Acero, An Introduction to Voice Search, in IEEE Signal Processing Magazine (Special Issue on Spoken Language Technology), Institute of Electrical and Electronics Engineers, Inc., May 2008
- Amarnag Subramanya, Zhengyou Zhang, Zicheng Liu, and Alex Acero, Multisensory processing for speech enhancement and magnitude-normalized spectra for speech modeling, in Speech Communication, vol. 50, pp. 228-243, Elsevier , March 2008
- Sibel Yaman, Li Deng, Dong Yu, Ye-Yi Wang, and Alex Acero, An integrative and discriminative technique for spoken utterance classification, in IEEE Trans. Audio, Speech, and Language Processing, vol. 16, no. 6, pp. 1207-1214, Institute of Electrical and Electronics Engineers, Inc., 2008
2007
- Rodrigo Guido, Li Deng, and Shoji Makino, Guest Editors’ Introduction: Special Section on Emergent Systems, Algorithms, and Architectures for Speech-Based Human-Machine Interaction, in IEEE Transactions on Computers, vol. 56, no. 9, pp. 1153-1155, September 2007
- Ciprian Chelba, Jorge Silva, and Alex Acero, Soft Indexing of Speech Content for Search in Spoken Documents, in Computer Speech & Language, vol. 21, no. 3, pp. 423-578, Elsevier , July 2007
- Amarnag Subramanya, Michael Seltzer, and Alex Acero, Automatic Removal of Typed Keystrokes From Speech Signals, in IEEE Signal Processing Letters, vol. 14, no. 5, pp. 363-366, Institute of Electrical and Electronics Engineers, Inc., May 2007
- Li Deng, Write Feature Articles with a Lasting Impact, in IEEE Signal Processing Magazine, vol. 24, no. 2, March 2007
- Michael Seltzer and Alex Acero, Training Wideband Acoustic Models Using Mixed-Bandwidth Training Data for Speech Recognition, in Trans. on Audio, Speech and Language Processing, vol. 15, no. 1, pp. 235-245, Institute of Electrical and Electronics Engineers, Inc., January 2007
- Xiaodong He and Li Deng, A new look at discriminative learning for hidden Markov models, in Pattern Recognition Letters, vol. 28, pp. 1285-1294, 2007
- Dong Yu, Li Deng, and Alex Acero, Speaker-adaptive learning of resonance targets in a hidden trajectory model of speech coarticulation, in Computer Speech and Language, vol. 27, pp. 72-87, Elsevier , 2007
- Li Deng, Hagai Attias, Leo Lee, and Alex Acero, Adaptive Kalman smoothing for tracking vocal tract resonances using a continuous-valued hidden dynamic model, in IEEE Transactions on audio, Speech and Language Processing, vol. 15, no. 1, pp. 13-23, Institute of Electrical and Electronics Engineers, Inc., January 2007
2006
- R. Stern and Michael Seltzer, Subband Likelihood-Maximizing Beamforming for Speech Recognition in Reverberant Environments, in IEEE Trans. on Audio, Speech and Language Processing. Volume: 14 Issue: 6, Nov 2006. pp. 2109-2121, Institute of Electrical and Electronics Engineers, Inc., November 2006
- Ciprian Chelba and Alex Acero, Adaptation of maximum entropy capitalizer: Little data can help a lot, in Computer Speech & Language, vol. 20, no. 4, pp. 382-399, Elsevier , October 2006
- Li Deng, Dong Yu, and Alex Acero, Structured Speech Modeling, in IEEE Trans. on Audio, Speech and Language Processing, vol. 14, no. 5, pp. 1492-1504, Institute of Electrical and Electronics Engineers, Inc., September 2006
- Dong Yu, Alex Acero, and Li Deng, A Lattice Search Technique for Long-contextual-span Hidden Trajectory Model of Speech, in Speech Communication, vol. 48, no. 9, Elsevier , September 2006
- I. Bazzi, Li Deng, and Alex Acero, Tracking Vocal Tract Resonances Using a Quantized Nonlinear Function Embedded in a Temporal Constraint, in IEEE Trans. on Audio, Speech and Language Processing, vol. 14, no. 2, pp. 425-434, March 2006
- Alex Acero, Building Voice User Interfaces, in MSDN Magazine, February 2006
- Roberto Togneri and Li Deng, A state-space model with neural-network prediction for recovering vocal tract resonances in fluent speech from Mel-cepstral coefficients, in Speech Communication, vol. 48, pp. 971-988, 2006
- Ye-Yi Wang and Alex Acero, Rapid development of spoken language understanding grammars, in Speech Communication, vol. 48, no. 3-4, pp. 390-416, Elsevier , 2006
- Li Deng, Dong Yu, and Alex Acero, A Bidirectional Target Filtering Model of Speech Coarticulation: two-stage Implementation for Phonetic Recognition, in IEEE Transactions on Audio and Speech Processing, vol. 14, no. 1, pp. 256-265, IEEE, January 2006
2005
- Asela Gunawardana and William Byrne, Convergence theorems for generalized alternating minimization procedures, in Journal of Machine Learning Research, MIT Press, December 2005
- Li Deng and Dong Yu, A Speech-Centric Perspective for Human-Computer Interface - A Case Study, in Journal of VLSI Signal Processing Systems (Special Issue on Multimedia Signal Processing), Springer Verlag, November 2005
- Li Deng, K. Wang, and Wu Chou, Speech Technology and Systems in Human-Machine Communication, in IEEE Signal Processing Magazine, vol. 22, no. 5, pp. 12-14, September 2005
- Dong Yu and Alex Acero, Semiautomatic Improvements of System-Initiative Spoken Dialog Applications Using Interactive Clustering, in IEEE Trans. Speech & Audio Proc (Special Issue on Data Mining of Speech, Audio and Dialog), IEEE, September 2005
- Li Deng, J. Wu, Jasha Droppo, and Alex Acero, Analysis and Comparison of Two Speech Feature Extraction/Compensation Algorithms, in IEEE Signal Processing Letters, vol. 12, no. 6, pp. 477–480, Institute of Electrical and Electronics Engineers, Inc., June 2005
- Li Deng, Jian Wu, Jasha Droppo, and Alex Acero, Dynamic Compensation of HMM Variances Using the Feature Enhancement Uncertainty Computed From a Parametric Model of Speech Distortion, in IEEE Transactions on Speech and Audio Processing, vol. 13, no. 3, pp. 412–421, Institute of Electrical and Electronics Engineers, Inc., May 2005
- Ye-Yi Wang, Li Deng, and Alex Acero, Spoken Language Understanding — An Introduction to the Statistical Framework, in IEEE Signal Processing Magazine, vol. 22, no. 5, pp. 16-31, Institute of Electrical and Electronics Engineers, Inc., 2005
2004
- Li Deng, Jasha Droppo, and Alex Acero, Estimating cepstrum of speech under the presence of noise using a joint prior of static and dynamic features, in IEEE Transactions on Speech and Audio Processing, vol. 12, no. 3, pp. 218–233, Institute of Electrical and Electronics Engineers, Inc., May 2004
- Li Deng, Jasha Droppo, and Alex Acero, Enhancement of log Mel power spectra of speech using a phase-sensitive model of the acoustic environment and sequential estimation of the corrupting noise, in IEEE Transactions on Speech and Audio Processing, vol. 12, no. 2, pp. 133–143, Institute of Electrical and Electronics Engineers, Inc., March 2004
- Li Deng, Ye-Yi Wang, Kuansan Wang, Alex Acero, Hsiao Hon, Jasha Droppo, C. Boulis, Derek Jacoby, Milind Mahajan, Ciprian Chelba, and Xuedong Huang, Speech and language processing for multimodal human-computer interaction (Invited Article) , in Journal of VLSI Signal Processing Systems (Special issue on Real-World Speech Processing), vol. 36, no. 2-3, pp. 161 - 187, Kluwer Academic , 2004
2003
- Li Deng, Jasha Droppo, and Alex Acero, Recursive estimation of nonstationary noise using iterative stochastic approximation for robust speech recognition, in IEEE Transactions on Speech and Audio Processing, vol. 11, no. 6, pp. 568–580, Institute of Electrical and Electronics Engineers, Inc., November 2003
- J. Xin, Y. Qi, and Li Deng, Time Domain Computation of a Nonlinear Nonlocal Cochlear Model with Applications to Multitone Interactions in Hearing, in Communications in Mathematical Sciences, vol. 1, no. 2, pp. 211-227, 2003
2002
- Li Deng, Kuansan Wang, Alex Acero, Hsiao-Wuen Hon, Jasha Droppo, Constantinos Boulis, Ye-Yi Wang, Derek Jacoby, Milind Mahajan, Ciprian Chelba, and Xuedong D. Huang, Distributed Speech Processing in MiPad’s Multimodal User Interface, in IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, vol. 10, no. 8, pp. 605-619, Institute of Electrical and Electronics Engineers, Inc., 2002
2000
- Y. Rui, A. Gupta, and Alex Acero, Automatically Extracting Highlights for TV Baseball Programs, in ACM Multimedia, pp. 105-115, 2000
1999
- X. Shen and Li Deng, A Dynamic System Approach to Speech Enhancement Using the H-inf Filtering Algorithm,, in IEEE Trans. on Speech and Audio Processing, vol. 7, pp. 391-399, July 1999
1996
- Mei-Yuh Hwang, Xuedong Huang, and Fil Alleva, Predicting unseen triphones with senones, in IEEE Trans. on Speech and Audio Processing,, vol. 4, no. 6, pp. 412-419, Institute of Electrical and Electronics Engineers, Inc., November 1996
1992
- Li Deng, P. Kenny, M Lennig, and P. Mermelstein, Modeling acoustic transitions in speech by state-interpolation hidden Markov models, in Transactions on Signal Processing, vol. 40, no. 2, pp. 265-272, 1992
Conference Papers
2013
- Larry Heck, Dilek Hakkani-Tur, and Gokhan Tur, Leveraging Knowledge Graphs for Web-Scale Unsupervised Semantic Parsing, in Proceedings of Interspeech, International Speech Communication Association, August 2013
- Asli Celikyilmaz, Gokhan Tur, and Dilek Hakkani-Tur, IsNL? A Discriminative Approach to Detect Natural Language Like Queries for Conversational Understanding, Annual Conference of the International Speech Communication Association (Interspeech), August 2013
- Heeyoung Lee, Andreas Stolcke, and Elizabeth Shriberg, Using Out-of-Domain Data for Lexical Addressee Detection in Human-Human-Computer Dialog, in Proceedings NAACL, Association for Computational Linguistics, June 2013
- Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig, Linguistic Regularities in Continuous SpaceWord Representations, in Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT-2013), Association for Computational Linguistics, 27 May 2013
- Dong Wang, Dilek Hakkani-Tur, and Gokhan Tur, Understanding Computer-Directed Utterances in Multi-User Dialog Systems, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2013
- Dilek Hakkani-Tur, Larry Heck, and Gokhan Tur, Using a Knowledge Graph and Query Click Logs for Unsupervised Learning of Relation Detection, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2013
- Dong Yu, Mike Seltzer, Jinyu Li, Jui-Ting Huang, and Frank Seide, Feature Learning in Deep Neural Networks - Studies on Speech Recognition, in International Conference on Learning Representations, May 2013
- Gokhan Tur, Asli Celikyilmaz, and Dilek Hakkani-Tur, Latent Semantic Modeling for Slot Filling in Conversational Understanding, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2013
- Vikramjit Mitra, Wen Wang, Andreas Stolcke, Hosung Nam, Colleen Richey, Jiahong Yuan, and Mark Liberman, Articulatory features for large vocabulary speech recognition, in Proc. IEEE ICASSP, IEEE SPS, May 2013
- Aditya Bhargava, Asli Celikyilmaz, Dilek Hakkani-Tur, and Ruhi Sarikaya, Easy Contextual Intent Prediction and Slot Detection, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2013
- Xiaodong He, Li Deng, Dilek Hakkani-Tur, and Gokhan Tur, Multi-Style Adaptive Training for Robust Cross-Lingual Spoken Language Understanding, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2013
- Mark Liberman, Jiahong Yuan, Andreas Stolcke, Wen Wang, and Vikramjit Mitra, Using multiple versions of speech input in phone recognition, in Proceedings IEEE ICASSP, IEEE SPS, May 2013
- Edmund Lalor, Nima Mesgarani, Siddharth Rajaram, Adam O'Donovan, James Wright, Inyong Choi, Jonathan Brumberg, Nai Ding, Adrian KC Lee, Nils Peters, Sudarshan Ramenahalli, Jeffrey Pompe, Barbara Shinn-Cunningham, Malcolm Slaney, and Shihab Shamma, Decoding Auditory Attention (in Real Time) with EEG, in Proceedings of the 37th ARO MidWinter Meeting, Association for Research in Otolaryngology (ARO), 17 February 2013
- Ivan Tashev and Malcolm Slaney, Data Driven Suppression Rule for Speech Enhancement, in Information Theory and Applications Workshop , University of California - San Diego, 14 February 2013
- Geoffrey Zweig and Konstantin Makarychev, SPEED REGULARIZATION AND OPTIMALITY IN WORD CLASSING, in ICASSP, IEEE, 2013
- Asli Celikyilmaz, Dilek Hakkani-Tür, Gokhan Tur, and Ruhi Sarikaya, Semi-Supervised Semantic Tagging for Conversational Understanding Using Markov Topic Regression, Association for Computational Linguistics, 2013
2012
- Kenichi Kumatani, Takayuki Arakawa, Kazumasa Yamamoto, John McDonough, Bhiksha Raj, Rita Singh, and Ivan Tashev, Microphone Array Processing for Distant Speech Recognition: Towards Real-World Deployment, in APSIPA Annual Summit and Conference, Hollywood, CA, USA, 5 December 2012
- Jens Ahrens, Mark R.P. Thomas, and Ivan Tashev, HRTF Magnitude Modeling Using a Non-Regularized Least-Squares Fit of Spherical Harmonics Coefficients on Incomplete Data, in APSIPA Annual Summit and Conference, Hollywood, CA, USA, 4 December 2012
- Asli Celikyilmaz, Dilek Hakkani-Tur, and Gokhan Tur, Statistical Semantic Interpretation Modeling for Spoken Language Understanding with Enriched Semantic Features, IEEE Workshop on Spoken Language Technologies, December 2012
- Mark R. P. Thomas, Jens Ahrens, and Ivan Tashev, Beamformer Design Using Measured Microphone Directivity Patterns: Robustness to Modelling Error, in APSIPA Annual Summit and Conference, Hollywood, CA, USA, December 2012
- Li Deng, Gokhan Tur, Xiaodong He, and Dilek Hakkani-Tur, Use of Kernel Deep Convex Networks and End-To-End Learning for Spoken Language Understanding, IEEE Workshop on Spoken Language Technologies, December 2012
- Larry Heck, The Conversational Web, IEEE Workshop on Spoken Language Technology, December 2012
- Larry Heck and Dilek Hakkani Tur, Exploiting the Semantic Web for Unsupervised Spoken Language Understanding, IEEE Spoken Language Technology Workshop, December 2012
- Mark R. P. Thomas, Jens Ahrens, and Ivan Tashev, Optimal 3D Beamforming Using Measured Microphone Directivity Patterns, Proc. Intl. Workshop Acoust. Signal Enhancement (IWAENC), Aachen, Germany, 4 September 2012
- Dmytro Prylipko, Bogdan Vlasenko, Andreas Stolcke, and Andreas Wendemuth, Language Modeling of Nonverbal Vocalizations in Spontaneous Speech, in Text, Speech and Dialogue, 15th International Conference, Springer Verlag, September 2012
- Elizabeth Shriberg, Andreas Stolcke, Dilek Hakkani-Tür, and Larry Heck, Learning When to Listen: Detecting System-Addressed Speech in Human-Human-Computer Dialog, in Proceedings of Interspeech, International Speech Communication Association, September 2012
- Xie Chen, Adam Eversole, Gang Li, Dong Yu, and Frank Seide, Pipelined Back-Propagation for Context-Dependent Deep Neural Networks, in Interspeech, ISCA, September 2012
- Gokhan Tur, Minwoo Jeong, Ye-Yi Wang, Dilek Hakkani-Tur, and Larry Heck, Exploiting the Semantic Web for Unsupervised Natural Language Semantic Parsing, Annual Conference of the International Speech Communication Association (Interspeech), September 2012
- Li Deng, Brian Hutchinson, and Dong Yu, Parallel Training of Deep Stacking Networks, in Interspeech, ISCA, September 2012
- Dilek Hakkani-Tur, Gokhan Tur, Larry Heck, Ashley Fidler, and Asli Celikyilmaz, A Discriminative Classification-Based Approach to Information State Updates for a Multi-Domain Dialog System, Annual Conference of the International Speech Communication Association (Interspeech), September 2012
- Dong Yu, Li Deng, and Frank Seide, Large Vocabulary Speech Recognition Using Deep Tensor Neural Networks, in Interspeech, ISCA, September 2012
- Scott Yih, Geoffrey Zweig, and John Platt, Polarity Inducing Latent Semantic Analysis, in Experimental Methods in Natural Language Processing 2012, ACL/SIGPARSE, July 2012
- Geoffrey Zweig, John C. Platt, Christopher Meek, Christopher J.C. Burges, Ainur Yessenalina, and Qiang Liu, Computational Approaches to Sentence Completion, in ACL 2012, ACL/SIGPARSE, July 2012
- Dilek Hakkani-Tur, Gokhan Tur, and Asli Celikyilmaz, Mining Search Query Logs for Spoken Language Understanding, in North Ameircan Association for Computational Linguistics NAACL-2012: Workshop on Future Directions and Needs in the Spoken Dialog Community: Tools and Data, June 2012
- Geoffrey Zweig and Chris J.C. Burges, A Challenge Set for Advancing Language Modeling, in Workshop on the Future of Language Modeling for HLT, NAACL-HLT 2012, ACL/SIGPARSE, June 2012
- Andreas Stolcke, Martin Graciarena, and Luciana Ferrer, Effects of audio and ASR quality on cepstral and high-level speaker verification systems, in Proceedings Odyssey Speaker and Language Recognition Workshop, International Speech Communication Association, June 2012
- Dong Yu, Frank Seide, and Gang Li, Conversational Speech Transcription Using Context-Dependent Deep Neural Networks, in ICML 2012, June 2012
- Jason D. Williams, A belief tracking challenge task for spoken dialog systems, in NAACL HLT 2012 Workshop on Future directions and needs in the Spoken Dialog Community: Tools and Data, Association for Computational Linguistics, June 2012
- Ivan J. Tashev, Audio for Kinect: pushing it to the limit (invited talk), in CREST Symposium on Human-Harmonized Information Technology, University of Kyoto, 2 April 2012
- Ivan J. Tashev, Coherence Based Double Talk Detector with Soft Decision, IEEE International Confrence on Acoustics, Speech, and Signal Processing (ICASSP), 27 March 2012
- Dong Yu, Sabato Siniscalchi, Li Deng, and Chin-Hui Lee, Boosting Attribute And Phone Estimation Accuracies With Deep Neural Networks For Detection-Based Speech Recognition, in ICASSP 2012, IEEE SPS, March 2012
- Dilek Hakkani-Tur, Gokhan Tur, Rukmini Iyer, and Larry Heck, Translating Natural Language Utterances to Search Queries for SLU Domain Detection Using Query Click Logs, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 2012
- Dong Yu, Frank Seide, Gang Li, and Li Deng, Exploiting Sparseness In Deep Neural Networks For Large Vocabulary Speech Recognition, in ICASSP 2012, IEEE SPS, March 2012
- R. Prabhavalkar and Jasha Droppo, A Chunk-Based Phonetic Score for Mobile Voice Search, IEEE International Confrence on Acoustics, Speech, and Signal Processing (ICASSP), March 2012
- Geoff Zweig, Classification and Recognition with Direct Segment Models, International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 2012
- Amittai Axelrod, Xiaodong He, Li Deng, Alex Acero, and Mei-Yuh Hwang, New Methods and Evaluation Experiments on Translating TED Talks in the IWSLT Benchmark, IEEE International Confrence on Acoustics, Speech, and Signal Processing (ICASSP), March 2012
- Xiaodong He and Li Deng, Optimization in Speech-Centric Information Processing: Criteria and techniques, IEEE International Confrence on Acoustics, Speech, and Signal Processing (ICASSP), March 2012
- Brian Hutchinson, Li Deng, and Dong Yu, A deep architecture with bilinear modeling of hidden representations: applications to phonetic recognition, in ICASSP 2012, IEEE SPS, March 2012
- Ngoc Thang Vu, Tanja Schultz, and Daniel Povey, Modeling Gender Dependency in the Subspace GMM Framework, IEEE International Confrence on Acoustics, Speech, and Signal Processing (ICASSP), March 2012
- K. Riedhammer, T. Bocklet, A. Ghoshal, and Daniel Povey, Revisiting Semi-Continuous Hidden Markov Models, IEEE International Confrence on Acoustics, Speech, and Signal Processing (ICASSP), March 2012
- Oriol Vinyals, Suman Ravuri, and Daniel Povey, Revisiting Recurrent Neural Networks for Robust ASR, IEEE International Confrence on Acoustics, Speech, and Signal Processing (ICASSP), March 2012
- Jinyu Li, Michael Seltzer, and Yifan Gong, Improvements to VTS Feature Enhancement, IEEE International Confrence on Acoustics, Speech, and Signal Processing (ICASSP), March 2012
- Andreas Stolcke, Arindam Mandal, and Elizabeth Shriberg, Speaker recognition with region-constrained MLLR transforms, in Proceedings of IEEE ICASSP, IEEE SPS, March 2012
- Li Deng, Dong Yu, and John Platt, Scalable stacking and learning for building deep architectures, in ICASSP 2012, IEEE SPS, March 2012
- Dong Yu, Xin Chen, and Li Deng, Factorized Deep Neural Networks for Adaptive Speech Recognition, in IWSML 2012 , March 2012
- Geoffrey Zweig, Classification and Recognition with Direct Segment Models, in ICASSP 2012, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 2012
- Gokhan Tur, Li Deng, Dilek Hakkani-Tur, and Xiaodong He, Towards Deeper Understanding Deep Convex Networks for Semantic Utterance Classification, IEEE International Confrence on Acoustics, Speech, and Signal Processing (ICASSP), March 2012
- Daniel Povey, Mirko Hannemann, Gilles Boulianne, Lukas Burget, Arnab Ghoshal, Milos Janda, Martin Karafiat, Stefan Kombrink, Petr Motlıcek, Yanmin Qian, Korbinian Riedhammer, Karel Vesely, and Ngoc Thang Vu, Generating Exact Lattices in the WFST Framework, IEEE International Confrence on Acoustics, Speech, and Signal Processing (ICASSP), March 2012
- Ivan J. Tashev, Optimizing Kinect: Audio and Acoustics, in Inormation Technologies and Applications Workshop, University of California - San Diego, 8 February 2012
- Ivan J. Tashev, Audio for Kinect: Nearly Impossible (invited talk), in IEEE International Conference on Emerging Signal Processing Applications, IEEE SPS, 14 January 2012
- Asli Celikyilmaz and Dilek Hakkani-Tur, Joint Model for Discovery of Aspects in Utterances, Association for Computational Linguistics, 2012
- Tomas Mikolov and Geoffrey Zweig, Context Dependent Recurrent Neural Network Language Model, in Spoken Language Technologies, IEEE, 2012
- Kornel Laskowski and Elizabeth Shriberg, Corpus-Independent History Compression for Stochastic Turn-Taking Models, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2012
2011
- Frank Seide, Gang Li, Xie Chen, and Dong Yu, Feature engineering in context-dependent deep neural networks for conversational speech transcription, in ASRU 2011, IEEE, December 2011
- Asli Celikyilmaz, Dilek Hakkani-Tur, Gokhan Tur, Ashley Fidler, and Dustin Hillard, Exploiting Distance Based Similarity in Topic Models for User Intent Detection, IEEE Automatic Speech Recognition and Understanding Workshop, December 2011
- Andreas Stolcke, Jing Zheng, Wen Wang, and Victor Abrash, SRILM at Sixteen: Update and Outlook, in Proceedings IEEE Automatic Speech Recognition and Understanding Workshop, IEEE SPS, December 2011
- Dilek Hakkani-Tür, Gokhan Tur, Larry Heck, Asli Celikyilmaz, Ashley Fidler, Dustin Hillard, Rukmini Iyer, and S. Parthasarathy, Employing Web Search Query Click Logs for Multi-Domain Spoken Language Understanding, IEEE Automatic Speech Recognition and Understanding Workshop, December 2011
- Ivan Tashev, Coherence Based Double Talk Detector with Adaptive Threshold, in XX Scientific Conference ELECTRONICS ET2011, Technical University of Sofia Publishing House, 15 September 2011
- Yun-Cheng Ju and Jasha Droppo, Automatically Optimizing Utterance Classification Performance without Human in the Loop, in Interspeech, International Speech Communication Association, 28 August 2011
- Xiaodong He and Li Deng, Robust Speech Translation by Domain Adaptation, in Interspeech, International Speech Communication Association, August 2011
- Mike Seltzer and Alex Acero, Separating Speaker and Environmental Variability Using Factored Transforms, in Interspeech, International Speech Communication Association, August 2011
- Geoffrey Zweig and Shuangyu Chang, Personalizing Model M for Voice-search, in Interspeech, International Speech Communication Association, August 2011
- Dong Yu and Li Deng, Accelerated Parallelizable Neural Network Learning Algorithm for Speech Recognition, in Interspeech, International Speech Communication Association, August 2011
- Dong Yu and Mike Seltzer, Improved Bottleneck Features Using Pretrained Deep Neural Networks, in Interspeech, International Speech Communication Association, August 2011
- Yanmin Qian, Daniel Povey, and Jia Liu, State-Level Data Borrowing for Low-Resource Speech Recognition based on Subspace GMMs, in Interspeech, International Speech Communication Association, August 2011
- Li Deng and Dong Yu, Deep Convex Network: A Scalable Architecture for Speech Pattern Classification, in Interspeech, International Speech Communication Association, August 2011
- Frank Seide, Gang Li, and Dong Yu, Conversational Speech Transcription Using Context-Dependent Deep Neural Networks, in Interspeech 2011, International Speech Communication Association, August 2011
- Gokhan Tur, Dilek Hakkani-Tur, Dustin Hillard, and Asli Celikyilmaz, Towards Unsupervised Spoken Language Understanding: Exploiting Query Click Logs for Slot Filling, Annual Conference of the International Speech Communication Association (Interspeech), August 2011
- Asli Celikyilmaz, Dilek Hakkani-Tur, and Gokhan Tur, Multi-Domain Spoken Language Understanding with Approximate Inference, Annual Conference of the International Speech Communication Association (Interspeech), August 2011
- Xiao Li, Ye-Yi Wang, and Gokhan Tur, Multi-Task Learning for Spoken Language Understanding with Shared Slots, Annual Conference of the International Speech Communication Association (Interspeech), August 2011
- Dilek Hakkani-Tur, Gokhan Tur, Larry Heck, and Elizabeth Shriberg, Bootstrapping Domain Detection Using Query Click Logs for New Domains, August 2011
- Dustin Hillard, Asli Celikyilmaz, Gokhan Tur, and Dilek Hakkani Tur, Learning Weighted Entity Lists from Web Click Logs for Spoken Language Understanding, Annual Conference of the International Speech Communication Association (Interspeech), August 2011
- Asli Celikyilmaz, Gokhan Tur, and Dilek Hakkani-Tur, Leveraging Web Query Logs to Learn User Intent Via Bayesian Latent Variable Model, in ICML Workshop on Combining Learning Strategies to Reduce Label Cost, July 2011
- Jinyu Li, Dong Yu, Li Deng, and Yifan Gong, Towards High-Accuracy Low-Cost Noisy Robust Speech Recognition Exploiting Structured Model , in ICML Workshop 2011, June 2011
- Brian Hutchinson and Jasha Droppo, Learning Non-Parametric Models of Pronunciation, in Proceedings of ICASSP, IEEE SPS, 23 May 2011
- Hoang Do, Ivan Tashev, and Alex Acero, A New Speaker Identification Algorithm for Gaming Scenarios, in ICASSP, IEEE, May 2011
- Flávio Ribeiro, Dinei Florencio, Cha Zhang, and Michael Seltzer, CROWDMOS: An Approach for Crowdsourcing Mean Opinion Score Studies, in ICASSP, IEEE, May 2011
- Yaodong Zhang, Li Deng, Xiaodong He, and Alex Acero, A Novel Decision Function and the Associated Decision-Feedback Learning for Speech Translation, in ICASSP, IEEE, May 2011
- Daniel Povey and Kaisheng Yao, A Basis Method for Robust Estimation of Constrained MLLR, in ICASSP, IEEE, May 2011
- Jingjing Liu, Xiao Li, Alex Acero, and Ye-Yi Wang, Lexicon Modeling for Query Understanding, in ICASSP, IEEE, May 2011
- Dilek Hakkani-Tur, Larry Heck, and Gokhan Tur, Exploiting Query Click Logs for Utterance Domain Detection in Spoken Language Understanding, in Proceedings of the ICASSP, Prague, Czech Republic, May 2011
- Xiaodong He, Li Deng, and Alex Acero, Why Word Error Rate is not a Good Metric for Speech Recognizer Training for the Speech Translation Task?, in Proc. ICASSP, IEEE, May 2011
- Gokhan Tur, Dilek Hakkani-Tür, Larry Heck, and S. Parthasarathy, Sentence Simplification for Spoken Language Understanding, in IEEE International Conference on Acoustics, Speech, and Signal Processing, IEEE SPS, May 2011
- Xing Fan, Michael Seltzer, Jasha Droppo, Henrique Malvar, and Alex Acero, Joint Encoding of the Waveform and Speech Recognition Features Using a Transform Codec, in International Conference on Acoustics, Speech and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., May 2011
- Daniel Povey, Martin Karafiat, Arnab Ghoshal, and Petr Schwarz, A Symmetrization of the Subspace Gaussian Mixture Model, in ICASSP, IEEE, May 2011
- G. Dahl, Dong Yu, Li Deng, and Alex Acero, Large Vocabulary Continuous Speech Recognition With Context-Dependent DBN-HMMS, in Proc. ICASSP, Prague, IEEE, May 2011
- Marcel Kockmann, Luciana Ferrer, Lukas Burget, Elizabeth Shriberg, and Jan Cernocky, Recent Progress in Prosodic Speaker Verification, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2011
- Justine Kao, Geoffrey Zweig, and Patrick Nguyen, Discriminative Duration Modeling for Speech Recognition with Segmental Conditional Random Fields, in ICASSP, IEEE, 2011
- Geoffrey Zweig and Shuangyu Chang, Personalizing Model M for Voice-search, in Interspeech, International Speech Communication Association, 2011
- Samuel Thomas, Patrick Nguyen, Geoffrey Zweig, and Hynek Hermansky, MLP Based Phoneme Detectors for Speech Recognition, in ICASSP, IEEE, 2011
- Geoffrey Zweig, Patrick Nguyen, Dirk Van Compernolle, Kris Demuynck, Les Atlas, Pascal Clark, Greg Sell, Meihong Wang, Fei Sha, Hynek Hermansky, Damianos Karakos, Aren Jansen, Samuel Thomas, G.S.V.S. Sivaram, Samuel Bowman, and Justine Kao, Speech Recognition with Segmental Conditional Random Fields: A Summary of the JHU CLSP 2010 Summer Workshop, in ICASSP 2011, IEEE, 2011
- Kris Demuynck, Dino Seppi, Dirk Van Compernolle, Patrick Nguyen, and Geoffrey Zweig, Integrating Meta-Information into Exemplar-Based Speech Recognition with Segmental Conditional Random Fields, in ICASSP, IEEE, 2011
2010
- Dong Yu, Li Deng, and George E. Dahl, Roles of Pre-Training and Fine-Tuning in Context-Dependent DBN-HMMs for Real-World Speech Recognition, in NIPS 2010 workshop on Deep Learning and Unsupervised Feature Learning, December 2010
- Gokhan Tur, Dilek Hakkani-Tur, and Larry Heck, What's Left to Be Understood in ATIS?, IEEE Workshop on Spoken Language Technologies, December 2010
- y. c. ju and jasha droppo, Spontaneous Mandarin speech understanding using utterance classification: a case study, in International Symposium on Chinese Spoken Language Processing, International Speech Communication Association, December 2010
- Ye-Yi Wang, Strategies for Statistical Spoken Language Understanding with Small Amount of Data – an Empirical Study, in Proc. of Interspeech, International Speech Communication Association, September 2010
- Jinyu Li, Dong Yu, Yifan Gong, and Li Deng, Unscented Transform with Online Distortion Estimation for HMM Adaptation, in Interspeech 2010, International Speech Communication Association, September 2010
- Ivan Tashev and Alex Acero, Statistical Modeling of the Speech Signal, in International Workshop on Acoustic, Echo, and Noise Control (IWAENC), Tel Aviv, Israel, 1 September 2010
- Mike Seltzer and Alex Acero, HMM Adaptation Using Linear Spline Interpolation with Integrated Spline Parameter Training for Robust Speech Recognition, in Interspeech, International Speech Communication Association, September 2010
- Li Deng, Mike Seltzer, Dong Yu, Alex Acero, Abdel-rahman Mohamed, and Geoff Hinton, Binary Coding of Speech Spectrograms Using a Deep Auto-encoder, in Interspeech 2010, International Speech Communication Association, September 2010
- Geoffrey Zweig, Patrick Nguyen, Jasha Droppo, and Alex Acero, Continuous Speech Recognition with a TF-IDF Acoustic Model, International Speech Communication Association, September 2010
- Abdel-rahman Mohamed, Dong Yu, and Li Deng, Investigation of Full-Sequence Training of Deep Belief Networks for Speech Recognition, in Interspeech 2010, International Speech Communication Association, September 2010
- Dong Yu and Li Deng, Deep-Structured Hidden Conditional Random Fields for Phonetic Recognition, in Interspeech 2010, International Speech Communication Association, September 2010
- Yun-Cheng Ju and Tim Paek, Using Speech to Reply to SMS Messages While Driving: An In-Car Simulator User Study, Association for Computational Linguistics, 11 July 2010
- Xiao Li, Understanding the Semantic Structure of Noun Phrase Queries, in ACL, Association for Computational Linguistics, July 2010
- Ivan Tashev, Andrew Lovitt, and Alex Acero, Dual stage probabilistic voice activity detector, in NOISE-CON 2010 and 159th Meeting of the Acoustical Society of America, Acoustical Society of America, 20 April 2010
- Lae-Hoon Kim, Ivan Tashev, and Alex Acero, Reverberated Speech Signal Separation Based on Regularized Subband Feedforward ICA and Instantaneous Direction of Arrival, in International Conference on Acoustics, Speech and Signal Processing, IEEE, 16 March 2010
- Dong Yu and Li Deng, Semantic Confidence Calibration for Spoken Dialog Applications, IEEE, March 2010
- Dong Yu, Shizhen Wang, Zahi karam, and Li Deng, Language Recognition Using Deep-Structured Conditional Random Fields, IEEE, March 2010
- George Saon, Hagen Soltau, Upendra Chaudhari, Stephen Chu, Brian Kingsbury, Hong-Kwang Kuo, Lidia Mangu, and Daniel Povey, The IBM 2008 GALE Arabic Speech Trranscription System, in ICASSP, IEEE, March 2010
- Wei Wu, Yun-Cheng Ju, Xiao Li, and Ye-Yi Wang, Paraphrase Detection on SMS Messages in Automobiles, in ICASSP, IEEE, March 2010
- Dong Yu, Shizhen Wang, Jinyu Li, and Li Deng, Word Confidence Calibration Using a Maximum Entropy Model with Constraints on Confidence and Word Distributions, IEEE, March 2010
- Xiaoqiang Xiao, Jasha Droppo, and Alex Acero, Information Retrieval Methods for Automatic Speech Recognition, in ICASSP, IEEE, March 2010
- Jui-Ting Huang, Xiao Li, and Alex Acero, Discriminative Training Methods for Language Models Using Conditional Entropy Criteria, in ICASSP, IEEE, March 2010
- Mike Seltzer, Alex Acero, and Kaustubh Kalgaonkar, Acoustic Model Adaptation via Linear Spline Interpolation for Robust Speech Recognition, in ICASSP, IEEE, March 2010
- Jasha Droppo and Alex Acero, Context Dependent Phonetic String Edit Distance for Automatic Speech Recognition, in ICASSP, IEEE, March 2010
- Amitav Das and Makarand Tapaswi, Direct Modeling of Spoken Passwords for Text-Dependent Speaker Recognition by Compressed Time-Feature Representations, in ICASSP, IEEE, March 2010
- Yun-Cheng Ju and Tim Paek, How to Safely Respond to SMS Messages in Automobiles, in 2nd Multimodal Interfaces for Automobile Applications (MIAA), Association for Computing Machinery, Inc., 7 February 2010
- Daniel Povey, Lukas Burget, Mohit Agarwal, Pinar Akyazi, Kai Feng, Arnab Ghoshal, Ondrej Glembek, Nagendra Kumar Goel, Martin Karafiat, Ariya Rastrow, Richard C. Rose, Petr Schwarz, and Samuel Thomas, Subspace Gaussian Mixture Models for Speech Recognition , in ICASSP, 2010
- Stephen Chu and Daniel Povey, Speaking Rate Adaptation using Continuous Frame Rate Normalization, in ICASSP, 2010
- Shankar Shivappa, Patrick Nguyen, and Geoffrey Zweig, Discriminative Template Extraction for Direct Modeling, in ICASSP, IEEE, 2010
- Lukas Burget, Petr Schwarz, Mohit Agarwal, Pinar Akyazi, Kai Feng, Arnab Ghoshal, Ondrej Glembek, Nagendra Goel, Martin Karafiat, Daniel Povey, Ariya Rastrow, Richard C. Rose, and Samuel Thomas, Multilingual Acoustic Modeling for Speech Recognition based on Subspace Gaussian Mixture Models, in ICASSP, 2010
- Arnab Ghoshal, Daniel Povey, Mohit Agarwal, Pinar Akyazi, Lukas Burget, Kai Feng, Ondrej Glembek, Nagendra Goel, Martin Karafiat, Ariya Rastrow, Richard C. Rose, Petr Schwarz, and Samuel Thomas, A Novel Estimation of Feature-space MLLR for Full Covariance Models, in ICASSP, 2010
- Geoffrey Zweig and Patrick Nguyen, SCARF: A Segmental Conditional Random Field Toolkit for Speech Recognition, International Speech Communication Association, 2010
- Nagendra Goel, Samuel Thomas, Mohit Agarwal, Pinar Akyazi, Lukas Burget, Kai Feng, Arnab Ghoshal, Ondrej Glembek, Martin Karafiat, Daniel Povey, Ariya Rastrow, Richard C. Rose, and Petr Schwarz, Approaches to automatic lexicon learning with limited training examples, in ICASSP, 2010
- Haihua Xu, Daniel Povey, Lidia Mangu, and Jie Zhu, An Improved Consensus-Like method for Minimum Bayes Risk Decoding and Lattice Combination, in ICASSP, 2010
- Geoffrey Zweig and Patrick Nguyen, From Flat Direct Models to Segmental CRF Models, in ICASSP, IEEE, 2010
2009
- Dong Yu, Li Deng, and Shizhen Wang, Learning in the Deep-Structured Conditional Random Fields, in NIPS 2009 Workshop on Deep Learning for Speech Recognition and Related Applications, December 2009
- Ye-Yi Wang, Raphael Hoffmann, Xiao Li, and Jakub Syzmanski, Semi-Supervised Learning of Semantic Classes for Query Understanding – from the Web and for the Web, in The 18th ACM Conference on Information and Knowledge Management , Association for Computing Machinery, Inc., November 2009
- Ivan Tashev, Michael L. Seltzer, and Yun-Cheng Ju, Speech and sound for in-car infotainment systems, in Automotive User Interfaces and Interactive Vehicular Applications (AutomotiveUI 2009), Association for Computing Machinery, Inc., Essen, Germany, 22 September 2009
- Ivan Tashev, Michael Seltzer, Yun-Cheng Ju, Ye-Yi Wang, and Alex Acero, Commute UX: Voice Enabled In-car Infotainment System, in Mobile HCI '09: Workshop on Speech in Mobile and Pervasive Environments (SiMPE), Association for Computing Machinery, Inc., Bonn, Germany, 15 September 2009
- Yun-Cheng Ju and Tim Paek, A Voice Search Approach to Replying to SMS Messages in Automobiles, International Speech Communication Association, September 2009
- yun-Cheng Ju and Tim Paek, A Voice Search Approach to Replying to SMS Messages in Automobiles, International Speech Communication Association, September 2009
- Geoffrey Zweig, New Methods for the Analysis of Repeated Utterances, in Interspeech 2009, International Speech Communication Association, September 2009
- Dong Yu, Li Deng, and Alex Acero, Hidden Conditional Random Field with Distribution Constraints for Phone Classification, in Interspeech 2009, International Speech Communication Association, September 2009
- Geoffrey Zweig and Patrick Nguyen, Maximum Mutual Information Multi-phone Units in Direct Modeling, in Interspeech 2009, International Speech Communication Association, September 2009
- Haihua Xu, Daniel Povey, Jie Zhu, and Guanyong Wu, Minimum Hypothesis Phone Error as a Decoding Method for Speech Recognition, in Interspeech 2009, International Speech Communication Association, September 2009
- Yun-Cheng Ju, Michael Seltzer, and Ivan Tashev, Improving Perceived Accuracy for In-Car Media Search, International Speech Communication Association, September 2009
- Ivan Tashev, Andrew Lovitt, and Alex Acero, Unified Framework for Single Channel Speech Enhancement, in 2009 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, IEEE, Victoria B.C., Canada, 24 August 2009
- Mehdi Hafezi Manshadi and Xiao Li, Semantic Tagging of Web Search Queries, in ACL, Association for Computational Linguistics, August 2009
- Hisami Suzuki, Xiao Li, and Jianfeng Gao, Discovery of Term Variation in Japanese Web Search Queries, in Proceedings of EMNLP, Association for Computational Linguistics, August 2009
- Xiao Li, On the Use of Virtual Evidence in Conditional Random Fields, in EMNLP, August 2009
- Jianfeng Gao, Jian-Yun Nie, Wei Yuan, Xiao Li, and Kefeng Deng, Smoothing Clickthrough Data for Web Search Ranking, in SIGIR, July 2009
- Xiao Li, Ye-Yi Wang, and Alex Acero, Extracting Structured Information from User Queries with Semi-Supervised Conditional Random Fields, in SIGIR, July 2009
- Ozlem Kalinli, Michael L. Seltzer, and Alex Acero, Noise Adaptive Training Using a Vector Taylor Series Approach for Robust Automatic Speech Recognition, in Proceedings of International Conference on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., Taipei, Taiwan, April 2009
- Hui Lin, Li Deng, Dong Yu, Yifan Gong, Alex Acero, and Chi-Hui Lee, A Study on Multilingual Acoustic Modeling For Large Vocabulary ASR, in Proceedings of the ICASSP, Institute of Electrical and Electronics Engineers, Inc., April 2009
- Dong Yu, Li Deng, Peng Liu, Jian Wu, Yifan Gong, and Alex Acero, Cross-lingual speech recognition under run-time resource constraints, in Proceedings of the ICASSP, Institute of Electrical and Electronics Engineers, Inc., April 2009
- Balakrishnan Varadarajan, Dong Yu, Li Deng, and Alex Acero, Maximizing global entry reduction for active learning in speech recognition, in Proceedings of the ICASSP, Institute of Electrical and Electronics Engineers, Inc., April 2009
- Xiao Li, Patrick Nguyen, Geoffrey Zweig, and Dan Bohus, Leveraging Multiple Query Logs to Improve Language Models for Spoken Query Recognition, in ICASSP, IEEE, April 2009
- Balakrishnan Varadarajan, Dong Yu, Li Deng, and Alex Acero, Using collective information in semi-supervised learning for speech recognition, in Proceedings of the ICASSP, Institute of Electrical and Electronics Engineers, Inc., April 2009
- Michael L. Seltzer and Lei Zhang, The data deluge: challenges and opportunities of unlimited data in statistical signal processing, in Proceedings of International Conference on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., Taipei, Taiwan, April 2009
- Oriol Vinyals, Li Deng, Dong Yu, and Alex Acero, Discriminative pronunciation learning using phonetic decoder and minimum classification error criterion, in Proceedings of the ICASSP, Institute of Electrical and Electronics Engineers, Inc., April 2009
- Young-In Song, Ye-Yi Wang, Yun-Cheng Ju, Mike Seltzer, Ivan Tashev, and Alex Acero, Voice Search of Structured Media Data, in International Conference on Acoustics, Speech and Signal Processing, Institute of Electrical and Electornic Engineers, Inc., Taipei, Taiwan, April 2009
- Jasha Droppo and Alex Acero, Experimenting with a Global Decision Tree for State Clustering in Automatic Speech Recognition Systems, in ICASSP 2009, IEEE, April 2009
- Georg Heigold, Geoffrey Zweig, Xiao Li, and Patrick Nguyen, A Flat Direct Model for Speech Recognition, in ICASSP-2009, IEEE, 2009
- Geoffrey Zweig and Patrick Nguyen, A Segmental CRF Approach to Large Vocabulary Continuous Speech Recognition, in ASRU, IEEE, 2009
- Daniel Bolanos, Geoffrey Zweig, and Patrick Nguyen, Multi-scale Personalization for Voice Search Applications, in HLT-NAACL 2009, Association for Computational Linguistics, 2009
2008
- Dong Yu, Li Deng, Jian Wu, Yifan Gong, and Alex Acero, Improvements on Mel-Frequency Cepstrum Minimum-Mean-Square-Error Noise Suppressor for Robust Speech Recognition, in ISCSLP, IEEE, December 2008
- Dong Yu, Li Deng, and Alex Acero, The Maximum Entropy Model with Continuous Features , in NIPS Workshop, Whistler, BC, Canada, Microsoft, December 2008
- Hui Lin, Li Deng, Jasha Droppo, Dong Yu, and Alex Acero, Learning Methods in Multilingual Speech Recognition, in NIPS Workshop, Whistler, BC, Canada, Microsoft, December 2008
- Zhengyou Zhang, Qin Cai, and Jack W. Stokes, Multichannel Acoustic Echo Cancellation in Multiparty Spatial Audio Conferencing with Constrained Kalman Filtering, in 11th International Workshop on Acoustic Echo and Noise Control, 14 September 2008
- Yun-Cheng Ju and Julian Odell, A Language-Modeling Approach to Inverse Text Normalization and Data Cleanup, International Speech Communication Association, September 2008
- Michael Seltzer and Ivan Tashev, A Log-MMSE Adaptive Filter Using a non-Linear Spatial Filter, in Proceedings of International Workshop on Acoustic, Echo and Noise Control IWAENC 2008, Seattle, USA, September 2008
- Ivan Tashev, Slavy Mihov, Tyler Gleghorn, and Alex Acero, Sound Capture System and Spatial Filter for Small Devices, in Proceedings of Interspeech 2008, International Speech Communication Association, Brisbane, Australia, September 2008
- Ivan Tashev and Michael Seltzer, Data Driven Beamformer Design for Binaural Headset, in Proceedings of International Workshop on Acoustic, Echo and Noise Control IWAENC 2008, Seattle, USA, September 2008
- Xiaolong Li, Li Deng, Yun-Cheng Ju, and Alex Acero, Automatic Children's Reading Tutor on Hand-Held Devices, in Proceedings of Interspeech, International Speech Communication Association, Brisbane, Australia, September 2008
- Dong Yu, Li Deng, Yifan Gong, and Alex Acero, Discriminative Training of Variable-Parameter HMMs for Noise Robust Speech Recognition, in Proceedings of the Interspeech, International Speech Communication Association, September 2008
- Dong Yu, Li Deng, Yifan Gong, and Alex Acero, Parameter Clustering and Sharing in Variable-Parameter HMMs for Noise Robust Speech Recognition, in Proc. of the Interspeech, International Speech Communication Association, September 2008
- Ye-Yi Wang, Xiao Li, and Alex Acero, Inductive and Example-Based Learning for Text Classification, in Interspeech, International Speech Communication Association, Brisbane, Australia, September 2008
- Xiao Li, Ye-Yi Wang, and Alex Acero, Learning Query Intent from Regularized Click Graphs, in SIGIR'08: the 31st Annual ACM SIGIR conference on Research and Development in Information Retrieval, Association for Computing Machinery, Inc., Singapore, Singapore, July 2008
- Slavy Mihov, Tyler Gleghorn, and Ivan Tashev, Enhanced Sound Capture System for Small Devices, in Proceedings of XLIII International Scientific Conference on Information, Communication, and Energy Systems and Technologies ICEST 2008, Nis, Serbia, June 2008
- Dan Bohus, Xiao Li, Patrick Nguyen, and Geoffrey Zweig, Learning N-Best Correction Models from Implicit User Feedback in a Multi-Modal Local Search Application, in Special Interest Group on Discourse and Dialogue (SIGdial), June 2008
- Michael L. Seltzer, Bridging the gap: towards a unified framework for hands-free speech recognition using microphone arrays, in Proceedings of the Workshop on Hands-Free Speech Communication and Microphone Arrays, Institute of Electrical and Electronics Engineers, Inc., Trento, Italy, May 2008
- Dong Yu, Li Deng, Jasha Droppo, Jian Wu, Yifan Gong, and Alex Acero, A Minimum Mean-Square-Error Noise Reduction Algorithm on Mel-Frequency Cepstra for Robust Speech Recognition, in Proc. ICASSP, Institute of Electrical and Electronics Engineers, Inc., April 2008
- Graham Taylor, Michael Seltzer, and Alex Acero, Maximum a Posteriori ICA: Applying Prior Knowledge to the Separation of Acoustic Sources, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., April 2008
- Alex Acero, Neal Bernstein, Rob Chambers, Yun-Cheng Ju, Xiao Li, Julian Odell, Patrick Nguyen, Oliver Scholtz, and Geoff Zweig, Live Search for Mobile: Web Services by Voice on the Cellphone, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., April 2008
- Tsung-Hui Chang, Zhi-Quan Luo, Li Deng, and Chong-Yung Chi, A Convex Optimization Method for Joint Mean and Variance Parameter Estimation of Large-Margin CDHMM, in Proceedings of the ICASSP, April 2008
- Jinyu Li, Li Deng, Dong Yu, Jian Wu, Yifan Gong, and Alex Acero, Adaptation of compressed HMM parameters for resource-constrained speech recognition, Institute of Electrical and Electronics Engineers, Inc., April 2008
- Nilesh Madhu, Ivan Tashev, and Alex Acero, An EM-based Probabilistic Approach for Acoustic Echo Suppression, in Proceedings of International Conference on Audio, Speech and Signal Processing ICASSP 2008, Institute of Electrical and Electronics Engineers, Inc., Institute of Electrical and Electronics Engineers, Inc., Las Vegas, USA, April 2008
- Ivan Tashev, Jasha Droppo, Michael Seltzer, and Alex Acero, Robust Design of Wideband Loudspeaker Arrays, in Proc. of International Conference on Audio, Speech and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., Las Vegas, USA, April 2008
- Luis Buera, Jasha Droppo, and Alex Acero, Speech Enhancement using a Pitch Predictive Model, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., April 2008
- Jinyu Li, Li Deng, Dong Yu, Yifan Gong, and Alex Acero, HMM Adaptation Using a Phase-Sensitive Acoustic Distortion Model for Environment-Robust Speech Recognition, Institute of Electrical and Electronics Engineers, Inc., April 2008
- Xiao Li, Y.-C. Ju, Geoffrey Zweig, and Alex Acero, Language modeling for voice search: a machine translation approach, in ICASSP, March 2008
- G. Chouelter and Geoffrey Zweig, An Empirical Study of Automatic Accent Classification, in In Proceedings of ICASSP, 2008
- Sumit Basu, Surabhi Gupta, Milind Mahajan, Patrick Nguyen, and John C. Platt, Scalable Summaries of Spoken Conversations, in IUI '08: Proceedings of the 13th international conference on Intelligent user interfaces, Association for Computing Machinery, Inc., January 2008
- Z. Li, Geoffrey Zweig, and Patrick Nguyen, Optimal Dialog in Consumer-Rating Systems using a POMDP Framework, in In Proceedings of SIGdial, 2008
- G. Zweig, D. Bohus, X. Li, and P. Nguyen, Structured Models for Joint Decoding of Repeated Utterances, in In Proceedings of Interspeech, 2008
- G. Zweig and J. Nedel, Empirical Properties of Multilingual Phone-to-Word Transduction, in In Proceedings of ICASSP, 2008
- Tim Paek and Yun-Cheng Ju, Accommodating Explicit User Expressions of Uncertainty in Voice Search or Something Like That, International Speech Communication Association, 2008
- C. White, G. Zweig, L. Burget, P. Schwarz, and H. Hermansky, Confidence Estimation, OOV Detection and Language ID Using Phone-to-Word Transduction and Phone-Level Alignments, in In Proceedings of ICASSP, 2008
2007
- Li Deng, Roles of high-fidelity acoustic modeling in robust speech recognition (invited), in Proceedings IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Institute of Electrical and Electronics Engineers, Inc., December 2007
- Xiao Li, Asela Guanawardana, and Alex Acero, Adapting grapheme-to-phoneme conversion for name recognition, in IEEE Workshop on Automatic Speech Recognition and Understanding, Institute of Electrical and Electronics Engineers, Inc., December 2007
- Dong Yu and Li Deng, Large-Margin Discriminative Training of Hidden Markov Models for Speech Recognition (invited), in Proc. IEEE Intern. Conf. Semantic Computing, Irvine, CA, Institute of Electrical and Electronics Engineers, Inc., 17 September 2007
- Ivan Tashev, Michael Seltzer, Y. C. Ju, Dong Yu, and Alex Acero, Commute UX: Telephone Dialog System for Location-based Services, in Proceedings of SIGdial Workshop on Disclosure and Dialogue 2007, Antwerp, Belgium, September 2007
- Roberto Togneri and Li Deng, A Structured Speech Model Parameterized by Recursive Dynamics and Neural Networks, in Proc. Interspeech, Antwerp, Belgium, 27 August 2007
- Jasha Droppo and Alex Acero, A Fine Pitch Model for Speech, in Proc. Interspeech Conference, International Speech Communication Association, August 2007
- Li Deng and H. Strik, Structure-Based and Template-Based Automatic Speech Recognition --- Comparing parametric and non-parametric approaches, in Proc. Interspeech, August 2007
- Michael Seltzer, Y. C. Ju, Ivan Tashev, and Alex Acero, Robust Location Understanding in Spoken Dialog Systems Using Intersections, in Proceedings of Interspeech 2007, Antwerp, Belgium, August 2007
- Qiang Fu, Xiaodong He, and Li Deng, Phone-Discriminating Minimum Classification Error (P-MCE) Training for Phonetic Recognition, in Proc. Interspeech, August 2007
- Dong Yu and Li Deng, Handling Phonetic Context and Speaker Variation in a Structure-Based Speech Recognizer, in Proc. Interspeech, International Speech Communication Association, August 2007
- Dong Yu, Yun-Cheng Ju, Ye-Yi Wang, Geoffrey Zweig, and Alex Acero, Automated Directory Assistance System - from Theory to Practice, in Proc. of Interspeech, International Speech Communication Association, Antwerp, Belgium, August 2007
- J. Sherwani, Dong Yu, Tim Paek, Mary Czerwinski, Yun-Cheng Ju, and Alex Acero, VoicePedia: Towards Speech-based Access to Unstructured Information, in Interspeech, International Speech Communication Association, August 2007
- Amarnag Subramanya, Mike Seltzer, and Alex Acero, Removal of Typed Keystrokes from Speech Signals, in Proc. of the Interspeech Conference, International Speech Communication Association, May 2007
- Xiaolong Li, Yun-Cheng Ju, Li Deng, and Alex Acero, Efficient and Robust Language Modeling in an Automatic Children's Reading Tutor System, in Proceedings of IEEE Internaltional Conference on Acoustics, Speech and Signal Processing (ICASSP), Institute of Electrical and Electronics Engineers, Inc., 18 April 2007
- Dong Yu, Li Deng, Xiaodong, and Alex Acero, Large-Margin Minimum Classification Error Training for Large-Scale Speech Recognition Tasks, in Proceedings of the ICASSP, Honolulu, Hawaii, IEEE, April 2007
- Ivan Tashev and Henrique Malvar, Stationary-Tones Interference Cancellation Using Adaptive Tracking, in International Conference on Acoustics, Speech and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., Honolulu, USA, April 2007
- Sibel Yaman, Li Deng, Dong Yu, Ye-Yi Wang, and Alex Acero, A Discriminative Training Framework using N-Best Speech Recognition Transcriptions and Scores for Spoken Utterance Classification, in Proc. of the International Conference on Acoustics, Speech and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., Honolulu, Hawaii, U.S.A., April 2007
- Jinyu Li, Li Deng, Dong Yu, Yifan Gong, and Alex Acero, High-Performance HMM Adaptation With Joint Compensation of Additive and Convolutive Distortions Via Vector Taylor Series, in Proceedings IEEE Workshop on ASRU, Institute of Electrical and Electronics Engineers, Inc., April 2007
- Amarnag Subramanya, Zhengyou Zhang, A.C. Surendran, Patrick Nguyen, Mukund Narasimhan, and Alex Acero, A Generative Discriminative Framework Using Ensemble Methods for Text-Dependent Speaker Verification, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., April 2007
- Li Deng and Dong Yu, Use of Differential Cepstra as Acoustic Features in Hidden Trajectory Modeling for Phonetic Recognition, in Proceedings of the ICASSP, Honolulu, Hawaii, IEEE, April 2007
- Chris White, Jasha Droppo, Alex Acero, and Julian Odell, Maximum Entropy Confidence Estimation for Speech Recognition, in Proc. ICASSP, Institute of Electrical and Electronics Engineers, Inc., Hawaii, April 2007
- Byung-Jun Yoon, Ivan Tashev, and Alex Acero, Robust Adaptive Beamforming Algorithm Using Instantaneous Direction of Arrival with Enhanced Noise Suppression Capability, in Proceedings of International Conference on Audio, Speech and Signal Processing ICASSP 2007, Honolulu, USA, April 2007
- Michael Seltzer, Ivan Tashev, and Alex Acero, Microphone Array Post-Filter Using Incremental Bayes Learning to Track the Spatial Distribution of Speech and Noise, in Proceedings of International Conference on Audio, Speech and Signal Processing ICASSP 2007, Honolulu, USA, April 2007
- Ye-Yi Wang, Voice search - Information access via voice queries (Invited Talk), in IEEE Workshop on Automatic Speech Recognition and Understanding, Institute of Electrical and Electronics Engineers, Inc., Kyoto, Japan, 2007
- Tim Paek, Yun-Cheng Ju, and Christopher Meek, People Watcher: A Game for Eliciting Human-Transcribed Data for Automated Directory Assistance, International Speech Communication Association, 2007
- Ye-Yi Wang, Dong Yu, Yu-Cheng Ju, Geoffrey Zweig, and Alex Acero, Confidence Measures for Voice Search Applications, in 8th Annual Conference of the International Speech Communication Association, International Speech Communication Association, Antwerp, Belgium, 2007
- Geoffrey Zweig, Yun-Cheng Ju, Patrick Nguyen, Dong Yu, Ye-Yi Wang, and Alex Acero, Voice-Rate: A Dialog System for Consumer Ratings, in NAACL/HLT (Demonstration Program), Association for Computational Linguistics, Rochester, New York, USA, 2007
- Ye-Yi Wang and Alex Acero, Maximum Entropy Model Parameterization with Tf-Idf Weighted Vector Space Model, in IEEE Automatic Speech Recognition and Understanding Workshop, Institute of Electrical and Electronics Engineers, Inc., Kyoto, Japan, 2007
- Geoffrey Zweig, Patrick Nguyen, Yun-Cheng Ju, Ye-Yi Wang, Dong Yu, and Alex Acero, The Voice-Rate Dialog System for Consumer Ratings, in INTERSPEECH, International Speech Communication Association, Antwerp, Belgium, 2007
2006
- Xiaolong Li, Li Deng, Dong Yu, and Alex Acero, A Time-Synchronous Phonetic Decoder For A Long-Contextual-Span Hidden Trajectory Model, in Proceedings of International Conference on Speech Communication (InterSpeech), 2006, International Speech Communication Association, Pittsburgh, PA, 19 September 2006
- Ivan Tashev and Alex Acero, Microphone Array Post-Processor Using Instantaneous Direction of Arrival, in Proceedings of International Workshop on Acoustic, Echo and Noise Control IWAENC 2006, Paris, France, September 2006
- Dong Yu, Li Deng, Xiaodong He, and Alex Acero, Use of Incrementally Regulated Discriminative Margins in MCE Training for Speech Recognition, in Proc. of the Interspeech Conference, International Speech Communication Association, September 2006
- Dong Yu, Yun-Cheng Ju, and Alex Acero, An Effective and Efficient Utterance Verification Technology Using Word N-gram Filler Models, in Proc. of the Interspeech Conference, International Speech Communication Association, September 2006
- Amarnag Subramanya, Michael Seltzer, and Alex Acero, Removal of Typed Keystrokes from Speech Signals, in Proc. of the Interspeech Conference, International Speech Communication Association, September 2006
- Yong Rui, Eric Rudolph, Li-wei He, Rico Malvar, Michael Cohen, and Ivan Tashev, PING: A Group-to-Individual Distributed Meeting System, in Proceedings of International Conference Multimedia and Expo, Institute of Electrical and Electronics Engineers, Inc., Toronto, Canada, July 2006
- Li Deng, X. Cui, R. Pruvenok, J. Huang, S. Momen, Y. Chen, and A. Alwan, A Database of Vocal Tract Resonance Trajectories for Research in Speech Processing, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing, May 2006
- Milind Mahajan, Asela Gunawardana, and Alex Acero, Training algorithms for hidden conditional random fields, in International Conference on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., May 2006
- Jasha Droppo and Alex Acero, Joint Discriminative Front End and Back End Training for Improved Speech Recognition Accuracy, in Proc. ICASSP, Institute of Electrical and Electronics Engineers, Inc., Toulouse, France, May 2006
- J. Silva, C. Chelba, and Alex Acero, Pruning Analysis for the Position Specific Posterior Lattices for Spoken Document Search, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing, May 2006
- Ivan Tashev, Jasha Droppo, and Alex Acero, Suppression Rule for Speech Recognition Friendly Noise Suppressors, in Proceedings of Eight International Conference Digital Signal Processing and Applications DSPA’06, Moscow, Russia, March 2006
- Yun-Cheng Ju, Ye-Yi Wang, and Alex Acero, Call Analysis with Classification Using Speech and Non-Speech Features, in the International Conference on Spoken Language Processing, International Speech Communication Association, Pittsburgh, PA, USA, 2006
- Dong Yu, Yun-Cheng Ju, Ye-Yi Wang, and Alex Acero, N-Gram Based Filler Model for Robust Grammar Authoring, in International Conference on Acoustics, Speech, and Signal Processing., Institute of Electrical and Electronics Engineers, Inc., Toulouse, France, 2006
- Ye-Yi Wang, John Lee, Milind Mahajan, and Alex Acero, Combining Statistical and Knowledge-Based Spoken Language Understanding in Conditional Models, in COLING/ACL06, Association for Computational Linguistics, Sydney, Australia, 2006
- Ye-Yi Wang and Alex Acero, Discriminative Models for Spoken Language Understanding., in the International Conference on Spoken Language Processing, International Speech Communication Association, Pittsburgh, PA, USA, 2006
- Ye-Yi Wang, John Lee, and Alex Acero, Speech Utterance Classification Model Training without Manual Transcriptions, in IEEE International Conference on Acoustics, Speech and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., Roulouse, France, 2006
2005
- Mike Seltzer and Alex Acero, An EM Algorithm for Training Wideband Acoustic Models from Mixed-Bandwidth Training Data , in Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding, Institute of Electrical and Electronics Engineers, Inc., December 2005
- Jasha Droppo, Milind Mahajan, Asela Gunawardana, and Alex Acero, How to Train a Discriminative Front End with Stochastic Gradient Descent and Maximum Mutual Information, in Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding, Institute of Electrical and Electronics Engineers, Inc., Puerto Rico, December 2005
- Li Deng, Dong Yu, and Alex Acero, A Generative Modeling Framework for Structured Hidden Speech Dynamics, in NIPS Workshop on Advances in Structured Learning for Text and Speech Processing , Microsoft, December 2005
- Li Deng, Dong Yu, Xiaolong Li, and Alex Acero, A Long-Contextual-Span Model of Resonance Dynamics for Speech Recognition: Parameter Learning and Recognizer Evaluation, in Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding, Institute of Electrical and Electronics Engineers, Inc., Puerto Rico, November 2005
- Zicheng Liu, Michael Seltzer, Alex Acero, Ivan Tashev, Zhengyou Zhang, and Mike Sinclair, A Compact Multi-Sensor Headset for Hands-Free Communication, in Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, USA, October 2005
- A. Subramanya, Z. Zhang, Z. Liu, Jasha Droppo, and Alex Acero, A Graphical Model for Multi-Sensory Speech Processing in Air-and-Bone Conductive Microphones, in Proc. of the Interspeech Conference, International Speech Communication Association, Lisbon, Portugal, September 2005
- Ivan Tashev, Beamformer Sensitivity to Microphone Manufacturing Tolerances, in Proceedings of Nineteenth International Conference Systems for Automation of Engineering and Research SAER 2005, St. Konstantin Resort, Bulgaria, September 2005
- Li Deng, Xiaolong Li, Dong Yu, and Alex Acero, Evaluation of a Long-Contextual-Span Hidden Trajectory Model and Phonetic Recognizer Using A* Lattice Search, in Proc. of the Interspeech Conference, International Speech Communication Association, September 2005
- Asela Gunawardana, Milind Mahajan, Alex Acero, and John C. Platt, Hidden Conditional Random Fields for Phone Classification, in International Conference on Speech Communication and Technology, International Speech Communication Association, September 2005
- Ivan Tashev, Michael Seltzer, and Alex Acero, Microphone Array for Headset with Spatial Noise Suppressor, in Proceedings of Ninth International Workshop on Acoustic, Echo and Noise Control IWAENC 2005, Eindhoven, The Netherlands, September 2005
- Li Deng, Dong Yu, and Alex Acero, Learning Statistically Characterized Resonance Targets in a Hidden Trajectory Model of Speech Coarticulation and Reduction, in Proc. of the Interspeech Conference, International Speech Communication Association, September 2005
- Jasha Droppo and Alex Acero, Maximum Mutual Information SPLICE Transform for Seen and Unseen Conditions, in Proc. Interspeech Conference, International Speech Communication Association, September 2005
- C. Chelba and Alex Acero, Indexing Uncertainty for Spoken Document Search, in Proc. of the Interspeech Conference, September 2005
- Xiao Li, Asela Gunawardana, and Alex Acero, Unsupervised semantic intent discovery from call log acoustics, in International Conference on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., August 2005
- A. Subramanya, Li Deng, Z. Liu, and Z. Zhang, Multi-sensory speech processing: Incorporating automatically extracted hidden dynamic information, in Proceedings of the IEEE International Conference on Multimedia & Expo (ICME), Amsterdam, July 2005
- C. Chelba and Alex Acero, Position Specific Posterior Lattices for Indexing Speech, in Proc. of the Association for Computational Linguistics, June 2005
- C. Chelba and Alex Acero, SPEECH OGLE: Indexing Uncertainty for Spoken Document Search, in Proc. of the Association for Computational Linguistics, June 2005
- Ivan Tashev and Daniel Allred, Reverberation reduction for improved speech recognition, in Proceedings of Hands-Free Communication and Microphone Arrays, Piscataway, USA, March 2005
- Michael Seltzer and Alex Acero, Training Wideband Acoustic Models using Mixed-Bandwidth Training Data via Feature Bandwidth Extension, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., March 2005
- Z. Liu, A. Subramanya, Z. Zhang, Jasha Droppo, and Alex Acero, Leakage Model and Teeth Clack Removal for Air- and Bone-Conductive Integrated Microphones, in Proc. ICASSP, Institute of Electrical and Electronics Engineers, Inc., Philadelphia, March 2005
- Li Deng, Xiang Li, Dong Yu, and Alex Acero, A Hidden Trajectory Model with Bi-Directional Target Filtering: Cascaded vs. Integrated Implementation for Phonetic Recognition, in Proc. of Int. Conf. on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., March 2005
- Dong Yu, Milind Mahajan, P. Mau, and Alex Acero, Maximum Entropy Based Generic Filter for Language Model Adaptation, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing, IEEE, March 2005
- Ivan Tashev and Henrique Malvar, A New Beamformer Design Algorithm for Microphone Arrays, in International Conference of Acoustic, Speech and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., Philadelphia, USA, March 2005
- Ye-Yi Wang, John Lee, Milind Mahajan, and Alex Acero, Statistical Spoken Language Understanding: from Generative Model to Conditional Model, in NIPS Workshop: Advances in Structured Learning for Text and Speech Processing, Whistler, BC, Canada, 2005
- Ye-Yi Wang and Alex Acero, SGStudio: Rapid Semantic Grammar Development for Spoken Language Understanding, in 9th European Conference on Speech Communication and Technology, International Speech Communication Association, Lisbon, Portugal, 2005
- M. L. Seltzer, Alex Acero, and Jasha Droppo, Robust Bandwidth Extension of Noise-corrupted Narrowband Speech, in Proc. Interspeech Conference, International Speech Communication Association, 2005
2004
- Xuedong Huang, Enabling natural computing, in 2004 International Symposium on Chinese Spoken Language Processing, December 2004
- Li Deng, Xiaolong Li, Dong Yu, and Alex Acero, Novel Acoustic Modeling with Structured Hidden Dynamics for Speech Coarticulation and Reduction, in Proc. of the DARPA RT04 Workshop, November 2004
- Li Deng, Dong Yu, and Alex Acero, A Quantitative Model for Formant Dynamics and Contextually Assimilated Reduction in Fluent Speech, in Proc. Int. Conf. on Spoken Language Processing, International Speech Communication Association, October 2004
- Dong Yu, Mei-Yuh Hwang, Peter Mau, Alex Acero, and Li Deng, Unsupervised Learning from Users’ Error Correction in Speech Dictation, in Proc. Int. Conf. on Spoken Language Processing, International Speech Communication Association, October 2004
- Michael Seltzer, B. Raj, and R. Stern, A Bayesian Classifier for Spectrographic Mask Estimation for Missing Feature Speech Recognition, in Speech communication, September 2004
- Li Deng, Zicheng Liu, Zhengyou Zhang, and Alex Acero, Nonlinear Information Fusion in Multi-Sensor Processing - Extracting and Exploiting Hidden Dynamics of Speech Captured by a Bone-Conductive Microphone, in Proc. of the IEEE Workshop on Multimedia Signal Processing, Institute of Electrical and Electronics Engineers, Inc., September 2004
- Zicheng Liu, Zhengyou Zhang, Alex Acero, Jasha Droppo, and Xuedong Huang, Direct Filtering for Air- and Bone-Conductive Microphones, in Proc. IEEE Workshop on Automatic Speech Recognition and Understanding, Institute of Electrical and Electronics Engineers, Inc., Siena, Italy, September 2004
- C. Chelba and Alex Acero, Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lot,, in Proc. of EMNLP, July 2004
- Ivan Tashev, Gain Self-Calibration Procedure for Microphone Arrays, in Proceedings of International Conference for Multimedia and Expo ICME 2004, Taipei, Taiwan, June 2004
- Li Deng, L. Lee, H. Attias, and Alex Acero, A Structured Speech Model with Continuous Hidden Dynamics and Prediction-Residual Training for Tracking Vocal Tract Resonances, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing, May 2004
- Jasha Droppo and Alex Acero, Noise Robust Speech Recognition with a Switching Linear Dynamic Model, in Proc. ICASSP, IEEE, Montreal, Canada, May 2004
- C. Chelba and Alex Acero, Conditional ML Estimation Using Rational Function Growth Transform, in Proc. of the Snowbird Learning Workshop, April 2004
- Kuansan Wang, Ye-Yi Wang, and Alex Acero, Use and Acquisition of Semantic Language Model, in NAACL/HLT (Short Paper), Association for Computational Linguistics, Boston, MA, 2004
- Ye-Yi Wang and Yun-Cheng Ju, Creating Speech Recognition Grammars from Regular Expressions for Alphanumeric Concepts, in International Conference on Spoken Language Processing, International Speech Communication Association, Jeju, Korea, 2004
- David Ollason, Yun-Cheng Ju, Siddharth Bhatia, Dan Herron, and Jackie Liu, MS Connect: A Fully Featured Auto-attendant: System Design, Implementation and Performance , International Speech Communication Association, 2004
- Zhengyou Zhang, Z. Liu, M. Sinclair, A. Acero, Li Deng, J. Droppo, Xuedong Huang, and Yanli Zheng, Multisensory microphones for robust speech detection, enhancement, and recognition, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal, Canada, May 2004, IEEE, 2004
- Alex Acero, Ye-Yi Wang, and Kuansan Wang, A Semantically Structured Language Model, in Special Workshop in Maui, Maui, Hawaii, 2004
2003
- J. Wu, Jasha Droppo, Li Deng, and Alex Acero, A Noise-Robust ASR Front-End Using Wiener Filter Constructed from MMSE Estimation of Clean Speech and Noise, in Proc. IEEE Workshop on Automatic Speech Recognition and Understanding, Institute of Electrical and Electronics Engineers, Inc., U.S. Virgin Islands, December 2003
- Ivan Tashev, Improving Meetings with Microphone Array Algorithms, in Machine Learning Meets the User Interface Workshop, Neural Information Processing Systems NIPS 2003, Whistler, Canada, December 2003
- Y. Zheng, Z. Liu, Z. Zhang, M. Sinclair, Jasha Droppo, Li Deng, Xuedong Huang, and Alex Acero, Air and Bone-Conductive Integrated Microphones for Robust Speech Detection and Enhancement, in Proc. IEEE Workshop on Automatic Speech Recognition and Understanding, Institute of Electrical and Electronics Engineers, Inc., U.S. Virgin Islands, December 2003
- Ciprian Chelba and Alex Acero, Discriminative Training of N-gram Classifiers for Speech and Text Routing, in Proc. of the Eurospeech Conference, International Speech Communication Association, September 2003
- Mike Seltzer, Jasha Droppo, and Alex Acero, A Harmonic-Model-Based Front End for Robust Speech Recognition, in Proc. Eurospeech Conference, International Speech Communication Association, Geneva, Switzerland, September 2003
- Dong Yu, Kuansan Wang, Milind Mahajan, Peter Mau, and Alex Acero, Improved Name Recognition With User Modeling, in Proc. of the Eurospeech Conference, International Speech Communication Association, September 2003
- Yongang Deng, Milind Mahajan, and Alex Acero, Estimating Speech Recognition Error Rate without Acoustic Test Data,, in Proc. of the European Conference on Speech Communication, International Speech Communication Association, September 2003
- Li Deng, I. Bazzi, and Alex Acero, Tracking Vocal Tract Resonances Using an Analytical Nonlinear Predictor and a Target-guided Temporal Constraint, in Proc. of the Eurospeech Conference. Geneva, September 2003
- Asela Gunawardana and Alex Acero, Adapting acoustic models to new domains and conditions using untranscribed data, in International Conference on Speech Communication and Technology, International Speech Communication Association, September 2003
- Y. Deng, Milind Mahajan, and Alex Acero, Estimating Speech Recognition Error Rate without Acoustic Test Data, in Proc. of the Eurospeech Conference, September 2003
- Jasha Droppo, Li Deng, and Alex Acero, A Comparison of Three Non-Linear Observation Models for Noisy Speech Features, in Proc. Eurospeech Conference, International Speech Communication Association, Geneva, Switzerland, September 2003
- Issam Bazzi, Alex Acero, and Li Deng, An Expectation-Maximization Approach for Formant Tracking using a Parameter-free Nonlinear Predictor, in Proc. of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., April 2003
- C. Chelba, Milind Mahajan, and Alex Acero, Speech Utterance Classification, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing, April 2003
- Li Deng, Jasha Droppo, and Alex Acero, Incremental Bayes Learning with Prior Evolution for Tracking Non-Stationary Noise Statistics from Noisy Speech Data, in Proc. ICASSP, Institute of Electrical and Electronics Engineers, Inc., Hong Kong, April 2003
- Ye-Yi Wang and Alex Acero, Combination of CFG and N-gram Modeling in Semantic Grammar Learning, in Eurospeech 2003, International Speech Communication Association, Geneva, Switzerland, 2003
- Ye-Yi Wang and Alex Acero, Concept Acquisition in Example-Based Grammar Authoring, in IEEE International Conference on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., Hong Kong, China, 2003
- Ye-Yi Wang and Alex Acero, Is Word Error Rate a Good Indicator for Spoken Language Understanding Accuracy, in IEEE Workshop on Automatic Speech Recognition and Understanding, Institute of Electrical and Electronics Engineers, Inc., St. Thomas, US Virgin Islands, 2003
2002
- Ross Cutler, Yong Rui, Anoop Gupta, JJ Cadiz, Ivan Tashev, Li-wei He, Alex Colburn, Zhengyou Zhang, Zicheng Liu, and Steve Silverberg, Distributed Meetings: A Meeting Capture and Broadcasting System, in Proceedings of ACM Multimedia 2002, Nice, France, December 2002
- Li Deng, Alex Acero, Ye-Yi Wang, Kuansan Wang, Hsiao-Wuen Hon, Jasha Droppo, Milind Mahajan, and XD Huang, A speech-centric perspective for human-computer interface, in Proc. of the IEEE Fifth Workshop on Multimedia Signal Processing, Institute of Electrical and Electronics Engineers, Inc., December 2002
- Li Deng, Jasha Droppo, and Alex Acero, Exploiting Variances in Robust Feature Extraction Based on a Parametric Model of Speech Distortion, in Proc. International Conference on Spoken Language Processing, Denver, Colorado, September 2002
- Jasha Droppo, Li Deng, and Alex Acero, Evaluation of SPLICE on the Aurora 2 and 3 Tasks, in Proc. International Conference on Spoken Language Processing, International Speech Communication Association, Denver, Colorado, September 2002
- Jasha Droppo, Alex Acero, and Li Deng, A Nonlinear Observation Model for Removing Noise from Corrupted Speech Log Mel-Spectral Energies, in Proc. International Conference on Spoken Language Processing, Denver, Colorado, September 2002
- Li Deng, Jasha Droppo, and Alex Acero, Log-Domain Speech Feature Enhancement Using Sequential MAP Noise Estimation and a Phase-sensitive Model of the Acoustic Environment, in Proc. International Conference on Spoken Language Processing, Denver, Colorado, September 2002
- Y. Xiang, Y. Hua, S. An, and Alex Acero, Separating Colored Signals Distorted by Convolutive Channels Using Diagonal Constrained Decorrelation, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing, May 2002
- Jasha Droppo, Li Deng, and Alex Acero, Uncertainty Decoding with SPLICE for Noise Robust Speech Recognition, in Proc. ICASSP, Institute of Electrical and Electronics Engineers, Inc., Florida, May 2002
- Li Deng, Jasha Droppo, and Alex Acero, A Bayesian Approach to Speech Feature Enhancement using the Dynamic Cepstral Prior, in Proc. ICASSP, Institute of Electrical and Electronics Engineers, Inc., Florida, May 2002
- Hagai Attias and Li Deng, A new approach to speech enhancement by a microphone array using EM and mixture moels, in Proceedings of the International Conference on Spoken Language Processing, Denver CO, September 2002, 2002
- Ye-Yi Wang, Alex Acero, Ciprian Chelba, Brendan Frey, and Leon Wong, Combination of Statistical and Rule-Based Approaches for Spoken Language Understanding., in International Conference on Spoken Processing, International Speech Communication Association, Denver, Colorado, 2002
- Ye-Yi Wang and Alex Acero, Evaluation of Spoken Language Grammar Learning in ATIS Domain, in IEEE International Conference on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., Orlando, Florida, 2002
2001
- Li Deng, Jasha Droppo, and Alex Acero, Recursive Noise Estimation Using Iterative Stochastic Approximation for Stereo-based Robust Speech Recognition, in Proc. IEEE Workshop on Automatic Speech Recognition and Understanding, Institute of Electrical and Electronics Engineers, Inc., Madonna di Campliglio, Italy, December 2001
- Jasha Droppo, Alex Acero, and Li Deng, Evaluation of the SPLICE Algorithm on the Aurora 2 Database, in Proc. Eurospeech Conference, International Speech Communication Association, Aalbodk, Denmark, September 2001
- H. Attias, Li Deng, Alex Acero, and John Platt, A New Method for Speech Denoising and Robust Speech Recognition Using Probabilistic Models for Clean Speech and for Noise, in Proc. of the Eurospeech Conference, September 2001
- B. Frey, Li Deng, T. Kristjansson, and Alex Acero, ALGONQUIN: Iterating Laplace's Method to Remove Multiple Types of Acoustic Distortion for Robust Speech Recognition, in Proc. of the Eurospeech Conference, September 2001
- Y. Xiang, Y. Hua, S. An, and Alex Acero, Experimental Investigation of Delayed Instantaneous Demixer for Speech Enhancement, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing, May 2001
- L. Lee, P. Fleguth, and Li Deng, A Functional Articulatory Dynamic Model for Speech Production, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing, May 2001
- T. Kristjansson, B. Frey, Li Deng, and Alex Acero, Towards Non-Stationary Model-Based Noise Adaptation for Large Vocabulary Speech Recognition, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing, May 2001
- Li Deng, Alex Acero, L. Jiang, Jasha Droppo, and Xuedong Huang, High-Performance Robust Speech Recognition Using Stereo Training Data, in Proc. ICASSP, Institute of Electrical and Electronics Engineers, Inc., Salt Lake City, Utah, May 2001
- Jasha Droppo, Alex Acero, and Li Deng, Efficient Online Acoustic Environment Estimation for FCDCN in a Continuous Speech Recognition System, in Proc. ICASSP, Institute of Electrical and Electronics Engineers, Inc., Salt Lake City, Utah, May 2001
- Xuedong Huang, Alex Acero, C. Chelba, Li Deng, Jasha Droppo, D. Duchene, J. Goodman, Hsiao-Wuen Hon, D. Jacoby, L. Jiang, R. Loynd, Milind Mahajan, P. Mau, S. Meredith, S. Mughal, S. Neto, M. Plumpe, K. Stery, G. Venolia, Kuansan Wang, and Ye-Yi Wang, MIPAD: A Multimodal Interactive Prototype, in International Conference on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., Salt Lake City, Utah, USA, 2001
- Ye-Yi Wang, Robust Spoken Language Understanding in MiPad, in Eurospeech, International Speech Communication Association, Aalborg, Denmark, 2001
- B. Frey, T. Kristjansson, Li Deng, and Alex Acero, Learning dynamic noise models from noisy speech for robust speech recognition, in Advances in Neural Information Processing Systems (NIPS), Vol. 14, Vancouver, Canada, 2001, pp. 101-108, 2001
- Ye-Yi Wang and Alex Acero, Grammar Learning for Spoken Language Understanding, in IEEE Workshop on Automatic Speech Recognition and Understanding, Institute of Electrical and Electronics Engineers, Inc., Madonna di Campiglio, Italy, 2001
2000
- H. Attias, J. Platt, Alex Acero, and Li Deng, Speech Denoising and Dereverberation Using Probabilistic Models, in NIPS, November 2000
- Alex Acero, S. Altschuler, and L. Wu, Speech/Noise Separation Using Two Microphones and a VQ Model of Speech Signals, in Proc. Int. Conf. on Spoken Language Processing, October 2000
- Alex Acero, Li Deng, T. Kristjansson, and J. Zhang, HMM Adaptation Using Vector Taylor Series for Noisy Speech Recognition, in Proc. Int. Conf. on Spoken Language Processing, October 2000
- Li Deng, Alex Acero, M. Plumpe, and Xuedong Huang, Large-Vocabulary Speech Recognition under Adverse Acoustic Environments,, in Proc. Int. Conf. on Spoken Language Processing, October 2000
- Ye-Yi Wang, Milind Mahajan, and Xuedong Huang, A unified context-free grammar and n-gram model for spoken language processing, in Proc. of Int. Conf. on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., June 2000
- Xuedong Huang, Alex Acero, Ciprian Chelba, Li Deng, Doug Duchene, Joshua Goodman, Hsiao-Wuen Hon, Derek Jacoby, Li Jiang, Ricky Loynd, Milind Mahajan, Peter Mau, Scott Meredith, Salman Mughal, Salvado Neto, Mike Plumpe, Kuansan Wang, and Ye-Yi Wang, MiPad: A Next Generation PDA Prototype, in International Conference on Spoken Language Processing, International Speech Communication Association, Beijing, China, 2000
1999
- Matthew Richardson, Mei-Yuh Hwang, Alex Acero, and Xuedong Huang, Improvements on Speech Recognition for Fast Talkers, in Proc. of the Eurospeech Conference, September 1999
- Alex Acero, Formant Analysis and Synthesis using Hidden Markov Models, in Proc. of the Eurospeech Conference, September 1999
- Ye-Yi Wang, A Robust Parser for Spoken Language Understanding, in Eurospeech, International Speech Communication Association, Budapest, Hungary, 1999
1998
- Alex Acero, A Mixed-Excitation Frequency Domain Model for Time-Scale Pitch-Scale Modification of Speech, in Proc. of the Int. Conf. on Spoken Language Processing, December 1998
- Hsiao-Wuen Hon, Alex Acero, Xuedong Huang, J. Liu, and M. Plumpe, Automatic Generation of Synthesis Units for Trainable Text-to-Speech Systems, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing, December 1998
- M. Plumpe, Alex Acero, Hsiao-Wuen Hon, and Xuedong Huang, HMM-Based Smoothing for Concatenative Speech Synthesis, in Proc. of the Int. Conf. on Spoken Language Processing, December 1998
- Jasha Droppo and Alex Acero, Maximum a Posteriori Pitch Tracking, in Proc. International Conference on Spoken Language Processing, International Speech Communication Association, Sydney, Australia, December 1998
- Alex Acero, Source-Filter Models for Time-Scale Pitch-Scale Modification of Speech, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing, May 1998
- Mei-Yuh Hwang and Xuedong Huang, Dynamically configurable acoustic models for speech recognition, in Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing , Institute of Electrical and Electronics Engineers, Inc., May 1998
- Hsiao-Wuen Hon, Yun-Cheng Ju, and Keiko Otani, Japanese Large-Vocabulary Continuous Speech Recognition System Based on Microsoft Whisper , International Speech Communication Association, 1998
1997
- X. D. Huang, Alex Acero, Hsiao-Wuen Hon, Yun-Cheng Ju, J. Liu, S. Meredith, and M. Plumpe, Recent Improvements on Microsofts Trainable Text-to-Speech System: Whistler, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., April 1997
1996
- Xuedong Huang, Alex Acero, J. Adcock, J. Goldsmith, and J. Liu, Whistler: A Trainable Text-to-Speech System, in Proc. of the Int. Conf. on Spoken Language Processing, International Speech Communication Association, October 1996
- Alex Acero and Xuedong Huang, Speaker and Gender Normalization for Continuous-Density Hidden Markov Models, in Proc. of the Int. Conf. on Acoustics, Speech, and Signal , IEEE, May 1996
1995
- Alex Acero and Xuedong Huang, Augmented Cepstral Normalization for Robust Speech Recognition, in Proc. of the IEEE Workshop on Automatic Speech Recognition, December 1995
- Xuedong Huang, Alex Acero, Fil Alleva, Mei-Yuh Hwang, Li Jiang, and Milind Mahajan, Microsoft Windows Highly Intelligent Speech Recognizer: Whisper, in Proc. of the International Conference on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., May 1995
Tech Reports
- Jason D. Williams, Alan Black, Deepak Ramachandran, and Antoine Raux, Dialog State Tracking Challenge: information for prospective participants, 5 December 2012
- Tomas Mikolov and Geoffrey Zweig, Context Dependent Recurrent Neural Network Language Model, no. MSR-TR-2012-92, July 2012
- Geoffrey Zweig and Christopher J.C. Burges, The Microsoft Research Sentence Completion Challenge, no. MSR-TR-2011-129, December 2011
- Geoffrey Zweig, Patrick Nguyen, Dirk Van Compernolle, Kris Demuynck, Les Atlas, Pascal Clark, Greg Sell, Fei Sha, Meihong Wang, Aren Jansen, Hynek Hermansky, Damianos Karakos, Samuel Thomas, G.S.V.S. Sivaram, Keith Kintzley, Sam Bowman, and Justine Kao, Speech Recognition with Segmental Conditional Random Fields: Final Report from the 2010 JHU Summer Workshop, no. MSR-TR-2010-173, November 2010
- Patrick Nguyen and Geoffrey Zweig, Extensions to the SCARF framework, no. MSR-TR-2010-129, 22 September 2010
- Daniel Povey, Subspace Gaussian Mixture Models for Speech Recognition, no. MSR-TR-2009-64, 27 May 2009
- Amarnag Subramanya, Zhengyou Zhang, Zicheng Liu, and Alex Acero, Speech Modeling with Magnitude-Normalized Complex Spectra and its Application to Multisensory Speech Enhancement, no. MSR-TR-2005-126, September 2005
- Ya Chang, Ross Cutler, Zicheng Liu, Zhengyou Zhang, Alex Acero, and Matthew Turk, Automatic Head-Size Equalization in Panorama Images for Video Conferencing, no. MSR-TR-2005-48, May 2005
- Ciprian Chelba and Alex Acero, Conditional Maximum Likelihood Estimation of Naive Bayes Probability Models Using Rational Function Growth Transform, no. MSR-TR-2004-33, April 2004
