|
|
Alex Acero
Researcher Area Manager
Speech Technology
Natural Language Processing
Communication and Collaboration Systems
Interactive Visual Media
- Speech Signal Processing: robustness to noise, microphone arrays
- Speech Recognition: acoustic modeling, discriminative training, rapid adaptation.
- Spoken Language Systems: rapid prototyping of speech understanding systems, data mining of speech.
- Natural Language Processing: machine translation, statistical language processing.
Before joining Microsoft in 1994, I worked
in the speech groups of Apple Computer and
Telefonica Investigacion y Desarrollo. I received a Ph.D. from
Carnegie Mellon University in 1990, a Master's from
Rice University in 1987 and an engineering degree from the
Universidad Politecnica de Madrid in 1985, all in Electrical
Engineering. I'm also an affiliate Professor of Electrical Engineering at
University of Washington.
Alex was born in Madrid, Spain. He's married to Donna and is the proud father of
Nicolas and Marcos. While he's not chasing Nicolas or Marcos around, he likes to play the
piano, soccer and sip a good wine, though not at the same time ;-)
Alex is the author of the books:
and has written invited chapters in 3 edited books:
- X. Huang, A. Acero, F. Alleva, M. Hwang, L. Jiang and M. Mahajan. "From
Sphinx-II to Whisper: Making Speech Recognition Usable." Published in
Automatic Speech and Speaker Recognition, Advanced Topics (Kluwer
Academic Publishers, 1996) Edited by C. Lee, F. Soong and K. Paliwal.
Norwell, MA, 1996.
- R. Stern, A. Acero, F. Liu and Y. Ohshima. "Signal Processing for Robust Speech
Recognition." Published in Automatic
Speech and Speaker Recognition, Advanced Topics (Kluwer
Academic Publishers, 1996) Edited by C. Lee, F. Soong and K. Paliwal.
Norwell, MA, 1996.
- A. Acero.
"The Role of Phoneticians in Speech Technology".
Published in European Studies in Phonetics and Speech Communication,
OTS Publications. Edited by G. Bloothooft, V. Hazan, D. Huber and J.Llisterri. August 1995.
Alex holds US 29 patents and has over 140 publications.
Here is a list of publications on noise robust speech recognition
- I. Tashev, J. Droppo, and A. Acero.
Suppression Rule for Speech Recognition Friendly Noise Suppressors,
in Proc. of the 2006 Int. Conference Digital Signal Processing and Applications (DSPA). Moscow, Russia, Mar. 2006.
- M. Seltzer and A. Acero.
An EM Algorithm for Training Wideband Acoustic Models from Mixed-Bandwidth Training Data,
in Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding. Puerto Rico, Dec, 2005.
- L. Deng, J. Wu, J. Droppo, and A. Acero.
Analysis and Comparison of Two Speech Feature Extraction/Compensation Algorithms,
in IEEE Signal Processing Letters. Volume: 12 Issue: 6, Jun 2005. pp. 477-480.
- L. Deng, J. Droppo, and A. Acero.
Dynamic Compensation of HMM Variances Using the Feature Enhancement Uncertainty Computed From a Parametric Model of Speech Distortion,
in IEEE Trans. on Speech and Audio Processing. Volume: 13 Issue: 3, May 2005. pp. 412-421.
- M. Seltzer and A. Acero.
Training Wideband Acoustic Models using Mixed-Bandwidth Training Data via Feature Bandwidth Extension
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Philadelphia, Mar, 2005.
- Z. Zhang, Z. Liu, M. Sinclair, A. Acero, L. Deng, J. Droppo, X. Huang, and Yanli Zheng.
Multi-Sensory Microphones for Robust Speech Detection, Enhancement and Recognition
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Montreal, May, 2004.
- J. Droppo and A. Acero.
Noise Robust Speech Recognition with a Switching Linear Dynamic Model
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Montreal, May, 2004.
- L. Deng, J. Droppo, and A. Acero.
Estimating Cepstrum of Speech Under the Presence of Noise Using a Joint Prior of Static and Dynamic Features,
in IEEE Trans. on Speech and Audio Processing. Volume: 12 Issue: 3 , May 2004. pp. 218-233.
- L. Deng, J. Droppo, and A. Acero.
Enhancement of log Mel Power Spectra of Speech using a Phase-Sensitive Model of the Acoustic Environment and Sequential
Estimation of the Corrupting Noise,
in IEEE Trans. on Speech and Audio Processing. Volume: 12 Issue: 2 , Mar 2004. pp. 133-143.
- J. Wu, J. Droppo, L. Deng and A. Acero.
A Noise-Robust ASR Front-End Using Wiener Filter Constructed from MMSE Estimation of Clean Speech and Noise,
in Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding. Virgin Islands, Dec, 2003.
- L. Deng, J. Droppo, and A. Acero.
Recursive Estimation of Nonstationary Noise using Iterative Stochastic Approximation for Robust Speech Recognition,
in IEEE Trans. on Speech and Audio Processing. Volume: 11 Issue: 6 , Nov 2003. pp. 568-580.
- M. Seltzer, J. Droppo, and A. Acero.
A Harmonic-Model-Based Front End for Robust Speech Recognition,
in Proc. of the Eurospeech Conference. Geneva, Switzerland, Sep, 2003.
- J. Droppo, L. Deng and A. Acero.
A Comparison of Three Non-Linear Observation Models for Noisy Speech Features,
in Proc. of the Eurospeech Conference. Geneva, Switzerland, Sep, 2003.
- L. Deng, J. Droppo and A. Acero.
Incremental Bayes Learning with Prior Evolution for Tracking Non-Stationary Noise Statistics from Noisy Speech Data,
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Hong Kong, April 2003.
- J. Droppo, L. Deng, and A. Acero.
Evaluation of SPLICE on the Aurora 2 and 3 Tasks,
in Proc. Int. Conf. on Spoken Language Processing. Denver, Colorado, Sep, 2002.
- J. Droppo, A. Acero, and L. Deng.
A Nonlinear Observation Model for Removing Noise from Corrupted Speech Log Mel-Spectral Energies,
in Proc. Int. Conf. on Spoken Language Processing. Denver, Colorado, Sep, 2002.
- L. Deng, J. Droppo, and A. Acero.
Exploiting Variances in Robust Feature Extraction Based on a Parametric Model of Speech Distortion,
in Proc. Int. Conf. on Spoken Language Processing. Denver, Colorado, Sep, 2002.
- L. Deng, J. Droppo, and A. Acero.
Log-Domain Speech Feature Enhancement Using Sequential MAP Noise Estimation and a Phase-sensitive Model of the Acoustic Environment,
in Proc. Int. Conf. on Spoken Language Processing. Denver, Colorado, Sep, 2002.
- Y. Xiang, Y. Hua, S. An, A. Acero.
Separating Colored Signals Distorted by Convolutive Channels Using Diagonal Constrained Decorrelation,
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Orlando, Florida, May, 2002.
- J. Droppo, L. Deng and A. Acero.
Uncertainty Decoding with SPLICE for Noise Robust Speech Recognition,
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Orlando, Florida, May, 2002.
- L. Deng, J. Droppo and A. Acero.
A Bayesian Approach to Speech Feature Enhancement using the Dynamic Cepstral Prior,
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Orlando, Florida, May, 2002.
- L. Deng, J. Droppo and A. Acero.
Recursive Noise Estimation Using Iterative Stochastic Approximation for Stereo-based Robust Speech Recognition,
in Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding. Madonna di Campiglio, Italy, Dec, 2001.
- J. Droppo, A. Acero and L. Deng.
Evaluation of the SPLICE Algorithm on the Aurora2 Database,
in Proc. of the Eurospeech Conference. Aalborg, Denmark, Sep, 2001.
- H. Attias, L. Deng, A. Acero and J. Platt.
A New Method for Speech Denoising and Robust Speech Recognition Using Probabilistic Models for Clean Speech and for Noise,
in Proc. of the Eurospeech Conference. Aalborg, Denmark, Sep, 2001.
- B. Frey, L. Deng, A. Acero and T. Kristjansson.
ALGONQUIN: Iterating Laplace's Method to Remove Multiple Types of Acoustic Distortion for Robust Speech Recognition,
in Proc. of the Eurospeech Conference. Aalborg, Denmark, Sep, 2001.
- L. Deng, A. Acero, L. Jiang, J. Droppo and X. Huang
High-Performance Robust Speech Recognition Using Stereo Training Data,
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Salt Lake City, Utah, May, 2001.
- J. Droppo, A. Acero and L. Deng.
Efficient Online Acoustic Environment Estimation for FCDCN in a Continuous Speech Recognition System,
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Salt Lake City, Utah, May, 2001.
- T. Kristjansson, B. Frey, L. Deng and A. Acero.
Towards Non-Stationary Model-Based Noise Adaptation for Large Vocabulary Speech Recognition
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Salt Lake City, Utah, May, 2001.
- Y. Xiang, Y. Hua, S. An and A. Acero.
Experimental Investigation of Delayed Instantaneous Demixer for Speech Enhancement
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Salt Lake City, Utah, May, 2001.
- H. Attias, J. Platt, A. Acero and L. Deng.
Speech Denoising and Dereverberation Using Probabilistic Models,
in NIPS, Denver, Nov. 2000.
- A. Acero, L. Deng, T. Kristjansson and J. Zhang.
HMM Adaptation Using Vector Taylor Series for Noisy Speech Recognition,
in Proc. Int. Conf. on Spoken Language Processing. Beijing, China, Oct, 2000.
- A. Acero, S. Altschuler and L. Wu.
Speech/Noise Separation Using Two Microphones and a VQ Model of Speech Signals,
in Proc. Int. Conf. on Spoken Language Processing. Beijing, China, Oct, 2000.
- L. Deng, A. Acero, M. Plumpe and X. Huang.
Large-Vocabulary Speech Recognition under Adverse Acoustic Environments,
in Proc. Int. Conf. on Spoken Language Processing. Beijing, China, Oct, 2000.
- A. Acero and X. Huang.
Augmented Cepstral Normalization for Robust Speech Recognition,
in Proc. of the IEEE Workshop on Automatic Speech Recognition. Snowbird, UT. Dec 1995.
and publications on acoustic modeling for speech recognition:
- D. Yu, L. Deng, and A. Acero.
A Lattice Search Technique for Long-contextual-span Hidden Trajectory Model of Speech,
in Speech Communication, Elsevier. Volume: 48 Issue: 9, Sep 2006. pp. 1214-1226.
- X. Li, L. Deng, D. Yu and A. Acero.
A Time-Synchronous Phonetic Decoder for a Long-Contextual-Span Hidden Trajectory Model,
in Proc. of the Interspeech Conference. Pittsburgh, Sep, 2006.
- D. Yu, L. Deng, X. He and A. Acero.
Use of Incrementally Regulated Discriminative Margins in MCE Training for Speech Recognition,
in Proc. of the Interspeech Conference. Pittsburgh, Sep, 2006.
- L. Deng, D. Yu, and A. Acero.
Structured Speech Modeling,
in IEEE Trans. on Audio, Speech and Language Processing. Volume: 14 Issue: 5, Sep 2006. pp. 1492- 1504.
- M. Mahajan, A. Gunawardana and A. Acero.
Training Algorithms for Hidden Conditional Random Fields
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Toulouse, May, 2006.
- J. Droppo and A. Acero.
Joint Discriminative Front End and Back End Training for Improved Speech Recognition Accuracy
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Toulouse, May, 2006.
- L. Deng, A. Acero, and I. Bazzi.
Tracking Vocal Tract Resonances Using a Quantized Nonlinear Function Embedded in a Temporal Constraint,
in IEEE Trans. on Audio, Speech and Language Processing. Volume: 14 Issue: 2, Mar 2006. pp. 425-434.
- L. Deng, D. Yu, and A. Acero.
A Bidirectional Target Filtering Model of Speech Coarticulation: two-stage Implementation for Phonetic Recognition,
in IEEE Trans. on Audio, Speech and Language Processing. Volume: 14 Issue: 1, Jan 2006. pp. 256-265.
- L. Deng, D. Yu, X. Li, and A. Acero.
A Long-Contextual-Span Model of Resonance Dynamics for Speech Recognition: Parameter Learning and Recognizer Evaluation,
in Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding. Puerto Rico, Dec, 2005.
- J. Droppo, M. Mahajan, A. Gunawardana, and A. Acero.
How to Train a Discriminative Front End with Stochastic Gradient Descent and Maximum Mutual Information,
in Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding. Puerto Rico, Dec, 2005.
- J. Droppo and A. Acero.
Maximum Mutual Information SPLICE Transform for Seen and Unseen Conditions,
in Proc. of the Interspeech Conference. Lisbon, Portugal, Sep, 2005.
- A. Gunawardana, M. Mahajan, A. Acero, and J. Platt.
Hidden Conditional Random Fields for Phone Classification,
in Proc. of the Interspeech Conference. Lisbon, Portugal, Sep, 2005.
- L. Deng, D. Yu, and A. Acero.
Learning Statistically Characterized Resonance Targets in a Hidden Trajectory Model of Speech Coarticulation and Reduction,
in Proc. of the Interspeech Conference. Lisbon, Portugal, Sep, 2005.
- D. Yu, L. Deng, and A. Acero.
Evaluation of a Long-Contextual-Span Hidden Trajectory Model and Phonetic Recognizer Using A* Lattice Search,
in Proc. of the Interspeech Conference. Lisbon, Portugal, Sep, 2005.
- L. Deng, X. Li, D. Yu, and A. Acero.
A Hidden Trajectory Model with Bi-directional Target-Filtering: Cascaded vs. Integrated Implementation for Phonetic Recognition
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Philadelphia, Mar, 2005.
- L. Deng, X. Li, D. Yu, and A. Acero.
Novel Acoustic Modeling with Structured Hidden Dynamics for Speech Coarticulation and Reduction,
in Proc. of the DARPA RT04 Workshop. Palisades, New York, Nov 2004.
- L. Deng, D. Yu, and A. Acero.
A Quantitative Model for Formant Dynamics and Contextually Assimilated Reduction in Fluent Speech,
in Proc. Int. Conf. on Spoken Language Processing. Jeju, South Korea, Oct, 2004.
- L. Deng, L. Lee, H. Attias, and A. Acero.
A Structured Speech Model with Continuous Hidden Dynamics and Prediction-Residual Training for Tracking Vocal Tract Resonances
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Montreal, May, 2004.
- A. Gunawardana and A. Acero.
Adapting Acoustic Models to New Domains and Conditions Using Untranscribed Data,
in Proc. of the Eurospeech Conference. Geneva, Switzerland, Sep, 2003.
- L. Deng, I. Bazzi and A. Acero.
Tracking Vocal Tract Resonances Using an Analytical Nonlinear Predictor and a Target-guided Temporal Constraint,
in Proc. of the Eurospeech Conference. Geneva, Switzerland, Sep, 2003.
- Y. Deng, M. Mahajan and A. Acero.
Estimating Speech Recognition Error Rate without Acoustic Test Data,
in Proc. of the Eurospeech Conference. Geneva, Switzerland, Sep, 2003.
- D. Yu, K. Wang, M. Mahajan, P. Mau and A. Acero.
Improved Name Recognition With User Modeling,
in Proc. of the Eurospeech Conference. Geneva, Switzerland, Sep, 2003.
- M. Richardson, M, Hwang, A. Acero and X. Huang.
Improvements on Speech Recognition for Fast Talkers,
in Proc. of the Eurospeech Conference. Budapest, Sep 1999.
- A. Acero and X. Huang.
Speaker and Gender Normalization for Continuous-Density Hidden Markov Models, in
Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing.
Atlanta, GA. May 1996.
- X. Huang, A. Acero, F. Alleva, M. Y. Hwang, L. Jiang and M. Mahajan.
"Microsoft Windows Highly Intelligent Speech Recognizer: Whisper"
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Detroit, MI. May 1995.
and publications on statistical language modeling:
- D. Yu, M. Mahajan, P. Mau, and A. Acero.
Maximum Entropy Based Generic Filter for Language Model Adaptation
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Philadelphia, Mar, 2005.
- C. Chelba and A. Acero.
Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lot,
in Proc. of EMNLP. Barcelona, Spain, Jul. 2004.
- C. Chelba and A. Acero.
Conditional ML Estimation Using Rational Function Growth Transform,
in Snowbird Learning Workshop. Utah, Apr. 2004.
- A. Acero, Y. Wang and K. Wang.
A Semantically Structured Language Model,
in Special Workshop in Maui (SWIM), Jan 2004.
- C. Chelba and A. Acero.
Discriminative Training of N-gram Classifiers for Speech and Text Routing,
in Proc. of the Eurospeech Conference. Geneva, Switzerland, Sep, 2003.
- C. Chelba, M. Mahajan and A. Acero.
Speech Utterance Classification,
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Hong Kong, Apr, 2003.
and publications on spoken language systems:
- D. Yu, Y. Ju, and A. Acero.
An Effective and Efficient Utterance Verification Technology Using Word N-gram Filler Models,
in Proc. of the Interspeech Conference. Pittsburgh, Sep, 2006.
- Y. Ju, Y. Wang, and A. Acero.
Call Analysis with Classification Using Speech and Non-Speech Features,
in Proc. of the Interspeech Conference. Pittsburgh, Sep, 2006.
- Y. Wang and A. Acero.
Discriminative Models for Spoken Language Understanding,
in Proc. of the Interspeech Conference. Pittsburgh, Sep, 2006.
- Y. Wang, J. Lee and A. Acero.
Speech Utterance Classification Model Training without Manual Transcriptions
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Toulouse, May, 2006.
- D. Yu, Y. Ju, Y. Wang, and A. Acero.
N-Gram Based Filler Model for Robust Grammar Authoring
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Toulouse, May, 2006.
- J. Silva, C. Chelba and A. Acero.
Pruning Analysis for the Position Specific Posterior Lattices for Spoken Document Search
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Toulouse, May, 2006.
- A. Acero.
Building Voice User Interfaces,
in MSDN Magazine. Feb. 2006.
- Y. Wang, L. Deng, and A. Acero.
Spoken Language Understanding,
in IEEE Signal Processing Magazine. Volume: 22 Issue: 5, Sep. 2005, pp. 16-31.
- D. Yu and A. Acero.
Semiautomatic Improvements of System-Initiative Spoken Dialog Applications Using Interactive Clustering,
in IEEE Trans. on Speech and Audio Processing. Volume: 13 Issue: 5 , Sep. 2005, pp. 661-671.
- C. Chelba and A. Acero.
Indexing Uncertainty for Spoken Document Search,
in Proc. of the Interspeech Conference. Lisbon, Portugal, Sep, 2005.
- Y. Wang and A. Acero.
SGStudio: Rapid Semantic Grammar Development for Spoken Language Understanding,
in Proc. of the Interspeech Conference. Lisbon, Portugal, Sep, 2005.
- C. Chelba and A. Acero.
SPEECH OGLE: Indexing Uncertainty for Spoken Document Search
in Proc. of the Association for Computational Linguistics. Ann Arbor, June, 2005.
- C. Chelba and A. Acero.
Position Specific Posterior Lattices for Indexing Speech
in Proc. of the Association for Computational Linguistics. Ann Arbor, June, 2005.
- X. Li, A. Gunawardana, and A. Acero.
Unsupervised Semantic Intent Discovery from Call Log Acoustics
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Philadelphia, Mar, 2005.
- D. Yu, M. Hwang, P. Mau, A. Acero and L. Deng.
Unsupervised Learning from Users’ Error Correction in Speech Dictation,
in Proc. Int. Conf. on Spoken Language Processing. Jeju, South Korea, Oct, 2004.
- L. Deng, Y. Wang, K. Wang, A. Acero, H. Hon, J. Droppo, C. Boulis, D. Jacoby, M. Mahajan, C. Chelba, and X. Huang.
Speech and Language Processing for Multimodal
Human-Computer Interaction (invited),
in Journal of VLSI Signal Processing Systems (Special issue on Real-World Speech Processing),
Vol. 36, No. 2, February 2004, pp. 161-187.
- Y. Wang, A. Acero, and C. Chelba.
Is Word Error Rate a Good Indicator for Spoken Language Understanding Accuracy,
in Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding. Virgin Islands, Dec, 2003.
- L. Deng, K. Wang, A. Acero, H. Hon, J. Droppo, C. Boulis, Y. Wang, D. Jacoby, M. Mahajan, C. Chelba, and X.D.Huang.
Distributed Speech Processing in MiPad's Multimodal User Interface,
in IEEE Trans. on Speech and Audio Processing. Volume: 10 Issue: 8 , Nov 2002, pp. 605-619.
- Y. Wang and A. Acero.
Combination of CFG and N-gram Modeling in Semantic Grammar Learning,
in Proc. of the Eurospeech Conference. Geneva, Switzerland, Sep, 2003.
- Y. Wang and A. Acero.
Concept Acquisition in Example Based Grammar Authoring,
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Hong Kong, Apr, 2003.
- Y. Wang, A. Acero, C. Chelba, B. Frey, and L. Wong.
Combination of Statistical and Rule-based Approaches for Spoken Language Understanding,
in Proc. Int. Conf. on Spoken Language Processing. Denver, Colorado, Sep, 2002.
- Y. Wang, A. Acero.
Evaluation of Spoken Language Grammar Learning in the ATIS Domain,
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Orlando, Florida, May, 2002.
- Y. Wang and A. Acero.
Grammar Learning for Spoken Language Understanding,
in Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding. Madonna di Campiglio, Italy, Dec, 2001.
- X. Huang, A. Acero, C. Chelba, L. Deng, J. Droppo, D. Duchene, J. Goodman,
H. Hon, D. Jacoby, L. Jiang, R. Loynd, M. Mahajan, P. Mau, S. Meredith, S.
Mughal, S. Neto, M. Plumpe, K. Stery,. G. Venolia, K. Wang, Y. Wang.
MIPAD: A Multimodal Interaction Prototype,
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Salt Lake City, Utah, May, 2001.
- X. Huang, A. Acero, C. Chelba, L. Deng, D. Duchene, J. Goodman, H. Hon, D.
Jacoby, L. Jiang, R. Loynd, M. Mahajan, P. Mau, S. Meredith, S. Mughal, S. Neto,
M. Plumpe, K. Wang, Y. Wang.
MIPAD: A Next Generation PDA Prototype,
in Proc. of the Int. Conf. on Spoken Language Processing. Beijing, China, Oct, 2000.
- Y. Rui and A. Gupta, and A. Acero.
Automatically Extracting Highlights for TV Baseball Programs,
in ACM Multimedia, pp. 105-115, 2000.
and publications on speech synthesis:
- A. Acero.
Formant Analysis and Synthesis using Hidden Markov Models,
Proc. of the Eurospeech Conference. Budapest, Sep 1999.
- A. Acero.
A Mixed-Excitation Frequency Domain Model for Time-Scale Pitch-Scale Modification of Speech,
in Proc. of the Int. Conf. on Spoken Language Processing. Sydney, Australia. Dec 1998.
- M. Plumpe, A. Acero, H. Hon and X. Huang.
HMM-Based Smoothing for Concatenative Speech Synthesis,
in Proc. of the Int. Conf. on Spoken Language Processing. Sydney, Australia. Dec 1998.
- H. Hon, A. Acero, X. Huang, J. Liu and M. Plumpe.
Automatic Generation of Synthesis Units for Trainable Text-to-Speech Systems,
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Seattle, WA. May 1998.
- A. Acero.
Source-Filter Models for Time-Scale Pitch-Scale Modification of Speech,
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Seattle, USA. May 1998.
- X. Huang, A. Acero, H. Hon, Y. Ju, J. Liu, S. Meredith, M. Plumpe.
Recent Improvements on Microsofts Trainable Text-to-Speech System: Whistler,
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Munich, Germany. Apr. 1997.
- X. Huang, A. Acero, J. Adcock, H. Hon, J. Goldsmith, and J. Liu.
Whistler: A Trainable Text-to-Speech System,
in Proc. of the Int. Conf. on Spoken Language Processing. Philadelphia, PA. October 1996.
and publications on speech analysis:
- I. Bazzi, L. Deng and A. Acero.
An Expectation Maximization Approach for Formant Tracking Using a Parameter-Free Nonlinear Predictor,
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Hong Kong, Apr, 2003.
- J. Droppo and A. Acero.
Maximum a Posteriori Pitch Tracking,
in Proc. of the Int. Conf. on Spoken Language Processing. Sydney, Australia. Dec 1998.
and publications on speech enhancement
- I. Tashev and A. Acero.
Microphone Array Post-Processor Using Instantaneous Direction of Arrival,
in Int. Workshop on Acoustic, Echo and Noise Control (IWAENC). , Paris, France, Sep, 2006.
- A. Subramanya, M. Seltzer, and A. Acero.
Automatic Removal of Typed Keystrokes from Speech Signals,
in Proc. of the Interspeech Conference. Pittsburgh, Sep, 2006.
- Z. Liu, M. Seltzer, A. Acero, I. Tashev, Z. Zhang, and M. Sinclair.
A Compact Multi-Sensor Headset for Hands-Free Communication,
in Proc. of the Workshop on Applications of Signal Processing to Audio and Acoustics. New Paltz, NY, USA, Oct. 2005.
- A. Subramanya, Z. Zhang, Z. Liu, J. Droppo, and A. Acero.
A Graphical Model for Multi-Sensory Speech Processing in Air-and-Bone Conductive Microphones,
in Proc. of the Interspeech Conference. Lisbon, Portugal, Sep, 2005.
- M. Seltzer, A. Acero, and J. Droppo.
Robust Bandwidth Extension of Noise-corrupted Narrowband Speech,
in Proc. of the Interspeech Conference. Lisbon, Portugal, Sep, 2005.
- I. Tashev, M. Seltzer, and A. Acero.
Microphone Array for Headset with Spatial Noise Suppressor,
in Proc. of the Ninth Int. Workshop on Acoustic, Echo and Noise Control (IWAENC). Eindhoven, The Netherlands, Sep. 2005.
- Z. Liu, A. Subramanya, Z. Zhang, J. Droppo, and A. Acero.
Leakage Model and Teeth Clack Removal for Air- and Bone-Conductive Integrated Microphones
in Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing. Philadelphia, Mar, 2005.
- Z. Liu, Z. Zhang, A. Acero, J. Droppo and X. Huang.
Direct Filtering for Air- and Bone-Conductive Microphones,
in Proc. IEEE Int. Workshop on Multimedia Signal Processing. Siena, Italy. Sep, 2004.
- L. Deng, Z. Liu, Z. Zhang, and A. Acero.
Nonlinear Information Fusion in Multi-Sensor Processing - Extracting and Exploiting Hidden Dynamics of Speech Captured by a Bone-Conductive Microphone,
in Proc. IEEE Int. Workshop on Multimedia Signal Processing. Siena, Italy. Sep, 2004.
- Y. Zheng, Z. Liu, Z. Zhang, M. Sinclair, J. Droppo, L. Deng, A. Acero and X. Huang.
Air and Bone-Conductive Integrated Microphones for Robust Speech Detection and Enhancement,
in Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding. Virgin Islands, Dec, 2003.
Here is a list of articles about us
- T. Bishop.
Show and tell at Microsoft's annual research fest
(Seattle PI, 2004).
- D. Barker.
Microsoft Research
Spawns a New Era in Speech Technology
(PC AI Magazine, 2003).
- M. Kanellos.
Talking Computers Nearing Reality
(CNET News.com, 2003).
- M. Brooks.
No one understands me as well as my PC
(New Scientist, 2003).
Last updated: Oct. 30, 2006
E-mail: alexac at microsoft dot com
U.S.Mail: Microsoft Corporation, One Microsoft Way, Redmond WA,
98052-6399, USA
Tel: (425) 706-1597
Fax: (425) 706-7329 (This is the main MS FAX number so make sure to send
documents to my attention) |