Zicheng Liu, Michael Seltzer, Alex Acero, Ivan Tashev, Zhengyou Zhang, and Mike Sinclair
The need for hands-free communication has led to an increased popularity in the use of headsets with mobile phones. Comfort and portability concerns have led to the desire for headsets with a small form factor. Unfortunately, this size constraint typically requires that the microphone be placed farther from the user’s mouth, making it highly susceptible to environmental noise. One long term goal of our work is to develop a headset that can achieve the sound capture performance of a close-talking microphone located at the user’s mouth, while maintaining the desired compact size. Toward this end, we have designed a headset consisting of three air microphones and a bone-conductive sensor. The speech enhancement is performed in two stages, a fixed beamformer followed by a single-channel adaptive post-filter. Unlike other techniques, the beamformer is calibrated in a purely data-driven manner. The bone sensor provides a robust speech activity detector for use in the post-filtering stage. We present preliminary experimental results using real data collected in multiple environments. The proposed approach results in significant improvements in both speech recognition accuracy and SNR.
|Published in||Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics|
|Address||New Paltz, USA|