Mark R. P. Thomas, Jens Ahrens, and Ivan Tashev
4 September 2012
The design of time-invariant beamformers is often posed as an optimization problem using practical design constraints. In many scenarios it is sufficient to assume that the microphones have an omnidirectional directivity pattern, a flat frequency response in the range of interest, and a 2D environment in which wavefronts propagate as a function of azimuth angle only. In this paper we consider a generalized solution for those cases in which one or more of these assumptions do not hold, yielding a beamformer that is optimized on measured directivity patterns as a function of azimuth, elevation and frequency. A comparative study is made with the 4-element cardioid microphone array employed in Microsoft Kinect, whose beamformer weights are calculated with directivity patterns using (a) 2D cardioid models, (b) 3D cardioid models and (c) 3D measurements. Results on a recorded noisy speech corpus show similar PESQ and speech recognition accuracy comparing (a) and (b), but a 50% relative improvement in word error rate using measured directivity patterns.
|Publisher||Proc. Intl. Workshop Acoust. Signal Enhancement (IWAENC)|