Cha Zhang, Dinei Florencio, and Zhengyou Zhang
30 March 2008
Among many existing time difference of arrival (TDOA) based sound source localization (SSL) algorithms, the Phase Transform (PHAT) is extremely popular for its excellent performance in low noise environments, even under relatively heavy reverberation. However, PHAT was developed as a heuristic approach and its working principle has not been completely understood. In this paper, we present the relationship between PHAT and a maximum likelihood (ML) framework for multi-microphone sound source localization. We show that when the environment noise approaches zero, PHAT is indeed a special case of the ML algorithm, which explains its good performance under low noise environments. In addition, we show that as long as the noise stays low, PHAT remains optimal in ML sense even when the room reverberation is heavy, which explains its robustness over reverberation.
In IEEE International Conference on Acoustics, Speech and Signal Processing
© 2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. http://www.ieee.org/