Cha Zhang, Zhengyou Zhang, and Dinei Florencio
This paper presents a maximum likelihood (ML) framework for multimicrophone sound source localization (SSL). Besides deriving the framework, we focus on making the connection and contrast between the ML-based algorithm and popular steered response power (SRP) SSL algorithms such as phase transform (SRP-PHAT). We also show under our ML framework how challenging conditions such as directional microphone arrays and reverberations can be handled. The computational cost of our method is low – similar to SRPPHAT. The effectiveness of the proposed method is shown on a large dataset with 99 real-world audio sequences recorded by directional circular microphone arrays in over 50 different meeting rooms.