Particle Filter-based Acoustic Source Localisation algorithms track (online and in real-time) the position of a sound source — a person speaking in a room — based on the current data from a microphone array as well as all previous data up to that point.
The first section of this thesis reviews previous research in this field and discusses the suitability of using particle filters to solve this problem. Experiments are then detailed which examine the typical performance and behaviour of various instantaneous localisation functions.
In subsequent sections, algorithms are detailed which advance the state-of-the-art. First an orientation estimation algorithm is introduced which uses speaker directivity to infer head pose. Second an algorithm is introduced for multi-target acoustic source tracking and is based upon the Track Before Detect (TBD) methodology. Using this methodology avoids the need to identify a set of source measurements and allows for a large saving in computational power.
Finally this algorithm is extended to allow for an unknown and time-varying number of speakers. By leveraging the frequency content of speech it is shown that regions of the surveillance space can be monitored for activity while requiring only a minor increase in overall computation. A variable dimension particle filter is then outlined which proposes newly active targets, maintains target tracks and removes targets when they become inactive.