Enhanced Binaural Loudspeaker Audio System with Room Modeling

Myung-Suk Song, Cha Zhang, Dinei Florencio, and Hing-Goo Kang


For many years, spatial (3D) sound using headphones

has been widely used in a number of applications. A

rich spatial sensation is obtained by using head related transfer

functions (HRTF) and playing the appropriate sound through

headphones. In theory, loudspeaker audio systems would be

capable of rendering 3D sound fields almost as rich as headphones,

as long as the room impulse responses (RIRs) between

the loudspeakers and the ears are known. In practice, however,

obtaining these RIRs is hard, and the performance of loudspeaker

based systems is far from perfect. New hope has been recently

raised by a system that tracks the user’s head position and

orientation, and incorporates them into the RIRs estimates in

real time. That system made two simplifying assumptions: it

used generic HRTFs, and it ignored room reverberation. In this

paper we tackle the second problem: we incorporate a room

reverberation estimate into the RIRs. Note that this is a nontrivial

task: RIRs vary significantly with the listener’s positions,

and even if one could measure them at a few points, they are

notoriously hard to interpolate. Instead, we take an indirect

approach: we model the room, and from that model we obtain

an estimate of the main reflections. Position and characteristics

of walls do not vary with the users’ movement, yet they allow

to quickly compute an estimate of the RIR for each new user

position. Of course the key question is whether the estimates are

good enough. We show an improvement in localization perception

of up to 32% (i.e., reducing average error from 23.5◦ to 15.9◦).


Publication typeInproceedings
Published inMMSP
> Publications > Enhanced Binaural Loudspeaker Audio System with Room Modeling