C. Evers , A. H. Moore, and P. A. Naylor
Acoustic scene mapping creates a representation of positions of audio sources such as talkers within the surrounding environment of a microphone array. By allowing the array to move, the acoustic scene can be explored in order to improve the map. Furthermore, the spatial diversity of the kinematic array allows for estimation of the source-sensor distance in scenarios where source directions of arrival are measured. As sound source localization is performed relative to the array position, mapping of acoustic sources requires knowledge of the absolute position of the microphone array in the room. If the array is moving, its absolute position is unknown in practice. Hence, Simultaneous Localization and Mapping (SLAM) is required in order to localize the microphone array position and map the surrounding sound sources. In realistic environments, microphone arrays receive a convolutive mixture of direct-path speech signals, noise and reflections due to reverberation. A key challenge of Acoustic SLAM (a-SLAM) is robustness against reverberant clutter measurements and missing source detections. This paper proposes a novel bearing-only a-SLAM approach using a Single-Cluster Probability Hypothesis Density filter. Results demonstrate convergence to accurate estimates of the array trajectory and source positions.