Vib2Sound: Separation of Multimodal Sound Sources

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

Vib2Sound: Separation of Multimodal Sound Sources

Authors

Akahoshi, M.; Wang, Y.; Cheng, L.; Zai, A. T.; Hahnloser, R. R. H.

Abstract

Understanding animal social behaviors, including vocal communication, requires longitudinal observation of interacting individuals. However, isolating individual-level vocalizations in complex settings is challenging due to background noise and frequent overlaps of coincident signals from multiple vocalizers. A promising solution lies in multimodal recordings that combine traditional microphones with animal-borne sensors, such as accelerometers and directional microphones. These sensors, however, are constrained by strict limits on weight, size, and power consumption and often lead to noisy or unstable signals. In this work, we introduce a neural network-based system for sound source separation which leverages multi-channel microphone recordings and body-mounted accelerometer signals. Using a multimodal dataset of zebra finches recorded in a social setting, we demonstrate that our method outperforms conventional microphone array-based systems. By enabling the separation of overlapping vocalizations, our approach offers a valuable tool for studying animal communication in complex naturalistic environments.

Follow Us on

0 comments

Add comment