Author (s)
Guastamacchia Angela 1, Puglisi Giuseppina 2, Bottega Andrea 1, Shtrepi Louena 1, Riente Fabrizio 3, Astolfi Arianna 1
Affiliation
1 Department of Energy (DENERG), Politecnico di Torino
2 Campus Management, Logistics and Sustainability, Politecnico di Torino
3 Department of Electronics and Telecommunications, Politecnico di Torino
Publication date
2023
Abstract
Lately, auditory research has stressed the importance of maximizing the ecological validity of the laboratory tests used for assessing hearing loss degree, empowering hearing-aids and performing their fitting. Thus, simulations and in-field recordings of spatialized audiovisual scenes returning complex acoustic environments are now being integrated into these tests. However, because of the complicated interplay between visual and aural human perception, capturing through in-field recordings synchronized audiovisual scenarios might be tricky. Indeed, to accomplish ecological high-quality audio-video footage, audio-video coherence is essential, which occurs when visual and acoustical scenes’ origins match. Yet, most of the available recording devices embed high-resolution 360° stereoscopic cameras but support spatial audio recording only up to 1st ambisonics order, often requiring the use of two different devices placed on the same stand, one for video footage and one for high-order ambisonics audio acquisitions. This work proposes a method for choosing the best stand configuration, for a case study involving Zylia ZM-1 3rd-order ambisonics microphone and Insta360 Pro or Insta360 ONE X2 cameras, based on the evaluation of (i) the influence of the camera on the captured sound field and of the microphone on the visual field, (ii) the degree of quasi-coincidence between ZM-1 and camera devices.
Full paper
https://iris.polito.it/handle/11583/2980450
Keywords
audio-visual coherence, ecological validity, ambisonics recording, spatial hearing assessment, multimodal scene capture