This paper proposes a visuo-auditory substitution method to assist visually impaired people in scene understanding. Our approach focuses on person localisation in the user’s vicinity in order to ease urban walking. Since a real-time and low-latency is required in this context for user’s security, we propose an embedded system. The processing is based on a lightweight convolutional neural network to perform an efficient 2D person localisation. This measurement is enhanced with the corresponding person depth information, and is then transcribed into a stereophonic signal via a head-related transfer function. A GPU-based implementation is presented that enables a real-time processing to be reached at 23 frames/s on a 640×480 video stream. We show with an experiment that this method allows for a real-time accurate audio-based localization.
Publication
Télécharger la publication
Année de publication : 2022
Type :
Acte de colloque
Acte de colloque
Auteurs :
Scalvini, F.
Bordeau, C.
Ambard, M.
Migniot, C.
& Dubois, J.
Scalvini, F.
Bordeau, C.
Ambard, M.
Migniot, C.
& Dubois, J.
Titre de la collection :
ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Mots-clés :
auditory sensory substitution, people detection, wearable assistive device, real-time processing
auditory sensory substitution, people detection, wearable assistive device, real-time processing