About AMHAT 2023
Hearing loss affects 1.5 billion people globally and is associated with poorer health and social
outcomes. Recent technological advances have enabled low-latency, high data-rate wireless
solutions for in-ear hearing assistive devices, which have primarily reformed the current
innovation direction of the hearing industry.
Nevertheless, even sophisticated commercial hearing aids and cochlear-implant devices are based
on audio-only processing, and remain ineffective in restoring speech intelligibility in
overwhelmingly noisy environments. Human performance in such situations is known to be dependent
upon input from both the aural and visual senses that are then combined by sophisticated
multi-level integration strategies in the brain. Due to advances in miniaturized sensors and
embedded low-power technology, we now have the potential to monitor not only sound but also many
parameters such as visuals to improve speech intelligibility. Creating future transformative
multimodal hearing assistive technologies that draw on cognitive principles of normal
(visually-assisted) hearing, raises a range of formidable technical, privacy and usability
challenges which need to be holistically overcome.
The AMHAT Workshop aims to provide an interdisciplinary forum for the wider speech signal
processing, artificial intelligence, wireless sensing and communications and hearing technology
communities to discuss the latest advances in this emerging field, and stimulate innovative
research directions, including future challenges and opportunities.
Workshop Chairs
Amir Hussain, Edinburgh Napier University, UK
Mathini Sellathurai, Heriot-Watt University, UK
Peter Bell, University of Edinburgh, UK
Katherine August, Stevens Institute of Technology, USA
Steering Committee Chairs
John Hansen, University of Texas at Dallas, USA
Naomi Harte, Trinity College Dublin, UK
Michael Akeroyd, University of Nottingham, UK
Scientific Committee Chair
Yu Tsao, Academia Sinica, Taiwan
Scientific Committee
Peter Derleth, Sonova
Ben Milner, University of East Anglia, UK
Jennifer Williams, University of Southampton, UK
Emanuel Habets, University of Erlangen-Nuremberg, Germany
Chi-Chun Lee, National Tsing Hua University, Taiwan
Hadi Larijani, Glasgow Caledonian University, UK
Erfan Loweimi, University of Cambridge and Edinburgh Napier University, UK
Raza Varzandeh, University of Oldenburgh, Germany
Jesper Jensen, Aalborg University, Denmark
Yong Xu, Tencent America, USA
Dong Yu, Tencent AI Lab, China
Daniel Michelsanti, Aalborg University, Denmark
Volker Hohmann, University of Oldenburgh, Germany
Marc Delcroix, NTT Communication Science Laboratories, Japan
Zheng-Hua Tan, Aalborg University, Denmark
Harish Chandra Dubey, Microsoft, USA
Simon Doclo, University of Oldenburgh, Germany
Kia Dashtipour, Edinburgh Napier University
Hsin-Min Wang, Academia Sinica, Taiwan
Mandar Gogate, Edinburgh Napier University, UK
Jun-Cheng Chen, Academia Sinica, Taiwan
Adeel Ahsan, University of Wolverhampton, UK
Alex Casson, University of Manchester, UK
Tharm Ratnarajah, University of Edinburgh, UK
Jen-Cheng Hou, Academia Sinica, Taiwan
Tughrul Arslan, University of Edinburgh, UK
Shinji Watanabe, Carnegie Mellon University
Nima Mesgarani, Columbia University, USA
Jesper Jensen, Aalborg University, Denmark
Qiang Huang, University of Sunderland, UK
Bernd T. Meyer, University of Oldenburgh, Germany
Topics of interest
The Workshop invites authors to submit papers presenting novel research related to all aspects of multi-modal hearing assistive technologies, including, but not limited to the following:
- Novel explainable and privacy-preserving machine learning and statistical model based approaches to multi-modal speech-in-noise processing
- End-to-end real-time, low-latency and energy-efficient audio-visual speech enhancement and separation methods
- Human auditory-inspired models of multi-modal speech perception and enhancement
- Internet of things (IoT), 5G/6G and wireless sensing enabled approaches to multi-modal hearing assistive technologies
- Multi-modal speech enhancement and separation in AR/VR environments
- Innovative binaural and multi-microphone, including MEMS antenna integration and multi-modal beamforming approaches
- Cloud, Edge and System-on-Chip based software and hardware implementations
- New multi-modal speech intelligibility models for normal and hearing-impaired listeners
- Audio-visual speech quality and intelligibility assessment and prediction techniques for multi-modal hearing assistive technologies
- Demonstrators of multi-modal speech-enabled hearing assistive technology use cases (e.g. multi-modal listening and communication devices)
- Accessibility and human-centric factors in the design and evaluation of multi-modal hearing assistive technology, including public perceptions, ethics, standards, societal, economic and political impacts
- Contextual (e.g. user preference and cognitive load-aware) multi-modal hearing assistive technologies
- Innovative applications of multi-modal hearing assistive technologies (e.g. diagnostics, therapeutics, human-robot interaction, sign-language recognition for aided communication)
- Live demonstrators of multi-modal speech-enabled hearing assistive technology use cases (e.g. multi-modal cochlear implants and listening and communication devices)
Important Dates
Workshop Paper Submission Deadline: 23 March 2023 24 February 2023
Workshop Paper Acceptance Notification: 14 April 2023
Workshop Camera Ready Paper Deadline: 28 April 2023
All deadlines are 11:59PM UTC-12:00 ("anywhere on Earth").
Paper Submission Guidelines
The AMHAT workshop accepts both short (2 pages) and long paper (4 pages) submissions on topics as highlighted in Topics of interest. Papers may be no longer than 5 pages, including all text, figures, and references, and the 5th page may contain only references. All submissions should follow the ICASSP-2023 paper style and format (https://2023.ieeeicassp.org/paper-submission-guidelines/). Only long papers will be published in IEEE Xplore.
Paper submission now open!
Note: The peer-reviewing process will follow the main conference reviewing guidelines
and all accepted Workshop papers will be published in the IEEE Xplore Digital Library.
Note that, like the main conference, the AMHAT Workshop is fostering return to an
in-person attendance experience. Accordingly, there must be an author of each accepted
workshop paper presenting it in-person.
Program
Date and time: Saturday, 10 June 2023, 13:45 - 1745
Venue: Salon Des Roses A
13:45-14:00 Welcome by Workshop Chairs
14.00-14:30 Keynote talk: Professor Yu Tsao, Academia Sinica, Taiwan
14:30-14:45 Live showcase Demo
14:45-15:30 Poster Session 1 - Poster Area WP-D (set up from 1300-13:30, take down sharply at start of Coffee break)
Paper ID 6970: Lightweight VisualVoice: Neural Network Quantization on Audio-Visual Speech Separation
Paper ID 7210: Towards Pose-Invariant Audio-Visual Speech Enhancement in the Wild for Next-Generation Multi-Modal Hearing Aids
Paper ID 6949: A Vision-Assisted Hearing Aid System Based on Deep Learning
Paper ID 7057: Requirements for Mass Adoption of Assistive Listening Technology by the General Public
Paper ID 7192: Frequency-Domain Functional Links for Nonlinear Feedback Cancellation in Hearing Aids
15:30-15:45 Coffee & Networking Break (Poster Session 1/2 swap)
15:45-16:30 Poster Session 2 - Poster Area WP-D (set up during coffee break, take down at end of AMHAT) [Note: 15:45-16:00 will be during the remainder of the coffee break]
Paper ID 7194: Towards an FPGA implementation of an IoT-based Multimodal Hearing aid System
Paper ID 7199: Towards Individualised Speech Enhancement: An SNR Preference Learning System for Multi-modal Hearing Aids
Paper ID 7207: Socio-Technical Trust for Multi-Modal Hearing Assistive Technology
Paper ID 7213: Two-point neurons for efficient multimodal speech enhancement
Paper ID 6837: Audio-visual Speech Enhancement and Separation by utilising Multi-modal Self-supervised Embeddings
16:30-17:00 Keynote Talk: Dr Peter Derleth, Sonova AG, Switzerland.
17:00-17:10 Introduction to AVSEC-2: the second international Audio-Visual Speech Enhancement Challenge
17:10-17:45 Open Panel Discussion
17:45 close
Keynote speakers
Prof Yu Tsao, Academia Sinica, Taiwan
Towards Audio-visual Speech Enhancement in Real-World Scenarios
We propose a novel audio-visual speech enhancement (AVSE) algorithm, iLAVSE, for a real-world scenario. Compared to conventional AVSE systems, iLAVSE overcomes three common issues that can occur in a real-world environment: the additional cost of processing visual data, audio-visual asynchronization, and low-quality visual data. To evaluate iLAVSE, we use a multimodal Taiwan-Mandarin speech with video dataset and compare with conventional AVSE systems. The results demonstrate that iLAVSE can effectively address the aforementioned issues and improve speech enhancement performance, making it suitable for real-world applications.
Dr Peter Derleth, Sonova AG
Technological and commercial aspects of assistive hearing solutions
Assistive hearing solutions come in a variety of form factors, are designed to serve various use cases, are targeted at different user groups and distributed to the market as consumer or medical products. Each of the mentioned aspects influences if a technological/functional innovation reaches the respective market segment and gets the chance to improve the daily life of human listeners. The presentation will shed a light on existing and near future hearing aid technology and share anecdotal insights into alternative technical solutions which did not reach a high market impact despite being technically advanced.