About AMHAT 2023


    Hearing loss affects 1.5 billion people globally and is associated with poorer health and social outcomes. Recent technological advances have enabled low-latency, high data-rate wireless solutions for in-ear hearing assistive devices, which have primarily reformed the current innovation direction of the hearing industry.

    Nevertheless, even sophisticated commercial hearing aids and cochlear-implant devices are based on audio-only processing, and remain ineffective in restoring speech intelligibility in overwhelmingly noisy environments. Human performance in such situations is known to be dependent upon input from both the aural and visual senses that are then combined by sophisticated multi-level integration strategies in the brain. Due to advances in miniaturized sensors and embedded low-power technology, we now have the potential to monitor not only sound but also many parameters such as visuals to improve speech intelligibility. Creating future transformative multimodal hearing assistive technologies that draw on cognitive principles of normal (visually-assisted) hearing, raises a range of formidable technical, privacy and usability challenges which need to be holistically overcome.

    The AMHAT Workshop aims to provide an interdisciplinary forum for the wider speech signal processing, artificial intelligence, wireless sensing and communications and hearing technology communities to discuss the latest advances in this emerging field, and stimulate innovative research directions, including future challenges and opportunities.



    Workshop Chairs

    Amir Hussain, Edinburgh Napier University, UK
    Mathini Sellathurai, Heriot-Watt University, UK
    Peter Bell, University of Edinburgh, UK
    Katherine August, Stevens Institute of Technology, USA


    Steering Committee Chairs

    John Hansen, University of Texas at Dallas, USA
    Naomi Harte, Trinity College Dublin, UK
    Michael Akeroyd, University of Nottingham, UK


    Scientific Committee Chair

    Yu Tsao, Academia Sinica, Taiwan


    Scientific Committee

    Peter Derleth, Sonova
    Ben Milner, University of East Anglia, UK
    Jennifer Williams, University of Southampton, UK
    Emanuel Habets, University of Erlangen-Nuremberg, Germany
    Chi-Chun Lee, National Tsing Hua University, Taiwan
    Hadi Larijani, Glasgow Caledonian University, UK
    Erfan Loweimi, University of Cambridge and Edinburgh Napier University, UK
    Raza Varzandeh, University of Oldenburgh, Germany
    Jesper Jensen, Aalborg University, Denmark
    Yong Xu, Tencent America, USA
    Dong Yu, Tencent AI Lab, China
    Daniel Michelsanti, Aalborg University, Denmark
    Volker Hohmann, University of Oldenburgh, Germany
    Marc Delcroix, NTT Communication Science Laboratories, Japan
    Zheng-Hua Tan, Aalborg University, Denmark
    Harish Chandra Dubey, Microsoft, USA
    Simon Doclo, University of Oldenburgh, Germany
    Kia Dashtipour, Edinburgh Napier University
    Hsin-Min Wang, Academia Sinica, Taiwan
    Mandar Gogate, Edinburgh Napier University, UK
    Jun-Cheng Chen, Academia Sinica, Taiwan
    Adeel Ahsan, University of Wolverhampton, UK
    Alex Casson, University of Manchester, UK
    Tharm Ratnarajah, University of Edinburgh, UK
    Jen-Cheng Hou, Academia Sinica, Taiwan
    Tughrul Arslan, University of Edinburgh, UK
    Shinji Watanabe, Carnegie Mellon University
    Nima Mesgarani, Columbia University, USA
    Jesper Jensen, Aalborg University, Denmark
    Qiang Huang, University of Sunderland, UK
    Bernd T. Meyer, University of Oldenburgh, Germany


    Topics of interest


    The Workshop invites authors to submit papers presenting novel research related to all aspects of multi-modal hearing assistive technologies, including, but not limited to the following:

    1. Novel explainable and privacy-preserving machine learning and statistical model based approaches to multi-modal speech-in-noise processing
    2. End-to-end real-time, low-latency and energy-efficient audio-visual speech enhancement and separation methods
    3. Human auditory-inspired models of multi-modal speech perception and enhancement
    4. Internet of things (IoT), 5G/6G and wireless sensing enabled approaches to multi-modal hearing assistive technologies
    5. Multi-modal speech enhancement and separation in AR/VR environments
    6. Innovative binaural and multi-microphone, including MEMS antenna integration and multi-modal beamforming approaches
    7. Cloud, Edge and System-on-Chip based software and hardware implementations
    8. New multi-modal speech intelligibility models for normal and hearing-impaired listeners
    9. Audio-visual speech quality and intelligibility assessment and prediction techniques for multi-modal hearing assistive technologies
    10. Demonstrators of multi-modal speech-enabled hearing assistive technology use cases (e.g. multi-modal listening and communication devices)
    11. Accessibility and human-centric factors in the design and evaluation of multi-modal hearing assistive technology, including public perceptions, ethics, standards, societal, economic and political impacts
    12. Contextual (e.g. user preference and cognitive load-aware) multi-modal hearing assistive technologies
    13. Innovative applications of multi-modal hearing assistive technologies (e.g. diagnostics, therapeutics, human-robot interaction, sign-language recognition for aided communication)
    14. Live demonstrators of multi-modal speech-enabled hearing assistive technology use cases (e.g. multi-modal cochlear implants and listening and communication devices)


    Important Dates

    Workshop Paper Submission Deadline: 23 March 2023 24 February 2023
    Workshop Paper Acceptance Notification: 14 April 2023
    Workshop Camera Ready Paper Deadline: 28 April 2023
    All deadlines are 11:59PM UTC-12:00 ("anywhere on Earth").



    Paper Submission Guidelines



    The AMHAT workshop accepts both short (2 pages) and long paper (4 pages) submissions on topics as highlighted in Topics of interest. Papers may be no longer than 5 pages, including all text, figures, and references, and the 5th page may contain only references. All submissions should follow the ICASSP-2023 paper style and format (https://2023.ieeeicassp.org/paper-submission-guidelines/). Only long papers will be published in IEEE Xplore.

    Paper submission now open!

    Note: The peer-reviewing process will follow the main conference reviewing guidelines and all accepted Workshop papers will be published in the IEEE Xplore Digital Library. Note that, like the main conference, the AMHAT Workshop is fostering return to an in-person attendance experience. Accordingly, there must be an author of each accepted workshop paper presenting it in-person.

    Program

    Date and time: Saturday, 10 June 2023, 13:45 - 1745


    Venue: Salon Des Roses A

    13:45-14:00 Welcome by Workshop Chairs

    14.00-14:30 Keynote talk: Professor Yu Tsao, Academia Sinica, Taiwan

    14:30-14:45 Live showcase Demo

    14:45-15:30 Poster Session 1 - Poster Area WP-D (set up from 1300-13:30, take down sharply at start of Coffee break)

    Paper ID 6970: Lightweight VisualVoice: Neural Network Quantization on Audio-Visual Speech Separation
    Paper ID 7210: Towards Pose-Invariant Audio-Visual Speech Enhancement in the Wild for Next-Generation Multi-Modal Hearing Aids
    Paper ID 6949: A Vision-Assisted Hearing Aid System Based on Deep Learning
    Paper ID 7057: Requirements for Mass Adoption of Assistive Listening Technology by the General Public
    Paper ID 7192: Frequency-Domain Functional Links for Nonlinear Feedback Cancellation in Hearing Aids

    15:30-15:45 Coffee & Networking Break (Poster Session 1/2 swap)

    15:45-16:30 Poster Session 2 - Poster Area WP-D (set up during coffee break, take down at end of AMHAT) [Note: 15:45-16:00 will be during the remainder of the coffee break]

    Paper ID 7194: Towards an FPGA implementation of an IoT-based Multimodal Hearing aid System
    Paper ID 7199: Towards Individualised Speech Enhancement: An SNR Preference Learning System for Multi-modal Hearing Aids
    Paper ID 7207: Socio-Technical Trust for Multi-Modal Hearing Assistive Technology
    Paper ID 7213: Two-point neurons for efficient multimodal speech enhancement
    Paper ID 6837: Audio-visual Speech Enhancement and Separation by utilising Multi-modal Self-supervised Embeddings

    16:30-17:00 Keynote Talk: Dr Peter Derleth, Sonova AG, Switzerland.

    17:00-17:10 Introduction to AVSEC-2: the second international Audio-Visual Speech Enhancement Challenge

    17:10-17:45 Open Panel Discussion

    17:45 close

    Keynote speakers



    Prof Yu Tsao, Academia Sinica, Taiwan


    Towards Audio-visual Speech Enhancement in Real-World Scenarios

    We propose a novel audio-visual speech enhancement (AVSE) algorithm, iLAVSE, for a real-world scenario. Compared to conventional AVSE systems, iLAVSE overcomes three common issues that can occur in a real-world environment: the additional cost of processing visual data, audio-visual asynchronization, and low-quality visual data. To evaluate iLAVSE, we use a multimodal Taiwan-Mandarin speech with video dataset and compare with conventional AVSE systems. The results demonstrate that iLAVSE can effectively address the aforementioned issues and improve speech enhancement performance, making it suitable for real-world applications.



    Dr Peter Derleth, Sonova AG


    Technological and commercial aspects of assistive hearing solutions

    Assistive hearing solutions come in a variety of form factors, are designed to serve various use cases, are targeted at different user groups and distributed to the market as consumer or medical products. Each of the mentioned aspects influences if a technological/functional innovation reaches the respective market segment and gets the chance to improve the daily life of human listeners. The presentation will shed a light on existing and near future hearing aid technology and share anecdotal insights into alternative technical solutions which did not reach a high market impact despite being technically advanced.