Cross-lingual Audio-visual Speech Enhancement based on Deep Multimodal Learning
  Speech enhancement and separation techniques are often used to improve the quality and intelligibility of speech degraded by background distractions, including speech and non-speech noises. We aim to change the current landscape of research and innovation in speech enhancement and separation by developing a novel multilingual audiovisual speech enhancement and separation framework based on English and Taiwanese Mandarin. Recently, there has been increasing interest in developing audio-only and audiovisual speech enhancement and separation models [1] to operate in real-world noisy environments, with an emphasis on English speakers. There have been studies in other languages, including Taiwanese Mandarin, but a number of formidable multilingual challenges are yet to be addressed [2, 3].
In this joint project, we will develop and practically evaluate a novel multilingual framework for speech enhancement and separation to reduce background noise and improve the performance of voice communication systems in real-world environments. We will consider challenging use cases e.g. human-robot interaction in very noisy environments, and automotive applications where a range of noises in cars, including music, phone ringing, car navigation sounds, children's noise, air conditioning and car audio systems are known to distract the driver and degrade the performance of hands-free voice communication and recognition systems. Our goal is to develop a first of its kind multilingual speech enhancement and separation framework and utilise quantitative and qualitative listening and comprehensibility tests to assess resulting improvements compared to benchmark approaches.
This joint proposal will build on ongoing collaborative research between two world-leading research groups at Edinburgh Napier University in Scotland, and the Research Center for Information Technology Innovation at Academia Sinica, Taipei, Taiwan. The outcomes of this project will be made openly available to both national and global research communities.

  • Start Date:

    1 June 2023

  • End Date:

    31 May 2025

  • Activity Type:

    Externally Funded Research

  • Funder:

    Royal Society of Edinburgh

  • Value:


Project Team