Speech Acoustic Modelling from Raw Phase Spectrum

Research Output

Magnitude spectrum-based features are the most widely employed front-ends for acoustic modelling in automatic speech recognition (ASR) systems. In this paper, we investigate the possibility and efficacy of acoustic modelling using the raw short-time phase spectrum. In particular, we study the usefulness of the raw wrapped, unwrapped and minimum-phase phase spectra as well as the phase of the source and filter components for acoustic modelling. Furthermore, we explore the effectiveness of simultaneous deployment of the vocal tract and excitation components of the raw phase spectrum using multi-head CNNs and investigate multiple information fusion schemes. This paves the way for developing an effective phase-based multi-stream information processing systems for speech recognition. The performance, even for wrapped phase with a noise-like shape, is comparable to or better than the magnitude-based classic features, and up to 4.8% WER has been achieved in the WSJ (Eval-92) task.

Date:

13 May 2021
Publication Status:

Published
Publisher

IEEE
DOI:

10.1109/icassp39728.2021.9413727
Funders:

Engineering and Physical Sciences Research Council

http://researchrepository.napier.ac.uk/output/3585849 <p>Loweimi, E., Cvetkovic, Z., Bell, P., & Renals, S. (2021). Speech Acoustic Modelling from Raw Phase Spectrum. In <i>ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</i>. https://doi.org/10.1109/icassp39728.2021.9413727</p>

Citation

Loweimi, E., Cvetkovic, Z., Bell, P., & Renals, S. (2021). Speech Acoustic Modelling from Raw Phase Spectrum. In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). https://doi.org/10.1109/icassp39728.2021.9413727

Authors

Dr Erfan Loweimi

School of Computing Engineering and the Built Environment

Keywords

Raw phase spectrum, phase-based source-filter separation, multi-head CNNs, acoustic modelling, ASR

Monthly Views:

Available Documents

Files currently unavailable for download , please contact E.Loweimi@napier.ac.uk to request a copy
Downloadable citations
HTML BIB RTF

Date:

Publication Status:

Publisher

DOI:

Funders:

Citation

Authors

Dr Erfan Loweimi

Keywords

Monthly Views:

Files currently unavailable for download , please contact E.Loweimi@napier.ac.uk to request a copy

Downloadable citations