Robust Source-Filter Separation of Speech Signal in the Phase Domain

Research Output

In earlier work we proposed a framework for speech source-filter separation that employs phase-based signal processing. This paper presents a further theoretical investigation of the model and optimisations that make the filter and source representations less sensitive to the effects of noise and better matched to downstream processing. To this end, first, in computing the Hilbert transform, the log function is replaced by the generalised logarithmic function. This introduces a tuning parameter that adjusts both the dynamic range and distribution of the phase-based representation. Second, when computing the group delay, a more robust estimate for the derivative is formed by applying a regression filter instead of using sample differences. The effectiveness of these modifications is evaluated in clean and noisy conditions by considering the accuracy of the fundamental frequency extracted from the estimated source, and the performance of speech recognition features extracted from the estimated filter. In particular, the proposed filter-based front-end reduces Aurora-2 WERs by 6.3% (average 0–20 dB) compared with previously reported results. Furthermore, when tested in a LVCSR task (Aurora-4) the new features resulted in 5.8% absolute WER reduction compared to MFCCs without performance loss in the clean/matched condition.

Date:

20 August 2017
Publication Status:

Published
Publisher

ISCA
DOI:

10.21437/interspeech.2017-210
Funders:

Engineering and Physical Sciences Research Council

http://researchrepository.napier.ac.uk/output/3586530 <p>Loweimi, E., Barker, J., Torralba, O. S., & Hain, T. (2017). Robust Source-Filter Separation of Speech Signal in the Phase Domain. In <i>Proc. Interspeech 2017</i> (414-418). https://doi.org/10.21437/interspeech.2017-210</p>

Citation

Loweimi, E., Barker, J., Torralba, O. S., & Hain, T. (2017). Robust Source-Filter Separation of Speech Signal in the Phase Domain. In Proc. Interspeech 2017 (414-418). https://doi.org/10.21437/interspeech.2017-210

Authors

Dr Erfan Loweimi

School of Computing Engineering and the Built Environment

Monthly Views:

Available Documents

Files currently unavailable for download , please contact E.Loweimi@napier.ac.uk to request a copy
Downloadable citations
HTML BIB RTF

Date:

Publication Status:

Publisher

DOI:

Funders:

Citation

Authors

Dr Erfan Loweimi

Monthly Views:

Files currently unavailable for download , please contact E.Loweimi@napier.ac.uk to request a copy

Downloadable citations