Use of Generalised Nonlinearity in Vector Taylor Series Noise Compensation for Robust Speech Recognition

Research Output

Designing good normalisation to counter the effect of environmental distortions is one of the major challenges for automatic speech recognition (ASR). The Vector Taylor series (VTS) method is a powerful and mathematically well principled technique that can be applied to both the feature and model domains to compensate for both additive and convolutional noises. One of the limitations of this approach, however, is that it is tied to MFCC (and log-filterbank) features and does not extend to other representations such as PLP, PNCC and phase-based front-ends that use power transformation rather than log compression. This paper aims at broadening the scope of the VTS method by deriving a new formulation that assumes a power transformation is used as the non-linearity during feature extraction. It is shown that the conventional VTS, in the log domain, is a special case of the new extended framework. In addition, the new formulation introduces one more degree of freedom which makes it possible to tune the algorithm to better fit the data to the statistical requirements of the ASR back-end. Compared with MFCC and conventional VTS, the proposed approach provides up to 12.2% and 2.0% absolute performance improvements on average, in Aurora-4 tasks, respectively.

Date:

08 September 2016
Publication Status:

Published
Publisher

ISCA
DOI:

10.21437/interspeech.2016-1028
Funders:

Historic Funder (pre-Worktribe)

http://researchrepository.napier.ac.uk/output/3586541 <p>Loweimi, E., Barker, J., & Hain, T. (2016). Use of Generalised Nonlinearity in Vector Taylor Series Noise Compensation for Robust Speech Recognition. In <i>Proc. Interspeech 2016</i> (3798-3802). https://doi.org/10.21437/interspeech.2016-1028</p>

Citation

Loweimi, E., Barker, J., & Hain, T. (2016). Use of Generalised Nonlinearity in Vector Taylor Series Noise Compensation for Robust Speech Recognition. In Proc. Interspeech 2016 (3798-3802). https://doi.org/10.21437/interspeech.2016-1028

Authors

Dr Erfan Loweimi

School of Computing Engineering and the Built Environment

Monthly Views:

Available Documents

Files currently unavailable for download , please contact E.Loweimi@napier.ac.uk to request a copy
Downloadable citations
HTML BIB RTF

Date:

Publication Status:

Publisher

DOI:

Funders:

Citation

Authors

Dr Erfan Loweimi

Monthly Views:

Files currently unavailable for download , please contact E.Loweimi@napier.ac.uk to request a copy

Downloadable citations