29 results

Phonetic Error Analysis Beyond Phone Error Rate

Journal Article
Loweimi, E., Carmantini, A., Bell, P., Renals, S., & Cvetkovic, Z. (2023)
Phonetic Error Analysis Beyond Phone Error Rate. IEEE/ACM Transactions on Audio, Speech and Language Processing, 31, 3346-3361. https://doi.org/10.1109/taslp.2023.3313417
In this article, we analyse the performance of the TIMIT-based phone recognition systems beyond the overall phone error rate (PER) metric. We consider three broad phonetic cla...

Dysarthric Speech Recognition, Detection and Classification using Raw Phase and Magnitude Spectra

Conference Proceeding
Yue, Z., Loweimi, E., & Cvetkovic, Z. (2023)
Dysarthric Speech Recognition, Detection and Classification using Raw Phase and Magnitude Spectra. In Proc. INTERSPEECH 2023 (1533-1537). https://doi.org/10.21437/interspeech.2023-222
In this paper, we explore the effectiveness of deploying the raw phase and magnitude spectra for dysarthric speech recognition, detection and classification. In particular, we...

Multi-Stream Acoustic Modelling Using Raw Real and Imaginary Parts of the Fourier Transform

Journal Article
Loweimi, E., Yue, Z., Bell, P., Renals, S., & Cvetkovic, Z. (2023)
Multi-Stream Acoustic Modelling Using Raw Real and Imaginary Parts of the Fourier Transform. IEEE/ACM Transactions on Audio, Speech and Language Processing, 31, 876-890. https://doi.org/10.1109/taslp.2023.3237167
In this paper, we investigate multi-stream acoustic modelling using the raw real and imaginary parts of the Fourier transform of speech signals. Using the raw magnitude spectr...

Acoustic Modelling From Raw Source and Filter Components for Dysarthric Speech Recognition

Journal Article
Yue, Z., Loweimi, E., Christensen, H., Barker, J., & Cvetkovic, Z. (2022)
Acoustic Modelling From Raw Source and Filter Components for Dysarthric Speech Recognition. IEEE/ACM Transactions on Audio, Speech and Language Processing, 30, 2968-2980. https://doi.org/10.1109/taslp.2022.3205766
Acoustic modelling for automatic dysarthric speech recognition (ADSR) is a challenging task. Data deficiency is a major problem and substantial differences between typical and...

Dysarthric Speech Recognition From Raw Waveform with Parametric CNNs

Presentation / Conference
Yue, Z., Loweimi, E., Christensen, H., Barker, J., & Cvetkovic, Z. (2022, September)
Dysarthric Speech Recognition From Raw Waveform with Parametric CNNs. Paper presented at Interspeech 2022, Incheon, Korea
Raw waveform acoustic modelling has recently received increasing attention. Compared with the task-blind hand-crafted features which may discard useful information, representa...

RCT: Random consistency training for semi-supervised sound event detection

Presentation / Conference
Shao, N., Loweimi, E., & Li, X. (2022, September)
RCT: Random consistency training for semi-supervised sound event detection. Paper presented at Interspeech 2022, Incheon, Korea
Sound event detection (SED), as a core module of acoustic environmental analysis, suffers from the problem of data deficiency. The integration of semi-supervised learning (SSL...

Raw Source and Filter Modelling for Dysarthric Speech Recognition

Conference Proceeding
Yue, Z., Loweimi, E., & Cvetkovic, Z. (2022)
Raw Source and Filter Modelling for Dysarthric Speech Recognition. In ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). https://doi.org/10.1109/icassp43922.2022.9746553
Acoustic modelling for automatic dysarthric speech recognition (ADSR) is a challenging task. Data deficiency is a major problem and substantial differences between the typical...

Multi-Modal Acoustic-Articulatory Feature Fusion For Dysarthric Speech Recognition

Conference Proceeding
Yue, Z., Loweimi, E., Cvetkovic, Z., Christensen, H., & Barker, J. (2022)
Multi-Modal Acoustic-Articulatory Feature Fusion For Dysarthric Speech Recognition. In ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). https://doi.org/10.1109/icassp43922.2022.9746855
Building automatic speech recognition (ASR) systems for speakers with dysarthria is a very challenging task. Although multi-modal ASR has received increasing attention recentl...

Stochastic Attention Head Removal: A Simple and Effective Method for Improving Transformer Based ASR Models

Conference Proceeding
Zhang, S., Loweimi, E., Bell, P., & Renals, S. (2021)
Stochastic Attention Head Removal: A Simple and Effective Method for Improving Transformer Based ASR Models. In Proc. Interspeech 2021 (2541-2545). https://doi.org/10.21437/interspeech.2021-280
Recently, Transformer based models have shown competitive automatic speech recognition (ASR) performance. One key factor in the success of these models is the multi-head atten...

Speech Acoustic Modelling Using Raw Source and Filter Components

Conference Proceeding
Loweimi, E., Cvetkovic, Z., Bell, P., & Renals, S. (2021)
Speech Acoustic Modelling Using Raw Source and Filter Components. In Proc. Interspeech 2021 (276-280). https://doi.org/10.21437/interspeech.2021-53
Source-filter modelling is among the fundamental techniques in speech processing with a wide range of applications. In acoustic modelling, features such as MFCC and PLP which ...