Research Output
Automatic Human Utility Evaluation of ASR Systems: Does WER Really Predict Performance?
  We propose an alternative evaluation metric to Word Error Rate (WER) for the decision audit task of meeting recordings, which exemplifies how to evaluate speech recognition within a legitimate application context. Using machine learning on an initial seed of human-subject experimental data, our alternative metric handily outperforms WER, which correlates very poorly with human subjectsf success in finding decisions given ASR transcripts with a range of WERs.

Citation

Favre, B., Cheung, K., Kazemian, S., Lee, A., Liu, Y., Munteanu, C., …Zeller, F. (2013). Automatic Human Utility Evaluation of ASR Systems: Does WER Really Predict Performance?. In Proc. Interspeech 2013 (3463-3467). https://doi.org/10.21437/Interspeech.2013-610

Authors

Monthly Views:

Available Documents