Speech Recognition Error Analysis on the English MALACH Corpus

  • Olivier Siohan ,
  • Bhuvana Ramabhadran ,
  • Geoffrey Zweig

In Proceedings of ICSLP |

This paper presents an analysis of the word recognition error rate on an English subset of the MALACH corpus. The MALACH project is an NSF-funded research program related to the development of multilingual access to large audio archives. The archive of interest is a large collection of testimonies from 52,000 survivors, liberators, rescuers and witnesses of the Nazi Holocaust, assembled by the Shoah Visual History Foundation. This data has some unique characteristics that make it quite unusual in the speech recognition community such as elderly speech, noisy conditions, heavily accented speech. Hence, it is a challenging task for automatic speech recognition (ASR). This paper attempts to identify the factors affecting the ASR performance on that task. It was found that the signal-to-noise ratio and syllable rate were two dominant factors in explaining the overall word error rate, while we observed no evidence of the impact of accent and speaker’s age on the recognition performance. Based on this evidence, noise compensation experiments were carried out and led to a 1.1% absolute reduction of the word error rate.