Modeling syllable-based pronunciation variation for accented mandarin speech recognition
Abstract
Pronunciation variation is a natural and inevitable phenomenon in an accented Mandarin speech recognition application. In this paper, we integrate knowledge-based and data-driven approaches together for syllable-based pronunciation variation modeling to improve the performance of Mandarin speech recognition system for speakers with Southern accent. First, we generate the syllable-based pronunciation variation rules of Southern accent observed from the training corpus by Chinese linguistic expert. Second, dictionary augmentation with multiple pronunciation variants and pronunciation probability derived from forced alignment statistics of training data. The acoustic models will be retrained based on the new expansion dictionary. Finally, pronunciation variation adaptation will be performed to further fit the data on the decoding stage by taking distribution of variation rules clusters of testing set into account. The experimental results show that the proposed method provides a flexible framework to improve the recognition performance for accented speech effectively. © 2010 IEEE.