Pronunciation modeling for names of foreign origin
Abstract
The pronunciation of a proper name is influenced by both a speaker's native language as well as the language of origin of the name itself. Thus, creating suitable sets of pronunciations for names in speech recognition applications is extremely challenging. In this work, we investigate whether automatic language identification and grapheme-to-phoneme conversion algorithms can be effective for this task. We train grapheme-to-phoneme models for eight foreign languages and use automatic language identification to select the models with which to generate additional pronunciations for words in a baseline pronunciation dictionary. As compared to the baseline dictionary in a US name recognition task, we achieve a 25% reduction in sentence-error rate for foreign names spoken by native speakers of the language in question, and a 10% reduction in sentence-error rate for foreign names spoken by American speakers.