Tree-guided transformation-based homograph disambiguation in Mandarin TTS system
Abstract
Homograph disambiguation is the core issue of the grapheme-to-phoneme conversion in Mandarin Text-to-Speech system. In this paper, a hybrid algorithm called tree-guided transformation-based learning (TTBL), which combines decision tree with transformation-based learning (TBL), is proposed to resolve homograph ambiguity. It can automatically generate templates, thereby avoiding manually summarizing templates, which is time-consuming and laborious in conventional TBL. In addition, the paper evaluates various keyword selection approaches in different domains. Results of comparative experiments show that, for the task of homograph disambiguation, templates automatically generated by decision tree achieve comparable performance to manually summarized templates, and the TTBL significantly outperforms decision tree. ©2008 IEEE.