Abstract
We present a dynamic network decoder capable of using large cross-word context models and large n-gram histories. Our method for constructing the search network is designed to process large cross-word context models very efficiently and we address the optimization of the search network to minimize any overhead during run-time for the dynamic network decoder. The search procedure uses the full LM history for lookahead, and path recombination is done as early as possible. In our systematic comparison to a static FSM based decoder, we find the dynamic decoder can run at comparable speed as the static decoder when large language models are used, while the static decoder performs best for small language models. We discuss the use of very large vocabularies of up to 2.5 million words for both decoding approaches and analyze the effect of weak acoustic models for pruning. © 2009 IEEE.