Efficient machine translation decoding with slow language models
Abstract
Efficient decoding has been a fundamental problem in machine translation research. Usually a significant part of the computational complexity is found in the language model cost computations. If slow language models, such as neural network or maximum-entropy models are used, the computational complexity can be so high as to render decoding impractical. In this paper we propose a method to efficiently integrate slow language models in machine translation decoding. We specifically employ neural network language models in a hierarchical phrase-based translation decoder and achieve more than 15 times speed-up versus directly integrating the neural network models. The speed-up is achieved without any noticeable drop in machine translation output quality, as measured by automatic evaluation metrics. Our proposed method is general enough to be applied to a wide variety of models and decoders.