Refactoring acoustic models using variational density approximation
Abstract
In model-based pattern recognition it is often useful to change the structure, or refactor, a model. For example, we may wish to find a Gaussian mixture model (GMM) with fewer components that best approximates a reference model. One application for this arises in speech recognition, where a variety of model size requirements exists for different platforms. Since the target size may not be known a priori, one strategy is to train a complex model and subsequently derive models of lower complexity. We present methods for reducing model size without training data, following two strategies: GMM approximation and Gaussian clustering based on divergences. A variational expectation- maximization algorithm is derived that unifies these two approaches. The resulting algorithms reduce the model size by 50% with less than 4% increase in error rate relative to the same-sized model trained on data. In fact, for up to 35% reduction in size, the algorithms can improve accuracy relative to baseline. ©2009 IEEE.