Acoustic modeling with mixtures of subspace constrained exponential models
Abstract
Gaussian distributions are usually parameterized with their natural parameters: the mean μ and the covariance σ. They can also be re-parameterized as exponential models with canonical parameters P = σ-1 and ψ = Pμ. In this paper we consider modeling acoustics with mixtures of Gaussians parameterized with canonical parameters where the parameters are constrained to lie in a shared affine subspace. This class of models includes Gaussian models with various constraints on its parameters: diagonal covariances, MLLT models, and the recently proposed EMLLT and SPAM models. We describe how to perform maximum likelihood estimation of the subspace and parameters within a fixed subspace. In speech recognition experiments, we show that this model improves upon all of the above classes of models with roughly the same number of parameters and with little computational overhead. In particular we get 30-40% relative improvement over LDA+MLLT models when using roughly the same number of parameters.