Beyond linear transforms: Efficient non-linear dynamic adaptation for noise robust speech recognition
Abstract
In this paper, we present new theory and results that combine constrained Maximum Likelihood Linear Regression (MLLR), known as feature space MLLR (fMLLR), a state-of-the-art model adaptation technique, with Dynamic Noise Adaptation (DNA), a state-of-the-art noise adaptation algorithm. We explain how DNA implements a highly non-linear transform on speech model features, and why DNA is better suited for compensating for additive noise than fMLLR. Tests results are presented on the DNA + Aurora II framework, which is based upon a collection of challenging in-car noise recordings, as a function of SNR. The results demonstrate that DNA significantly outperforms block fMLLR on additive noise, and that DNA + fMLLR outperforms the ETSI advanced front-end (AFE) system + fMLLR by a significant margin (over 7% absolute). Copyright © 2008 ISCA.