Fast incremental adaptation using maximum likelihood regression and stochastic gradient descent

Sreeram V. Balakrishnan

Publication

INTERSPEECH - Eurospeech 2003

Conference paper

Fast incremental adaptation using maximum likelihood regression and stochastic gradient descent

INTERSPEECH - Eurospeech 2003

Abstract

Adaptation to a new speaker or environment is becoming very important as speech recognition systems are deployed in unpredictable real world situations. Constrained or Feature space Maximum Likelihood Regression (fMLLR) [1] has proved to be especially effective for this purpose, particularly when used for incremental unsupervised adaptation [2]. Unfortunately the standard implementation described in [1] and used by most authors since, requires statistics that require O(n3) operations to collect per frame. In addition the statistics require O(n3) space for storage and the estimation of the feature transform matrix requires O(n4) operations. This is an unacceptable cost for most embedded speech recognition systems. In this paper we show the fMLLR objective function can be optimized using stochastic gradient descent in a way that achieves almost the same results as the standard implementation. All this is accomplished with an algorithm that requires only O(n2) operations per frame and O(n2) storage requirements. This order of magnitude savings allows continuous adaptation to be implemented in most resource constrained embedded speech recognition applications.

Date

01 Sep 2003

Publication

INTERSPEECH - Eurospeech 2003

Authors

Sreeram V. Balakrishnan

IBM-affiliated at time of publication

Topics

Computer Science

Abstract

Date

Publication

Authors

Topics

Share