PROBABILISTIC MODELING WITH BAYESIAN NETWORKS FOR AUTOMATIC SPEECH RECOGNITION
Abstract
This paper describes the application of Bayesian networks to automatic speech recognition (ASR). Bayesian networks enable the construction of probabilistic models in which an arbitrary set of variables can be associated with each speech frame in order to explicitly model factors such as acoustic context, speaking rate, or articulator positions. Once the basic inference machinery is in place, a wide variety of models can be expressed and tested. We have implemented a Bayesian network system for isolated word recognition, and present experimental results on the PhoneBook database. These results indicate that performance improves when the observations are conditioned on an auxiliary variable modeling acoustic/articulatory context. The use of multivalued and multiple context variables further improves recognition accuracy.