Boosting speaker recognition performance with compact representations
Abstract
This paper describes a speaker recognition system combination approach in which the compact forms of MAP adapted GMM supervectors are used to boost the performance of a high-dimensional supervector-based system or a combination of multiple systems. The compact supervector representations are subjected to a diagonal transformation to emphasize those dimensions that describe significant speaker information and to deemphasize noisy dimensions. Scores obtained from these representations are then combined with the scores obtained from high-dimensional supervector representations. The transformation parameters and the combination weights are estimated by minimizing a discriminative training objective function that approximates a minimum detection cost function. We carried out experiments on two NIST 2008 Speaker Recognition Evaluation English telephony tasks to compare the proposed approach with direct score combination obtained from low-and high-dimensional supervector representations. We have found that the proposed approach yields up to 18% relative gain. Copyright © 2011 ISCA.