Audio-visual speaker recognition for video broadcast news

B. Maison; C. Neti; A.W. Senior

doi:10.1023/A:1011175531609

Publication

Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology

Paper

Audio-visual speaker recognition for video broadcast news

Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology

View publication

Abstract

Audio-based speaker identification degrades severely when there is a mismatch between training and test conditions due either to channel or to noise. In this paper, we explore various techniques to combine video based speaker identification with audio-based speaker identification to improve the performance under mismatched conditions. Specifically, we explore techniques to optimally determine the relative weights of the independent decisions based on audio and video to achieve the best combination. Experiments on video broadcast news data show that significant improvements can be achieved by the fusion in acoustically degraded conditions.

Date

01 Aug 2001

Publication

Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology

Authors

IBM-affiliated at time of publication

Abstract

Date

Publication

Authors

Share