Speech reconstruction from mel frequency cepstral coefficients and pitch frequency

Dan Chazan; Ron Hoory; Gilad Cohen; Meir Zibulski

doi:10.1109/ICASSP.2000.861816

Publication

ICASSP 2000

Conference paper

Speech reconstruction from mel frequency cepstral coefficients and pitch frequency

ICASSP 2000

View publication

Abstract

This paper presents a novel low complexity, frequency domain algorithm for reconstruction of speech from the mel-frequency cepstral coefficients (MFCC), commonly used by speech recognition systems, and the pitch frequency values. The reconstruction technique is based on the sinusoidal speech representation. A set of sine-wave frequencies is derived using the pitch frequency and voicing decisions, and synthetic phases are then assigned to each respective sine wave. The sine-wave amplitudes are generated by sampling a linear combination of frequency domain basis functions. The basis function gains are determined such that the mel-frequency binned spectrum of the reconstructed speech is similar to the mel-frequency binned spectrum, obtained from the original MFCC vector by IDCT and antilog operations. Natural sounding, good quality intelligible speech is obtained by this procedure.

Date

05 Jun 2000

Publication

ICASSP 2000

Authors

IBM-affiliated at time of publication

Abstract

Date

Publication

Authors

Share