Spoken Digit Recognition Using Vowel-Consonant Segmentation
Abstract
A procedure has been developed for recognition of spoken digits by means of digital computer simulation. Using power spectra computed at 10-msec intervals, the words are segmented into vowels and consonants. Vowels are then classified into one of 11 categories by a multivariate statistical decision method operating on approximations of the measurements. Consonants are classified into one of three categories by means of an empirically derived decision tree. Recognition is then performed by means of a dictionary search. When tested on a sample of 493 words spoken by 50 speakers, and with the internal dictionary adjusted for optimum results, 97% of the words were identified correctly. It appears that this procedure is more tolerant of interspeaker variations than those previously reported. © 1962, Acoustical Society of America. All rights reserved.