Augmented edit distance based temporal contiguity analysis for improved videotext recognition
Abstract
Videotext refers to text superimposed on video frames and it enables automatic content annotation and indexing of large video and image collections. Its importance is underscored by the fact that a videotext based Multimedia Description Scheme has recently been adopted into the MPEG-7 standard. A study of published work in the area of automatic videotext extraction and recognition reveals that, despite recent interest, a reliable general purpose video character recognition (VCR) system is yet to be developed. In our development of a VCR system designed specifically to handle the low resolution output from videotext extractors, we observed that raw VCR accuracies obtained using various classifiers including kernel space methods such as SVMs, are inadequate for accurate video annotation. In this paper, we propose an intelligent postprocessing mechanism that is supported by general data characteristics of this domain for VCR performance improvement. We describe Temporal Contiguity Analysis, which works independently of the raw character recognition technique and works well even for moving videotext. This novel mechanism can be easily implemented in conjunction with VCR algorithms being developed elsewhere to offer the same performance gains. Experimental results on various video streams show notable improvements in recognition rates with our system incorporating a SVM-based recognition engine and temporal contiguity analysis.