Dominant and multiple motion estimation for video representation
Abstract
The major inhibitors of rapid access to on-line video data are costs and management of capture and storage, lack of high-speed real-time delivery and non-availability of content and context based intelligent search and indexing techniques. The solutions for capture, storage and delivery maybe on the horizon, however the lack of visual content based indexing of video and image information may still inhibit as widespread a use of this information modality as that of text or tabular data is currently. In this paper, we present techniques for compact visual representation of video data that will be useful for visual content based presentation and indexing. Video data comes in torrents - almost a megabyte every 30th of a second - but also affords the exploitation of relatively smoothly changing information over time. The techniques presented exploit the motion information across video frames to represent the underlying scene in a compact visual form as it is seen across many slowly varying frames in a video. Two classes of techniques are presented: (i) dominant motion estimation based techniques which exploit a fairly common occurrence in videos that a mostly fixed background (scene) is imaged with or without independently moving objects, and (ii) simultaneous multiple motion estimation and representation of motion video using layered representations.