(i) I will describe a new approach to alignment and integration of information across multiple video sequences, which utilizes all available spatio-temporal information within video sequences. By combining the spatial and dynamic visual scene information within a single alignment framework, situations which are inherently ambiguous for traditional image-to-image alignment methods, are uniquely resolved by sequence-to-sequence alignment. Moreover, coherent dynamic information can sometimes be used for aligning video sequences even in extreme cases when there is no common spatial information across these sequences (e.g., when the fields of view of the video cameras do not overlap, or when the cameras are of different sensing modalities, such as with IR and visible camera).
(ii) I will show how extended spatio-temporal scene representations can be very efficiently used to view, browse, index into, edit and enhance the video data. In raw video data the spatio-temporal scene information is implicitly and redundantly distributed across many video frames. This makes access and manipulation of video data very difficult. However, by analyzing the redundancy of visual information within the space-time data volume, the distributed scene information can be integrated into coherent and compact scene-based visual representations. These lead to very efficient methods for access and manipulation of visual information in video data.