Persistent Object Detection and Tracking Within and Across Multiple Cameras

Jinman Kang


Abstract

We address three key elements (i.e. detection, tracking and integration) of the video-based activity analysis in various types of camera configurations. We present an approach to detection and tracking of multiple objects from a single or multiple uncalibrated cameras, and demonstrate a framework for combining multiple sources of information from various sensors.

The objective of the detection is to extract moving regions in the scene. For video streams acquired by a moving camera, the detection of moving objects is performed by defining an adaptive background model that approximates camera motion by an affine transformation. The detection and compensation of a parallax effect is needed for a robust detection of independent moving objects where significant variations of the depth are presented.

The objective of the tracking step is to establish correspondences between detected blobs for activity understanding. A Tensor Voting based tracking approach is proposed by reformulating tracking of moving objects from a single camera as a perceptual grouping problem in 2D+t space.

The use of multiple cameras requires integrating video streams from various vantage points, and the propagating information (i.e. detection and trajectories) across views. The registration of multiple stationary cameras is performed by a homography obtained from a ground reference plane. A spatio-temporal homography is proposed to solve the registration of the trajectories of moving objects and the synchronization of the video streams. The integration of a stationary and moving camera is performed by a combination of perspective and affine transformations. Registering non-overlapping views is performed by integrating video stream acquired by a moving camera, that spans across the non-overlapping views. The proposed approach integrates, as well video streams acquired by multi-modal sensors (i.e. EO and IR).

The tracking problem is addressed by separately modeling motion and appearance of the moving objects using two probabilistic models. A 2D invariant appearance model based on multiple color and edge distributions is proposed for describing the object being tracked. The motion model is obtained using a Kalman Filter (KF) process, which predicts the position of the moving object. Tracking of moving objects is performed by the maximization of a joint probability model. We propose a spatio-temporal joint probability data association filter (JPDAF) to simultaneously integrate appearance, 2D and 3D information. It improves the accuracy of the tracking, and allows us to tracking objects during occlusions of short duration.


Maintained by Philippos Mordohai