Part based Object Detection, Segmentation, and Tracking by Boosting Simple
Shape Feature based Weak Classifiers
Abstract
Detection, segmentation, and tracking of objects of a known class is
a fundamental problem in computer vision. For this task, we need to
first detect the objects of interest and segment them from the
background, and then track them across different frames while
maintaining the correct identities. The two principle sources of
difficulty in performing this task are: a) change in appearance of
the objects with viewpoint, illumination, and possible articulation,
and b) partial occlusion of objects of interest by other objects.
The objective of this work is to develop a system to automatically
detect, segment, and track multiple, possibly partially occluded
objects of a known class from a single camera. We take pedestrians,
which are important for many real-life applications, as the main
class of interest to demonstrate our approach. However, some
components of the method are also applied to the class of cars to
show the generality of our approach.
We represent an object as a hierarchy of parts. The use of part
based model enables us to detect and track objects when some parts
of them are not visible. We develop a new type of shape oriented
features, called edgelet, to capture the silhouette based patterns.
We integrate the edgelet features with some other existing shape
features, and learn tree structured classifiers for object parts.
Part detection responses are combined jointly so that the spatial
relations, including possible occlusions, between multiple objects
are analyzed. For specific applications, an unsupervised, online
learning algorithm is used to improve the performance of the
detectors by adapting them to the particular environment. Object
segmentor, whose output is pixel-level figure-ground segmentation,
is learned based on the local shape features. The object detection
and segmentation results provide the observations for tracking.
Trajectory initialization and termination are both automatic and
rely on the detection results. Two complementary techniques, data
association and mean-shift, are used to track an object.
An automatic object detection and tracking system has been
implemented and evaluated on a number of images and videos. The
experimental results show that our method achieves the
state-of-the-art performance.