|
University of Southern California
|
|
|
|
|
|
|
|
|
Object Tracking and Event Understanding
|
|
|
|
Video Analysis and Content Extraction
|
|
|
|
|
|
IRIS home page
Research topics
VACE Program
ARDA home page
|
|
|
|
|
|
Project Description
|
|
|
|
We propose to develop tools for automated analysis of video
sequences. This will include methods for detection and tracking of
moving objects resulting in a structured description of the
video. This representation will then be used for extracting contents
that will allow understanding of the events occurring in the scene.
Such automated tools will be essential for analysts in the
intelligence community to cope with and make effective use of the
vast quantities of video data that are becoming increasingly
available.
A key issue in content extraction from videos is the use and
availability of the spatial and mission context. Such context is more
likely to be available for observations at a specific site (such as
monitoring for security applications) but will be more difficult to
obtain for videos where little is known about the scene of activity or
only generic context (such as the activity takes place in some hotel
lobby) is available. Our approach has several innovative and unique
characteristics:
-
Detecting and tracking of moving objects in video streams through the
characterization of each pixel’s path in the 2D + t space. Our
approach based on the characterization of beams of paths, provides a
robust approach to layered segmentation of moving objects of various
scales and their tracking by studying the properties of the beams of
trajectories. This approach provides a generalization of the current
methods to deformable or articulated objects such as humans in motion.
-
Structured representation of the videos in order to capture spatial
and temporal
constructs characterized by the detection and tracking. A structured
video eliminates the disadvantages of the frame-based representation by
providing a description based on moving objects. The atomic elements of
the representation are the moving objects providing therefore an
adequate information to support both query processing and re-usability
of information.
-
Understanding of events characterized by the interactions among the humans
and the objects in the environment. We propose a generic, hierarchical
representation for understanding events in a scene. Both
“single” and “multiple” threaded events are
considered. The technique bridges the gap between image oriented
structured representations and higher level semantic inferences. Our
method will handle uncertainties in the computations rigorously by
using Bayesian networks and stochastic finite automata. Our
representation will also allow for easy entry of new activity
descriptions to be handled by an Event Representation
Language.
We expect that our detection and tracking methods will work on a wide variety
of videos of different content. Our event understanding methods will be
initially limited to situations where spatial and task context are readily
available. Extensions to cases where the context is available in very generic
form or not available at all will be addressed in future phases of this
research.
|
|
People
|
|
|
|
|
|
Research Topics
|
|
|
|
|
|
Publications
|
|
|
|
- S. Hongeng and R. Nevatia. Multi-agent event
recognition. In IEEE Proceedings of the
International Conference on Computer Vision, 2001.
- T. Zhao, R. Nevatia and F. Lv. Segmentation and Tracking of
Multiple Humans in Complex Situations.
In the proceedings of the conference on Computer Vision and Pattern
Recognition, December 2001, Kawai.
- Fengjun Lv, Tao Zhao and Ram Nevatia. Self-Calibration of a
camera from video of a walking human,
International Conference on Pattern Recognition 2002.
- Tao Zhao and Ram Nevatia. 3D Tracking of Human Locomotion: A
Tracking as Recognition Approach,
International Conference on Pattern Recognition 2002.
- Elaine Kang, Isaac Cohen and Gerard Medioni. Robust Affine
Motion Estimation in Joint Image Space using Tensor Voting,
International Conference on Pattern Recognition 2002.
- Jinman Kang, Isaac Cohen and Gerard Medioni. Continuous
multi-view tracking using tensor voting. In IEEE Workshop on
Motion and Video Computing, Orlando Florida, December 2002.
- Tao Zhao and Ram Nevatia. Stochastic Human Segmentation
from a static camera. In IEEE Workshop on
Motion and Video Computing, Orlando Florida, December 2002.
|
|
Data and Formats
|
|
|
|
|
|
|
|
|
|