First we describe a system for model-based segmentation and tracking in which simple shape model is used. Segmentation is done by using direct image features. Multi-human tracking is factored into matching them one by one according to their depth order inferred from the geometry. This results in a real-time system effective for temporary severe occlusion and persistent occlusion of small groups of people.
The simple approach may not be effective when the number of people and the amount of occlusion increase. We formulate the model-based segmentation and tracking problem under the Bayesian framework. The optimal solutions are defined explicitly as the Bayesian posterior probability in a joint-object space. The solution in this complex high-dimensional space is computed by a Markov chain Monte Carlo (MCMC)-based method. The computational approach also takes advantages of domain knowledge as importance proposal probabilities to direct the Markov chain intelligently to obtain significantly faster convergence. The new formulation is more general and also applies to the case of a larger group of people move together.
We also propose a "tracking as recognition" approach where the estimation of body postures is accomplished by recognizing the motion in a locomotion model. It results in robust performance in very challenging data.