Estimating mode (walking/running/standing) and phases of human locomotion is important for video understanding. We present a new ˇ±tracking as recognitionˇ± approach. A hierarchical finite state machine constructed from 3D motion capture data serves as a prior motion model. Motion templates are used as the observation model. Robustness is achieved by making inferences in the prior motion model which resolves the short-term ambiguity of the observations that may cause a regular tracking formulation to fail. Experiments show very promising results on some difficult sequences.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Result B:
![]() |
![]() |
![]() |
![]() |
Pose rendered from new viewpoints:
![]() |
![]() |