Inference of Surfaces, Curves and Junctions from One, Two or More Images using Tensor Voting

Philippos Mordohai


Abstract

This works extends the tensor voting framework for perceptual organization and addresses the problem of the inference of descriptions from images. The most important contributions consist of the addition of first order representation and voting, a full evaluation of all steps associated with stereo reconstruction and a novel contour completion mechanism. The desired descriptions are in terms of surfaces, curves and junctions. The input is in the form of images, from which the tokens are derived. Computer vision problems are then formulated as the organization of these tokens into salient perceptual structures.

We propose methods for generating tokens from images and show how they can be naturally incorporated as modules in the framework. If two or more images are available, the tokens correspond to pixel matches which can be reconstructed in a three-dimensional metric or projective space. Scene surfaces generate salient surfaces in this space, and thus can be inferred as salient, coherent groupings of tokens. We have achieved results in stereo that compare favorably with a large class of methodologies on standardized datasets.

The fact that such a large number of diverse algorithms achieve very similar performance confirms our belief that an inherent limit on the amount of information that can be derived from binocular cues only has been reached. To advance the state of the art in image interpretation, one should investigate the monocular case more thoroughly. The final part of this research addresses analysis of single images leading to the inference of descriptions that are a lot richer than before. To facilitate the inference of structure terminations, such as the endpoints of curves, and the labeling of junctions, first order information has been integrated to the tensor voting framework.


Maintained by Philippos Mordohai