Integration Architecture
Our goal is not the development of specific algorithms. Most algorithms in computer vision work very well under specific situations, but are prone to failure under others. This results in a system which is fragile and unreliable. To solve this problem, we propose a system to tie several of these fragile algorithms together at the symbolic level. This allows the construction a world view which can recover from erroneous results occurring within these individual algorithms.
Stevi's code consists primarily of three separate modules. These are:
- Self-localization using the omnidirectional camera
- Detection and tracking of people using the stereo camera at long range
- Gesture recognition using the stereo camera at short range
Stevi's overall design can easily be expressed using SAI notation:

Within each of these objects, all data is converted to symbolic form as early as possible. In the case of modules that use stereo vision, stereo fusion of the input streams is delayed as long as possible so that it takes places on the symbolic elements.


