Detection of Changes in Man-Made Structures from Aerial Views

Andres Huertas and Ramakant Nevatia*

Institute for Robotics and Intelligent Systems

University of Southern California

Los Angeles, California 90089-0273



An important application of machine vision is to provide a means to monitor a scene over a period of time and report changes. We have developed a change detection system for this purpose, in support of the RADIUS program. This process consists of several steps. Model to image registration aims to bring the site model and the images into close correspondence; model validation seeks to confirm the presence of model objects in the image; preliminary change detection seeks to resolve matching problems and provide an early report of possibly changed structures. Currently, the system is able to detect changes such as missing buildings or changes in their dimensions.

Keywords: Registration, matching, model-based change detection, photo-interpretation, model updating.

1  Introduction

This paper deals with change detection in a complex and cluttered environment: aerial views of natural and cultural features. The imagery acquired from aerial and space vehicles is vast and requires automated and semi-automated systems with applications in photo-interpretation, mapping, planning, surveillance and guidance. The task of change detection in this context consists of finding significant differences between the new data and a model derived from the older data. These differences may be due to a multitude of sources, many of which are irrelevant. The significance of the differences may be task specific although in most cases man-made changes are more important than those caused by factors such as seasonal, illumination and viewpoint changes. In this work we are only interested in those changes in the image that come from some changes in the site rather than from changes in imaging conditions.

To deal with the development and integration of machine vision systems to support the task of image analysts, the RADIUS program [Gerson and Wood, 1994] relies on the concept of model-supported exploitation (MSE). In this concept a site model is constructed from a number of images of a site [Huang et al., 1994]. The model consists of terrain elevation, surface features such as roads and rivers, functional areas and 3D building structures. Significant progress in development of techniques for interactive and automated construction of model buildings have been reported in the literature [Fua, 1996, Chung and Nevatia, 1992, Lin et al., 1994 Huertas and Nevatia, 1988, Irving and McKeown, 1989, Mohan and Nevatia, 1989, Jaynes et al, 1994, McGlone and Shufelt, 1994]. Figure 1 shows an example of a site model corresponding to a modelboard site constructed for RADIUS experiments.
Extracted pic [1] Figure 1 A typical site model.

Change detection involves comparing a new image (or a collection of images) of a site to the information associated with that site in a site folder which consists of a site model and one or more previous images and results of analyses on these images. This problem is similar to industrial inspection (see for example [Khalaj et al., 1992]) in some ways. However, in our case site models are very incomplete, illumination is uncontrolled and many image changes are not relevant to changes in the site. In this paper we concentrate on building structures only. For related work dealing with mobile objects using similar techniques as those reported here see [Huertas et al., 1995b.] In all cases we assume that a site model of suitable resolution and complexity is available.

Figure 2 shows a flowchart of the complete change detection process. It contains five major steps:

Extracted pic [2] Figure 2 In the following we describe work on the following tasks needed to achieve a full change detection system: site model registration, model validation, change detection and model updating. We show examples that help illustrate the current capabilities using imagery supplied to us by the RADIUS program. The system is written in LISP and runs under the Radius Common Development Environment (RCDE) [Strat et al., 1992].

2  Site Model to Image Registration

The first step is to register the site model to an image. Our system has capability to correct for translational errors. Our registration method [Huertas et al., 1995a] consists of the following tasks:

The first task is carried out by a matching technique [Medioni et al., 1990] that uses the visible line segments derived from the projected site model objects, and line segments approximated from the edges extracted [Canny, 1986] from the image. Candidate segment matches contribute a "vote" derived from the attributes and position of the segments into a parameter space accumulator. The space denotes translational offsets and the main peak gives the global misregistration error.

The second task uses the knowledge of the registration offsets to, in a second pass of the matcher, select the matching pairs (model segment, image segment) that correspond segments in the model to segments in the image. Since the model segments are grouped into objects by the model itself, we also obtain a correspondence of the model objects to the image features. For additional details see [Huertas et al., 1995a.]

3  Site Model Validation

The analysis of changes of permanent structures, such as buildings, proceeds at the object level. The purpose of model validation is to verify whether model objects are present in the image and whether the objects remain unchanged. The system uses the object correspondences established in the registration step to calculate and assign a confidence value (see below) to each object in the model.

3.1  Missing Features

To validate a model accurately we need to study the source of missing model-to-image correspondences. Some missing image features will be due to viewing conditions such as self-occlusion, occlusion by other objects, self shadows and shadows cast by nearby objects. These, however, can be predicted and explained from the site model itself. Missing correspondences may be due to over- or under-modeling of objects ( Figure 3 and Figure 4) and are more difficult to predict from the model. The confidence associated with over- or undermodeled objects may thus be underestimated or difficult to calculate.
Extracted pic [3] Figure 3 The thick lines in the building model b do not correspond to actual
Extracted pic [4] Figure 4 Some buildings may be under modelled

Over-modeling is due to the use of modeling primitives that introduce elements that do not correspond to actual physical elements or boundaries. Figure 3 shows a building that has been modeled by two rectangle parallelepipeds. The thick lines represent portions of the elements on the building model that do not correspond to physical boundaries. These can not be matched and the missing correspondences result in lower confidence.

Figure 4 shows two buildings that are likely to be under-modeled (i.e. modeled by simpler shapes) due to their complexity. These require additional search strategies designed to look for additional and possibly fragmented evidence, such as a large number of vertical or horizontal edge elements. Our system is not currently capable of determining these conditions, and thus the confidence values may be underestimated. It is assumed that some of these conditions may require annotations in the site model to help the system process these appropriately.

3.2  Ambiguities in Matching

There are several ambiguities inherent to the matching process that need to be resolved during validation and change detection. The system currently deals with two of these. The first deals with multiple or missing matches between the site model features and the image. The second deals with coincidental alignments due to viewpoint, illumination direction, or to adjacent structures.

3.2.1  Multiple Matches

The model-to-image matcher in the system corresponds each model element with one or more image elements. This is necessary to deal with expected fragmentation in the image elements. Fragmentation is due to inadequacies in the feature extraction process and due to actual image content, such as occluding trees, road boundaries and shadows. This may result in one-to-many correspondences ( Figure 5) possibly involving more than one object. If a model segment matches multiple colinear image segments, all the image segments are considered to represent image support. If a model segments matches multiple parallel image segments, the overlap among these is considered to represent image support.
Extracted pic [5] Figure 5 One to many correspondences.

3.2.2 Coincidental Alignments.

Some multiple matches are due to coincidental alignments of buildings with other structures ( Figure 6). Some of these include roads, and adjacent objects. Nearby objects and shadows sometimes result in image features that have a larger extent than that predicted by the model features. These are explained by examining nearby shadows with knowledge of the direction of illumination, and by examining adjacent structures. Coincidental alignments due to nearby and adjacent structures are determined by looking for adjacent structures that help explain alignment or a possible change in horizontal dimensions.
Extracted pic [6] Figure 6 Coincidental alignments

3.3 Validation Confidence

The confidence values derived take into account only visible elements from the particular viewpoint of the image. Self-occlusion and occlusion by other objects (determined using a range image derived from the model itself) are also taken into account. Confidence is based on the following measures (see Figure 7 and Figure 8):

Let x be a model object defined by a set of vertices and a set of edges. For each object, x, we wish to calculate a confidence value C(x) as a contribution of the following terms, wighted by wp , wv , ws , wj , and wm :

Object Visibility: V(x) is defined as the ratio of visible edges from the particular viewpoint and included in the field of view, over the number of edges in the model object. The current system does not penalize partial visibility and thus wv=1.0.

Object Presence: P(x) is defined as the ratio of the number of visible model edges that are matched to image edges, over the number of visible model edges. In the example shown in Figure 7a, all nine visible edges (dashed lines) have correspondences in the image (solid lines), giving a P value of 1.0. An object that is only 50% visible but that has the visible 50% corresponded to image edges has a P value of 1.0 also. P is calculated separately for roof elements, vertical wall elements and base wall elements to allow us to assign different ("ad hoc") weights to reflect the relative importance of these groups. Currently wp = 7, 5 and 3 for roof, vertical wall, wall base elements respectively.

Object Coverage: M(x) is defined as the ratio of the sum of the lengths of the image segments that are corresponded to visible model edges, over the sum of the lenghts of the visible model edges. Figure 7a shows an object with all model edges (dashed) corresponded (good presence) by small image (solid) edge supports (poor coverage). Figure 7b shows the opposite; a few model edges (poor presence) corresponded to large image edge support (good coverage).
Extracted pic [7] Figure 7 Presence and Coverage
M(x) is also calculated separately for roof and wall elements using the same weights as for P, thus wm=wp. M(x) is penalized by F(x), where F(x), fragmentation, is defined as the ratio of the number of image segments corresponded to model edges over the number of model edges.

Shadow Presence: S(x), is defined as the ratio of the number of potential shadow boundaries and junctions extracted from the image over number of visible shadow elements (boundaries and junctions) derived from the model ( Figure 8). The image segments are labelled as potential shadow segments by noting the consistency of the "dark" side of the segment with respect to the direction of illumination. Segments oriented parallel to the projection of the direction of illumination correspond to possible shadow lines cast by vertical edges. The L-junctions formed (allowing for gaps) by potential shadow lines are labeled potential shadow junctions. Details on shadow evidence extraction may be found in [Lin et al., 1995].
Extracted pic [8] Figure 8 Typical shadows cast by a cubic building with no surrounding
S(x) is currently assigned a weight ws=3.0.

Junction Presence: J(x) is defined as the ratio of the number of image L-junctions at locations predicted by the model ( Figure 8) to the number of visible model vertices. Image junctions are extracted from the image from the line segments used for matching. Currently J(x) is assigned a weight wj=3.0.

Above calculations are combined to give a confidence value, C, from which a confidence level is established, as follows:
Extracted pic [9] Equation 1

High confidence values indicate good image support while low values denote low image support. Low values may signify change as lack of image support may be due to missing buildings, or buildings that have undergone significant change with respect to their current model. Model buildings that have strong image support, may have changed also. Additions to structures, such as a new wings, may not affect significantly the appearance of the previously modeled portions.

Figure 9 shows an example of the registration/validation step applied to one of the modelboard images where all ambiguities are resolved successfully and there are no changes reported. The colors indicate the confidence level coded to 5 levels: very high (green), high (blue), medium (yellow), low (salmon) and very low (red).
Extracted pic [10] Figure 9 Validation result and confidence levels

4  Change Detection

The previous step (validation) makes available information that is used to start analyses to determine changes. The indication of changes in the site currently comes in two forms:

Our system currently is able to detect changes in the dimensions of the structures and changes due to missing buildings. In our experiments and examples below we altered the site model to test these conditions.

4.1  Missing Buildings

Model buildings having very low confidence values denote poor image support. The possible causes for this condition are that either the model is incorrect, the structure is occluded or that the building has been removed or destroyed (assuming that images are of sufficient quality.) Resolving these ambiguities may require examination of this location in other images. Examples are shown below in Figure 11.

4.2  Changed Buildings

Apparent changes in the dimensions of the structures in the image that do not seem to be due to errors or coincidental alignment are taken to signify real changes. The changes in dimensions detected by the current system are reported but explicitly modeled. A full description of the changes requires that the entire object geometry be analyzed, possibly requiring the use of more than one view. This is a subject for future work.

Figure 10 shows a building wing that has been added to an existing structure. The portion of the building in the model is correctly registered to the image by the system. The two thick white lines denote the extent of the match. Because the object presence measure for the roof of this structure indicates that all four sides of the current model were matched, the change is labeled "added" wing.
Extracted pic [11] Figure 10 Added wing is reported in this case.

4.3  New Buildings

One important type of site change is the introduction of new structures. We have capabilities to construct models automatically and therefore we can suggest new additions to the site model. These techniques are applied to areas of interest, currently designated in the site model as "functional areas", using one or more images, if available. The site model is used to indicate already modeled areas.The camera models and terrain models associated with the images are used also by these systems to derive viewpoint and illumination parameters automatically. An example of this task is shown later in Figure 12.

5 Experiments and Results

We have tested the system extensively with model board imagery (comprising more than 40 images), and with real imagery of Fort Hood (Texas). An example from the Fort Hood Imagery set supplied by the RADIUS program is shown below to demonstrate the current capabilities. The ability to detect change in the form of new structures (not in the model) is also demonstrated with an example. The processed image size is 7775x7720 pixels, and the 3-D site model contains 79 objects representing building structures. In this example, the image and the model are registered allowing the processing of the entire image only at locations where model buildings exist. Processing time is about 15 seconds per structure on a Sun sparc-10 workstation, running under the RCDE. The graphical results given in Figure 11 we show three levels of confidence: high (green), medium (yellow) and low(red). Only a small portion of the image are shown for lack of space.

The statistics and results are described in table 1.The left part of the table simply shows the number of buildings visible in the image and the distribution of validation confidence values. These are for information purposes only as they primarily reflect image content. Notice however the correlation between confidence level and the number of buildings changed, not changed or missing in the rest of the table. All matching ambiguities, with one exception are correctly handled. Fourteen buildings actually had changes. Thirteen of these are found to be changed. Some of these are shown in Figure 11 with an orange dot on top. One of these was not found "changed" due to lack of evidence, a false positive. Of the 54 non-changed buildings, one is found to be changed, a false alarm. This case involves an alignment with a ground feature not present in the model, a situation, not currently handled by the system (not shown.)

Buildings that change considerably or are missing have poor image support, resulting in low validation confidence (the red buildings in Figure 11). There are 12 of these, 11 of which were added by hand to test the "missing building" detection capability. The remaining one, represents a significantly changed building (not shown). All these are labeled correctly as changed or missing.

Figure 12 shows the result of applying a monocular building detection system [Lin et al., 1995] to look for change in the form of new buildings (shown with cyan outlines). The areas modeled are ignored. Typically the system would be instructed to locate new buildings in designated areas that are of interest, such as functional areas. The three buildings shown in cyan outlines are detected automatically and added to the site model.

6  Future Work

Change detection is a tedious task as it requires careful comparison of images taken at different times under possibly varying conditions. Even partial automation of this task will greatly increase image analyst productivity and possibly also enhance the reliability of the results.

We have developed and tested system for the purpose extensively with modelboard and real imagery.The system has been ported to industrial and Government sites for evaluation and testing.

The current system operates in the 2-D domain of projected model structures onto the image viewpoint and can be easily extended to incorporate 2-D features such as the transportation network. Detailed 3-D description of change is expected to require the use of more than one image viewpoint or images from a range sensor. Multiple images allow for 3-D matching and verification of changes although the processing may continue to incorporate 2-D processing for simplicity and usefulness in regard to processing times required

Table 1:  Summary of Results

Image Visible Buildings Validation Confidence Non-changed Buildings Changed Buildings Missing Buildings
High (green) Medium (yellow) Low (red) Number of buildings Reported non-changed Reported changed Number of buildings Reported changed Reported non-changed Number of buildings Reported missing Validated






























  1. [Canny, 1986] Canny, J. A Computational Approach to Edge Detection. IEEE transactions on Pattern Analysis and Machine Intelligence 8(6), November, pp 679-698
  2. [Chung and Nevatia, 1992] Chung, C. and Nevatia, R., "Recovering Building Structures from Stereo," IEEE Proceedings of Workshop on Applications of Computer Vision, Palm Springs, California, December, pp 64-73.
  3. [Gerson and Wood, 1994] Gerson, D. and Wood, S. RADIUS Phase II. The RADIUS Testbed System, Proceedings of the Image Understanding Workshop, Vol 1, Monterey, California, Morgan Kaufman, Publisher, November, pp 231-237.
  4. [Fua, 1996] Fua, P. Cartographic Applications of Model-Based Optimization. Proceedings of the Image Understanding Workshop, Vol 1, Pal Springs, California, Morgan Kaufman, Publisher, February, pp 409-419.
  5. [Huang et al., 1994] Huang, C., Mundy, J. and Rothwell, C. Model Supported Exploitation: Quick look, Detection and Counting, and Change Detection, Proceedings of the Second IEEE Workshop on Applications of Computer Vision, Sarasota, Florida, pp 144-151.
  6. [Huertas et al., 1995a] Huertas, A., Bejanin, M. and Nevatia, R. Model Registration and Validation, In Automatic Extraction of Man-Made Objects from Aerial and Space Images, Gruen, A., Kuebler, O., Agouris, P. Editors. Birkhauser Verlag, Switzerland, pp 33-42.
  7. [Huertas et al., 1995b] Huertas, A., Mourani, S., and Medioni, G. Model-Based Aircraft Recognition in Perspective Aerial Imagery. IEEE Computer Vision Symposium, Coral Gables, Florida, November, pp. 371-376.
  8. [Huertas and Nevatia, 1996] Huertas, A. and Nevatia, R., Detecting Changes in Aerial Views of Man-Made Structures, in Proceedings of the ARPA Image Understanding Workshop, Palm Springs, California, February, pp 381-388.
  9. Huertas and Nevatia, 1988] A. Huertas and R. Nevatia, Detecting Buildings in Aerial Images, Computer Vision, Graphics and Image Processing, 41(2): 131-152, February.
  10. [Irving & McKeown, 1989] R. Irving and D. McKeown, Methods for exploiting the Relationship Between Buildings and their Shadows in Aerial Imagery, IEEE Transactions on Systems, Man and Cybernetics, 19(6): 1564-1575, Nov/Dec.
  11. [Jaynes, et al., 1994] C. Jaynes, F. Stolle, and R. Collins, Task Driven Perceptual Organization for Extraction of Rooftop Polygons, Proceedings of the 1994 ARPA Image Understanding Workshop, 359-365.
  12. [Khalaj et al., 1992] Khalaj, B, Aghajan, H and Kailath, T. Automated Direct Patterned Wafer Inspection Proceedings of the IEEE Workshop on Applications of Computer Vision, Palm Springs, California, pp. 266-273.
  13. [Lin et al., 1994] Lin C., Huertas A. and Nevatia R. Detection of Buildings using Perceptual Grouping and Shadows. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, pp 62-69.
  14. [Lin et al., 1995] Lin, C, Huertas, A. and Nevatia, R., Detecting Buildings from Monocular Images. In Automatic Extraction of Man-Made Objects from Aerial and Space Images, Gruen, A., Kuebler, O., Agouris, P. Editors. Birkhauser Verlag, Switzerland, pp 125-134.
  15. [McGlone & Shufelt, 1994] J. McGlone and J. Shufelt, Projective and Object Space Geometry for Monocular Building Extraction, IEEE Proceedings of Computer Vision and Pattern Recognition, 54-61.
  16. [Mohan & Nevatia, 1989] R. Mohan and R. Nevatia, Using Perceptual Organization to Extract 3-D Structures, IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(11): 1121-1139, November.
  17. [Medioni et al., 1991] Médioni G., Huertas A. and Wilson M. Automatic Registration of Color Separation Films, Machine Vision and Applications, Springer-Verlag, New York, Vol. 4, pp 33-51.
  18. [Strat et al, 1992] Strat, T. et al., The RADIUS Common Development Environment, Proceedings of the DARPA Image Understanding Workshop, San Diego, California, Morgan Kaufman, Publisher, January, pp 215-226.

Extractedpic [12] Figure 11 Line segments extracted from the image
Extracted pic [13] Figure 12 Change detection New buildings not previously modeled are detected automatically

Generated by fmtoweb (v. 2.9b) written by Peter G. Martin <> Last modified: 12 Jun 1997