An important application of machine vision is to provide a means to monitor a scene over a period of time and report changes. We have developed a change detection system for this purpose, in support of the RADIUS program. This process consists of several steps. Model to image registration aims to bring the site model and the images into close correspondence; model validation seeks to confirm the presence of model objects in the image; preliminary change detection seeks to resolve matching problems and provide an early report of possibly changed structures. Currently, the system is able to detect changes such as missing buildings or changes in their dimensions.
Keywords: Registration, matching, model-based change detection, photo-interpretation, model updating.
This paper deals with change detection in a complex and cluttered environment: aerial views of natural and cultural features. The imagery acquired from aerial and space vehicles is vast and requires automated and semi-automated systems with applications in photo-interpretation, mapping, planning, surveillance and guidance. The task of change detection in this context consists of finding significant differences between the new data and a model derived from the older data. These differences may be due to a multitude of sources, many of which are irrelevant. The significance of the differences may be task specific although in most cases man-made changes are more important than those caused by factors such as seasonal, illumination and viewpoint changes. In this work we are only interested in those changes in the image that come from some changes in the site rather than from changes in imaging conditions.
To deal with the development and integration of machine vision
systems to support the task of image analysts, the RADIUS program
[Gerson and Wood, 1994] relies on the concept of model-supported
exploitation (MSE). In this concept a site model is constructed from a
number of images of a site [Huang et al.,
1994]. The model consists of
terrain elevation, surface features such as roads and rivers,
functional areas and 3D building structures. Significant progress in
development of techniques for interactive and automated construction
of model buildings have been reported in the literature [Fua, 1996,
Chung and Nevatia, 1992, Lin et al., 1994
Huertas and Nevatia, 1988,
Irving and McKeown, 1989, Mohan and Nevatia, 1989, Jaynes et al, 1994,
McGlone and Shufelt, 1994]. Figure 1 shows an example of a site model corresponding to a modelboard
site constructed for RADIUS experiments.
Figure 1
A typical site model.
Change detection involves comparing a new image (or a collection of images) of a site to the information associated with that site in a site folder which consists of a site model and one or more previous images and results of analyses on these images. This problem is similar to industrial inspection (see for example [Khalaj et al., 1992]) in some ways. However, in our case site models are very incomplete, illumination is uncontrolled and many image changes are not relevant to changes in the site. In this paper we concentrate on building structures only. For related work dealing with mobile objects using similar techniques as those reported here see [Huertas et al., 1995b.] In all cases we assume that a site model of suitable resolution and complexity is available.
Figure 2 shows a flowchart of the complete change detection process. It contains five major steps:
Figure 2
In the following we describe work on the following tasks needed to achieve a full change detection system: site model registration, model validation, change detection and model updating. We show examples that help illustrate the current capabilities using imagery supplied to us by the RADIUS program. The system is written in LISP and runs under the Radius Common Development Environment (RCDE) [Strat et al., 1992].
The first step is to register the site model to an image. Our system has capability to correct for translational errors. Our registration method [Huertas et al., 1995a] consists of the following tasks:
The first task is carried out by a matching technique [Medioni et al., 1990] that uses the visible line segments derived from the projected site model objects, and line segments approximated from the edges extracted [Canny, 1986] from the image. Candidate segment matches contribute a "vote" derived from the attributes and position of the segments into a parameter space accumulator. The space denotes translational offsets and the main peak gives the global misregistration error.
The second task uses the knowledge of the registration offsets to, in a second pass of the matcher, select the matching pairs (model segment, image segment) that correspond segments in the model to segments in the image. Since the model segments are grouped into objects by the model itself, we also obtain a correspondence of the model objects to the image features. For additional details see [Huertas et al., 1995a.]
The analysis of changes of permanent structures, such as buildings, proceeds at the object level. The purpose of model validation is to verify whether model objects are present in the image and whether the objects remain unchanged. The system uses the object correspondences established in the registration step to calculate and assign a confidence value (see below) to each object in the model.
To validate a model accurately we need to study the source of
missing model-to-image correspondences. Some missing image features
will be due to viewing conditions such as self-occlusion, occlusion by
other objects, self shadows and shadows cast by nearby objects. These,
however, can be predicted and explained from the site model
itself. Missing correspondences may be due to over- or under-modeling
of objects ( Figure 3 and
Figure 4) and are more difficult to predict from
the model. The
confidence associated with over- or undermodeled objects may thus be
underestimated or difficult to calculate.
Figure 3
The thick lines in the building model b do not correspond to actual
Figure 4
Some buildings may be under modelled
Over-modeling is due to the use of modeling primitives that introduce elements that do not correspond to actual physical elements or boundaries. Figure 3 shows a building that has been modeled by two rectangle parallelepipeds. The thick lines represent portions of the elements on the building model that do not correspond to physical boundaries. These can not be matched and the missing correspondences result in lower confidence.
Figure 4 shows two buildings that are likely to be under-modeled (i.e. modeled by simpler shapes) due to their complexity. These require additional search strategies designed to look for additional and possibly fragmented evidence, such as a large number of vertical or horizontal edge elements. Our system is not currently capable of determining these conditions, and thus the confidence values may be underestimated. It is assumed that some of these conditions may require annotations in the site model to help the system process these appropriately.
There are several ambiguities inherent to the matching process that need to be resolved during validation and change detection. The system currently deals with two of these. The first deals with multiple or missing matches between the site model features and the image. The second deals with coincidental alignments due to viewpoint, illumination direction, or to adjacent structures.
The model-to-image matcher in the system corresponds each model
element with one or more image elements. This is necessary to deal
with expected fragmentation in the image elements. Fragmentation is
due to inadequacies in the feature extraction process and due to
actual image content, such as occluding trees, road boundaries and
shadows. This may result in one-to-many correspondences
( Figure 5) possibly involving more than one
object. If a model segment matches multiple colinear image segments,
all the image segments are considered to represent image support. If a
model segments matches multiple parallel image segments, the overlap
among these is considered to represent image support.
Figure 5
One to many correspondences.
Some multiple matches are due to coincidental alignments of
buildings with other structures ( Figure 6). Some
of these include roads, and adjacent objects. Nearby
objects and shadows sometimes result in image features that have a
larger extent than that predicted by the model features. These are
explained by examining nearby shadows with knowledge of the direction
of illumination, and by examining adjacent structures. Coincidental
alignments due to nearby and adjacent structures are determined by
looking for adjacent structures that help explain alignment or a
possible change in horizontal dimensions.
Figure 6
Coincidental alignments
The confidence values derived take into account only visible elements from the particular viewpoint of the image. Self-occlusion and occlusion by other objects (determined using a range image derived from the model itself) are also taken into account. Confidence is based on the following measures (see Figure 7 and Figure 8):
Let x be a model object defined by a set of vertices and a set of edges. For each object, x, we wish to calculate a confidence value C(x) as a contribution of the following terms, wighted by wp , wv , ws , wj , and wm :
Object Visibility: V(x) is defined as the ratio of visible edges from the particular viewpoint and included in the field of view, over the number of edges in the model object. The current system does not penalize partial visibility and thus wv=1.0.
Object Presence: P(x) is defined as the ratio of the number of visible model edges that are matched to image edges, over the number of visible model edges. In the example shown in Figure 7a, all nine visible edges (dashed lines) have correspondences in the image (solid lines), giving a P value of 1.0. An object that is only 50% visible but that has the visible 50% corresponded to image edges has a P value of 1.0 also. P is calculated separately for roof elements, vertical wall elements and base wall elements to allow us to assign different ("ad hoc") weights to reflect the relative importance of these groups. Currently wp = 7, 5 and 3 for roof, vertical wall, wall base elements respectively.
Object Coverage: M(x) is defined as the ratio of the sum of the
lengths of the image segments that are corresponded to visible model
edges, over the sum of the lenghts of the visible model edges. Figure 7a shows an object with all model edges
(dashed) corresponded
(good presence) by small image (solid) edge supports (poor coverage).
Figure 7b shows the opposite; a few model edges (poor presence)
corresponded to large image edge support (good coverage).
Figure 7
Presence and Coverage
M(x) is also calculated separately for roof and wall elements using the same weights as for P, thus wm=wp. M(x) is penalized by F(x), where F(x), fragmentation, is defined as the ratio of the number of image segments corresponded to model edges over the number of model edges.
Shadow Presence: S(x), is defined as the ratio of the number of
potential shadow boundaries and junctions extracted from the image
over number of visible shadow elements (boundaries and junctions)
derived from the model ( Figure 8). The image
segments are labelled as potential shadow segments
by noting the consistency of the "dark" side of the segment with
respect to the direction of illumination. Segments oriented parallel
to the projection of the direction of illumination correspond to
possible shadow lines cast by vertical edges. The L-junctions formed
(allowing for gaps) by potential shadow lines are labeled potential
shadow junctions. Details on shadow evidence extraction may be found
in [Lin et al., 1995].
Figure 8
Typical shadows cast by a cubic building with no surrounding
S(x) is currently assigned a weight ws=3.0.
Junction Presence: J(x) is defined as the ratio of the number of image L-junctions at locations predicted by the model ( Figure 8) to the number of visible model vertices. Image junctions are extracted from the image from the line segments used for matching. Currently J(x) is assigned a weight wj=3.0.
Above calculations are combined to give a confidence value, C, from
which a confidence level is established, as follows:
Equation 1
High confidence values indicate good image support while low values denote low image support. Low values may signify change as lack of image support may be due to missing buildings, or buildings that have undergone significant change with respect to their current model. Model buildings that have strong image support, may have changed also. Additions to structures, such as a new wings, may not affect significantly the appearance of the previously modeled portions.
Figure 9 shows an example of the
registration/validation step applied to one of the modelboard images
where all ambiguities are resolved successfully and there are no
changes reported. The colors indicate the confidence level coded to 5
levels: very high (green), high (blue), medium (yellow), low (salmon)
and very low (red).
Figure 9
Validation result and confidence levels
The previous step (validation) makes available information that is used to start analyses to determine changes. The indication of changes in the site currently comes in two forms:
Our system currently is able to detect changes in the dimensions of the structures and changes due to missing buildings. In our experiments and examples below we altered the site model to test these conditions.
Model buildings having very low confidence values denote poor image support. The possible causes for this condition are that either the model is incorrect, the structure is occluded or that the building has been removed or destroyed (assuming that images are of sufficient quality.) Resolving these ambiguities may require examination of this location in other images. Examples are shown below in Figure 11.
Apparent changes in the dimensions of the structures in the image that do not seem to be due to errors or coincidental alignment are taken to signify real changes. The changes in dimensions detected by the current system are reported but explicitly modeled. A full description of the changes requires that the entire object geometry be analyzed, possibly requiring the use of more than one view. This is a subject for future work.
Figure 10 shows a building wing that has been added to an existing
structure. The portion of the building in the model is correctly
registered to the image by the system. The two thick white lines
denote the extent of the match. Because the object presence measure
for the roof of this structure indicates that all four sides of the
current model were matched, the change is labeled "added"
wing.
Figure 10
Added wing is reported in this case.
One important type of site change is the introduction of new structures. We have capabilities to construct models automatically and therefore we can suggest new additions to the site model. These techniques are applied to areas of interest, currently designated in the site model as "functional areas", using one or more images, if available. The site model is used to indicate already modeled areas.The camera models and terrain models associated with the images are used also by these systems to derive viewpoint and illumination parameters automatically. An example of this task is shown later in Figure 12.
We have tested the system extensively with model board imagery (comprising more than 40 images), and with real imagery of Fort Hood (Texas). An example from the Fort Hood Imagery set supplied by the RADIUS program is shown below to demonstrate the current capabilities. The ability to detect change in the form of new structures (not in the model) is also demonstrated with an example. The processed image size is 7775x7720 pixels, and the 3-D site model contains 79 objects representing building structures. In this example, the image and the model are registered allowing the processing of the entire image only at locations where model buildings exist. Processing time is about 15 seconds per structure on a Sun sparc-10 workstation, running under the RCDE. The graphical results given in Figure 11 we show three levels of confidence: high (green), medium (yellow) and low(red). Only a small portion of the image are shown for lack of space.
The statistics and results are described in table 1.The left part of the table simply shows the number of buildings visible in the image and the distribution of validation confidence values. These are for information purposes only as they primarily reflect image content. Notice however the correlation between confidence level and the number of buildings changed, not changed or missing in the rest of the table. All matching ambiguities, with one exception are correctly handled. Fourteen buildings actually had changes. Thirteen of these are found to be changed. Some of these are shown in Figure 11 with an orange dot on top. One of these was not found "changed" due to lack of evidence, a false positive. Of the 54 non-changed buildings, one is found to be changed, a false alarm. This case involves an alignment with a ground feature not present in the model, a situation, not currently handled by the system (not shown.)
Buildings that change considerably or are missing have poor image support, resulting in low validation confidence (the red buildings in Figure 11). There are 12 of these, 11 of which were added by hand to test the "missing building" detection capability. The remaining one, represents a significantly changed building (not shown). All these are labeled correctly as changed or missing.
Figure 12 shows the result of applying a monocular building detection system [Lin et al., 1995] to look for change in the form of new buildings (shown with cyan outlines). The areas modeled are ignored. Typically the system would be instructed to locate new buildings in designated areas that are of interest, such as functional areas. The three buildings shown in cyan outlines are detected automatically and added to the site model.
Change detection is a tedious task as it requires careful comparison of images taken at different times under possibly varying conditions. Even partial automation of this task will greatly increase image analyst productivity and possibly also enhance the reliability of the results.
We have developed and tested system for the purpose extensively with modelboard and real imagery.The system has been ported to industrial and Government sites for evaluation and testing.
The current system operates in the 2-D domain of projected model structures onto the image viewpoint and can be easily extended to incorporate 2-D features such as the transportation network. Detailed 3-D description of change is expected to require the use of more than one image viewpoint or images from a range sensor. Multiple images allow for 3-D matching and verification of changes although the processing may continue to incorporate 2-D processing for simplicity and usefulness in regard to processing times required
Figure 11
Line segments extracted from the image
Figure 12
Change detection New buildings not previously modeled are detected automatically