«An Autonomous Vision-Guided Helicopter Omead Amidi August 1996 Department of Electrical and Computer Engineering Carnegie Mellon University ...»
To analyze the tracking algorithm’s position sensing, let us examine a situation where the helicopter moved from ground frame position Po to P, as depicted in Figure 2-7. (Note that the associated vectors for each position have the corresponding 0 or 1 subscripts.) From the current estimated position, p,, the algorithm estimates the new position, p,, by first localizing the ground target position, p,, and sensing the target view vectors V, and V I, and cam
era vectors Co and C, in the ground coordinate frame. By vector arithmetic the new position is computed from (2-5). The superscripts denote the coordinate frame in which vectors are represented.
The algorithm commences with the secondary thread's localization of a ground target taken from the image center using (2-6). This localization requires sensing the ground target location in the camera frame which defines the view vector, V t, in (2-9) and transforming the camera translation vector
center and the secondary thread estimates the target’s range by stereo image processing. The details of the range detection are presented later in Section 2.4.3.
sured helicopter attitude.
G To determine the view vector, V,,the primary thread senses target location in the camera frame by image processing and transforms its location vector to the ground frame using (2-7). The image processing steps to locate the target include target location in the image by template matching, and target range detection by stereo image processing. Sections 2.4.4 and 2.4.5 present the details of target image location and range detection.
The primary thread positions the helicopter by this method while it is in view and trackable. If the primary thread must switch to a new target, it acquires the new target’s position from the secondary thread and follows the same procedure for uninterrupted helicopter positioning.
2.3.3 Velocity Sensing
Along with helicopter position estimation, the visual odometer’s tracking algorithm estimates helicopter velocity. While the primary thread is estimating current helicopter position, the secondary thread estimates the pixel velocity, referred to as optical flow, at the image center to estimate helicopter velocity.
which represents current helicopter position in terms of sensed view vector and camera translation
vector. Differentiating (2- 12) yields:
where z? is the ground target's range to the camera. Substitution of the image coordinate expressions from (2-4) into (2- 13) yields:
(2-14) Equation (2- 14) describes how the secondary thread estimates helicopter velocity by sensing the
The odometer estimates these quantities by image template matching and synchronized helicopter attitude measurement. The following subsections examine the odometer image processing algorithms for estimating these quantities and the template matching method upon which these algorithms are built.
2.4.1 Image Pixel Coordinates of the Target Template
The odometer initially acquires a target template from the image center and thereafter tracks it to maintain helicopter position. Therefore, the odometer’s helicopter positioning accuracy is directly affected by the accuracy and robustness of this tracking operation. Templates must be tracked consistently as images rotate with changing helicopter attitude, vary in size as the helicopter ascends or descends, and vary in intensity with changing lighting conditions. Therefore, for robust and accurate matches, the odometer must calibrate the target template before matching by rotating, scaling, and normalizing intensity.
To aid in calibration of templates, the odometer tracks an auxiliary template in parallel with the main target template. The auxiliary template provides another anchor point in the image to measure image rotation and scaling. The change in angle of the baseline between the templates measures image rotation, and the change in their baseline measures image scaling.
Chapter 2. Vision-Based Helicopter Positioning for Autonomous ControlVisual Odometer Image Processing Figure 2-8 depicts the odometer’s image processing steps for detecting template pixel coordinates. The odometer acquires an auxiliary template near the main template at the image center. The auxiliary template is offset by a nominal (20 pixel) distance in the x direction and has the same y pixel coordinate as the main template.This x direction offset provides the initial horizontal base line representing zero image rotation.
The odometer stores the two templates from the initial image as locked on targets and commences matching them to incoming images. The templates are calibrated using the baseline and image intensities of the previous match. The first match does not require template calibration since the image rotation, scaling, and intensity variation in one cycle is assumed to be insignificant relative to the algorithm tracking frequency. The validity of this assumption is later demonstrated experimentally.
To accommodate for changes in template intensity, the template pixel intensities are normalized to correspond to the most recent image match. In addition, a scale factor is determined by comparing the intensity within the calibrated templates and the intensity within the matched areas. Finally, the stored templates are rotated by the baseline angle before locating the templates in next image. The primary thread of the algorithm then employs the pixel coordinates of the main template to estimate helicopter position.
The templates leave the image from time to time with helicopter motion and must be reacquired from the image center. The secondary thread always captures new templates and the primary thread
locks on to them under the following conditions:
Either one of the current templates is about to go out of view. As is presented Section 3.4.3, templates are searched for in the neighboring area of the last successful match. A template is replaced if the neighboring search area reaches an image border.
The current template baseline length, D, is above a threshold value. A large baseline indicates a significant reduction in helicopter altitude or large attitude changes which degrade the resolution of
calibrated templates, and leads to poor matches. Large template separations also signal a potential mismatch of one of the templates; switching both templates improves the matching for upcoming images.
The current baseline length, D,is below a threshold value. A short baseline is a result of a significant gain in altitude or a large change in helicopter attitude. The angular resolution of image rotation angle reduces with baseline length and new templates are necessary restore this resolution.
2.4.2 Range Measurement Helicopter position and velocity estimation by object tracking requires knowledge of the object range to the camera. Object range along with lens focal length are the two variables necessary to transform object image coordinates to the camera frame as shown by (2-4).The odometer only needs to estimate object range during flight as lens focal length can be determined off-line using a number of lens models, see for comprehensive work on lens calibration.
The odometer measures object range by stereo image processing. It interpolates one range estimate at the image center to detect the range of objects appearing in different parts of the image. The odometer interpolates the one range estimate using current helicopter attitude by assuming the ground below the helicopter is locally level. This approach guarantees range detection for a predetermined helicopter altitude range by choosing the camera baseline so that the center template of one camera is always visible by the other. Matching individual templates as they move through images is more accurate, but significantly limits the allowable template travel region to guarantee visibility by both cameras.
22.214.171.124 Image Center Range Estimation
The odometer estimates the image center range by locating the new potential templates captured from the main camera image center in the rear camera image. The two observations of the same object along with known camera baseline and focal lengths allow the odometer to solve for camera range.
Figure 2-9 shows the processing steps and the associated variables of this process.
The odometer normalizes the main camera template intensity to match the intensity of the rear camera template before the matching operation. The intensity of the previous success full match is used for this normalization. The normalized main camera template is then located in the rear camera which is centered vertically (y coordinate is zero) and offset in the x direction since the cameras are assumed to be accurately aligned.
126.96.36.199 Range Interpolation Relying on a locally flat ground assumption, the odometer interpolates the image center range to estimate the range of templates in arbitrary image locations by using measured helicopter attitude. Figure 2-10 depicts the relevant variables and vectors used for this range interpolation.
Referring to Figure 2-10, the odometer interpolates the image center range, R, to find template range, T, using current helicopter attitude in the ground frame. Using similar triangles, we find the
following relationship between T and R:
and W are unit vectors pointing to the current template object position and along the main camera i;
z axis, respectively, and are defined as follows:
2.4.3 Pixel Velocity at the Image Center The odometer senses the pixel velocity or optical flow at the image center to estimate helicopter velocity. It senses the optical flow by locating the center template of the current image, which is continuously captured to serve as a potentially new template, in the successive image. This operation is outlined in Figure 2-1 1.
The odometer matches the previous center template to the current image. The found location, im im ( x p, y p ) indicates the image displacement in one algorithm period which determines pixel velocity
2.4.4 Template Matching The visual odometer tracking algorithm is built upon image displacement measurement by template matching; therefore, positioning accuracy and robustness directly depends on the odometer’s template matching capability.
188.8.131.52 Matching Criteria The visual odometer employs the Sum o the Squared Diferences (SSD) matching criteria to locate f templates in incoming camera images. The SSD criteria, one of a large class of image comparison strategies , is the traditional choice because of its proven effectiveness in many object tracking applications. In each image, the odometer searches for the location yielding the minimum SSD of
image and template pixels to locate a matching area. Therefore the odometer must compute the following:
(2-21) for each examined image location where Z(x, y ) represents the image intensities, T ( x, y ) represents the template intensities, ( D x,D y ) represent the image location being examined, and ( n, m ) represent the template dimensions in pixels.
Evaluating the SSD criteria is computationally complex. Examining each image location requires nm multiplications and subtractions. Computational cost for finding the template can be reduced by restricting the search area to a small window around the previous successful template match. The area size linearly affects the computational complexity.
The odometer examines a search area around templates based on an experimentally determined maximum change in template location within one processing period. As the helicopter altitude decreases, the same translational motion causes a larger displacement in the image. The entire image may need to be searched for to locate the template match candidates. The odometer’s search implementation is described in Section 3.6.
184.108.40.206 Coarse to Fine Search The odometer employs a coarse-to-fine strategy to further improve the processing complexity. Initially, the template and image are subsampled by calculating the SSD of every fourth pixel to narrow the search to a 9x9 pixel area as shown in Figure 2-12. The subsampled match is then improved by computing the SSD at the unexamined pixels within the subsampling neighborhood.
Image subsampling can be susceptible to mismatches, especially in images with highly contrasting intensities. Typically an image pyramid  is constructed by interpolating adjacent pixels for multi-resolution searches. The interpolation is computationally complex, but necessary for consistent template matching to high contrast images. However, for the helicopter application, images must be filtered due to the significant inherent noise of the power plant and on-board electronics which significantly lowers image contrast and eliminates the need for pixel interpolation. In fact, by smoothing images with an 8x8 Gaussian convolution mask, the odometer produced consistent matches of high contrasting natural vegetation by subsampling alone. There was no need for pixel interpolation.
- - where A is an n x 6 matrix with rows representing one of n match candidates and e is a vector of SSD errors for the corresponding pixel. Each row of A consists of the parabola variables,
a constant provided the subsampled match candidates are always translated to the image center before fitting the parabola. With this approach, the right hand side of (2-23) can be calculated by one matrix vector multiplication at run-time. Figure 2-13 shows an example of a fitted parabola to an (8x8) pixel grid of match candidates.