|UFDC Home||myUFDC Home | Help|
This item has the following downloads:
THE DESIGN OF GEOREFERENCING TECHNIQUES FOR UNMANNED AUTONOMOUS
AERIAL VEHICLE VIDEO FOR USE WITH WILDLIFE INVENTORY SURVEYS: A CASE
STUDY OF THE NATIONAL BISON RANGE, MONTANA
BENJAMIN EVERETT WILKINSON
A THESIS PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE
UNIVERSITY OF FLORIDA
O 2007 Benj amin Everett Wilkinson
To my parents and Zhao.
My fiancee, Zhao, somehow always finds time to make me ridiculously happy while first-
authoring papers in prestigious virology journals, volunteering, and working crazy hours. I
cannot believe I get to marry such an incredible woman. I also want to thank Zhao's and my
parents, for being unconditionally proud of me, faithful, and always encouraging. I want to
thank my dad for suggesting that I go to Reed Lab and talk to Dr. Gibson about the geomatics
program, and I want to thank my mom for being the strongest person I know.
My supervisory committee members were very helpful in the preparation of this thesis, and
I would like to thank them for their time, knowledge, and patience. I look forward to working
again with Drs. Bon Dewitt, Franklin Percival, and Scot Smith in the coming years as I pursue a
Ph.D. at UF. I would especially like to thank my supervisory committee chair, Dr. Dewitt, for
always having his door open even though he knows I could be on my way to bug him about
something at any moment. He is a great teacher and friend, and I feel lucky to be able to work
TABLE OF CONTENTS
ACKNOWLEDGMENT S .............. ...............4.....
LI ST OF T ABLE S ............ ...... ._ ...............7....
LI ST OF FIGURE S .............. ...............8.....
AB S TRAC T ........._. ............ ..............._ 1 1...
1 INTRODUCTION ................. ...............13.......... ......
Background on the National Bison Range............... ...............13.
Wildlife Inventory Surveys.................. ...............1
Unmanned Autonomous Aerial Vehicles ................ ...............15................
The University of Florida' s Wildlife UAV Program ................. ...............16..............
Georeferencing and Orthophoto Mosaicking of UAV Images ................. ............ .........16
2 UAV VIDEOGRAPHY GEOREFERENCING PROCES S ................ ........................22
Introducti on ............ ..... .._ ...............22...
The D ataset ............... .. ..... ..... ......... ..__ .. ...........2
Tie and Pass Points, and the Bundle Adjustment ....._____ .... ... ..__ ...........__....23
Orthorectification .............. ...............24....
The Developed M ethod ................ ... ............... .................2
Step 1: Decompilation of Video Data into Individual Frames ................ ... .............. ..25
Step 2: Conversion of GPS and INS Navigation Data to a Local Vertical
Coordinate System and Photogrammetric Standard Definitions Respectively............25
Step 3: Interpolation of Resulting Navigation Data and Linking of the Data to
Individual Images............... ....... ... ... ..........2
Step 4: Determination of Pass-points for a Strip of Photos using Rigorous Point-
m watching Techniques ................ ... .. .......... ..... ................2
Step 5: Determination of Interior Orientation Parameters and Relative Orientation
for Subsets of Photos in a Strip using Arbitrary Coordinate Systems .......................27
Step 6: Attaching Relatively Oriented Subsets Using Multiple 3D Coordinate
Transformations from Subsets along the Entire Strip ........._.. ............ ..... ........._.28
Step 7: Transforming the Resulting Points into 3D Ground Coordinates using
N aviation D ata ............... ... ... ........ ...... ........ .... .. ..... ............2
Step 8: Performing a Bundle Adjustment on the Entire Strip Using the Results of
the Previous Transformation as Initial Approximations .................. ... ... ........... ........29
Step 9: Attaching Each Strip (Found from Steps 4-8) from the Flight to Each Other ....29
Step 10: Performing a Final Bundle Adj ustment Using all Strip Ob servations ..............30O
Step 11: Creation of a DTM from the Resulting Points ................. .................3
Step 12: Using the DTM and Absolute Orientation Parameters, Creating a Digital
Orthophoto Mosaic of the Proj ect Area .....___._ ..... ... .__. ....._. ..........3
3 DESIGN OF AN ADAPTIVE POINTT MATCHING ALGORITHM ................. ...............32
Introducti on ................. ...............32.................
The Al gorithm .............. ...............34....
Calibration .................. ........... ...............34.......
Allocation and Initialization ................. ...............37........... ....
Iterations of Adaptive Point Matching .............. ...............37....
Normalized cross-correlation .............. ...............37....
Least squares image matching............... ...... .. .. ........3
Position prediction and blunder detection by conformal transformation .................41
Adaptation ......................... .................4
Last chance for unmatched points ................. ........... .. ............... 43....
General least-squares proj ective transformation check ................. .....................43
Establishing new points ................. ...............44................
Formatting and output ................. ...............45................
4 RELATIVE ORIENTATION............... ..............5
Introducti on ................. ...............52.................
Self-Calibration. ................... ..... ..... .... ........ ........... .... .. .........5
A Relative Orientation Method for the NBR UAV Dataset ................. ........................55
Singular Value Decomposition............... .............6
5 CAMERA CALIBRATION ............_...... ._ ...............68...
Introducti on ............... ........... ..... .............6
DOQ Scale Comparison Method .........._....... ....... ...............69...
Airborne Calibration (Ground Control Method) .............. ...............70....
6 CONCLU SION................ ..............7
APPENDIX: MATCHED IMAGE SUBSET ...._. ......_._._ .......__. ............ 8
LI ST OF REFERENCE S ........._._ ...... .... ...............93...
BIOGRAPHICAL SKETCH .............. ...............95....
1-1. Navigation System Specifications. .............. ...............21....
LIST OF FIGURES
1-1. Location of Flathead Indian Reservation, Montana (in pink). The NBR project area is
located within the red square. ............. ...............18.....
1-2. Navigation points from the four UAV flights over the National Bison Range. ...................19
1-3. 3D map of navigation points from the four UAV flights over the National Bison
Range. ............. ...............20.....
2-1. Flowchart of the developed overall process for georeferencing UAV videography .............31
3-1. Image pair matching epoch flowchart. ............. ...............46.....
3-2 Image coordinates of point 1 for 1 16 consecutive images from flight 2 of the NBR
m mission. ............. ...............47.....
3-3 Image coordinates for point 1 for the first 20 consecutive images illustrating retrograde
motion of the point coordinates in the x direction on the 1 1h epoch of image pair
matching ................. ...............48.................
3-4. Left image: Grid spacing of 64, template dimension of 64. Right image: Grid spacing
of 32, template dimension of 16. ............. ...............48.....
3-5. Poorly matched image pair 1. ............. ...............49.....
3-6. Poorly matched image pair 2. ............. ...............49.....
3-7. Well matched image pair. .............. ...............49....
3-8. Original (right) vs. resampled arrays (left) of a point for the 1st, 20th, 30th, and 90th
photo pair from bottom to top ................. ...............50...............
3-9. New point locations using the anticipation method (a) vs. the random method (b).............51
4-1. Lines 2 and 6 of the 2nd flight over the NBR ................. ...............62..........
4-2. Comparison of base-height ratios of 0.6 (a) and 0.4 (b). ................... ...............6
4-3. Increase in pitch at the end of line 2 associated with anticipatory turning of the UAV
(a), and the plotted section of line 2 highlighted in red (b).. ......____ .... ....._ ...........64
4-4. Effects of divergent photography on base-height ratio. ............. ...............65.....
4-5. Statistics for the only converging solution found in line 6 (a) and a solution from line 2
(b). .............. ...............66....
4-6. Plan view of the digital terrain model created from relative 3D coordinate from a
relative orientation in line 2 (a), and 3D view of the same model (b). ............. ...... ........._67
5-1. Field calibration target point (a) and the control point cage (b) at the USGS EROS Data
Center, Sioux Falls, SD............... ...............73...
5-2. Camera calibration team at EROS performing a calibration of a digital camera using
the CPC method. ............. ...............73.....
5-3. Wide area (a) and close up (b) DOQ imagery of the National Bison Range's Summit
Road. .............. ...............74....
5-4. DOQ imagery (a) and UAV imagery (b) of Summit Road. .............. ....................7
5-5. Plot of GPS altitude (green) and barometric altitude (blue) and their differences. ..............75
5-6. Plot of GPS altitude (green) and barometric altitude (blue) with a 13 second shift
applied ................. ...............76.................
5-7. Plot of GPS altitude (green) and barometric altitude (blue) with shifts accounting for all
data gaps greater than 2 seconds applied ................ ...............77........... ..
5-8. Planned (blue) versus actual (red) calibration flight lines illustrating the exaggerated
autopilot anticipation of turns. ................ ...............77.......... .....
5-9. Second flight of the NBR mission ......... ........ ......... ................ ...............78
A-1. Images 1-6 of the matched image set. ............. ...............81.....
A-2. Images 7-14 of the matched image set. ............. ...............82.....
A-3. Images 15-22 of the matched image set. ............. ...............83.....
A-4. Images 23-30 of the matched image set. ............. ...............84.....
A-5. Images 30-37 of the matched image set. ............. ...............85.....
A-6. Images 38-45 of the matched image set. ............. ...............86.....
A-7. Images 46-53 of the matched image set. ............. ...............87.....
A-8. Images 54-61 of the matched image set. ............. ...............88.....
A-9. Images 62-69 of the matched image set. ............. ...............89.....
A-10. Images 70-77 of the matched image set. ............. ...............90.....
A-10. 78-85 of the matched image set. .............. ...............91....
A-11. 86-92 of the matched image set. .............. ...............92....
Abstract of Thesis Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Master of Science
THE DESIGN OF GEOREFERENCING TECHNIQUES FOR UNMANNED AUTONOMOUS
AERIAL VEHICLE VIDEO FOR USE WITH WILDLIFE INVENTORY SURVEYS: A CASE
STUDY OF THE NATIONAL BISON RANGE, MONTANA
Benj amin Everett Wilkinson
Chair: Bon Dewitt
Maj or: Forest Resources and Conservation
Unmanned Autonomous Aerial Vehicles (UAVs) have great potential as tools for
collecting wildlife data. UAVs offer solutions to logistical, safety, statistical, and cost concerns
associated with collecting wildlife population data using traditional aerial methods. The
University of Florida recently designed and built a UAV, "The Tadpole," specifically for wildlife
applications. The Tadpole successfully flew several missions including four over the National
Bison Range in Montana collecting navigational and video data.
Aerial digital video data alone can be helpful when used for estimating populations of
wildlife. However, because the flight of the Tadpole was generally less-stable than larger Eixed-
wing aircraft and because of the low altitude of the flights (around 200m), the camera frame
tended to continuously move spasmodically along the flight path. This made the analysis of the
data problematic. Thus, it was determined that georeferencing and mosaicking of the video data
would serve as an important processing step preceding any population analysis of the collected
This thesis presents the design of georeferencing techniques for the UAV' s videography,
including an adaptive image point matching algorithm and self-calibrating analytical
photogrammetric operations for videogrammetric data. Due to the high dynamic nature of the
UAV, some of the images had geometry too poor for analytical photogrammetric solutions
without a priori camera calibration. However, the designed techniques proved successful in the
cases having a suitable geometric configuration. Suggestions for future research in order of
importance are: the addition of more plan waypoints to minimize the dynamic pitch and roll of
the UAV; the use of a camera with a wider field of view; and the use of more precise attitude and
position measurement devices.
Background on the National Bison Range
The National Bison Range, (NBR), was established in 1908 near Moiese, Montana.
According to the NBR website, it is one of three reserves established between 1907 and 1909 by
President Theodore Roosevelt to preserve the American bison, Bison bison, following the
formation of the American Bison Society and public outcry over the sale of the last large
American bison herd to Canada. The NBR is the home of between 350 and 500 bison among
other large wildlife including elk, white-tail and mule deer, pronghorn antelope, big horn sheep
and black bear. Spread out over 19,000 acres of prairie, forest, wetlands and streams, the Range
is administered by the U.S. Fish and Wildlife Service. The website, maintained by the U.S. Fish
and Wildlife Service, also states "To keep the herd in balance with their habitat, surplus bison are
donated and/or sold live each year. Donations of animals are made to start or supplement other
public herds. Bison have also been donated to the InterTribal Bison Cooperative, which places
animals on tribal lands. Others are sold live and have been the nucleus of a number of private
herds located throughout the country" (U.S. Fish and Wildlife Service). The Range is
surrounded by the Flathead Indian Reservation (Figure 1-1), and was originally purchased from
tribal members by the U.S. government. Recently, tribal leaders--who claim that the sale of the
NBR land was forced--are petitioning to be involved in the management of the Range and other
federal holdings within the Reservation citing the Self-Governance Act which allows federally
recognized tribes to apply for management of Interior Department functions that are of "special
geographic, historical and cultural significance" to a tribe (Robbins, 2003). The USFWS opposes
allowing the tribes to manage the Range for fear that, as sovereign entities, tribes will not be held
accountable for mismanagement.
Wildlife Inventory Surveys
Acquiring accurate population data is one of the main concerns in the development of
effective wildlife management strategies. Clearly, one needs to know the number of animals so
that differential target populations and population distributions can be established before
planning and implementing any harvesting strategies, conservation policies and/or management
protocols. Williams, Nichols, and Conroy (2001) state that:
On initiating an investigation, the first questions confronting a researcher concern the
number of individuals in the population and where they are found. Beyond a first look at
size and range, studies often focus on the investigation of biological relationships that are
sensitive, or at least potentially sensitive, to the number of organisms participating in
them. Without some idea of the size and spatial distribution of a population, it is
effectively impossible to investigate such size-dependant or density-dependant
Accurate population counts are needed in order to evaluate the success of management
policies and modify them in an adaptive management scheme, and according to Buckland,
Goudie, and Bourchers (2000), "Despite the desires of many conservationists, active
management of wildlife populations and of the environments they inhabit is likely to increase."
Also, population model performance may be assessed by abundance estimates because "model
behavior that faithfully tracks changes in population status suggests that the model incorporates
the key biological factors affecting change." (Williams, Nichols, and Conroy, 2001)
Traditional inventory survey methods tend to be expensive when robust and are therefore
often altered in a way that sampling design is compromised and the data collected loses validity
(Roberts et al, 2001). Specifically, aerial surveys can be costly when one takes into account
mobilization, flying time and data processing, and are therefore often replaced with highly biased
"convenience" methods such as roadside counts. Owing to the use of aerial imagery, the NBR is
an example of the rare situation where a theoretically complete count of the population may be
possible; the bison are confined to a specific area, are conspicuous in the imagery, and their
population is small. Even terrestrial counting techniques using detectability models and other
empirical methods of reducing bias in population density estimates such as the distance sampling
method discussed in Rosenstock (2002) cannot achieve the impartial results of well planned
aerial photography/video interpretation.
Unmanned Autonomous Aerial Vehicles
Unmanned Autonomous Aerial Vehicles, or UAVs, have great potential as a tool for
collecting wildlife data. UAVs offer solutions to logistical, safety, statistical, and cost concerns
associated with collecting wildlife population data using traditional aerial methods. The
traditional manned aerial surveys have long been considered an invaluable tool when collecting
large scale biological data such as vegetation coverage, nesting censuses, and numerous wildlife
However, there are a host of problems associated with manned aerial surveys. About
66% of deaths of wildlife biologists in the field are related to aircraft crashes (Sasse, 2003). In
2003, 3 biologists and the pilot of a Cessna 337 Skymaster died in a crash off the Florida coast
while conducting a right whale census survey (St. Petersburg Times, 2003). The main reason
these flights are so dangerous is that the plane or helicopter usually has to fly at low altitude to
facilitate identification and assessment of the subject. Excluding mobilization costs, fixed-wing
aircraft pilots usually charge between $100 and $400 per flight hour. Owning and maintaining
an aircraft can cost tens of thousands of dollars or more. There are also many logistical problems
with wildlife surveys using manned aircraft. One example is that the mission area is often far
from the nearest airport. This can greatly influence the time and fuel spent for merely getting
into position to perform the survey.
The University of Florida's Wildlife UAV Program
Researchers at the University of Florida have recently designed and built a UAV ("The
Tadpole") specifically for wildlife studies. The system can be transported and operated by one
person, needing only about 100m of suitable terrain for takeoff and landing. Also, since the
UAV has a relatively quiet electric motor, animals are less-apt to be disturbed by the vehicle's
presence, providing an arguably less-biased sampling. The 2m wingspan UAV uses a
Procerus'" Kestral autopilot system which includes a micro GPS receiver, a three-axis solid-
state accelerometer inertial navigation system (INS) and a pressure altimeter. The positional
accuracy of the GPS system is estimated to be about 15m horizontally and the INS measures
angles with a precision of about 100 (see Table 1-1). Using pre-planned positional waypoints,
the UAV navigated along defined transects for long distances, recording video data over specific
areas. The UAV successfully flew several missions including four over the National Bison
Range (Figures 1-2 and 1-3), collecting navigational and video data. For the NBR mission, an
off-the-shelf digital camcorder (Canon Elura 20MC) was Eixed to the hull of the UAV which
recorded digital video on tape while transmitting real time footage to the base-station. The entire
system cost was approximately $35,000. (Lee, 2004; and Jones, 2003)
Georeferencing and Orthophoto Mosaicking of UAV Images
The flight of Tadpole is generally less-stable than larger Eixed-wing aircraft and because
of the low altitude of the flights (approximately 200m), the camera frame tended to continuously
"jerk" along the flight path. This, coupled with the disorientation caused by long term viewing
of the videos, makes the analysis of the data problematic. Thus, georeferencing and mosaicking
of the video data could serve as an important post-processing step preceding the analysis of the
collected data. (Othophoto) Mosaics, as opposed to video footage, also offer the advantage of
allowing for the simple determination of area. This is useful when estimating the spatial
distribution of the target subject. An ideal algorithm would use the UF UAV' s navigation data
and would not rely on ground control points due to difficulties in defining ground control for
dynamic rural areas such as the NBR. The method should be fast, repeatable, and adaptable to
UAV imagery has become a popular research area for applications such as homeland
security and emergency response. Because of the nature of such applications, the research
emphasis has been on speed of image processing and georeferencing, and the precision of
resultant images. For example, in their paper on UAV video georeferencing, Zhou, Li, and
Cheng (2005) discuss developing a real-time videogrammetric georeferencing scheme to be used
for forest-fire mapping. Their methods rely on auto-identification of tie points and ground
control points which can be problematic in areas with few buildings and roads (causing a lack of
distinct candidate point-areas), and tall trees (causing occlusion of potential ground patterns, and
higher relief-displacement between images). Also, since their method uses digital ortho-
quadrangles (DOQs) for registration of the UAV imagery, natural processes and land-use change
over time may hinder the identification of suitable ground control due to differences between the
UAV and DOQ images.
In applications where precision of the rectified image must be very high, such as when
one is calculating the area of a forest fire or it' s proximity to homes, ground control can help
ensure correct image positions. However, when one is using the images to find discrete quanta
of specific features such as the number of animals or trees, the precision of rectified images is
less important, and methods that do not employ ground control may be considered.
is located within the red square.
Figure 1-2. Navigation points from the four UAV flights over the National Bison Range.
:P ~i' -'t.ea~-~y~59i~i~ "j i ~eruu~
I i; ; '
~t, ~ i
C~j~ 4jtihrl .C*l 111_ rll r .i
~V-~ i -1 .. 5;
i i i-, t i,
~CIrji~ C ~r~ 4 "3~ ...
C; r~ r. ~c~4~
~~ cr rci i
j;. c c-,lr~i '5e
I ~: .. j
.I~' 1.'' '~l~i i ;I'Y ~C'
I~ i[i, ~ r
~C i' 1
j~ P1 I ,
I j '3
r ii P1: j
_7 PI 1~1 I.
,r ii: ..
II~ -3 i I i, ~ -
(1 '.- -- ,1
Data use subject to Ilcensc.
D 2004 DcLorme. Topo USA~ 5.0.
Figure 1-3. 3D map of navigation points from the four UAV flights over the National Bison
Table 1-1. Navigation System Specifications.
GPS: Furuno GH-80(81) Series GPS Receiver Accuracy
INS: Kestrel Autopilot v2.2 Attitude Estimation Error: Roll and Pitch
Level Flight: So
During. Turns 100
Altimeter: Kestrel Autopilot v2.2 Altitude Resolution:
Resolution: 0. 116m, 0.249m (v2.22)
UAV VIDEOGRAPHY GEOREFERENCING PROCESS
There are five main steps in the process of creating a ground adjusted
mosaicked/resampled image from video data. These are: (1) interpolation of navigation data and
decompilation of video into individual frames; (2) finding the interior orientation parameters,
the focal length and lens distortion parameters for the camera system; (3) finding the relative
exterior orientation parameters, the relative 3D position and angular attitude of the frames with
respect to each other; (4) finding the absolute exterior orientation parameters, the 3D position
and angular attitude of the frames in a mapping or ground coordinate system; and (5) resampling
the images to a ground coordinate system.
The designed method was implemented only to part of the way through step 5 due to
several problems with the dataset discussed throughout this thesis. The method, however,
remains viable using the same system with some adjustments as discussed in the Conclusion.
In their description of appropriate sensors for intelligent video surveillance, Kumar et al.
(2001) note that cameras using interlaced format introduce aliasing that leads to poor image
matching and resampling, and thus progressive scan images should be used to ensure that the
decompiled images are continuous. The camera, a Canon Elura 20MC digital camcorder
captured images using the progressive scan method. Commercial software is readily available
that decompiles video into discrete images. Since the Tadpole system does not have a direct
time-stamping system for the image data, the images must be registered with the navigational
data specifically by capturing video of a synchronized clock directly before the flight. Because
the frequency of navigation data measurements is less than that of the video frames, the
navigation data (both GPS points and IMU angles) were interpolated over the time of the flight
using a cubic spline. Next the images were assigned times based on the synchronized clock and
nominal video frame rate and positional values based on the splined navigation data. As noted
by Baumker and Heimes (2001) in their calibration method for direct georeferencing using INS
data, for conventional photogrammetric equations to be used, INS data defined by ARINC
standard pitch, roll, and yaw(cp, 6, uy) must be transformed into the photogrammetric orientation
system omega, phi, and kappa (co,cp,K). In addition, an arbitrary vertical ground system must be
defined by converting the GPS-output geodetic coordinates.
Tie and Pass Points, and the Bundle Adjustment
Finding pass and tie points between images is a fundamental step in the
orientation/georeferencing process. Pass points are defined as conjugate points found on two
images in the direction of the flight of the aircraft. Conversely, tie points are conjugate points
found on images overlapping from different flight lines, usually flown in opposite directions.
Area based matching techniques may be used to automatically search for image tie and pass
points. Specifically a combination of normalized cross-correlation and least-squares matching
for refinement may be used to Eind these points with high precision (Wolf and Dewitt 2000).
The middle three (orientation) steps are solved in a single iterative least-squares solution
called a bundle adjustment. However, since the least-squares collinearity-derived observation
equations are non-linear, initial approximations of the solutions must be found in order to
implement the Taylor' s series iterative solution. Thus, a priori solutions must be found for the
absolute orientation of the images to ground. A solution to this prerequisite is included in the
method. Since, for convenience and speed, little or no ground control points should be used in
adjusting the images, GPS and INS data must be used to solve for the location of image points to
ground. Although the precision of the GPS and INS data used for navigation is relatively poor,
the high redundancy of image and point measurements statistically compensates for this
shortcoming, providing higher overall precision in the final solution. At 30 frames per second, a
flying height of 200m, flight speed of about 15m/s, images with dimensions 480x720 pixels, and
an estimated focal length of about 1000 pixels, we can expect decompiled images to "move" at
approximately 3 pixel rows per frame. Potentially, this can result in more than 140 observations
per pass point if one uses a 32x32 pixel template for matching. Normalized cross correlation and
least-squares matching have proven to be effective in automatically matching points in images,
and can certainly be applied to videogrammetric data. In other words, the unique advantage of
using video data is that the small change in image between frames facilitates the design of pass
point searching algorithms that need not rely on measured position data, thus despite low
precision positional and attitudinal data, a high count of pass points may still be observed.
The final bundle adjustment solution will produce orientation parameters for all the
photos included which may then be used to construct an orthophoto mosaic. A useful feature of
the bundle adjustment solution is the determination of ground point locations for all pass and tie
points. These densely spaced 3D point coordinates may then be used as the framework for
creating a digital terrain model (DTM), a requirement for orthophoto, and therefore orthophoto
(Wolf and Dewitt, 2000)
The Developed Method
In general, although revisitation and refinement at some steps may apply, the overall
method may be considered linear. Each step of this method is dependant on the previous, thus
should one step fail or prove too complex in the context of the proj ect, a substitute must be found
that will give results satisfactory for the next step. A flowchart of the process is shown in Figure
2-1. The designed steps involved in producing digital orthophoto mosaics from the given data
are the following.
Step 1: Decompilation of Video Data into Individual Frames
There are numerous commercial software programs available to decompile video into
individual frames. Bitmap (BMP) format is used as it is simpler to work with, and it doesn't
contain the aliasing artifacts found in compressed images. To facilitate the design and
implementation of the point matching algorithm and to reduce storage requirements, the images
are converted to 8-bit (graycale). If necessary, the 24-bit (color) images may be used in the final
orthophoto mosaic production step to allow easier .recognition of target subj ects.
Step 2: Conversion of GPS and INS Navigation Data to a Local Vertical Coordinate System
and Photogrammetric Standard Definitions Respectively
The coordinate system conversion from latitude, longitude, and height to local vertical X,
Y, Z is performed using methods described in Wolf and Dewitt (2000). The observed navigation
rotation angles are then converted to the photogrammetric system using methods similar to those
described in Baumker (2001), a common, but typically hidden procedure found in proprietary
software. Roll, pitch, and yaw, (cp, 6, uy), are usually (but not always) defined as independent
(non-sequential) rotations about a fixed axis, while the photogrammetric standards omega, phi,
and kappa, (co,cp,K), are sequential rotations about the x, the once-rotated y, and the twice-rotated
z axes. Since yaw is defined as a left hand rotation about the stationary z axis, even if the
difference in the angles roll and pitch, and omega and phi respectively are negligible, the rotation
about the z-axis, phi, would still need to be changed--that is reversed--in order to account for the
rule difference between yaw and kappa. Since this proj ect uses imagery from a small, and
therefore rotationally dynamic aircraft, differences between two coordinate systems are non-
trivial, and thus a full conversion is required.
Step 3: Interpolation of Resulting Navigation Data and Linking of the Data to Individual
In this step, a cubic spline is used as an interpolant for the data along given flight lines.
The INTS data are splined as well with results satisfactory for the required precision. Using
before and after video shots of a clock and thereby mapping the individual frames to time, the
navigation data file, which includes a time entry, is used to map each image to a given X, Y, Z
and (CO,(P,K) as well as height above ground measured using the barometric altimeter. The spline
works best on straight, fast trajectories. However on turns when the UAV is flying slower, the
GPS data sometimes indicates a reverse movement of the plane, thus resulting in a "zigzagging"
of the traj ectory. This may be due to actual movement of the plane or the result of loss of lock of
GPS satellites due to banking of the UAV. Either way, the spline interpolant does not do well in
this situation, and thus when planning flights, care must be taken to ensure that flight lines over
the proj ect area are straight--extending past the area if possible, noting that these straight flight
lines are also important to ensure complete coverage of the proj ect area.
Step 4: Determination of Pass-points for a Strip of Photos using Rigorous Point-matching
A rigorous method for generating numerous, precise point-matching observations is then
used to find pass points along individual flight lines. Due to the dynamics of the system, the
algorithm is adaptive with respect to the changing attitude and speed of the UAV. Also, in order
to retain proper geometry, and therefore proper conditioning of bundle adjustment equations, the
points must be well distributed throughout each image. Beginning with a nominal 480x720 pixel
image, the first frame is divided into 77 grid squares with a 32x32 pixel point template located in
the center of each. The algorithm then checks the same location in the next contiguous photo for
each point and finds the location of the point therein using normalized cross correlation and a
user-specified minimum correlation coefficient.
Next, if cross correlation matching is successful, the point location is refined using least
squares image matching and the resulting resampled template array is stored for use in the next
image. After Einding more than five common points between images, a 2D conformal coordinate
transformation is used to estimate the beginning location of the search for each of the remaining
points. This coordinate transformation also serves as the first check on the correctness of the
location of the matched points, and points having a high standard deviation are flagged. After
iterating through all points, any points that were not matched satisfactorily using normalized
cross correlation or were flagged in the location estimation transformation are given replacement
point locations based on a conformal transformation of all successfully matched points, their
newly-defined positions subsequently refined by least squares matching.
Finally, using a general least squares solution to a proj ective coordinate transformation of
points from the template photo to the current photo, the points are all checked for validity. If a
user-defined minimum number of valid tie-points not be found in the current photo iteration, the
matching is restarted on the current photo with either a relaxed correlation coefficient minimum
or a dilated search area. Should any points not be matched at this stage, new points are
established for the current image. The process is repeated for all images in a specified strip of
frames, the result being the photo coordinates of pass points between all photos. This algorithm
is described in more detail in Chapter 3.
Step 5: Determination of Interior Orientation Parameters and Relative Orientation for
Subsets of Photos in a Strip using Arbitrary Coordinate Systems
Using an arbitrary coordinate system defined by treating the initial estimation for the
focal length as the estimate for flying height above ground (and therefore scaling found ground
point coordinates to image scale, i.e. 1 pixel units), and estimating the photo bases by using the
found coefficients from an affine coordinate transformation solution utilizing image pass points,
a relative orientation solution is computed for photos in a subset. Image subsets consist of a
group of contiguous photos where the first and last photos have a nominal number of pass-points
in common. Since a self-calibrating method is used for the relative orientation, the solution can
include interior orientation parameters, although focal length cannot be rigorously determined
because of its dependence on the photo base estimates. This method depends greatly on the
geometry of the photography and estimations of the photo base distances in order to achieve the
best solution. The results are the relative position and orientation of all photos in terms of the
first photo in the chunk and 3D coordinates for all pass points at photo scale. This step is
described in more detail in Chapter 4.
Step 6: Attaching Relatively Oriented Subsets Using Multiple 3D Coordinate
Transformations from Subsets along the Entire Strip
A number of sequential 3D conformal coordinate transformations are then used to
reference the solved-for points (including the photo station coordinates) for each subset into a
single system, resulting in the relative orientations and positions of all photos in a strip in terms
of the first photo of the first chunk. The solutions also include precisions of the transformations,
and will help detect additional outliers at this stage in the overall process. Note: Steps 5 and 6
are done on all strips for the entire flight.
Step 7: Transforming the Resulting Points into 3D Ground Coordinates using Navigation
Since, in addition to the arbitrary 3D coordinates for the pass-points, the arbitrary 3D
positions of the photo stations determined, navigation observations of the UAV can be used to
transform all the arbitrary system points' coordinates into the defined local vertical coordinate
system. Again, a 3D conformal coordinate transformation is used. This step relies on the
precision of the GPS observations which we know to be poor. However, by constraining certain
parameters of the transformation, specifically co--rotation about the x axis, enhancement of the
precisions of the pass points involved is possible because of the high redundancy of
observations. The results serve as initial approximations for a bundle adjustment in the ground
(navigation/control) coordinate system in the next step.
Step 8: Performing a Bundle Adjustment on the Entire Strip Using the Results of the
Previous Transformation as Initial Approximations
This time, the entire strip is included in the solution. At this point, a more efficient
method for solving the bundle adjustment is needed due to the high number of observations. The
inclusion of the observations of all chunks in a strip will greatly increase the time and space
needed compared to computation of individual chunks due to the quadratic space and cubic time
complexity of the bundle adjustment solution. This may be avoided through decimation of the
data by eliminating photos and points from the solution. The possibility for this option exists
because of the high redundancy of observations, however, one must take care as to not remove
too many observations and greatly diminish the advantages of having a large number of
ob servations--practically the only advantage the dataset has over conventional photogrammetric
systems. The best solution, however, is still to use an algorithm well suited for solving large
sparse positive-definite systems.
Step 9: Attaching Each Strip (Found from Steps 4-8) from the Flight to Each Other
This will be done by finding tie points between strips. In order to eliminate search time,
a method to estimate the location of points before utilizing a matching scheme is needed. A 2D
or 3D Voronoi proximity algorithm may be used to search for close points. Once close points
have been established, either a manual or automatic method similar to the pass-point matching
algorithm is used to find tie points between photos on different strips.
Step 10: Performing a Final Bundle Adjustment Using all Strip Observations
Once all strips are attached, adjustment of all the images in a final single solution is
performed. This step will provide the final values for (o,cp,K) and (X, Y, Z) needed to resample
the images into the ground coordinate system. Also it will provide the ground coordinates for
each tie point. As in Step 8, a more efficient algorithm for the bundle adjustment solution is
critical to this step.
Step 11: Creation of a DTM from the Resulting Points
Although more involving methods for finding the DTM used in orthorectification are
usually required, especially in urban areas of places with sharp relief, the NBR' s topographic
relief should be modeled well by the 3D points found in the final bundle adjustment solution.
There are numerous methods for creating a continuous 3D surface using discrete points.
Step 12: Using the DTM and Absolute Orientation Parameters, Creating a Digital
Orthophoto Mosaic of the Proj ect Area
Using the collinearity equations, the DTM, and the solved (o,cp,K) and (X, Y, Z) position
of the photos we can map groundel (ground element) values for any region of the proj ect area to
photo pixel values. The result is an orthophoto mosaic in our local vertical system, which can
be, if needed, converted to geodetic coordinates.
,c Estimated Photo Bases
Flight Strip Subsets of Photos in
Arbitrary Coordinate System
-3D Conformal Coordinate Transformation
,- Initial Approximations in Mapping
/I Coordinate System
-Bundle Adjustment On Individual Flight Strips
Flight Strips of Photos in Refined
Mapping Coordinate system
Find Tie Points Between Each Strip
~~Tie Points for All Photos in the
Orthophoto Algorithm O he cosoC~
S3D Conformal Coordinate Transformation I
Flight strips of Photos in
Arbitrary Coordinate System
SBundle Adjustment for the Entire Flight -
AVI Video /
SBMP Images ~
GPS and INS Data
Convert GPS and INS Data
Time; w, cp, K ; X,Y,Z
Map Time and Position to Photos
Find Pass Points for Photos in Flight Lines
SPhoto Coordinate Pairs
Image; X, Y, Z; w, cp, K
SAffine Coordinate Transformation
Image; X, Y, Z; w, cp, K ; DTM
Figure 2-1. Flowchart of the developed overall process for georeferencing UAV videography
DESIGN OF AN ADAPTIVE POINT MATCHING ALGORITHM
Tie and pass point observations are employed in several steps of the overall process of
georeferencing the NBR UAV imagery. The flow chart in Figure 2-1 illustrates that they are
used for estimating the photo bases for the relative orientation step, and that they serve as the
only directly measured observations therein. They are also vital observations for both the bundle
adjustment to form each individual flight strip, and the bundle adjustment of the entire flight
performed after finding tie points between the flight strips. Because of the utmost importance of
these observations, it follows that the algorithm used to find the points should be rigorous in its
The point matching algorithm developed for this proj ect (Figure 3-1) exploits the fact that
video imagery is being used, and therefore the resulting images are very close to each other in
terms of the field captured in the frames. In other words, any point found on a given image will
have photo coordinates in close proximity to the coordinates of the same point found on images
directly preceding and succeeding. Thus, instead of searching an entire image or a large portion
of an image for a conjugate point, the search area can be reduced to a subset of the image based
on the point' s coordinates in the previous image. This advantage is significant because of the
weak position and attitude data measured by the navigation system of the UAV; if the system
determined more accurate angular attitude and position of the camera, these navigational
observations could be used to estimate the position of points on subsequent images and may not
need the high number of frames resulting from the use of videography.
Although the flight paths are nominally straight, due to the small size of the aircraft, the
attitude and position of the camera is very dynamic. This can have a dramatic effect on the
relative position of conjugate points in adj acent images. For example, in the ideal case of
perfectly straight, horizontal and non-crabbing flight in a nominal x direction of the plane,
sequential conjugate points will be found in positions with the same y photo coordinate and with
uniformly decreasing x photo coordinates. Using the same example and introducing moderate
amounts of dynamic pitch (rotation about the y axis), the points will still have essentially the
same y photo coordinates, but the x photo coordinates will no longer change uniformly. In fact,
depending on the speed, direction, and magnitude of the change in pitch relative to the speed,
flying height, and focal length of the camera, the changing x photo coordinates of conjugate
points may even reverse direction. Figures 3-2 and 3-3 illustrate this effect of attitude change on
the positions of instances of a point for 1 16 images along a nominally straight flight path. Figure
3-2 shows the overall variability of the photo coordinates, and Figure 3-3, is a close-up of the
first few photo coordinate positions showing an example of retrograde motion of points in the x
direction. Note that plot of the point moves temporally right to left, as the x photo coordinate is
decreased as the camera moves in the X direction. Because the six camera position and attitude
parameters are constantly changing, the determination of a trend in the change of photo
coordinates for points compared with the cost of implementation is not computationally feasible
for estimating the position of sequential pass points.
This generally unpredictable movement also has an effect on the aforementioned search
area for pass points. A decision must be made about how much of the image is to be searched
after determining a starting point for an attempt to match a point. It is not efficient to search the
entire image for a matching array of pixels when it is very unlikely that a point will be more than
a nominal distance from the decided starting point. In contrast, if a search area size is chosen
that is not sufficiently large for matching points on all the images along a flight path, not enough
points will be acquired to adequately connect the photos and continuity of the flight line will be
lost unless the point search area size is adaptive. Thus, a nominal search area size should be
selected that is adequately large for a high percentage of photo pairs to be matched along a flight
path and has been made adaptive to account for disparities in the required search areas. The
developed point matching algorithm takes these considerations into account, and produces pass
point photo coordinates for the length of each input flight line. The input consists of a list of
images, the grid spacing, the point template dimensions, the search area dimensions, and the
correlation threshold for the normalized cross correlation matching.
The four overall steps in the algorithm consist of calibration, allocation and initialization,
iterations of adaptive point matching, and formatting and output. Each image pair matching
epoch, determination of all findable pass-points between two images, consists of iterations
through the cycle illustrated in Figure 3-1. Each initial search for pass points is represented by a
counter-clockwise traverse through the flowchart starting from the upper-left corner. When all
points have been cycled through, the search area is dilated, or an attempt is made to find pass
points that were missed by the initial search. Then, the next epoch begins by redefining the
candidate image as the template image and the next image as the candidate image. The process
repeats until all pass-points for all images have been found
Due to varying terrain texture and environmental conditions like fog and wind, the point
matching conditions may be different from flight line to flight line or within individual flight
lines themselves. Thus, it is important to optimize the point matching iterations in terms of the
input parameters: grid spacing; template dimensions; search area; and correlation threshold.
Incorporated into the program's front end is the option to edit each of these parameters and run
test correlations on pairs or subsets of images in order to gauge their effectiveness, maximizing
the number of points matched between images, and to ensure the precision of the resulting point
image coordinates. One may also select a specific point patch on the calibration images to test
different textures on the ground for their suitability in matching at different parameter settings.
The grid spacing parameter determines the distance of the points on the first template
image from one another and effectively determines the maximum number of points on each
image. For example, a 480 x 720 image with a selected grid spacing of 64 pixels will result in
the following equations.
Rows pcn~mg Hih 40 7.5 = 7 (truncated to integer) (3-1)
Columns gmgeWdt 20i~ 11.25 = 11 truncated to integer (3-2)
Number of Points = Rows x Columns = 7 x 11 = 77 (3 -3)
The template dimensions determine the size of the pixel arrays to be matched. A
template size that is too small will result in a less valid solution of the coordinates for the
matched point, and a template size that is too large will result in not only a considerably larger
amount of time spent on matching the points, but also may result in distortions that are not
modeled by the affine transformation used in the least squares matching phase. So, one must
determine a compromise value that results in an acceptable precision of point coordinates and a
manageable execution time. Examples of different grid spacing and template dimension sizes are
illustrated in Figure 3-4.
As previously mentioned, the search area parameter should be chosen so that it is adequate
for matching a large majority, about 90%, of the image pairs along a flight line. It can be
established by testing different values on sample photo pairs along the length of the flight line.
Similarly, the correlation threshold should be chosen so that a large percentage of points are
found by the normalized cross correlation initial matching phase (discussed later in this chapter).
In the algorithm, for a single photo pair matching operation to be considered successful and thus
maintain continuity of points along the flight line, more than 5 points must be initially matched.
These points are used in a conformal coordinate transformation that establishes starting points
for the least squares matching routines on all unmatched points in a photo pair. Although
mathematical minimum of 2 points are needed for the 2D conformal coordinate transformation, a
commodious minimum of 6 points was chosen to reinforce the transformation solution with
An example matching calibration is illustrated in Figures 3-5, 3-6, and 3-7, all of which
have a grid spacing of 64 pixels and template dimensions of 32x32 pixels. In Figure 3-5, the
correlation threshold was set at .95 and the search area set at 64. In Figure 3-6, the correlation
threshold was dropped to .90, the search area kept the same. In Figure 3-7, the correlation
threshold of .90 was maintained and the search area was increased to 80 pixels. The black
squares in the right image represent pixel arrays that had initial correlation coefficients above the
threshold and the white squares represent points that were found using a conformal coordinate
transformation of all originally well-matched points and refinement by least squares matching.
Note in Figure 3-7 how areas with coarse texture and sharp edges were more readily matched by
cross correlation. Also note the false matches in the first two figures. Careful inspection reveals
that in Figure 3-5 the uppermost pixel array in the right image was falsely matched. In Figure 3-
6, note that the same array was falsely matched as well as the array below it which was
successfully matched in Figure 3-5. In the well matched pair, Figure 3-7, these same false
matches were detected by the conformal transformation blunder detector and remedied,
illustrating the importance of matching the minimum number of points used in the coordinate
Allocation and Initialization
Once the parameters for point matching are established through the calibration procedure,
one can begin to run the point matching algorithm on whole flight lines. An optimum condition
would be to match points for all flight lines continuously, including the turns. However, since
the UAV has a high banking angle on turns from one line to the next--and, less severely, along
the flight lines themselves--it is impossible to maintain continuity between flight lines (this is
discussed more thoroughly in the challenges chapter). One must then determine the video-
decompiled image sets for each flight line by using the navigational data, and create a list of
images for them. Each list is then loaded into the program along with the previously determined
optimal point matching parameters. The initial point locations are defined on the first image and
memory is allocated for the various data structures based on the number of images, grid spacing,
and template dimensions.
Iterations of Adaptive Point Matching
The program cycles through each point on the template (first) image and the conjugate
points are established in the matched (next) image by using normalized cross-correlation and
least squares matching.
The correlation between two variables is defined as the covariance divided by the product
of the standard deviations, defined in Equation 3-4.
p =v (3-4)
The definitions of standard deviation and covariance are shown in Equations 3-5, 3-6 and 3-7.
ox=N x, Tc)x (3-5)
0,= N Z~y, -v (3-6)
0 xy = x1z~~ xy,( -y)] (3-7)
These equations can be used to form Equation 3-8.
z=1 z= 1
By substituting x, y pairs with digital numbers Az, and B, of identically-sized pixel
arrays mA, and "B, with A and B being the average of the digital numbers within the arrays,
results in Equation 3-9.
2 1] (3-9)
In Equation 3-9, c is the coefficient obtained from normalized cross correlation. The cross-
correlation coefficient can be considered an indicator of the strength of a linear relationship
between two random variables, in this case, pixels on two images. The linear relationship is
defined by the slope, P, and intercept, a in Equations 3-10 and 3-11.
P= 2 ]=1 n2- (3-10)
oc = B- PA = y
The values of slope and y-intercept can be considered differential radiometric scale and
radiometric shift, respectively. This means that a different overall brightness and brightness
gradient between the two images will not affect the correlation. Thus, the correlation is termed
The pixel array from the template image is iteratively compared to each sub-array of the
search area using the coordinates of the point in the template image--and later a conformal
transformation--as the starting search point. The correlation coefficient is found at each position
until its value is greater than the chosen correlation threshold, at which point the position--more
specifically the position change of the point from the template image to the searched candidate
image--is refined by least squares matching. If the threshold is not reached, then the point is
flagged as "not-found."
Least squares image matching
According to Gruen (1985):
Cross-correlation cannot respond appropriately to a number of facts that are inseparably
related to stereo images of three-dimensional and sometimes even two-dimensional
obj ects. The conjugate images created under the laws of perspective proj section might differ
considerably from each other. Terrain slope, height differences, and positional and attitude
differences of the sensors give cause to geometrical distortions. (p. 175)
Least-squares image matching is similar to normalized cross-correlation in that it uses the
strength of a linear relationship between pixels in different image patches, but it also accounts for
some of the geometrical distortions Gruen mentions. It achieves this by incorporating an affine
coordinate transformation from the pixel locations in the template image to the candidate image
using the relationships in Equations 3-12 and 3-13.
x = ao + a, x + a2 y (3-12)
y' =b + b, x +b~y (3-13)
In these equations, ao, a a2, bo, b,, b2 are the affine transformation parameters, x, y the pixel
coordinates in the template image, and x ', y the transformed coordinates in the candidate
image. The overall least squares matching observation equations are of the form:
A (x~y)= h + h,B (x', y') (3-14)
A (x,y) and B (x', y') are the digital numbers of the pixels at x, y in the template image and the
digital number at x ', y in the candidate image, respectively. The parameters ho,h, are the
radiometric shift and scale. Since Equation 3-14 is nonlinear, Taylor's series is used to form the
effective linearized observation equations, the result of each iteration being the corrections to the
previous values of the unknowns ho, h ao, az a2 bo, b, and b2. Also, before the iterations
begin, initial approximations of the unknowns are found by using the results of the normalized
cross correlation such thatha = a, h, = P, ao = Ax, a, = 1, a2 = 0, bo = Ay, b, = 0, and b2 = 1. Here,
p and a are the same as those in Equations 3-10 and 3-11, and Ax and Ay are the differences
in image coordinates of the point in A and B found from normalized cross-correlation.
Iterations begin by resampling the B array using the affine parameters in Equations 3-12
and 3-13. Note that the values x and y are not necessarily integers, and that the resampling
may use pixels outside of B so if pixels are requested that are outside of the image, the point is
said to have "fallen off." If a pixel has "fallen off," the search for said point is discontinued and
a new point may need to be inserted at the end of the current image pair matching epoch. After
resampling, an observation Equation 3-14 is formed for each pixel from the updated B array and
the system is solved by least squares. The resulting corrections are applied to the unknowns and
the process repeats until the corrections are negligible, that is when the standard deviation of unit
weight (the square root of the sum of the residuals squared divided by the number of degrees of
freedom) has changed insignificantly from the previous to the current iteration. Alternatively, if
the solution diverges, the point is treated the same as a "fallen off" point. The center coordinates
of a successfully matched point are then defined by the solved affine parameters in Equations 3-
12 and 3-13, and the resampled array ofB is stored for use in the next photo pair matching
By saving the resampled array, as opposed to the original array of pixels from the
candidate image, the position of the matched points is more precisely and more easily found in
subsequent images. Changes in perspective due to the motion of the camera during the flight
cause the image to distort and the actual center of the obj ect point is not as well represented in a
non-resampled image. Also, by using the resampled matched array instead of the array from the
first instance of the point, there is a better chance of matching the point in the next image due to
the minimization of perspective change. Figure 3-8 illustrates the evolution of a resampled point
array for a line of photos. The arrays in the right column are the pixels from the candidate
images in the 1st, 20th, 30th, and 60th epochs, from bottom to top, centered at the solved
coordinates of a point. The left column consists of the least squares-resampled arrays from the
same image points. Note the difference between the 1st epoch and the 90th epoch. During later
epochs, the perspective change from the first image increases, creating the need to resample into
a very different image, and a less distinct array. Note also in the 90th epoch pair how the affine
transformation has rotated the pixel array to more closely match the array in the first epoch.
Position prediction and blunder detection by conformal transformation
Each successfully matched point is flagged, and when more than 5 points have been
successfully matched, a 2D conformal transformation (simple scale, rotation, and translation
change) is used to estimate positions of the remaining points on the photo by mapping the
positions of points on the template image to the matched image, and thus serving as starting
points for the normalized cross-correlation phase. The equations for a 2D conformal coordinate
transformation are given in Equations 3-16 and 3-17, where x and y are the input coordinates (in
our case coordinates from the template image) x and y are the transformed coordinates (from
the candidate image), a and b are the scale and rotation coefficients, T, is the x translation, and
T, is the y translation.
x '= ax by + Tx (3-16)
y '= ay + bx + T, (3-17)
The template window' s position is then spiraled outward from the starting point until the
correlation threshold is exceeded or the window falls outside of the search area. As more points
are added to the conformal transformation, it can be used to detect blunders by finding points
with high residuals. This is achieved by searching for points whose residuals are greater than a
tolerance level of twice the value of the standard deviation of unit weight. Of these points, that
with the highest value is removed permanently from the transformation and is flagged as not-
found. The check is repeated until no points have residuals higher than the tolerance or there are
no longer enough points to perform the transformation.
If, after all the points on the template image have been initially processed, less than 6
have been successfully matched, the process is repeated on the same image pair, this time with a
dilated search window. If the repeat process fails to find enough matched points again, the
search window is dilated again. This sequence is repeated until a match is found or the search
window reaches a maximum size at which point the program is aborted.
The assumption is, because a calibration of the search area size was performed before
execution, that the need for a larger search window is anomalistic to only this image pair.
Therefore, after a suitable size is found, the search dimensions return to the user-defined default
Last chance for unmatched points
After all the points have been processed and more than 5 points have been successfully
matched, an attempt is made to match the points flagged "not-found." A conformal coordinate
transformation composed of all the successfully matched points is used to provide initial
approximations for the unmatched points, and a final least-squares matching effort is carried out
for each. Those that are not successfully matched must be replaced in the next image pair
General least-squares projective transformation check
Finally, a check is made on all of the points that are successfully matched. A projective
transformation is used because it will more adequately model the distortions found in the
imagery than conformal or affine. The 2D projective transformation equations are Equations
3-18 and 3-19, where x and y are the input coordinates x and y are the transformed
coordinates, and a, a2, 3 b, b2 b3, 1 and c2 are the transformation parameters.
a, ax + b, y + c,
a3 x + b3
a2 x + b2y 2
y '= (3-19)
a3 x + b3y+
Since both the x ', y 'and x, y photo coordinates are measurements, it is best to use the
generalized form of least squares to account for errors on both sides of the equations. The
generalized forms of the 2D projective equations are Equations 3-20 and 3-21. Notice that these
equations are the same as Equations 3-18 and 3-19 but with residual variables (the vs) to account
for the measurement errors on both sets of coordinate measurements.
a, (x + v,) + b (y + v3.) + c1
F x,yx ', y') = x v) = (3-20)
a, (x + v,) +6 b3 +v3) + 1
a (x + vx) +6 bdy+ v3) + c,
G x,y~x',y')= y'-v.)=0 (3-21)
a (x~v)b y v,) i G + Vf1
After Equations 3-20 and 3-21 are linearized and iteratively solved using least squares, a
rigorous test is executed to find any outliers. Similar to the conformal transformation test, if
blunders are detected, the most probable outlier is removed (deemed "not-found") and the
process is repeated until no blunders are detected.
Usually, at this point in the image-matching routine, there are very few or no outliers after
the first iteration of this test. However, this final check is significant in that it filters out any
points that were erroneously matched in the least-squares matching routine found in the Last
chance for thimatched Points section.
Establishing new points
All of the unmatched points must be replaced by new points. Originally, a method was
devised that took into account where an unmatched point was located on the template image by
establishing the location of the new point on the corresponding opposite side of the candidate
image. The rationale behind this was that the motion of the frames will continue in generally the
same direction leaving room for a new point on the trailing end, and thus helping to maximize
the number of frames in which the new point will be matched. This method resulted in poor
geometry because points tended to cluster in a linear form on one side of the image, and seemed
to have little effect on the retention of points because of the irregular motion of the frame with
respect to the flight line. A method that established new points at random positions was
implemented and resulted in better distribution of the points and a retention rate on par with the
previous approach. Figure 3-9 illustrates the effects of the anticipation approach and the random
point insertion method. Note how the anticipation method results in a general linear pattern of
matched points along the lower-left to upper right diagonal and has a large area where no points
were matched. Conversely, the random method results in a wider distribution of points with only
a few small sparsely populated areas; there aren't large areas without any pass points. The
distribution of points that the random method provides creates stronger geometry when solving
for the orientation parameters of the camera stations.
Formatting and output
Once a set of points along the entire flight line has been matched, the point positions
must be reformatted to facilitate photogrammetric operations. The image coordinates of the
points are converted from the image left-handed axes system to a right-handed system, and
swapped to align the x axis with the direction of the flight. Finally, the points are saved to a file
containing the image names, point names, and image coordinates of the points which may be
used to create input files for the subset relative orientations.
Appendix A consists of a subset of matched images for line 2 of the second flight of the
NBR mission. The images show where the points were matched on them using black squares the
size of the template pixel arrays and illustrate the nature of the algorithm. Note that there is an
unidentified obstruction in the lower right corner of many of the images that seems to be waving
in the wind. This obj ect prevents many points from being matched in that sector of the imagery,
although no false matches due to its presence are apparent.
Figure 3-1. Image pair matching epoch flowchart.
40 80 120 160 200 240 280 320 360 400
X Phoin Coordinates
Figure 3-2 Image coordinates of point 1 for 1 16 consecutive images from flight 2 of the NBR
X Photo Coordinates
Figure 3-3 Image coordinates for point 1 for the first 20 consecutive images illustrating
retrograde motion of the point coordinates in the x direction on the 11Ith epoch of
image pair matching.
Figure 3-4. Left image: Grid spacing of 64, template dimension of 64. Right image: Grid
spacing of 32, template dimension of 16.
Figure 3-5. Poorly matched Image pair 1.
r figure s-0. roorly matenea Image pair z.
Figure 3-7. Well matched image pair.
Figure 3-8. Original (right) vs. resampled arrays (left) of a point for the 1t, 20h,3tad9h
photo pair from bottom to top.
a roo 2co soo 400
son 4MI son
Figure 3-9. New point locations using the anticipation method (a) vs. the random method (b).
Absolute orientation parameters for the aerial photos covering the proj ect area are needed
in order to resample images into an orthorectified mosaic. The best method for obtaining these
parameters for the NBR UAV imagery is a bundle adjustment, a simultaneous least squares
solution of all the orientation parameters for all photos in all flight lines. Because the equations
for this solution are nonlinear, a Taylor' s Series is employed, and therefore initial
approximations are required for the absolute (map) coordinates of the tie and pass points as well
as for the absolute orientation parameters for all of the photos. There are many methods to find
these initial approximations. For example, if the navigation data were of high
precision/frequency and a digital terrain model was available for the proj ect area, one could use
the orientation data from the navigational unit to estimate the camera orientation for each image,
and could estimate the point positions by mathematically proj ecting rays from the image to the
ground, finding the map coordinates of the tie and pass points. The most commonly used
method of finding the initial approximations, however, is preliminary strip adjustment. In this
method, each flight strip is adjusted independently by connecting pairs or subsets of relatively
oriented photos along each flight line and transforming the results to ground coordinates by using
control points. A modified version of this method must be used for the NBR UAV imagery.
The first step in conventional preliminary strip adjustment is the relative orientation of
pairs or larger subsets of photos along the flight line. Relative orientation consists of finding the
angular attitude of an image or images with respect to another image that is held fixed--typically
the first image--in a pair or set of images. This is achieved by setting the Z coordinate of the
fixed image to the focal length, the X and Y coordinate of the fixed image to zero, the attitudinal
parameters of the fixed image, co, cp, K, to zero, and the X coordinate of the last image equal to
the photo base--the relative positional displacement at image scale between the images. Pass
point coordinates are then used as observations in a least-squares solution for the remaining
unknown relative orientation parameters. The results can then be used to attach the relatively
oriented image sets (typically two-image stereomodels) to each other for an entire flight line by
using 3D coordinate transformations. The entire line is then adjusted to absolute coordinates
using control points in a final transformation. This method, however, assumes availability of
assets that are not included in the NBR UAV dataset. Specifically and most importantly, the
focal length and other interior orientation parameters are unknown and there is no ground control
in the imagery. A method was devised to minimize these deficiencies using self-calibration, and
the navigation data.
The fundamental relationship used in analytical photogrammetry is the collinearity
condition. The collinearity equations specify the relationship between image coordinates of
points, the exterior orientation parameters of the camera at exposure, and the ground coordinates
of points. The following is from Wolf and Dewitt (2000):
[Collinearity] is the condition that the exposure station, any obj ect point, and its photo
image all lie along a straight line in three-dimensional space...
Two equations express the collinearity condition for any point on a photo: one equation for
the x photo coordinate and another for the y photo coordinate. (p.234)
The collinearity equations are Equations 4-1 and 4-2.
x, = x f (4-1)
ng, (X, -XL) 32, (4 L, 3Z, -ZL
m, (X XL)m, 22 Y4 L 23 Z ZL)
ya = yo ,-f (4-2)
m3 (X XL)m, 32 Y,4 L 3 Z ZL
In these equations, x, and y, are the "photo" coordinates of image point a; X Y 4, and
Z,, are object space coordinates of point A; XL, L, and ZL are object space coordinates of the
exposure station; fis the camera focal length; x,, and y,, are the coordinates of the principal
point; and the m 's are functions of the three rotational angles of the image.
In analytical self-calibration, these equations are augmented with additional terms to be
included in the solution. The augmented collinearity equations are the same as the above
formulas, but with the following terms in Equations 4-3 and 4-4 added to the right hand sides of
Equations 4-1 and 4-2 respectively.
20 k, ri+ k rj + k, ri) -1 + p3 ri [p, 3 ti+ yi)+ 2p,, toy. (4-3)
ya kzk, r +kri)- 1+p r) [2p' ro y,+p, fi+3 yi (4-4)
In Equations 4-3 and 4-4:
r; = x; + y;
k,, k,, k, = symmetric radial lens distortion coefficients
p1, pt, p3 = decentering distortion coefficients
Thus the solution for/J x,, y,, k, k,, k,, p, p,, and p, can be found when the exterior
orientation parameters are computed. However, analytical self-calibration relies on control for
the solution, be it ground points or camera stations. Plus, there are commonly strong correlations
between some of the interior parameters, which can lead to ambiguity in their solution. For
example: the principal point offsets are often highly correlated with the decentering lens
distortion coefficients; and k2, k3 are correlated withk,. Thus, some constraints, achieved by
wei ghti ng the p arameters, mu st b e u sed to avoi d over-p arameteri zati on. Poor geometry may
lead to correlations between the calibration parameters and the exterior orientation parameters as
well. Such is the case withxo, 70 and f which are correlated respectively withX,, Y,, andZ,,
and the rotation angle co (rotation about the x-axis, the direction of flight) which is correlated
with the y principal point offset. Typically, interior orientation parameters are constrained with
regard to a previously determined calibration value usually found by performing a self-
calibrating bundle adjustment under optimal circumstances with regard to camera position
geometry and tie-point distribution.
A Relative Orientation Method for the NBR UAV Dataset
A method was developed to relatively orient subsets of images along flight lines. The
number of points connecting the last photo in the group to the first, Eixed image determined how
many images were in each group. A minimum threshold of 11 points was chosen to sufficiently
connect the images to one another. The number of images per group ranged from about 20 to
140, or about 1 to 4 seconds worth of video. Since a camera calibration was not performed prior
to the NBR flights and the camera was subsequently damaged before the georeferencing process
began, estimates of the interior orientation parameters were made by methods described in
Chapter 5. The Z, coordinate, or height of the camera, was fixed at the same value as the focal
length and the photo bases were estimated by using the translation coefficients found using affine
coordinate transformations. Affine was found to be the best transformation method for
estimating the photo bases after testing both the conformal and proj ective models. There were,
however, cases when the affine transformation performed poorly in finding accurate values.
When this happened, the navigation data were used with the estimated focal length and flying
height to obtain photo base values. By using these values, initial approximations prerequisite to
the relative orientation solution could be easily found by using the image coordinates as the
horizontal "ground" coordinates of the pass points and an initial value of 0 could be used for the
Z component of the points' "ground" coordinates. This technique is also useful in that the
residuals from the solution are in pixel units, which can be converted to ground coordinates using
the focal length and flying height.
The method was tested on the second and sixth flight line of the second NBR flight, seen
in Figure 4-1. These lines were selected because of their dissimilar terrain relief and texture.
For example, line two covers an area with steeply decreasing then increasing elevation from
north to south with very little middle frequency texture for point matching, and line six covers a
gently sloping area which includes coverage of a road. Point matching was successfully
executed on both lines resulting in continuous pass points attaching the images in each. The
lines were then divided into subsets using the 11 minimum point requirement, and a relative
orientation process was performed on each subset.
The results of the relative orientations suggest several problems with the NBR dataset.
After attempting relative orientations on both lines, the data from line 6 resulted in the successful
solutions of 9 out of 15 image subsets. Only one of the subsets for Line 6 resulted in a solution.
The most likely cause of the poor and divergent solutions is the geometry of the imagery,
specifically the base-height ratio--the ratio of the distance between camera stations to the flying
height above ground.
The base-height ratio is important because it determines the magnitude of the parallactic
angles. A good way of understanding this relationship is to consider the distance between your
eyes as a photo base and the distance to the obj ect you are looking at as the height. Your brain
can easily judge the distance from the chair across the room to your face by plus or minus 5 feet.
However if you increase the distance, or height of an obj ect to about 300 feet from your face--
greatly reducing the base-height ratio--your brain has a much harder time precisely determining
distances because the parallactic angle of your eyes is approaching zero. Conventional
photogrammetric proj ects typically will use a nominal base-height ratio of about 0.6 for images
with 60% overlap. In this proj ect, images with 60% overlap have a base-height ratio averaging
around 0. 1 when the UAV is not turning. This leads to a higher ambiguity of the computed
orientation parameters and 3D coordinates of the pass points. The effect of base-height ratio on
the Z component of the point coordinates is illustrated in Figure 4-2.
In Figure 4-2 both stereo-image pairs have the same focal length and flying height. There
are two rays from the left image running linearly from the focal point, through the image plane,
to the intersection of the rays from the conjugate image on the right. The slight angle between
the two rays on each individual image plane represents the error in pass point measurement.
Note how, since the air base in the right pair is smaller than the left pair--and thus the base-
height ratio is reduced--the determination of the Z component of the obj ect point is more
ambiguous in the right image than the left. To a lesser degree, the same principles can be applied
to the X and Y components of the obj ect point coordinate determination. While Figure 4-2 does
not show the striking difference for the extremely low base height ratios found in the NBR data
(as low as 0.04), the same principles apply. It is also important to note that for vertical
photography, the base-height ratio is constant for a given overlap. For example, the base-height
ratio for 60% overlap using vertical photography is expressed in Equation 4-5 where fis the focal
length of the camera and w is the width of the camera frame in the direction of flight.
(1 6 0%i (4-5)
This value, however, can be greatly reduced or increased when the images are tilted by attitude
change such as that illustrated in Figure 4-3. The effect of divergence is illustrated in Figure 4-4.
In the figure, both stereo pairs have the same focal length and format, and both images overlap at
60%. However, the right stereo pair photos are each diverging 5 degrees. In order to keep the
same overlap, the air base must be reduced for the right pair, decreasing the base-height ratio and
therefore weakening the geometry. In the NBR imagery, the nominal vertical base to height ratio
at 60% endlap was calculated as 0.192.
The most problematic image subsets in line 2 were found at the end of the flight line, when
the aircraft was beginning its turn into the next line. At the end of each line, as the UAV
prepares to turn, it increases its altitude by increasing its pitch (see Figure 4-3). Pitch angle
varies along each line depending on environmental conditions such as wind or updraft. As the
pitch angle increases, the field of view of the camera pans forward in the direction of flight. This
results in divergent photography and a severely reduced base-height ratio just prior to the end of
the flight line. The average base-height ratio for the subsets prior to the anticipatory turning in
line 6 is about 0. 1. When the UAV prepares to turn, the ratio is reduced to 0.04. The same
problem was found in line 6 data. Also, in line 6, geometry is weakened due to the dynamic roll
of the UAV. Notice in Figures 4-1 and 4-3b how the flight line oscillates along its planned path.
The roll of the aircraft causes the field of view of the camera to pan laterally, reducing the
number of pass points retained. This causes an early termination of the image subset and thus a
reduced base-height ratio. Also notice in Figure 4-1 the gap in GPS navigation points recorded
in the middle of line 6. The absence of these points makes the alternate estimation of the photo
base impossible. That is, if the affine transformation does not find a good solution to the photo
base (there is a good chance of this happening because of complex distortions caused by the
constant banking), it cannot be calculated. Data from only one out of 19 subsets in line 6
resulted in a non-diverging relative orientation solution, and it was noticeably weaker than the
solutions found in line 2. A side by side comparison of the solution statistics from line 6 and a
solution from line 2 can be found in Figure 4-5.
In the figure, first notice that the standard deviations of unit weight are comparable to each
other at about 0.8. Also, note that that the computed focal lengths are different, but are similar to
the estimated value found from calibration. The remaining interior orientation parameters show
that there is significant amount of distortion in the camera, in particular the radial lens distortion,
or k coefficients, are significantly different from zero in both solutions. The most distinct feature
in this comparison is the difference in standard deviations of the ground coordinate results,
which are much lower in the line 2 solution.
Figures 4-6a and 4-6b show a digital terrain model created from the relative 3D
coordinates of pass points in pixel units produced by the relative orientation of images near the
center of line 2. Note that the terrain apparently slopes from the lower left side to the upper right
side of the area. Since the ground at this section was nominally level, this effect is mostly caused
by the pitch and roll of the fixed image (image 1 1616) of the subset with values of 60 and -1 10
respectively in the interpolated navigation file. An absolute orientation of these images will
provide these same points in the ground coordinate system, and the DTM created from them can
be used to create an orthorectified image mosaic.
An analysis of the iterations of the bundle adjustment solutions substantiates the assertion
that a poor base-height ratio is the main problem with the relative orientation of the NBR
imagery. The Z solutions have an average standard deviation about 7 times larger than the X
solutions, and the Y solutions are about twice as large as the X solutions. In the best relative
orientation solution, the average standard deviation of the X, Y and Z relative coordinate
solutions are 2.3, 2.9, and 13 pixels respectively. Given the solution's focal length and estimated
flying height, these numbers scale to about 0.44m in X, 0.55m in Y, and 2.46m in Z on the
ground. Most of the image subset relative orientations, however, failed to converge to a solution
For geometrical purposes, the focal length of the camera was too large relative to the size
of the image format in the direction of flight relative to the magnitude of angular attitude found
in certain flight lines. This weak geometry, coupled with the absence of control, created poorly
conditioned equations used for the solution. Traditionally, normal equations are used in
photogrammetric computations to solve least squares problems. Using the normal equations, the
conditioning of the linear systems used is not taken into account in the solution. This is usually
not an issue because conventional photogrammetric mapping configurations have strong
geometry due to a wide angular field of view. Another method however, singular value
decomposition, can be used to solve least squares problems that takes into account and
sometimes compensates for poor geometry--that is the ill-conditioning of linear systems.
Singular Value Decomposition
From Forsythe, Malcolm and Moler (1977):
The singular value decomposition--the SVD--is a powerful computational tool
for analyzing matrices and problems involving matrices which has applications in many
fields. (p. 201)
One specific powerful application of SVD is the analysis of the condition of least-squares,
redundant, observation matrices and the elimination of weak components from the solution.
SVD is performed by decomposing the observation matrix '"A, of a linear system Ax=b into
three matrices U, E, and V where A = UEVT Uand V are orthogonal rotation matrices and E is a
diagonal matrix containing the singular values ofA. These singular values can be used, firstly to
measure the overall conditioning--that is the independence, or strength of geometry of a linear
system--by computing the condition number of the A matrix by Equation 4-5.
cond (A) = = (4-6)
In this equation, o n is the largest singular value of A and o,, is the smallest singular value of
A. Again, from Forsythe, Malcolm and Moler (1977):
Clearly, cond(A) > 1. If cond(A) is close to 1, then the columns of A are very independent.
If cond(A) is large, then the columns of A are nearly dependant...
The singular values [by themselves] can be interpreted geometrically. The matrix A maps
the unit sphere, which is the set of vectors x for which ||Ix|| = 1, onto a set of vectors b=Ax
which have varying lengths. The image set is actually a k-dimensional [where k is the rank
of A, the number of independent columns] ellipsoid embedded in m-dimensional space...
The singular values are the lengths of the varying axes of the ellipsoid. The extreme
singular values o x and a mn are the lengths of the maj or and minor axes of the ellipsoid.
The condition number is related to the eccentricity of the ellipsoid, with large condition
numbers corresponding to very eccentric ellipsoids. (p. 206)
Singular values that are very small compared to the largest singular value of matrix A
result in arbitrary solutions for their corresponding least squares coefficients. An attempt can be
made, however, to edit the E matrix by setting to zero singular values smaller than a threshold,
thereby collapsing the aforementioned ellipsoid in the direction of that singular value. This
results in a lower (closer to 1) conditioning number, although residuals of the solution may
increase in size.
The relative orientation observation matrix was decomposed using the SVD method and
was found to have a high conditioning number. In fact, several of the singular value components
were very close to zero compared with the largest singular value. This suggests that the
observation matrices used for relative orientation were grossly ill-conditioned, and therefore
even convergent solutions found using the normal equations method were of poor precision.
Removal of singular values was pointless because there were so many below a reasonable
threshold, that it would make the solution impossible. Because the tie-point acquisition method
was very rigorous, the culprit of the ill-conditioning was most likely the geometry of the camera
stations relative to the ground. Singular value decomposition remains, however, a good
alternative to the normal equations method for photogrammetric solutions, especially for smaller
adjustments such as relative orientation. SVD, however, is costly when it comes to
computational time and memory, and it does not facilitate the determination of solution statistics
as the normal equations method does.
Fiue -. ies2an o he2d lgh ve heNR
Figure 4-2. Comparison of base-height ratios of 0.6 (a) and 0.4 (b).
-0.3 - - - - - 212
720 00700072007 0
12800 160000 72000 234[00
Fiur -3 heinras n ichatte n o in 2asoite wt atciaor trin f h
UA aadte lte eto o ie2hglghe nrd()
Figure 4-4. The effects of divergent photography on base-height ratio. The base-height ratio in
the left image is about 0.32; in the right, it is about 0. 12. Both images have the same
format, focal length and flying height.
Stanmdard crzar of unrit eight 0 7650092
CAL;IBRATTOR PARAKETER RIESULTS j
Parameter PaLue Stan Dev. t-statistic
f 1000.6006 d 6849 0.1287
xP -E.1309 0 7053 7.2743
YP +0.2890 0 7501 0.3B54
k1 +2.1703e-007 6 5220m-009 32.7733
k2 -6.5~004-013 8 1195m-01d B.D@5B
k3 +1.9390c-001 3 5529m-015 5.4576
pl -1.7451e-009 7.5003m-005 0.2295
p2 -3.1021e-015 7.5009c-011 0.0BOD
PHOTD CODHDIN~TE RB6I~IfALS: (sr.;nled by 1000]
celghedj xMSMS) 0.50
celghlM y(RtMS) 0.94
Prcenteen of ress~duals; uthin:
1 sigm~a. 74.572 babuLd be 60.27%
2 sigma. 95.232~ babuLd be 95.45%
3 sisha. 9B.931( sabuLd be 99.23%
4 sisma. 99.592( sabuLd be 99.99%
5 asiga: 99.942: should be 100.00% :
GROUHID CDOOPSHATE IESULTS.
POINT K Z
PAS sigmas 8.004 201.991 69.294
ALverage signas 6.221 12.232 49.76@
tMaximuh chject distance = 2657.037
A~pprDminatB spherical standard error = 32.265
ApprDnainatB 955 spherical error = 91.5~77
954 RPreisoln ratio = 1 / 29 0
Standard crwrP of~ Unit peigqht; 0 081240
CALIBRA"TION B4RAKETER RESQLTSj
Parameter PaLue Sitan Dev. t-statistric
f 950.6577 5 7965 8.510B
xP -0.9336 D 7873 1.1895
YP +1.E840 D 7511 2.1089
ki +1.7430e-007 1 0091m-DOB 17.7686
k2 -9.0114e-013 1 3091m-a13 6.9937
k3 +3.02Dam-01B 6 ZB2m52-al9 4.8[61
pl -2.766e-0111 B 17B~e-DOS 0.0034
p2 +L.0334e-014 B.17B5e-011 0.0007
PHOTD0 COORDINATE RESII]fALS: (scaLed by 1DOD)
w->sha .(2) .59
v.-ahtM y(R ) 0.941
F-r-erJ..-- OE renadualb vx~hth:
1 E~grar~ Q.dllf should he 68.27%)
2 exama 93.39f should he 95.45L
3 si~gan. 99.053( should ha 99.73%
4 asinga. 99.9332 should be 99.992C
5 estoa. 100.002: should Be 100d.00%
GR013KD COORDINATED RESULTS.
POINT X Y Z
Average signas 2.295 2.903 13.092
RMS signas 2.671 3.367 14.139
Mcantimnu obect. distance = B64.B39
ALpprDuinat9a Spgerical Standard errar = 6.720
ApproximateB 45; spharica.1 error = 18.804
954 Ilereision rtio~ 1 / 45.
Figure 4-5. Statistics for the only converging solution found in line 6 (a) and a solution from line
Figure 4-6. Plan view of the digital terrain model created from relative 3D coordinate from a
relative orientation in line 2 (a), and 3D view of the same model (b).
++ + _
+ + -
+ ~+C C
400 son 2oo loo 0 -100 -too
'' ''," Z~,
-r I* ,-
The goal of camera calibration is to find the unique elements of interior orientation for a
given camera. These elements are geometric parameters (e.g. focal length) that are needed to
make accurate spatial determinations from photographs. There are several procedures for
determining these values including laboratory, field, and stellar methods. Government and
private agencies have historically used field and laboratory methods to calibrate cameras,
including the use of multicollimators, devices that use individual collimators mounted in precise
angular arrays which proj ect images onto glass plate negatives, thus allowing lens distortion to
be determined. Recently, because of the emerging shift from film to CCD arrays and the
subsequent need for digital camera standards, there is an increased use of laboratory methods
like control point cage or CPC for calibrating digital cameras. The CPC method employs a fiee-
network, self-calibrating bundle adjustment to solve for interior orientation parameters. Field
methods involve acquiring images from an aircraft over an area with highly redundant network
of ground control points. A maj or advantage of the field method is it also provides boresight
calibration, which determines the position and angular attitude of the camera relative to the GPS
and inertial sensors in addition to the interior orientation parameters. Figures 5-1 and 5-2 show
systems used by the USGS to calibrate digital and film-based cameras.
Due to the absence of a preflight NBR camera calibration, determination of the focal
length and lens distortion parameters had to be made using the existing video imagery. Although
self-calibrating bundle-adjustments may be used to solve for some these missing parameters,
most importantly the focal length, the position of the camera at exposure or the use of ground
control points is needed to achieve suitable accuracy from this method. Unfortunately, data from
the navigation unit had low precision compared with typical mapping systems, had time gaps--
periods of no data of up to 14 seconds--and a host of other issues. Three methods were
attempted, two of them using a different camera of the same model (Canon Elura), with results
leading to the best estimate for the focal length of the mission camera. They were a scale
comparison to ortho-rectified imagery, a field calibration, and a baseline test.
DOQ Scale Comparison Method
Digital ortho-quadrangles are orthophotos; the effects of tilt and relief have been removed
from them and their scale has been normalized. One can therefore make direct measurements
from the DOQ as if it is a map. Using features easily recognizable in both images,
measurements were made on both a 1-meter resolution DOQ and the NBR imagery. Doing so
gave a rough estimate of the focal length of the camera. The result is of low precision because
the actual altitude of the camera above ground was unknown, the non-orthogonality (and
effectively unknown attitude) of the imagery results in distorted measurements on the NBR
imagery, and changes in the terrain between image acquisitions may have decreased the accuracy
of the image measurements. Nevertheless, a rough approximation of the actual value can be
deduced by comparing the measurements. Ignoring the effects of tilt on the measurements, one
can produce estimated maximum and minimum values of the focal length by estimating the
maximum and minimum possible flying heights of the UAV. With an estimated maximum
flying height of 250m and a minimum of 200m, the potential range in the focal length was
computed to be between 900 and 1125 pixels. Equation 5-1 was used to make these estimates:
f =H' x~ (5-1)
The ground distance in meters was found using DOQ imagery, the image distance in pixels
was measured using the NBR imagery, and H' is the estimated flying height above ground in
meters. Figures 5-3 and 5-4 show the areas that were used. The most static feature in the
imagery was a dirt road that weaves through the NBR, upon which the measurements were
During the implementation of this procedure, a discrepancy between the barometric
altimeter and the GPS height was observed. The autopilot data file logs at varying intervals of
about one second each necessitating interpolation of the data, although gaps greater than 4
seconds can be found in the data file including a 14 second gap in flight 2 of the NBR set, which
is an equivalent loss of information for about 400 images at 30fps, 210 meters at an air velocity
of 15m/s which equates to a distance of 1050 pixels or 2.2 frames in the flight direction assuming
a focal length of 1000 pixels and a flying height of 200m. Figure 5-5 highlights the area of the
13 second gap. Also noticeable in Figure 5-5: there seems to be a shift in the recordation time of
the GPS altitude and barometric altitude in the data file. Applying a shift in the GPS data of
about 13 seconds results in Figure 5-6. This suggests there is a lag in the recordation of all of the
GPS data including latitude and longitude in addition to height above the ellipsoid. The cause or
an explanation of this shift/lag has not been found, although it may be related to the difference
between Coordinated Universal Time (UTC) and GPS time, where GPS time is about 14 seconds
ahead. At first, a shift accounting for all data gaps was applied to the data which resulted in
Figure 5-7; however, a comparison of these plots indicates that there is about a 13 second
difference in the data written for GPS and barometric altitudes for all values in the entire flight,
not several shifts resulting from the data gaps.
Airborne Calibration (Ground Control Method)
An airborne field calibration mission was planned and flown to find camera calibration
parameters for a different camera of the same model as the one used in the NBR data. Using
targets placed on ground control points, a self-calibrating bundle adjustment was used to solve
for the interior orientation parameters. Because the calibration flight was planned at an altitude
of 200m, close to the same as that used for the NBR flights, the issue of poor base to height ratio,
discussed in the Relative Orientation chapter made for less than optimal adjustment geometry. A
solution of about 1030 pixels was found for the focal length with a standard deviation of about
60 pixels. Two additional discoveries, however, were realized in the analysis of this field test.
The radial lens distortion parameters were found to have significantly high values, as
demonstrated by their t-statistics. Since there is a large amount of radial distortion in the lens of
the calibrated camera one can assume there was a large amount of distortion in the mission
camera as well and an account for this must be made when adjusting the imagery. Also, a
striking observation was made about the nature of the flight of the UAV.
The autopilot of the UAV anticipates turns from one waypoint to another. In other words,
the UAV optimizes its flight path to come as close to all of the points as possible. Instead of
flying in a straight line between GPS waypoints, the aircraft banks in anticipation of reaching
them. Because the calibration site was small, this action is magnified. Upon inspecting the NBR
flights after discovering the anticipatory turning, we find the same tendency although not as
exaggerated. Figure 5-8 is a plot of the planned flight versus the calibration flight path. The
corners of the planned flight (blue) are the locations of waypoints programmed into the autopilot,
while the red trail shows the actual path traveled. In Figure 5-9, the path of the second flight of
the NBR mission displays some of the same characteristic anticipatory turning.
Some of the effects of this pre-turning and subsequent banking are that the target
coverage area may not be acquired by the camera especially with a poor base to height ratio, and
the point matching algorithm will have a harder time locating pass points due to the distortion
caused by the extreme attitude change.
The final calibration procedure was a simple "baseline" test. Images were taken at a
several distances, capturing a target of known size. Using the ratio of the image size to the
obj ect size multiplied by the distance to the object, an estimate of a focal length of about 950 to
1050 pixels was calculated. The methods employed gave a consistent, albeit low precision value
for the focal length of roughly 1000 pixels.
Figure 5-1: Field calibration target point (a) and the control point cage (b) at the USGS EROS
Data Center, Sioux Falls, SD.
Figure 5-2: The camera calibration team at EROS performing a calibration of a digital camera
using the CPC method.
Figure 5-3. Wide area (a) and close up (b) DOQ imagery of the National Bison Range's Summit
Figure 5-4. DOQ imagery (a) and UAV imagery (b) of Summit Road.
5~I I I \r r I0
1 00 -'
400 600 800 1000 1200 1400
Figure 5-5. Plot of GPS altitude (green) and barometric altitude (blue) and their differences.
I I ~
~I I 40-
40 0 0 00 2010
Fiue56 lto P liue(re)adbroercattd bu)wt 3scn hf
0~~~" I I 1
400 600 800 1000 1200 1400
Figure 5-7. Plot of GPS altitude (green) and barometric altitude (blue) with shifts accounting for
all data gaps greater than 2 seconds applied
Ea sting (m)
Figure 5-8. Planned (blue) versus actual
autopilot anticipation of turns.
(red) calibration flight lines illustrating the exaggerated
allI~ n n n h R.
1200 1600 2000 2400
Figure 5-9. The second flight of the NBR mission.
Small unmanned autonomous aerial vehicles provide very useful platforms for collecting
imagery for a broad range of applications. The use of UAVs will continue to grow as the
technology becomes inevitably smaller and less expensive, and the development of techniques
for processing the data collected using these aircraft will evolve with changing acquisition
methods and applications. In this thesis, due to the high dynamic nature of the UAV, some of
the images had geometry too poor for analytical photogrammetric solutions without a priori
camera calibration. However, the designed techniques proved successful in the cases having a
suitable geometric configuration, and the developed adaptive point matching algorithm can serve
as an excellent tool for image matching UAV aerial videography and imagery collected using
numerous types of other platforms.
There were several impediments that hindered the success of the designed process. Many
of them, however, can easily be avoided in future missions. The two problems that caused the
most difficulty were the absence of camera calibration, and the poor geometry of the imagery
due to the dynamic attitude of the UAV. The process of calibrating the camera before flying or
even planning a mission is essential to georeferencing aerial imagery particularly when little or
no ground control is used. Field calibrations are especially useful in that they allow a calibration
of the whole system including the UAV platform to be tested. When performing a camera
calibration, the camera should be mounted in exactly the same way it will be when the mission is
flown. One should mount the camera at the same relative angular attitude and distance from the
GPS antenna and INTS as the mission flight to eliminate changes in boresight/lever-arm offsets,
which will have to be accounted for otherwise. The calibration of the camera can then serve as
the foundation, in addition to the application parameters, of the planned flight lines.
Using the camera calibration, one can then create 3D waypoints that optimize the base to
height ratio, overlap, and endlap of the images. This was the maj or downfall of the NBR data.
Certainly, more precise INS and GPS units would have helped the orientation solutions by
adding control to the system, however, the poor geometry caused by the extreme banking and
pitch angles coupled with the low base to height ratio would still have made it difficult to process
the data. A camera with a high format to focal length ratio should be used and the camera zoom
should be set to its widest Hield of view. The planned lines should be designed to be flown as
straight as possible over the proj ect area. Since the current system uses navigation waypoints to
determine the path of the UAV, additional points linear to the plan lines can be used to prevent
the aircraft from anticipating turns and banking while over the proj ect area.
Further recommendation include: using more precise navigational (INS and GPS) devices;
solving the navigation data "gap" problem; making sure no obstructions block the cameras Hield
of view; and finding a better method of syncing the imagery with the navigation data
A MATCHED IMAGE SUBSET
This appendix contains a sequence of images decompiled from video with matched point
pixel arrays outlined in black squares. The order is from left to right and top to bottom.
Figure A-1. Images 1-6 of the matched image set.
Figure A-2. Images 7-14 of the matched image set.
Figure A-3. Images 15-22 of the matched image set.
Figure A-4. Images 23-30 of the matched image set.
Figure A-5. Images 30-37 of the matched image set.
Figure A-6. Images 38-45 of the matched image set.
Figure A-7. Images 46-53 of the matched image set.
Figure A-8. Images 54-61 of the matched image set.
Figure A-9. Images 62-69 of the matched image set.
Figure A-10. Images 70-77 of the matched image set.
Figure A-10. 78-85 of the matched image set.
Figure A-11i. 86-92 of the matched image set.
LIST OF REFERENCES
Baumker, M., Heimes F.J., 2001. New Calibration and Computing Method for Direct
Georeferencing of Image and Scanner Data Using the Position and Angular Data of an
Hybrid Inertial Navigation System, OEEPE Workshop, hItegrated Sensor Orientation.
Buckland, S.T., Goudie, B.J., Borchers, D.L., 2000. Wildlife Population Assessment: Past
Developments and Future Directions, Biometrics 56:1-12.
Forsythe, G.E., Malcolm, M.A., Moler, C.B., 1977. Computer Methods for Mathematical
Computations. Prentice Hall, Inc., Englewood Cliffs, N.J., 259 p.
Gruen, A.W., 1985. Adaptive Least Squares Correlation: A Powerful image Matching
Technique, S. Afcrica Journal ofPhotogrananetry, Remote Sensing and Cartography,
Jones, G.P. IV, 2003. The Feasibility of Using Small Unmanned Aerial Vehicles For Wildlife
Reseach, Masters Thesis, University of Florida 2003.
Kumar, R., Sawhney, H., Samarasekera, S., Hsu, S., Hai Tao, Yanlin Guo, Hanna, K. Pope, A.,
Wildes, R., Hirvonen, D., Hansen, M., Burt, P., 2001. Aerial Video Surveillance and
Exploitation, Proceedings of the IEEE, 89(10):1518-1539.
Lee, K., 2004. Development of Unmanned Aerial Vehicle for Wildlife Surveillance. Masters
Thesis, University of Florida, 2004.
Roberts, C. W., Pierce, B.L., Braden, A.W., Lopez, R.R., Silvy, N.J., Frank, P.A., Ransom Jr.,
D., 2006. Comparison of Camera and Road Survey Estimates for White-Tailed Deer, The
Journal of Wildhife Management, 70(1):263
Robbins, J., 2003 "Bison Roam on the Refuge, But that' s on the Reservation," New York
Times, pg. A. 16 Nov 24.
Rosenstock, Steven S., Anderson, David R., Giesen, Kennth M., 2002. Landbird Counting
Techniques: Current Practices and an Alternative, The Auk, 119(1):46-53.
Sasse, D.B., 2003. Job-related mortality of wildlife workers in the United States, 1937-2000. In:
Wildhife Society Bulletin, 31(4): 1015-1020.
St. Petersburg Times Staff (Author not listed), 2003. "1 Killed in Plane Crash; 3 Missing," St.
Petersburg Times, Jan 28th.
Williams, B.K., Nichols, J.D., Conroy, M.J., 2001. Analysis and Management of Aninzal
Populations. Academic Press, San Diego, California. 1040 p.
Wolf, P. R., and Dewitt, B. A., 2000. Elements ofPhotogrananetry 0I ith Applications in GIS,
McGraw-Hill Science/Engineering/Math, New York, N.Y. 608 p.
U. S. Fish and Wildlife Service, National Bison Range Information Website,
http ://www.fws.gov/bisonrange/nbr/, Accessed October 2006.
Zhou, G., Li, C., Cheng, P., 2005. Unmanned Aerial Vehicle (UAV) Real-time Video
Registration for Forest Fire Monitoring, Proceedings of the International Geoscience and
Remote Sensing Symposium, 3:1803- 1806
Ben Wilkinson was born in Gainesville, Florida while his father was attending the
University of Florida in 1980. He grew up in Tallahassee. His first involvement in the mapping
sciences was a summer j ob at the Florida Resources and Environmental Analysis Center at FSU.
He found his niche, and received his bachelor' s degree in geomatics with a minor in computer
and information sciences and engineering from the University of Florida in 2004 after bouncing
around a few different maj ors. After working as an airborne LiDAR operator and data processor
for the National Center for Airborne Laser Mapping, a j ob which took him to every corner of the
US, he began graduate school at UF in 2005. He was awarded the Alumni Graduate Fellowship
from the School of Forest Resources and Conservation, and will begin pursuing a Ph.D. in the
fall of 2007 at the University of Florida.