Citation |

- Permanent Link:
- http://ufdc.ufl.edu/UFE0041556/00001
## Material Information- Title:
- Three-dimensional Citrus Grove Reconstruction Based on Empirical Measurement
- Creator:
- Han, Sanghoon
- Publisher:
- University of Florida
- Publication Date:
- 2010
- Language:
- English
## Thesis/Dissertation Information- Degree:
- Doctorate ( Ph.D.)
- Degree Grantor:
- University of Florida
- Degree Disciplines:
- Agricultural and Biological Engineering
- Committee Chair:
- Burks, Thomas F.
- Committee Members:
- Lee, Won Suk
Beck, Howard W. Dixon, Warren E. Dankel, Douglas D. - Graduation Date:
- 4/29/2010
## Subjects- Subjects / Keywords:
- Cameras ( jstor )
Canopy ( jstor ) Coordinate systems ( jstor ) Geometric lines ( jstor ) Image filters ( jstor ) Image reconstruction ( jstor ) Images ( jstor ) Sensors ( jstor ) Simulations ( jstor ) Three dimensional modeling ( jstor ) 3d, activemesh, agricultural, autonomous, canopy, citrus, detection, feature, grove, harvesting, image, ladar, leaf, manipulator, matching, mosaicing, plucker, reconstruction, robot, scouting, surf, tracking, unmanned, vision - Genre:
- Unknown ( sobekcm )
## Notes- General Note:
- My study contributes to the development of unmanned and automated techniques associated with two applications for citrus groves: ground-based scouting autonomous vehicles and harvesting robot manipulators. The duty of ground-based scouting autonomous vehicles is to collect global information, threading along alleyways in a grove, and harvesting robot systems explore and map local fruit positions on a canopy using vision systems. Two major techniques considered in my study are a global image mosaicing technique for the ground-based scouting autonomous vehicle and a local 3D reconstruction for the harvesting robot manipulator. Visionbased techniques follow common procedures related to features: feature detection, feature tracking and feature matching. Since it must be assumed that there are no artificial landmarks in canopy scenes, it is necessary to extract features from leaves. A leaf detection algorithm was developed by combinations of morphological operations designed to look for the center of the leaf. Leaf detection can preserve the viability of dominant interest points by means of segmenting individual. To be robust against the disturbance and disappearance of interest points, an active mesh method is introduced. It was originally designed to detect shapes of objects, however, the method is used in tracking with an assumption that objects are not deformable during consecutive frames. A multilayered active tree was invented to be more robust to noises by managing interest points hierarchically. The multilayered active tree is not only stable, but it determines the dominant features. Some of the matched sets are used in the estimation of optical flow and reconstruction. For the horizontal movement, a parallax which causes discords between images occurs. A sequential image mosaicing was implemented by cropping and stitching the overlapped area between images. Difficulties of an 8 point algorithm were pointed out and a Plu umlautcker coordinates system which provides simpler and intuitive computations was suggested for the 3D canopy reconstruction. An alternative way to reconstruct canopies was also conducted through a LADAR and a GPS receiver.
## Record Information- Source Institution:
- University of Florida
- Holding Location:
- University of Florida
- Rights Management:
- Copyright Han, Sanghoon. Permission granted to University of Florida to digitize and display this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
- Embargo Date:
- 5/31/2012
- Resource Identifier:
- 814306390 ( OCLC )
## UFDC Membership |

Downloads |

## This item has the following downloads: |

Full Text |

PAGE 1 1 3D CITRUS GROVE RECONSTRUCTION BASED ON EMPIRICAL MEASUR E MENT By S ANGHOON HAN A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2010 PAGE 2 2 2010 SangHoon Han PAGE 3 3 To all the people who believed in me PAGE 4 4 ACKNOWLEDGMENTS I would like to express my sincere thanks to Dr. Thomas F. Burks, my supervisory committee chair and advisor, for introducing m e to the subject of robot vision and for his continuous faith in me throughout this work. His advice and support enabled me to successfully complete all my graduate studies at the University of Florida. I would also like to thank to Dr. Lee, who gave me sp ecial advice. I am grateful to my other committee members: Dr. Beck, Dr. Dankel and Dr. Dixon for being a part of my dissertation committee. Their valuable insight and guidance contributed to the completion of this research work. I am thankful to all my fr iends in our lab Agricultural Robotics and Mechatronics group for discussing and helping me with all of the experiments. I express my special gratitude to Greg Pugh and Mike Zingaro for their invaluable technical support and building system. I thank the pe rsonnel in the Department of Agricultural and Biological Engineering for providing flexibility. I thank my family and all the people who kept me encouraged for long time. Special thanks go to Ms. Sarah Baybordi for proofreading this document and sharing a special friendship. Lastly, I gratefully acknowledge the United States Department of Agriculture, Florida Department of Citrus and Florida Le gislative Initiative for Citrus Harvesting for providing the necessary funds to carry on this research. PAGE 5 5 TABLE OF C ONTENTS page ACKNOWLEDGMENTS ...............................................................................................................4 LIST OF TABLES .........................................................................................................................10 LIST OF FIGURES .......................................................................................................................11 ABSTRACT ...................................................................................................................................16 CHAPTER 1 INTRODUCTION ..................................................................................................................18 Citrus Groves in Florida .........................................................................................................18 Unmanned System ..................................................................................................................18 Autonomous Vehicles for Citrus Groves ........................................................................19 Robotics for Citrus Groves ..............................................................................................19 2 LITERATURE REVIEW .......................................................................................................20 Agricultural Applications using Computer Vision System ....................................................20 A utonomous V ehicle .......................................................................................................20 Harvesting Robot .............................................................................................................20 Requirements of Vision B ased 3D R econstruction .........................................................21 FeatureB ased Image Processing ............................................................................................21 Feature M atching .............................................................................................................21 Area based m atching ................................................................................................22 Featurebased m atching ............................................................................................22 Color based m ethod .................................................................................................24 Descriptor based m e thod ..........................................................................................24 Feature Tracking ..............................................................................................................26 Motion e stimation ....................................................................................................27 Linear t racking m odel ..............................................................................................27 Non l inear t racking m odel .......................................................................................29 Stochastic t racking m odel ........................................................................................29 Geometric t racking m odel ........................................................................................29 Structure and Motion Estimation ............................................................................................31 Camera Model and Geometry .........................................................................................31 Pinhole camera model ..............................................................................................31 Intrinsic p arameters ..................................................................................................31 Extrinsic parameters .................................................................................................32 Image Mosaic ..................................................................................................................33 Motion t ype ..............................................................................................................33 Estimation of m otion ................................................................................................33 PAGE 6 6 Composition .............................................................................................................34 Applied t echniques ...................................................................................................35 3D Reconstruction ...........................................................................................................36 Estimation of s tructure and m otion ..........................................................................36 Estimation of e ssential m atrix ..................................................................................36 Estimation of m otion ................................................................................................37 Estimation of s tructure .............................................................................................40 Applied t echniques ...................................................................................................42 Different a pproaches ................................................................................................43 3 OBJECTIVES .........................................................................................................................52 Objective Statement ................................................................................................................52 Scope .......................................................................................................................................52 Sub objectives .........................................................................................................................52 Generating a mosaic image of a grove scene ..................................................................52 3D canopy mode l s ...........................................................................................................53 4 METHODS AND PROCEDURES ........................................................................................54 Introduction .............................................................................................................................54 Application Sc enarios ......................................................................................................54 Scouting autonomous vehicle system ......................................................................54 Harvesting robot .......................................................................................................54 Assumptions and Limitations ..........................................................................................55 Citrus canopy scene ..................................................................................................55 Video frame image ...................................................................................................55 Single vision system .................................................................................................55 Motion model and large objects than view ..............................................................56 Methods and Procedures .........................................................................................................56 Conclusions .............................................................................................................................57 5 FEATURE DETECTION .......................................................................................................59 Introduction .............................................................................................................................59 Methods and Procedures .........................................................................................................59 Image Enhancement ........................................................................................................60 Leaf Detection .................................................................................................................60 Results .....................................................................................................................................62 Fruit Detection .................................................................................................................62 Size Problem ....................................................................................................................62 Conclusions .............................................................................................................................62 Future Work ............................................................................................................................63 6 FEATURE TRACKING .........................................................................................................67 Introduction .............................................................................................................................67 Methods and Procedures .........................................................................................................67 PAGE 7 7 Active Mesh .....................................................................................................................68 External force ...........................................................................................................68 Internal f orce ............................................................................................................69 Feature management ................................................................................................69 Simula tion Results ....................................................................................................70 Multilayered Active Tree ................................................................................................70 Generation hierarchical structure .............................................................................70 External force and internal force ..............................................................................71 Simulation and results ..............................................................................................72 Conclusions .............................................................................................................................73 Future Work ............................................................................................................................73 7 SEQUENTIAL IMAGE MOSAICING ..................................................................................81 Introduction .............................................................................................................................81 Methods and Procedures .........................................................................................................82 Projection Model .............................................................................................................82 Alignment ........................................................................................................................83 Filename N umbering .......................................................................................................84 Experiment ..............................................................................................................................84 Configuration ...................................................................................................................84 Simulation ........................................................................................................................84 Results .....................................................................................................................................85 Conclusion ..............................................................................................................................86 8 VISION BA SED 3D RECONSTRUCTION ..........................................................................92 Introduction .............................................................................................................................92 Methods and Procedures .........................................................................................................93 3D Reconstruction using an 8 Point Algorithm ..............................................................93 Markerless approach ................................................................................................93 Initial base vertex model ..........................................................................................94 Interest point manager ..............................................................................................94 Implementation .........................................................................................................96 Sensitivity to roundoff error ....................................................................................97 Discord index ...........................................................................................................97 3D Reconstruction using a Plcker Coordinates System ................................................98 Ex periment ............................................................................................................................100 Configuration .................................................................................................................100 Robot motion generation ........................................................................................101 Robot motion control and capture image ...............................................................101 Calibrations for Camera Parameters ..............................................................................101 Rectification for lens distortion effect ....................................................................101 Accuracy of camera motion estimation ..................................................................102 Feature Matching ...........................................................................................................103 I ndoor E xperiments .......................................................................................................103 Outdoor Experiment ......................................................................................................104 PAGE 8 8 Results ...................................................................................................................................104 Conclusion ............................................................................................................................105 Future Work ..........................................................................................................................106 9 RANGE SENSOR BASED 3D RECONSTRUCTION .......................................................121 Intr oduction ...........................................................................................................................121 Methods and Procedures .......................................................................................................121 Coordinates of LADAR and Vehicle ............................................................................121 Scanning model of LADAR ...................................................................................121 Motion model of vehicle ........................................................................................122 Simulation ......................................................................................................................122 Virtual canopy model .............................................................................................122 LADAR simulation ................................................................................................123 Motion s imulation ..................................................................................................123 Simulated distances ................................................................................................124 Results in reconstruction ........................................................................................124 Experiment ............................................................................................................................125 Configuration of equipments .........................................................................................125 Data acquisition software ..............................................................................................125 Calibration of LADAR ..................................................................................................125 Obtained raw data ..........................................................................................................126 Adjustment of position data ...........................................................................................126 Results ...................................................................................................................................126 Reconstruction 3D space data .......................................................................................126 Calculation of volume of canopy ..................................................................................127 Representation on GIS software ....................................................................................127 Conclusion ............................................................................................................................127 Future Work ..........................................................................................................................128 10 CONCLUSIONS AND FUTURE WORK ...........................................................................135 Summary and Conclusions ...................................................................................................135 Future Work ..........................................................................................................................137 APPENDIX A GEOMETRY ........................................................................................................................139 Plcker Coordinates System .................................................................................................139 Definitions .....................................................................................................................139 Point .......................................................................................................................139 Line .........................................................................................................................139 Plane .......................................................................................................................140 PAGE 9 9 Angle .............................................................................................................................140 A sign between two lines which meet each other ..................................................140 An angle between two lines which meet each other ..............................................140 An angle between two planes .................................................................................140 Point ...............................................................................................................................140 A point where a line passes though a plane ...........................................................140 A point on a plane which is closest to the point .....................................................141 A point on a line which is closest to the point .......................................................141 A point on a line which is closest to a line .............................................................141 Line ................................................................................................................................142 A line passing through two lines w ith the shortest distance ..................................142 A line made by two planes .....................................................................................142 Plane ..............................................................................................................................142 A plane made by a point and a line ........................................................................142 A plane made by two lines .....................................................................................142 Basic Functions .....................................................................................................................142 Triangle Functions .........................................................................................................142 Angles in triangle when edges are known ..............................................................142 Volume of Tetrahedron ..........................................................................................143 B CRITICAL SOURCE CODES OF ALGORITHM ..............................................................144 Leaf Detection ......................................................................................................................144 Multilay ered A ctive T ree Tracking ......................................................................................145 Active Mesh ...................................................................................................................145 Hierarchical Structure Generation .................................................................................148 Multilayered Active Tree ..............................................................................................148 Sequential Image Mosaicing .................................................................................................150 3D Reconstruction with Images ............................................................................................153 3D Reconstruction with Range Data ....................................................................................153 C DEVELOPMENT TOOLS ...................................................................................................155 LIST OF REFERENC ES .............................................................................................................156 BIOGRAPHICAL SKETCH .......................................................................................................159 PAGE 10 10 LIST OF TABLES Table page 51 Comparison detected number of fruit. ...............................................................................64 71 Intrinsic parameters for simulation ...................................................................................87 81 Calibrated Intrinsic camera parameters. ..........................................................................107 91 Estimated volume and height of canopy from reconstructed 3D space data. ..................129 101 Contribution techniques. ..................................................................................................138 C 1 Softwares. .........................................................................................................................155 PAGE 11 11 LIST OF FIGURES Figure page 11 C oncept design of an unmanned robot harvesting and autonomous scout ing system. ......19 21 Vehicle applications. ..........................................................................................................45 22 Robotics applications. ........................................................................................................45 23 SIFT feature descriptor. .....................................................................................................45 24 SURF 64 descriptor vectors at a n interest point. ...............................................................45 25 Active contour model ........................................................................................................46 26 Active mesh model ...........................................................................................................46 27 P inhole camera model .......................................................................................................46 28 Intrin sic camera parameters ..............................................................................................46 29 Panorama from pure rotation. ............................................................................................47 210 Homography .....................................................................................................................47 211 Warping.. ............................................................................................................................47 212 Color correction process. ...................................................................................................48 213 Time warping. ....................................................................................................................48 214 Epipolar geometry. .............................................................................................................48 215 F our possible cases of reconstruction. ...............................................................................49 216 Stereo vision system. .........................................................................................................49 217 3D reconstruction of tree model .......................................................................................49 218 Parallel perspective stereo mosaic. ....................................................................................50 219 Augmented reality .............................................................................................................50 220 Augmented reality with SLAM. ........................................................................................50 221 Measurement of canopy. ....................................................................................................51 41 V ehicle system ..................................................................................................................58 PAGE 12 12 42 Robotics sytem. ..................................................................................................................58 43 Overall procedures. ............................................................................................................58 51 Histogram equalization. .....................................................................................................64 52 Embossing filter effect. ......................................................................................................64 53 Leaf segmentation. .............................................................................................................65 54 T hreshold a djustment ........................................................................................................65 55 Detect the center of segment. .............................................................................................65 56 Morphological operations of leaf detection. ......................................................................65 57 Fruit segmentation. ............................................................................................................66 58 Segmentation size. .............................................................................................................66 61 Life time of each feature for during sequence. ..................................................................74 62 Sum of forces acting on each node ...................................................................................74 63 Virtual objects and 2D sinusoidal camera motion. ............................................................74 64 Optical flow estimation. .....................................................................................................75 65 Nodes generation from multiple images. ...........................................................................75 66 Voronoi segmentation applied to each image. ...................................................................76 68 MATLAB code of recursive groupi ng function. ...............................................................77 69 Automatically generated hierarchical connection based on an image. ..............................77 610 Determined d ominant features. ..........................................................................................78 611 F orce s acting on the considering node. ..............................................................................78 612 Procedure to determine subnodes. .....................................................................................78 613 Camera motion model in the test for multilayered active tree. ..........................................79 614 Simulation for robustness to noise. ....................................................................................79 615 Simulation for robustness to occlusion. .............................................................................80 71 Parallax problem occurring with nonplanar objects which have depth. ...........................87 PAGE 13 13 72 Basic steps of image mosaicing. ........................................................................................87 73 D iagram of algorithm structure. .........................................................................................88 74 Translation projection model. ............................................................................................88 75 Alignment. .........................................................................................................................89 76 File numbering. ..................................................................................................................89 77 Configuation of image mosa icing simulation. ...................................................................89 78 Evaluation of positioning error or camera ........................................................................90 79 Cropping edge mosaicing effect. .......................................................................................90 710 Mosaicing with sinusoid motion. .......................................................................................90 711 Mosaicing result from video clip recorded far enough away from citrus grove ...............90 712 Mosaicing result from 5 times zoomed in video clip. ........................................................91 713 Mosaicing result from 10 times zoomed in video clip ......................................................91 81 Loop among projection, motion estimation and reconstruction. .....................................107 82 Icosahedron base vertex model. .......................................................................................107 83 Flow chart of an 8 point algorithm. .................................................................................108 84 Interest point selection. ....................................................................................................108 85 Interest points around edge ar e discarded. .......................................................................109 86 Re projection error of unstable Interest points. ...............................................................109 87 Stability of vertex.. ...........................................................................................................109 88 Average statistical quantity .............................................................................................110 89 3D reconstruction program using an 8 point algorithm in real time. ..............................110 810 Increment of accumulated reprojection error. ................................................................110 811 Sensitivity test of an 8 point algorithm with 4 pixel roundoff error. ..............................111 812 Reconstruction simulation ..............................................................................................111 813 Mismatched index problem..............................................................................................112 PAGE 14 14 814 Plck er coordinates system. .............................................................................................112 815 Determination of an object vertex....................................................................................112 816 Simulation of reconstruction using a Plcker coordinate system. ...................................113 817 C onfiguration of camera on R1207 robot manipulator. ...................................................113 818 Robot motion simulation..................................................................................................113 819 Robot control software. ....................................................................................................114 820 Camera calibration. ..........................................................................................................114 821 Detemination of camera motion ......................................................................................115 822 Feature matching. ............................................................................................................115 823 Reconstruction of icosahedrons. ......................................................................................116 824 Reconstruction of a robot manipulator ...........................................................................117 825 Reconstruction comparison. .............................................................................................117 826 Outdoor experiment. ........................................................................................................118 827 Captured sequential images. ............................................................................................118 828 Matched correspondences between sequence images. .....................................................118 829 Reconstruction results. .....................................................................................................119 830 Reconstruction confirmation. ...........................................................................................120 91 Procedure diagram. ..........................................................................................................129 92 Coordinates of LADAR system ......................................................................................129 93 Path model of a vehicle in a virtual grove. ......................................................................130 94 Virtual canopy model. ......................................................................................................130 95 S earch for a facet which a laser beam passes .................................................................130 96 Coordinates of LAD AR during simulation. .....................................................................131 97 Fully scanned visual simulation. ......................................................................................131 98 Distances measured from simulation ..............................................................................131 PAGE 15 15 99 Reconstruction results. .....................................................................................................131 910 Configuration of LADAR and GPS system.. ...................................................................132 911 LADAR calibration. .........................................................................................................132 912 Raw data. ..........................................................................................................................132 913 Adjusted control points for path. .....................................................................................133 914 Reconstructed 3D space data in the 4th row ....................................................................133 915 Created tetrahedrons corresponding to canopy ...............................................................133 916 Reconstructed entire grove .............................................................................................134 917 Results applied to GIS software. ......................................................................................134 101 Virtual robot harvesting simulation. ................................................................................138 A 2 Angles in triangle when edges are known. ......................................................................143 A 3 Volume of a tetrahedron. .................................................................................................143 PAGE 16 16 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy 3D CITRUS GROVE RECONSTRUCTION BASED ON EMPIRICAL MEASUR E MENT By SangHoon Han April 2010 Chair: Thomas F. Burks Major: Agricultural and Biological Engineering My study contribute s to the development of unmanned and automated techniques associated with two applications for citrus groves: ground based s couting autonomous vehicles and harvesting robot manipulators. The duty of groundbased s couting autonomous vehicles is to collect global information, threading along alleyways in a grove, and harvesting robot systems explore and map local fruit positions on a canopy using vision sys tems. Two major techniques considered in my study are a global image mosaicing technique for the groundbased s couting autonomous vehicle and a local 3D reconstruction for the harvesting robot manipulator Vision based techniques follow common procedures r elated to features : feature detection, feature tracking and feature matching. Since it must be assumed that there are no artificial landmark s in canopy scenes it is necessary to extract features from leaves A leaf detection algorithm was developed by com binations of morphological operations designed to look for the center of the leaf Leaf detection can preserve the viability of dominant interest points by means of segment ing individual To be robust against the disturbance and disappearance of interest p oints, an active mesh method is introduced. It was originally designed to detect shapes of object s however, the method is used in tracking with an assumption that object s are not deformable during PAGE 17 17 consecutive frames. A multilayered active tree wa s invente d to be more robust to noises by managing interest points hiera r chically. The multilayered active tree is not only stable, but it determines the dominant features. S ome of the matched sets are used in the estimation of optical flow and reconstruction. F or the horizontal movement a parallax which causes discords between images occurs. A sequential image mosaicing was implemented by cropping and stitching the overlapped area between images. Difficulties of an 8 point algorithm were pointed out and a Plcker coordinates system which provides simpler and intuitive computations was suggested for the 3D canopy reconstruction. A n alternative way to reconstruct canopies was also conducted through a LADAR and a GPS receiver PAGE 18 18 CHAPTER 1 INTRODUCTION The objective o f chapter is to describe the motivation why this research starts Citrus Groves in Florida C ultivation such as pomiculture is a work that needs a bunch of labors and care. I ncreasing l abor costs is becom ing a considerable part of reasons for increasing th e production cost and weakening competitiveness. Citrus is a huge industry of Florida, producing more than 70% of USs supply of citrus. Present citrus groves face with many issues. A unit grove is becom ing large scale by uniting small groves for years. Th is means work capacity per person increases. Compared with increment of work duration, number of labor is decreasing. In practice, imported labors are replacing scarce labor power. It is required for farm owners to equip temporary accommodations and facili ties on near to groves for labors for harvesting season. Furthermore, employing nonUS citizens has potentially immigration problems. Exposing to chemicals such as spraying is also an obstacle which makes labors scruple at working. T hese challenges are bei ng the pressing reasons that demand unmanned systems in applications for citrus groves. Unmanned System A f ully automated unmanned system should be able to detect and analyze the environment by itself. Unmanned harvesting techniques do not only strengthen the price competition of farm products on international markets, but also contribute to the technical development of industry. Figure 1 1 shows a concept of a robot harvesting vehicle system. Two major applications can be briefly divided autonomous vehic le system and harvesting robot system. PAGE 19 19 Autonomous Vehicles for Citrus Groves The main function of autonomous vehicle systems requires a capability of navigating alleyways by itself. There are many trials not only agricultural applications, but also urban applications. My study focuses on only vision applications rather than automatic navigation. Robotics for Citrus Groves In practice, Florida oranges are being harvested by using mass product type machines because 95% of oranges are used to produce juice r ather than raw fruits. However individual fruit harvesting techniques are worth to develop for the fresh fruit market. A robot harvesting system which automatically maps fruit positions and harvest them can be one of agricultural applications about unmanned management. Figure 1 1. C oncept design of an unmanned robot harvesting and autonomous scouting system. PAGE 20 20 CHAPTER 2 LITERATURE REVIEW This chapter reviews the previous research conducted in the area of agricultural applications and general concept s a ssociated with the visionbased 3D reconstruction. Agricultural Applications using Computer Vision System Machine vision system s are a common acquisition device in various applications from industry to everyday life. It is also coming to play a critical pa rt in agricultural application s associated with unmanned system s Briefly, two kinds of platform can be considered in my work. A utonomous V ehicle Since most agricultural application s often cover a spacious field area, it is essential to include machine vis ion in the autonomous vehicle s. Younse (2005) applied a machine vision system to an autonomous greenhouse sprayer, which tracks groundbased artificial landmarks as shown in Figure 21 A ) Subramanian (2006) developed a steering control system of an unmanned tractor which combines with a camera and a laser range sensor as shown in F igure 2 1 B) Harvesting Robot Harvesting robot in Figure 22 A) is one of the most complicated agricultural application s at present. In the case of detecting fruit s spectrum an alysis or shape detection technique s can be employed (Hannan 2004). Mehta (2006) studied a visual servo system which controls the positioning of a grabber. A v ision system together with a range sensor (Rovira Mas 2006) can be applied not only as a sensor t o control, but also as a device to measure 3D space density maps (Rovira Mas 2003; Kise 2005). PAGE 21 21 Requirements of VisionB ased 3D R econstruction When rendering a large or long 3D model using successive image s the whole image is aligned with subimages taken from different views. To form the continuous image, an image mosaic technique can be ad opted. The visionbased 3D reconstruction method consists of several procedures as follows. Matching correspondences Feature Extraction, Feature Tracking Image m osaici ng Homography, Registration, 3D r econstruction Depth Information, Texture Mapping, Virtual Reality The following will be a series of detailed literature review s Due to the enormous amount of studies in the computer vision areas, only selected works will be cited, with a stronger focus on techniques considered and related with the issues of this dissertation. FeatureB ased Image Processing Since the feature extraction depends on conditions or situation s of an image, there is no unique way to apply it. It is necessary to take into account various methods before applying feature extraction The first step in processing, when dealing with sequen tial or multiple images establish es relationship s between the different images. Feature M atching F eature matching searches for identical characteristics in multiple images oftentimes referred to as correspondences The feature correspondences which should be known in order to achieve a mosaic, can be roughly divided into two classes : areabased and featurebas ed. Since the pixel based method has weakness es such as illumination, shape, occlusion, appearance or local minimum, the featurebased method is normally preferred. In the feature extraction various features such as point s (or color s ), line s areas or content s can be employed. For a planar image, an edge detector or a corner detector are used in most cases PAGE 22 22 Area based m atching Area based matching or Template based matching is a rudimentary method which assumes that there exist similar regions between images. There are two different approaches : Normalized Cross Correlation (NCC) and Sum of Squared Differences (SSD). NCC measures the similarity, and SSD measures the dissimilarity between two patch images. Normalized Cross Correlation : To observe similarity b etween patch images, Normalized Cross Correlation (NCC) is computed by subtracting the mean and dividing by the standard deviation. The formula can be represented as, ) (2 ) ( ) ( 2 ) ( ) ()) ( ( )) ( ( )) ( ))( ( ( ) (j iI j i j i W W j i j iI E I W E W I E I W E W j i CC (2 1) where W is a subimage and I is a template image. The s ubimage should be larger than the template image. Sum of Squared Differences : Sum of Squared Differences (SSD) is based on inferential statistics. A d istance from a point to the mean of an image is a deviation. If the deviation is squared, then we have the sum of squares for the image. i i i i i SSDe x I x x I u E2 2 0 1)) ( ) ( ( ) ( (2 2) w here v u u is the displacement, ie called the residual error. When a template patch image is placed on a similar a rea, the result of computation has a pick point over the image. The performance between NCC and SSD is the same. Featurebased m atching F eature extraction is a process that determines a relationship between images in a local image. A variety of features su ch as color s line s curve s patterns or multiple combinations can PAGE 23 23 be chosen, but mostly the point feature invariant to transformations and directions is used in sequen tial image s in which there is camera motion. The f eature extraction technique is used fo r motion detection, object tracking, feature matching, image mosaicing, panorama, 3D reconstruction and object recognition. Interest point (Harris corner detection) Nave : The corner detection devised by Harris and Stephens is a universal approach used to detect point feature s and many enhanced detection techniques have been proposed based on the corner detection. A c orner is defined with an intersection between two edges, while an interest point is a n invariable position to detect. Since the detector does not detect only actual corners, an interest point in general is referred to as a feature more than a corner In the literature, c orner i nterest point and feature are referred to interchangeably. One important assumptions of the corner detect ion is that it can be deri ved from neighborhood points. The neighborhood of a feature should be different enough from another neighborhood after moving. The approximation of the second derivative between a patch window w and a shifted window ) ( dy dx w is given by following the equation M dy dx D ) ( where dy dx y x w My I x I y I x I, ) ( (2 3) M should have two large eigen values with respect to the interest point. If the eigen value has a positive lar ge number, it would be an edge. I f both eigen values have zeros, it can be inferred there is no feature. Since the calculation of the eigen value is computationally expensive, it can be achieved by minimizing the fol lowing corner response function: (source: ) ) ( trace ) det( ) (2 2 2 1 2 1M M R (2 4) PAGE 24 24 where k is a tunable sensitivity parameter Since the corner detection itself is not very robust in general, it frequently needs expert supervisions or redundancies introduced to prevent the effect of i ndividual error. Color based m ethod In the case of autonomous mobile robot s which work indoor s or in the industry, artificial landmarks such as color s or specified patterns are employed to detect the location. Yoon (2001) proposed an artificial landmark re sistant to change caused by altering intensities or shape s The designed color checkered pattern uses a global/local histogram and an HSV color model which is less affected by illumination. Since the histogram is invariable within the pattern, it is imperv ious to noise. Descriptor based m ethod It is not enough for a single method to cope with incorrect matches caused by a variety of conditions such as change of scale, rotation and illumination. To avoid these varieties, particularly designed approaches have been suggested. The descriptor based method is becoming the most promising approach. S cale i nvariant f eature t ransform : A S cale Invariant Feature Transform (SIFT) which is devised by Lowe (2004) is one of the advanced feature matching methods. The SIFT br iefly consists of a key point localization and an identified descriptor around the key point as shown in Figure 2 3. The key point is determined by covering invariance to varieties. To detect a stable location of a key point, a Difference of Gaussian (DoG) approach is used over the scalespace representation of an original image. This convolution makes it resistant to scale variation s After the localization, the key points orientation is assigned in order to cover rotational invariance. The descriptor com posed of 128 normalized vectors is computed by local gradient orientations and magnitude s around the key point. The SIFT features have a lot of advantages such that they PAGE 25 25 are invariant to scale and rotation, highly distinctive, resistant to distortion s occ lusions noise s and change of illumination. Due to its merits, many researchers are presently trying to apply it increas ing present ly (Lepetit et al. 2004; Heikkil et al. 2005; Gr ., 2005; Mortensen et al. 2005; Moreels et al. 2005). Ke et al. (2004) proposed a PCA SIFT algorithm in order to reduce dimensions of descriptors. They applied Principle Component Analysis (PCA) to the normalized gradient patch instead of smoothly weighted histograms. Thus t hey insist that it should be faster and more resistant than the original SIFT. Speeded up robust features: Speeded Up Robust Features (SURF) devised by Bay et al (2006) is a descriptor based feature matching method like SIFT. The descriptor illustrates distribution of intensity such as the gradient around an interest point. Unlike SIFT, SURF is designed to enhance computational speed by utilizing responses of the first order of Haar w avelet. SURF uses a Fast Hessian detector for key point detection and Haar w avelet for descriptor. The response of Ha ar w avelet is fundamentally invariant to illumination changes like shadow as well as contrast invariant by taking the unit vector. Fast Hessian Detector Fast Hessian is based on an approximation of a Hessian m atrix. Given a point ) ( y x x in an image, the H essian matrix ) ( s x H in x at scale s is defined as follows ) ( ) ( ) ( ) ( ) ( s x L s x L s x L s x L s x Hyy xy xy xx (2 5) where ) ( s x Lxx is the convolution of the Gaussian second order derivative SU RF Descriptor SURF explores 64 sub areas while SIFT explores all pixels within the frame. This simplification enhances not only computation speed but resistance because it is less sensitive to PAGE 26 26 noise. The first step for a descriptor is to determine a reproducible orientation from the circular neighborhood around the interest point. The next step is to extract a rectangular area along the orientation and then determine the descriptor with respect to this area. Orientation Consideration of the circular neighborhood yields invariance to rotation. First it computes the response of the Haar wavelet at the circular neighborhood followed by determining the dominant orientation based on distributed responses from the weighted Gaussian center. To keep consistency of orientation, it is necessary to ascertain the appropriate size of the window through empirical means. Descriptor Descriptor vectors are computed within the rectangular area. T he rectangular area angled and centered on the interest point is extracted along the orientation determined in the previous step. The extracted rectangle is divided again into subareas with a 4x4 grid and then a Haar wavelet is performed at the sampling points distributed on the grid. Each subarea contains 4 descriptor vectors as follows. y x y x j id d d d,v (2 6) where xd and yd are sums of the responses, and xd and yd are sum s of absolute values of responses with respect to the X and Y axis Therefore the descriptor with respect to an interest point has 64 vectors as shown in Figure 24. Feature Tracking F eature tracking is for matching correspondences in sequential image s The disparity between features gives us depth information to compute a homogra phy. A s uccess of registration and reconstruction depends on how well the features are tracked. PAGE 27 27 Motion e stimation Optical f low : An optical flow is defined as a distinct motion of image intensities. There are two major assumption s in the motion estimation. First, the brightness is steadily changed. Second, there is no object moving individually in the scene. When an object moves x y in the scene while t the first assumption can be represented w ith T aylor series expansion ) ( ) ( ) ( t z y x O t t t I t z z I t y y I t x x I t z y x I t t z z y y x x I (2 7) According to the second assumption that terms are constant, derivatives are zero. t z x y x x xI V I V I V I t t t I t z z I t y y I t x x I 0 (2 8) B AV I I I v v v I I I I I I I I I I V In n n nt t t z y x z y x y y x x y x t 2 1 2 2 1 1 1 1 (2 9) This equation is called optical constraint. Since this equation has many solutions, more constraints are needed To determine the optical flow, many methods such as a Lucas Kanade method and a Horn Schunck method are proposed. Linear t racking m odel LK tracker: A Lukas Kanade method is one popular optical flow estimation. In this differential method, it is assumed that information based on intensities and color s or objects is not changed significantly. Assuming the optical flow of every pixel in the small window is the same, the computation can be simplifie d with solving a linear equation. Although a solution is PAGE 28 28 obtained with two points in a nonsingular system, more point s are more effective. For this over determined system, a least square approximation is used in common. For points, a weight function such as Gaussian is used. y x t x y x y x y x x y x t y y x y y x y xI I y x w v I I y x w u I y x w I I y x w v I y x w u I I y x w, 2 2 ,) ( ) ( ) ( ) ( ) ( ) ( (2 10) The i teration also yields a better result. Yielded offset s in the computation are used to determine a new window. B AV I I I v v v I I I I I I I I I I V In n n nt t t z y x z y x y y x x y x t 2 1 2 2 1 1 1 1 (2 11) To solve the t aking pseudo inverse, i i i i i i i i i i i i i i i i i i i i it z t y t x z z y z x z y y y x z x y x x z y x T TI I I I I I I I I I I I I I I I I I I I I v v v B A A A V1 2 2 2 1) ( ) ( (2 12) Lukas Kanade method is usually carried out in a coarse to fine iterative manner as SIFT. T his algorithm does not yield a very high density of flow vectors. T he optical flow fades out quickly across motion boundaries and the inner parts of larg e homogenous areas show little motion. Therefore it is comparatively resistant against the presence of noise. Rav Acha et al. (2004) use s a modification of the Lucas Kanade for a direct 2D alignment. Kalman f ilter : A Kalman f ilter that originated from a c ontrol theory is a traditional tracking method, which estimate s coefficients of linear equation s It is applied to tracking PAGE 29 29 moving objects. Davison et al. (2003) emphasize d that an E xtended Kalman F ilter (EKF) shows a great success in estimating locations of the robot with a single camera. Non l inear t racking m odel There are many applications that linear model s cannot cover. Horn Schunck tracker : A Horn Schunck method is designed to solve an aperture problem. A f low density is around a boundary on the other hand, it is sensitive to noise. Entended Kalman f ilter : An extended K alman f ilter is for a non linear version of the l inear Kalman f ilter. In some research, these techniques are combined to take each advantage. Stochastic t racking m odel C on ditional de ns ity propagation: Isard et al. (1998) devised a CONditional DENSity propagATION (CONDENSATION) algorithm based on a stochastic tracking for an active contour. In this algorithm each position of feature is estimated based on the probability in the previous stage. The CONDENSATION algorithm applies factored sampl es iteratively to compute the a posteriori density. In the iterative step, it starts with a sample set 1s which represents a posteriori density from the previous step 1 t Applied factored sampling, a set 3 s formed from 2s which describes the a priori density ) | (1 t tZ x p is represented as follows ) | ( ) | ( ) | (1 t t t t t tZ x p x z p Z x p (2 13) While one formulation in the Kalma n f ilter tracks only one feature, CONDESATION can expand the number of feature s I t is resistant against occlusion s and temporary disappear ances Meier et al. (1999) applied t his algorithm to the content based tracking for mobile robots. Geometric t racking m odel T racking models, as reviewed before, are based only on extract ed features from image s Geometric tracking models such as an a ctive c ontour and an a ctive m esh refer to geometric PAGE 30 30 relationship s among the features (nodes) as well as image based features These terms are defined as energy equation s and controlled by weights. The a ctive contour method is effective in detecting shapes or tracking geometric deformation. Active c ontour : An a ctive contour is an iterative framework delineating an object profile by minimizing energy which is defined with image properties. The energy is categorized by two aspects : internal and external energy. The external e nergy is minimal when a contour stays at the boundary of an object. It is measured by taking gradient between pixels. The i nternal e nergy is minimal when a contour matches a desired shape. It is based on an elongation (elastic force) or a curvat ure (bending force) deformation. This approach that quantifies morphological constraints with energy has a potentiality which extends to other approaches like shape detection as well as tracking. Figure 2 5 shows an example result from an active contour. Active m esh : Molly, D. (2000) proposed an a ctive mesh method which extends from an a ctive contour. While an a ctive con tour method defines a one dimensional relationship among nodes, an active mesh method defines multiple relationships among the closest nodes as shown in Figure 2 6. A D elaunay t riangulation is one of typical grouping methods for nearby nodes. The a ctive me sh is effective in the background of a scene which is not deformed or extracting moving objects from the background. Thereby it is applied to video compression as well as tracking. While feature extraction s in the active contour ha ve been conducted every frame, Griffin, (2001) suggested a content based active mesh method. This approach finds node positions that minimize errors between the patch image and the reference image. Since it is assumed that each patch image is a planar ( a ffine transformation), it does not seem to be effective in a sudden occlusion. PAGE 31 31 Structure and Motion Estimation Once correspondences are determined through various image processing methods mentioned in the previous section then the camera extrinsic parameters and the depth of feat ures based on the correspondences can be estimated This reconstruction is often called Structure F rom Motion (SFM). In this section, the basic theory behind the procedure and some applications which expand those concepts is discussed A c amera is a device which projects a 3D scene into a 2D image. Therefore, a camera can be modeled as a projection matrix (source: Ma et al. 2004) Camera Model and Geometry Pinhole camera model Throughout my study, an ideal pinhole camera model is used, which is referred to generally. As shown in F igure 2 7, the projection center C is placed as far as focal length f on the principle axis perpendicular to the center c of an image plane (retin al plane). An i mage point 3 m which intersects the image plane and a line connecting the projection center C and an object 4 M in the scene, can be simply modeled by a proj ection matrix 4 3 0 as M m0 (2 14) Intrinsic p arameters For an input data 3 m from a CCD sensor to indicate projected point 3 m a transformation matrix 3 3 int K between them, which is called the intrinsic parameter s of the camera, is needed The intrinsic parameter s are specified with several fixed properties of the camera. PAGE 32 32 1 1 1inty x o f o s f y xy y x xmK m (2 15) where xf yf are focal length with respect to x y axis. xo yo are the center of an image axis. s is a skew angle as shown in Figure 28. Extrinsic p arameters A 3D point M can be represented with a reference point M by a pose matrix with respect to the base camera. M 0 T R M 1, (2 16) w here R is a rotation matrix and T is a translatio n vector. If a camera pose has changed, the reference point is simply given by taking an inverse of the pose, which is called the extrinsic parameter s of a camera 4 4 ext K M K M M 0 T R R M ext1T T (2 17) Camera p rojection matrix : Combining t hose specified parameter matri ces a generalized camera projection matrix can be obtained : M T R R K m M K K m 0 T R RT T T T y y x xZ Y X o f o s f y x int ext 0 int1 1 0 1 0 0 0 0 1 0 0 0 0 1 1 1 (2 18) PAGE 33 33 Image Mosaic Since cameras ha ve a limited view, it is necessary to merge multiple images to be a wide single like Figure 2 9. I mage mosaicing is a framework that builds a high resolution image with multiple subimages. Features extracted from the duplicated area are used to estimate camera motions Appropriate warping transformation is carried out based on the estimated camera motion T he w arped image can be folded exactly. Approach depends on the camera motion type. Motion t ype Parallax: A p arallax is a phenomenon that occurs when the background of an object is changed with respect to view point s Since this effect is based on a distanc e from the object, the optical f low is also used to measure the distance of objects. This effect does not happen during a pure rotation. Pure r otation : An i mage mosaic uses either a cylind rical type or a spher ical type of projection model in general. These projection models have a single common camera center. Since no parallax happen s with the models it is simple to achieve a mosaicing with pixelbased matching methods If images are taken with translational motion, the distance of objects and the estimation of camera motion should be considered. Estimation of m otion To combine images, there should be an overlapped area between images. For the estimation of camera motions a homography is widely used. Homography: A homography in computer vision is defined as a collineation which maps between sets of correspondences obtained from different plane s This relationship can be represented with a transformation matrix iH with respect to plane i as shown in Figure 210. Ther e are a lot of applications using homography and it provides ways of composing images or video frames. PAGE 34 34 The features may be obtained from a plane such as an artificial structure on occasion. In this degenerat ive configuration, an 8 point algorithm would not give a unique solution. Thus different approaches are needed A plane equation can be represented as follows d z n y n x nT 3 2 1M N 1 1 M NTd (2 19) where d is the distance from the origin to the plane. Substituting t his to the rigid body equation, we obtain 1 1 1 1 1 21 1 HM M TN R M N T RM T RM M T Td d Td TN R H 1 (2 20) where 3 3 H is called a collineation or a homography. It is a 3x3 homogeneous matrix whose rank is 2. However there is no information that extracted features are obtained from a plane. If a camera moves with pure rotation, i.e. 0 T or a camera is infinitely far away from objects i.e. d the homography becomes a function of only R E ach depth of feature s d and planar structure N to determine the homography do not matter Thus many image mosaicing methods ha ve been carried out with image set s taken from purely rotating camera. 0 2 1 Rm m (2 21) Composition Warping: W arping is a process which deforms an image to fit to a projection model. Since images taken with different view point do not have the same projection plane, the overlapped areas between images are not identical. Deform ing image s based on the base camera motion images can be combined with seams as shown in Figure 211 Blending : When mosaicing with images taken from outdoors, the boundary edges between images aligned might be salient due to changes of illumina tion As s hown in Figure 212, Hasler et al. (2004) proposed a method to correct for color differences between images stitched together. PAGE 35 35 The color correction consists in retrieving linearized relative scene s referred to data from uncalibrated images by estimating th e optoelectronic conversion function (OECF) and by correcting for exposure, white point, and vignetting variations between the individual pictures. Applied t echniques A m osaicing technique is used in various applications such as topological terrain maps satellite imagery, high resolution medi c al image s video compression and so on. Manifold : Jun J.C. (2004) achieved an image mosaicing and 3D reconstruction by using a manifold projection. The manifold projection contains a n individual projection plane every frame. He assumed that the path of a camera is two dimensional and the extrinsic parameters are estimated by measuring the optical flow with three images. However incongruity on the projection plane is incur red as much as the difference of depth. Eve n t hough it cut off a discord boundary with a curve estimated with the same pixels, errors are incur red in the distinct nonplanar scene b ecause it approximates the boundary of the projection with compensation between intersections. Time warping : Rav Acha et al. (2004) estimated depth information by means of time warping, which expands a time axis from an image plane. To simplify the computation, a camera moves in a one dimensional direction with a constant speed and the camera view is orthogonal to the pat h. Therefore no transformation is used. Since the assumed constant moving speed forms a line on the time axis, the depth of each feature can be estimated as much as the line is declined. That is, point s that are too far away become almost a vertical line and the closer point s become the declined line as shown in Fig 2 13. The feature patches are divided by a Voronoi diagram and set the width of pa tch as much as the depth. Since the depth information is represented by a line on the time axis, it can be estimated even though features disappear while tracking sometimes PAGE 36 36 3D Reconstruction Correspondences do not generally come from only a plane. Thus a homography cannot be applied in terms of arbitrary motion s of the camera. This section induces equations whic h represent the general relationship between correspondences from a pair of images based on an epipolar geometry. (source: Ma et al. 2004) Estimation of s tructure and m otion Depth information with a camera is estimated by using correspondences taken from o ver two images. For these correspondences to reflect actual features of an object, those should basically satisfy an epipolar constraint. The e pipolar geometry can be described through several ways. If neither intrinsic nor extrinsic parameters of a camer a are known, the epipolar geometry is represented by a Fundamental m atrix. If only intrinsic parameters of a camera are known, the epipolar geometry is represented by an Essential m atrix. If both intrinsic and extrinsic parameters of a camera are known, the epipolar geometry is represented by a Projection m atrix. The relative motion (rotation and translation) of a camera can be estimated with the epipolar constraint. Once the camera extrinsic parameters are determined, spatial position s of features are a lso determined with a transformation matrix easily. Estimation of e ssential m atrix Epipolar constraint : As shown in F igure 214, when calibrated correspondences x on each image plane reflect a spatial point X the relation between them can be represented with a multiplication of matrix as M m0 (2 22) PAGE 37 37 where is a scalar and 00I is a camera projection matrix. Setting one of two cameras as a reference frame, a spatial point 3 2 M with respect to the other camera 2o can also be represented with a relative pose R T and an identical point 3 1 M with respect to the reference frame 1o as T RM M 1 2 (2 23) Since i i im M where 3 m is homogenous coordinates, the relationship of correspond ences is represented as follows once again. T m R m 1 1 2 2 (2 24) To remove scalar s i which denote physical depths applying a skew matrix T of the vector T to both sides, it becomes 1 1 2 2 m R T m T (2 25) where 2 2 m T m T Since 2 x T is perpendicular to 2x 0 2 2 m T mT Multiplying T2x to both sides again, the correspondences have the following relationship regardless of scalars i 0 1 2 Rm T mT or 01 2 Em mT (2 26) which is called an epipolar constraint and 3 3 E is called an essential matrix. Since 0 )det( E it is singular and a homogenous quantity. It has only five degrees of freedom: 3 in rotation and 2 in translation up to scalar. Since E contains only R and T information, solving the E that satisf ies the equation with given correspondences, we can determine R and T Estimation of m otion To solve E the epipolar constraint equation is modified into a linear form. 0 s TE A (2 27) PAGE 38 38 where 9 33 23 13 32 22 12 31 21 11 T se e e e e e e e e E is referred to as the stacked version of E 9 2 1 n Tm m A and operator denotes a K ronecker product. For example given Tz y x1 1 1 1 m ,2 2 2 2 Tz y x m 2 1 2 1 2 1 2 1 2 1 2 1 2 12 1 2 1 2 1z z y z x z z y y y x y z xy x x x m m A solution of E satisfying the equation is an eigenvector of A which corresponds to an eigenvalue equal to zero Applying Singular Value Decomposition (SVD) to the coefficient matrix A then the last column of V which corresponds to zero of the eigenvalue becomes E R T V U UR V UR V U E 2 2 2 2 T z T z T z z T (2 28) where T z T zV R U UR T2 2, T here is now a pair of motion s H owever, in practice the eigenvalue would not be zero exactly. Therefore, E is determined by minimizing the norm of difference between the estimated solution Tdiag V U E ) ( ~3 2 1 and the exact solution Tdiag V U E ) 0 (2 1 0 2 0~ min E E (2 29) Notice that E also satisfies the same set of the ep ipolar constraint resulted from E Thus there are four possible cases and three of them from B) to D) are unrealistic as shown in F igure 215. A realistic one A) can be chosen by inspecting positive depth. R and om sa mple c onsensus: As s een before, the estimation of the relationship between correspondences is carried out with the assumption that the measurement of feature points is exact. In practice, however, the correspondences are exposed enough to noises, and a little PAGE 39 39 contamination of noise s can make the final result s useless. The problem is that it is difficult to segment the set of input data into inliers and outliers before the exact solution is obtained. One solution is RANdom SAmple Consensus (RANSAC), which is a popular algorithm in the computer vision area. This algorithm is an iterative method to estimate a mathematical model from a set of data and it can be applied to various problems. RANSAC also assumes that in a given set of inliers there exists a process which describes the data or estimates an appropriate model. The objective of this algorithm is to segment observed input data into inliers fitting to parameters of a model and outliers such as noise. The outlier can be caused by amplified noises, coarse measurements or incor rect hypotheses. The segmented inliers are used not only to refine the solution, but will be used to reconstruct the structure of a scene in the continuous frame in which it is important for feature s to survive as long as possible. RANSAC is iteratively c arried out by randomly selecting a subset of data. The subset of data is hypothetical inliers and they are tested as the following steps: 1. take a subset of hypothetical inliers from observed input data and estimate hypothetical parameters of a model. 2. count the number of data which fit to the model by applying other data to the model. 3. after repeating these step s choose a hypothetical model which has large number of data. 4. segment observed data into inliers and outliers by applying data to the chosen model. I n the case of a Homography as seen before, when all feature points are not extracted from a plane, the homography induced from the plane creates a virtual parallax. Since a homography assumes that points originate from a plan, it is effective for RANSAC to segment feature points into the points originated from a plane, i.e. inliers and the others. RANSAC is designed so that it can just estimate one model for a specific set of data. In case that one is multiply ing a homography, therefore RANSAC should be applied for each homography individually. PAGE 40 40 One advantage of RANSAC is the ability to robustly estimate parameters of a model with a high accuracy even in the presence a significant number of outliers. In contrast, one disadvantage is that it requires threshol ds specified for each problem. There is no time limit to repeat the iteration. When one stops at some iteration, there is no guarantee that it would be an optimal solution. Bundle adjustment : A b undle a djustment is one kind of optimization techniques. It has increasingly been used for the last step of feature based 3D reconstruction in the computer vision area. This method is used to minimize the distance between observed feature points and re projected feature points in terms of estimated parameters of a camera model. n i m j ij i j ijx X x P d v1 1 2) ), ( ( min (2 30) where ) (i jX x P is the evaluated projection of point iX on image jx and ) ( y x d denotes the Euclidean distance between image coordinates. Whe n the minimization problem is solved, the equation may have a sparse block due to irrelevances between points and camera. Thus a Levenberg Marquardt algorithm is also used in the optimization problem, which robustly converges even if the initial guesses be gin far away from the final minima. Estimation of s tructure A s seen before, given more than eight points, an 8 point algorithm can determine the relative R and T between cameras accompanied with a scale factor If 1 the obtained pose is equal to a translation with unit length. Now that the motion of a camera is determined the structure can be reconstructed based on the pose and the correspondences used for an 8 point al gorithm (source: Ma et al, 2004). Rewriting the basic rigid body equation (2 24) in terms of these parameters, PAGE 41 41 n jj j j j, 2 1 1 1 2 2 T Rx x (2 31) Either 1 or 2 is redundant. To eliminate 2 modif ying the basic rigid body equation by multiplying 2 x to both sides, we obtain n jj j j, 2 1 0 2 1 2 T x Rx x (2 32) which can be represent ed with a linear equation as 0 1 2 1 2 j j j j j jT x x R x M (2 33) where 2 3 jM 2 j for all n j 2 1 To give a unique solution, jM should have rank 1, which is not the case that points lie on the base line between the center of the cameras 1o 2o i.e. 0 2 T x To simplify the equation, defining ) 1 ( 3 n nM 1 n for j as 01 1 n j (2 34) T x Rx x T x Rx x T x Rx x T x Rx x Mn n n n n n 2 1 2 1 2 1 1 1 2 2 2 2 1 2 2 1 2 1 1 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 (2 35) We obtain the same linear equation 0 M (2 36) where a vector which represents the depth of each correspondence up to a single universal scale is an eigenvector of M M T corresponding to the minimum eigenvalue. To obtain a nontrivial solution of M should be zero. As we have done at the e ssential m atrix, the solution is also obtained by applying SVD in which the column of M corresponding to the minimum eigenvalue become s the structure position. Since these spatial positions are solved PAGE 42 42 with the assumption that the translation was normalized, actual positions are determined by multiplying the actual distance between two cameras. Applied t echniques Since 3D reconstruction is essentially bas ed on a pair of images, a stereo vision system is widely used as shown in Figure 216. However, a moving single vision system can also provide multiple view, some researchers try to utilize it for 3D reconstruction. Multiple c amera : D epth is estimated by measuring a disparity between images. Since a stereo vision system provides two images with known configurations a reconstruction can be achieved with stationary motion. It is the best way to be robust against potential errors. Kise et al (2006) applied a stereo system to a weed detection and a furrow reconstruction. They assume d the camera was fixed on a vehicle which moves straight forward. T hree or more cameras can be also used for measur ing depth (Kim 2003; Klechenov et al. 2002) Shlyakhter et al (20 01) rendered a 3D vegetation model based on multiple real images. Each of input image is segmented into a tree and a background. S ilhouettes are used to construct a visual hull. Figure 217 shows that their system constructs a tree skeleton. This research assumes the extrinsic parameters are known. The surface of the model is formed based on V oronoi approximation. Single Camera : The cost of a stereo system is expensive and it is complicated to implement Many researchers are trying to overcome such potenti al errors which occur with a single camera. The issue in a single camera is mainly associated with the estimation of a camera motion. Dornaika, F. (2001) tried to overcome a parallax problem which makes occlusion s during translational motion s Nevertheless this parallax problem also provides the depth information. Zhu (2003) estimated depth information through stereo mosaic images as before. Taking images of nonplanar scene with translational motion, the left and right sides of an image with respect to PAGE 43 43 mo ving direction show the different lateral face of a structure. The stereo mosaic images generated by the left and right strip provide disparity information about correspondences as shown Figure 2 18. Augmented r eality : An A ugmented R eality (AR) is a risin g field of computer research which is associated with a combination of generated 3D graphics and real image as shown in Figure 2 19. Mostly fiducial markers are used to make it easy to estimate the camera motion. Once particular markers are detected, 3D mo dels are mapped based on the coordinates of markers. Simultaneous l ocalization and m apping : A Simultaneous Localization and Mapping (SLAM) is a technique mostly used by autonomous vehicle systems which navigate in the unknown environment. This kind of mac hine requires abilities to keep tracking its location and to generate mapp ed environment s accurately. Klein et al. (2007) showed state of the art advances in terms of markerless tracking and mapping. Markerless means that they keep generating new markers f rom new features, not particular fixed marker s They used e nhanced Kalman f ilter SLAM for predicting features. Their approach keeps many key frames as ground truth. Figure 220 shows features which are tracked and mapped on a real image. Different approa ches Various range sensors such as laser range sensor o r an ultrasonic sensor can be used to measure a depth in remote sensing. Groundbased remote sensing techniques ha ve been studied using laser range sensors or ultrasonic sensor s for the last half decad e. Laser Detection And Ranging ( L ADAR) : Even though aerial based remote sensing has a limitation to measure a depth (height of canopy), Meron et al. (2000) studied measurement of the shape and volume of the canopy with aerial photogrammetry. However a r an ge sensor is more popular with a tree specific management. Wei et al. (2004) developed a system measuring the PAGE 44 44 volume of the citrus canopy using multiple laser scanners As shown in Figure 221 A), their system scans like a semicircle, but the beam source m oves along the guide frame which is fixed on the ground. They designed the system for only one canopy. Ultrasonic s ensor : An u ltrasonic sensor can also be considered to measure the volume of the canopy. Zaman et al. (2005) developed a measurement system w ith an ultrasonic sensor and a DGPS receiver Several pieces of ultrasonic sensor s are attached to a pole on a moving trailer instead of a moving the direction of source s as shown in Figure 221 B) Tumbo et al. (2001) compared correlation s among laser, ul trasonic and manual measurements. They concluded that a laser measurement can provide a good estimation of canopy volumes especially in a grove where there are significant numbers of partially defoliated trees or small replants. The other advantage of the laser measurement over an ultrasonic system may be the high speed of data acquisition with the former. So far, s uccessful papers have been reviewed. These papers are achieved under their assumptions. It is important to take into account whether or not the ir assumptions are suitable enough to the field. PAGE 45 45 A B Figure 2 1. Vehicle applications A) Six wheel greenhouse sprayer ( Younse 2005) B) Autonomous tractor ( Subramanian 2005) A B Figure 2 2. Robotics applications. A) Vision based harvesting robot (Hannan et al. 2004) B) Vision system for aiming fruit ( Hannan et al. 2004) Figure 2 3. SIFT feature descriptor (Lowe, 2004). Figure 2 4. SURF 64 descriptor vectors at a n interest point ( Bay et al ., 2006) PAGE 46 46 Figure 2 5. Active contour m odel ( Molly 2001) Figure 2 6. Active mesh model ( Molly 2000) Figure 2 7. P inhole camera model Figure 28. Intrinsic camera parameters PAGE 47 47 Figure 2 9. Panorama from pure rotation. Figure 2 10. Homography (source: Ma, 2004) Camera path Base line Transformation A B Figure 2 11. Warping. A) Warping model. B) Registration after warping ( Kanazawa, 2004) PAGE 48 48 Figure 2 12. Color correction process ( Hasler et al., 2004) A) Original mosaic. B) Exposure compensation by extrinsic parameter. C) Exposure compensation and white balance with polynomial OECF. D) Exposure compensation and white balance with L aguerre OECF A B Figure 2 13. Time warping ( Rav Acha. et al. 2004) A) Features movement in time axis B) Patches divided by Voronoi diagram Figure 2 14. Epipolar geometry ( source: Ma et al., 2004) PAGE 49 49 Figure 2 15. F our possible cases of reconstruction (source: Ma, 2004) Figure 2 16. Stereo vision system ( Kise et al., 2006) Figure 2 17. 3D reconstruction of tree model ( Shlyakhter et al., 2001 ) PAGE 50 50 A B Figure 2 18. Parallel perspective stereo mosaic ( Zhu et al., 2003). A) mosaic model B) Depth map obtained from stereo mosaic. a ) Left Mosaic. b) Right Mosaic. c ) Depth map from stereo mosaics. Figure 2 19. Augmented r eality Figure 2 20. Augmented r eality with SLAM (Klein et al., 2007) PAGE 51 51 A B Figure 2 21. Measurement of canopy Measurement of canopy with A ) laser sensor s (Wei et al., 2004) and with B ) ultrasonic sensor s ( Zaman et al. 2005) PAGE 52 52 CHAPTER 3 OBJECTIVES Objective Statement The overall objective of this dissertation is to develop techniques required to generate a 3D canopy model base d on an actual citrus grove. Scope Th ese techniques can be used for scouting autonomous vehicle systems to inspect disease or growth status of plants throughout a global grove or for a harvesting robot systems to track and map fruit positions within a local canopy. Moreover, the 3D canopy model can provide farmers with a visual interface to manipulate unmanned systems remotely. In an effort to generate 3D models, range s ensor s could be used as well H owever, vision system s such as cameras are more commonly used. In my study, the primary focus will be on vision system s Feature points obtained from image s are approximated to vertices which forms a surface of 3D canopy mode ls Since a single image cannot cover the whole canopy a wide panoram ic image should be prepared to map the image on the surface of 3D model s Therefore image mosaicing approach is developed to make a panoram ic image, so that 3D models can be rendered. T o achieve these goals additional subobjectives are determined as follows: Sub o bjectives Generating a mosaic image of a grove scene Develop a feature extracting method customized for a grove scene, including image enhancement s Develop a feature tracking method which is robust against a parallax problem. F eatures tracked contribute to both a determination of the optical flow and registration for mosaic ing Generate a wide panorama image of a grove scene, including blending which can be useful for a s cou ting vehicle system PAGE 53 53 3D canopy model s R econstruct the surface of an actual canopy based on empirical measurements taken from both a vision system and a range sensor system. Motion models of the vision system and the range sensor system are associated with a s couting vehicle system and a h arvesting robot system respectively. PAGE 54 54 CHAPTER 4 METHODS AND PROCEDUR ES The objective of this chapter is to describe the technical procedures to meet each subobjective. Introduction Before describing technical procedures, it is necessarily to review application scenarios and limitations under which techniques were designed. My study can be applied to two different agricultural applications in a citrus grove: 1) scouting system and 2) a harvesting system. Each scenario is b riefly discussed in following subsections. Application Scenarios Scouting autonomous vehicle system It is not important what type of vehicle is used in terms of gathering information. A frontal steering type shown in Figure 41 is assumed in my study. Whe n a frontal steering vehicle tries to change directions, the rear axle has the least variation in coordinate position with respect to the lateral side of body. Therefore, it is assumed that sensors such as cameras or range sensors will be mounted on near t he rear axle of vehicle. As the vehicle gathers information from the canopy, moving along alleyways in a grove, it is assumed that the vehicle moves in a straight line motion and the camera is fixed facing the canopy in direction Z, orthogonal to the movi ng direction X as shown in Figure 41 B). Therefore it is also assumed that lateral variations are ignored even though the vehicle may oscillate upward and downward. Harvesting r obot It is not critical what type of robot is used for harvesting. An R1207 w hich has a series of axial rotary joints is used for my study as shown in Figure 42 A). Various devices such as a PAGE 55 55 gripper or sensors can be mounted on the endeffector of robot. The mounting position of these devices on the end effector is not important, as long as all extrinsic parameters are properly accounted for. A robot gathers images of the canopy, passing through various positions around the canopy as shown in Figure 4 2 B). It is assumed that the base trailer, which supports a robot system, is sta tionary while the robot is acquiring data. All techniques in my study were designed by assuming following environments and conditions. Assumptions and Limitations Citrus canopy scene Only citrus canopies and a citrus grove are considered as a target object. It is also assumed that there are only stationary citrus leaves without any artificial structures or moving objects in the scene. Other severe outdoor conditions such as direct backlights and windy conditions are also ignored. Video frame image Image q uality does not matter as long as features are discernable. H owever image filtering techniques were designed for the RGB color images to enhance feature extraction. Since most image processing techniques developed in my study are designed for sequential i mages and implementations are considered to work in real time applications, it is assumed that vision systems can record video clips. Single vision system Since a reconstruction associated with vision systems are typically carried out using on a two view geometry, where two or more images are required. As a result, stereo vision systems are widely used for a reconstruction. Since a stereo vision system can carry out reconstruction under stationary conditions, it could be appropriate for mobile vehicles sco uting. The PAGE 56 56 performance of reconstruction using a stereo vision system is strongly influenced by our ability to match image features. To insure high performance in matching, both cameras must be able to provide similar quality images. When considering robot ic manipulator applications, one must consider whether the weigh t size or volume of the stereo camera system will hinder robot maneuverability. As the robot harvest the fruit, it will continuously be moving in and out of the canopy T herefore a light wei ght low profile sensor systems is the best. Therefore my study uses a single camera vision system. Parts of frames are saved as base frames to get multiple views during exploring. Base frames are used until correspondences remain enough between frames. M otion m odel and l arge objects than view Since alleyways in an ordinary citrus grove are very long with respect to camera field of view, a single camera cannot capture the whole scene with sufficient detail. Therefore, we need to develop techniques that can reconstruct the whole canopy from successive images. Most literature which deals with 3D reconstruction has demonstrated their results in scenes where the whole objects are shown in every view. In this case, there is no problem to reconstruct objects with successful matching features. H owever, in the case of a canopy scene, since initial features extracted from the base frame are supposed to disappear at other frames taken from different views, it is necessary to continuously track features along the frame s. This is especially true for the base features, which are used for the estimation of extrinsic camera parameters, must be propagated precisely. Methods and Procedures The process using an image can be briefly categorized into two parts : image processing and geometric calculation. Image processing deals with features in three steps : Feature detection, Feature tracking and Feature matching. PAGE 57 57 A feature detection algorithm (Chap.5) extracts meaningful features from an image obtained from vision systems. Feat ure tracking (Chap.6) monitors and retains the change of feature position in the image coordinates during sequential images. It is also used to measure optical flow. Feature matching is used to match indices of features between images. Once an index of fea tures is arranged, then geometric applications such as image mosaicing (Chap.7) or 3D reconstruction (Chap.8) can be conducted. Sequential image mosaicing creates a wide panoramic image by means of stitching sequential images based on the optical flow. The wide image can be used for texture mapping images on virtual canopy models. A 3D reconstruction of a canopy surface is carried out with correspondences from feature matching. In addition, alternative 3D reconstruction using a LADAR & GPS (Chap.9) is conducted. The relationship among these p rocedures is shown in Figure 43. The 3D reconstruction can then be implemented to work with the harvesting robot model, and sequential image mosaicing and 3D reconstruction based on a LADAR & GPS for the autonomous scouting vehicle model. Conclusions This chapter has briefly reviewed the applications and assumptions considered in my work Technical details of these topics and conclusions are discussed in next all chapters. PAGE 58 58 A B Figure 4 1. V ehicle system A) Scouting autonomous vehicle ( Subramanian, V., 2006) B) Motion model of vehicle. A B Figure 4 2. Robotics sytem. A) R1207 robot manipulator. B) Motion model of robot. Figure 4 3. Overall procedures. PAGE 59 59 CHAPTER 5 FEATURE DETECTION The objective of this chapte r is extracting features from canopy scenes. Introduction It is more difficult to extract feature s from a natural landmark scene than an artificial well structure d scene Most feature based image processing methods have tried to find and extract features from salient artificial landmarks such as buildings, roads, etc. Natural landmarks like trees were considered more difficult objects. In my study, it cannot be assumed that there will always be artificial landmarks in a grove scene. Consequently, feature d etection should be able to extract features from a canopy scene which consists only of leaves or fruit. A ripe piece of fruit could be a good feature because it can be segmented by color. However, image processing in my study cannot depend solely on harves t seasons. Methods and Procedures In my study, a feature denotes a point feature. A corner is a feature used commonly because it is invariant to directions. A Harris corner detection method is basically used for feature detection. Initial feature points used for tracking are selected by using Harris corner detection. Harris corner detection used here allows us to set the number of corners and the minimum distance among corner points. The output of this detector has different results based on the region of interest because its output depends on a threshold which has to be applied in a gradient based on the given image. However, the image quality influences the corner detection directly. Therefore, it is necessary to maintain uniform image quality through t he use of image filters. PAGE 60 60 Image Enhancement Change in illumination which occurs outdoors influences the uniformity of image quality. Ordinary cameras are designed to adjust brightness and contrast automatically. In the case of an image obtained under strong sunlight or backlight, the contrast of the image becomes strong, or the histogram distribution is prone to be biased. Therefore, features may not be detected uniformly or sufficiently. Image filtering is used to equalize and enhance image quality acquired from a camera. To make feature extraction clear, several image enhancement techniques such as histogram equalization and s harp e ning can be applied. The simplest way to be invariant to illumination is to convert RGB into HSV color space, which is a linear transformation. The other way is to equalize histogram distribution. Figure 51 shows that the shadowed area becomes enhanced though histogram equalization. Another method is to use an embossing filter which regulates contrast and emphasizes edges. An ex ample of the effect of the embossing filter is shown in Figure 5 2. Figure 5 2 A) shows corners detected from a backlit image. Few corners were detected in the shadowed area. On the other side, corners were detected uniformly from an embossing filtered ima ge as shown in Figure 5 2 B). Results from the optical flow based on the corner detection were shown in Figure 5 2 C) and D). Figure 52 D) shows that an embossing filter gives more chances to detect features in a backlit image. These filtering combination s are helpful to keep a uniform image even with backlit scenes. Leaf Detection As long as a camera moves, features are prone to disappear. Therefore, robust features are needed that are able to withstand this disappearance. The aim of this approach is to f ind out the best combination of morphological operations at the initial step so that the best feature point is PAGE 61 61 chosen and it will last as long as possible. Features such as the tip and junction of a leaf could be promising features, but it seems that it is not only unstable but also expensive to calculate during the sequence. After enhancing image quality, an opening operation is applied, which is a combination of erosion and dilation. An opening operation can be expressed with B B A B A ) ( (5 1) where and are an erosion and a dilation operation respectively. An opening operation has the effect of simplifying complex textures into a simple level of intensity as shown in Figure 5 3 B). From there, a watershed segmentation is applied with results illustrated in Figure 5 3 C ) With a large mask for opening operation inadequately small segment can be avoided. Even though segmented, all patches might not be features. Since a patch in each segment has a d ifferent level of intensity from other patches shadowed leaves could disappear easily. To filter patches by suitableness a threshold is applied to the patches. Figure 5 4 illustrates an intensity patch in a segment. T he threshold increases from zero unti l an area of the binary image becomes a quarter of the area of the segment. When it reaches this condition, the center point of the binary image is recorded This algorithm is also implemented automatically to ignore thinner segments by computing the ratio of area. Once a patch over the threshold is selected, compute the center of the patch as shown in Figure 5 5 A) At this moment, to reflect the actual image, an other opening operation is used again which has smaller mask like Figure 5 5 B). After the thr eshold is overcome and the segments are selected, c onsidering all of the patches, Figure 5 5 C) is finally obtained. These feature points provide the initial position for feature matching. In summary, the entire procedure is shown in Figure 56. PAGE 62 62 Results Fr uit Detection Even though leaf detection is designed to detect the center of leaves, it can also detect the center of fruit. Figure 5 7 A) and B) shows the result of using a ripe citrus fruits scene and an unripe citrus fruit scene, respectively. Most frui ts are pointed regardless of their color. There is a problem separating fruits and leaves in terms of fruit detection. However, it would be useful if fruit candidates were chosen regardless of color because the color based fruit detection approach cannot be applicable to green fruit such as limes. Counted number of fruit was shown in Table 51. Mature fruit was dete cted more than immature fruit. Size Problem Since a segment size cannot fit to a leaf size automatically, it is necessary to adjust the size fa ctor appropriately depending on the image status. Figure 58 shows the results of applying inappropriate size factors. It is not necessary to fit a segment to each leaf. However, segmentation could vary and features may be unstable. Conclusions Outdoor sce nes are exposed to unbalanced illumination such as shadow or backlight. Therefore, it is necessary to apply image filters in order to enhance image quality. An embossing filter detects and shows features uniformly. A leaf detection method was devised to provide m ore stable features than corner detection The key point of leaf detection is to segment leaf areas through the combination of morphological operation s Once segmented, the center of a leaf in each segment is set as a feature point. Leaf detection was designed to apply discriminating thresholds to each segment to secure as many features as possible. However, it is not used in every frame. Feature detection is PAGE 63 63 carried out only when new features are needed because the detection may not always give the same points repeatedly. Future Work A segment size is a critical factor in leaf detection. Therefore, it is necessary to develop a method to adjust a segment size automatically. It is expected to find out appropriate factors by varying the size of segmen t iteratively. PAGE 64 64 Table 5 1. Comparison detected number of fruit. Figure 5 7 A) Figure 5 7 B) Number of fruit 15 14 Detected number of fruit 13 8 Detection rate 86.6% 57.14% 0 0.5 1 1.5 2 2.5 3 3.5 4 x 104 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 A 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 x 104 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 B Figure 5 1. Histogram equalization. A) Original image and its histog ram B) Equalized image and its histogram. A B C D Figure 5 2. Embossing filter effect. A) Detected features with an original image and. B) Detected features with an e mbossing filtered image and a mosaicing image based on the optical flow. C) Mosaicing image based on the features from an original image. D) Mosaicing image based on the features from an embossing filtered image. PAGE 65 65 A B C Figure 53. Leaf segmentation. A) Original Image B) An o pening with large mask applied to the image C ) A watershed segmentation applied to the image segment threshold intensity Figure 54. T hreshold a djustment A B C Figur e 55. Detect the center of segment. A) Segmented patch. B) I mage applied an opening with small mask. C) D etected feature points Image Sharpness Opening (Large Mask) Watershed Segmentation Selecting segment by threshold Opening (Small Mask) compute Center of segment. Figure 5 6. Morphological operations of leaf detection. PAGE 66 66 A B Figure 5 7. Fruit segmentation. Fruit segmentation A) wit h a ripe fruit image B) with an unripe fruit image. A B Figure 5 8. Segmentation size. Fruit segmentation A) with a zoomedin scene. B) with a zoomedout scene. PAGE 67 67 CHAPTER 6 FEATURE TRACKING The objective of this chapter is tracking features between fr ames. Introduction T emplate images obtained near every feature point are used to compare how the feature points are moved during tracking features. Tracking is a process to measure the motion vector of a feature point. Although a great number of methods have been suggested, existing methods that extract feature points based on specific image s have limitation s that apply to various type s of image s in general. An ambiguous scene or occlusion can cause tracking failure. To keep tracking robustly, an active mesh method was used in my study. The basic concept of active mesh in terms of tracking is that the tracking does not depend only on the image, but also focuses on the geometric relationship among connected feature points. This approach makes it more robust to track in the case that only the camera is moving. Since f eatures in outdoor scenes are exposed to a variety of disturbances, they are prone to blink or disappear as shown in Figure 6 1. When they lose their capacity as a feature, new features must be de tected again. Such short lived features lead to accumulated errors Therefore, it is necessary for features to survive as long as possible. Methods and Procedures Mesh based tracking is an extended application of an active contour. An a ctive contour define s linear relationships between nodes while an active mesh takes care of multiple connections. Meshes can fit to gradual deformations and are regarded as the surface of an object. Blinking features which repeatedly appear and disappear can be sustained through developing these meshes. Molly (2001) proposed a basic idea about an active mesh method and suggested various potential features to be forces. In my study, part of the equations and features were employed for force equations. PAGE 68 68 Active Mesh In order to be robust to sudden changes of features it is necessary to reduce the effects of individual features. An active mesh considers the movement of overall features by connecting feature points. To define the relationship with neighboring feature points (nodes), meshes are generated by Delaunay triangulation. Optical flow is regarded as a force vector in this scenario. The new position ) ( y x P of feature i at frame thk is estimated as follows: i k i k iS P P ) 1 ( ) ( (6 1) where iS is the sum of the resultant forces and is the weight factor. n j k j i iF S1 ) ( (6 2) S F Fk j i k j i ) 1 ( ) ( (6 3) where ) ( k j iF which is a resultant force of neigh boring node j at frame thk Internal j External j k j ibF aF F, ) ( (6 4) where External jF, is affected by image intensity, and Internal jF, is based on the relationship between nodes. Fixed ratios a and b are adjusted experimentally ( a =0.85, b =0.15) with the sum of resultant force iS reflected on the center feature ) ( k iP as shown Figure 6 2. External force External forces External jF, are defined by the difference of neighboring node jP between frames thk and 1 thk ) 1 ( ) ( ) ( k j k j k j External jP P F (6 5) PAGE 69 69 where ) ( k j i s the weight factor of neighboring node j at frame thk The variable is the inverse relationship determining that the farther the node, the less influential it will be on the calculations. How ever, for this to be a reasonable assumption, the elevation difference between the center and neighbor node cannot be too large. n l k l k jd dk j1 ) ( ) (1) ( (6 6) ) ( ) ( ) ( k i k j k jP P d (6 7) where ) ( k jd is distance between current node ) (k iP and neighboring node ) (k jP Tracked features give us the degree of correlation and the position ) (kP Internal f orce Internal forces Internal jF, are defined by an elastic coefficient which represents the relationship between feature points. ) ( ) ( ) ( k i k j k j Internal jP P F (6 8) where ) ( k j is the elastic coefficient which is a ratio of difference between previous ) 1 ( k jd and current distance ) ( k jd to distance ) ( k jd ) ( ) ( ) 1 ( ) ( k j k j k j k jd d d (6 9) Internal force not only makes the calculations less sensitive to the high frequency motion, but sustains the previous force when features disappear temporarily. Feature management Features are expected to disappear when the camera moves, some features, go out of the field of view boundary, while others disappear due to occlusion or are overlapped with other PAGE 70 70 features. These features should be deleted immediately so that new feature points can be added to maintain the total number of points. New features are selected among candidates that are the farthest from the existing feature points. Simulation Results T he performance of the active mesh was examined using a virtual object and 2D sinusoidal camera motion. The testing code was im plemented with M ATLAB While the camera moves, some features are intentionally discarded and overall optical flow is estimated. When absent features are ignored the motion estimation appears turbulent. On the other hand, active mesh tracking follows original motion with sustainable features as shown in Figure 64. Each axis indicates 2D motion coordinates in the plot. When an occlusion of over 10% occurs, the estimation diverges as illustrated in Figure 64 B) because too many occlusions violate the assumption of an active mesh. Figure 6 4 C ) shows that an active mesh estimates the input optical flow better than a normal estimation. This experiment examined only the effect of occlusion, and does not represent local acc uracy. Multilayered Active Tree An active mesh method has several disadvantages. Because it searches all nodes exhaustively, computation speed is slow Furthermore wrong estimation can still influence normal neighboring nodes. To enhance the shortcomings of an active mesh method, a multilayered active tree method was invented in my study. Multilayered active tree expanding from a 2D mesh to a 3D hierarchical structure as the active mesh extended from an 1D line to a 2D mesh Generation hierarchical structure Scale invariant methods use coarseto fine multiple images referred to as pyramids. The basic idea of a multilayered active tree is to connect nodes between multiple images. F irst PAGE 71 71 multiple coarse to fine images are prepared by applying Gaussian filtering as shown in Figure 65 A) Feature detection is carried out at each image as shown in Figure 6 5 B) Then a Voronoi diagram is applied based on features at each layer as shown in Figure 66 C). Features within a segment of an upper layer are grouped a nd belong to the segment as shown in Figure 67. This procedure is repeatedly conducted for every segment and every layer. Therefore, this loop can be implemented by a recursive function as show in Figure 6 8. Features form a hierarchical structure which i s automatically generated as shown in Figure 69. Since exploring is based on hierarchical paths, it is faster and less influenced by neighboring nodes than an exhaustive active mesh. Common features between layers are regarded as parent nodes. Features detected in the coarse image can be regarded as dominant features. Figure 6 10 shows dominant features. External force and internal force Once the hierarchical structure is built, the active mesh method is appl ied at each layer Basic formulas are the same as those of an active mesh; however, the exploring path is different. An active mesh considers neighboring nodes, while a multilayered active tree considers subnodes. Figure 6 11 shows the difference between an active mesh and a multilayered active tree in terms of forces. Thick arrows in Figure 6 11 B) denote a large weight factor. Since all nodes in an active mesh are equally handled, unnecessary evaluations are repeatedly conducted on the same node. In addition, some nodes that are erroneously estimated can influence other nodes. On the other hand, because it is possible to assign discriminating weight factors on dominant or high level nodes in a multilayered active tree the effect of unstable nodes can be controlled. Figure 612 shows how forces are applied to sub nodes. In this example, 3rd node is a top dominant node. Computation starts from the top node. Once new top nodes (N3) are determined by an external force, they become a reference (R3) for PAGE 72 72 their subnodes. New nub nodes (N2) are determined by b oth an external force and an internal force with altering weight factors. In the equation (6 4), weight factors are determined by an angle between optical flow vectors of upper and lower nodes. The sum of weight factors is equal to 1. When a noise feature makes the difference, weight factors enable to refer to the internal forces more than the external force. Simulation and r esults This experiment tests a multilayered active tree s robustness against to the effect of noise and occlusions by means of estima ting the optical flow. The camera motion model wa s the same with the active mesh. Three layers were generated for the hierarchical structure and 19 feature frames were used. It was assumed that features were consistently detected for the noise experiment. An image obtained through a Gaussian filter possesses the relatively small number of features yet the features are prone to be less sensitive to noise. Therefore, smaller noises are assigned to higher nodes. Noises are added to all frames with random varia tion. Figure 614 shows the result in optical flow estimation. Since noise could influence neighboring nodes in the active mesh, errors could be accumulated as process goes. However, in the multilayered active tree, since the lower nodes noise does not af fect neighboring nodes or higher nodes, errors were not accumulated as shown in Figure 614 A) Figure 6 14 B) shows that t he norm error of a multilayered active tree was smaller than that of noise added features. An occlusion case was simulated by random ly removing some nodes in a certain frame. A multilayered active tree holds occluded nodes during occlusion frames. Figure 615 shows that occlusion did not affect its optical flow estimation after occlusion occurs Therefore, multilayered active tree tracking method can be regarded less sensitive to the sudden change of optical flow. PAGE 73 73 Conclusions To be unaffected by sudden changes in feature points, an active mesh feature tracking method was employed. Basically, an active mesh consists of two forces, an ex ternal force and an internal force. The external force refers to feature points with a weight factor, and the internal force is determined by an elastic coefficient. The ratio between forces is experimentally adjusted. A simulation for occlusion was conduc ted with a virtual object and a sinusoidal camera motion. The effect of occlusion was tested by estimating the overall optical flow with different number of absent feature points. The results in simulation showed that it is effective with a small amount of occlusions such as blinking and disappearing features. However, even though an active mesh is resistant to the sudden changes, it requires exhaustive computation and it is possible for erroneous features to disturb normal features. Therefore, a multilaye red active tree method was invented, which expands a 2D mesh into a 3D mesh. The creation of multiple layers is based on coarse to fine images by means of Gaussian filtering. Features extracted from each image recursively form hierarchical structures by co nnecting subfeatures within an area which is segmented by a Voronoi diagram. Since multilayered meshes can evaluate the strength of features, it can control the disturbance from unnecessary features. Subsistence of features was evaluated with the comparis on of optical flow. The optical flow through these methods resulted in resistance to noise. The computation speed of was faster over ten times than that of an active mesh. Future Work The next step is to apply the algorithm to actual sequential images. Sin ce this algorithm works with several assumptions, it is necessary to verify how appropriate those assumptions are on the actual sequential images. To compare the accuracy of optical flow estimation, the camera motion must be known. PAGE 74 74 50 100 150 200 250 0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160 180 X Figure 61. Life ti me of each feature for during sequence SiSidjFi,jFj,int Fj,ext ) (k jP ) ( k iP Figure 62. Sum of forces acting on each node -15 -10 -5 0 5 10 1 0 2 4 6 8 X X Y Y X X Z Z Y Y X X Z Z Y Y X X Z Z Y Y Z Z X X Y Y Z Z X X Y Y Z Z X X X Y Y Z Z X X Y Y Z Z X X Y Y Z Z X X Y Y Z Z X X Y Y Z Z X X Y Y Z Z X X Y Y X X Z Z Y Y X X Z Z Y Y X X Z Z Y Y Z Z Z Figure 63. V irtual objects and 2D sinusoidal camera motion. PAGE 75 75 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -1.5 -1 -0.5 0 0.5 1 1.5 origianl estimated normal A -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 0 0.5 1 1.5 2 origianl estimated normal B -1.5 -1 -0.5 0 0.5 1 1.5 -1 -0.5 0 0.5 1 origianl estimated normal -1.6 -1.5 -1.4 -1.3 -1.2 -1.1 -1 -0.9 -0.8 -0.7 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 origianl estimated normal C Figure 6 4. Optical flow estimation. A) 30.54% occlusion (5025/16384) B) 18.99% occlu sion (3112/16384) C) 10.53% occlusion (1725/16384). Original Size=5, Sig=5 Size=10, Sig=10 Size=15, Sig=15 A 50 100 150 200 250 300 0 50 100 150 200 50 100 150 200 250 50 100 150 200 50 100 150 200 250 20 40 60 80 100 120 140 160 180 50 100 150 200 250 40 60 80 100 120 140 160 180 200 B Figure 6 5. Nodes generation from multiple image s A) Generated coarse images using Gaussian filtering. B) Detected corners based on each image. PAGE 76 76 0 50 100 150 200 250 300 350 0 50 100 150 200 250 Figure 6 6. Voronoi segmentation applied to each image. 0 50 100 150 200 250 300 20 40 60 80 100 120 140 160 180 200 220 40 60 80 100 120 140 160 180 40 50 60 70 80 90 100 110 120 130 Figure 6 7. Grouped features in a segment of an upper layer. PAGE 77 77 function [ Nds_r ]= nds_vpts ( vPts_, lv_, pIdx_ ) if lv_<2 return; end; vPt = vPts_{lv_}; vPt_1 = vPts_{lv_ 1}; for i=1:size(pIdx_,2) Pt_ = vPt_1.Pt; polyPt_ = vPt.vP ts{pIdx_(i)}; pIdx = inpolygon(Pt_(1,:),Pt_(2,:), polyPt_(1,:),polyPt_(2,:)); pPt = Pt_(:,Idx_r); if isempty(pPt) continue; end; Idx = find(pIdx==1); Nds_r{i}.Idx = Idx; Nds_r{i}.Pt = pPt; Nds_r{i}.Nd = nds_vpts ( vPts_, lv_ 1, Idx ); % recursive calling End Figure 6 8. MATLAB code of recursive grouping function. 50 100 150 200 250 -100 -50 0 50 100 150 200 250 300 50 100 150 200 250 20 40 60 80 100 120 140 160 180 200 220 50 100 150 Figure 6 9. Automatically generated hierarchical connection based on an image. PAGE 78 78 Figure 6 10. Determined d ominant features. Image Considering Node Neighbor Node External Force Internal Force Internal Force Neighbor Node External Force External Force A Image External Force Internal Force Neighboring Node Sub Node Sub Node Internal Force External Force External Force External Force External Force Considering Node Dominant Node Internal Force Internal Force B Figure 611. F orce s acting on the considering node. A) Active mesh. B) Multilayered active tree Feature N3 P3 R3 P2 FExternal P1 N2 R3 R2 N1 FInternal FInternal FExternal FExternalprevious current Feature Feature Top level Bottom level Figure 6 12. Procedure to determine sub nodes. PAGE 79 79 5 -4 -3 -2 -1 0 1 2 3 4 5 -0.5 0 0.5 1 -1 0 1 2 3 4 X Z Y X X X Y Y Z Z Z Figure 6 13. Camera motion model in the test for multilayered active tree 10 20 30 40 50 60 70 80 90 10 0 10 x axis [pixel]y axis [pixel]Optical Flow Estimation Original Noise Estimated Norm Errors A 0 2 4 6 8 10 12 14 16 18 20 0 1 2 3 FrameNorm Error [pixel] Noise Estimated B Figure 6 14. S imulation for robustness to noise A) Optical flow esitmations B) Norm errors between noisy optical flow and compensated optical flow. PAGE 80 80 0 20 40 60 80 0 10 20 30 x axis [pixel]y axis [pixel]Optical Flow Estimation Original Occlusion Estimated Norm Errors A 0 2 4 6 8 10 12 14 16 18 20 0 10 20 30 40 FrameNorm Error [pixel] Noise Estimated B Figure 6 15. S imulation for robustness to occlusion. A) Optical flow esitmations B) Norm errors between occluded optical flow and compensated optical flow. PAGE 81 81 CHAPTER 7 SEQUENTIAL IMAGE MOSAICING The objective of this chapter is to create a panoramic image with sequential images. Introduction A groundbased scouting system which inspects growing states of plant s or detects spreading diseases early can b e one visionbased agricultural application associated with unmanned management. An aerial based or satellite based scouting system takes only the top view of canopies, whereas a groundbased scouting system can take the lateral view of canopies closer. Th erefore, g round based scouting system s are a more effective way to gather indispensable information of canopies Machine vision systems are also popular devices used in remote sensing. However since cameras have limited viewing angles, a huge number of i mages may be need ed to cover a whole grove. To reduce saving capacities and increase the effectiveness, image mosaicing technique s are commonly used to stitch images taken from different viewpoints. Most literature dealing with image mosaicing ha s assumed a planar scene in which the shape s of even nonplanar object s are not changed. Unchanged shapes make it easy to stitch images together. This assumption can be achieved only when a camera is purely panning or is far enough away from object s as a satellite. Otherwise the mosaicing is prone to failure because some features can be occluded or deformed, which is called a parallax problem. In two images taken from different positions as shown in Figure 71, for instance, the right side of an image at the left po sition and the left side of the other image at the right position of a camera could be different Maintaining shapes in the scene is important to find out correspondences. Since the deformation of a feature point occurs less between video frames, losing th e correspondences can be minimized by tracking the feature points. PAGE 82 82 Methods and Procedures A mosaicing procedure is briefly divided into six stages: Image Acquisition, Feature Detection, Feature Matching, Estimation Homography, Warping and Blending. The bas ic process of mosaicing is to determine and align correspondences between two given images. Therefore, those two images should have enough area that overlaps Once correspondences are determined, the extrinsic parameters of a camera can be also estimated Then warping and aligning target images are achieved based on a homography between the images. Figure 7 2 shows a typical image mosaicing procedure. The structure of the algorithm consists of four essential functions: Initialization, Filtering, Tracking an d mosaicing. Figure 73 shows a flow chart of the algorithm. Projection M odel Most image mosaicing methods conform to either spherical or cylindrical projection models. Both projection models have a single center position of a camera. The image mosaicing using these models is carried out by the pure rotation of a camera. Since there is no translational motion, a homography becomes simply a rotation matrix. Td TN RH 1 ) 0 ( T R H (7 1) In the pure rotational motion, no parallax occ urs with a nonplanar scene. Therefore, warping and matching are conducted simply. However t hese projection models are limited because they cannot be applied any longer if a camera moves far enough away. In a groundbased scouting system, the center of a camera moves along a grove alleyway. Assuming that cameras are placed on the rear wheel axis of the vehicle and the path of cameras is almost a straight line, only variations in the X and Y axis can be considered without PAGE 83 83 perspective warping computations. Since there are small variations in the z axis between successive frames, the optical flow indicates camera motions directly. Alignment An alignment is a process to register images with a distance along the motion vector between successive images. An alignment approach in my work was designed to remove the partially overlapped area of an image in order to register it along an edge perpendicular to a line which connects the center of two images. Since a parallax may cause incongruity wh ile mosaicing, this cropping edge approach tries to minimize the parallax by leaving a partial image around the center. This approach assumes that a parallax increases as a camera moves farther away from the center of an image plane as shown in Figure 7 5 A). This assumption can be applied only when the camera direction is orthogonal to lengthy objects. Therefore, the alignment can be approximated by cropping into a part of the image. Figures 75 B) and C) show the process that determines the edge cropped o verlapped area. Only case A) was used within my work. To remove the overlapped area along the cropped edge, an alpha channel was used. An alpha channel is a gray mask image which controls transparency between a given image and background. First, you look f or intersecting points between frame edges and a line perpendicular to another line between the centers of frames. Once the cropping edge is determined, it should be decided which side would be removed. The partial area to be removed should be located betw een the centers. Then, you can make boundaries based on these edges and apply a floodfill function to the area on the alpha channel. By controlling the alpha channel, the overlapped area can be blended smoothly Regardless of which direction the camera moves, the partial area between centers would be removed. PAGE 84 84 Filename N umbering If the registered image gets too big, it may be a problem to process. Thus big image s should be cut with a fixed size whenever it surpasses twice the regular size. Precisely cropped images can be joined together whenever we want without any additional computation later. The file names are named with an index number similar to a matrix. Figure 7 6 A) shows an example of the numbering of divided mosaic images. Experiment Configuration T o be compatible with coordinates of VRML later, a negative focal length was employed like that of OpenGL. C onfiguration is also shown in Figure 77 B). A vehicle to which a camera is mounted on the real axis moves along the alley way in a grove. Figure 77 A) shows a situation when camera moves in a sinusoidal path The variation along the Y axis is due to vibrations of a vehicle while moving. Parameters assumed for simulation are shown in Table 1. Simulation Before applying the algorithm to an actual grove scene, the estimated camera path was compared with a known input sinusoidal path as shown in Figure 78. The current coordinates are based on the previous frame image. Since the image coordinates are discrete integer s roundoff errors accumulate with resp ect to the camera path as the process continues Cumulative errors can be corrected by using other positional sensors such as a DGPS receiver This algorithm was examin ed with pre recorded video clips rather than real time video input. To observe the performance of this algorithm, it was compared with a panorama demo program built in videoprocessing toolbox of M ATLAB This video clip was taken from a camera moving horizontally rightward To make it easy to track feature points, salient points were scattered artificially and a tetrahedron was adhered in order to form a non planar object. PAGE 85 85 Since the panorama demo program leaves behind the left part of the image, Figure 7 9 A) shows that the result ing image is skewed favoring the right side of the object, where as the developed algorithm leaves a direct frontal view as shown in Figure 79 B) Figure 7 1 0 shows the result when the position A) and direction B) of the camera moves with a sinusoid pattern Notice how images have been cropped and stitched. Results Real grove scene video clips used in this experiment were recorded in the UF grove on May 23rd 2007. A camera was fixed on a tripod and a cart that feigns a vehicle, and pulled by hands with a speed of less than 1mile/h. M osaicing result s for the grove scene are shown in Figure 7 11 through Figure 713. First, video clips were recorded far enough away from the citrus grove. In this case, the vibration of the vehicle had little effect Since this scene can be regarded as a planar object, the mosaicing is condu cted quite well. Figure 7 12 shows a mosaicing result which was recorded closer to the grove so that leaves can be observe d in detail. This result looks blurred because of wind s It is very difficult to solve this problem without information about the glob al location of the vehicle. In my work, the wind waving problem was not considered. Another problem found wa s a gradual deviation. The m osaicing image is very lengthy. If the horizontal axis of an image and the direction of a camera motion are not parallel, the mosaic ing image will be out of the upper or lower boundary before long as shown in Figure 713. Therefore, the angle of camera should be aligned horizontally before recording. It was designed for l arge size images to be automatically divided into fixed sizes when the width exceeded the twice width of the fixed size as shown in Figure 7 13. PAGE 86 86 Conclusion In my work, video image mosaicing for a nonplanar object was studied. The developed algorithm estimates the motion vector of the camera. When a camera moves parallel to objects, cropping edge approach can approximate mosaicing of even nonplanar objects such as a citrus grove. This video mosaicing was achieved in spite of moving irregularly. This result shows that the cropping edge approach can be approp riate to mosaicing. Any well designed image processing algorithm does not always yield successful results in every case. Since the algorithm developed in my work assumes that successive images are similar, that is optical flow can be measured, it makes an unexpected error when excessive changes of view occur. A few errors can affect the success of the entire mosaicing. Therefore, it is most important to track feature points robustly and capture video stably. The successful result of image mosaicing depends on how accurately feature points are tracked. Since the image coordinates are discrete integer s the estimation errors in terms of camera path may be accumulated as the process goes. A noticeable problem from the results is a deviation from the horizontal image axis. Figure 73 shows that the camera motion and the horizontal axis of an image are not parallelized. Since the mosaicked image is very lengthy, it may not be within the margins of the upper or lower boundaries after long unless the horizontal axi s of the image and the direction of the camera motion are substantially parallel or corrected using some appropriated sensors. PAGE 87 87 Table 71. Intrinsic parameters for simulation V ideo image M osaic image F ocal length Metric C onversion N umber of frame 320 x240 640x480 460 300 dpi 64 Overlapped Scene camera Figure 7 1. Parallax problem occurring with nonplanar objects which have depth. Camera Image sequence Feature Detection, Feature matching Registration Blending Homography, Warping Regulation Figure 7 2. Basic steps of image mosaicing PAGE 88 88 Feature points Mosaic image avi points image patches points points patches points image Adjusted image createTempletes filterImage detectPoints Initializing patches matchingTempletes applyingForce Feature Tracking modifyingFeatures determiningExtrinsic Homography image image warpImage cutImage Image Mosaicing overlapImage divideImage Mosaic image Mosaic image While loop Video png Initial points filterImage image tri_nodes tri_nodes Mosaic Image Mosaic Image Figure 7 3. D iagram of a lgorithm structure Base line Camera PathImage Planes Transformation Figure 7 4. Translation projection model PAGE 89 89 Camera Planar Distorted Object Image Plane Camera Planar Distorted Object Image Plane A Planar B C Figure 7 5. Alignment. A) Planar assumption. Registration with B) plane image and C) projective warped image 11.png 42.png 36.png Path A Y Z camera B Figure 7 6. File numbering. A) Numbering of divided mosaic image B) Groundbased scouting system -15 -10 -5 0 5 10 1 0 2 4 6 8 X X Y Y X X Z Z Y Y X X Z Z Y Y X X Z Z Y Y Z Z X X Y Y Z Z X X Y Y Z Z X X X Y Y Z Z X X Y Y Z Z X X Y Y Z Z X X Y Y Z Z X X Y Y Z Z X X Y Y Z Z X X Y Y X X Z Z Y Y X X Z Z Y Y X X Z Z Y Y Z Z Z A B Figure 7 7. Configua tion of image mosaicing simulation. A) V irtual nonplanar m odel B) Coordinates of camera. PAGE 90 90 0 500 1000 1500 100 120 140 160 180 200 220 240 Path of Camera X axis [pixel]Y axis [pixel] 0 20 40 60 80 0 5 10 15 20 25 30 35 40 45 50 Norm Errors Frame [fps]Error [pixel] detected position input position Figure 7 8. Evaluation of positioning error or camera A B Figure 7 9. Cropping edge mosaicing effect. Comparison between mosaicing results from A) M ATLAB an d B) developed algorithm A B Figure 7 10. Mosaicing with sinusoid motion. A) Translational sinusoid motion and B) Rotational sinusoid motion. Figure 7 11. Mosaicing result from video clip recorded far enough away from citrus grove PAGE 91 91 Figure 712. Mosaicing result f rom 5 times zoomed in video clip. Figure 7 13. Mosaicing result from 10 times zoomed in video clip. PAGE 92 92 CHAPTER 8 VISION BASED 3D RECONSTRUCTION The objective of this chapter is to reconstruct the surface of a citrus canopy s o that the robot harvesting system can map and navigate fruit positions later. Introduction A 3 D reconstruction of the canopy can be not only used to measure the volume of the canopy, but also to visualize the canopy in virtual space. To reconstruct the su rface of the canopy, various sensors could be used V ision systems are representative devices used for detectin g the features of an object F eatures extracted from the canopy become vertices of the surface of the canopy The depth of a vertex is determined based on two view geometry. To reconstruct a structure, two sets of matching points and camera motion are needed, where matching points are projections of the actual 3D surface. When both the 3D position and 2D projection are given, camera motion can be e stimated. The inputs of all st eps require the others outputs Therefore, it forms a circular loop. Figure 8 1 shows this relationship and parameters among projection, motion estimation, and reconstruction. This loop is prone to diverge due to the accumula tion of errors unless groundtruth is given. An 8 point algorithm based on twoview geometry is widely used in reconstruction. The s olution of twoview geometry is determined by Singular Value Decomposition (SVD) or i nverse m atrix operation as introduced in Chapter 2. Since correspondences are coupled in the 8 point algorithm, errors that occur at some of the correspondences influence the existing reconstruction. Additionally, re projection errors continue to increase. Since the robot manipulator used in my study provides reliable motion of the end effector, a Plcker coordinates system was applied to the 3D reconstruction. This approach calculates correspondences individually with a given camera motion. A pair of over 8 points is not required, PAGE 93 93 and local errors do not affect the existing reconstruction anymore. Because the Plcker coordinate s system mainly uses the dot product and the cross product, its expression s about 3D computation is considerably simpler than ordinary triangular equations It does not require any complicat ed matrix operations including SVD or i nverse matrix In addition, computation cost s can be less expensive in certain computing systems which are specified for the matrix operation s Methods and Procedures This section shows how to ma nage the quality of interest points and discusses known problems with an 8 point algorithm. The alternative approach using a Plcker coordinates system is proposed as a counterproposal to the reconstruction. Simulation and indoor experiment are carried out to confirm results through re projection error. 3D Reconstruction using an 8 Point Algorithm The basic procedure of an 8 point algorithm is to estimate a relative camera motion, and then reconstruct the vertices based on the estimated camera motion. Since the estimated camera motion is relative, the origin coordinates must always be given Therefore, base vertices are considered for the initial origin coordinates. Markerless approach The base v ertex is a known basis, called marker to estimate the origin coordinates. M arkerless means that 3D reconstruction is carried out without known markers. Since the harvesting robot motion model assumes that the frames get out of the initial frame, it loses the origin coordinates eventually. One possible way to achi eve markerless, reconstruction is to continue adapting stable vertices as base vertices from reconstructed vertices. PAGE 94 94 Initial base vertex model Since an icosahedron has many vertices which are not placed on a plane, and it is easy to make precisely, icosah edron base vertex models like Figure 82 were made to provide the initial known base vertex and compare the accuracy of reconstruction. Interest point manager The aim of interest point manager is to judge and manage the quality of Interest points Two vie w geometry assumes that given sets of correspondences are exact projections of identical vertices. In practice, input sets of correspondences are contaminated by a variety of noises or outliers. During the processes, even one outlier can cause the failure of the whole reconstruction eventually in an 8 point algorithm Therefore, it is required that the interest point manager is capable of measuring and managing the quality of i nterest point quite strictly. Figure 8 3 shows a flow chart of the interest point manager. Since a list of interest points on each frame does not have an index, an additional index list is assigned where it adds and deletes elements of the list when events happen. There are three sets of lists for interest points. Interest point set (C F*) is used to track and reconstruct correspondences. Base point set (BM*) is used to estimate the camera motion. Candidate set (BE*) is used to judge which interest point is good or bad to be a base vertex. When a camera moves, four events occur i nteres t point selection, interest point removal, base vertex selection and base vertex removal. Interest point selection: The rule of adding a new interest point is to choose new interest points in empty area farthest from existing Interest points as show in Fi gure 8 4. This rule allows interest points to survive longer while the scene moves, as well as to avoid concentrating on one spot. The steps are as follows. 1. Extract an adequate number of candidate interest points. 2. Compute norm of difference between each ne w point and existing points. 3. Sort the list of norm. PAGE 95 95 4. Take as many new points as needed, which have the largest norm in the rest of the list. Figure 8 4 A) illustrates a case that a camera moves leftward. The leftmost interest points will be picked up firs t, then the next middle points will be repeatedly picked up. Interest point removal: As the camera moves, interest points are also supposed to change. When some of interest points are overlapped or go outside of frame edges, they must be removed in the index list. Since interest points not picked up as a base point have no information to track anymore, they should be discarded as soon as they disappear. Once the Lucas Kanade tracking function in OpenCV library, for example loses some of interest points, it places them into the nearest corners rather than discards them. This behavior is quite unnecessary for 3D reconstruction. This phenomenon always occurs around the frame edges. One solution is to force to discard them before they reach the frame edges. The optical f lows direction determines which edge is chosen, and its maximum speed establishes how thick the area will be. E ight pixels were selected for the edge thickness in this experiment Occasionally, some of the i nterest points are not placed on corne rs. In this case, they are prone to fluctuate around their initial position. Those altering movements lead to the wrong reconstruction. One of the solutions to these unstable interest points is to measure the difference between the input position and the r e projected position with respect to reconstructed vertices, which is called reprojection error. Figure 8 6 shows that some difference between input position ( projected position (x) occurs when an interest point is placed on a noncorner. When a re projection error of an interest point exceeds a threshold, it is discarded. Through repeated experiments, t he threshold was set at 0.015 [in]. Base vertex selection and removal: Reconstructed vertices which have small re projection errors or converge s tably are picked up as new base vertices. When stable interest PAGE 96 96 points between frames converge into constant positions, the reconstructed vertices corresponding to them become candidates for the base vertices. A base frame provides the second view to comput e reconstruction. When base vertices are chosen, the base frame is also captured and is held until the next selection event. To estimate convergence, differentiation and standard deviation of re projection error is observed between frames. If the reconstruction is adequate, differentiation and standard deviation of re projection error will be almost zero. Figure 8 7 shows examples of a stable vertex and an unstable vertex. Unstable Interest points are prone to continue fluctuating as shown Figure 8 7 B). In a way, it is necessary to limit the number of samples to compute statistic quantity, thus minimizing the effect by momentarily wrong measurement. Figure 8 8 shows the statistical quantity of each vertex by using different sizes of circles. Large statistic al quantity means instability, as a perfect measurement leads it to zero. The minimum error in this experiment was 0.01 [in]. The threshold for base vertices was set 0.02 [in] as shown in Figure 88 B). Implementation A real time 8 point algorithm was impl emented with OpenCV and C++Builder. C++Builder takes charge of the display and user interface, and OpenCV carries out image processing and connectivity to a camera. The implemented software was also designed to process video clips as well as real time came ra input. The farther the camera motion from the base position, the more accurately an estimation is calculated. Figure 8 9 shows successful reconstruction while base points show up on both the left base frame and the right current frame. However, the bas e points used for motion estimation must continually be provided so that the camera keeps moving. Even though the re projection error is small, accumulated errors eventually contribute to the failure of reconstruction. Figure 810 shows that the re project ion PAGE 97 97 error continues to increase, as the number of base vertices increases. The X axis and Y axis denote the number of frames and pixels, respectively. This effect is because it loses the global ground truth to maintain the origin coordinates as the scene m oves away from the first initial view. Sensitivity to round off error 3D reconstruction through an 8 point algorithm is considerably more accurate with floating type interest points, and re projection errors are virtually zero. In terms of pixel units of a real CCD image sensor, roundoff error could occur because of integer type interest points. To examine the sensitivity of an 8 point algorithm, noises within 4 pixels, 0.0124 [in] (0.315 mm) on CCD were added purposely as shown in Figure 811 A). 4 pixels is not a big error for fluctuated interest points in the image, and is not noticeable as shown in Figure 811 A). Nevertheless, Figure 8 11 B) shows that the reconstruction deviates with big differences between the original and reconstructed vertices. The norm error of reconstruction was 6.3264 [in] when cameras were placed at 15 [in] away from the object in this test. The effect of roundoff error in an 8 point algorithm was simulated for two different types of vision systems as shown in Figure 813. The dashed red lines are reconstructed vertices. Figure 8 12 A) and C) show that the reconstruction was perfectly achieved in both systems when floating type interest points are used. However, in the case of integer type interest points, the reconstruction wa s unacceptable in both systems. Since stereo vision could secure many more correspondences, the reconstruction errors were relatively smaller as shown in Figure 812 D). Discord index Since interest points are just a collection gathered, indices are assign ed to manage (add / delete) interest points, as well as to provide correspondent information. Even though there is nothing wrong with the index manager, mismatching error happens on occasion while running PAGE 98 98 the software. Figure 8 13 shows a mismatching case of interest points between frames. The black solid lines represent appropriate matching, and the green dotted lines represent mismatching. Since this reconstruction software assumes that correspondences are always correct, and the program is not designed t o examine the adequacy of matching again, the whole reconstruction is prone to collapse, once mismatching happens. 3D Reconstruction using a Plcker Coordinates System An 8 point algorithm is widely referred to as a 3D reconstruction. Because correspondences are coupled in an 8 point algorithm, even a few outliers can influence the whole reconstruction. In addition, accumulated errors keep increasing. Therefore, many methods such as RANSAC, bundle a djustment and Marque Levenberg are employed to discard out liers. However, according to the circular relationship shown in Figure 81, no method without ground truth is likely to resolve the accumulation of error. The robot manipulator used in my study provides measurable and repeatable motion of the end effector. If the motion information is given as a groundtruth, a Plcker coordinates systems can be simply applied to a 3D reconstruction instead of an 8 point algorithm. This approach is intuitive and independently computes correspondences with given camera motions. Computation is conducted about a single correspondence; therefore, it is not necessary to take a set of over 8 points. Furthermore, since any individual error does not influence the existing reconstruction any longer, it is not necessary to check outli ers during reconstruction. This approach starts with defining a line in a space (source: Crane III 2006, screw theory lecture note). A line in Plcker coordinates is defined as follows. L 0; S S L (8 1) PAGE 99 99 where S is a norm alized ve c tor between two points, and L 0S is a moment with respect to an origin as shown in Figure 814 A) S and L 0S are determined by given points P and T S P S T P T P S L 0, (8 2) where T c c cz y x T is the center of the camera, and T w w wz y x P is the correspondent point on the image plane with respect to world coordinates. The P is determined by the extrinsic camera parameters Figure 8 14 B) represents T Rp P (8 3) where R is the orientation of the camera which has a 3x3 rotation matrix, T im imy x p is a correspondent point with respect to image coordinates. Assuming the correspondent points 1P and 2P are projection s of a vertex Q of an object the line s 1L and 2L are on the same plane as shown in Figure 815 A) In other words two non parallel lines on the plane must intersect at Q Given the two lines 1L and 2L the intersection point Q can be determine d by using the following equation. 2 1 L 0 12 1 2 L 0 12sin sin S S S S S S Q (8 4) where is an angle between lines, 12 S denotes normalized cross product between 1S and 2S ) ( cos2 1 1S S 2 1 2 1 12S S S S S (8 5) If lines 1L and 2L are not on the same plane, there can still exist another line 12L orthogonal to both 1L and 2L as shown in Figure 815 B) PAGE 100 100 Giv e n the noncoplanar lines 1L and 2L the intersection point 1Q and 2Q can be determine d by using the following equation. 1 L 0 12 2 12 2 L 0 12 1) ( S S S S S S Q d 12 1 2S Q Q d (8 6) where d is the distance between 1Q and 2 Q. sin2 0 1 1 L 0 2 LdS S S S (8 7) Due to interference interest points on the image may not be an accurate enough projection. Th erefore, the two lines are unlikely to be on the same plane in reality. In this case, Q can be approximated by taking the average between 1Q and 2Q ( Refer to Appendix A to see more inducing procedures.) 22 1Q Q Q (8 8) To validate the equations, a reconstruction simulation using a Plcker coordinates system was conducted with a virtual tetrahedron model as shown Figure 8 16. The red crosses are input correspondences and the cyan dots are re projections. The re projection errors were almost zero in this simulation. Therefore, the idea that the equations work precisely was confirmed. Experiment Configuration Figure 8 17 shows that a camera is attached on the end effector of a robot manipulator. A camera does not need to be in the center of the endeffector. The robot manipulator used is R obotics Research 1207 which has 7 DOF and provides the position and orientation of the endeffector. PAGE 101 101 Robot motion generation Since the inverse kinematics formula of the robot manipulator used in this experiment was unknown, the robot was moved manually to create particular motions. To save resources, the motion simulation software was implemented as shown in Figure 8 18 A). This software allows humans to jog the virtual robot m anipulator in virtual space with respect to global coordinates. Once base motions are determined, intermediate motions are generated by fitting methods such as Spline. Figure 8 18 B) shows each joint angle was generated automatically. Robot motion control and capture image To make it easy to capture images, fully automated and integrated robot control software was i mplemented as shown in Figure 819 A). This software was implemented to load and display motion data, send commands to the robot manipulator a nd capture and save images from a camera. Since it is unlikely to take clear shots due to the vibrations caused by inertia it was designed to stop at each position for 3 seconds. Distance between motions wa s set within 10 [in] to secure a sufficiently ove rlapped area between sequential images. Figure 8 19 B) shows the captured images along the path of motion. Calibrations for Camera Parameters Every real camera has its own intrinsic parameters such as lens distortion, focal length, principal point, scaling factors and skew angels. One of cameras used in my study has a very small lens. The smaller focal length a camera has, the more lens distortion occurs. Since two view geometry assumes an ideal flat image plane, it is necessary to compensate for lens disto rtion effect. Rectification for lens distortion effect Two types of lens distortion are typically considered. A tangential distortion represents the degree of skewing between axes. A radial distortion illustrates either barrel or pincushion effect. PAGE 102 102 For obj ects close to a camera, a barrel distortion occurs. These distortions can be compensated for with the following equations, which are introduced by Camera Calibration Toolbox for MATLAB of Caltech Vision lab (2008). Tangential distortion (skew effect): xy k y r k x r k xy k k k xy y r x r xy 2 2 2 2 2 2 2 24 2 2 3 2 2 4 3 4 3 2 2 2 2 tangentiald (8 9) Radial distortion (barrel or pincushion effect): 1 1d d p px x KK y x where tangential 6 5 4 2 2 11 d y x r k r k r k y xd d (8 10) where 2 2 2y x r Coefficient k s are determined through the camera calibration. Si nce these equations are reversible, it is possible to generate virtual correspondences for simulation. Figure 820 B) shows the undistorted image through camera calibration. The intrinsic parameters are shown in Table 81 and the coefficient k determined through the calibration was 0 0148 0 0047 0 8389 0 6717 0 k Accuracy of camera motion estimation Figure 8 21 A) shows the extrinsic camera parameters estimated by camera calibration. The reprojection errors were within 2 pixels. That means the estimated camera motions were acceptably accurate. When these camera motions were transformed into the motion information from the robot manipulator, the target checkerboard is supposed to coincide if the camera motions are accurate. However the target checkerboard did not coincide as shown in Figure 821 B). Therefore, it is regarded that the motion information from the robot manipulator has some errors. The maximum difference between samples was about 3 [in]. PAGE 103 103 Feature Matching For matching correspondenc es, the performance of SIFT and SURF feature matching methods were examined with several pairs of indoor scenes. Figure 8 22 shows matching results from several pairs of images which are taken with about 2 [in] distance. The dashed line s indicate optical f low vector. The similar optical f low vectors are prone to denote similar depth of vertices. F igure 822 A) shows several mismatched correspondences whose vectors are twisted Optical f low vectors that differ greatly from the dominant optical f low vector ar e excluded. This rule was applied to discard mismatched correspondences F igure 8 22 B) shows most of the outliers were discarded Through the test, i t is regarded that SURF shows adequate matching performance and high speed relatively more than SIFT. Aver age computation time for a pair was 3 times faster than SIFT. I ndoor Experiments The 3D reconstruction algorithm was examined on an indoor scene before applying it to an actual canopy. Since it is difficult and inaccurate to measure the size of an actual c anopy, it is necessary to confirm the algorithm with known sizes of objects. To confirm the performance of the algorithm the reconstruction of two icosahedrons was conducted at the two pair s of correspondences as shown in Figure 823 A). Two sets of inter est points were selected manually. The difference between input correspondences and reprojected points are adequately the same as shown in Figure 8 23 B), which means reconstruction is locally correct. However since the motion information from the robot manipulator is not quite accurate, reconstructed icosahedrons at each motion did not coincide globally as shown in Figure 823 C). This discordant result represents errors between two different matching sets. This offset can be compensated by calibrations between motion information and physical position. PAGE 104 104 Figure 824 shows a result in reconstruction of another object using the same procedure. Interest points were selected manually (by hand) and automatically (by SURF) and both results were plotted in Figu re 824 B) Even though there are inevitable errors during automatic selection, the entire reconstruction maintained a considerable deal of approximation. To confirm the effect of input errors, the result of the Plcker coordinates system was compared wit h that of the 8 point algorithm. Figure 8 26 A) is the same plot with Figure 8 24 C). T he reconstruction by an 8 point algorithm shown in Figure 825 B) looks impractical. Therefore it is regarded that a Plcker coordinates system using a single point to compute, yields better performance than an 8 point algorithm us ing a set of coupled points for 3D reconstruction. Outdoor Experiment T he algorithm was eventually examined with an actual canopy in Ocala, Florida The experiment was carried out at P ine A cre grove in Florida. Figure 8 26 A) shows the operation on an actual canopy. Since canop ies are much bigger than the workspace of the robot manipulator in general, the reconstructed surface of a canopy is just part of it. Figure 8 27 shows 75 sequence images sampled with a zigzag camera motion shown in Figure 826 B). Results Since the image quality was comparatively low and the overlapping area wa s not wide enough, many of the interest points were discarded unlike the indoor experiment In addition, i nterest points were not distributed uniformly, and only a few correspondences were obtained at each pair of images Figure 8 28 shows one of matching results. T here were still about 10% of mismatch ing correspondences; however, those mismatching correspondences are unlikely to influence the final reconstruction. Feature tracking could contribute to holding an unbiased distribution of interest points, even though it is not in bad condition. PAGE 105 105 Figure 829 A ) shows the final reconstruct ion of the canopy based on empirical measurement. This part of c anopy s urface shown in Figure 829 B) was approximated using 2D grid interpolation. The 2D grid interpolation can not only reduce the entire number of vert ices, but also regulate biased vertices. In addition, .it can compensate wrong vertices too. If adequately many vertices were reconstructed, estimated vertices t he surface could look more realistic. Figure 829 B ) shows reconstructed vertices and camera motions Since fruit detection was not applied to my study, some of the fruit position s were manually picked up, so that we can confirm the reconstruction was appropriately carried out. The circle s in Figure 8 29 B ) indicate fruit positions selected manually. The r ed dots in Figure 830 A ) indicate re projection from reconstruct ion. T he re projection errors between input fruit positions and re projections were adequately small Therefore as we have seen in the indoor experiment, it is regarded that the surface reconstruction of a real canopy is conducted well. T he reconstructed surface can be used for visualization and measurement of the volume Conclusion In my study, 3D r econstruction of the citrus c anopy was studied. An 8 point algorithm is an adequate approach for the reconstruction of objects which are entirely covered in t he frame of view. However i t is hypersensitive to minor errors due to coupled input points. Furthermore, when objects are larger than the view frame like the canopy case, reconstruction errors are likely to accumulate. To match correspondences between seq uen tial images, the SURF matching method was employed. A Plcker coordinates approach to reconstruct the 3D surface of canopy worked out adequately The Plcker coordinates systems expression for reconstruction is intuitive and computed individually. Ther efore, any individual error does not destroy the existing reconstruction. Since the camera motion obtained from the robot manipulator could be PAGE 106 106 inconsistent with the physical pose, it is necessary to take as many samples as possible. The successful result o f reconstruction depends on how accurately feature points can be matched. Future Work Since the objective of my study was the reconstruction of the canopy, some of the automation processes were not implemented in my study In future work, fruit detection will be include d. Once fruit position was mapped on the surface of the canopy, the actual fruit position will be proved by positioning the end effector of the robot manipulator to the mapped position from the reconstruction. PAGE 107 107 Table 8 1. Calibrated Intrins ic camera parameters Coordinates Image Resolution F orcal Length Image Center x axis 640 747.01 276.89 0 y axis 480 742.74 240.18 0 Position Orientation 3d Vertex 2D Correspondences Reconstruction Motion Estimation Projection Figure 8 1. Loop among projection, motion estimation and reconstruction. -0.5 0 0.5 1 -0.5 0 0.5 1 0 0.5 1 Y X Z Figure 8 2. Icosahedron base vertex model. PAGE 108 108 Figure 8 3. Flow chart of an 8 point algorithm. Existing Features Center of Existing Features A Candidate Features Selected Features B Figure 8 4. Interest point selection. Basis to pick select new I nterest points at A) previous frame and B) next frame. PAGE 109 109 Figure 8 5. Interest points around edge are disc arded. Figure 8 6. Re projection error of u nstable I nterest points. A B Figure 8 7. Stability of vertex. A) C onverging stable vertex. B) F luctuating unstable vertex. PAGE 110 110 A B Figure 8 8. Average statistical quantity A) Plot in 3D view B) 2D plot wi th a threshold. Figure 8 9. 3D reconstruction program using an 8 point algorithm in real time. Figure 8 10. Increment of accumulated re projection error. PAGE 111 111 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 A -1 0 1 2 3 4 5 -1 0 1 0 5 10 15 Y Z B Figure 8 11. Sensitivity test of an 8 point algorithm with 4 pixel roundoff error. A B C D Figure 8 12. Reconstruction simulation. Simulation with A) floating type pixels in single vision, B) integer type pixels in single vision, C) floating type pixel in stereo vision and D) integer type pixels in stereo vision. PAGE 112 112 0 100 200 300 400 500 600 0 50 100 150 200 250 300 350 400 450 Figure 8 13. M ismatched index problem. A T P L p B Figure 8 14. Plck er coordinates system A) A l ine definition B) I ntersection between a line and a plan T1T2P2P1L2L1Q p1p2 A T1T2P2P1L2L1L12Q1Q2p1p2d Q B Figure 8 15. Determination of an object vertex A) Two coplanar lines B) Common line orthogonal with two lin es PAGE 113 113 Figure 8 16. Simulation of reconstruction using a Plcker coordinate system. Figure 8 17. C onfiguration of camera on R1207 robot manipulator. A 0 2 4 6 8 10 12 -150 -100 -50 0 50 100 150 200 250 300 B Figure 8 18. Robot motion simulation. A) Base r obot motion. B) Joint angle generation based on the base motion. PAGE 114 114 A B Figure 8 19. Robot control software. A) Robot motion simulation. B) I mage acquisition A B Figure 8 20. Camera calibration. A) Original image. B) Undistorted image PAGE 115 115 A B Figure 8 21. Detemination of camera mot ion. A) Camera motions estimated by camera calibration. B) Transformed camera motion based on motion information from robot manipulator. A B Figure 8 22. Feature matching. A) Matching by SIFT. B) Matching by SURF. PAGE 116 116 A) B) C) Figure 8 23. Recons truction of i cosahedrons. A) I cosahedron objects B) D ifference between input points and re projected point C) R econstructed icosahedrons PAGE 117 117 A B C D Figure 8 24. Reconstruction of a robot manipulator A) I nterest points picked up manually B) View point of the object. C) V iewpoint of the object. D) Reconstructed result. A B Figure 8 25. Reconstruction comparison. R econstruction by A) a Plcker coordinates and B) an 8 point algorithm PAGE 118 118 A B Figure 8 26. Outdoor experiment. A) S canning using robot manipulator with B) z igzag camera motion s Figure 8 27. C aptured sequential i mages Figure 8 28. M atched correspondences between sequence images. PAGE 119 119 0 200 400 600 800 -1500 -1000 -500 0 500 1000 1500 X Y Z Z A B Figure 8 29. Reconstruction results. A) Approximated surface of citrus canopy B) R econstru ct ed vertic es. PAGE 120 120 A B Figure 8 30. Reconstruction confirmation A) F ruit positioning B) Re projection error. PAGE 121 121 CHAPTER 9 RANGE SENSOR BASED 3D RECONSTRUCTION The objective of this chapter is to globally reconstruct canopies by combining an active range sensor ( Laser Detection A nd Ranging LADAR) and a Global Positioning System ( GPS ) receiver Introduction In precision agriculture, to treat fields specifically, it is important to observe the variability. Remote sensing is one monitoring method used for wide areas. There are a couple of approaches in remote sensing. As compared with aerial or satellite based remotesensing systems, ground based remotesensing systems give more effective benefits such as detailed vision based information about distributions of status like diseases or vigor observed closer at hand. H owever, this chapter discusses a technique that approximately measures the volume of canopies. A range sensor (LADAR) and a GPS receiver is used for measurement, and estimated data can be ma pped on a global map of the geographic information system (GIS). Methods and Procedures Before conducting experiments, simulation software was developed. The simulation step is not only necessary to verify the algorithm, but helpful to analyze errors with actual field data. The entire procedure is shown in Figure 91. Coordinates of LADAR and Vehicle Scanning model of LADAR A LADAR is an active range sensor which measures distances to objects by emitting laser beam pulses and scanning for reflected pulses in a counter clockwise planar semicircle (180 ) as shown in Figure 92 A) and B) Therefore, scanned data do not indicate 3D space information without manipulation. Furthermore, if the position or orientation of a laser source is consistently PAGE 122 122 altering dyna mically, it makes estimation more difficult. A GPS receiver can provide only position information of a LADAR. To determine the orientation of a LADAR, it is assumed that both the position and the orientation of a LADAR are fixed on the rear axle of a vehic le, and it scans in the lateral direction with respect to the forward direction of a vehicle. Since the Y axis of a LADAR can be regarded as the normal vector of the path curve as shown in Figure 92 C), the orientation can be determined based on the path obtained from a GPS receiver Motion model of vehicle Since a LADAR is capable of scanning the canopy on both sides of a vehicle, the vehicle moves along every two rows in spiral as shown in Figure 93. Since the speed of a vehicle is not very fast, the r esolution of position information taken from a GPS receiver unit is too unde r populated compared with a LADAR Therefore, the intermediate motions of a LADAR are interpolated by a s pline curve fitting method with the assumption the position information received from a GPS receiver is accurate. Simulation On the assumption that measured data are ideally accurate, a simulation for range sensor based reconstruction was conducted. This simulation has several steps. A path generation and distance measurement sim ulate GPS and LADAR information respectively. A 3D reconstruction is achieved based on these data. Virtual canopy model For simulation, a virtual canopy was modeled using a mixed Gaussian. The mesh of the model was generated by Delaunay triangulation. Vi rtual canopies having different height are uniformly aligned in a grid. The pitch size between canopies is 4 [m]. The created virtual canopy model is illustrated in Figure 9 4. PAGE 123 123 LADAR simulation A virtual canopy model consists of a number of triangular face ts. When a laser beam passes through a facet, the distance between the beam source and intersecting point on the facet can be measured. According to Plcker coordinates system (see Appendix A), an intersection point p that a line 1 0 1;LS S passes through a plane 2 2; d S is determined by 21 1 2 1 0 2S S S S S p dL (9 1) Therefore the distance between the laser beam source and facet is p The intersecting point can also be placed on the outs ide of a given facet as shown in Figure 9 5. To discard these invalid points, the summation among these three angles between the intersecting point and three vertices of a facet is computed. When the intersecting point is located inside the facet, the summ ation of angles is equal to 360. An angle between two lines 1 0 1;LS S and 2 0 2;LS S is solved by 2 1 2 1 1cos S SS S (9 2) There could be several facets which a laser beam passes through. The closest facet from the laser so urce is regarded as an actual reflecting one. Since it is timeconsuming to compute all of the triangles every time, facet which are placed within 25 upper and lower bounds with respect to scanning plane are selected as candidates to check in order to red uce the number of checking facets during iteration. Motion s imulation Since a laser beam source is a moving part, its coordinates must be transformed. To set this configuration, the YXZ rotation transformation is used. PAGE 124 124 zx yR R R R (9 3) where cos sin 0 sin cos 0 0 0 1xR cos 0 sin 0 1 0 sin 0 cosyR 1 0 0 0 cos sin 0 sin cos zR The direction of the beam PathP at each position is determined by the follow ing transformation. LADAR PathRTP P (9 4) where T is a tran slation vector and LADARP is a laser beam vector. Figure 9 6 shows the coordinates at a scanning. A real LADAR considered has 180 scanning angle, with 180 individual laser beams emitting from the source distributed evenly over the 180. Th e LADAR has the capability of 5 scans within a second, however, this frequency is too high to reasonably calculate. A more manageable resolution and frequency for computation in this simulation was set to have a scanning angle of 60 and frequency of 12 sc ans in 0.5 [s]. To simplify the calculation in this simulation, a path equivalent to only one row is tested. Figure 97 shows a fully simulated scene. Notice that the intersecting points along the surface of virtual grove model are observed in F igure 97 B) Simulated distance s The distance resulted in simulation is shown in Figure 98. X axis and Y axis represent the scanning angle and distance, respectively. Z axis in Figure 9 8 B) means time. The distance toward discarded points which is placed at no fa ce t is set 0. Results in reconstruction The process of simulation is similar to the inverse process of the reconstruction algorithm. The model reconstructed by using the distance and motion information is shown in Figure 99, as PAGE 125 125 well as another figure com bined with an original model is shown so as to observe how well it was reconstructed. The visual result shows that the algorithm works well. However, this result is obtained under the assumption that motion information is accurate. Experiment Configuration of equipments The configuration of equipment used in the experiment is shown in Figure 910 A) As mentioned before, the Y axis of a LADAR was placed parallel to the real axle Since the power generator sets up excessive vibration, it was place on the ot her cart to protect other devices from vibration. Data acquisition software To obtain data from equipments, a dataacquisition software was developed by referring to the previous simulation software T ime and position information from a GPS receiver is re corded at the first column to the third column, and then 180 number of LADAR data go after every second in order to synchronize data with respect to time. This software shown in Figure 910 displays receiving data and allows users to change several setting s. This software was compiled by MFC 6.0. Calibration of LADAR Raw d ata obtained from LADAR does not represent physical range as it is, a range calibration was conducted with six sample length s from one foot to six feet. X axis in Figure 911 B) denotes s ample length s and y axis denotes value s taken from LADAR. Data from a LADAR show ed almost linear variation. The metric unit was used in this experiment for compatibility with that of a GPS receiver PAGE 126 126 Obtained raw data A LADAR used in this experiment was abl e to measure up to 81 [m]. Since the maximum value means just empty space, those were discarded. Figure 9 12 B) shows position data obtained from a GPS receiver Even though the vehicle was moved straightly, there are large errors as shown in the figure. x axis in the plot comes under Longitude and y axis comes under Latitude. In the computation, the origin position was reset right and bottom side of data set. Adjustment of position data Since these position data contains large errors, I was not able to ma ke use of them as it is. So I manually reset the control points which fit to the curve smoothly. Regeneration of path and r econstruction was conducted at each separated row. The most right ward row in Figure 913 denotes the 1st row in this experiment Red dots in are regenerated positions of the 1st row by using a Spline curve fitting method Results Reconstruction 3D space data Figure 9 14 shows the result in reconstruction of 4th row. Acquiring time was 110 [s] for the 4th row. It was necessarily to separ ate vertices in a range in order to divide between the canopy and ground. P oints lower than 20 [ cm ] were colored with green regarding as the ground. Therefore, two canopie s become disti nctive Vertices around canopies are manually collected, and tetrahedro ns are created by using a 3D de launay triangular function as shown in Figure 915. A t etrahedron consist s of four vertices or four facet s. Figure 9 16 shows t he entire reconstruction of a grove. T he second canopy in the 3rd row looks like two trees. But th ere was no basis to separate adjacent canopies with obtained data. Even though the sa m pl ed grove was chosen for the purpose of salient result, there is little PAGE 127 127 gap between canopies in the general citrus grove. For automatic collection, additional informatio n such as stem positions must be provided. Calculation of volume of canopy Once a tetrahedron is determined from the vertices, then the volume of each tetrahedron can be solved by 6 | | 2 | | 3 1 3 1 c b a h b a Ah V (8 10) w here c b a are adjacent edge vectors of a vertex The entire volume of a canopy is summation of each volume of tetrahedron, and the height of a canopy is just the maximum Z axis value among vertices. The result s of each canopy are shown in Table 81. Representation on GIS softwar e To show an application about collected data t he estimated variabilities were mapped using a n Arcview GIS software as shown in Figure 9 17. Figure 917 A) shows the location and path on satellite image and B) shows the distribution of the estimated variabilit y map b y choosing an inverse distance interpolation option. I f more samples are used, more practical distribution will be obtained. Conclusion A remote sensing technique associated with a g roundbased scouting system was implemented. This system is de signed to measure variability such as volume and height of citrus canopy using an active range sensor and a GPS. A simulation software contributed to sav ing the time to develop and verify the algorithm before experiment A v ertical lateral scanning approach can successfully reconstruct 3D canopies and solve their volume and height. A LADAR is an effective range sensor to measure objects size. However, since the precision of result is sensitive PAGE 128 128 to motions a more accurate positioning system is needed. If th e center of a canopy was known on the map, the grouping vertices for volumetry could be done automatically Th i s system can also be applied to unmanned vehicle system s to automate the whole process of a groundbased remote sensing. Future Work A dditional devices may be needed when the boundary between canopies is ambiguous. For example, s ince range data do not tell between a canopy and other objects such as ground or empty space, vision information can be a refer ence to separate them. Basically, r ange infor mation can be projected on the vision image through a calibration If the boundary between a canopy and the other part is divided through the segmentation, range data can be grouped based on the segmen ts. Then vertices associated with a canopy can be collected automatically. PAGE 129 129 Table 91. Estimated volume and height of canopy from reconstructed 3D space data. 4 th row 3 rd row 2 nd row 1 st row Volume [m 3 ] Height [m] Volume [m 3 ] Height [m] Volume [m 3 ] Height [m] Volume [m 3 ] Height [m] 11.661 1.9476 11.079 1.5466 16.86 6 1.6399 14.288 2.0056 14.394 1.7248 64.874 2.73 13.279 2.5763 20.552 2.0231 16.504 2.7476 26.576 2.705 14.524 2.143 17.004 2.9033 Creating Model Generating Path Virtual Scanning Measuring Distance Actual Position Data from GPS Reconstructing Model Range Data Path Data Actual Range Data from LADAR Distance Position Figure 9 1. Procedure diagram. A X Y beam X Y beam B Z Y Z Y C Figure 92. Coordinates of LADAR system A) Coordinates of LADAR. B) Scanning direction. C) Coordinates of LADAR on a vehicle. PAGE 130 130 5 Figure 93. Path model of a vehicle in a virtual grove. Figure 9 4. Virtual canopy model. A) Unit virtual canopy B) V irtual grove. Inner point outter point Inner point outter point Figure 95. S earch for a facet which a laser beam passes Y X X Z Y X X Z PAGE 131 131 Figure 96. Coordinates of LADAR during simulation. Figure 9 7. Fully scanned visual simulation 0 2 4 6 8 10 12 0 1 2 3 4 5 6 7 8 9 10 2 4 6 8 10 12 20 40 60 80 0 1 2 3 4 5 6 7 8 9 Figure 9 8. Distances measured from simulation A B Figure 9 9. Reconstruction results. A) Reconstructed model B) reconstructed model fitting to the original virtual grove. PAGE 132 132 A B Figure 9 10. Configuration of LADAR and GPS system A) S ystem layout B) Data a cquisition software A Front y = 30.8x 1.8 R2 = 0.9997 0 20 40 60 80 100 120 140 160 180 200 1 2 3 4 5 6 Series1 Linear (Series1) B Figure 911. LADAR c alibration. A) Calibration of LADAR B) L inearity of range. A -16 -14 -12 -10 -8 -6 -4 -2 0 0 2 4 6 8 10 12 Four number of Rows Row-1 Row-2 Row-3 Row-4 B Figu re 912. Raw data. A) Raw range data of 4th row acquired from LADAR B) Row position data acquired from GPS PAGE 133 133 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 x y GPS Picked Points Generated by spline Figure 9 13. Adjusted control points for path. Figure 9 14. Reconstructed 3D space data in the 4th row Figure 9 15. Created tetrahedro ns corresponding to canopy PAGE 134 134 Figure 9 16. R econstructed entire grove A B Figure 9 17. Results applied to GIS software. A) Location. B) V ariability map PAGE 135 135 CHAPTER 10 CONCLUSIONS AND FUTURE WORK The objective of this chapter is to summarize and conclude my study. Summary and Conclusions Corners extracted from canopy scenes which have no artificial landmarks are unstable when adding a variety of disturbances such as backlight or illumination. Thus the longevity of features is prone to become short when the camera moves. To strengthen the role of features, a leaf detection and a multilayered tracking method were invented, which are designed specially for canopy scenes. The key point of leaf detection is to segment leaf areas through the combination of morphological operation s Once segmented, the center of a leaf in each segment is set as a feature point. Since shadowed leaves are not sustainable, l eaf detection was designed to apply discriminating thresholds to each segment to secure as many feature s as possible. In cases where the intensities of some leaves are similar, segment boundaries become ambiguous. Thus choosing the center of leaf varies among frames. To be insensitive to these sudden changes of the feature point, an active mesh feature tra cking method was employed. Even though an active mesh is insensitive to sudden changes, it requires exhaustive computation and it is possible for wrong features to disturb normal features. Therefore, a multilayered active tree method was invented which expands 2D meshes into 3D meshes. The creation of multiple layers is based on coarse to fine images by means of Gaussian filtering. Features extracted from each image recursively form hierarchical structures by connecting sub features within an area which is segmented by a Voronoi diagram. Since a multilayered active tree can evaluate the strength of a feature, it can control the disturbance from unnecessary features. Subsistence of features was PAGE 136 136 evaluated with the comparison of optical flow. The optical flow t hrough these methods results in insensitivity to noise. A parallax can be seen around the edges of an image. Furthermore, a parallax occurs in the horizontal movement model. Thus a cropping edge mosaicing method was suggested. It was designed to automatic ally determine the cropping area based on the direction of optical flow. Cropping area is blended by controlling an alpha channel. A cropping edge mosaicing method showed that it can minimize the discordance caused by the parallax effect. One critical point that must be considered in the lengthy mosaic is that a camera alignment must be parallel to the direction of movement. Otherwise, the mosaic will go past the upper or lower boundaries. For a 3D canopy model, local reconstruction was conducted by robot vision and global reconstruction was conducted by a range sensor vehicle. Since it is difficult to accurately measure dimensions of an unknown structure, the accuracy of reconstruction is evaluated with re projection error. In the visionbased reconstruction, an 8 point algorithm was tried first, which is widely known. However, since an 8 point algorithm requires over 8 input points, it was concluded that those coupled input points are quite sensitive to noise. In addition, since a reconstructing object is larger than the view in the local reconstruction, accumulation of error is inevitable without a ground truth. Instead, a Plcker coordinates system is employed for reconstruction, which employs a simpler computation method than an 8 point algorithm. Since a Plcker coordinates system independently conducts the reconstruction with one pair of correspondences with respect to given motion, error input does not influence the existing reconstruction. The robot used for reconstruction provides repeated constant motion information of the endeffector. Thus a Plcker coordinates system was valid. According to experiments, even though the motion information seems to have some errors, the re projection error was PAGE 137 137 adequately small. Therefore, reconstruction by a Plcker coordinates system was successful with the given motions. For reconstruction with a range sensor, a combination with LADAR and GPS was used. Since a LADAR scans only 2D, additional information is needed to achieve 3D space. A GPS is used to provide vertica l direction information orthogonal to the scan plane of the LADAR. Since the resolution of GPS information is underpopulated compared to that of LADAR, the sparse positions and orientations are interpolated with a Spline fitting method. The volume of reconstructed canopies were measured and represented on a global map. Contributions within my study about each topic are shown in Table 101. Future Work Leaf detection was carried out with an image. This approach will be applicable to multilayered successive images. Watershed segmentation could be also used to segregate fruits from leaves. For active meshes, fixed parameters were used. If parameters can be adjusted automatically, it will be a more intelligent tracking method. In the estimation of camera motion, accumulating errors could not be converged. For a motion model in the open system, since there is no ground truth, accumulation of error is essentially inevitable. To be within an allowable range, a high performance camera, a fast computing system and a l imited camera motion must be considered. A LADAR range sensor is quite accurate, but the range information does not tell which is ground or empty space. If range data are combined with vision data, it will yield more effective results. When applying a rang e sensor, precise positioning and calibration are needed. A vehicle system simulation will be able to test in the implemented virtual space. All topics in my study are tested independently. In the next work, all procedures will be designed to work at once automatically in realtime. Collected and processed information about the surface of the canopy and fruit positions can be represented in the virtual environment as shown in Figure 101 so that f armers can observe and control the real situation remotely PAGE 138 138 T able 1 01. Contribution techniques. Topic Main Technique Contribution Level Feature detection Leaf detection novel invention normal Feature tracking Multilayered active tree novel invention complex Image mosaicing Crop off composition novel trial normal 3D reconstruction (Vision) Plcker coordinates system novel trial normal 3D reconstruction (LADAR) Vertical scanning trial simple Figure 101. Virtual robot harvesting simulation. PAGE 139 139 APPENDIX A GEOMETRY Plcker Coordinates System Defin itions P oint A point in homogeneous expression is w z y x ; p (A 1) I n this dissertation, a point is regarded as z y x p i.e. 1 ; z y x p L ine A line in Plcker coordinates is defined as follows. L 0; S SL (A 2) where S is a normalized vertor between two points, and L 0S is a moment with respect to an origin. S and L 0S are determined by given points 1p an d 2 p. 1 2 1 2p p p p S S p S 2 0 L (A 3) Figure A 1. A l ine definit ion in Plck coordinates system PAGE 140 140 P lane A plane is defined by d ; S N (A 4) G iven three points 1 1 1 1, z y x p 2 2 2 2, z y x p 3 3 3 3, z y x p a plan e which a point and a line make s is 12 3 12 0S p S S l 12 0 3 ld S p (A 5) Angle A sign between two lines which meet each other G iven two lines 1 0 1 1;L S S L and 2 0 2 2;LS S L a sign of the angle between them is ) ( ) (1 0 2 2 0 1 L Lsign sign S S S S (A 6) An angle between two lines which meet each other G iven two lines 1 0 1 1;LS S L and 2 0 2 2;LS S L a n angle between them is 2 1 1cos ) ( S S sign (A 7) An angle betwe en two planes G iven two planes 1 1 1; d S N and 2 2 2; d S N an angle between them is 2 1 2 1 1cos S S S S (A 8) Point A point whe re a line passes though a plane G iven a line 1 0 1 1;L S S L and a plane 2 2 2; d S N a point where a line passes though a plane is PAGE 141 141 2 1 1 2 1 0 2S S S S S p dL (A 9) The angle between the line and the plane is. 2 cos2 1 2 1 1 S S S S (A 10) A point on a plane which is closest to the point G iven a point z y x 1 p and a plane 2 2 2 ; d S N a point on the plane closes t to the point is 1 2 2 2 2 1 2p S S S S p p d (A 11) A point on a line which is closest to the point G iven a point z y x ,1 p and a line 2 0 2 2;LS S L a point on the plane closes t to the point is 1 2 2 2 1 2 0 2p S S S p S S p L (A 12) A point on a line which is closest to a line G iven two lines 1 0 1 1;LS S L and 2 0 2 2;LS S L a point on each line which is closest to the other line is 1 2 1 2 1 0 2 2 2 1 2 0 1 0 1 2 1 1 1 0 2 2 0 1cos sin p S S p S S S S S S S p S S S S S S d d dL L L L L (A 13) W here d is a distance between two points. PAGE 142 142 Line A line passing through two lines with the shortest distance G iven two lines 1 0 1 1;LS S L and 1 0 1 1;LS S L a line passing through two lines with the shortest distanc e is 2 1 1 2 1 2 1; S S p S S S S L (A 14) W here 1p is a point on line 1L which is closest to the other line A line made by two planes G iven two point 1 1 1; d S N and 2 2 2; d S N a line made b y two planes is 1 02 2 01 2 1; S S S S L d d (A 15) Plane A plane made by a point and a line G iven a point z y x ,1 p and a line 2 0 2 2;LS S L a plain made by a point and a line is 2 0 1 2 1 2 0;L LS p S p S N (A 16) A plane made by two li nes G iven a point two lines 1 0 1 1;LS S L and 2 0 2 2;LS S L a plane made by two lines is 2 1 0 2 1; S S S S N L (A 17) Basic Functions Triangle Functions Angles in triangle when edges are known G iven edges il, a ngles are PAGE 143 143 3 1 3 1 1 3 12 sin 2 1k k k k i k kl l S S l S. (A 18) Figure A 2. Angles in triangle when edges are known Volume of Tetrahedron G iven edges il, the volume is 6 | | 2 | | 3 1 3 1 c b a h b a Ah V (A 19) Figure A 3. Volume of a tetrahedron. PAGE 144 144 APPENDIX B CRITICAL SOURCE CODES OF ALGORITHM Leaf Detection function [ Im_r ] = segment_detect( Im_ ) %>> [ Im_r ] = segment_detect( Im_ ) % %(c) copyright SangHoon Han, ARMg UFL 3/26/2008 %% --------------------------------------------------------------------------------------if nargin<1, return; end, [ h, w, c ] = size(Im_); Im_r = Im_; Im = Im_; %% --------------------------------------------------------------------------------------Im1 = im1_im(Im_); opIm = imnorm(imopen(Im_, strel('disk',8))); % imshow( opIm); oIm = imnorm(imshow_open( Im1,8 )); flIm = opIm; % flIm = imfilter(opIm, fspecial('disk',8), 'replicate'); % [ wIm, bIm, Ims ] = imshow_watershed( oIm); % imshow(Im_r); [ wIm, Ims ] = watershed_( flIm); bIm = double(wIm == 0); sgArs = []; for i=1:length(Ims), sgAr = regionprops(Ims{i},'Area'); sgArs = [sgArs sgAr.Area]; end msgAr = mean(sgArs); %% imPnt_r = []; i=55; th_r = []; Im_r = bIm; opIm1 = imnorm(imopen(Im_, strel('disk',4))); for i=1:length(Ims), %disp(i); sgImi = Ims{i}; % imshow(Ims{55}); sgAr = regionprops(sgImi,'Area'); % if sgAr.Area<(msgAr*0.2) continue; end; % if ~tf_ratio(sgImi, 0.2), continue; end; mlIm = immultiply(opIm, sgImi); % imshow(mlIm); mlIm = imshow_multiply(oIm, sgImi); % th = 0; sm = 0; while th<1 bwIm = double(im2bw(mlIm, th)); % imshow(bwIm); sm = sum(sum(bwIm))/sum(sum(sgImi)); if sm<.25 bwCnt = regionprops(bwIm,'Centroid'); % imshow(sgImi); if ~isempty(bwCnt), imPnt_r = [imPnt_r floor(bwCnt.Centroid')]; end; % [ bPnts, bdIm ] = bwboundaries(bwIm,'noholes'); % [L,num] = bwlabel(opIm1); % lb = Label2rgb(L); % if ~isempty(bPnts) % bPnt1 = bPnts{1}; % mnPnt = floor(mean(bPnt1)); break; % % % lbIm = double(label2rgb(bdIm, 'jet', 'w', 'shuffle')); PAGE 145 145 % % for j=1:length(lbIm) % % bPnti = bPnts{i}; % % end % % cPnt = floor(sgAr.Centroid'); % end end th = th+0.01; end Im_r = imadd(Im_r, bwIm); end Im_r(:,:,3) = imnorm(bIm); Im_r(:,:,2) = im_plot(bIm, imPnt_r, 'y+',1); % imshow(Im_r); % Im_r = im_plot(oIm,imPnt_r,'+',2); %% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% functio n segment_detect_demo() % clear %% imf_ = 'garden1_cinepak_filtered0000.bmp'; Im_ = imshow_file(imf_); Im_ = im1_im(Im_); %% clc; clf; [ Im_r ] = segment_detect( Im_ ); imshow(Im_r); Multilayered A ctive T ree Tracking Active Mesh function [ ePnt2_r ] = mesh _track( oPnt2s_r ) % match images. % > Im %>> [ crPnt_r ] = track_mesh( prPnt_, crPnt_, Fct_ ) % %(c) copyright SangHoon Han, ARMg UFL 4/7/2007 %% --------------------------------------------------------------------------------------if nargin<1, return; end %% --------------------------------------------------------------------------------------Pnt21 = oPnt2s_r{1}; tri = delaunay(Pnt21(1,:), Pnt21(2,:)); trii = [tri tri(:,1)]; ngPnts = mesh_neighbor(tri); %% wrong input wPnt2s_r = oPnt2s_r; %% combin e forces clc; ngPnts = mesh_neighbor(tri); F_ = []; F_tot =[]; Pnt2_1=wPnt2s_r{1}; Pnt2=wPnt2s_r{1}; pTime=now; sz2 = size(oPnt2s_r,2); for k=2:sz2, pTime = datestr_(pTime, k, sz2); PAGE 146 146 Pnt2_k_1 = wPnt2s_r{k1}; Lprev = []; for i=1:size(Pnt2,2) % f or each node ngPnt = ngPnts{i}; for j=1:size(ngPnt,2), nj = ngPnt(j); %disp([k,i,nj]); Pnt2_k_1nj = Pnt2_k_1(:,nj); if norm(Pnt2_k_1nj)==inf, Pnt2_k_1nj = Pnt2_1(:,nj); end P nt2_k_1i = Pnt2_k_1(:,i); if norm(Pnt2_k_1i)==inf, Pnt2_k_1i = Pnt2_1(:,i); end dlen = Pnt2_k_1nj Pnt2_k_1i; Lprev(i,nj) = norm(dlen); end end Pnt2_k = wPnt2s_r{k}; for i=1:size(Pnt2 ,2) % for each node Pnt2_k_i = Pnt2_k(:,i); ngPnt = ngPnts{i}; slen = 0; for j=1:size(ngPnt,2), nj = ngPnt(j); slen = slen+Lprev(i,nj); end F_ = []; for j=1:size(ngPnt,2), nj = ngPnt(j); %disp([k,i,nj]); Pnt2_k_nj = Pnt2_k(:,nj); if norm(Pnt2_k_nj)==inf, % Pnt2_k_nj = F_tot{k 1}(:,nj); end Pnt2_k_nj = Pnt2(:,nj); end Pnt2_k_1nj = Pnt2_k_1(:,nj); if norm(Pnt2_k_1nj)==inf, Pnt2_k_1nj = Pnt2_1(:,nj); end Fext = Pnt2_k_nj Pnt2_k_1nj; % external force if norm(Pnt2_k_i)==inf, Pnt2_k_i = Pnt2(:,i); end dPnt2 = P nt2_k_nj Pnt2_k_i; % internal length Lcurr = norm(dPnt2); if Lcurr==0, Lcurr = 1; end w = (Lprev(i,nj) Lcurr)/Lcurr; Fint = w*dPnt2; if slen==0, slen = 1; end beta = (1 Lprev(i,nj)/slen)/(size(ngPnt,2)1); F_{k 1,i}(:,nj) = beta*(0.95*Fext 0.05*Fint); end sumF = sum(F_{k1,i},2); Pnt2_k_i = Pnt2_k(:,i); if norm(Pnt2_k_i)==inf, Pnt2_k_i = Pnt2(:,i); end Pnt2_k_1i = Pnt2_k_1(:,i); PAGE 147 147 if norm(Pnt2_k_1i)==inf, Pnt2_k_1i = F_tot{k 2}(:,i); end alph = 1; alph = norm(Pnt2_k_i Pnt2_k_1i)/norm(sumF); if alph<=1e5, alph=1; end F_tot{k 1}(:,i) = alph*sumF+Pnt2(:,i); end Pnt2_1 = Pnt2; Pnt2 = F_tot{k 1}; end %% F_ext{1,1} % clf; plot_(oPnt2s_r{3}(:,1),'b+'); hold on; % plot_(F_tot{1}(:,1),'ro'); hold off %% plot combine force nf = size(F_tot{1},2); np = size(F_tot, 2); tw = []; twp = []; err = []; c = 'kgrbc'; s= '.+so*'; figure(1); clf; hold on; pTime = now; owPnt2_r = []; ePnt2_r = []; wPnt2_r = []; for i=1:np, pTime = datestr_(pTime, i, np); tw = []; ow = []; er= []; ww = []; for j=1:nf tw = [tw [F_tot{i}(:,j); i]]; op2 = oPnt2s_r{i+1}(:,j); ow = [ow [op2; i]]; er = [er norm(F_tot{i}(:,j)op2)/norm(oPnt2s_r{i}(:,j) op2)]; wp2 = wPnt2s_r{i+1}(:,j); if wp2==inf % plot_(oPnt2s_r{i}(:,j),'s'); % plot_([oPnt2s_r{i}(:,j) ow(1:2,j)],'b:'); % plot_([ow(1:2,j) tw(1:2,j)],'.k '); else ww = [ww [wp2; i]]; end end Pnt2s_r{i} = tw; err = [err; er]; % plot_(tw(1:2,:),[c(2) s(2)]); % plo t_(ow(1:2,:),[c(5) s(5)]); owPnt2_r = [owPnt2_r mean(ow(1:2,:),2)]; ePnt2_r = [ePnt2_r mean(tw(1:2,:),2)]; wPnt2_r = [wPnt2_r mean(ww(1:2,:),2)]; end figure(3); clf; hold on; plot_(owPnt2_r,'ks'); plot_(ePnt2_r,'. '); plot_(wPnt2_r,'ro:'); l egend('origianal','Active mesh','Mean'); axis normal %% figure(2); clf; hold on; for j=1:nf k = mod(j,4)+1; plot(1:size(err,1),err(:,j)',[':' c(k) s(k)]); end PAGE 148 148 Hierarchical S tructure G eneration [ imPnt, desc1 ] = surfpoints( Im_ ); imPts_r{1} = im Pnt; % subplot(2,2,1); imshow(Im); title(['Original']); bIm=[]; for i=2:n_ Ims_r{i} = imfilter(Im_, fspecial('gaussian',5*i,5*i) ); [ imPnt, desc1 ] = surfpoints( Ims_r{i} ); imPts_r{i} = imPnt; % subplot(2,2,i); imshow(bIm{i}); title(['Size=' int2str(5*i) ', Sig=' int2str(5*i)]); End for i=1:size(Pts_,2) vPts_r{i}.Pt=Pts_{i}; [ vIdx, vPts ]= idx_voronoi ( Pts_{i} ); vPts_r{i}.vPts=vPts; end if lv_<2 return; end; vPt = vPts_{lv_}; vPt_1 = vPts_{lv_ 1}; for i=1:size(pIdx_,2) [ pIdx, pPt ]= idx_inpolygon( vPt_1.Pt, vPt.vPts{pIdx_(i)} ); if isempty(pPt) continue; end; Idx = find(pIdx==1); Nds_r{i}.Idx = Idx; Nds_r{i}.Pt = pPt; Nds_r{i}.Nd = nds_vpts ( vPts_, lv_1, Idx ); e nd Multilayered A ctive T ree function [ Ns_r ]= multimesh_mv ( Ns, imPts_, Pth_r ) % --------------------------------------------------------------------------------------------%% Ns_r=Ns; if nargin<1|isempty(Ns), multimesh_mv_demo; return; end %% % --------------------------------------------------------------------------------------------k=2; for k=2:size(Pth_r,2); disp(['k: int2str(k)]); %size(Pth_r,2) l3=3; l2=2; l1=1; dexFrc3=[]; for n3=1:size(Ns,1), disp(['n3: int2str(n3)]); Nd3=Ns_r{n3,k 1}; if isemp ty(Nd3)|isempty(Nd3.Pt), continue; end; imPtk3 = imPts_{l3}.Pts{k}(:,n3); if isinf(imPtk3) dexFrc31 = Pt_ref3; else dexFrc31 = imPtk3Nd3.Pt; % currentExternal previous end dexFrc3 = [dexFrc3 dexFrc31]; Ns_r{n3,k}.Pt = Nd3.Pt+dexFrc31; end; Pt_ref3 = mean(dexFrc3,2); PAGE 149 149 for n3=1:size(Ns,1), disp(['n3: int2str(n3)]); Nd3=Ns_r{n3,k 1}; if isempty(Nd3)|isempty(Nd3.Pt), continue; end; imPtk3 = imPts_{l3}. Pts{k}(:,n3); if isinf(imPtk3) dexFrc31 = Pt_ref3; else dexFrc31 = imPtk3Nd3.Pt; % currentExternal previous end Ns_r{n3,k}.Pt = Nd3.Pt+dexFrc31; imPtk2 = imPts_{l2}.Pts{k}(:,Nd3.Idx); dexFrc2 = imPtk2 Nd3.subPt; while sum(sum(isinf(dexFrc2)))>0, sn = round(size(dexFrc2(dexFrc2==Inf),1)/2); if sn>1 dexFrc2(dexFrc2==Inf) = Pt_ref2*ones(1,sn); else dexFrc2(d exFrc2==Inf) = Pt_ref2; end end Pt_ref2 = mean(dexFrc2,2); Nd2=Nd3.Nd; if isempty(Nd2), continue; end; for n2=1:size(Nd3.subPt,2), Pt2 = Nd3.subPt(:,n2); b2 = .1; [ ang len] = ang_pt2( Pt2, Pt_ref3 ); b2 = abs(ang)/pi; % [ang len] = ang_pt2( Pt2, dexFrc31 ); b2 = abs(ang)/pi; a2=1 b2; dexFrc23 = a2*dexFrc2(:,n2) + b2*dexFrc31; % dexFrc23 = dexFrc2(:,n2); N s_r{n3,k}.subPt(:,n2) = Pt2 + dexFrc23; end % Ns_r{n3,k}.subPt = Nd3.subPt + (a2*dexFrc2 + b2*dexFrc31*ncol(dexFrc2)); for n1=1:size(Nd2,2), disp(['n1: int2str(n1)]); Nd1=Nd2{n1}; if isempty(Nd1), continue ; end; imPtk1 = imPts_{l1}.Pts{k}(:,Nd1.Idx); dexFrc1 = imPtk1 Nd1.subPt; % currentExternal previous while sum(sum(isinf(dexFrc1)))>0, sn = round(size(dexFrc1(dexFrc1==Inf),1)/2); if sn>1 dexFrc1(dexFrc1==Inf) = Pt_ref1*ones(1,sn); else dexFrc1(dexFrc1==Inf) = Pt_ref1; end end Pt_ref1 = mean(dexFrc1,2); a1=.9; b1=.075; c1=.015; for n11=1:size(Nd1.subPt,2), Pt1 = Nd1.subPt(:,n11); [ang len] = ang_pt2( Pt_ref2, Pt_ref3 ); c1 = (abs(ang)/pi)*1/4; [ang len] = ang_pt2( Pt1 Pt_ref3 ); b1 = (abs(ang)/pi)*3/4; b 23 = b1+c1; a1 = 1 b23; PAGE 150 150 dexFrc23 = a1*dexFrc1(:,n11) + b1*dexFrc2(:,n1) + c1*dexFrc31; % dexFrc23 = dexFrc1(:,n11); Ns_r{n3,k}.Nd{n1}.subPt(:,n11) = Pt1 + dexFrc23; end % Ns_r{n3,k}.Nd{n1}.subPt = Nd1.subPt + ( a1*dexFrc1 + b1*dexFrc2(:,n1)*ncol(dexFrc1) + c1*dexFrc31*ncol(dexFrc1)); end end end Sequential Image Mosaicing function [ Img_r ] = imgcut( Img_, offPnt_, c_, thk_ ) %plot 3D points on 2D image w ith viewpoint. %>> Img_r = imcreate_( CData_, bAlpha_ ) % %(c) copyright SangHoon Han, ARMg UFL 2/8/2007 %% --------------------------------------------------------------------------------------if nargin<1, return; end h = Img_.h; w = Img_.w; c = Img_.c; cp =round([w h]'/2); if nargin<2, offPnt_ = cp; end if nargin<3, c_ = [1 1 0]; end if nargin<4, thk_ = 0; end %% --------------------------------------------------------------------------------------xy = [1 w; 1 h]; Pnts = [xy( :,1) [xy(1,1) xy(2,2)]' xy(:,2) [xy(1,2) xy(2,1)]' xy(:,1)]; Img_r = Img_; if thk_>0, Img_r = imglines(Img_r, Pnts, c_); end Img_.gIm = zeros(h, w); %% get angle % dp = cp offPnt_; Ang = pi/2+atan2(dp(2),dp(1)); %Ang = 90*pi/180; %% get sp, ep % Ang=10 *pi/180; a = tan(Ang); x = offPnt_(1); %y = Img_r.h np(2); y = offPnt_(2); sy = 1; if a==0 sx = x; sy = y; else sx = x+(syy)/a; if sx<1, sx = 1; sy = y+(sx x)*a; elseif sx>w sx = w; sy = y+(sx x)*a; PAGE 151 151 end end sp = round([sx sy]'); ey = h; if a==0 ex = x; ey = y; else ex = x+(ey y)/a; if ex<1, ex = 1; ey = y+(ex x)*a; elseif ex>w ex = w; ey = y+(ex x)*a; end end ep = round([ex ey]'); Img_r = img line(Img_r, [sp, ep], c_, thk_); if Ang<0 sPnt = ep; ePnt = sp; else sPnt = sp; ePnt = ep; end % [sPnt, ePnt] %% pntlist % sPnt=[1,100]'; ePnt=[320,50]'; sx = sPnt(1); ex = ePnt(1); sy = sPnt(2); ey = ePnt(2); eg = []; switch sx case 1, eg = 1; if (1 PAGE 152 152 Pnts = [ePnt Pnts]; end end case w if (1 PAGE 153 153 3D Reconstruction with Images function [ L n_r ]= ln_pttr ( Pt_, Extr_,Intr_ ) Pth = cpth_extr(Extr_); Pnt1 = Pth(1:3); [ cfrm3_r, Pnt2 ] = cfrm3tr( Extr_,Intr_,[], Pt_ ); if isempty(Pnt2), return; end; Ln_r = ln_pnt2(Pnt1, Pnt2); % ================================================================================ function [ Pnt_r1, Pnt_r2, len_r, Ln_r, Ang_r ]= pnt2_ln2 ( Ln_1, Ln_2 ) [S1, S0L1] = s_ln(Ln_1); [S2, S0L2] = s_ln(Ln_2); % S1,S2 is unit vector. dtS12 = dot(S1,S2); crS12 = cross(S1,S2); Ang_r = acos(dtS12); sinAng = sin(Ang_r); if sin Ang==0, sinAng=1; end; S12 = crS12/norm(crS12); dtS2L1 = dot(S2,S0L1); dtS1L2 = dot(S1,S0L2); len_r = (dtS1L2+dtS2L1) /sinAng; % length if (dtS2L1==0), % parallel or coplanar Pnt1 = dot(S0L2,S12)/sinAng; Pnt2 = dot(S0L1,S12)/sinAng; Pnt_r1 = Pnt1*S 1+Pnt2*S2; else % SL2_cS12S2 = S0L2len_r*cross(S12,S2); Pnt_r1 = cross(S0L1,SL2_cS12S2)/dtS2L1; end Pnt_r2 = Pnt_r1+len_r*S12; S0L12 = cross(Pnt_r1, S12); Ln_r = ln_s(S12, S0L12); % ================================================================================ function [ Pnt_r ]= pnt_pttr2 ( Pt_1,Pt_2, Extr_1,Extr_2, Intr_ ) Ln1 = ln_pttr( Pt_1, Extr_1, Intr_ ); if isempty(Ln1), return; end; Ln2 = ln_pttr( Pt_2, Extr_2, Intr_ ); if isempty(Ln2), return; end; [ Pnt1, Pnt2 ] = pnt2_ln2( Ln1, Ln2 ); Pn t_r = mean([Pnt1 Pnt2],2); 3D Reconstruction with Range Data function [ rVrt ] = reconstructObject(scn_, Path_, Dst_, viewp_); %scanObject Summary of this function goes here % Detailed explanation goes here %02.scan object %% %%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%% %% if nargin<1, scn_.rng = 90; scn_.smpNum = 12; scn_.t = 1; scn_.scnNum = 41; scn_.len =10; end if nargin<2, paths_ = load('paths.txt'); end if nargin<3, tri_ = load('tri_.mat'); end if nargin<4, viewp_ = [.5 1 .5]; end %% c reate the resolution of beam with respect to time %%% sN = scn_.smpNum; sT = scn_.t; PAGE 154 154 sR = (scn_.rng) pi/180; if sN == 1, sS = pi/2; sE = pi/2; sD = sR; dt = sT; s = pi/2; else sS = (pi sR)/2; sE = sS + sR; sD = sR/(sN 1); s = s S:sD:sE; %[rad] end % dt = sT/sN; % tm = 0:dt:(sT*scn_.scnNum dt); Tr = Path_(:,2:4)'; Rt = Path_(:,5:7)'; %% measure distances %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Dst = []; Dst(scn_.scnNum, sN).Pnt=[0 0 0]'; i=1; Dst(scn_.scnNum, sN).dst = scn_. len; rDst = []; t = 1; rFrm = []; rVrt=[]; %% % cPnt = .25*[ .7071 .7071 0; 0 1 0; .7071 .7071 0]'; for i = 1:(scn_.scnNum) 1, %disp(i);%[i scn_.scnNum]%i 5[s] sPnt = Path_(i,2:4)';% +org_; prevPnt = []; fsPnt = []; fePnt = []; %% for j = 1:s N, %[i j t] fsPnt = [fsPnt sPnt]; Dst_ij = Dst_(i,j); if Dst_ij~=0, Pnt = Dst_ij*[cos(s(j)) sin(s(j)) 0]'; trPnt = trans_yxz(Pnt, sPnt, Path_(i,5:7)'); ePnt = trPnt(1:3); rVrt = [rVrt; ePnt']; end end end PAGE 155 155 APPENDIX C DEVELOPMENT TOOLS Table C 1. Softwares. Development Tool Application MATLAB 2009 F eature detection, Feature tracking, Simulation OpenCV 2.0 Capturing, Image processing Visual C++ 2005 Robot controller, GPS C++Builder 2006 R eal time 3D reconstruction Robot controller user interface Delphi 2006 Virtual grove GLScene Virtual grove ARToolkit Augmented Reality PAGE 156 156 LIST OF REFERENCES Bay H ., A Ess, T Tuytelaars, and L V Gool 2008. Speeded Up Ro bust Features (SURF) Computer Vision and Image Understanding 110(3): 346359. Crane III, C. D., and J. Duffy 2008. Kinematic Analysis of Robot Manipulators New York, N.Y.: Cambridge University Press. Crane III, C. D 2010. Screw Theory f or Spatial Robot Manipulators New York, N.Y.: Cambridge University Press. Computer Vision Research Group. 2008. Camera Calibration Toolbox for MATLAB Pasadena, C alifornia: California Institute of Technology Available at: http://www.vision.caltech.edu/bouguetj/calib_doc/index.html Accessed 19 February 2010. Davison, A J. and D. W. Murray 2002. S imultaneous localization and map building using active vision IEEE Transactions on Pattern Analysi s and Machine Intelligence 24(7) : 865880. Dornaika F., and R. Chung 2004. Mosaicking images with parallax Signal Pr ocessing: Image Communication 19(8): 771786. Griffin, A., and J. Kittler 2002. An active mesh based tracker for improved feature corres pondences. Pattern Recognition Letters 23(4): 443449. Hasler D ., and S. S sstrunk. 2004. Mapping colour in image stitching applications Journal of Visual Communication and Image Representation 15(1): 6590. Hartley R ., and A Zisserman 2004. Multiple View Geometry in Computer Vision, 2nd edition New York, New York: Cambridge University Press. Isard M ., and A Blake 1998. CONDENSATION: C onditional density propagation for visual tracking International Journal of Computer Vision 29(1): 528. Kanazawa, Y ., and K. K anatani. 2004. Image mosaicing by strati Image and Vision Computing 22: 93 103. Ke Y ., and R Sukthankar. 2004. PCA SIFT: A more distinctive representation for local image descriptors IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2: 506513. Kise M., and Q. Zhang. 2006. Reconstruction of a virtual 3D field scene from groundbased multispectral stereo imagery. ASABE Meeting Presentation Paper No.063098. Portland, Oregon: ASABE Klien, G., and D. W. Murray 2007. P arallel tracking and mapping for small AR workspaces In P roc of the 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality : 110. PAGE 157 157 Lowe, D. G. 2004. Distinctive image feature s from scaleinvariant keypoints. International Journal of Computer Vision 60(2): 91110. Ma, Y ., S Soatto J Kosecka and S. S Sastry 2004. An Invitation to 3D Vision: From Images to Geometric Models New York, New York: Springer M ehta, S S. 2007. Vision based control for autonomous robotic ci trus Harvesting MS thesis. University of Florida, Department of Agricultural and Biological Engineering. Meier, E. B., and F. Ade. 1999. Using the condensation algorithm to implement tracking for mobile robots. In Proc. of the Third European Workshop on A dvanced Mobile Robots eurobot 99: 7380. Molloy, D. 2000. Active meshes for motion tracking PhD diss, Dublin, Ireland: Dublin City University, Department of Engineering and Design. OpenCV. 2001. Open Source Computer Vision Library Reference Manual Ver1. 0. Santa Clara, California: Intel Corporation. Rav Acha A ., Y Shor and S Peleg 2004. Mosaicing with parallax using time warping In P roc Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop 11: 164172. Shlyakhter I ., M Rozenoer J Dorsey and S Teller 2001. R econstructing 3D tree models from instrumented photographs IEEE Computer Graphics and Applications 21(3): 5361. Subramanian, V. 2005. Autonomous vehicle guidance using machine vision and laser radar for ag ricultural applications. MS thesis. Gainesville, Florida: University of Florida, Department of Agricultural and Biological Engineering. S ubramanian V 2008. Autonomous vehicle guidance f or citrus grove navigation. PhD Diss. University of Florida, Departme nt of Agricultural and Biological Engineering. Szeliski, R. 2004. Image alignment and stitching : A tutorial. MSR TR 200492. Redmond, W ashington: Microsoft Research Trucco, E., and A. Verri. 1998. Introductory Techniques for 3D Computer Vision. New York, New York: Prentice Hall Tumbo, S. D., M. Salyani, J. Whitney, T. Wheaton, and W. Miller. 2001. Laser, ultrasonic and manual measurements of citrus tree canopy volume ASAE Meeting Presentation Paper No. 011068. Sacramento, California: ASAE Wei J ., and M Salyani 2004. Development of a laser scanner for measuring tree canopy characteristics. ASAE Meeting Presentation Paper No.041168. Ottawa,Canada: A S AE Yoon, K ., G Jang, S Kim, I Kweon 2001. Color landmark based self localization for indoor mobile r obots Journal of Control, Automation, and Systems Engineering 7(9): 749757. PAGE 158 158 Y ounse P 2005. Intersection detection and navigation for an autonomous greenhouse sprayer using machine vision. MS thesis. University of Florida, Department of Agricultural and Biological Engineering. Zaman, Q. U., A. W. Schumann, and H. K. Hostler. 2005. Quantifying sources of error in ultrasonic measurements of citrus orchards. ASAE Meeting Presentation Paper No. 051123. Tampa, Florida: ASAE. Zhu Z ., E M. Riseman, and A R. H anson. 2001. Parallel perspective stereo mosaics The E ighth IEEE International Conference on Computer Vision. Vancouver, Canada : ICCV PAGE 159 159 BIOGRAPHICAL SKETCH SangHoon Han w as born in Pusan, South Korea 1972. He received his Bachelor of Science degree and Master of Science degree in mechanical engineering design at College of Engineering at Korea Aviation University, Korea, in February 1996 and February 1999, respectively. He attended the University of Florida in Jan, 2004 for the Doctor of Philosophy degree in agricultural and biological engineering. During his PhD program, he worked as a graduate research assistant with Dr. Thomas Burks. |