UFDC Home  myUFDC Home  Help 



Full Text  
PAGE 1 1 FAST PHYSICSBASED METHODS FOR WIDEBAND ELECTROMAGNETIC INDUCTION DATA ANALYSIS By GANESAN RAMACHANDRAN A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLOR IDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2010 PAGE 2 2 2010 Ganesan Ramachandran PAGE 3 3 ACKNOWLEDGMENTS I thank m y mother, JothiSree Venkataraman; father, Ramachandran Sethunarayanan; and wife, Swetha Seetharaman, for thei r relentless love and support. I thank my advisor, Paul Gader, for his guidance and encouragement throughout my tenure at the University of Florida. I thank my committee Paul Gader, Joseph Wilson, Arunava Banerjee, and John Harris for their insight and guidance which has steered my research and bettered resulting contributions. I thank my lab mates for their support and am thankful for their ability to endure my shenanigans. I thank Andres MendezVasqu ez, Seniha Esen Yuksel, Xuping Zhang and Gyeongyong Heo for their encouragem ent, s uggestions and aid in my research. I thank colleagues, Jim Keller, Hichem Fri gui, Peter Torrione and Dominic Ho, for their collaboration on a variety of research projects. I thank Russell Harmon of Army Research Office (ARO), Richard Weaver and Mark Locke of Night Vision and Electronic Sensors Directorate (NVESD), and Waymond Scott of Georgia Tech, for their support of my research. PAGE 4 4 TABLE OF CONTENTS page ACKNOWLEDGMENTS ............................................................................................................... 3LIST OF TABLES ...........................................................................................................................6LIST OF FIGURES .........................................................................................................................7ABSTRACT ...................................................................................................................... ...............91 INTRODUCTION .................................................................................................................. 11Wideband Electromagnetic Inducti on Data and Analysis ......................................................11Statement of Problem .......................................................................................................... ...12Overview of Research .............................................................................................................122 LITERATURE REVIEW .......................................................................................................14Induced Field of a Homogeneous Sphere ...............................................................................14Spectral Models ......................................................................................................................17BellMiller Models .......................................................................................................... 17First order ColeCole model ..................................................................................... 19Nonlinear least squares optimization ....................................................................... 21Bishays circle fitting method ..................................................................................21Xiangs direct i nversion method .............................................................................. 23Bayesian inversion ...................................................................................................24Discrete Spectrum of Relaxation Frequencies ................................................................24Kth Order ColeCole ........................................................................................................263 TECHNICAL APPROACH ...................................................................................................27Argand Diagrams of WEMI Responses ................................................................................. 28Prototype Angle Matching ...................................................................................................... 29Gradient Angle Model Algorithm ..........................................................................................30Stability of Lookup Table Step in GRANMA ................................................................. 32Classifier Design .............................................................................................................34Gradient Angles in Parts Algorithm .......................................................................................34Filter Design ................................................................................................................. ...36Classifier Design .............................................................................................................37Dielectric Relaxation estima tion using Sparse Models ..........................................................37Nonnegative Least Squares Optimization ...................................................................... 38Convex Optimization with Sparsity Constraint ............................................................... 38Quadratic programming ........................................................................................... 40Linear programming .................................................................................................41Joint Sparse Estimation of Dielectric Relaxations .................................................................. 42Gradient Newton Methods ..............................................................................................44 PAGE 5 5 Lp,q Regularized Optimization .................................................................................. 44Iterative Reweighted Optimization .......................................................................... 46L1 Optimization ........................................................................................................47Dictionary Learning using Adaptive Kernels .................................................................. 49Double Dictionary Search ...............................................................................................514 RESULTS ....................................................................................................................... ........53Comparison on Synthetic Data ............................................................................................... 53Experiment 1: Data fr om first order model ..................................................................... 53Observations .................................................................................................................. ..54Experiment 2: Comparison of Joint Spar se Methods on Data from second order model............................................................................................................................56Experiment 3: Comparison of Cole Cole and DSRF Dictionaries ..................................58Experiment 4: Dictionary update ..................................................................................... 60Experiment 5: Double Dictionary search Proof of concept .......................................... 61Experiment 6: Dictionary Correlation Analysis .............................................................. 63Comparison on Landmine Data .............................................................................................. 64Data Sets ..........................................................................................................................64Data Filtering ...................................................................................................................65Experiment 6: Classification of objects using first order model .....................................65Experiment setup ......................................................................................................65Experiments and observations .................................................................................. 66PRAM analysis .........................................................................................................66GRANMA analysis .................................................................................................. 66Bishay method and Xiang inversion analysis .......................................................... 68Feature Selection .............................................................................................................68Observations .................................................................................................................. ..68Experiment 7: Analysis of GRANPA on landmine data .................................................69Experiment 8: Analysis of Dic tionary methods on landmine data .................................. 725 CONCLUSIONS ................................................................................................................... .76LIST OF REFERENCES ...............................................................................................................77BIOGRAPHICAL SKETCH .........................................................................................................79 PAGE 6 6 LIST OF TABLES Table page 41 Comparison of convergence statistics: Part (1). ................................................................5942 Comparison of convergen ce statistics: Part (2) ................................................................. 6043 Nomenclature and Proportion ............................................................................................64 PAGE 7 7 LIST OF FIGURES Figure page 21 Pictorial representation of the data collection setup .......................................................... 1422 Geometry terms gn( r ,dT,dS,RT,RS) for n from 1 to 4. ........................................................... 1623 Shape terms n( kr ) for n from 1 to 4. ................................................................................ 1624 Effect of varying and ..................................................................................................2025 Real vs. imaginary parts of data from a first order ColeCole model. ............................... 2226 Effect of pole spread on DSRF coefficients. ..................................................................... 2531 Argand plots of WEMI response. ...................................................................................... 2832 Angle plots of WEMI response. ......................................................................................... 3033 Plot of gradient angle m in degrees .................................................................................... 3334 A) m / and B) m / of the proposed model ................................................................ 3335 Argand diagram of a low metal AntiP ersonnel mine in the near field. ............................ 3537 Estimated relaxations A) =105, B) =104 and C) =103.4. .............................................5241 Real vs Imaginary part of the induced response of a sample. ............................................ 5342 Fitting Error vs. Signal Energy for di fferent parameter estimation methods .................... 5543 Actual vs. estimated paramete r values for noise free case. ................................................ 5544 Actual vs. estimated parameter values for = 104. ..........................................................5645 Histogram image of Number of terms vs. Fitting error ..................................................... 5746 Convergence diagnostics for dictionary search and update. ..............................................6148 Two nonzero rows of the weight matrix WT. .....................................................................6249 Correlation regions of a Co leCole dictionary region.. ...................................................... 63410 Downtrack filter template ............................................................................................... ..65411 Plot showing the relationship between and EL for different mine types. .................... 66412 and for different object types. ......................................................................................67 PAGE 8 8 413 ROC curves of different landmine detection algorithms. .................................................. 69414 Change in Argand diagram with distance. .........................................................................69415 Change in amplitude with distance ....................................................................................70418 Parameter values estimated using joint sparse DSRF along downtrack. ..........................73419 Relaxations estimated usi ng a ColeCole dictionary ......................................................... 74420 Usage of dictionary elements by the landmine data.. ........................................................ 74421 Entropy of estimated relaxations for 28 different mine types. ........................................... 75 PAGE 9 9 Abstract of Dissertation Pres ented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy FAST PHYSICSBASED METHODS FOR WIDEBAND ELECTROMAGNETIC INDUCTION DATA ANALYSIS By Ganesan Ramachandran May 2010 Chair: Paul Gader Major: Electrical and Computer Engineering Three methods of object rec ognition using wideband electrom agnetic induction data are described. A fourth method, which extends an ex isting algorithm to extract features using a dictionary based search, was developed to an alyze objects. Emphasis was given to speed of execution as our interest is in their realtime performance. Wideband electromagnetic induction data may c onsist of a wide range of frequencies starting from a few Hz to a few hundred thousand Hz. In addition to the object, the data usually has information about the sensor geometry, orientation of data collection setup and the medium in which the object lies. The first method is called the Prototype A ngle Matching (PRAM) algorithm that takes a nonparametric approach using the gradient angl e between the real and imaginary components of the data. It classifies using di stance from prototypes whose gradie nt angles have been measured. It is fast and does not make a ny assumptions about the data. The second method, the Gradient Angle M odel Algorithm (GRANMA), is based on a novel analytical derivation of the gradient angle using a first order Co leCole model. The analytical derivation reduces the number of parame ters from four to two, enabling a fast lookup approach to nearestneighbor clas sification schemes. Furthermore, the other two parameters can PAGE 10 10 easily be estimated from the first two. The method is demonstrated to be much faster and more robust than existing methods. The third method, Gradient Angles in Parts Algorithm (GRANPA) uses a piecewise ColeCole modeling approach to attempt to estimate pa rameters of higher order models. It estimates the frequency segments in the data that follo w the ColeCole model in an automated way and then uses the same setup as the GRANMA to extract the parameters. The fourth set of methods, collectively re ferred to as SPARse DIelectric Relaxation estimation (SPARDIR), use a model that genera lizes the Discrete Spectrum of Relaxation Frequencies (DSRF) model. The SPARDIR algorithms assume that the data are formed by a weighted combination of the ColeCole models a nd use a gradient Newton framework to search for parameters. A variety of combinations of L1 and L2 normbased objective functions and constraints are investigated to seek sparse, physically meaningful parameter estimates. Furthermore, SPARDIR algorithms are devised that perform joint sparse estimation of parameters over a set of measurements and co mpared to SPARDIR and DSRF algorithms that perform pointwise sparse estimation. Classification and parameter estimation results ar e given on sets of real, measured data as well as synthetic data sets for which the true pa rameter values are known. In these experiments, GRANMA performed better, more robust and with higher classi fication rates, than several existing algorithms and is faster than all but one. The Joint SPARDIR algorithms more accurately estimated the true underlying model parameters for more general models than previous work. In addition, the Joint SPARDIR algorithms are general; they are not specific to sensor type. PAGE 11 11 CHAPTER 1 INTRODUCTION Wideband Electromagnetic Induction Data and Analysis W ideband electromagnetic induction (WEMI) sensors have been used in a wide variety of applications ranging from finding mineral ore deposits [1] to monitoring blood glucose levels [2]. They are mainly divided into time domai n and frequency dom ain sensors based on the domain of interpretation. The time domain sensors operate on the principle that when a time varying electrical current from a primary transmitt er is injected through a dielectric medium, part of the electrical power is stor ed in the medium. When the current is stopped, the stored energy dissipates. This energy dissipation creates a secondary electric fi eld which can be measured by a receiver. The rate at which the energy dissipa tes depends on the dielectric properties of the medium. This phenomenon is known as induced polarization (IP) [3]. Frequency dom ain sensors use a discrete number of sinusoidal si gnals. The secondary electric field or the induced fi eld differs from the primary in amplitude and phase. This can be represented as the result of change in the complex impedance of the sensor [4]. By having the prim ary field consist a wide range of frequencies, we can get a complex spectral signature of the object under investigation. The main appeal of using electromagneti c induction is that different materials have different spectral si gnatures that can be used to detect and identify them. Also, different material s have different bandwidths i.e., frequencies at which they produce maximum induction response. There exists a multitude of elect romagnetic induction sensors or metal detectors as they are more commonly known. Some comm ercial handheld metal detectors use a small range of frequencies and have applications such as treasure hunting or air port security. Their designs vary in the bandwidth of interest for the minerals they are tuned for. Most of them use the energy in PAGE 12 12 the secondary field for detection and seldom ha ve discrimination capabi lities between different materials within same bandwidth. Wideband EMI sensors, as their name indicates, use a wide range of frequencies to identify different liquids or solids. Their frequencies of operation can ra nge from a few hundreds of Hz to a few KHz. They are used in a variety of appli cations such as mineral prospecting, biological tissue analysis and landmine detection. Statement of Problem Param etric models offer a framework for char acterization and classi fication of different objects in a physically interpretable way. Most parametric models for wideband EMI data are based on simplifications of Maxwells equations to characterize the secondary field. But they often suffer from one or more of the following assumptions The object to be identified is suspended in vacuum The object comprises of a single homogeneous material The object is of a simple geomet ric shape (sphere, cylinder etc) The surrounding medium is magnetically transparent Nontargets or clutte r can be modeled Even though most parametric models make such assumptions, they are still highly nonlinear. Parameter estimation is a nontrivial pr oblem in most cases. Also most algorithms use iterative methods that require good knowledge of search ranges and starting values. We need models that are reasonably accura te to fit the observed data and estimated parameters to be consistent for identical objects to enable cl assification. Also, we need parameter estimation methods that are fast enough to be employed in practical applications. Overview of Research This research involved the developm ent and analysis of both parametric and nonparametric models to detect, analyze and identify diffe rent metallic objects using frequency domain PAGE 13 13 wideband EMI sensors. The primary objective was to develop fast, physicsbased algorithms for landmine detection and extend them to other ap plications. We first developed models to characterize spectral shapes at a given location a nd then modeled the variation in the parameters with respect to distance and orientat ion of the sensor from the object. The first method uses only the shape informa tion in the spectral si gnature and designs a prototype matching framework. It is nonparametric and hence does not make any assumptions about the data distribution. It cr eates a fast anomaly detection fr amework for landmine detection. The second method uses a novel gradient angle approach to solving the first order ColeCole equation. The ColeCole equation has been used over half a century in the characterization of dielectric properties of different minerals. The Gradient Angle Mode l Algorithm (GRANMA) uses a fast lookup table method to avoid local optim a and is able to model most signatures with acceptable accuracy. The third method Gradient Angles in Pa rts Algorithm (GRANPA) models more complicated spectral shapes by extending the GRANMA algorithm to attempt piecewise modeling approach. It uses the same framework of the GRANMA algorithm and hence is much faster than existing methods. The fourth method SPARse DIelectric Relaxa tion estimation (SPARDIR) uses a dictionary of dielectric relaxations approach to analyze the underlying dielectr ic properties of the object. It separates the propertie s of the object from the sensor setup. PAGE 14 14 CHAPTER 2 LITERATURE REVIEW The following is a review of current literatur e pertinent to modeling and processing WEMI data. Most methods attempt to solve either one or both of two problems. So me try to characterize the spectral shape of the observed data at a given location with re spect to the object, and others try to characterize the spatial pattern of energy around the obj ect at a given frequency. Unified methods that can characterize shape and spatia l energy pattern are rare except for a few basic shapes like sphere, cylinder etc. First, therefore, the analytical solution to a homogeneous sphere field in the field of a coaxial coil is presented to discuss the different factors that influence induced response. Next, the cases of more comp licated models are reviewed. Finally, a brief review of WEMI models and parameter estim ation methods in the context of unexploded ordnance (UXO) detection is given. Induced Field of a Homogeneous Sphere Figure 21. Pictorial representation of the data collection setup R T Receive Transmit RS dT dS a PAGE 15 15 The simplest and most used analytical mode l in the WEMI literature is of a homogeneous sphere in the field of a coaxial coil [4]. Let r and denote radius, conductivity and permeability of the sphere respectively. Let RT and RS denote the radii of th e transmit and receive coils located at dT and dS respectively from the spheres center. Let Pn 1(x) denote Legendre polynomials and Jn(x) denote modified Bessel functions of the first kind. Then, the response V(s) of the sphere is given by krRRddrg Rd RR IjVn n STSTn TT TS s 1 2 0 )(),,,,( 2 (21) where 2/1 22 2/ 22 2/122 12/122 1 12]/[]/[ )1(2 ),,,,( n SS n TT SSSnTTTn n STSTnRdRd RddPRddP nn r RRddrg known as the geometry term and, krkrJkrJ n krkrJkrJn n krjkrkrn n n n n n n 2/1 2/1 0 2/1 2/1 01 1 QI known as the shape term with 2 22 j jk and 0=4 107 is the permittivity of free space and is the skin depth of the material defined as 2 The geometry term gn(r,dT,dS,RT,RS) controls the relative impo rtance of different order terms and also the magnitude of the response. The shape term n(kr) defines the shape of the spectral response. The first order term ( n =1) is known as the dipole re sponse. Most often only the first order approximation is used as the higher orde r terms fall off quickly for large distances and small spheres. In the example shown in Figure 22 and Figure 23 for the case with RS = RT = 0.3m and a = 0.05m that are typical in landmine detection sensors, only n = 1 needs to be accounted for. PAGE 16 16 Figure 22. Geometry terms gn( r, dT,dS, RT, RS) for n from 1 to 4. The terms are shown for dT= dS in the range of 0 to 0.3m, RT= RS=0.2m, r=0.05m. n >1 terms are negligible compared to n =1 term. Figure 23. Shape terms n( kr ) for n from 1 to 4. The terms are shown for =2 106 to 2 105 Hz = 5.8x107 with A) = 0 and B) = 100 0. The x and y axes represent the real and imaginary parts respectively. 0 0.1 0.2 0.3 1010 105 n=1 dgn 0 0.1 0.2 0.3 1010 105 n=2 dgn 0 0.1 0.2 0.3 1010 105 n=3 dgn 0 0.1 0.2 0.3 1010 105 n=4 dgn 0 0.2 0.4 0.6 0.8 0.1 0.2 0.3 (n())(n())r=1 1.5 1 0.5 0 0.5 0.2 0.4 0.6 (n())(n())r=100 n=4 n=1 n=1 n=4 A) B) C) D) A) B) = 0 = 0 PAGE 17 17 Since most of the objects of interest are neither spheres nor homogeneous, complete mathematical analysis is not possible. Therefore there have been a number of attempts to model the induced response using parametric models, some of which are discussed in the following section. Spectral Models One of the most common approaches in the study of wideband EMI response of objects is to model the spectral shape. Most of the sh ape modeling approaches either make some assumptions on the shape and use a parametric model, or use a set of basis functions to represent it nonparametrically. BellMiller Models Miller et al. [5] showed that for most objects, the analytical solution for the hom ogeneous sphere can be approximated by the first order term. For n = 1, the Bessel function terms simplify to hyperbolic sines and cosines. The shape te rm in Equation 21 can be simplified as, kr krkr rk kr krkr rk kr cosh sinh cosh 2 sinh0 22 00 0 22 00 1 (22) They show that in the case of highly permeable objects ( >> 0), this formula can be simplified into a 3parameter model to charac terize more general but compact shapes as, 1)( 2)(2/1 2/1 1 j j sakr (23) with a s, and known as the amplitude, shift and rela xation time respectively. For a compact shaped object, the EMI response can be approximated by the dipole moment m (3x1 vector) induced in the target by the primary field h0 (3x1 vector) created by the transmitter coil [6]. The approxim ation neglects higher order multipole contri butions to the response, and is valid if the PAGE 18 18 distance from the sensor to the object is large compared to the dimensions of the object. For a harmonic field oscillating at frequency m ej t = V Ph0 ej t (24) where V is the volume of the object (scalar) and P is the magnetic polarizability tensor [7] (3x3 m atrix for each frequency ). P fully characterizes the EMI dipole response. The elements Pij() of P are a function of and depend on the objects electri cal properties, shape and on its orientation in the primary field. They are complex numbers corresponding to the frequencydependent phase shift between primary and induced field. Since the induced field is causal, P has the property of being symmetric. This means that it can be diagonalized and be represented by its Eigen values i and Eigen vectors u i as P = U UT, where U = [u 1, u 2, u 3] and = 3 2 100 00 00 At any given location and angle of the measurement setup with respect to the object, wh at we observe is a linear comb ination of the Eigen values. If ai( d ) denotes the strength of th e induced field at distance d along the Eigen direction i then the response of a compact object according to the 3parameter BellMiller model is given by, 1 2 )()(),(),(),(2/1 2/13i i ii p i i ij j dsdadjd d Q Iz (25) The model assumes that the value i remains the same along a given Eigen direction around the influence of the object. For a sphere, all the Eige n values are equal and th e response is identical in any direction. The rest of this document deals with the infl uence of orientation and distance separately. The initial work assumes that the primary and s econdary coils are equidistant from the object and PAGE 19 19 the objects response can be approximated by its dipole approximation unless mentioned otherwise. By replacing the exponent with instead of the above model can be used to characterize noncompact shapes. The 4parameter model is given by, 1 2 )()()(4 j j sa jpQIz (26) The factor controls the width of th e quadrature part and hence controls the bandwidth of the material. The bandwidth in this context represents the range of frequencies where the object under study has a significant imaginary part. First order ColeCole model Col e et al. [8], proposed a four parameter model in the context of m i neral prospecting. Denoting the WEMI response at infinite and zero frequencie s respectively as z and z0, it is given as, j zz z 1 )(0z (27) Since both the first order ColeCole model and the four para meter BellMiller model have the same number of parameters and are of sim ilar form, they warrant further analysis. From Equation 26, we can write the ze ro and infinite frequency respons es of the four parameter BellMiller model as, 1)( 2)0( sa sa z z (28) With simple rearranging of terms, the amplitude and shift can be shown to be, 30zz a and PAGE 20 20 0 02 zz zz s Substituting the notations for a and s, the fourparameter model becomes, Cole.Coleorderfirst 1 )(0 j zz z z Hence, the 4parameter model proposed by Miller et al. defaults to 1st order ColeCole model [8]. Figure 24. Effect of varying and Figure 24 shows the effect of varying and in the shape of the cu rve in the Inphase vs. Quadraturephase plot, also known as the Argand diagram. As goes outside a boundary of values, only part of the shape is visible in th e Argand diagram due to the finite bandwidth of operation. Therefore the range of is limited by the bandwidth of the sensor. In most unexploded ordnance (UXO) objects, there are soft metal rings near the tail of the projectile, known as driving bands to facilitate being fired by a gun. This can be modeled by extending the 4parameter model into a 5parameter model as, PAGE 21 21 1 2 1 2 )()()(5 Loop Loop pj j b j j sa jQIz (29) where Loop = 101.295 and cLoop = 0.943 are fixed empirically. This can be shown as a special case of 2nd order ColeCole model. While BellMiller models provide a framework for analyzing the electromagnetic induction spectra, they dont provide a fast way of estimating the parameters. There have been many attempts at finding exact or approximate solu tions to the first order ColeCole model, some of which are described in the following sections. Nonlinear least squares optimization The most straightforward albeit computat ionally intensive method to estimate the parameters is to use iterative least squares curve fitting approach. If 1, 2 N f are the frequencies at which measurements are taken, then the parameters are given by, fN k k k k csaj j sa csa1 2 ,,,1 2 )(minarg} { z (210) with the constraint that the parameters are real This is usually achieved by using an iterative least squares curve fitting algorithm. However, it requires a good initial estimate for converging to the global or at least a good optimum. In pract ice, this requires the algorithm to be run multiple times with different random initial conditi ons to find the optimal solution. The ways of selecting the initial conditions and the number of epochs required ar e usually heuristic. It is also a common practice to first divide the data by its total norm and then finally rescale the estimated amplitude. Bishays circle fitting method Claim 21: I f the spectral signature follows the ColeCole model, then the plot of I() vs. Q() will be a segment of a circle. PAGE 22 22 Proof: Let us assume that the points from a first order ColeCole model indeed form a segment of a circle. Let the cen ter of the circle be at ( ac,bc), then the points from the model intersect the x axis at (z0,0) and (z,0) From Figure 25, the I () coordinate of the center lies where Q() reaches its maximum. Taking the derivative Q() of with respect to Figure 25. Real vs. imaginary parts of da ta from a first order ColeCole model. At the maximum of Q(), the derivative is zero. Ignoring cases where = 0, = and = 0, the maximum happens at = 1/ Denoting the maximum by Qmax we get 4 tan 2 3 )/1(max a Q Q The x coordinate of the center is given by ac = I(=1/) = a ( s 1/2). From the figure, radius is given by r = bc + Qmax. By using ( I ()ac)2 + ( Q()+ bc)2 = r2 constraint, we get ( I()a ( s0.5))2+( Q ()+ rQmax)2 = r2 Substituting for I(), Q() and Qmax and solving for r, we get 2 cos 2 1 2 sin 1 3)(2 2 2 a Q I ( ) ( a Qmax) ( ac,bc) (z( ),0) ( z 0 0 ) Q() PAGE 23 23 2 tan 2 cosec 2 3 and 2 cosec 2 3 a b a rc Therefore, the first order ColeC ole equation can be written as 2 2 22 cosec 2 3 2 tan 2 cosec 2 3 2 1 a a sa Q I(211) The cases where > 1 correspond to the segment being greater than a semicircle and vice versa. If Nf denotes the number of frequencies wher e measurements are made, then in cases where z (0) and z ( ) can be measured, the values of and can be estimated as [9] : fN k k k k k fN1 1 1tan 0 tan 21 Iz Q zI Q, and (212) fN k k k k k kfN1 2 1 2 2 2 20 11 QIz QzI. (213) Since in most practical applications, z (0) and z ( ) cannot be directly measured, they have to be estimated. If the measured data includes the maximum of Q() at frequency m, then 2 I(m) z (0) + z ( ). Since z (0) and z ( ) are the I()axis intercepts, they are given by, z (0) = ac ( r2bc 2)1/2 and z ( ) = ac + ( r2bc 2)1/2. The value of bc is positive if > 1 and is negative if < 1. The values of ac, bc and r are found by using least squares search with the constraint (I() ac)2 + ( Q () bc)2 = r2. This approach works well for spectral shapes that fall on a circle but is not robust against noise and de viations from the assumed shape. Also, in cases where z () has flat regions, r which makes the estimate unstable. Xiangs direct inversion method Pelton et al. [10] formulated the ColeCole model in term s of induced polarization (IP) parameters in the form, j m 1 1 110 zz, with 0 1 z z m. (214) PAGE 24 24 The Xiang inversion method [11] converts the parameter es tim ation into a 1D problem using elimination of variable to estimate the pa rameters of the Pelton model, and hence is more computationally efficient, but restricts the values of m and to be between 0 and 1. This proves to be a handicap in modeling certain low metal mines as shown in the experiments. Bayesian inversion Since direct inversion of the f irst ColeCole may lead to suboptimal solutions, a Bayesian approach [1] can be used to find the global optimum. z (0) and are characterized using Jeffreys priors or alternately, log10( z (0)) and log10() by uniform probability de nsity. Also, by making a change of variable from m to m using m =log10( m /(1m )), it can be represented by a uniform pdf. And finally, by characterizing by a uniform pdf, the Bayesian a posteriori probability distribution is given by: p ()= a z (0) m (1m )()exp[0.5*( z zr())TCdd 1( z zr())] (215) where p () represents the a priori probability density of the model parameters, zr() the reconstructed data and Cdd is the NxN data covariance matr ix. The individual parameters are found by finding the respective marginal pdf s. Though this method can produce accurate estimates for the parameters with the right choice of priors, it is computationally intensive as the posteriors do not have closed form solutions and hence need to be estimated by ndimensional grid search. Discrete Spectrum of Relaxation Frequencies The EMI response of a target can be modeled as a sum of real exponentials which, in frequency domain can be represented as poles in the spectrum [12]. They are represented as: K k k kj c c1 01 z, with k 0 and ck 0. (216) PAGE 25 25 The values ck are estimated by searching for the k = 1/k values in the region where k>102 to 102. Since ck are restricted to be real and pos itive, the matrix setup shown in Equation 216 is solved using nonnegative leastsquares for real and imaginary components in parallel. c KM c c cM NMN j N j N j M j j j M j j j z z z z ... 1 ..... ..... ..... ... 1 ... 1 )( )( )(1 0 2 1/1 1 2 /1 1 1 /1 1 / 2 1 1 2 / 2 1 1 1 / 2 1 1 / 1 1 1 2 / 1 1 1 1 / 1 1 1 (217) Figure 26. Effect of pole spread on DSRF coefficients. A) Argand diagram of the original and reconstructed signals. B) DSRF pole locat ions and their strengths. Since DSRF assumes the pole spread to be unity, it splits a single pole into many for low values. In other words, DSRF tries to model the curve in the Argand diagram by a weighted sum of semicircles. 2 1.5 1 0.5 0 0.5 1 1.5 2 x 104 0 0.5 1 x 104 (Z())(Z())HMAP =0.47 tau=1.1674e005 103 104 105 106 0 0.5 1 x 104 ck GRANMA DSRF Original DSRF2 DSRF3 DSRF4 DSRF5 DSRF6 B) A) PAGE 26 26 If the data is in fact generated from a ColeCole model with <1, the DSRF splits a single pole into multiple poles to make an accurate fit. In other words, DSRF tries to model the curve in the Argand diagram which is a segment of a circle smaller than a semicircle by a weighted sum of semicircles. This makes the characteriza tion and classification of objects difficult. Kth Order ColeCole In the analysis of dielectric properties of biological tissues, the spectral shapes are more complicated than that of the first order ColeCole model [13]. By generalizing the DSRF, we m ay be able to model complicated shapes with lesser number of parameters. This leads to a generalized version of th e ColeCole equation as: K k k kkj c c1 01 z, with ck 0. (218) This form of Kth order ColeCole model with K =4 is widely used in breast tumor detection and other wideband EMI applications in medi cine and biology. Direct inversion of a Kth order ColeCole model is mathematical ly intractable. There however, exist a few methods that attempt to find approximate solutions. If the order K is known, then an iter ative least squares method can be used to find the cks and ks. The finite difference time domain method is also widely employed by the biological tissue analysis community. Both these methods however require a good es timate of the initial conditions and are comput ationally inefficient. PAGE 27 27 CHAPTER 3 TECHNICAL APPROACH The focus of this research was on developing to ols for detecting, analyzing and identifying objects from sets of spectral measurements obtained from a wideband EMI sensor. Three specific approaches, namely Prototype Angle Matching (PRAM), Gradient Angle Model Algorithm (GRANMA) and Gradient Angle in Parts Algorithm (GRANPA) using the gradient angle between the real and imaginary components of the data were developed each building on the previous one and in the order of increasing complexity on simple spectral signatures. GRANMA and GRANPA express the data in terms of thei r dielectric properties by extracting features related to the underlying physical phenomena. PRAM and GRANMA identify objects based on distance to one or more nearest prototypes. PR AM is a fast prototype matching method based on gradient angles calculated numerically on the spec tral signatures. GRANMA uses the gradient of an analytical model, and reduces the number of parameters to be estimated simultaneously from four to two. It significantly reduces the complexity of the s earch using a two stage approach. GRANPA uses piecewise curve fitting to model more complicated spectra. It provides unique solutions to model near field effects and is faster than existing methods. A fourth method, SPARse DIelectric Relaxa tion estimation (SPARD IR), was introduced that looks for a jointly sparse solution to provide the most compact representation of the data. SPARDIR also separates the underlying physical phenomena from the sensor setup. Two solutions to this algorithm, SPARDIR2 and SPARDIR1 were developed based on minimizing quadratic and absolute error re spectively. Although both methods provide sparse solutions and approximate the data well, the first one is faster whereas the second one provides solutions closer to the true underlying physics. PAGE 28 28 In the following section, we briefly revi ew the Argand diagram and explain the four methods, PRAM, GRANMA, GRANPA and SPARDIR. Argand Diagrams of WEMI Responses Argand diagram s have been one of the widely used tools to study complex data. They are also known as Nyquist plots in the context of control theory a nd ColeCole plots in electrochemistry. It is a pl ot of Inphase component I() against the Quadraturephase component Q(). Figure 31 shows that it provides a cons istent fram ework to identify specific mine types. Mines with similar metal content have similar signatures (which may not be true in the case of Ground Penetrating Radar). This proper ty can be exploited in landmine detection and classification. Figure 31 Argand plots of WEMI response for a A) boxed AP mine and a B) circular AT mine at different locations and depths. Different points in the curve represent the WEMI response at differe nt frequencies. Figure 31 shows that though the shapes are sim ilar, the re is variability in the amplitude and in the shift in In phase component between different candidates of same mine type. The B) A) PAGE 29 29 slope between the points in the Argand diagra m is a powerful tool for discrimination. Figure 32 shows that, the variability between different ca ndidates is reduced as th e gradient angle is independent of amplitude and shift, two of the most difficult and unrelia ble (Fails, et al., 2007) values to estimate. Here the angles are calcu lated numerically by usi ng a twosided gradient. )()()( )()()( I II Q QQ. The gradient angle is defined as )(),( )( IQ m atan2 (31) Using the MATLAB definition of atan2, the equation for the grad ient angle is given by: 2 2 1))(())(()( )( tan2)( Q II Q m. (32) Prototype Angle Matching The Prototype Angle M atching method is a near est prototype classifi er. The prototype of each object class is defined to be po intwise median. More precisely, if Ti represents the set of indices of all candidates of object type i then the corresponding set of gradient angles is given by Mi()={ mk() : for all k Ti} and the prototype by: k k immedian P (33) If z is a test vector, then the confidence that z is of target class is defined by ))((1 1 )( zztde conf (34) where ),(min)(i i tPd d zz is defined to be the distance between z and the target class t and and are the rate and bias parameters of the logis tic function to be estim ated by cross validation [15]. PAGE 30 30 Figure 32. Angle plots of WEMI response for a A) circular AT mine and B) boxed AP mine and at different loca tions and depths. Gradient Angle Model Algorithm Param eters of the ColeCole mode l have been used successfully [14] to model and classify m etallic objects. The most widely used parameter estimation methods use nonlinear optimization methods, which are slow due to the presence of lo cal minima. The simpler methods are faster but not robust against noise and devia tions from the model. In this section, it is shown that by analytically differentiating the Co leCole model, a fast lookup al gorithm can be formulated. This is one of the contributions of the present research. Separating the I() and Q() components using MATHEMATICA, (verified numerically and by MATLAB symbolic math toolbox) the f ourparameter model becomes: B) A) PAGE 31 31 2 cos2 1 2 cos 13 1)(2 sa I and (35) 2 cos2 1 2 sin3 )(2 a Q (36) Using the same framework as Equation 32, th e equation for the gradient angle for fourparameter model (assuming a > 0) is 4 tan 1 1 tan2)(1 m (37) The range of m() is given by evaluating it at the extrema of : 2 )(and 2 )(0 m m. These equations show that the angle varies between /2 and /2 which is adequate to model all angles when [0, 2]. They also show that the threeparameter BellMiller model (Miller, et al., 2001), where is restricted to be equal to 0.5, can only characterize angles between /4 and /4. Analysis of empirical data shows the latter range to be inadequate. Since the equation for the grad ient angle involves only two variables, a lookup table was created for m() for a range of values of and The elements of the table are denoted by ,, m. Values of and were found by calculating the best match between m () and table values using least mean absolute error: ),,( )(),( mm E (38) The lookup table error EL is defined as follows: PAGE 32 32 ),( minarg where) ( E ELE (39) The amplitude a was found by substituting the estimates and in Equation 36 and finding the maximum likelihood estimate: 1 2 Im j j a Q (310) where and are lookup table estimates, and Nf is the number of measured frequencies Finally, the shift s was found by substituting the estimates of a and in Equation 35: 1 2 Re j j a s I (311) The goodness of fit for this method was measur ed as the normalized error between the actual data and the fit: )( 1 2 )(1 2 1 2 f fN k k N k k k k Fj j sa E z z (312) Stability of Lookup Table Step in GRANMA All the param eter estimates depe nd on the accuracy of the gradie nt and on the resolution of the table. Therefore, first it was necessary to see how a small error in the parameter estimate in the (, ) space due to the finite size of the l ookup table affects the angle estimate. Figure 33 shows the dependence of m on and for a fixed It shows two regions of interest, one being = 0, and another = 2. Figure 34 shows that n ear regions where = 1, there is a large change in m for small changes in and due to the fact that the denominator in ColeCole equation becomes zero. Theref ore, regions too close to = 2 were avoided. PAGE 33 33 Figure 33. Plot of gradient angle m in degrees vs. and It shows a discontinuity in m when = 2 and = 1. Figure 34. A) m / and B) m / of the proposed model, computed numerically, shown at 330Hz. It shows the pattern th at, near to regions where 1 and 2, there is a big difference in the angles for a small change in or 101 100 101 0 0.5 1 1.5 2 100 0 100 Gradient angle in parameter space for a fixed m(,,) A) B) PAGE 34 34 Next, it was necessary to analy ze the robustness of the estimat ed parameters with respect to noise in the data and in the gradient estimates. Since the nois e in the parameters is nonlinearly linked to the noise in the data, it is difficult to derive expressions linki ng the noise in them. Therefore, the stability of the method with respect to noise in the data was analyzed in the context of classifier design. Classifier Design The classification was done on six features using a soft K nearest neighbor m ethod. The features are a tan1(s ), log10(), EL & EF defined in Equations 39 to 312. The value of K was found as the one that gave minimum variance wi thout sacrificing perfor mance. The confidence value for object i is given by K j i j K j i jm n iconf1 1)( (313) where ni j is the distance to jth nearest nonmine neighbor, and mi j is the distance to jth mine neighbor. The contribution here is the estimati on of the ColeCole parameters, not in the classifier design. Any suitable cl assifier could be used here. Gradient Angles in Parts Algorithm The first order ColeCole m odel together wi th GRANMA is useful for modeling objects that show a single relaxation in their spectral signat ures. This, however, may not be enough to model objects that have more complicated spectral shapes. The Kth order ColeCole has found many applications especially in electromagnetic studies in biology [13]. However, there are no direct inversion m ethods availabl e to estimate the parameters a nd existing methods are too slow to be used in a realtime environment like landmine detection [13]. PAGE 35 35 If the order K is known and the relaxations express themselves in frequency bands with little overlap, then a piecewise curve f itting approach can be used to estimate Kth order ColeCole parameters. Figure 35. Argand diagram of a lo w metal AntiPersonnel mine in th e near field. It shows that a single pole model is not enough to capture the information contained. The red and magenta circles represent out put of two first order ColeCole models and the black asterisks represent their sum. This figure shows an example a spectrum with two relaxations that dominate in different frequency bands. Therefore, if the frequency bands are known, then we can characterize the frequency bands individually us ing a single ColeCole model and estimate the parameters using GRANMA. 21 111 zh zhz (314) where h1() is a low pass filter with frequency response: PAGE 36 36 0 0 1,0 ),0[,1 h with 0 as the cutoff frequency. This approach can be extended to a general Kth order case as a bank of bandpass filters: K k kk 1 zhz, with K k k 11h for all (315) Now, the problem of estimating Kth order ColeCole equations simplifies into finding the bandpass filters, so that individu al bands can be modeled using a first order ColeCole model. Filter Design Figure 36. A) Argand diagram of a low metal AntiPersonnel mine in the near field. B) gradient angle at different frequenc ies. C) slope of gradient angle A) B) C) PAGE 37 37 There are many ways of finding the optimal cutoff frequency in two filter case. One of the easiest is to search exhaustiv ely in the frequencies of inte rest (usually around the middle frequency) to optimize for the least lookup tabl e error. Another method would be to find the location of the gradient angle minimum around the center frequency as shown in Figure 36 B. Another option would be to choose the frequency where slope of gradient angle crosses from negative to positive around the center frequency as shown in Figure 36 C. We found the zero crossing to b e a reliable estimate of the cutoff fr equency for landmine data Experimental results demonstrated that a less heuristic method should be pursued. There is on e merit that should be mentioned here. GRANPA provides a simple and fast way of estima ting the near field effects. Classifier Design A soft kNN class ifier similar to the one us ed in GRANMA case coul d be used. Relevance Vector Machines (RVM) also show promise in case of GRANMA features and could be used here. This method is applicable only when the order K is known and the relaxations are bandwidth limited. Dielectric Relaxation estima tion using Sparse Models W e can use similar to the framework used by DSRF to model such curves. By extending the matrix T to include different values of and we get: ML Nc c cL c MN j L c N j c N j c N j L c N j c N j c N j L c M j L c j c j c j L c j c j c j L c M j L c j c j c j L c j c j c j. ...... ... 1 ...... ... 1 ...... ... 1 )( )( )(1 0 2 11 1 2 1 1 2 2 1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 1 1 2 1 1 22 1 1 2 22 1 1 1 22 1 1 12 1 1 2 12 1 1 1 12 1 1 1 1 1 21 1 1 2 21 1 1 1 21 1 1 11 1 1 2 11 1 1 1 11 1 1 z z z PAGE 38 38 or, ]/[ ,mod11,1 1 ,LjIntegermLjnn c mi j ji cz (316) The values of ck can be found using many different approaches, some of which are described in the following section. Since the cks are restricted to be real whereas, and z are complex, the optimization is done as c z z m e m e. (317) Nonnegative Least Squares Optimization This uses th e same approach as in DSRF. This approach has the advantage of being able to estimate cks and the order K simultaneously. However, this almost always overestimates K and needs to be modified with a sparsity constraint on the cks as described in the next method. Convex Optimization with Sparsity Constraint B oth DSRF and SPARDIR can be written in matrix form as cz with being the dictionary of dielectric relaxations of interest. Each column in corresponds to a realization of a first order ColeCole or Debye m odel normalized to have a unit L2 norm. In general, the dimension of z is lesser than the number of columns in Also the columns themselves are correlated. This leads to an overcomplete probl em with multiple solutions. However, if we assume that only a few cks are nonzero, then we are essentia lly looking for the smallest order ColeCole or Debye model to fit our data. Th is transforms the problem of estimating the dielectric relaxations into a constr ained optimization problem given as (P0) Minimize 0c subject to cz (318) PAGE 39 39 with 0c being the L0 norm which is the number of nonzero elements in c Even though L0 norm does not satisfy the requirements for a nor m, it is still referred to as a norm. The constrained optimization problem as given in Equation 318 is NP hard [16]. There is no direct m ethod of optimizing with the L0 norm constraint. However there are three alternate approaches to solving such problems. These appr oaches are desirable because they can converge to the original solution for problem P0 under certain condit ions. If we let u be the parameter controlling the tradeoff between fitting error and the sparsity of the final solution, then (P1) L1 constrained optimization: minimize L k kcu u1 2 21 cz. (319) (P2) Lp constrained optimization: minimize L k p kcu u1 2 21 cz, with 0< p < 1 (320) (P3) Iterative reweighted optimization: minimize L k p k n kkkc cu u1 1 )1( 2 21 1 cz (321) with 0 < p < 1, and 0. There are multiple ways of solving the a bove mentioned three optimization problems. Some of the famous ones are quadratic progra mming, linear programming projected gradient and gradient descent. The uniqueness of the solutions to problems P1P3 depends on the correlation between the columns of If j T i jiLji];,1[,max denotes the maximum correla tion between any two columns in the dictionary, then if the number of nonzero coefficients or model order K for the true L0 solution satisfies PAGE 40 40 K = 0c < 0.5 (1+1/ ) (322) [16] then the L1 norm based solution of P1 is the same as the L0 norm based solution in Equation 318 [16]. The Lp norm based solution of P2 can equal to the L0 norm based solution of P0 under more relaxed conditions. In general, the conditio ns depend on the Restricted Isometry Property (RIP) of the matrix [18]. It states that to produce Ksparse solutions, the matrix has to satisfy KL K K0 2 2 2 2 2 2, 1 1 zzz zz (323) with K < 1. For L1 norm solution to reach the L0 norm solution, the condition is K < 21. For an Lp norm solution to give a unique L0 solution, the condition is 2 111 11221 1 1 p KK [18]. When p is clos e to zero, K is close to 1. So the interval [1K, 1+K] is larger, yielding more rela xed conditions. This makes Lp optimization to give better solutions than with p = 1. In other words, as p decreases, the condition for reaching the L0 solution becomes easier to satisfy. However, when p < 1, the problem is no longer convex and hence achieving a global optimum depends on the algorithm and its initial conditions. Quadratic programming The optim ization problems shown in Equation 319 and Equation 321 can be solved by quadratic programming which trie s to minimize the function 0.5 xTAx + bTx with respect to x for x 0. For a simple x 0 constrained optimization, A = and b =zT When we add the L1 constraint, the inputs become A = and b = zT 1 where 1 is a vector of ones of appropriate length. For iterative reweighted op timization with a given p the input A remains the same while b = zT with being a vector containing the ks. We used MATLABs PAGE 41 41 quadprog.m function to implement this method. Th is iterative schedule is summarized in the following pseudocode Q_IR 1. A 2. k 1 for all k 3. Scale z for unit norm and positive sign 4. b zT 5. Iteration 1 6. while ObjectiveFunctionValuePr eviousObjectiveFunctionVal ue > ChangeThreshold a. PreviousObjectiveFunctionValue ObjectiveFunctionValue b. Compute c to minimize 0.5 cTAc + bTc c. Update k =1/ ck d. if k > SparsityParameterThreshold then i. Remove kth column of ii. Remove k from iii. Update A end if e. Update b zT f. Update ObjectiveFunctionValue 0.5 cTAc + bTc g. Iteration Iteration + 1 7. end while 8. Rescale c with original scale and sign of z Linear programming The linear p rogramming approaches focu s on minimizing functions of the form fTx subject tocz For L1 constrained optimization, f is a vector of ones. For iterative reweighted optimization f = with ks being updated iteratively. This makes the approach a sequence of linear programming solutions all with the same region of feasibility. We used MATLABs linprog.m function to implement this method. Th is iterative schedule is summarized in the following pseudocode L_IR 1. k 1 for all k 2. Scale z for unit norm and positive sign 3. f 4. Iteration 1 PAGE 42 42 5. while ObjectiveFunctionValuePr eviousObjectiveFunctionVal ue > ChangeThreshold a. PreviousObjectiveFunctionValue ObjectiveFunctionValue b. Compute c to minimize fTc c. Update k =1/ ck d. if k > SparsityParameterThreshold then i. Remove kth column of end if e. Update f f. Update ObjectiveFunctionValue fTc g. Iteration Iteration + 1 6. end while 7. Rescale c with the original scale and sign of z There are two common ways to implemen t linear programming namely, the simplex method and the interior point me thod. The explanation of those methods is beyond the scope of this thesis. Joint Sparse Estimation of Dielectric Relaxations All the m ethods mentioned so far use a single observation to identify objects. In practical use, most often multiple measurements are taken around an object with varying orientations and distances. If we can assume the observed data is a weighted sum of parts with each part due to a different component, then multiple measurements around an object should contain the same set of parts albeit with different se t of weights. For example if 1 and 2 are responses due to two sources, then observations are given by z1= w11+ w22 and z2= w31+ w42, with w1 to w4 being the appropriate weights. Therefore it is preferable to use all the responses related to an object together to get a more stable estimate. The optimization now becomes a joint sparse estimation with the group of observations coming from the same small s ubset of the dictionary. If H denotes the Heaviside step function, then the optimization function for No observations is given by, PAGE 43 43 (JP0) Minimize L k jk NjcHKO1 ],1[max subject to Cccc zzzZ ON O... ...21 21. (324) Joint sparsity implies that the matrix C is row sparse i.e. only a few rows have nonzero elements. The problem given in Equation 322 is NP hard, therefore approximations similar to Equations 318 to 320 can be ex tended in the joint case as (JP1) Lp,q constrained optimization: minimize L k p N j q jkOcu u11 2 21 CZ, (325) with 0 < p < 1 and q > 1. (JP2) Iterative reweighted optimization: minimize L k p O j jk n k N j q jk kc c u uO1 1 1 )1( 1 2 21 1 CZ (326) with 0 < p < 1, q 1 and 0. (JP3) L1 optimization: minimize L k N j jkOcu u11 11CZ. (327) The advantage of such joint sparse optimiza tion as opposed to optimizing each observation individually is that we can uniquely estimate a K sparse solution with K < 0.5(1/ + Rank( Z ) ) as opposed to 0.5(1+1/) [17]. This enables us to estimate higher order m odels for a given dictionary, or design dictionari es with more elements which in turn produce more accurate answers. As in the single observation case, the quadratic programming or linear programming approaches can be used to solve Equation 324. The A matrix remains the same, while b(j) = z(j)TT. The entries in the vector now correspond to the sums along the rows of C rather than PAGE 44 44 the individual elements in each of the columns. This definition of enforces joint sparsity by using the same sparsity parameter k along all columns of a given row of C Gradient Newton Methods The quadratic and linear prog ram ming methods work well when all the weights are nonnegative. However, this may not be the case with all WEMI data. For example, in landmine detection using a Dipole transmitter antenna and a Quadrupole receiver antenna, each of the relaxations can have positive or negative weights. In case of multiple observations the magnitude and sign of the weights vary with respect to the relative distance and orientation between the object and the antenna setup. In such cases, a gr adient descent approach can be used to solve Equations 324 to 326. For the objective functions defined in E quations 324 to 326 of the form uE + (1u )R Newton methods are given by the up date equation for time step t +1 as t ttCCC 1, where denotes the step size tC is the update which is given by v vv v Ct t t t tR u E u R u E u vec 1 11 2 2 2 2, with E denoting the quadratic or linear fitting error term, R the regularization term and v the vectorized version of C vec ( ) denotes the vectorizatio n operator. Lp,q Regularized Optimization To solve Equation 324, let us denote the quadratic fitting error and the regularization terms as L k p k p L k N j q kjR c R EO1 11 2 2& Z C. Taking partial derivatives, and using o to denote elementwise multiplication, PAGE 45 45 1...1 2 )(21 1 1 p L p TR R p R qFor E C C Z C C To derive the Hessian matrix, first we need to vectorize the weight matrix. ...0 00 ...00 0 ...000 )(2 2 v CvT T TE vecLet (328) If LmmLkk mod &mod1 1 with L being the number of columns of vk = ci1,j1, k = (i11) L + j1 and m = ( i21)L + j2, then the Hessian for the regu larization term is given by, otherwise mkif vvRpp mkifpRvRpp vv Rmk p k p k k p k mk, 0 )1(4 ,2)1(411 2 1 22 21 1 1 (329) The Hessian matrix of the gradient descent c ontrols the speed of convergence along each of the weights. The matrix needs to be positive definite for the solution to converge. The problem in using a Newton approach directly on Lp,q regularized optimization is that the second order gradient matrix has p (p 1) term in the diagonal. Since p < 1, this may lead to negative diagonal entries which can result in the solution to diverge. PAGE 46 46 Iterative Reweighted Optimization W e can extend the same treatment to JP2 as we did for JP1. L ...000 0...00 0...00 Let 2 1G (330) For q = 2, using a similar approach as in Lp,q optimization, the equations simplify to Z G CT Tu uu1)1( (331) This setup provides us with a simple equation that improves the solution by iteratively updating G and C This iterative schedule is summarized in the following pseudocode L2_IR 1. k 1 for all k 2. Scale all columns of Z for positive sign 3. Compute G according to Equation 330 4. Compute C according to Equation 331 5. Iteration 1 6. while ObjectiveFunctionValuePr eviousObjectiveFunctionV alue > ChangeThreshold a. PreviousObjectiveFunctionValue ObjectiveFunctionValue b. Update ON j jk kc1 2 ,1 c. if k > SparsityParameterThreshold then i. Remove kth column of ii. Remove k from iii. Update G according to Equation 330 iv. Update C according to Equation 331 end if d. Update ObjectiveFunctionValue according to Equation 326 e. Iteration Iteration + 1 7. end while 8. Rescale C with original sign of Z PAGE 47 47 L1 Optimization To solve Equation 326, let L k N j jk kOc R E11 1 &Z C. Directly applying second derivative approach is no t possible as the derivative of L1 norm is the sign or signum function which is discontinuous at 0 and has de rivative equal to 0 elsewhere. However, the hyperbolic tangent function can be used as a smooth approximation to the signum function with a scaling constant As increases, the approximation becomes better. Using the approximation and taking partial derivatives, we obtain Z C Z C C tanh2)(2T Tsign E (332) & 1...1 tanh21...1 21 1 L Lsign R C C C (333) Using to denote entries along a column, the update equation can be written as C C H HCj j R j E j jR u E uuu 1 )1(1, (334) withj R jG.RH where G is the diagonal matrix with ks as its entries as defined in the previous method and denotes the matrix product. Let )(hsec...000 0...0)(hsec0 0...00)(sech, 2 ,2 2 ,1 2 jL j j jc c c R. (335) PAGE 48 48 Using j Z Cje to denote the jth column of the fitting error matrix, we can write the Hessian matrix for the L1 fitting error as )(sech...000 0...0)(sech0 0...00)(sech, 2 ,2 2 ,1 2 jL j j T E je e e H (336) This approximation has the property that it bridges the gap between L1 and L2 methods. When is near one, the approximation is close to the L2 method because tanh(x) x for small values of x. As increases the results become closer and closer to the L1 approach. This iterative schedule is summarized in the following pseudocode L1_IR 1. k 1 for all k 2. 1 3. Scale all columns of Z for positive sign 4. Compute G according to Equation 330 5. Compute C according to Equation 331 6. Iteration 1 7. while ObjectiveFunctionValuePr eviousObjectiveFunctionV alue > ChangeThreshold a. PreviousObjectiveFunctionValue ObjectiveFunctionValue b. Compute E/ C according to Equation 332 c. Compute R/ C according to Equation 333 d. Compute Rj according to Equation 335 e. Compute Hessian of regularization term j R jG.RH f. Compute Hessian of absolute error term according to Equation 336 g. Update C according to Equation 334 h. Update ON j jk kc1 ,1 i. if k > SparsityParameterThreshold then i. Remove kth column of ii. Remove k from iii. Update G according to Equation 330 PAGE 49 49 iv. Update C according to Equation 334 end if j. Update ObjectiveFunctionValue according to Equation 327 k. Iteration Iteration + 1 8. end while 9. Rescale C with original sign of Z Dictionary Learning us ing Adaptive Kernels The set of methods discu ssed so far use a fi xed dictionary. In most parametric modeling approaches, the dictionary is populated using a specific model wh ose parameters are varied to create dictionary elements. In the present case the model is a first order ColeCole model. The task of creating a dense enough dictionary to represent data of interest while keeping the correlation between the elements small is not triv ial. For example, the ColeCole dictionary for the frequencies between 300 Hz to 90000 Hz eq uispaced in log scale can have only a maximum of 11 elements for a correlation th reshold of 0.9. This is highly restrictive and still the correlation is too high. One solution to alleviate this problem would be to make the dictionary tuned to the data. kk k i1 1 Let denote kth dielectric relaxation. By extending the same gradient descent approach to the regularized error func tion defined in Equation 326, we get for L2 error, 0&)( log 2 log k T k k kR E Z C (337) with 2i1 i logkk kk k k (338) 0&)(2 k T k k kR E Z C (339) with 2i1 ilogik kk k k k k (340) PAGE 50 50 Note that since varies in exponential scale, the update is done using loge( The Hessians are given by otherwise 0 i1 ilogi1i otherwise 0 i1 i1i loglog3 2 2 3 2 2mk mkk k k k k kk k k k mk k k k kk m k k Since the diagonal elements of the Hessian cannot be guaranteed to be positive, only a first order update is used. Also, since the parameters are restricted to be real, on ly the real part of the update is used. Using the same approach as Equa tions 337 to 340, the update equations for L1 error are given by 0& tanh log 2 log k T k k kR E Z C (341) and 0& tanh 2 k T k k kR E Z C (342) The dictionary elements are adapted along with their corre sponding weights as described below AKL1_IR 1. k 1 for all k 2. 1 3. Scale all columns of Z for positive sign 4. Compute G according to Equation 330 5. Compute C according to Equation 331 6. Iteration 1 7. while ObjectiveFunctionValuePr eviousObjectiveFunctionV alue > ChangeThreshold a. 1.005 b. PreviousObjectiveFunctionValue ObjectiveFunctionValue c. Compute E/ C according to Equation 332 d. Compute R/ C according to Equation 333 PAGE 51 51 e. Compute Rj according to Equation 335 f. Compute Hessian of regularization term j R jG.RH g. Compute Hessian of absolute error term according to Equation 336 h. Update C according to Equation 334 i. Update ON j jk kc1 ,1 j. if k > SparsityParameterThreshold then i. Remove kth column of ii. Remove k from iii. Update G according to Equation 330 iv. Update C according to Equation 334 v. 1 end if k. Update k according to Equation 341 l. Update k according to Equation 342 m. Update ObjectiveFunctionValue according to Equation 327 n. Iteration Iteration + 1 8. end while 9. Rescale C with original sign of Z Double Dictionary Search In m ost practical cases the joint sparse esti mation is done over observations that are generated by collecting data by varying the distance and orientation of the sensor setup with respect to the object. The observa tions are then related to one an other by the response function of the transmitter and the receiver. For example, Figure 37 shows the weights of indivi dua l relaxations for a high metal antitank mine along a path known as downtrack. We can see that the weights themselves follow a pattern that is dependent on the dipole transmitter and quadrupole receiver configuration. We can create another dictionary of such shapes and constrain the weights to a linear combination of them as C = ( HW )T, with H being the dictionary of do wntrack shapes. The matrix W now serves as row sparse weight matrix for the diel ectric relaxations and picks the downtrack shapes by the column values along such rows. By desi gning an appropriate dictionary, we can get PAGE 52 52 accurate estimates by a small number of such shape elements. Each column of W is sparse but not necessarily jointly sparse along multiple rows. The optimization is done using 2 2ZH W TTE & regularization L j N i ijjshapesW R11 2 (343) Figure 37. Estimated relaxations A) =105, B) =104 and C) =103.4 and their weights for a high metal antitank mine along a given direction across the object. The update is done as a two step process, first with optimizing C with respect to the data to be a row sparse matrix and then next optimizing W to be generally sparse with respect to the estimate of C The weights are initialized using least squares pse udoinverse solution1 1)( I ZHIHHW TTT Twith controlling the illposedness of the matrix inverse operation. 10 20 30 40 50 60 0.5 0 0.5 = & 105 10 20 30 40 50 60 0.5 0 0.5 = & 104 10 20 30 40 50 60 0.5 0 0.5 = & 103.4Downtrack A) B) C) PAGE 53 53 CHAPTER 4 RESULTS We implemented and tested the four algorithms and their variants discussed in the previous chapter. In the following sections we discuss th e test results of our pr oposed methods. First, to illustrate the effectiveness of one method against another, synthetic data sets are created with varying levels of complexity in the spectral sh apes. Then, the methods are compared on a set of landmine data collected at two different locations. Comparison on Synthetic Data Experiment 1: Data from first order model Figure 41. Real vs Im aginary part of the induced response of a sample generated using ColeCole model with increasing levels of additive noise. To compare different algorithms that estimate the parameters of the 1st order ColeCole model, 100 instances of data were generated using the 1st order ColeCole model with the four parameters a, s, and randomly sampled from uniform distributions over the intervals (108,1), (10,10), (107,101) and [0.1,1.4] respectively at 21 frequencies ranging from 330 Hz to 90300 0.335 0.34 0.345 0.35 0.355 0.36 3.405 3.41 3.415 3.42 x 103 =103(Z())(Z()) 0.335 0.34 0.345 0.35 0.355 0.36 3.405 3.41 3.415 3.42 x 103 =0(Z()) 0.335 0.34 0.345 0.35 0.355 0.36 3.405 3.41 3.415 3.42 x 103 =104 0.3 0.32 0.34 0.36 0.38 0.4 3.405 3.41 3.415 3.42 x 103 =102(Z()) Original Noisy A) B) C) D) PAGE 54 54 Hz in log scale. These were typical values se en in the landmine detection framework. The experiments were repeated with zero mean Gaussian noise of variance 2() added at each frequency. More precisely, the noisy versions of the signal are given by, fN k k kN1 2 2 2where),0( z z zz, with = [104,103,102]. The effect of additive noise on the signal is shown by Argand diagrams in Figure 41. we see that = 103 was the point where noise st arted dominating the signal. Observations Figure 42 shows the fitting error as a functio n of the signal energy for different parameter estim ation methods. The circle fitting method by Bishay [9] worked well and was the fastest m ethod in the noise free case. However its perf ormance degraded quickly in the presence of noise. Figure 43 and Figure 44 compare the estimated values in the y axis to the true values in the x axis. Therefore, a good estim ate should lie on the y = x line. Figure 43 shows that even for a low f itting error, the estimated parameters might be far from the true underlying parameters. It also shows that in noise free case, the estimation of the parameter was to be the least reliable one. GRANMA produced the most robust results for noise levels up to = 103. When = 102, all methods fail most of the time. The experiments were carried out in MATLA B on a Linux based server with 10GB memory and quadcore processor with each core operating at 2Ghz. Bishays circle fitting method was the fastest with speeds about 10x faster than GRANMA which was in turn faster than the iterative least squares method by 10x 20x. The Xian g inversion method varied in speed from 2x 10x slower than GRANMA depending on the number of iterations it required to converge. PAGE 55 55 Figure 42. Fitting Error vs. Signal Energy for different parameter esti mation methods of ColeCole model at different noise levels. Figure 43. Actual vs. estimated parameter values for different ColeCole parameter estimation methods in the noise free case. 1030 1020 1010 100 1030 1020 1010 100 ZFitting Error=0 1010 105 1010 100 ZFitting Error=104 1010 105 105 100 ZFitting Error=103 1010 100 105 100 ZFitting Error=102 granma Bishay Xiang 1010 105 100 1020 1010 100 Estimated 0 0.5 1 1.5 0 0.5 1 1.5 2 1 0 1 2 2 0 2 atan(s)EstimatedActual 1010 105 100 1020 1010 100 Actual A Actual granma Xiang Bishay a PAGE 56 56 Figure 44. Actual vs. estimated parameter values for different ColeCole parameter estimation methods when = 104. GRANMA produced the most robust estimates and Bishay and Xiang methods started to degrade in performance Experiment 2: Comparison of Joint Sparse Me thods on Data from second order model The methods discussed in the previous experi ment assume that the data come from a 1st order ColeCole model. In most practical applications, the spectral shape of the data follows higher order models. For example, we observed most low metal antipersonnel mines in our data set exhibit behavior that is best described by a 2nd or 3rd order ColeCole model. In this experiment, we compared the different dictionary search methods to simultaneously estimate the order of the model along with the parameters of the model. The objective functions for the different algorithms seek to produce models that accurately represent a se t of observations with a small number of dictionary elements. To test this property, 100 di fferent synthetic data sets were created using the same procedure. A dictionary was created by varying and in the range (109, 10) and (0, 1). The dictionary was pruned so that no two elements had correlation greater than 0.96. This dictionary was used for all 100 trials. In each trial, a twostep process was used to generate synthetic data. 108 106 104 102 100 1020 1010 100 0 0.5 1 1.5 0 1 2 2 1 0 1 2 2 0 2 atan(s) 108 106 104 102 100 1020 100 1020 A Actual granma Xiang Bishay a PAGE 57 57 First, two random dictionary elements were sele cted from an (almost) uniform distribution over the pruned dictionary. They were restricted to ha ve correlation less than 0.3. Then, a data set of 25 spatially sequential data vectors was then generated by varying the weights of the two relaxations using two differentiated Gaussian shapes in the downtrack direction. The parameter estimation algorithms were then applied to the data for 50,000 iterations with high weight given to the sparsity terms of the objective functions. Figure 45. Histogram image of Number of terms vs. Fitting error for different dictionary search methods. From the top left to the bottom right, A) L2 error with iterative reweighted L2 regularization, B) approximate L1 error with iterative re weighted approximate L1 regularization, C) quadratic progra mming with iterative reweighted L1 regularization, D) linear programming with iterative reweighted L1 regularization, E) quadratic programming with fixed L1 regularization, F) linear programming with fixed L1 regularization. To compare the effectiveness of the algorithms, the histograms of fitting errors for different numbers of nonzero terms is plotted in Figure 45. The plot s are created by first plotting F) L fitting error# of terms 0 2 4 5 10 15 E) Q fitting error# of terms 0 2 4 5 10 15 D) L IR fitting error# of terms 0 2 4 5 10 15 C) Q IR fitting error# of terms 0 2 4 5 10 15 B) L1 fitting error# of terms 0 2 4 5 10 15 A) L2 fitting error# of terms 0 2 4 5 10 15 0 5 10 15 PAGE 58 58 parametric curves for each trial. Each curve plot s number of nonzero rows of the weight matrix vs. fitting error and the parameter is the iteration index. Histograms are then created by overlaying all the parametric curves in one plot and counting how many times each cell is covered. Here we observed that our L1 and L2 based methods outperformed off the shelf quadratic and linear programming methods. Experiment 3: Comparison of ColeCole and DSRF Dictionaries In this experim ent, the effect of choosing the right model to form the dictionary was studied. Two dictionaries were designed based on the DSRF ( = 1) and ColeCole ( not necessarily 1) models respectively. We refer to these two dicti onaries as the DSRF and ColeCole dictionaries. The maximum correlation betw een elements was restricted to 0.96 and both had 11 elements each. This experiment had two parts: (1) synthetic da ta were generated from the ColeCole model and not close to 1 and (2) synthetic da ta were generated from the DSRF model. In each part, we compare the two dict ionaries by analyzing th eir convergence (using the L1_IR method) to the correct number of parameters and the closeness to the true parameters. As before, each part of the experiment consisted of 100 trials. In part (1), data were generated for each trial by sampling the number of relaxations from a uniform distribution over {1, 2, 3}. Once the nu mber of relaxations was selected, then that number of relaxations was sampled by sampling and uniformly over the intervals (107, 1) and (0.2, 0.8) respectively. The relaxations were combined using a preset combination of weights to generate 25 data vectors as in the previous experiment. Gaussian noise with 10% of the energy in the signal was added. Since both DSRF and Co leCole dictionaries used the same number of elements, this experiment compares the effect of offsets in the and spaces respectively. The convergence statistics are given in Table 41. PAGE 59 59 Table 41. Comparison of conv ergence statistics: Part (1). Dictionary Convergence to true number of relaxations Converged to the closest relaxations in the parameter space Converged to relatively closer distance in the parameter space Performed the same ColeCole 47% 0% 38% 53% DSRF 9% 0% 9% The meanings of columns two and three seem cl ear. The fourth and fifth columns of Table 41 require some discussion. The spacing between dictionary elements in the parameter space varies nonuniformly because the parameter qua ntization depends on the correlation between the elements. The spacing must be taken into acco unt when calculating distance between the true and estimated parameters since no algorithm can ov ercome the inherent quantization error. More discussion of this will be given later. The fourth column measures the percentage of trials for which each algorithm was closer to the true parameter set. The fifth column is the percentage of cases for which they both the same distance from the true parameters. The DSRF dictionary based method converged to the true number of relaxations only 9% of the time compared to the ColeCole dictionary based method which converged 47% of the time. In the second part, it was required that 1. Data were then generated in each trial with one, two or three relaxations with uniformly sampled over the interval (107, 1) as described in part (1). As can be seen in Table 42, bo th dictionaries produce the correct number of relaxations but the DSRF errors are smaller more often. This should be expected because the quantization in the line 1 is coarser for ColeCole than DSRF. PAGE 60 60 Table 42. Comparison of conv ergence statistics: Part (2) Dictionary Convergence to true number of relaxations Converged to the closest relaxations in the parameter space Converged to relatively closer distance in the parameter space Performed the same ColeCole 100% 5% 15% 10% DSRF 100% 21% 69% Experiment 4: Dictionary update The previou s experiment assumed that the sample z is truly a linear combination of one or more of the dictionary elements. This however is not practical as the correlation between elements dictates the size of the dictionary and hence its resolution in the parameter space. Therefore, in this experiment the dictionary is adapted to the data using the adaptive kernel matching pursuit approach described in the technical approach. In this experiment, two relaxations were used for every trial. They were selected by sampling and uniformly over the intervals (107, 1) and (0, 1) respectively. Figure 47 (f ) is interesting. Each different colored curve represents the distance of a particular dicti onary element from the ne arest true relaxation in the parameter space. At the iteration for which a dictionary element is pruned, the curve ceases since that element is no longer part of the solu tion. By the end of the iterations, there are only two curves remaining, each of which have distance value almost zero. Thus, the algorithm converges to the true relaxations and only the true relaxations. PAGE 61 61 Figure 46. Convergence diagnostic s for dictionary search and update. From the top left to the bottom right plotted against iteration number, (a) L1 error, (b) iterative reweighted L2 regularization, (c) loss function with erro r and regularization terms combined, (d) total change in weights, (e) rate parame ter of hyperbolic tangent controlling L1 approximation, (f) distance between true relaxations and dictionary elements in parameter space. Experiment 5: Double Dictionary search Proof of concept To prove the concept of using double dictionary search we generated data with 2 relaxations taken from a dictionary of 23 el ements. A downtrack shape dictionary of 100 Gaussian pulses with different means and varian ces is designed with each pulse of duration 49 samples. This led to a total of 49 observations with their relaxations changing according to two opposing Gaussian pulse shaped signals with m eans at locations 22 and 32 respectively. Zero mean Gaussian random noise with 10% relative energy compared to the signal is added to the data to test the robustness of the estimates. PAGE 62 62 Figure 47. Weights of individual relaxa tions using double dictionary search. The estimated values are noisy versions of the original shape. Figure 48. Two nonzero rows of the weight matrix WT showing row sparsity and nonzero values in columns corresponding to the downtrack shape element. Figure 47 and Figure 48 show that the results are accu rate. This shows that a diction ary of downtrack shapes for objects of interest can be designed for a given sensor configuration. Since this method is highly specific to data se ts with particular sensor configurations, it is reserved for future work. PAGE 63 63 Experiment 6: Dictionary Correlation Analysis The effectiveness of any dictionary b ased se arch depends on the correlation between its elements. This is because most dictionary search methods use the L2 norm of the fitting error. Since the data and the elements are normalized to have a unit L2 norm, the fitting error becomes dependant only on the correlation. Therefore, in an L2 norm based solution, two elements with high correlation between them are almost indistinguishable. Figure 49. Correlation regions of a ColeCole dictionary region The black asterisks represent the DSRF dictionary elements chosen. Each of the different colors represent the regions that have 0.96 correlati on or more with at least on e or more of the dictionary elements. None of the dictionary elements can be used. Figure 49 shows the regions in the ColeCole p arameter space that are correlated with the DSRF dictionary elements. This plot demonstrates that the ColeCole dictionary must be coarser along the line = 1 or else no dictionary elements with 1 can be used. 1010 108 106 104 102 100 102 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 PAGE 64 64 Comparison on Landmine Data Data Sets The experim ents were performed on two data sets. Each data set contains measurements over buried mines, buried clutter objects and bla nks. The first set contains 62 different types of objects including 26 different type s of mines collected over 11 adjoining lanes divided into 220 grid cells. The second set contains 24 different types of objects including 12 different types of mines collected over 12 adjoining lanes divide d into 225 cells. Table 41 shows the distribution of objects in the data. The data were collected in two opposite directions over the same set of grid locations using a Wideband EMI sensor developed by W. Scott [14]. The object is assumed to be at the c enter of the grid cell and position errors were fixed manually. Table 43. Nomenclature and Proportion Notation Meaning Proportion Set 1 Set 2 Total HMAP High Metal AntiPersonnel mine 10% 3.6% 6.7% LMAP Low Metal AntiPersonnel mine 21.4% 8.4% 14.8% HMAT High Metal AntiTank mine 3.6% 1.3% 2.4% LMAT Low Metal AntiTank mine 15.9% 6.2% 11% HMC High Metal Clutter 15.5% 24.9% 20.2% MMC Medium Metal Clutter 0% 12.4% 6.3% LMC Low Metal Clutter 0% 12.4% 6.3% NMC NonMetallic Clutter or Blank 33.6% 30.7% 32.1% In the first data set, each grid cell is of dimension 1.5m x 1. 5m and in the second one; they are 1m x 1m. The WEMI sensor collects complex re sponses in 21 frequencies at 1cm intervals in 3 channels. This results in a 21 x 3 x 150 or 21 x 3 x 100 complex data matrix for each cell. PAGE 65 65 Data Filtering Figure 410. Downtrack filter template The measured data are filtered in the downtr ack direction by convolvin g with a zero mean template shown in Figure 410. This is used to remove gr ound response and sensor drift and also to im prove signal to noise ratio. Filtered data is used in our experiments for classification of objects modeled with a 1st order ColeCole model. Only the data from the center channel and at the center of the cell is used. This makes the first data set of size 220 grid cells x 2 directions and the second data set of size 225 x 2. Unfiltered data is used in dictionary based feature extraction methods as it represents the underl ying physical phenomenon better. Experiment 6: Classification of objects using first order model Experiment setup A 10 fold crossvalidation was perform ed. Ca re was taken to avoid having the same cell represented in both training and test sets simultaneously. To get fair representation of all objects in the training data, object stratification was used to create the training and testing folds. This method groups objects by type and splits them equally in all folds. 0 5 10 15 20 25 0.08 0.06 0.04 0.02 0 0.02 0.04 0.06 0.08 PAGE 66 66 Experiments and observations The experim ents were carried out for different algorithms in their increasing order of complexity. PRAM analysis In this appro ach, first the gradient angles we re computed for each grid cell. Since there are 28 different mines are in the data one prototype per mine type wa s created from the training data as per Equation 32. The nonmines were considered as outliers and not given prototypes. This led to a one class classification approach with a test sample be longing to the mine class with confidence given by Equation 33. GRANMA analysis Figure 411 plot showing the relationship between and EL for different mine types. 108 106 104 102 100 102 0 0.5 1 1.5 100 105 1010 EL HMAT HMAP LMAT LMAP PAGE 67 67 In this approach, the landmine data were assumed to follow first order ColeCole models and the parameters were extracted usin g GRANMA. The lookup table estimates of and for these data sets are shown in Figure 411. We have already seen from Figure 33 and the correspond ing section that, the error surface and the update gr adients are highly nonlinear especially near regions where = 1 and 2. But, Figure 412 shows that searching for between 0.1 and 1.4 (in linear scale) and for in the range of 106 to 1 (in log scale) was sufficient for the data in the present experiments. This justifies the choi ce of using a lookup table method. Future research may be targeted towa rds using alternative methods to a lookup table such as regression and direct inversion. Figure 412. and for different object types. The features are log10(), s, log10(), log10(EL) and EF making a feature set of size 220 x 2 for the first data set and of si ze 225 x 2 for the second data set. 100 0 0.5 1 HMAT 100 0 0.5 1 HMAP 100 0 0.5 1 LMAT 100 0 0.5 1 LMAP 100 0 0.5 1 HMC 100 0 0.5 1 MMC 100 0 0.5 1 LMC 100 0 0.5 1 NMC PAGE 68 68 F LE s s A A E E Fmax max 10 max 10 max max 10 ) (log)(log )(log vecto r Feature (41) where Amax, smax, max, max and Emax are scaling parameters. Figure 412 shows that for different object types for m their own clusters in the (, ) space whereas nonmetallic clutter (NMC) is spread all ove r. This is because there is no specific region in the lookup table that matches with NMC. Bishay method and Xiang inversion analysis In this appro ach, the landmine data was assumed to follow first order ColeCole model and the parameters were extracted using Xiang invers ion and Bishay method a nd the classification is done using the same soft kNN method described for GRANMA. Feature Selection Since the Xiang inversion and Bishay m e thod produced 5 features and GRANMA produced 6 features, each algorith m is evaluated for all its feature subsets. This makes 261 = 63 classification experiments for GRANMA and 251 = 31 experiments each for the Bishay method and Xiang inversion method. The results are compared using best ROC curves. Observations GRANMA outperform ed the other ColeCole feature extrac tion methods in terms of classification accuracy while Bishay method was the fastest. Even though all methods estimated the same features and used same type of classifier, the consistency in parameter estimates contributed to the difference in performance. PAGE 69 69 Figure 413. ROC curves of different landmine detection algorithms with their corresponding best features. ErrLookup and ErrFit denote the lookup table error and the fitting error respectively as defined in Equa tion 38 and Equations 311. Experiment 7: Analysis of GRANPA on landmine data Figure 414. Change in Argand diagram with distance for a Low Metal AntiPersonnel Mine buried at 1. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 PFAPDROC Curves PRAM GRANMA: tau c ErrLookup A s Xiang: tau c A ErrFit Bishay: tau c A s ErrFit Xiang PRAM Bishay GRANMA PAGE 70 70 Figure 415. Change in amplitude with distance for a Low Metal AntiPersonnel Mine buried at 1. The location nearest to the object is no ted by the blue ellipse. The x axis denotes observation number along down track Figure 414 shows the variation in the spectra l sh ape of unfiltered da ta as measurements are taken at increasing distance from the obje ct. GRANMA and other fi rst order approximation methods model the far field well and produce co nsistent parameter estimates for different instances of the same type of object. However, in the near field, data looks more piecewise first order. Figure 415 shows the variation in amplitude with dis tance. It shows that the amplitude crosses zero closest to the object. Also, the obs ervation closest to the zero crossing shows the maximum near field effect. Th erefore GRANPA was used to extract parameters from the observation closest to the zero crossing and used to initialize an iterative search procedure to connect the parameters of near and far field. This iterative schedule is summarized in the following pseudocode GRANPA_ILS 1. Find zero crossing location d0 2. Choose either positive or negative amplitude side 3. Estimate order and parameters using GRANPA S0 4. CurrentObservation Start of Positive Amplitude Side 5. Sstart S0 6. while CurrentObservation < EndOfPositiveAmplitudeSide a. CurrentObservation Move to next observation b. Use iterative least squares with initial conditions Sstart PAGE 71 71 c. Sstart New solution 7. end while 8. CurrentObservation Start of Negative Amplitude Side 9. Sstart S0 10. while CurrentObservation > BeginningOfNegativeAmplitudeSide a. CurrentObservation Move to previous observation b. Use iterative least squares with initial conditions Sstart c. Sstart New solution 11. end while Figure 416. Change in parameter values for a second order approximation estimated using GRANPA. The x axis denotes observation num ber along down track. The amplitude zero crossing happens at observation number 90. Figure 416 shows that as the distance fr om the object increases, the parameter values change. While it is expected that the amplitude a nd the shift can change w ith distance, the values of and should not as they are dielectric properties of the object under study. 70 80 90 100 110 5 4.5 1 70 80 90 100 110 0.85 0.9 0.95 1 1.05 1 80 100 1 0 1 x 105 A1 80 100 0.3 0.4 0.5 s1 80 100 7 6 5 2 80 100 0.8 1 1.2 2 80 100 4 2 0 2 4 x 105 A2 80 100 0.6 0.8 1 1.2 s2 80 100 1 0 1 x 105 A1s1+A2s2 PAGE 72 72 Experiment 8: Analysis of Dictio nary methods on landmine data Figure 417. Parameter values estimated using DSRF along downtrack. The x axis denotes distance along down track. The amplitude zero crossing happens at location 0.8m from the start. The horizontal lines corre spond to parameters estimated using joint sparse optimization. Figure 417 shows the difference between indi vidual and joint sparse optim ization using DSRF. Without the joint sparsity constraint, th e dictionary elements selected at different locations are free to change. PAGE 73 73 Figure 418. Parameter values estimated using joint sparse DSRF along downtrack. The x axis denotes observation number along down track The y axes denote the weight of corresponding dielectric relaxations. The amplitude zero crossing happens at observation number 35 from the start of the segment Figure 418 shows DSRF parameters estimated on a segm ent of the same data using a joint sparsity constraint. This constrains the values to be constant al ong the downtrack and only the amplitudes of the relaxations change PAGE 74 74 Figure 419. Relaxations estimated using a ColeCol e dictionary on the left vs. those estimated using a DSRF dictionary. The x axis corresp onds to different cell locations for the same mine type. Figure 420. Usage of dictionary elements by the landmine data. We want similar mines to use the same dictionary elements and hence lesser number of them. Joint Sparse ColeCole 101S 102J 104N 106H 109E 109K 1+0i 0+1i alpha=0.5 tau=10^9 alpha=0.7 tau=10^9 alpha=0.9 tau=10^9 alpha=0.1 tau=10^8.9 alpha=1 tau=10^5.6 alpha=0.9 tau=10^5.3 alpha=0.6 tau=10^5.2 alpha=1 tau=10^4.9 alpha=0.7 tau=10^4.7 alpha=1 tau=10^4.4 alpha=1 tau=10^4.1 alpha=0.9 tau=10^3.6 alpha=1 tau=10^3.3 alpha=0.3 tau=10^2.7 alpha=1 tau=10^2.6 alpha=0.6 tau=10^1.9 Joint Sparse DSRF 101S 102J 104N 106H 109E 109K 1+0i 0+1i tau=10^7.2 tau=10^6.6 tau=10^6.3 tau=10^6.1 tau=10^5.9 tau=10^5.7 tau=10^5.2 tau=10^4.7 tau=10^4.5 tau=10^4.3 tau=10^4.1 tau=10^3.9 tau=10^3.7 tau=10^3.5 tau=10^3.3 tau=10^3.1 0 2 4 6 8 10 12 14 16 18 20 0 50 100 150 dictionary index# of times usedColeCole dictionary 0 2 4 6 8 10 12 14 16 18 20 0 50 100 150 dictionary index# of times usedDebye dictionary PAGE 75 75 Figure 419 shows parameters estimated using jo int spa rse dictionary search method with a ColeCole model dictionary on the left and a Debye model based dictionary used by DSRF on the right. The number of elements in each dictio nary is fixed so that the comparison is done on the model used rather than resolution of the dictionary. The x axis shows different ground locations of the same landmine type. Figure 420 shows the number of times each dictionary elem ent was selected for the entire landmine data se t. Since it is desirable to have not only sparse estimates but also the same estimates at differe nt conditions for the same object, the entropy of the dictionary element usage is shown in Figure 421.The ColeCole model is not just able to produce a sp arser representation than the DSRF, but also a more consistent one. Figure 421. Entropy of estimated relaxa tions for 28 differe nt mine types. 0.3 0.4 0.5 0.6 0.7 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 entropy of ColeCole estimatesentropy of DSRF estimates PAGE 76 76 CHAPTER 5 CONCLUSIONS We have developed four algorithms to detect analyze and identify objects using wideband electromagnetic induction data. The first and simplest PRAM method is quick and doesnt make any assumptions about the data. It removes the variability in the amplitude and drifts in the data and has been found to be useful in designing a prescreener to mark in teresting sections of ground for further analysis. The second method GRANMA gives a robust way of estimating the ColeCole parameters and provides consistent feat ures that can be used for object classification. The third method GRANPA uses the same fr amework as GRANMA and is a quick way of modeling near field effects. The fourth meth od SPARDIR separates the sensor setup from the dielectric properties of objects and provides a framework to study them. Future work may be targeted towards speed ing up GRANMA by sear ching only a section of the lookup table using the range of the gradient angles. More statistical analysis on the convergence of the developed L1 version of SPARDIR needs to be done. Another area of promise is using SPARDIR to extract features identify objects. PAGE 77 77 LIST OF REFERENCES [1] H. Huang and I.J. Won, Identificatio n of Mineral Deposits using Airborne Electromagnetic Spectra, Expanded Abstracts, Society of Exploration Geophysics, pp. 58, 2002. [2] M Gourzi, A Rouane, R Guelaz, M Nadi, M Alavi and M McHugh, Noninvasive Glycaemia Blood Measurements by Electromagn etic Sensor: Study in Static and Dynamic Blood Circulation, Journal of Medical Engineering and Technology vol. 27 (1), pp. 22 26, 2005. [3] A. Ghorbani, C. Camerlynck, N. Florsch, P. Cosenza and A. Revil, Bayesian Inference of the ColeCole Parameters from Time a nd Frequencydomain Induced Polarization, Geophysical Prospecting, 55(4), 589. doi:10.1111/j.13652478.2007.00627.x. [4] C. Bruschini, A Multidisciplinary Analysis of Frequency Domain Metal Detectors for Humanitarian Demining, PhD thesis, Vrije Un iversiteit Brussel (VUB), September 2002. [5] J. Miller, T. H. Bell, J. Soukup and D. Ke iswetter, Simple Phenomenological Models for Wideband FrequencyDomain Electromagnetic Induction, IEEE Trans. on Geoscience and Remote Sensing, vol. 39, pp. 12941298, June 2001. [6] T. H. Bell, B. Barrow, J. Miller and D. Keiswetter, Time and Frequency Domain Electromagnetic Induction Signat ures of Unexploded Ordnance, Subsurface Sens. Tech. Appl. 2(3): pp. 153175, 2001. [7] C.E. Baum, Ed., Detection and Identification of Visually Obscured Targets, New York: Taylor & Francis, 1999. [8] K. S. Cole and R. H. Cole, Dispersion a nd Absorption in Dielectrics, I. Alternating Current Characteristics, J. Chem. Phys, 1, pp 341351, 1941 [9] S. T. Bishay, Numerical Methods for the Calculations of ColeCole parameters, Egypt J. Solids, vol. 23, pp 179188, 2000. [10] W. H. Pelton, P. G. Ward, W. R. Sill and P. H. Nelson, Mineral Discrimination and Removal Inductive Coupling with Multifrequency IP, Geophysics, vol. 43, 588609, 1978. [11] J. Xiang, N. B. Jones, D. Cheng and F. S. Schlindwein, Direct Inversion of the Apparent Complexresistivity Spectrum, Geophysics, vol. 66, pp. 13991404, 2001. [12] M. Wei, W. R. Scott and J. H. McClellan, Robust Spectrum of the Discrete Spectrum of Relaxations for Electromagnetic Induction Responses, IEEE Trans. on Geoscience and Remote Sensing, to be published. [13] S. Gabriel, R. W. Law and C. Gabriel, The Di electric Properties of Biological Tissue: III. Parametric Models for the Diel ectric Spectrum of Tissues," Phys. Med. Biol., vol. 14, pp. 22712293, 1996. PAGE 78 78 [14] E. Fails, P. Torrione, W. Scott, L. Collins, Performance of a Four Parameter Model for Landmine Signatures in Frequency Doma in Wideband Electromagnetic Induction Detection Systems, Proc. SPIE, Detection and Remediation Technologies for Mines and Minelike Targets XII; Russell S. Harmon, J. Thomas Broach, John H. Holloway, Jr.; Eds., vol. 6553, 2007. [15] S. E. Yuksel, G. Ramachandran, P. Gader, J. Wilson, D. Ho, and G. Heo, Hierarchical Methods for Landmine Detection with Wideband Electromagnetic Induction and Ground Penetrating Radar Multisensor Systems, Int.l Geoscience and Remote Sensing Symposium (IGARSS), vol. 2, pp. II177 II180, July 2008. [16] D. L. Donoho and M. Elad, Optimally Sp arse Representation in General (Nonorthogonal) Dictionaries via 1 Minimization, Proc. Nat.l Academy of Sciences of the United States of America, vol. 100 (5), pp. 21972202, March 2003. [17] J. Chen and X. Huo, Theoretical Results on Sparse Representations of Multiplemeasurement Vectors, IEEE Trans. on Signal Processing, vol. 54 (12), pp 46344643, December 2006. [18] S. Foucart, and M.J. Lai, Sparsest Soluti ons of Underdetermined Linear Systems via Lq Minimization for 0 < q < 1, Applied and Computational Harmonic Analysis, vol. 26 (3), pp 395407, 2009. PAGE 79 79 BIOGRAPHICAL SKETCH Ganesan Ram achandran received the Bachelor of Engineering degree in Electronics and Communication Engineering from the Regional Engineering College, Tiruchirappalli, Tamil Nadu, India in May 1999. He received his Master of Science and Doctor of Philosophy degrees in Electrical and Computer Engi neering from the University of Florida in August 2002 and May 2010 respectively. Currently, he is a research assistant in the Computational Science and Intelligence Lab in the Computer and Information Sciences and Engineering Department at the University of Florida. Research includes the development of algorithms, methodologies, and models with a solid mathematical and/or statistical base with applications to landmine detection. Current and previous research applies these methods to a variety of data including electromagnetic induction, radar, weather and biomedical. 