UFDC Home  myUFDC Home  Help 



Full Text  
PROBABILISTIC ANALYSIS AND RESULTS OF COMBINATORIAL PROBLEMS WITH MILITARY APPLICATIONS By DON A. GRUNDEL A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2004 Copyright 2004 by Don A. Grundel I dedicate this work to Bonnie, Andrew and Erin. ACKNOWLEDGMENTS I wish to express my heartfelt thanks to Professor Panos Pardalos for his guidance and support. His extraordinary energetic personality inspires all those around him. What I appreciate most about Professor Pardalos is he sets high goals for himself and his students and then tirelessly strives to reach those goals. I am grateful to the United States Air Force for its financial support and for allowing me to pursue my lifelong goal. Within the Air Force, I owe a debt of gratitude to Dr. David Jeffcoat for his counsel and assistance throughout my PhD efforts. My appreciation also goes to my committee members Stan Uryasev, Joseph Ge unes, and William Hager for their time and thoughtful guidance. I would like to thank my collaborators Anthony Okafor, Carlos Oliveira, Pavlo Krakhmal, and Lewis Pasil iao. Finally, to my family, Bonnie, Andrew and Erin, who have been extremely sup portive I could not have completed this work without their love and understanding. TABLE OF CONTENTS page ACKNOWLEDGMENTS ................... ...... iv LIST OF TABLES ................... .......... viii LIST OF FIGURES ................................ x ABSTRACT ...................... ............ xii 1 INTRODUCTION ........................... 1 1.1 Probabilistic Analysis of Combinatorial Problems ........ 1 1.2 Main Contributions and Organization of the Dissertation ..... 3 2 SURVEY OF THE MULTIDIMENSIONAL ASSIGNMENT PROBLEM 5 2.1 Formulations .............................. 5 2.2 Com plexity . . . . . . . 7 2.3 Applications .............................. 8 2.3.1 Weapon Target Assignment Problem. . . ... 8 2.3.2 Considering Weapon Costs in the Weapon Target Assign ment Problem .................. ... .. 11 2.4 Summary .................. ............ .. 12 3 CHARACTERISTICS OF THE MEAN OPTIMAL SOLUTION TO THE MAP ..... .............. ............... .. 13 3.1 Introduction .................. ........... .. 13 3.1.1 Basic Definitions and Results ................ .. 14 3.1.2 M otivation ........... . . ...... 15 3.1.3 Asymptotic Studies and Results .............. .. 16 3.1.4 Chapter Organization . . ........ 19 3.2 Mean Optimal Costs for a Special Case of the MAP . ... 20 3.3 Branch and Bound Algorithm ............ ... .. .. 23 3.3.1 Procedure .................. ...... .. .. 24 3.3.2 Sorting . . . . . . ... .. 27 3.3.3 Local Search .................. ..... .. .. 27 3.4 Computational Experiments .................. .. 28 3.4.1 Experimental Procedures .................. .. 28 3.4.2 Mean Optimal Solution Costs ............... .. 29 3.4.3 Curve Fitting .................. ..... .. 33 3.5 Algorithm Improvement Using Numerical Models .. ....... 3.5.1 Improvement of B&B .. ................. 3.5.2 Comparison of B&B Implementations .. .......... 3.6 R em arks . . . . . . . . 4 PROOFS OF ASYMPTOTIC CHARACTERISTICS OF THE MAP . 4.1 Introduction . . . . . . . . 4.2 Greedy Algorithm s .. ..................... 4.2.1 Greedy Algorithm 1 .. .................. 4.2.2 Greedy Algorithm 2 .. .................. 4.3 Mean Optimal Costs of Exponentially and Uniformly Distributed Random M APs .. ...................... 4.4 Mean Optimal Costs of NormalDistributed Random MAPs .. 4.5 Remarks on Further Research .. ................ 5 PROBABILISTIC APPROACH TO SOLVING THE MULTISENSOR MULTITARGET TRACKING PROBLEM .. ............ 5.1 Introduction . . . . . . . . 5.2 Data Association Formulated as an MAP .. ............ 5.3 Minimum Subset of Cost Coefficients .. ............. 5.4 GRASP for a Sparse MAP .. .................. 5.4.1 GRASP Complexity .. .................. 5.4.2 Search Tree Data Structure .. ............... 5.4.3 GRASP vs Sparse GRASP .. ............... 5.5 C conclusion . . . . . . . . 6 EXPECTED NUMBER OF LOCAL MINIMA FOR THE MAP ..... 6.1 Introduction . . . . . . . . 6.2 Some ('! i o '.teristics of Local Neighborhoods .. .......... 6.3 Experimentally Determined Number of Local Minima ....... 6.4 Expected Number of Local Minima for n 2 ............ 6.5 Expected Number of Local Minima for n > 3 .. .......... 6.6 Number of Local Minima Effects on Solution Algorithms ..... 6.6.1 Random Local Search .. ................. 6.6.2 G R A SP . . . . . . . . 6.6.3 Simulated Annealing .. ................. 6.6.4 R results . . . . . . . . 6.7 Conclusions . . . . . . . . 7 MAP TEST PROBLEM GENERATOR .. ............... 7.1 Introduction . . . . . . . . 7.1.1 Test Problem Generators .. ................ 7.1.2 Test Problem Libraries .. ................ 7.2 Test Problem Generator ............... .... 98 7.2.1 Proposed Algorithm ...... .......... ...... 98 7.2.2 Proof of Unique Optimum ............. .. 102 7.2.3 Complexity ............... .... .. 103 7.3 MAP Test Problem Quality ............. . 104 7.3.1 Distribution of Assignment Costs . . 105 7.3.2 Relative Difficultly of Solving Test Problems ....... 106 7.4 Test Problem Library ................ ... 109 7.5 Remarks ............... ........... 109 8 CONCLUSIONS ............... ........... ..111 REFERENCES ................... ... ........ 113 BIOGRAPHICAL SKETCH .................. ......... 122 LIST OF TABLES Table page 31 Mean optimal solution costs obtained from the closed form equation for MAPs of sizes n = 2, 3 < d < 10 and with cost coefficients that are independent exponentially distributed with mean one. . ... 23 32 Number of runs for each experiment with uniform or exponential as signment costs. ............... ......... 29 33 Number of runs for each experiment standard normal assignment costs. 30 34 Mean optimal costs for different sizes of MAPs with independent as signment costs that are uniform in [0,1]. .. . ..... 31 35 Mean optimal costs for different sizes of MAPs with independent as signment costs that are exponential with mean 1. . .... 31 36 Mean optimal costs for different sizes of MAPs with independent as signment costs that are standard normal. .. . ..... 31 37 Curve fitting results for fitting the form (An+B)c to the mean optimal costs for MAPs with uniform assignment costs. . . 35 38 Curve fitting results for fitting the form (An+B)c to the mean optimal costs for MAPs with exponential assignment costs. . .... 35 39 Curve fitting results for fitting the form A(n+B)c to the mean optimal costs for MAPs with standard normal assignment costs. ...... .. 36 310 Estimated and actual mean optimal costs from ten runs for variously sized MAPs developed from different distributions. Included are the average difference and largest difference between estimated mean op timal cost and optimal cost. .............. ...... 37 311 Results showing comparisons between three primal heuristics and the numerical estimate of optimal cost for several problem sizes and types. Shown are the average feasible solution costs from 50 runs of each primal heuristic on random instances. ............ ..40 312 Average time to solution in seconds of solving each of five randomly generated problems of various sizes and types. The experiment in volved using the B&B solution algorithm with different starting upper bounds developed in three different vi. ............. 43 51 Comparisons of the number of cost coefficients of original MAP to that in A. ..... .............. .............. 63 52 Table of experimental results of comparing solution quality and time tosolution for GRASP in solving fully dense and reduced simulated MSMTT problems. Five runs of each algorithm were conducted against each problem. .................. ..... 68 61 Average number of local minima (2exchange neighborhood) for differ ent sizes of MAPs with independent assignment costs. . ... 75 62 Average number of local minima (3exchange neighborhood) for differ ent sizes of MAPs with i.i.d. standard normal assignment costs. 76 63 Proportion of local minima to total number of feasible solutions for different sizes of MAPs with i.i.d. standard normal costs. . 76 71 Timed results of producing test problems of various sizes. ...... ..105 72 Chisquare goodnessoffit test for normal distribution of assignment costs for six randomly selected 5x5x5 test problems. . ... 106 73 Number of discrete local minima per 106 feasible solutions. The range is a 95percent confidence interval based on proportionate sampling. 108 74 Comparison of solution times in seconds using an exact solution algo rithm of the branchandbound variety. .. . . .... 109 75 Comparison of solution results using a GRASP algorithm . 109 LIST OF FIGURES Figure page 31 Branch and Bound on the Index Tree. ................. 24 32 Plots of mean optimal costs for four different sized MAPs with expo nential assignment costs. ............... .... 30 33 Surface plots of mean optimal costs for 3 < d < 10 and 2 < n < 10 sized MAPs with exponential assignment costs. ......... ..32 34 Plots of mean optimal costs for four different sized MAPs with standard normal assignment costs. ............... .... 32 35 Plots of standard deviation of mean optimal costs for four different sized MAPs with exponential assignment costs. ......... ..33 36 Plots of standard deviation of mean optimal costs for four different sized MAPs with standard normal assignment costs. . ... 34 37 Three dimensional MAP with exponential assignment costs. Plot in cludes both observed mean optimal cost values and fitted values. The two lines are nearly indistinguishable. . . ..... 36 38 Plots of fitted and mean optimal costs from ten runs of variously sized MAPs developed from the uniform distribution on [10, 20]. Note that the observed data and fitted data are nearly indistinguishable. 38 39 Plots of fitted and mean optimal costs from ten runs of variously sized MAPs developed from the exponential distribution with mean three. 38 310 Plots of fitted and mean optimal costs from ten0 runs of variously sized MAPs developed from a normal distribution, N(p = 5, a = 2). 39 311 Branch and bound on the index tree. ................... .. 41 51 Example of noisy sensor measurements of target locations. . 57 52 Example of noisy sensor measurements of close targets. In this case there is false detection and missed targets. . . ..... 57 53 Search tree data structure used to find a cost coefficient or determine a cost coefficient does not exist. ................ .. .. 66 54 Search tree example of a sparse MAP. ................. 67 61 Proportion of feasible solutions that are local minima when considering the 2exchange neighborhood and where costs are i.i.d. standard norm al. . . . . . .. . . 76 62 Plots of solution quality versus number of local minima when using the 2exchange neighborhood. The MAP has a size of d = 4, n = 5 with cost coefficients that are i.i.d. standard normal. ......... ..89 63 Plots of solution quality versus number of local minima when using a 3exchange neighborhood. The MAP has a size of d = 4, n = 5 with cost coefficients that are i.i.d. standard normal. ......... ..90 71 Tree graph for 3x4x4 MAP. .................. .... 98 72 Initial tree graph with assignment costs and lower bound path costs.. 101 73 Tree graph with optimal path and costs. .............. ..102 74 Tree graph used to consider all feasible nodes at level 3 from the first node in level 2. .................. .. ...... 103 75 Final tree graph for a 3x4x4 MAP. ................. 104 76 Typical normal probability plot for a 5x5x5 test problem ..... ..106 77 Typical histogram of 20x30x40 test problem. . . 107 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy PROBABILISTIC ANALYSIS AND RESULTS OF COMBINATORIAL PROBLEMS WITH MILITARY APPLICATIONS By Don A. Grundel August 2004 C('! In: Panagote M. Pardalos Major Department: Industrial and Systems Engineering The work in this dissertation examines combinatorial problems from a probabilis tic approach in an effort to improve existing solution methods or find new algorithms that perform better. Applications addressed here are focused on military uses such as weapontarget assignment, path planning and multisensor multitarget tr 1.:iin : however, these may be easily extended to the civilian environment. A probabilistic analysis of combinatorial problems is a very broad subject; how ever, the context here is the study of input data and solution values. We investigate characteristics of the mean optimal solution values for random multidimensional assignment problems (\. APs) with axial constraints. Cost coeffi cients are taken from three different random distributions: uniform, exponential and standard normal. In the cases where cost coefficients are independent uniform or exponential random variables, experimental data indicate that the average optimal value of the MAP converges to zero as the MAP size increases. We give a short proof of this result for the case of exponentially distributed costs when the number of elements in each dimension is restricted to two. In the case of standard normal costs, experimental data indicate the average optimal value of the MAP goes to neg ative infinity as the MAP size increases. Using curve fitting techniques, we develop numerical estimates of the mean optimal value for various sized problems. The exper iments indicate that numerical estimates are quite accurate in predicting the optimal solution value of a random instance of the MAP. Using a novel probabilistic approach, we provide generalized proofs of the . imp totic characteristics of the mean optimal costs of MAPs. The probabilistic approach is then used to improve the efficiency of the popular greedy randomized adaptive search procedure. As many solution approaches to combinatorial problems rely, at least partly, on local neighborhood searches, it is widely assumed the number of local minima has implications on solution difficulty. We investigate the expected number of local minima for random instances of the MAP. We report on empirical findings that the expected number of local minima does impact the effectiveness of three different solution algorithms that rely on local neighborhood searches. A probabilistic approach is used to develop an MAP test problem generator that creates difficult problems with known unique solutions. CHAPTER 1 INTRODUCTION Combinatorial optimization problems are found in everyday, life. They are par ticularly important in military applications as they most often concern management and efficient use of scarce resources. Applications of combinatorial problems are in a period of rapid development which follows from the widespread use of computers and the data available from information systems. Although computers have allowed expanded combinatorial applications, most of these problems remain very hard to solve. The purpose of the work in this dissertation is to examine combinatorial prob lems from a probabilistic approach in an effort to improve existing solution methods or find new algorithms that perform better. Most applications addressed here are fo cused on military applications; however, most may be easily extended to the civilian environment. 1.1 Probabilistic Analysis of Combinatorial Problems In general, probabilistic analysis of combinatorial problems is a very broad sub ject; however, the context being used here is the study of problem input values and solution values of combinatorial problems. An obvious goal is to determine if param eters (e.g., mean, standard deviation, etc.) of these values can be used to improve the efficiency of a solution algorithm. Alternatively, parameters of these values may be useful in selecting an appropriate solution algorithm. Although problem instance size is directly correlated with the difficulty of determining a solution, we often face problems of similar size that have far different computing times. One can conclude from this that characteristics of the problem data are significant factors. An example of the study of solution values is by Barvinok and Stephen [13], where the authors obtain a number of results regarding the distribution of solution values of the quadratic assignment problem. In the paper, the authors consider questions such as, how well does the optimum of a sample of random permutations approximate the true optimum? They explore an interesting approach in which they consider the "kth sphere" around the true optimum. The kth sphere, in simple terms, quantifies the nearness of permutations to the optimum permutation. By allowing the true optimum to represent a bullseye, the authors observe as the kth sphere contracts to the optimal permutation, the average solution value of a sample of permutations steadily improves. A study of the quadratic assignment problem (QAP) is found work by Abreu et al. [1] where the authors consider using average and variance of solution costs to establish the difficulty of a particular instance. Sanchis and Schnabl [103] study the "landscape" of the traveling salesman prob lem. Considered are number of local minima and autocorrelation functions. The concept of landscape was introduced by Wright [111] and can be thought of as a map of solution values such that there are peaks and valleys. Landscape roughness can give an indication of problem difficulty. In a study of cost inputs, Reilly [94] I. i that the degree of correlation among input data may influence the difficulty of finding a solution. It is r..ii, I. 1 that an extreme level of correlation can produce very challenging problems. In this dissertation, we use a probabilistic approach to consider how input costs affect solution values in an important class of problems called the multidimensional assignment problem. We also consider the mean optimal costs of various problem instances to include some .ii! i, 1I.l ic characteristics. We include another interesting probabilistic analysis which is our study of local minima and how the number of local minima affects solution methods. Finally, we use a probabilistic approach to design and analyze a test problem generator. 1.2 Main Contributions and Organization of the Dissertation The main contributions and organization of this dissertation are briefly discussed in the following paragraphs. Survey of the multidimensional assignment problem. A brief survey of the multidimensional assignment problem (1 AP) is provided in C'! Ilpter 2. In this chapter, we provide alternative formulations and applications for this important and difficult problem. Mean optimal solution values of the MAP. In ('!, lpter 3 we report exper imentally determined values of the mean optimal solution costs of MAPs with cost coefficients that are independent random variables that are uniformly, exponentially or normally distributed. Using the experimental data, we then find curve fitting models that can be used to accurately determine their mean optimal solution costs. Finally, we show how the numerical estimates can be used to improve at least two solution methods of the MAP. Proof of asymptotic characteristics of the MAP. In ('!i lpter 4 we prove some .i, !! ii '1 ic characteristics of the mean optimal costs using a novel probabilistic approach. Probabilistic approach to solving the data association problem. Us ing the probabilistic approach introduced in ('!i lpter 4, we extend the approach in ('!i lpter 5 to more efficiently solve the data association problem that results from the multisensor multitarget tracking problem. In the multisensor multitarget problem noisy measurements are made with an arbitrary number of spatially diverse sensors regarding an arbitrary number of targets with the goal of estimating the trajectories of all the targets present. Furthermore, the number of targets may change by moving into and out of detection range. The problem involves a data association of sensor measurements to targets and estimates the current state of each target. The combi natorial nature of the problem results from the data association problem; that is how do we optimally partition the entire set of measurements so that each measurement is attributed to no more than one target and each sensor detects a target no more than once? Expected number of local minima for the MAP. The number of local minima in a problem may provide insight to more appropriate solution methods. ('!i lpter 6 explores the number of local minima in the MAP and then considers the impact of the number of local minima on three solution methods. MAP test problem generator. As examined in the first five chapters, a probabilistic analysis can be used to develop a priori knowledge of problem instance hardness. In ('!i lpter 7 we develop an MAP test problem generator and use some probabilistic analyses to determine the generator's effectiveness in creating Il;,oil.:l test problems with known unique optimal solutions. Also included is a brief survey of sources of combinatorial test problems. CHAPTER 2 SURVEY OF THE MULTIDIMENSIONAL ASSIGNMENT PROBLEM The MAP is a higher dimensional version of the standard (twodimensional, or linear) assignment problem. The MAP is stated as follows: given d, nsets A1, A2,..., Ad, there is a cost for each dtuple1 in A, x A2 x ... x Ad. The problem is to minimize the cost of n tuples such that each element in A1 U A2 U ... U Ad is in exactly one tuple. The problem was first introduced by Pierskalla [86]. Solution methods have included branch and bound [87, 10, 84], Greedy Randomized Adap tive Search Procedure (GRASP) [4, 74], Lagrangian relaxation [90, 85], a genetic algorithm based heuristic [25], and simulated annealing [27]. 2.1 Formulations A wellknown instance of the MAP is the threedimensional assignment problem (3DAP). An example of the 3DAP consists of minimizing the total cost of assigning ni items to nj locations at nk points in time. The threedimensional MAP can be 1 Tuple an abstraction of the sequence: single, double, triple,..., dtuple. Tuple is used in denote a point in a multidimensional coordinate system. formulated as ni j nk 1 min cijkXijk nj nk s.t. Xijk = 1 for all i = 1,2,..., ii, j=1 k=1 i=l k 1 1 i Xijk < 1 for all k = 1,2,..., nk, i=1 j=1 Xijk e {0,1} for all i,j, k {1,... ,n}, ni where Cijk is the cost of assigning item i to location j at time k. In this formulation, the variable xijk is equal to 1 if and only if the ith item is assigned to the jth location at time k and zero otherwise. If we consider additional dimensions for this problem, the formulation can be similarly extended in the following way: min c~l il id il=1 id=1 n2 nd s.t. Xil ...i 1 for all i = 1,2,..., ni, i21= id 1 ni nk1 nk+1 nd S E E ** ... id < 1 ii 1 ikl l +l=l 1 d 1 for all k = 2,... d 1, and ik = 1, 2,..., nk, n2 nd1 .. > il "id < 1 for all id 1, 2,... nd, i21= id 1=1 Xil...id {0, 1} for all i, i2, .. d {1, .. }, ni < n2 < rd, where d is the dimension of the MAP. If we allow n1 = n2 =" nd = n, an equivalent formulation states the MAP in terms of permutations 61,..., 6di of numbers 1 to n. Using this notation, the MAP is equivalent to rain 1 Ci,61() ,...,6d 1(i)W .... ...,..d iE^l i=1 where HI" is the set of all permutations of {1,..., n}. 2.2 Complexity Solving even moderate sized instances of the MAP is a difficult task. A linear increase in the number of dimensions brings an exponential increase in the number of cost coefficients in the problem and the number of feasible solutions, N, is given by the relation d ,i N f nj! i= 2 In general, the MAP is known to be NPhard, a fact which follows from results work by Garey and Johnson [44]. Even in the case when costs take on a special structure of triangle inequalities, Crama and Spieksma [31] prove the threedimensional problem remains NPhard. However, special cases that are not NPhard do exist. Burkard, Ridolf, and Woeginger [23] investigate the threedimensional problems with decomposable cost coefficients. Given three nelement sequences ai, bi and ci, i = 1,...,n, a cost coefficient dijk is decomposable when dijk = 7 .Ck. Burkard [23] finds the minimization and maximization of the threedimensional assignment problem have different complexities. While the maximization problem is solvable in polynomial time, the minimization problem remains NPhard. On the other hand, Burkard [23] identifies several structures where the minimization problem is polyno mially solvable. A polynomially solvable case of the MAP occurs when the cost coefficients are taken from a Monge matrix [22]. An m x n matrix C is called a Monge matrix if cij + crs < cis + crj for all 1 < i < r < m, 1 < j < s < n. Another way to describe the Monge array is to again consider the matrix C. Any two rows and two columns must intersect at exactly four elements. The rows and columns satisfy the Monge property if the sum of the upperleft and lowerright elements is at most the sum of the upperright and lowerleft elements. This can easily be extended to higher dimensions. Because of the special structure of the Monge matrix, the MAP becomes polynomially solvable with a lexicographical greedy algorithm and the identity permutation is an optimal solution. 2.3 Applications The MAP has applications in numerous areas such as, data association [8], scheduling teaching practices [42], production of printed circuit boards [30], placement of distribution warehouses [87], multisensor multitarget problems [74, 91], tracking elementary particles [92] and multiagent path planning [84]. More examples and an extensive discussions of the subject can be found in two extensive surveys [81, 19]. A particular military application of the MAP is the Weapon Target Assignment problem which is discussed in the following subsection. 2.3.1 Weapon Target Assignment Problem The targetbased Weapon Target Assignment (WTA) problem [81] considers op timally assigning W weapons to T targets so that the total expected damage to the targets is maximized. The term targetbased is used to distinguish these problems from the assetbased or defensebased problems where the goal of these problems is to assign weapons to incoming missiles to maximize the surviving assets. The targetbased problems primarily apply to offensive strategies. Assume at a particular instant in time the number and location of weapons and targets are known with certainty. Then a single assignment may be made at that instant. Consider W weapons and T targets and define xij, i = 1,2,..., W, j 1,2,. ,T as: 1 if weapon i assigned to target j, xij c 0 O otherwise. Given that weapon i engages target j, the outcome is random. P(target j is destroyed by weapon i) = Pij P(target j is not destroyed by weapon i) = 1 Pij If one assumes that each weapon engagement is independent of every other en gagement, then the outcomes of the engagements are independent and Bernoulli dis tributed. Note that we let qij = (1 Pi) which is the probability that target j survives an encounter with weapon i. Now assign Vj to indicate a value for each target j. The objective is to maximize the damage to targets or minimize the value of the targets which may be formulated T W minimize V ij qJ j (2.1) j=1 i=1 T subject to Xij 1, i= ,2,...,W j=1 Xij {0, 1}. This is a nonlinear assignment problem and is known to be NPcomplete. Notice a few characteristics of the above problem. Since there is no cost for employing a weapon, all weapons will be used. The solution may result in some targets not being targeted because they are relatively worthless and/or because they are very difficult to defeat. A transformation of this formulation to an MAP may be accomplished. Using a two weapon, two target example, the transformation follows. First observe that the objective function of (2.1) may be written as minimize Vi[q qX21 ] + V2[q2 q 22 (2.2) Obviously, the individual probabilities of survival, qij, go to one if weapon i does not engage target j. Therefore, using the first term of the objective function in equation (2.2) as an example, the first term becomes V [qliq21] if X1 = 1 and x21 1= or Vi[qll] if xl = 1 and X21 = 0, or V [q21] if 11 = 0 and x21 1 or V1 if xll = 0 and x21 = 0. Notice these terms are now constant cost values. A different decision variable, paj, may be introduced that represents the status of engaging the different weapons on target j. a = 1, 2} represents weapon l's status of engagement on target j, where a = 1 means weapon 1 engages target j and a = 2 otherwise. Similarly, {1, 2} represents weapon 2's status of engagement of target j. For example, 1 both the first and second weapon engage target j, 0 else, and, S1 the first but not the second weapon engages target j, P12) = 0 else. The cost values may now be represented by c,3j. For example, cl = V [qllq21] and 121 = VI[qul]. Using these representations, the first term of objective function (2.2) becomes CllPlll + C121P121 + C211P211 + C221P221. For the two weapon, two target scenario, (2.1) may reformulated to a three dimensional MAP as follows. 2 2 2 min ZZZ "* C a=l 13 j=1 2 2 s.t. Z a IV a = 1,2 j= 1 =1 j Pa8 tV1,2 2 2 Pa8 tV/3 1,2 a=1 j=1 Papje {0, 1} V a, j. In general, reformulation of (2.1) will result in a W + 1 dimensional MAP. The number of indices will be T. As mentioned above, weapon costs are not considered in this formulation which results in all weapons being assigned. A more realistic formulation that considers weapon costs is developed in the next subsection. 2.3.2 Considering Weapon Costs in the Weapon Target Assignment Prob lem The formulation in the previous subsection excludes weapon costs which can result in overkill or poor use of expensive weapons on low valued targets. A more realistic formulation includes weapon costs. Let Ci be the cost of the ith weapon and let j =T + 1 be a dummy target. We may now reformulate (2.1) as W T W T W minimize C Tx + V q I (2.3) i=1 j=1 i=l,j=T+1 j=1 i=1 T+1 subject to Xj 1, i 1,2,...,W j=1 Xi, {0, 1}. The first summation term considers the costs of weapons assigned to actual targets. The second summation term considers the savings by applying weapons to the dummy target. Following a similar development as in the previous subsection, we obtain a gen eralized MAP formulation that incorporates weapon costs. T+1 T+1 T+I minm Cwiw2...jPwiw2... w=1 w21= j=1 T+1 T+1 s.t. Pwwa2 ... 1 Vwi = 1, 2, ,T + 1 w2 1 j1= T+1 T+1 T+1 T+1 wi=1 Wk =1wk+l 1 j1= Vk= 1,...,W1, and",, 1,2,...,T+1 T+1 T+I I  pV 2...J = 1V t,2,...,T + 1 wi=1 ww=1 PW1w2j E {0, 1} V WI, 2, .. ,j. This formulation results in a W + 1 dimensional MAP with T + 1 elements in each dimension. 2.4 Summary The MAP has been studied extensively in the last couple of decades and its appli cations in both military and civilian arenas has been rapidly expanding. The difficult nature of the problem requires researchers to continuously consider novel solution methods and a probabilistic approach provides some needed insight in developing these solution methods. CHAPTER 3 CHARACTERISTICS OF THE MEAN OPTIMAL SOLUTION TO THE MAP In this chapter, we investigate characteristics of the mean optimal solution values for random MAPs with axial constraints. Throughout the study, we consider cost coefficients taken from three different random distributions: uniform, exponential and standard normal. In the cases of uniform and exponential costs, experimental data indicate that the mean optimal value converges to zero when the problem size increases. We give a short proof of this result for the case of exponentially distributed costs when the number of elements in each dimension is restricted to two. In the case of standard normal costs, experimental data indicate the mean optimal value goes to negative infinity with increasing problem size. Using curve fitting techniques, we develop numerical estimates of the mean optimal value for various sized problems. The experiments indicate that numerical estimates are quite accurate in predicting the optimal solution value of a random instance of the MAP. 3.1 Introduction NPhard problems present important challenges to the experimental researcher in the field of algorithms. That is because, being difficult to solve in general, careful restrictions must be applied to a combinatorial optimization problem in order to solve some of its instances. However, it is also difficult to create instances that are representative of the problem, suitable for the technique or algorithm being used, and at the same time interesting from the practical point of view. One of the simplest and, in some cases, most useful v~i of creating problem instances consists of drawing values from a random distribution. Using this procedure, one wishes to create a problem that is difficult "on average," but that can also appear as the outcome of some natural process. Thus, one of the questions that arises is how a random problem will behave in terms of solution value, given some distribution function and parameters from which values are taken. This question turns out to be very difficult to solve in general. As an example, for the Linear Assignment Problem (LAP), results have not been easy to prove, despite intense research in this field [5, 28, 29, 55, 82]. In this chapter we perform a computational study of the .i~!,lil ic behavior for instances of the MAP. 3.1.1 Basic Definitions and Results The MAP is an NPhard combinatorial optimization problem, which extends the Linear Assignment Problem (LAP) by adding more sets to be matched. The number d of sets corresponds to the dimension of the MAP. In the special case of the LAP, we have d = 2. ('! Ilpter 2 provides an overview of the MAP to include formulations and applications. Let z(I) be the value of the optimum solution for an instance I of the MAP. We denote by z* the expected value of z(I), over all instances I constructed from a random distribution (the context will make clear what specific distribution we are talking about). In the problem instances considered in this chapter, we have fl = n2 = nrd = n. Our main contribution in this chapter is the development of numerical estimates of the mean optimal costs for randomly generated instances of the MAP. The experi ments performed show that for uniform [0, 1] and exponentially distributed costs, the optimum value converges to zero as the problem size increases. These results are not surprising for an increase in d since the number of cost coefficients increases exponen tially with d. However, convergence to zero for increasing n is not as obvious since the objective function is the sum of n cost coefficients. Experiments with standard normally distributed costs show that the optimum value goes to oo as the problem size increases. More interestingly, the experiments show convergence even for small values of n and d. The three distributions (exponential, uniform and normal) were chosen for anal ysis as they are very familiar to most practitioners. Although we would not expect realworld problems to have cost coefficients that follow exactly these distributions, we believe that our results may be extended to other cost coefficient distributions. 3.1.2 Motivation The study of.,i, :!il II ic values for MAPs has important motivations arising from theory and from practical applications. First, there are few theoretical results on this subject, and therefore, practical experiments are a good method for determining how MAPs behave for instances with random values. Determining .ivmptotic values for such problems is a 1 ii 'i"r open question in combinatorics, which can be made clear by careful experimentation. Another motivation for this work has been the possible use of . i~! ,ll1 ic results in the practical setting of heuristic algorithms. When working with MAPs, one of the greatest difficulties is the need to cope with a large number of entries in the multidimensional vector of costs. For example, in an instance with d dimensions and minimum dimension size n, there are nd cost elements that must be considered for the optimum assignment. Solving an MAP can become very hard when all elements of the cost vector must be read and considered during the algorithm execution. This happens because the time needed to read nd values makes the algorithm exponential on d. A possible use of the results shown in this chapter allows one, having good estimates of the expected value of an optimal solution and the distribution of costs, to discard a large number of entries in the cost vector, which have low probability of being part of the solution. By doing this, we can improve the running time of most algorithms for the MAP. Finally, while some computational studies have been performed for the random LAP, such as by Pardalos and Ramakrishnan [82], there are limited practical and theoretical results for the random MAP. In this chapter we try to improve in this respect by presenting extensive results of computational experiments for the MAP. 3.1.3 Asymptotic Studies and Results Asymptotical studies of random combinatorial problems can be traced back to the work of Beardwood, Halton and Hammersley [14] on the traveling salesman prob lem (TSP). Other work includes studies of the minimum spanning tree [41, 105], Quadratic Assignment Problem (QAP) [21] and, most notably, studies of the Linear Assignment Problem (LAP) [5, 28, 55, 64, 83, 76, 82, 109]. A more general analysis was made on random graphs by Lueker[69]. In the case of the TSP, the problem is to let Xi, Xi 1,..., n, be independent random variables uniformly distributed on the unit square [0, 1]2, and let L, denote the length of the shortest closed path (usual Euclidian distance) which connects each element of {X1,X2,...,X,.}. The classic result proved by Beardwood et al. [14] is lim  with probability one for a finite constant P. This becomes significant, as addressed by Steele [104], because it is key to Karp's algorithm [54] for solving the TSP. Karp uses a cellular dissection algorithm for the approximate solution. The above result may be summarized as implying that the optimal tour through n points is sharply predictable when n is large and the dissection method tends to give nearoptimal solutions when n is large. This points to an idea of using .,i~!,l')itic results to develop effective solution algorithms. In the minimum spanning tree problem, consider an undirected graph G = (N, A) defined by the set N of n nodes and a set A of m arcs, with a length ci associated with each arc (i,j) e A. The problem is to find a spanning tree of G, called a minimum spanning tree (\!ST), that has the smallest total length, LMST, of its constituent arcs [3]. If we let each arc length ci be an independent random variable drawn from the uniform distribution on [0, 1], Frieze [41] showed that E[LMsT] ((3) 3 = 1.202 as n i oo This was followed by Steele [105], where the Tutte polynomial for a connected graph is used to develop an exact formula for the expected value of LMST for a finite graph with uniformly distributed arc costs. Additional work concerning the directed minimum spanning tree is also available [17]. For the Steiner tree problem which is an NPhard variant of the MST, Bollobds, et al. [18] proved that with high probability the weight of the Steiner tree is (1 + O(1))(k 1)(log n log k)/n when k = O(n) and n i o and where n is the number of vertices in a complete graph with edge weights chosen as i.i.d. random variables distributed as exponential with mean one. In the problem, k is the number of vertices contained in the Steiner tree. A famous result that some call the BurkardFincke condition relates to the QAP. The QAP was introduced by Koopmans and Beckmann [60] in 1957 as a model for the location of a set of indivisible economical activities. QAP applications, extensions and solution methods are well covered in work by Horst et al. [51]. The Burkard Fincke condition [21] is that the ratio between the best and worst solution values approaches one as the size of the problem increases. Another way to think of this is for a large problem any permutation is close to optimal. According to Burkard and Fincke [21] this condition applies to all problems in the class of combinatorial optimization problems with sum and bottleneck objec tive functions. The Linear Ordering Problem (LOP) [26] falls into this category as well. Burkard and Fincke ii 1 that this result means that very simple heuristic algorithms can yield good solutions for very large problems. Recent work by Aldous and Steele [6] provides part survey, part tutorial on the objective method in understanding .,ivi ill. lic characteristics of combinatorial problems. They provide some concrete examples of the approach and point out some unavoidable limitations. In terms of the ., ii!1 .I ,tic nature of combinatorial problems, the most explored problem has been the LAP. In the LAP we are given a matrix C"x' with coefficients cij. The objective is to find a minimum cost assignment; i.e., n elements cl1,..., cCni, such that jp / j, for all p / q, with ji E {1,..., n}, and E I cij is minimum. A well known conjecture by M6zard and Parisi [71, 72] states that the opti mal solution for instances where costs cij are drawn from an exponential or uniform distribution, approaches 7r2/6 when n (the size of the instance) approaches infinity. Pardalos and Ramakrishnan [82] provide additional empirical evidence that the con jecture is indeed valid. The conjecture was expanded by Parisi [83], where in the case of costs drawn from an exponential distribution the expected value of the optimal solution of an instance of size n is given by 1 (3.1) i2. i= 1 Moreover, 1 7r2 as n o. i= 1 This conjecture has been further strengthened by Coppersmith and Sorkin [28]. The authors conjecture that the expected value of the optimum kassignment, for a fixed matrix of size n x m, is given by .1 i,j>0, i+j They also presented proofs of this conjecture for small values of n, m and k. The conjecture is consistent with previous work [71, 83], since it can be proved that for m = n = k this is simply the expression in (3.1) Although until recently the proofs of these conjectures have eluded many re searchers, there has been progress in the determination of upper and lower bounds. Walkup [109] proved an upper bound of 3 on the ..iiiiil ,l ic value of the objective function, when the problem size increases. This was improved by Karp [55], who showed that the limit is at most 2. On the other hand, Lazarus [64] proved a lower bound of 1 + 1/e t 1.3679. More recently this result was improved by Olin [76] to the tighter lower bound value of 1.51. Finally, recent papers by Linusson and Wastlund [67] and Nair et al. [75] have solved the conjectures of M6zard and Parisi, and Coppersmith and Sorkin. Concerning the MAP, not many results are known about the ..imptotic behav ior of the optimum solution for random instances. However, one example of resent work is that by Huang et. al. [52]. In this work the authors consider the complete dpartite graph with n vertices in each of d sets. If all edges in this graph are assigned independent weights that are uniformly distributed on [0,1], then the expected mini mum weight perfect ddimensional matching is at least n12/d. They also describe a randomized algorithm to solve this problem where the expected solution has weight at most 5d3 12/d + d15 for all d > 3. However, note that for even a moderate size for d, this upper bound is not tight. 3.1.4 Chapter Organization This chapter is organized as follows. In the next section, we give a closed form result on the mean optimal costs for a special case of the MAP when the number of elements in each dimension is equal to 2. The method used to solve the MAP employs a branchandbound algorithm, described in Section 3.3, to find exact solu tions to the problem. Then, in Section 3.4 we present the computational results and curve fitting models to estimate the mean optimal costs. Following this, we provide some methods to use the numerical models to improve the efficiency of two solution algorithms. Finally, concluding remarks and future research directions are presented in Section 3.6. 3.2 Mean Optimal Costs for a Special Case of the MAP In this section we present a result regarding the imptotical behavior of 2z in the special case of the MAP where n = 2, d > 3, and cost elements are independent exponentially distributed with mean one. This is done to give a flavor of how these results can be obtained. For proofs of a generalization of this theorem, including normal distributed costs, refer to C'i plter 4. Initially, we employ the property stated in the following proposition. Proposition 3.1 In an instance of the MAP with n = 2 and i.i.d. exponential cost coefficients with mean 1, the cost of each feasible solution is an independent .ri,,i,,. distributed random variable with parameters a = 2, and A = 1. Proof: Let I be an instance of MAP with n = 2. Each feasible solution for I is an assignment al = CI,(),...,d 1(1), a2 = C2,61(2),...,6d 1(2), with cost z = al + a2. The important feature of such assignments is that for each fixed entry C,i(1),...,Sd (i), there is just one remaining possibility, namely c2,6S(2),...,6d1(2), since each dimension has only two elements. This implies that different assignments cannot share elements in the cost vector, and therefore different assignments have independent costs z. Now, a, and a2 are independent exponential random variables with parameter 1. Thus z = al + a2 is a Gamma(a, A) random variable, with parameters a = 2 and A = 1. According to the proof above, it is clear why instances with n > 3 do not have the same property. Different feasible solutions share elements of the cost vector, and therefore the feasible solutions are not independent of each other. For example, consider a problem of size d = 3, n = 3. A feasible solution to this problem is cill, c232, and c323. Another feasible solution is cll,, c223, and c332. Note that both solutions share the cost coefficient cll, and are not independent. Suppose that X, X2,... Xk are k independent gamma distributed variables. Let X(i) be the ith smallest of these. Applying order statistics [33], we have the following expression for the expected minimum value of k independent identically distributed random variables E[X(1)] j kxf(x)( F(x))k dx JO where f(x) and F(x) are, respectively, the density and distribution functions of the gamma random variable. The problem of finding z* for the special case when n = 2 and d > 3 corresponds to finding the expected minimum cost E[X(1)], for k =2d1 independent gamma distributed feasible solution costs, with parameters a = 2, and A = 1 (note that k is the number of feasible solutions). Through some routine calculus, and noting a resulting pattern as k is increased, we find the following relationship kl /1 j+2. j=0 i=1 The above equation can be used to prove the .iil. l 1ic characteristics of the mean optimal cost of the MAP as d increases. We also note that this special result for the MAP follows directly from Lemma2(ii) by Szpankowski [106]. As an alternative approach, we use the above equation to prove the following theorem. Theorem 3.2 For the MAP with n = 2, and i.i.d. exponential cost coefficients with mean one, z* 0 as d oo.. Proof: When d  oo, then 2d k  o. We have (k j=0 k1 (k j=0 k1 (k  j=0 k oc as well. So we prove the result when Yk (y + 2)! i1 j0 j+ 1)! ( + 2)(j + 1) 1 )! T kj+2 1)! (k j)(k + 1) kkj+1 Equality (3.4) is found by a change of variable. Using Stirling's approximation n! (n/e)"gv2Tn, we have (3.5) (3.6) k1 k 1 k1 27(k 1) (k j)(k j + 1) e j! kkj j= e( )k 1 27(k 1) k1 k e(k+1 (k )(k j + 1)e j=o < e( 1)k1 2(k 1) Sk+ ( j)(k 1) j=0 Note that the summation in Formula (3.7) is exactly E[(k j)(k j+1)] for a Poisson distribution with parameter k, which therefore has value k. Thus, (3.8) and as (k 1)k1/2 kk  0 when k oo, the theorem is proved. As will be shown in Section 3.4, experimental results support these conclusions, even for relatively small values of d. Table 31 provides the value of z* for MAPs of sizes n = 2, 3 < d < 10. We note that a similar approach and results may be obtained for other distributions of cost coefficients. For example, we have similar results if the (3.2) (3.3) (3.4) S/2 (k 1)1/2 2z < eV ,, cost coefficients are independent gamma distributed random variables, since the sum of gamma random variables is again a gamma random variable. Table 31: Mean optimal solution costs obtained from the closed form equation for MAPs of sizes n 2, 3 < d < 10 and with cost coefficients that are independent exponentially distributed with mean one. d\n 2 3 0.804 4 0.530 5 0.356 6 0.242 7 0.167 8 0.116 9 0.080 10 0.056 3.3 Branch and Bound Algorithm This section describes the Branch and bound (B&B) algorithm used in the ex periments to optimally solve the MAPs. Branch and bound is essentially an implicit enumeration algorithm. The worstcase scenario for the algorithm is to have to cal culate every single feasible solution. However, by using a bounding technique, the algorithm is typically able to find an optimal solution by only searching a limited number of solutions. The indexbased B&B is an extension of the three dimensional B&B proposed by Pierskalla [87] where an index tree data structure is used to rep resent the cost coefficients. There are n levels in the index tree with nd1 nodes on each level for a total nd nodes. Each level of the index tree has the same value in the first index. A feasible solution can be constructed by first starting at the top level of the tree. The partial solution is developed by moving down the tree one level at a time and adding a node that is feasible with the partial solution. The number of nodes that are feasible to a partial solution developed at level i, for i = 1, 2,..., n is (n i)1. A complete feasible solution is obtained upon reaching the bottom or nthlevel of the tree. Deeper MAP tree representations provide more opportunities for B&B algorithms to eliminate branches. Therefore, we would expect the indexbased B&B to be more effective for a larger number of elements in each dimension. 3.3.1 Procedure The B&B approach proposed here finds the optimal solution by moving through the indexbased tree representation of the MAP. The algorithm avoids having to check every feasible solution by eliminating branches with lower bounds that are greater than the bestknown solution. The approach is presented as a pseudocode in Figure 31. procedu 1 for 2 S 3 i 4 whi: 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 end 22 reti end Inde re IndexBB(L) i = 1,..., n do ki  0  0 1 le i > 0 do if ki = Lil then S S\{sj} ki 00 i i1 else ki = ki + 1 if Feasible(S, Li,k7) then S  S U Li,k, if LB(S) < z* then if i= n then SS S <S z < Objective(S) else i+i+1 else S S\{ss} urn(S, z) xBB Figure 31: Branch and Bound on the Index Tree. The algorithm initializes the tree level markers ki, the solution set S, and the current tree level i in Steps 13. The value of the bestknown solution set S is denoted as z. Level markers are used to track the location of cost coefficients on the tree levels and Li is the set of coefficients at each level i. The solution set S contains the cost coefficients taken from the different tree levels. Steps 421 perform an implicit enumeration of every feasible path in the indexbased tree. The procedure investigates every possible path below a given node before moving on to the next node in the same tree level. Once all the nodes in a given level are searched or eliminated from consideration through the use of upper and lower bounds, the algorithm moves up to the previous level and moves to the next node in the new level. Step 11 checks if a given cost coefficient Li,kw, which is the kith node on level i, is feasible to the partial solution set. If the cost coefficient is feasible and if its inclusion does not cause the lower bound of the objective function to surpass the bestknown solution, then the coefficient is kept in the solution set. Otherwise, it is removed from S in Step 20. A lower bound that may be implemented to try to remove some of the tree branches is given by: r n LB(S) S, + m in Ci...d i= i=r+1 where r = S1 is the size of the partial solution and Si is the cost coefficient selected from level i of the indexbased MAP representation. This expression finds a lower bound by summing the values of all the cost coefficients that are already in the partial solution and the minimum cost coefficient at each of the tree levels underneath the last level searched. The lower bound consists of n elements, one from each level. If a cost coefficient from a given level is in the partial solution, then that coefficient is used in the calculation of the lower bound. If none of the coefficients from a given level is found in the partial solution, then the smallest coefficient from that level is used. Before starting the algorithm, an initial feasible solution is needed for an upper bound. A natural selection would be S = { ci,J2,3,..,jd I 1 for m = 2, 3,..., d; i = 1,2,..., n} . The algorithm initially partitions the cost array into n groups or tree levels with respect to the value of their first index. The first coefficient to be analyzed is the node furthest to the left at level i = 1. If the lower bound of the partial solution that includes that node is lower than the initial solution, the partial solution is kept. It then moves to the next level with i = 2 and again analyzes the node furthest to the left. The algorithm keeps moving down the tree until it either reaches the bottom or finds a node that results in a partial solution having a lower bound value higher than the initial solution. If it does reach the bottom, a feasible solution has been found. If the new solution has a lower objective value than the initial solution, the latest solution is kept as the current bestknown solution. On the other hand if the algorithm does encounter a node which has a lower bound greater than the bestknown solution, then that node and all the nodes underneath it are eliminated from the search. The algorithm then a"n Ji. the next node to the right of the node that did not meet the lower bound criteria. Once all nodes at a given level have been analyzed, the algorithm moves up to the previous level and begins searching on the next node to the right of the last node analyzed on that level. We discuss different modifications that may be implemented on the original B&B algorithm to help increase the rate of convergence. The B&B algorithm's performance is directly related to the tightness of the upper and lower bounds. The rest of this section addresses the problem of obtaining a tighter upper bound. The objective is to obtain a good solution as early as possible. By having a low upper bound early in the procedure, we are able to eliminate more branches and guarantee an optimal solution in a shorter amount of time. The modifications that we introduce are sorting the nodes in all the tree levels and performing a local search algorithm that guarantees local optimality. 3.3.2 Sorting There are two vv to sort the indexbased tree. The first is to sort every level of the tree once before the branch and bound algorithm begins. By using this implementation, the sorting complexity is minimized. However, the drawback is that infeasible cost coefficients are mixed in with the feasible ones. The algorithm would have to perform a large number of feasibility checks whenever a new coefficient is needed from each level. The second way to sort the tree is to perform a sort procedure every time a cost coefficient is chosen. At a given tree level, a set of coefficients that are still feasible to the partial solution is created and sorted. Finding coefficients that are feasible is computationally much less demanding than checking if a particular coefficient is still feasible. The drawback with the second method is the high number of sorting proce dures that need to be performed. For our test problems, we have chosen to implement the first approach, which is to perform a single initial sorting of the coefficients for each tree level. This choice was made because the first method performed best in practice for the instances we tested. 3.3.3 Local Search The local search procedure improves upon the bestknown solution by searching within a predefined neighborhood of the current solution to see if a better solution can be found. If an improvement is found, this solution is then stored as the current solution and a new neighborhood is searched. When no better solution can be found, the search is terminated and a local minimum is returned. Because an optimal solution in one neighborhood definition is not usually op timal in other neighborhoods, we implement a variable neighborhood approach. A description of this metaheuristic and its applications to different combinatorial opti mization problems is given by Hansen and Mladenovi6 [47]. Variable neighborhood works by exploring multiple neighborhoods one at a time. For our branch and bound algorithm, we implement the intrapermutation 2 and nexchanges and the interper mutation 2exchange presented by Pasiliao [84]. Starting from an initial solution, we define and search the first neighborhood to find a local minimum. From that local minimum, we redefine and search a new neighborhood to find an even better solution. The metaheuristic continues until all neighborhoods have been explored. 3.4 Computational Experiments In this section, the computational experiments performed are explained. In the first subsection, we describe the experimental procedures employ, 1 Then, in latter subsections, the results from the experiments are presented and discussed. The results include mean optimal costs and their standard deviation, for each type of problem and size. In the last subsection we present some interesting results, based on curve fitting models. 3.4.1 Experimental Procedures The experimental procedures involved creating and exactly solving MAPs using the B&B algorithm described in the preceding section. There were at least 15 runs for each experiment where the number of runs was selected based on the practical amount of time to complete the experiment. Generally, as the size of the problem increased, the number of runs in the experiment had to be decreased. Also, as the dimension, d, of the MAP increased, the maximum number elements, n, decreased. Tables 32 and 33 provide a summary of the size of each experiment for the various types of problems. The time taken by an experiment ranged from as low as a few seconds to as high as 20 hours on a 2.2 GHz Pentium 4 processor. We observed that problem instances with standard normal assignment costs took considerably longer time to solve; there fore, problem sizes and number of runs per experiment are smaller. The assignment costs cl...id for each problem instance were drawn from one of three distributions. The Table 32: Number of runs for each experiment with uniform or exponential assign ment costs. n\d 3 4 5 6 7 8 9 10 2 1000 1000 1000 1000 1000 1000 1000 500 3 1000 1000 1000 1000 1000 1000 1000 100 4 1000 1000 1000 1000 1000 500 500 50 5 1000 1000 1000 500 200 200 100 20 6 1000 1000 500 500 100 50 20 7 1000 1000 500 200 50 20 15 8 1000 1000 200 50 20 15 9 1000 1000 50 20 15 10 1000 1000 20 15 11 500 500 20 12 500 500 20 13 200 200 15 14 100 50 15 100 16 50 17 50 18 30 19 20 20 15 first distribution of assignment costs used was the uniform U[O, 1]. The next distribu tion used was the exponential with mean one, being determined by cil... = In U. Finally, the third distribution used was the standard normal, N(0, 1), with values determined by the polar method [63] as follows: 1. Generate U1 and U2, for U1, U2 ~ U[0, 1]. 2. Let V1 = 2U1 1,V2 2U2 1, and W V12 + V22. 3. If W > 1, go back to 1, else c,...i = VI. 3.4.2 Mean Optimal Solution Costs A summary of results for MAPs is provided in Tables 34, 35, and 36. We observe that in all cases the mean optimal cost gets smaller as the size of the MAP increases. Figure 32 shows the plots for problems with dimension d = 3, d = 5, d = 7 and d = 10, as examples for the exponential case (plots for the uniform case are similar). We observe that plots for higher dimensional problems converge to zero for smaller values of n. This is emphasized in the surface plot, Figure 33, of a subset of the data. Figure 34 shows plots for problems with the same number of dimensions Table 33: Number n\d 2 3 4 5 6 7 8 9 10 11 12 13 14 of runs 3 1000 1000 1000 1000 1000 1000 1000 1000 1000 500 50 15 15 for each experiment standard normal assignment costs. 4 5 6 7 8 9 10 1000 1000 1000 1000 1000 1000 500 1000 1000 1000 1000 1000 1000 500 1000 1000 1000 1000 500 50 15 1000 1000 500 200 20 15 1000 500 100 50 15 1000 50 20 15 1000 20 15 100 15 20 15 as the problems in Figure 32, but for the standard normal case. Different from the uniform and exponential cases, the mean optimal solution costs appear to approach oc with increasing n. 0.9 0.8 0.7 .n1 0 cV S0.6 E 0.5 00.4 0 S0.3 () E 0.2 0.1 0 2 7 12 17 2 I number of elements 3 DAP 5 DAP 7 DAP x10 DAP Figure 32: Plots of mean optimal costs for four different sized MAPs with exponen tial assignment costs. We observe that in the uniform and exponential cases the standard deviation of optimal costs converges to zero as the size of the MAP gets larger. Clearly, this just confirms the .,''iiii.l i ic characteristic of the results. However, a trend is difficult to Table 34: Mean < :.i': : costs for dilfrernti sizes of MA[Ps with independent assign nient costs that are uniform in [0, 11. 0.54078 0.480825 0.41716 0.374046 0.334805 0.30329 0.277139 0.252156 0.237884 0.216287 0.205552 0.185769 0.180002 0.16832 0.162104 0.14787 0.14583 0.129913 0.295578 0.209272 0.151543 0.114551 0.0897928 0.0724017 0.0587219 0.0486118 0.0419366 0.035617 0.0310225 0.02696 0.155189 0.0884739 0.0549055 0.0357428 0.02465 0.0175195 0.012982 0.00961 0.00762 0.006025 0.0853185 0.0386061 0.019867 0.011516 0.0067875 0.0041348 0.00253 0.00168 Table 3 5: :: optimal costs for different sizes of MAPs with independent assign ment costs that are exponential with mean 1. 0.63959 0.531126 0.454308 0.396976 0.349543 0.310489 0.28393 0.263187 0.238954 0.218666 0.203397 0.193867 0.181644 0.172359 0.161126 0.15081 0.144787 0.134107 0.319188 0.212984 0.155833 0.116469 0.0909123 0.0723551 0.0595148 0.0149353 0.041809 0.03546241 0.030967 0.0279 0.165122 0.0903552 0.0548337 0.0355034 0.0251856 0.0175745 0.01233 0.009165 0.007305 0.005385 0.004,4667 0.0829287 0.0391171 0.020256 0.0116512 0.0068995 0.0044 0.002665 0.0019 Table 36: Mean < :.i': : costs for different sizes of MAPs with independent assign nient costs that are standard normal. 3.41537 5.6486 8.00522 10.6307 13.2918 16.1144 18.9297 21.7916 24.7175 27.9675 30.9362 34.4204 4.59134 7.52175 106145 13.9336 17.2931 20.8944 24.5215 28.6479 31.9681 5.57906 9.05299 12.6924 16.5947 20.6462 24.7095 28.7188 6.44952 10.3701 145221 18.7402 23.4246 28.1166 Exponentially Distributed Cost Coefficients 0.9 0.8 0.7 0.6 0.5 0.4 d, dimension 9 n, number of elements Figure 33: Surface plots of mean optimal costs for 3 < d < 10 and 2 < n < 10 sized MAPs with exponential assignment costs. 5 10 15 20 25 30 35 An n, number of elements 2 4 6 8 10 12 14 3 DAP 5 DAP A7 DAP X10 DAP Figure 34: Plots of mean optimal costs for four different sized MAPs with standard normal assignment costs. Mean Optimal Cost / detect for standard deviation of optimal costs in the standard normal case. Figure 35 shows the plots for the three, five, seven and ten dimensional problems, as examples, for the exponential case (plots for the uniform case are similar). 0.6 0 0.5 0 S0.4 w 43 DAP *' ,0.3 5 DAP o 7 DAP ** X10 DAP 0 0.1 0. ^ 01 i " 2 4 6 8 10 12 14 16 18 20 n, number of elements Figure 35: Plots of standard deviation of mean optimal costs for four different sized MAPs with exponential assignment costs. No clear trend is given in Figure 36 which shows the plots for the same dimensional problems but for the standard normal case. 3.4.3 Curve Fitting Curve fits for the mean optimal solution costs were performed for the three types of problems using a least squares approach. The solver tool in Microsoft's Excel was used to minimize the sum of squares. Several nonlinear models were tested for the purpose of developing a model to estimate the mean optimal cost, z.. The tested models include the following Power Fit, z = An Shifted Power Fit, = A(n + B)c Scaled Power Fit, (An + B)c 1.4 1.2 0 o 43 DAP S0.8 5 DAP E 7 DAP 0.6 10 DAP o 0.4 0.2 2 4 6 8 10 12 14 n, number of elements Figure 36: Plots of standard deviation of mean optimal costs for four different sized MAPs with standard normal assignment costs. Exponential, = Ae"B Reciprocal Quadratic, z* A + Bn + Cn2 In each case the fit was calculated by fixing d and varying n. For the uniform and exponential problems the Scaled Power Fit was found to be the best model. For the standard normal problems the Shifted Power Fit was used. The results of curve fitting are shown in Tables 37, 38, and 39. We observe that curves fit surprisingly well to the collected data. Figure 37 is a plot of the curve fitting model and observed data for the exponential case where d = 3. Note that the curves are nearly indistinguishable. This is typical for most problem sizes. A closer analysis of the curve fitting parameters for both uniform and exponential type problems indicates that as the dimension of the MAP increases, the curve fitting parameter C approaches (d 2). A heuristic argument of why this is so is given in the following. Consider the case of uniformly distributed cost coefficients. For each level of the index tree representation of the MAP, the expected value of the minimum order 35 Table 3 7: Curve fitting results for fitting the form (A! + B3) to the mean optimal costs for MAPs with uniform assignment costs. d A B C Sum of Squares 3 0.102 1.133 1.764 8.80E04 4 0.183 .* 2.932 7.74E05 5 0.319 0.782 3.359 i ::7 6 0.300 0.776 4.773 5.77E07 7 0.408 0.627 4.997 . 080.08 0621 6.000 7.91E07 9 0.408 0.621 7.000 3.44E07 10 0.408 0.621 8.000 9.50E07 Table 3 8: Curve fitting results for fitting the form (. + B)c to the mean optimal costs for MAPs with i onential assignment costs. d A B C Sum of Squares 3 0.300 0.631 1.045 5.26E05 4 0.418 0.550 1.930 :'i 5 0.406 0.601 3.009 2.40E06 6 0.420 0.594 3.942 8.39E08 7 0.414 0601 5.001 9.42E07 8 0.413 0.617 5.999 9.45E07 9 0.418 0.600 7.000 1.94E07 10 0.114 0.607 8.000 6.68E07 statistic is given by E[X(1)] = l/(n + 1) as there are nd1 coefficients on each level of the tree. And as there is one coefficient from each of the n levels in a feasible solution we may expect = O(n n(d1)) O(n(d2)). The same argument can be made for the exponential case where E[X(1)] = 1/n 1 Again using a least squares approach, if we rebuild the curve fitting models for the uniform and exponential cases by fixing C = 2 d, we find, as expected, the lower dimension models result in higher sum of squares. The worst fitting model is that of the uniform case with d = 3. In this case the sum of squares increases from 8.80E 04 to 3.32E 03 and the difference in the model estimate and actual results for n = 3 increases from 2.;:'. to 5'. Although we believe fixing C = 2 d can provide adequate fitting models, in the remainder of this chapter we continue to use the more accurate models (where C is not fixed to C = 2 d); however, it is obvious the higher dimension problems are unaffected. 36 Table 39: Curve fitting results for fitting the form A(n + B)c to the mean optimal costs for MAPs with standard normal assignment costs. d A B C Sum of Squares 3 1.453 0.980 1.232 7.27E02 4 1.976 0.986 1.211 1.54E01 5 2.580 1.053 1.164 2.85E02 6 2.662 0.915 1.204 1.68E02 7 3.124 0.956 1.174 1.20E03 8 3.230 0.882 1.194 3.13E03 9 3.307 0.819 1.218 1.71E03 10 3.734 0.874 1.187 1.52E04 0.9 0.8 0.7 0.6 o 0.5 S0.4 observed data 0 0.3 0 0.3 ... fitted data 0.2 0.1 0 7 12 17 n, number of elements Figure 37: Three dimensional MAP with exponential assignment costs. Plot includes both observed mean optimal cost values and fitted values. The two lines are nearly indistinguishable. An obvious question to ask is what happens with variations of the distribution parameters. For example, what is the numerical estimation of z* when the cost coefficients are distributed as uniform on [a, b] or exponential with mean A? We propose without proof the following numerical models to estimate z*. For cost coefficients that are uniform on [a, b], the curve fit or numerical esti mation is z* z = an + (b a)(An + B)c, using the curve fit parameters for the uniform case on [0, 1] found in Table 37. For cost coefficients that are exponential with mean A, the curve fit is z* ~ = A(An + B)c using the curve fit parameters for the exponential case with A = 1 found in Table 38. 37 Table 310: Estimated and actual mean optimal costs from ten runs for variously sized MAPs developed from different distributions. Included are the average difference and largest difference between estimated mean optimal cost and optimal cost. d n Distribution Z* z* Ave A Max A with Parameters 3 12 Uniform on [5,10] 61.1 61.1 0.143 0.428 3 20 Expo, A 3 0.415 0.404 0.0618 0.154 5 9 N(p 0, a 3) 86.4 86.5 1.62 3.48 5 12 Uniform on [1,1] 12 12 1.65E03 3.16E03 7 5 N(p =5, a 2) 7.24 7.27 0.448 0.73 7 7 Expo, A 10 1.90E02 1 '.' 2.62E03 5.47E03 8 6 Uniform on [10,20] 60 60 0.003 0.008 8 8 Expo, A 1.5 4.13E04 3.07E04 1.15E04 2.30E04 9 5 N(t 5, a 2) 62.8 63.2 0.944 2.26 9 7 Uniform on [10,5] 70 70 3.60E04 6.70E04 10 4 N(p 1 ,a 4) 53.8 53.3 0.831 2.12 10 5 Expo, A 2 7.57E04 8.00E04 1.10E04 4.03E04 The situation is just a bit more involved for the normal case. Consider when the mean of the standard normal is changed from 0 by an amount p and the standard deviation is changed by a factor a. That is the cost coefficients have the distribution N(p,, a). Then z* z = nup + aA(n + B)c using the curve fit parameters found in Table 39. To assist in validating the numerical estimation models discussed above, experi ments were conducted to compare the numerical estimates of the mean optimal costs and results of solved problems. The experiments created ten instances of different problem sizes and of different distributions and solved them to optimality. A variety of parameters were used for each distribution in an effort to exercise the estimation models. In the first experiment, we report mean optimal solution, estimated mean optimal solution, the max A, and mean A where A = Iz z(I)l. That is, A for a problem instance is the difference between the predicted or estimated mean opti mal cost and the actual optimal cost. Results of these experiments are provided in Table 310. We observe that the numerical estimates of the mean optimal costs are quite close to actual results. Similar to Figure 37, Figures 38, 39 and 310 have plotted results of z* and z* (fitted data) for random instances of different sized problems. As in the above experiments, the number of runs is limited to ten for each problem size. As the plots of z* and z* are close to each other, this further validates the numerical models for estimating z*. 4 dimension, Uniform on [10,20] observed data ......... fitted data 0 5 10 n, number of elements 8 dimension, Uniform on [10,20] observed data .........fitted data 0 2 4 6 8 10 n, number of elements Figure 38: Plots of fitted and mean optimal costs from ten runs of variously sized MAPs developed from the uniform distribution on [10, 20]. Note that the observed data and fitted data are nearly indistinguishable. 4 dimension, Exponential, mean=3 5 10 n, number of elements 8 dimension, Exponential, mean=3  observed data f......itted data 0 2 4 6 n, number of elements Figure 39: Plots of fitted and mean optimal costs from ten runs of variously sized MAPs developed from the exponential distribution with mean three. 3.5 Algorithm Improvement Using Numerical Models The numerical estimates of the mean optimal cost can be used to accurately predict the optimal solution cost of a random instance of an MAP that is constructed from a uniform, exponential or normal distribution. However, we still lack a solution. observed data ........ fitted data 4 dimension, Normal [5,2] 8 dimension, Normal [5,2] 8 4 6 2 4 0 S2 2 4 05 10 observed ( S 10 1 observeddata observed data '. fitted data '.  fitted data E 8 1 12 12 n, number of elements n, number of elements Figure 310: Plots of fitted and mean optimal costs from ten0 runs of variously sized MAPs developed from a normal distribution, N(1 = 5, a = 2). In this section, we investigate whether the numerical estimates can be used to improve a branch and bound (B&B) exact solution method. 3.5.1 Improvement of B&B The B&B solution method under consideration is that described in this chapter, Section 3.3. Recall that the B&B performs best by establishment of a tight upper bound early in the process. A tight upper bound allows significant pruning of the branches of the search tree. We consider the use of the numerical estimates to set tighter upper bounds than would be available through other primal heuristics. An advantage of the primal heuristic is, of course, a solution is at hand; whereas, the numerical estimate is a bound only with no solution. The heuristic used in Section 3.3 randomly selects a starting solution and then performs a variable local neighborhood search to find a local minimum. Alternatively, we also consider the global greedy and a variation of the maximum regret approaches as ii.:: 1. I by Balas and Saltzman [10]. In the global greedy approach, a starting solution is constructed stepbystep by selecting the smallest feasible cost coefficient then a variable local neighborhood search is applied to find a local minimum. For maximum regret, a feasible solution is constructed as follows. The difference between the two smallest feasible costs associated with each level of the index tree is calculated. This difference is called the regret as it represents the penalty for not choosing the smallest cost in the row. 40 Table 311: Results showing comparisons between three primal heuristics and the numerical estimate of optimal cost for several problem sizes and types. Shown are the average feasible solution costs from 50 runs of each primal heuristic on random instances. d n Distribution Random Greedy Max Regret Numerical with Parameters Estimate 6 10 Uniform on [0,1] 0.530 0.216 0.165 0.00177 7 7 Uniform on [0,1] 0.433 0.201 0.182 0.00195 8 6 Uniform on [0,1] 0.429 0.186 0.168 0.0019 9 4 Uniform on [0,1] 0.320 0.218 0.214 0.00341 10 4 Uniform on [0,1] 0.283 0.219 0.216 0.00152 6 10 Expo, A 1 0.611 0.226 0.2426 0.00251 7 7 Expo, A 1 0.490 0.244 0.216 0.00190 8 6 Expo, A 1 0.430 0.217 0.175 0.00114 9 4 Expo, A 1 0.385 0 _'.7 0.270 0.00318 10 4 Expo, A 1 0.320 0.224 0.215 0.00145 6 7 N( = 0, = 1) 12.91 21.29 21.57 23.40 7 6 N( = 0, a 1) 12.91 18.51 18.97 20.89 8 5 N( = 0, a= 1) 8.99 15.77 16.08 17.51 9 4 N( = 0, a 1) 6.99 11.67 11.883 13.53 10 4 N( = 0, a 1) 7.00 12.60 12.67 14.44 The smallest feasible cost in the row with the largest regret is selected. This differs from the approach by Balas and Saltzman [10] where they consider every row in the multidimensional cost matrix, whereas we consider only the n rows in the index tree. We took this approach as a tradeoff between complexity and quality of the starting solution. Table 311 provides a comparison of starting solution cost values for the three primal heuristics described above along with a comparison of the numerical estimate of the optimal cost for various problem sizes and distribution types. The table shows the results of the average of 50 random generated instances. In terms of an upper bound, the results of Table 311 indicate that, generally, the greedy primal heuristic is better than the random heuristic and max regret is better than greedy. For the uniform and exponential cases, the numerical estimate of optimal costs is clearly smaller than any of the results of the heuristics. In the normal cases, the numerical estimate is not significantly smaller. For the uniform and exponential cases, it appears much is to be gained by somehow incorporating the numerical estimate into an upper bound. We propose using a factor r > 1 of the numerical estimate as the upper bound. If a feasible solution is found, the new solution serves as the upper bound. If a feasible solution is not found, then the estimated upper bound is incremented upwards until a feasible solution is found. This process guarantees an optimal solution will be found. Figure 311 is fundamentally the same as Figure 31 except for the outside loop which increments the estimated upper bound upward until a feasible solution is found. procedure IndexBB(L) 1 solutionfound = false 2 while solutionfound = false do 3 Z* = Z* 7 4 for i =,...,n do ki 0 5 S 0 6 i1 7 while i > 0 do 8 if ki = ILil then 9 S < S\{sj} 10 ki 0 11 ii1 12 else 13 k = ki + 1 14 if Feasible(S, Li,k ) then 15 S SU Li,ki 16 if LB(S) < z* then 17 if i= n then 18 S S 19 z < Objective(S) 20 solutionfound = true 17 else 18 i i+1 19 else 20 S S\{sj} 21 end 22 end 22 return(S, Z) end IndexBB Figure 311: Branch and bound on the index tree. The tradeoff which must be considered is if the upper bound is estimated too low and incremented upwards too slow, then it may take many iterations over the index tree before a feasible solution is found. However, no benefit is gained by setting the upper bound too high. We found through lessthanrigorous ,i i,', Ji that r set to a value such that the upper bound is incremented upward by one standard deviation of the optimal cost (see Figures 35 and 36) is a nice compromise. 3.5.2 Comparison of B&B Implementations Table 312 compares the performance of the B&B algorithm using the random primal heuristic for a starting upper bound versus using the maximum regret heuris tic versus using a numerical estimate for the upper bound. The table shows the average times to solution of five runs on random instances of various problem sizes and distribution types. In the uniform and exponential cases, we observe that B&B using maximum regret generally does slightly better than using a random starting solution. We also observe the approach of using a numerically estimated upper bound significantly outperforms the other approaches in solving problems with uniformly or exponentially distributed costs. However, there is no clear difference between the approaches when solving problems with normally distributed costs. This is explained by the small differences in the starting upper bounds for each approach. 3.6 Remarks In this chapter we presented experimental results for the . i! ,i1 Iic value of the optimal solution for random instances of the MAP. The results lead to the following conjectures which will be addressed in detail in C'i ipter 4. Conjecture 3.3 Given a ddimensional MAP with n elements in each dimension, if the nd cost coefficients are independent exponr ,lI ll;i distributed random variables with mean 1 or independent unifoti,,li distributed random variables in [0,1], z* 0 as n  o or d  oo. Table 3 12: Average time to solution in seconds of solving each of five randomly generated problems of various sizes and I :. TI: experiment involved : '::" the F solution algorithm with Distribution with .. Uniform on [0,1 Uni form on 10,1 Uniform on [0,1 Uniform on [0,11 Uniform on [0,1] Expo, A = 1 Expo, A = 1 Expo, A 1 Expo, A 1 Expo, A = 1 N(p 0, a 1) N( 0, a 1) N(p 0, 1) N(pf 0, .. 1) N(y aO, 1) .: starting upper bounds developed in three .. Max : _,et 1 1311 19.1 19.2 20.5 ',: 4 0.3 0.29 1.15 1.12 1279 1285 25.5 25.8 21.8 24.5 0.24 0.23 1.67 1.66 54.9 47.3 89.9 . 24.7 24.6 1.25 1.23 30.7 : Numerical 795 13.9 13.1 0.13 0.4 1201 17.8 13.4 0.1 0.57 54.2 89.2 24.6 1.24 30.7 Conjecture 3.4 Given a ddimensional MAP with n elements in each dimension, if the nd cost coefficients are independent standard normal random variables, z*  oo as n oo or d  oo. We also presented in this chapter curve fitting results to accurately estimate the mean optimal costs of variously sized problems constructed with cost coefficients in dependently drawn from the uniform, exponential or normal distributions. Of high interest of course is how numerical estimates of mean optimal cost can be used to improve existing solution algorithms or is they can be used to find new solution algo rithms. To this end, we have shown that using numerical estimates can significantly improve the performance of a B&B exact solution method. CHAPTER 4 PROOFS OF ASYMPTOTIC CHARACTERISTICS OF THE MAP 4.1 Introduction The experimental work detailed in C!i ipter 3 leads to conjectures concerning the .Iimptotic characteristics of the mean optimal costs of randomly generated instances of the MAP where costs are assigned independently to assignments. In this chapter, we provide proofs of more generalized instances of Conjecture 3.3 and prove Conjec ture 3.4. The proofs are based on building an index tree [87] to represent the cost coefficients of the MAP and then selecting a minimum subset of cost coefficients such that at least one feasible solution can be expected from this subset. Then an upper bound on the cost of this feasible solution is established and used to complete the proofs. Throughout this chapter we consider MAPs with n elements in each of the d dimensions. Before presenting the theorems and their proofs concerning the .ivmptotic na ture of these problems, we first consider a naive approach [28] to establishing the .,vmptotic characteristics based on some greedy algorithms. 4.2 Greedy Algorithms Consider the case of the MAP where cost coefficients are independent exponen tially distributed random variables with mean 1. By Conjecture 3.3 the mean optimal costs are thought to go to zero with increasing problem size. Suppose we consider the solution from a greedy algorithm. As the solution serves as an upper bound to the optimal solution, we can try to prove the conjecture if we can show the mean of the suboptimal solutions goes to zero with increasing problem size. However, as will be shown this is difficult with two common greedy algorithms. 4.2.1 Greedy Algorithm 1 The first algorithm that we consider uses the index tree data structure proposed by Pierskalla [87] to represent the cost coefficients of the MAP. There are n levels in the index tree with nd1 nodes on each level for a total nd nodes. Each level of the index tree has the same value in the first index. A feasible solution can be constructed by first starting at the top level of the tree. The partial solution is developed by moving down the tree one level at a time and adding a node that is feasible with the partial solution. The number of nodes at level i that are feasible to a partial solution developed from levels 1, 2,..., i 1 is (n i + l)d1. A complete feasible solution is obtained upon reaching the bottom or nthlevel of the tree. The proposed greedy algorithm is as follows: Input MAP of dimension d and n elements in each dimension in the form of an index tree. Build a partial solution, S, i = 1, by choosing the smallest cost coefficient from row 1 of the tree. For i = 2,...,n, continue to construct a solution by choosing the smallest cost coefficient in row i of the tree that is feasible with Si_ constructed from rows 1,..., i 1. We wish to calculate the expected solution cost from this algorithm for the MAP constructed from i.i.d. exponential random variables with mean 1. Let the mean solution cost resulting from the algorithm be represented by z*. Suppose that X, X2,..., Xk are k i.i.d. exponential random variables with mean 1. Let X(i) be the ith smallest of these. Applying order statistics [33], we have the following expression for the expected minimum value of k independent identically distributed random variables: E[X(1)]= 1/k. We may now construct a feasible solution using the above greedy algorithm. We do so by recalling that the number of nodes that are feasible at level i + 1 to a partial solution developed down to level i, for i = 1, 2,... n is (n i)d1. Considering this and the fact that cost coefficients are independent, the expected solution cost of S1 is the expected solution cost of S2 is I + and so forth. Therefore, we find n 1 S= ( )d1 (4.1) > (4.2) where equation (4.2) holds because the nth term of equation (4.1) is one. Since z* > 0, we conclude this greedy approach cannot be used to prove Conjec ture 3.3. However, maybe a more global approach will work. 4.2.2 Greedy Algorithm 2 The following algorithm is described by Balas and Saltzman [10] as the GREEDY heuristic. The algorithm is as follows: Input MAP of dimension d and n elements in each dimension as matrix A. For i 1,... n, construct the partial solution Si by choosing the smallest element in matrix A and then exclude the d rows covered by this element. Using this covering approach, we see the number of nodes that are feasible to a partial solution developed up to iteration i, for i = 1, 2,..., n is (n )d. For example, all nd cost coefficients are considered in the first iteration. The next iteration has (n )d nodes for consideration. The expected solution cost of S1 is 1/nd. The expected solution cost of S2 is 1/nd + 1/nd + 1/(n )d. The extra 1/nd appears in the expression because, in general, the expected minimum value of the uncovered nodes is at least as much as the expected minimum value found in the previous iteration. We could now develop the expression for z*; however, we note that the algorithm's last iteration will consider only one cost coefficient. Therefore, again, we have the result that z* > 1 when using this algorithm. We conclude that these simple greedy approaches cannot be used to prove the conjectures concerning the .,iiiiil .1 ic characteristics of the MAP. In the next sec tions, we resort to a novel probabilistic approach. 4.3 Mean Optimal Costs of Exponentially and Uniformly Distributed Random MAPs To find the .,vmptotic cost when the costs are uniformly or exponentially dis tributed, we use an argument based on the probabilistic method [7]. Basically, we show that, for a subset of the index tree, the expected value of the number of feasible paths in this subset is at least one. Thus, such a set must contain a feasible path and this fact can be used to give an upper bound on the cost of the optimum solution. This is explained in the next proposition. Proposition 4.1 Using an index tree to represent the cost coefficients of the MAP, Tr IJ..i;,, J select a different nodes from each level of the tree and combine these nodes from each level into set A. A is expected to produce at least one feasible solution to the MAP when d1 and A na (4.3) Proof: Consider there are nd1 cost coefficients on each of the n levels of the index tree representation of an MAP of dimension d and with n elements in each dimension. Now consider there are (n d)" paths (not necessarily feasible to the MAP) in the index tree from the top level to the bottom level. The number of feasible paths (or feasible solutions to the MAP) in the index tree is (n!)d1. Therefore, the proportion Q of feasible paths to all paths in the entire index tree is (n!)d1 (4.4) (ndI)n Create a set A of nodes to represent a reduced index tree by selecting a nodes randomly from each level of the overall index tree and placing them on a corresponding level in the reduced index tree. The number of nodes in A is obviously na. For this reduced index tree of A, there are a" paths (not necessarily feasible to the MAP) from the top level to the bottom level. Since the set of nodes in A were selected randomly, we may now use Q to determine the expected number of feasible paths in A by simply multiplying Q by the number of all paths in the reduced tree of A. That is E[number feasible paths in A] = Qa". We wish to ensure that the expected number of feasible paths A is at least one. Thus, over all possible choices of the n subsets of a elements, there must be one choice such that there is one feasible path (in fact there may be many since the expected value gives only the average over all possible solutions). Therefore, Qa" > 1, which results a >  (1 Incorporating the value of Q from (4.4) we get d1 a > (n!)dn Therefore, since a must be an integer, we get (4.3). We now take a moment to discuss the concept of order statistics. For more complete information, refer to statistics books such as by David [33]. Suppose that X1,X2,...,Xk are k independent identically distributed variables. Let X() be the ith smallest of these. Then X() is called the ith order statistic for the set {Xi,X2,..., Xk}. In the rest of the section, we will consider bounds for the value of the ath order statistic of i.i.d. variables drawn from a random distribution. This value will be used to derive an upper bound on the cost of the optimal solution for random instances, when n or d increases. Note that, in some places (e.g., Equation (4.6)), we assume that a = nd1/n!d This is a good approximation in the following formulas because (a) if n is fixed and d oo, then a  oc, and therefore there is no difference between a and nd //n! d (b) if d is fixed and n  o, then a + ed1. This is not difficult to derive, since n n e n! [() (2r)n (27rn) But (27n)' (27elog) n (27) 2n e 2n and both factors in the right have limit equal to 1. However, ed1 is a constant value, and will not change the limit of the whole formula, as n  co. Proposition 4.2 Let z* = nE[X(a)], where E[X(a)] is the expected value of the ath order statistic for each level of the index tree representation of the MAP. Then, zu is an upper bound to the mean optimal solution cost of an instance of an MAP with independent .:. ,.,/..ll'; distributed cost coefficients. Proof: Consider any level j of the index tree and select the a elements with lowest cost on that level. Let Aj be the set composed by the selected elements. Since the cost coefficients are independent and identically distributed, the nodes in Aj are randomly distributed across the level j. Now, pick the maximum node v E Aj, i.e., v = max{w Iw Aj}. The expected value of v is the same as the expected value of the ath order statistic among nd1 cost values for this level of the tree. Since each level of the index tree has the same number of independent and identically distributed cost values, we may conclude that E[X(,)] is the same for each level in the index tree. By randomly selecting a cost values from each of the n levels of the index tree, we expect to have at least one feasible solution to the MAP by Proposition 4.1. Thus, it is clear that an upper bound cost for the expected feasible solution is z, nE[X(a)]. Theorem 4.3 Given a ddimensional MAP with n elements in each dimension, if the nd cost coefficients are independent exponr ,i.: all distributed random variables with mean A > 0, then z* 0 as n oo or d oo. Proof: We first note that for independent exponentially distributed variables the expected value of the ath order statistic for k i.i.d. variables is given by E } A (4.5) a1 E[X(a)] = (4.5) j=0 Note that (4.5) has a terms and the term of largest magnitude is the last term. Using the last term, an upper bound on (4.5) is developed as a1 E[x(a)]u C (a t) < k (a 1) j=o aA k a+1 Now, using Propositions 4.1 and 4.2, the upper bound for the mean optimal solution to the MAP with exponential costs may be developed as SaA aA n z a+ 1 ka k  where k = nd is the number of cost elements on each level of the index tree. To prove z* 0, we must first substitute the values of k and a into (4.6), which gives nA z < n (4.6) (!)  Let n = 7 and n! = 6, where 7 and 6 are some fixed numbers. Then (4.6) becomes < 7A 7A S d ~ d 1 6 ] 6 as d gets large. Therefore, lim z < lim = 0. d Uc dc  d>oo d*oo d> Now, let d 1 = 7, where 7 is some fixed number. Then (4.6) becomes nA n\ z (n!)F 1 (n!) as n gets large. Using Stirling's approximation n! \ (n/e)"f/2n, n\ n\ nA ((n/e) (27n) ' nA < r(4.7) S )+(4.8) where (4.7) holds because (27r) approaches one from the right as n 0 o. Con sidering that (1)7 is a constant and that the exponent to n is greater than one for any 7 > 2, which holds because d > 3, then (4.8) will approach zero as n + oo. Therefore, for the exponential case lim z = 0 and lim z = 0 from above. n>oo d oo Note that z* is bounded from below by zero because the lower bound of any cost coefficient is zero (a characteristic of the exponential random variable with A > 0). Since 0 < z* < z*, the proof is complete. Theorem 4.4 Given a ddimensional MAP with n elements in each dimension, if the nd cost coefficients are independent unifoc 'in,,l distributed random variables in [0, 1], then z* + 0 as n oo or d + oo. Proof: For the case of the uniform variable in [0, 1], the expected value of the ath order statistic for k i.i.d. variables is given by E[Xa)] k + Therefore, using Propositions 4.1 and 4.2, the upper bound on the mean optimal solution for an MAP with uniform costs in [0, 1] is nao na  < (4.9) U k+1 k ' where k n d1 is the number of cost elements on each level of the index tree. We must now substitute the values of k and a into (4.9), which becomes I < _n! (4.10) Applying to (4.10) Stirling's approximation, in the same way as used in Theorem 4.3, we see that z*  0 as n  oo or d  oc. Note again that z* is bounded from below by zero because the lower bound of any cost coefficient is zero (a characteristic of the uniform random variable in [0, 1]). Since 0 < z* < z*, this completes the proof. Theorem 4.5 Given a ddimensional MAP with n elements in each dimension, for some fixed n, if the nd cost coefficients are independent, ; ,,'.: ,,,1 ..i distributed random variables in [a, b], then z* + na as d + oo. Proof: For the case of the uniform variable in [a, b], the expected value of the ath order statistic for k i.i.d. variables is given by David [33] (b a)a E[X(a)]= a+ k+1 Therefore, using Propositions 4.1 and 4.2, the upper bound on the mean optimal solution for an MAP with uniform costs in [a, b] is ( (b a)a ( (b a)a z = n a + k+ tn a+ k k+1 k (b a)na Sna+ k (4.11) k where k = n'1 is the number of cost elements on each level of the index tree. We must now substitute values of k and a into (4.11), which results S(b a)n z _< na+ n (4.12) It becomes immediately obvious from (4.12) that for a fixed n and as d oo, z*  na. As z* < z and na is an obvious lower bound for this instance of the MAP we conclude that, for fixed n, z* na as d  oo. 4.4 Mean Optimal Costs of NormalDistributed Random MAPs We want to now prove results similar to the theorems above, for the case where cost values are taken from a normal distribution. This will allow us to prove Conjec ture 3.4. A bound on the cost of the optimal solution for normal distributed random MAPs can be found, using a technique similar to the one used in the previous section. However, in this case a reasonable bound is given by the median order statistics, as described in the proof of the following theorem. Theorem 4.6 Given a ddimensional MAP, for a fixed d, with n elements in each dimension, if the nd cost coefficients are independent standard normal random vari ables, then z* oo as n oo. Proof: First note that for odd k = 2r + 1, X(r+1) is the median order statistic and for even k = 2r, we define the median as ((X(r) + X(r+1)). Obviously, the expected value of the median in both cases is zero. Let k nd1 and note that, as n or d get large, a < r for either odd or even case. Therefore we may immediately conclude E[X(a)] < 0. Using Propositions 4.1 and 4.2, we see that z* < z = nE[X(,)] and z* oo as n oo. Theorem 4.7 Given a ddimensional MAP with n elements in each dimension, for a fixed n, if the nd cost coefficients are independent standard normal random variables, then z* oo as d oo. Proof: We use the results from Cramir [32] to establish the expected value of the ith order statistic of k independent standard normal variables. With i < k/2 we have E[X() 2ogk+ log(log k) + log(4r) + 2(S1 C) 1 E[X( =2 log O( (4.13) 2 2logk log k where S = + + + and C denotes Euler's constant, C 0.57722. As d o o, k oc and the last term of (4.13) may be dropped. In addition, a slight rearrangement of (4.13) is useful: E[2ogk+ log(log k) log(4w) (S C) (4.14) E[Xp 21o + + + (4.14) 2 2 1ogk 2 /2logk V/2logk It is not difficult to see that as k oo, the sum of the first three terms of (4.14) goes to oo. Therefore, we consider the last term of (4.14) as k oo. (Si C) C+ C+ log(i1)C (Si c) c + ' j j log(i t) C 2log k 2 log k /2 log k 2 log k log(i ) C (4. gk g(4.15) 2/2go v/log k Noting that the second term of (4.15) goes to zero as k o0, and also making the substitutions i = a = n /n and k = n1, we have log d I 1 log dI 2/log k 21log 1 l og 1 log(nd ) log((n!) d1) 2 log 1 (d 1)log() (d 1)log(n!A) 2log nd 1 It is clear that for a fixed n, and as d oo, the right hand side of (4.16) approaches zero. Therefore, using Propositions 4.1 and 4.2 we have < z* = nE[X(,)] and E[X(a)] oo for a fixed n and d  oo. The proof is complete. 4.5 Remarks on Further Research In this chapter, we proved some ..imptotic characteristics of random instances of the MAP. This was accomplished using a probabilistic approach. An interest ing direction of research is how the probabilistic approach can be used to improve the performance of existing solution algorithms. C'! lpter 5 applies the probabilistic approach to reduce the cardinality of the MAP which, in turn, is then solved by GRASP. We show this process can result in better solutions in less time for the data association problem in the multisensor multitarget tracking problem. CHAPTER 5 PROBABILISTIC APPROACH TO SOLVING THE MULTISENSOR MULTITARGET TRACKING PROBLEM 5.1 Introduction The data association problem arising from multisensor multitarget tracking ( S \ lTT) can be formulated as an MAP. Although the MAP is considered a hard problem, a probabilistic approach to reducing problem cardinality may be used to accelerate the convergence rate. With the use of MSMTT simulated data sets, we show that the data association problem can be solved faster and with higher quality solutions due to these exploitations. The MSMTT problem is a generalization of the single sensor single target track ing problem. In the MSMTT problem noisy measurements are made from an arbitrary number of spatially diverse sensors (for example cooperating remote agents) regard ing an arbitrary number of targets with the goal of estimating the state of all the targets present. See Figure 51 for visual representation of the problem. Because of noise, measurements are imperfect. The problem is exacerbated with many close targets and noisy measurements. Furthermore, the number of targets may change by moving into and out of detection range and there are instances of false detections as shown in Figure 52. The MSMTT solves a data association problem on the sen sor measurements and estimates the current state of each target based on the data association problem for each sensor scan. The combinatorial nature of the MSMTT problem results from the data asso ciation problem; that is, given d sensors with n target measurements each, how do we optimally partition the entire set of measurements so that each measurement is attributed to no more than one target and each sensor detects a target no more than d sensors  each have n measurements, not necessarily the same X sensor measurement 1, from sensor I L1 Xi Xj1 Xkl Xk2 X2 Xk  Xa 9k Figure 51: Example of noisy sensor measurements of target locations. 6Xil Xk2 Xj Xi k ( ,i False detection by sensors '4 Xi Xi Missed detection by sensor k %1 Figure 52: Example of noisy sensor measurements of close targets. In this case there is false detection and missed targets. once? The data association problem maximizes the likelihood that each measurement is assigned to the proper target. In MSMTT, a scan is made at discrete, periodic mo ments in time. In practical instances, the data association problem should be solved in real time particularly in the case of cooperating agents searching for and identi fying targets. Combining data from more than one sensor with the goal of improving decisionmaking is termed sensor fusion. Solving even moderatesized instances of the MAP has been a difficult task, since a linear increase in the number of dimensions (in this case, sensors) brings an exponential increase in the size of the problem. As such, several heuristic algorithms [74, 90] have been applied to this problem. However, due to the size and complexity of the problem, even the heuristics struggle to achieve solutions in realtime. In this chapter we propose a systematic approach to reduce the size and complexity of the data association problem, yet achieve higher quality solutions in faster times. This chapter is organized as follows. We first give some background on data association for the MSMTT problem. We then introduce a technique that may be used to reduce the size of the problem. Following that, we discuss the heuristic, Greedy Randomized Adaptive Search Procedure (GRASP), and how GRASP can be modified to work effectively on a sparse problem. Finally, we provide some comparative results of these solution methods. 5.2 Data Association Formulated as an MAP Data association is formulated as an MAP where the cost coefficients are derived from a computationally expensive negative loglikelihood function. The data asso ciation problem for the MSMTT problem is to match sensor measurements in such a way that no measurement is matched more than once and overall matching is the most likely association of measurements to targets. In the MAP, elements from d dis joint sets are matched in such a way that the total cost associated with all matching is minimized. It is an extension of the twodimensional assignment problem where there are only two disjoint sets. For sets of size n, the twodimensional assignment problem has been demonstrated to be solvable in O(n3) arithmetic operations using the Hungarian method [62], for example. However, the threedimensional assignment problem is a generalization of the three dimensional matching problem which is shown by Garey and Johnson [44] to be NPhard. A review of the multitarget multisensor problem formulation and algorithms is provided by Poore [89]. BarShalom, Pattipati, and Yeddanapudi [11] also present a combined likelihood function in multisensor air traffic surveillance. Suppose that we have S sensors observing an unknown number of targets T. The Cartesian coordinates of sensor s is known to be uw = [x,, ys, z,]', while the unknown position of target t is given by ut = [xt, t, zt]'. Sensor s takes n, measurements, z,i,. Since the measurements of target locations are noisy, we have the following expression for measurement is from sensor s: i h(wct, ws) + ws,is if measurement is is produced by target t zs,is = Vs,i, if measurement is is a false alarm The measurement noise, u,ijs, is assumed to be normally distributed with zero mean and covariance matrix Rs. The nonlinear transformation of measurements from the spherical to Cartesian frame is given by hs((t, cs). Consider the Stuple of measurements Zil,i2,...is, each element is produced by a different sensor. Using dummy measurements zs,o to make a complete assignment, the likelihood that each measurement originates from the same target t located at Wt is given. S A(Zi ,i2,...,is It) J [PD p(zs,, t)] 1 [1 PD1 (5.1) S=l where S 0 if is = 0 (dummy measurement) JS,is i 1 if 1i > 0 (actual measurement) and PD, < 1 is the the detection probability for sensor m. The likelihood that the set of measurements Zi1,i2...,is corresponds to a false alarm is as follows. S A( Z,i2,...,is 0) [Pp S"'i (5.2) sl ss1 where PF, > 0 is the probability of false alarm for sensor s. The cost of associating a set of measurements Zi1,i2,...,i to a target t is given by the likelihood ratio: / A (Zil,i2,...,is 't) il,i2, ...,i A (Zil,i2,...,is lt 0) sI P (Zi, It) t PDI (5.3) This is the likelihood that Zi,,i,..., corresponds to an actual target and not a false alarm. Multiplying a large set of small numbers leads to round off errors as the product approaches zero. To avoid this problem, we apply the logarithm function on both sides. The cost of assigning a set of measurements Zi1,i2,...,i to a target t is given by the negative logarithms of the likelihood ratio. cI In A(Zi,i2 ...is I L]t) (5.4) ,\i, A(zi,i2...,isg t 0) Instead of maximizing the likelihood function, we now try to minimize the negative loglikelihood ratio. A good association would, therefore, have a large negative cost. In practice, the actual location of target t is not known. If it were, then obtaining measurements would be useless. We define an estimate of the target position as Ljt = arg max A(Zi,,i2... is t). The estimated target position maximizes the likelihood of a given set of measure ments. The generalized likelihood ratio utilizes an estimated target position. Our neg ative loglikelihood ratio takes the following form Cil,2,...,iS 6s,i In s1 2 PD, k [ h(t, )' R [zi, h(wit, us) ln (1PDs) (5.5) We can do a type of gating' by simply dropping any association with ci1,i2...,s > 0. A feasible solution of the MT\ [ST problem assigns each measurement to no more than one Stuple or association Zi1,i2,...,i In other words, each measurement may not be associated with more than one target. The result is a multidimensional assignment problem that chooses tuples of measurements minimizing the negative log likelihood. This is formally given as a 01 integer program. min m Zil,i2,...,iS Cil,i2,...,is Pil,i2,...,is S Pi,i2,..,is 1 V 1 Pi2,i2,..., < 1 V 1 i2 ,ik,.,isPS1 Vs Pi,i2.is < 1 V is il~i2,..,is 1  1,2,... ,ni 1, 2, ... n ; 2,3,... ,S 1 = 1, 2, ... ns, 1 Gating is a process of initially excluding some measurementtarget assignments because of an arbitrarily large distance between the measurement and target. (5.6) w1 if the tuple Zi,,i2 ....is is assigned to the same target where pi,,i2,,is = 0 otherwise Zil,i2,..,i2 s = L{zi, z2,i2 zS,is ni = min n, Vs = 1,2,..., S S Zs,, E R3 The objective is to find n, measurement associations so that the sum of all the neg ative loglikelihood costs are minimized. Measurements are assigned to a maximum of one association or Stuple. We define the Boolean decision variable piI,i2,...is, to be zero when not all measurements {zi, 1, z2,i ... zs ,is are assigned to the same target. The total number of possible partitions of s 1 n, measurements into T targets is given by n S i i S M T .) .S!] for ns 5 T WM = (5.7) i= (nsT (nS+i)! r ns where ns > max {ni, n2,..., ns1}. 5.3 Minimum Subset of Cost Coefficients Our objective is to preprocess the fullydense data association problem by re ducing the size of the problem to a smaller subset. We would expect advantages such as reduced storage requirements and less complexity for some algorithms. The devel opment of a minimum subset of cost coefficients is based on the work in C'! Ipter 4 (specifically Proposition 4.1) where we use the index tree representation of the MAP and randomly select a nodes from each level of tree where a d (5.8) When these a nodes from each level are combined into set A, we can expect this set to contain at least one feasible solution to the MAP. For the generalized MAP with dimension d and ni elements in each dimension i, i 1 2,..., d, and nl < n2 < < nd, we can easily extend equation (5.8) by noting the number of feasible solutions is ni(2 (nj2 Using this we find a H 2 ni (5.9) Hi=2 (nin1)! Consider an MAP where the cost coefficients of the index tree are sorted in non decreasing order for each level of the tree. If the cost coefficients are independent identically distributed random variables then the first a cost coefficients are from random locations at each level. Therefore, we may use Proposition 4.1 and conclude we can expect at least one feasible solution in A. The cardinality of this set A is substantially smaller than the original MAP which may result in faster solution times. Table 5 1 shows a comparison of the size of A to the size of the three original problems. Since the reduced set is made up of the smallest cost coefficients we expect good solution values. Table 51: Comparisons of the number of cost coefficients of original MAP to that in A. Number of Cost Coefficients Problem Original MAP A 5x5x5 125 20 O1xl0xl0x10 10000 110 8x9xl0xl0x10 72000 72 Now consider an MAP where cost coefficients are not independent and identically distributed. In real world applications, cost coefficients will most likely be dependent. Consider, for example, a multisensor multitarget tracking situation where a small target is tracked among other 1.',., targets. We can expect a higher noise/signal ratio for the smaller target. Thus, cost coefficients associated with measurements of the smaller target in the data association will be correlated to each other. In the case of dependent cost coefficients, Proposition 4.1 cannot be directly applied because the a smallest cost coefficients will not be randomly distributed across each level of the index tree. However, using Proposition 4.1 as a starting point, consider selecting some multiple, r > 1, of a cost coefficients from each level of a sorted index tree. For example, select the first ra cost coefficients from each of the sorted levels of the index tree to form a smaller index tree A. As r is increased, the cardinality of A obviously increases but so does the opportunity that a feasible solution exists in A. The best value of r depends upon the particular MAP instance, but we can empirically determine a suitable estimate. In this chapter, we use a consistent value of 7 = 10 wherever the probabilistic approach is used. 5.4 GRASP for a Sparse MAP A greedy randomized adaptive search procedure (GRASP) [36, 37, 38, 4] is a multistart or iterative process in which each GRASP iteration consists of two phases. In a construction phase, a random adaptive rule is used to build a feasible solution one component at a time. In a local search phase, a local optimum in the neighborhood of the constructed solution is sought. The best overall solution is kept as the result. 5.4.1 GRASP Complexity It is easy to see that GRASP can benefit in terms of solution times for the MAP by reducing the size of the problem. This can be seen by noting there are N cost coefficients in the complete MAP where N H= 1 ni. As the complexity of the construction phase can be shown to be 0(N) [4], a smaller N will directly reduce the time it takes for each construction phase. As it is easy to see that reducing the problem size to something less than N helps in the construction phase, it remains to be seen how the local search phase is effected. The local search phase of GRASP for the MAP often relies on the 2exchange neighborhood [74, 4]. A thorough examination of other neighborhoods for the MAP is provided in the work by Pasiliao [84]. The local search procedure is as follows. Start from a current feasible solution, examine one neighbor at a time. If a lower cost is found adopt the neighbor as the current solution and start the local search procedure again. Continue the process until no better solution is found. The size of the 2exchange neighborhood is d(n). As the size of the neighborhood is not directly dependent upon N there appears, at first, to be no advantage or disadvantage of reducing the number of cost coefficients in the problem. However, an obstacle surfaces in the local search procedure because, as the construction phase produces a feasible solution, we have no guarantee a neighbor of this solution even exists in the sparse problem. A feasible solution consists of n1 cost coefficients. A neighbor in the 2exchange neighborhood has the same n1 cost coefficients except for two. In a sparse MAP, most cost coefficients are totally removed from the problem. Therefore, the local search phase first generates a potential neighbor and then must determine whether the neighbor exists. In a complete MAP, the procedure may access the cost matrix directly; however, the sparse problem cannot be accessed directly in the same way. A simple procedure is to simply scan all cost coefficients in the sparse problem to find the two new cost coefficients or to determine that one does not exist. This is an expensive procedure. We propose a data structure which provides a convenient, inexpensive way of evaluating existing cost coefficients or determining that they do not exist. 5.4.2 Search Tree Data Structure We propose to use a search tree data structure to find a particular cost coefficient or determine that one does not exist in the sparse problem. The search tree has d + 1 levels. The tree is constructed such that there are ni branches extending from each node at level i,i =1,2,... d. The bottom level, i = d + 1, (leaves of the tree) contains each of the cost coefficients (if they exist). The maximum number of nodes in the tree including the leaves is equal to 1 + Y7i 1 j1 nj and therefore, the time to construct the tree is 0(N). An example of this search tree is given in Figure 53 for a complete 3x3x3 MAP. When searching for a particular cost coefficient, start at level i = 1 and traverse down branch y, y = 0,..., ni where y is the element of the th dimension for the cost coefficient. Continue this process until either level i d + 1 is reached, in which case the cost coefficient exists, or a null pointer is reached, in which case we may conclude the cost coefficient does not exist. It is obvious the search time is 0(d). Level 1 Dimension 1 0 1 2 Level 2 Dimension 2 0 1 2 0 1 \ 0 1 1 Level 3 Dimension 3 00 21200 1 0 0 \20 2 0 7 0 ,1 000 001 002 010 011 012 020 021 022 100 101 102 110 111 112 120 121 122 200 201 202 210 211 212 220 221 222 Figure 53: Search tree data structure used to find a cost coefficient or determine a cost coefficient does not exist. A search tree built from sparse data is shown in Figure 54. As an example of searching for cost coefficient (001), start at level 1 and traverse down branch labelled "0" to the node at level 2. From level 2, traverse again down branch labelled "0" to the node at level 3. From level 3, traverse down branch labelled "1" to the cost coefficient. Another example is searching for cost coefficient (222). Start at level 1 and traverse down branch labelled "2" to the node at level 2. From level 2, traverse again down branch labelled "2" to find it is a null pointer. The null pointer indicates the cost coefficient does not exist in the sparse MAP. Figure 54: Search tree example of a sparse MAP. The GRASP algorithm can benefit from this search tree data structure if the problem is sparse. In a dense problem, it would be best to put cost coefficients in a matrix which can be directly accessed this would benefit the local search phase. However, in the sparse problem, completely eliminating cost coefficients reduces stor age and benefits the construction phase. It remains a matter of experimentation and closer examination to find the level of sparseness where the search tree data structure becomes more beneficial. 5.4.3 GRASP vs Sparse GRASP To compare the performance of GRASP to solve a fully dense problem against the performance of GRASP to solve a sparse problem, we used simulated data from a multisensor multitarget tracking problem [74]. The problems ranged in size from five to seven sensors. Those with five sensors had five to nine targets. Problems with six and seven sensors had just five targets. Two problems of each size were tested. The problem size is indicated by the problem title. For example, "s5t6rml" means problem one with five sensors and six targets. The experiment conducted five runs of each solution algorithm and reports the average timetosolution, the average solution value and the best solution value from the five runs. The solution times can be easily adjusted for each algorithm by simply adjusting the number of iterations. An obvious consequence is that as the number of iterations goes down, the solution quality generally gets worse. To create sparse instances of each problem, the probabilistic approach described above in Section 5.3 was used where r = 10. Table 52 shows the results of the experiment. Except for problems s5t8rml and s5t8rm2, reducing the problem size increased solution quality with less timetosolution. Table 52: Table of experimental results of comparing solution quality and timeto solution for GRASP in solving fully dense and reduced simulated MSMTT problems. Five runs of each algorithm were conducted against each problem. Ordinary Grasp Sparse Grasp Problem Opt Sol Ave Sol Best Sol Ave Time (sec) Ave Sol Best Sol Ave Time (sec) s5t5rml 50 49.2 50 0.026 50 50 0.022 s5t5rm2 44 38 41 0.024 43.8 44 0.024 s5t6rml 57 54 51.4 0.044 49.4 52 0.044 s5t6rm2 45 38.6 41 0.0462 45 45 0.04 s5t7rml 63 52.6 59 0.0902 61.2 62 0.0962 s5t7rm2 66 59.2 62 0.0862 61.8 62 0.0822 s5t8rml 74 64.8 67 0.1322 71.2 72 0.1262 s5t8rm2 33 20.6 32 0.1402 17 25 0.1542 s5t9rml 84 74.6 78 1.7044 74.4 77 1.8326 s5t9rm2 65 59 61 1.6664 60.6 63 1.5702 s6t5rml 48 44.4 48 0.9676 48 48 0.9194 s6t5rm2 45 42 42 0.9754 45 45 0.8392 s7t5rml 51 41.6 44 1.378 50.4 51 1.0556 s7t5rm2 52 44.8 47 1.4804 52 52 1.0916 5.5 Conclusion In this chapter, we implemented techniques to reduce the size of the data associ ation problem that is linked to the MSMTT problem. Empirical results indicate that probabilistically reducing the cardinality generally increases the solution quality and decreases the timetosolution for heuristics such as GRASP. We ii :. 1 that further research is needed to study this approach on problems that are initially sparse in the first place which is a common occurrence in realworld problems. Additionally, we believe the probabilistic approach to reducing MAP size could be extended to other solution algorithms such as simulated annealing. CHAPTER 6 EXPECTED NUMBER OF LOCAL MINIMA FOR THE MAP As discussed in previous chapters, the MAP is an NPhard combinatorial op timization problem occurring in many applications, such as data association. As many solution approaches to this problem rely, at least partly, on local neighborhood searches, it is widely assumed the number of local minima has implications on solution difficulty. In this chapter, we investigate the expected number of local minima for random instances of this problem. Both 2exchange and 3exchange neighborhoods are considered. We report on experimental findings that expected number of local minima does impact effectiveness of three different solution algorithms that rely on local neighborhood searches. 6.1 Introduction In this chapter we develop relationships for the expected number of local minima. The 2exchange local neighborhood appears as the most commonly used neighborhood in metaheuristics such as GRASP that are applied to the MAP as evidenced in several different works [4, 74, 27]. Although the 2exchange is most common in the literature, we include in this chapter some analysis of the 3exchange neighborhood for comparison purposes. The motivation of this chapter is that the number of distinct local minima of an MAP may have implications for heuristics that rely, at least partly, on repeated local searches in neighborhoods of feasible solutions [112]. In general, if the number of local minima is small then we may expect better performance from metaheuristic algo rithms that rely on local neighborhood searches. A solution landscape is considered to be rugged if the number of local minima is exponential with respect to the size of the problem [78]. Evidence by Angel and Zissimopoulos [9] showed that r1. 'nl.  of the solution landscape has a direct impact on the effectiveness of the simulated an nealing heuristic in solving at least one other hard problem, the quadratic assignment problem. The concept of solution landscapes was first introduced by Wright [111] as a non mathematical way to describe the action during evolution of selection and variation [102]. The idea is to imagine the space in which evolution occurs as a landscape. In one dimension there is the ., i.1 ripe and in another dimension there is a height or fitness. Evolution can be viewed as the movement of the population, represented as a set of points (._. r ir. epes), towards higher (fitter) areas of the landscape. In an analogous way, a solution process for a combinatorial problem can be viewed as the movement from some feasible solution with its associated cost (fitness) towards better cost (fitter) areas within the solution landscape. As pointed out by Smith et al. [102], the difficulty of searching in a given problem is related to the structure of the landscape, however, the exact relationship between different landscape features and the time taken to find good solutions is not clear. To name a couple of the landscape features of interest are number local optima and basins of attraction. Reidys and Stadler [93] describe some characteristics of landscapes and express that local optima pl i, an important role since they might be obstacles on the way to the optimal solution. From a minimization perspective, if x is a feasible solution of some optimization problem and f(x) is the solution cost, then x is a local min ima iff f(x) < f(y) for all y in the neighborhood of x. Obviously the size of the neighborhood depends upon the definition of the neighborhood. According to Reidys and Stadler [93] there is no simple way of computing the number of local minima without exhaustively generating the solution landscape. However, the number can be estimated as done in some recent works [43, 45]. Rummukainen [98] describes some aspects of landscape theory which have been used to prove convergence of simulated annealing. Of particular interest are some results on the behavior of local optimization on a few different random landscape classes. For example, the expected number of local minima is given for the N k landscape. Associated with local minima is a basin B(x) defined by means of a steepest descent algorithm [93]. Let f(xi) be the cost of some feasible solution xi. Starting from xi, i = 0, record for all yneighbors the solution cost f(y). Let xi + 1 = y for neighbor y where f(y) is the smallest for all neighbors and f(y) < f(xi). Stop when xi is a local minima. It becomes apparent; however, that a basin may have more than one local minima because of the definition of local minima is not strict. The basin sizes becomes important for simple metaheuristics. For example, consider selecting a set of feasible solutions that are uniformly distributed in the solution landscape and performing a steepest descent. A question is what is the probability of starting in the basin with the global minima? This question is partially addressed by Gamier and Kaller [45]. Long and Williams [68] mention that problems are generally easier to solve when the number of local optima is small, but the difficulty can increase significantly when the number of local optima is large. The authors consider the quadratic 01 problem where instances are randomly generated over integers symmetric about 0. For such problems, the authors show the expected number of local maxima increases expo nentially with respect to n, the size of the problem. They also reconcile this result with Pardalos and Jha [80] who showed when test data are generated from a normal distribution, the expected number of local maxima approaches 1 as n gets large. Angel and Zissimopoulos [9] introduce a ruggedness coefficient which measures the ri. i. of the QAP solution landscape. They conclude that because the QAP landscape is rather flat, this gives theoretical justification for the effectiveness of local search algorithms. The ri. n 'l. dn s coefficient is an extension of the autocorrelation coefficient introduced by Weinberger [110]. The larger the autocorrelation coefficient the more flat is the landscape and so, as postulated by the authors, the more suited is the problem for any localsearchbased heuristic. Angel and Zissimopoulos [9] calculate the autocorrelation coefficient for the QAP as being no smaller than n/4 and no larger than n/2 which is considered relatively large. They develop the parameter, ri'dnsi coefficient, (, which is independent of problem size and lies between 0 to 100. Close to 100 means the the landscape is very steep. They go on to show experimentally that increasing ( for the same problem size results in higher relative solution error and higher number of steps when using a simulated annealing algorithm by Johnson et al. [53]. The conclusions Angel and Zissimopoulos [9] are a relatively low ri~~dn . coefficient for the QAP gives theoretical justification of the effectiveness of localsearchbased heuristics for the QAP. This chapter will further investigate the assumption that number of local minima impacts the effectiveness of algorithms such as simulated annealing in solving the MAP. The next section provides some additional background on the 2exchange and 3exchange local search neighborhoods. Then in Section 6.3, we provide experimen tal results on the average number of local minima for variously sized problems with assignment costs independently drawn from different distributions. Section 6.4 de scribes the expected number of local minima for MAPs of size of n = 2 and d > 3 where the cost elements are independent identically distributed random variables from any probability distribution. Section 6.5 describes lower and upper bounds for the expected number of local minima for all sizes of MAPs where assignment costs are independent standard normal random variables. Then in Section 6.6, we investigate whether the expected number of local minima impacts the performance of various algorithms that rely on local searches. Some concluding remarks are given in the last section. 6.2 Some Characteristics of Local Neighborhoods A first step is to establish the definition of a neighborhood of a feasible solution. Let Ap(k) be the pexchange neighborhood of the kth feasible solution, k = 1,..., N, where N is the number of feasible solutions to the MAP. The pexchange neighborhood is all p or less element exchange permutations in each dimension of the feasible solution. The neighborhood is developed from the work by Lin and Kernighan [66]. If zk is the solution cost of the kth solution, then Zk is a discrete local minimum iff k < zj for all j E AVp(k). As an example of a 2exchange neighbor, consider the following feasible solution to an MAP with d = 3, n = 3: {111, 222, 333}. A neighbor is {111, 322, 233}. The solution {111, 222, 333} is a local minimum if its solution cost is less than or equal to all of its neighbor's solution costs. The formula for the number of neighbors, J, of a feasible solution in the 2 exchange neighborhood of an MAP with dimension d and n elements in each dimen sion is as follows J i(k) = d ( (6.1) It is obvious that for a fixed n, J is linear in d. On the other hand for a fixed d, J is quadratic in n. If we define a flat MAP as one with relatively small n and define a deep MAP as one with relatively large n, then we expect larger neighborhoods in deep problems. Similarly, for n > 2 the size of the 3exchange neighborhood is as follows J = W(k) = d + 2 (n (6.2) Similar to above for the 2exchange, it becomes clear J is linear with respect to d and cubic with respect to n. The minimum number of local minima for any instance is one the global mini mum. At the other extreme, the maximum number of local minima is (n!)d1 which is the number of feasible solutions of an MAP. This occurs if all cost coefficients are equal. 6.3 Experimentally Determined Number of Local Minima Studies were made of randomly produced instances of MAPs to empirically de termine E[M]. The assignment costs ci,...i for each problem instance were drawn from one of three distributions. The first distribution of assignment costs used is the uniform, U[0, 1]. The next distribution used is the exponential with mean one, being determined by ci..., = In U. Finally, the third distribution used was the standard normal, N(0, 1), with values determined by the polar method [63]. Table 61 shows the average number of local minima for randomly generated instances of the MAP when considering a 2exchange neighborhood. For small sized problems, the study was conducted by generating an instance of an MAP and count ing number of local minima through complete enumeration of the feasible solutions. The values in the tables are the average number of local minima from 100 problem instances. For larger problems (indicated by in the table), the average number of local minima was found by examining a large number1 of generated problem in stances. For each instance of a problem we randomly selected a feasible solution and determined whether it was a local minimum. This technique gives an estimate of the probability that any feasible solution is a local minima. This relationship was then used to estimate the average number of local minima by multiplying the probability by the number of feasible solutions. This technique showed to have results consistent with the complete enumeration method mentioned above for small problems. Re gardless of the distribution that cost coefficients were drawn, a standard deviation of 40percent and 5percent were observed for problems of sizes d = 3, n = 3 and 1 The number examined depends on problem size. The number ranged from 106 to 107. d = 5, n = 5, respectively. It is clear from the tables that E[M] is effected by the dis tribution from which assignment costs are drawn. For example, problems generated from the exponential distribution have more local minima than problems generated from the normal distribution. Table 62 shows similar results for the 3exchange neighborhood and when cost coefficients are i.i.d. standard normal. We note, as expected, evidence indicates E[M] is smaller in the 3exchange case versus the 2exchange case for the same sized problems. Table 61: Average number of local minima (2exchange neighborhood) for different sizes of MAPs with independent assignment costs. Number of Local Minima, Uniform on [0,1] n\ d 3 4 5 6 2 1 1.68 2.66 4.56 3 2 7.69 33.5 159 4 5.60 77.8 1230 2.1E+4 5 21.0 1355 9.58E+04* 7.60E+6* 6 116 3.62E+04* 1.30E+07* 6.56E+9* Number of Local Minima, Exponential A 1 n\ d 3 4 5 6 2 1 1.54 2.66 4.71 3 2.06 7.84 35.8 165 4 5.47 80.6 1290 2.18E+4 5 22.7 1400 1.01E+5* 7.67E+06* 6 122 3.91E+4* 1.53E+07* (. ; in* Number of Local Minima, Standard Normal Costs n\ d 3 4 5 6 2 1 1.56 2.58 4.66 3 1.82 7.23 30.3 141 4 4.54 62.2 949 1.58E+04 5 16.3 939 6.5E+4* 4.75E+06* 6 75.6 2.36E+4* 7.90E+06* 3.48E+09* Table 63 shows the average proportion of feasible solutions that are local minima for both the 2exchange and 3exchange neighborhoods where costs are i.i.d. standard normal random variables. The table is followed by Figure 61 which includes plots of the proportion of local minima to number of feasible solutions. We observe that for Table 62: Average number of local minima (3exchange neighborhood) for different sizes of MAPs with i.i.d. standard normal assignment costs. n\d 3 4 5 6 3 1.55 4 3.27 5 8.27 6 28.8 5.98 43.1 516 8710W 26.0 124 670 1.11E+4 3.48E+4* 2.65E+06' 3.06E+06* 1.22E+09' either fixed dimension and increasing number of elements or visa versa, the proportion of local minima approaches zero. Table 63: Proportion of local minima to total number of feasible solutions for dif ferent sizes of MAPs with i.i.d. standard normal costs. Proportion of local minima to feasible solutions using standard normal costs and 2exchange n\ d 3 4 5 6 2.50E01 4.88E02 8.02E03 1.13E03 1.50E04 2.00E01 3.27E02 4.50E03 5.43E04 6.32E05 1.67E01 2.37E02 2.87E03 3.14E04 2.94E05 1.43E01 2.11E02 2.81E03 1.91E04 1.80E05 Proportion of local minima to feasible solutions using standard normal costs and 3exchange n\ d 3 4 5 6 4.23E02 5.60E03 6.40E04 5.94E05 2.87E02 3.13E03 2.99E04 2.33E05 2.05E02 2.03E03 1.68E04 1.14E05 1.58E02 1.40E03 1.06E04 6.31E06 Figure 61: Proportion of feasible solutions that are local minima when the 2exchange neighborhood and where costs are i.i.d. standard normal. considering Ratio of Local Minima to Feasible Solutions, Dimension, d = 6 16 S14 5 10  2 0 1 2 3 4 5 6 7 Number of Elements, n Ratio of Local Minima to Feasible Solutions, Number of Elements, n = 6 16 14 12 0 1 . 06 .04 02 2 3 4 5 6 7 Dimension, d 6.4 Expected Number of Local Minima for n = 2 In the special case of an MAP where n = 2, d > 3, and cost elements are inde pendent identically distributed random variables from some continuous distribution with c.d.f F(.), one can establish a closed form expression for the expected num ber of local minima. To show this, we recall that distribution Fx+y of the sum of two independent random variables X and Y is determined by the convolution of the respective distribution functions, Fx+y = Fx Fy. We now borrow from Proposition 3.1 to construct the following proposition. Proposition 6.1 In an instance of the MAP with n=2 and with cost coefficients that are i.i.d. random variables with continuous distribution F, the costs of all feasible solutions are independent distributed random variables with distribution F F. Proof: Let I be an instance of MAP with n = 2. Each feasible solution for I is an assignment al = cl,,(1),..., d (1), a2 C2,61(2),...,6d(2), with cost z = al + a2. The important feature of such assignments is that for each fixed entry c1,61(1),...,d (1), there is just one remaining possibility, namely C2,61(2),...,6d(2), since each dimension has only two elements. This implies that different assignments cannot share elements in the cost vector, and therefore different assignments have independent costs z. Now, a, and a2 are independent variables from F. Thus z = a, + a2 is a random variable with distribution F F. One other proposition is important to this development. Proposition 6.2 Given j i.i.d. random variables with continuous distributions, the 1',./',1/.:],'/; that the rth, r = 1,... ,j, variable is the minimum value is 1/j. Proof: Consider j i.i.d. random variables, Xi,i = 1,...,j, with c.d.f. F(.) and p.d.f. f(.). Let X(j_1) be the minimum of j 1 of these variables, X(j) = min{Xi I i 1,... ,j, i / r}, whose c.d.f. and p.d.f. are computed trivially as P[X(j_1) < x] = d F(j_)() (j  (1 F(x))1, 1)(1 F(x))j2f(X). Then, the probability that the rth random variable is minimal among j i.i.d. contin uous variables, is P[rth r.v. is minimal] = P[X, < X(j_)] = P[Y < 0] = Fy(0). (6.3) Here Fy(.) is the c.d.f. of random variable Y rule, it equals to Fy(x) I +00 00 Hence, the probability (6.3) can immediately be calculated as  F(y))jf(y)dy j f(y)dy X, X(j_I), and, by convolution P[X, < X(jl)] F(y)(J )(] 1 ** t j(1 F(y)) since j(1 F(y))Jlf(y) is the p.d.f. of X(j) equality yields the statement of the proposition. min{Xi,...,Xj}. The last The obvious consequence of Proposition 6.2 is that given a sequence of indepen dent random variables from a continuous distribution, position of the minimum value is uniformly located within the sequence regardless of the parent distribution. We are now ready to prove the main result of this section. Theorem 6.3 In an MAP with cost coefficients that are i.i.d. continuous random variables where n = 2 and d > 3, the expected number of local minima is given by 2d (6.4) E [M] = (6.4) d +1 F(j1)(x)) f(j1)(X) F(x y)(j 1)(1 F(y))2f(y)dy. 79 Proof: Let N be the number of feasible solutions to an n = 2 MAP, N = 2d1 Introducing indicator variables k 1, kth solution, k =1,..., N, is a local minimum; (6.5) 0, otherwise, one can write M as the sum of indicator variables: N M Yk k=1 From the elementary properties of expectation it follows that N N E[M] E [Yk] P[Y= 1], (6.6) k=1 k=1 where P[Yk = 1] is the probability that the cost of kth feasible solution does not exceed the cost of any of its neighbors. Any feasible solution has J = d() d neighbors whose costs, by virtue of Proposition 6.1, are i.i.d. continuous random variables. Then, Proposition 6.2 implies that the probability of the cost of kth feasible solution being minimal among its neighbors is equal to 1 P[Yk 11] =d d+1 which, upon substitution into (6.6), yields the statement of the theorem (6.4). Remark 6.3.1 Equality (6.4) implies that the number of local minima in an n 2, d > 3 MAP is exponential in d when the cost coefficients are independently drawn from ,in continuous distribution. Corollary 6.4 The proved relation (6.4) can be used to derive the expected ratio of the number of local minima M to the total number of feasible solutions N in an n= 2, d> 3 MAP: E [M] 1 E [M/N] N d+1 This shows that the number of local minima in an n = 2 MAP becomes infinitely small comparing to the number of feasible solutions, when dimension d of the problem increases. This imptotic characteristic is reflected in the numerical data above and may be useful for the development of novel solution methods. 6.5 Expected Number of Local Minima for n > 3 Our ability to derive a closedform expression (6.4) for the expected number of local minima E[M] in the previous section has relied on the independence of costs of feasible solutions in an n = 2 MAP. As it is easy to verify directly, in case of n > 3 the costs coefficients are generally not independent. This complicates significantly the analysis if an arbitrary continuous distribution for assignment costs il...id is assumed. However, as we show below, one can derive upper and lower bounds for E[M] in the case when the costs coefficients of (2.1) are normally distributed random variables. First, we introduce a proposition, which follows a similar development by Beck [16]. Proposition 6.5 Consider an n > 3, d > 3 MAP whose costs are i.i.d. continuous random variables. Then the expected number of local minima can be represented as N E[M] =J P[ n z,z, where A2(k) is the 2, ,. hli,, c, neighborhood of the kth feasible solution, and zi is the cost of the ith feasible solution. Proof: As before, M can be written as the sum of indicator variables Yk (6.5), which consequently leads to N N E[M] = E[Yk] P[Y = 1]. (6.8) k=1 k=1 As Yk = 1 means Zk < zj for all j cE A(k), it is obvious that P[Yk 1] P[zk zj < 0, Vj E cA2(k)], which proves the proposition. If we allow the nd cost coefficients ci...i, ~ N(0, 1) of the MAP to be independent standard normal N(0, 1) random variables, then for any two feasible solutions the difference of their costs Zj = zi zj is a normal variable with mean zero. Without loss of generality, consider the k = 1 feasible solution to (2.1) whose cost is Zi = ci...i + c2...2 + + c+...,. (6.9) In the 2exchange neighborhood A/2(l), the cost of a feasible solution differs from (6.9) by two cost coefficients, e.g., z2 = 21...1 + C12...2 + C3...3 + .. + Cn...n. Generally, the difference zl z, of costs of (6.9) and that of any neighbor 1 E A2(l) has the form Zrsq Cr...r + Cs...s Crr..rsr..r Cs...srs...s, (6.10) where the last two coefficients have .iiched" indices in the qth position, q = 1,..., d. Observing that Zrsq Zsrq for r, 1,. .. n; q= 1,.. ,d, consider the Jdimensional random vector Z= (Z111,... Zl, Z121, ... Z12d, ". ..., Zrs ..., Z rsd ,* Znn1 Znnd) r < s. (6.11) Vector Z has normal distribution N(0, E), with covariance matrix E defined as Cov(Z,,,q, Zij~k) if i =r, j =s, q = k, if i = r, j =s, q k, if (i r, j s) or ( / r,j if i/r, j s. For example, in case n = 3, d = 3 the covariance matrix E has the form 4 2 2 1 = 1 1 1 1 1 Now, the probability in (6.7) 2 2 1 1 1 1 1 42 1 1 1 1 1 241 1 1 1 1 1 1 4 2 1 1 1 1 2 4 2 1 1 1 1 2241 1 1 1 1 1 142 1 1 1 1 1 2 4 1 1 1 1 1 2 2 can be expressed as (6.13) [nQ Zk where Fe is the c.d.f. of the Jdimensional multivariate normal distribution N(0, E). While the value of Fv(0) in (6.13) is difficult to compute exactly for large d and n, lower and upper bounds can be constructed using Slepian inequality [107]. To this (6.12) end, let us introduce covariance matrices = (o i) and E = (oay) 4, if i j, aij= 2, if i j and (i 1)divd (j 1)divd, (6.14a) 0, otherwise, S4, if (6.14b) 2, otherwise, so that c ij < ayj < rij holds for all 1 < i, j < J, with aiy being the components of the covariance matrix Z (6.12). Then, the Slepian inequality claims that FE(O) < FW(O) < Fr(0), (6.15) where Fr(O) and F(0) are c.d.f.'s of random variables X_ ~ N(O, E) and Xr ~ N(0, E), respectively. As the variable Xv is equicorrelated, the upper bound in (6.15) can be expressed by the onedimensional integral (see, e.g., [107]) S [(az)] Jd(z), a 1 (6.16) where 4(<.) is the c.d.f. of standard normal distribution: S 1 t2 N(z)= e 2 dt, and p = aij/ /aoi jj is the correlation coefficient of distinct components of the cor responding random vector. The correlation coefficient of the components of vector X, is p which allows for a simple closedform expression for the upper bound in (6.15) Fr(O) = [ (z)] d4(z) = (6.17) oo J+1 This immediately yields the value of the upper bound E[M] for the expected number of local minima E[M]: N 2 (!)d1 E[M] = F(O) = , [ ( n(n 1)d+2 k 1 Let us now calculate the lower bound for E[M] using (6.15). According to the covariance matrix E (6.14a), the vector X_ is comprised of n" 1) groups of variables, each consisting from d elements, XE= (Xi,...,X, Xd X(i_)d+i,... ,Xid, '' X(n(n1)/21)d, ... X(n(nl)/2)d such that the variables from different groups are independent, whereas variables within a group have d x d covariance matrix defined as in (6.14b). Thus, one can express the lower bound Fr(O) in (6.15) as a product n(n1) 2 F(0)= P n [ X(i_l)id+1< 0,..,Xd< 0 ]. i=i Since variables X(i 1)d+1,... ,Xd, are equicorrelated with correlation coefficient p , each probability term in the last equality can be computed similarly to evaluation of the lowerbound probability (6.17), i.e., n(n2 1) n(n ) F(0) n [()] d4(z), = ^ oo [X)]d rd +t This finally leads to a lower bound for the number of expected minima: N (n!)d_1 E [M] = Fe(0)= . 0)(d + )n(1)/2 k 1 In such a way, we have proved the following Theorem 6.6 In an n > 3, d > 3 MAP with i.i.d. standard normal cost coefficients, the expected number of local minima is bounded as (n!)d1 2 (n!)d 1 < E [M] < (6.18) (d + 1)n(n 1)2 E n(n 1)d + 2 Remark 6.6.1 From (6.18) it follows that for fixed n > 3, the expected number of local minima is exponential in d. Corollary 6.7 Similarly to the case n = 2, the developed lower and upper bounds can (6.18) can be used to estimate the expected ratio of the number of local minima M to the total number of feasible solutions N in an n > 3, d > 3 MAP: 1 2 < 2 [ M/NI n(n] )d+2 (d + 1)n(n1)/2 n(n 1)d + 2 6.6 Number of Local Minima Effects on Solution Algorithms In this section, we examine the question of whether number of local minima in the MAP has an impact on heuristics that rely, at least in part, on local neighborhood searches. We consider three heuristics Random Local Search Greedy Randomized Adaptive Search (GRASP) Simulated Annealing The heuristics described in the following three subsections were exercised against various sized problems that were randomly generated from the standard normal dis tribution. 6.6.1 Random Local Search The random local search procedure simply steps through a given number of iter ations. Each iteration begins by selecting a random starting solution. The algorithm then conducts a local search until no better solution can be found. The algorithm captures the overall best solution and reports it after executing the maximum number of iterations. The following is a more detailed description of the steps involved. 1. Set iteration number to zero, Iter = 0. 2. Randomly select a current solution, current. 3. While not all neighbors of current examined, select a neighbor, xne., of the current solution. If zXne < zx ent then current  xnew. 4. If zc ..rent < zXbest then Xbest  current 5. If Iter < Iterma,, increment Iter by one and go to Step 2. Otherwise, end. 6.6.2 GRASP A greedy randomized adaptive search procedure (GRASP) [36, 37, 38] is a multi start or iterative process, in which each GRASP iteration consists of two phases. In a construction phase, a random adaptive rule is used to build a feasible solution one component at a time. In a local search phase, a local optimum in the neighborhood of the constructed solution is sought. The best overall solution is kept as the result. The neighborhood search is conducted similar to that in the random local search algorithm above. That is neighbors of a current solution are examined one at a time and if an improving solution is found, it is adopted as the current solution and local search starts again. Local search ends when no improving solution is found. GRASP has been used in many applications and specifically in solving the MAP [4, 74]. 6.6.3 Simulated Annealing Simulated Annealing is a popular heuristic used in solving a variety of problems [57, 70]. Simulated annealing uses a local search procedure, but the process allows uphill moves. The probability of moving uphill is higher at high temperatures, but as the process cools, there is less probability of moving uphill. The specific steps for simulated annealing used in this chapter are taken from work by Gosavi [46]. Simulated annealing was recently applied to the MAP by Clemmons et al. [27]. 6.6.4 Results The heuristic solution quality, Q, which is the relative difference from the optimal value, is reported and compared for the same sized problems with assignment costs independently drawn from the standard normal distribution. The purpose of our analysis is not to compare the efficiency of the heuristic algorithms, but to determine the extent to which the number of local minima affects the performance of these algorithms. Each run of an experiment involved the following general steps: 1. Generate random MAP instance with cost coefficients that are i.i.d. standard normal random variables. 2. Obtain M by checking each feasible solution for being a local minimum. 3. Solve the generated MAP instance using each of the above heuristics 100 times and return the average solution quality, Q, for each heuristic method. Problem sizes were chosen based on the desire to test a variety of sizes and the practical amount of time to determine M (as counting M is the obvious bottleneck in the experiment). Four problem sizes chosen were d = 3, n = 6; d = 4, n = 5; d = 4, n = 6; and d = 6, n = 4. For problem size d = 4, n = 6 which has the largest N, a single run took approximately four hours on a 2.2 GHz Pentium 4 machine. The number of runs of each experiment varied for each problem size with fewer runs for larger problems. The number of runs were 100, 100, 30, and 50, respectively, for the problem sizes listed above. Figure 62 dipl' plots of the average solution quality for each of the three heuristics versus the number of local minima. The plots are the results from problem size d = 4, n = 5 and are typical for the other problem sizes. Included in each plot is a bestfit linear leastsquares line that provides some insight on the effect of M on solution quality. A close examination of the figures indicates that the solution 