<%BANNER%>

Complex Network Assortment and Modeling


PAGE 1

COMPLEX NETWORK ASSORTMENT AND MODELING By ASHWIN ARULSELVAN A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLOR IDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORIDA 2006

PAGE 2

Copyright 2006 by Ashwin Arulselvan

PAGE 3

iii ACKNOWLEDGMENTS I would like to express my gr atitude to my advisor Dr. Panos M. Pardalos for all the valuable guidance and immense support he gave me while doing this thesis. I am really thankful to him. I would also like to thank Dr. J. Cole Smith, member of my committee, for his remarks, criticisms and advice for improving the quality of the thesis presented in every possible way. I also thank my family for their moral support.

PAGE 4

iv TABLE OF CONTENTS page ACKNOWLEDGMENTS.................................................................................................iii LIST OF TABLES...............................................................................................................v LIST OF FIGURES...........................................................................................................vi ABSTRACT......................................................................................................................v ii CHAPTER 1 INTRODUCTION........................................................................................................1 2 IDENTIFYING CONNECTED COMP ONENTS IN THE MARKET GRAPH.........2 Introduction................................................................................................................... 2 Preliminaries of Graph Theory.....................................................................................6 Motivation and Techniques for Finding Connected Component in the Market Graph.........................................................................................................................7 Structure of Connected Compone nts in the Market Graph........................................10 Size of Connected Components in the Mark et Graph in the Context of Power-Law Model......................................................................................................................11 Structure of Connected Compone nts in the Market Graph........................................13 Concluding Remarks..................................................................................................14 3 EVOLUTION OF SOCIAL NETWORK...................................................................20 Introduction.................................................................................................................20 Model.......................................................................................................................... 21 Performance Analyses of the Model...........................................................................23 Results........................................................................................................................ .24 Conclusions.................................................................................................................24 4 CONCLUSIONS........................................................................................................28 LIST OF REFERENCES...................................................................................................29 BIOGRAPHICAL SKETCH.............................................................................................32

PAGE 5

v LIST OF TABLES Table page 2-1 Arrangement of stocks into groups for the market graph with threshold of 0.5......15 2-2 Dates and mean correlations co rresponding to each 500-day shift..........................16 2-3 Stocks contained in larges t size group for eleven time periods (1 being the oldest period, and 11 being the most recent)......................................................................17 3-1 Clustering coefficient, assortative mi xing coefficient and Average length of the giant component.......................................................................................................25

PAGE 6

vi LIST OF FIGURES Figure page 2-1 Largest group size by time period (A co rresponds to the threshold value of 0.7, B corresponds to the threshold of 0.6, and finally, C pertains to the market graph with threshold 0.5)....................................................................................................19 3-1 Degree distribution for n = 500 nodes (R-square = 0.9056) ...................................25 3-2 Degree distribution for n = 800 nodes (R-square = 0.9563) ...................................25 3-3 Degree distribution for n = 1000 nodes (R-square = 0.9535) .................................26 3-4 Degree distribution for n = 1200 nodes (R-square = 0.9511) .................................26 3-5 Degree distribution for n = 1500 nodes (R-square = 0.9549) .................................26 3-6 Degree distribution for n = 1700 nodes (R-square = 0.9535) .................................27 3-7 Degree distribution for n = 2000 nodes (R-square = 0.9541) .................................27

PAGE 7

vii Abstract of Thesis Presen ted to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Science COMPLEX NETWORK ASSORTMENT AND MODELING By Ashwin Arulselvan August 2006 Chair: Panos M. Pardalos Major Department: Industrial and Systems Engineering Most of the real world networks were observed to follow the power law model and possess highly clustered subgraphs and small di ameters. Finance and social networks are no exceptions for these observations. In this thesis, we consider a recently introduced network-based representation of the U.S. stock market, which follows the power-law model. We propose a computa tionally efficient technique for identifying clusters of similar stocks in the market by partitioning the market graph into a set of connected components. It turns out that these groups ha ve specific structure, in which each cluster corresponds to certain industrial segments. Moreover, the size of these connected components is consistent with the theoretical properties of the powe r-law model. We then present a model that simulates the growth of a social network over time, by considering weights for relationship strength and identi ties, features attributed to individuals represented as nodes in the network that help s in their hierarchical classification. Other factors that influence the evolution include s mutual acquaintances between a pair of nodes considered and the time of last ac quaintance between them. Our simulation

PAGE 8

viii resulted in a model having many interesting features that are desired in real world network, including high cluste ring and assortative mixing coefficient, scale free distribution and small world phenomenon.

PAGE 9

1 CHAPTER 1 INTRODUCTION Complex networks attracted a lot of atten tion in the recent years as many real world networks share the same daedal features and the study of these features make them mathematically interesting and help us be tter understand them. Complex networks differ considerably from the random graph model presented by Erd s and Rnyi [1]. For instance, random graphs follow Poisson de gree distribution, while complex networks follow power law distribution [2-3]. Power la w distribution is scale free and for this reason complex networks are also referred to as scale free networks. These networks are highly clustered compared to random graphs [4]. Also, they were observed to exhibit small world phenomenon [4-6]. We take two such networks for our study that we presenting in this thesis. In chapter 2, with a brief introduction about financ e graphs we present an efficient technique to assort assets into highly correlated groups In chapter 3, we suggest a statistical model that simulates the growth of a social netw ork and summarize the precision of the model.

PAGE 10

2 CHAPTER 2 IDENTIFYING CONNECTED COMP ONENTS IN THE MARKET GRAPH We consider a recently introduced networkbased representation of the U.S. stock market referred to as the market graph, wh ich has been shown to follow the power-law model. We propose a computa tionally efficient technique for identifying clusters of similar stocks in the market by partitioning the market graph into a set of connected components. It turns out that these groups have specific structure, where each cluster corresponds to certain industrial segments. Moreover, the size of these connected components is consistent with the theoreti cal properties of th e power-law model. Introduction Taking into consideration a huge amount of data generate d by the stock market on a daily basis, the importance of discovering e fficient ways to represent and analyze these data becomes apparent. The stock market data is generally illustra ted by different plots displaying the price of a certain stock duri ng various time periods. Nevertheless, as the number of stocks increases, th e task of analyzing the information contained in the plot becomes more and more complicated. In our study we adopted an alternative appr oach to explore the stock market data. Specifically, we applied a recently developed technique of representing the stock market prices over time in the form of a networ k with the stocks as the nodes and the edges induced by the relations between the prices of two different stocks. This network is called the market graph [7-9].

PAGE 11

3 It is worthwhile to mention that the above approach to representing massive datasets is widely used in ma ny different areas such as social sciences, finance, genomics, and protein folding [10-15]. This methodology ca n be applied to interpret large datasets arising in various applicati ons as a graph, where the elem ents of the dataset are the vertices, and the relationships between those el ements are represented by the edges of the graph. In many cases, such network representa tions prove to be extremely useful and convenient for the information analysis and el ucidation of the hidden dependencies in the data. A network representation of the stock ma rket data is derived from the crosscorrelation of price fluctuations over a certain time period. We construct the market graph as follows: each node in the graph corresponds to a particular st ock, and two nodes are connected by an edge if the price correlation coefficient for the pair of associated stocks (computed over a specific period of time) exceeds a given threshold. Let us now describe the procedure for c onstructing the market graph. Denote the price of the financial instrument i on day t by Pi(t) The logarithm of return on the asset i over the one-day period from t-1 to t is given in equation (2.1) ln/1iiiRtPtPt (2.1) Then the correlation coefficient between instruments i and j can be computed as shown in equation 2.2. 2 2 22 ijij ij iijjRRRR C RRRR (2.2) In equation denotes the average logarithm retu rn of the asset i over the N-days period [16] and is given in equation 2.3.

PAGE 12

4 11N ii t R Rt N (2.3) Fix a threshold 1,1. For each pair of stocks with Cij we add an edge between nodes i and j of the graph. This indi cates that the two stocks display a similar behavior over time. In particular, the degree of similarity is determined by the prescribed value of the threshold. From the above it fo llows that the analysis of the patterns exhibited by the market graph can provide some useful insights into the inner structure of the stock market. Interestingly, the previous study indicated that the degree distri bution of the market graph can be described by the power-law model [11]. We say that a ve rtex has degree k if there are k edges incident to it. In accordan ce with the power-law model, the probability of a vertex having a degree k is given in equation 2.4. ,Pkk (2.4) or, equivalently, loglog Pkk. (2.5) Equation 2.5 implies that the degree distri bution plotted in the logarithmic scale reproduces a straight line. Furthermore, the degree distribution of the graph is a key characteristic that describes a real-life data set corresponding to this graph. It reveals the large-scale pattern of connections in the graph, which displays the global properties of the dataset this graph represents. Remarkably, aside from the stock mark et data, the power-l aw model can be observed in many other practical areas including, but not lim ited to biological networks, computer networks, and social networks [17-21]. This interesting discovery led to an

PAGE 13

5 introduction of the concept of the so-calle d “self-organized networks.” Moreover, it turned out that this phenomenon can also be found in finance. In the previous studies th e authors came up with a nove l idea to relate certain correlation-induced characteristics of the stoc k market prices with some combinatorial properties of the correspondent market graph [7-9]. Specifically, the problem of the stock arrangement into groups of hi ghly correlated assets was c onsidered. This problem was solved by utilizing simple algorithms for finding cliques / independent sets in the market graphs resulted from di fferent threshold values. Our present study takes a different appro ach to analyzing th e structure of the market graph. Specifically, in this paper we utilized the method for arranging the closelyrelated stock into certain groups based on the maximum weighted path cover of the market graph. Although, in general the maxi mum weighted path cover problem is known to be NP-hard [22]. However, we can find th e connected components of the market graph in polynomial time. Interestingly, this study has shown that ther e is a certain degree of similarity in the market graph configuration obt ained by our present method with the one obtained by the application of the cliques problem. The pape r also gives interesting insights into relationships between different industries de rived from the market graph structure. Taking into consideration a huge amount of data generated by the stock market on a daily basis, the importance of discovering efficient ways to represent and analyze these data becomes apparent. The stock market data is generally illustrate d by different plots displaying the price of a certain stock duri ng various time periods. Nevertheless, as the

PAGE 14

6 number of stocks increases, th e task of analyzing the inform ation contained in the plot becomes more and more complicated. Preliminaries of Graph Theory Let us first introduce some basic definiti ons and notations from the graph theory, which are used in the paper. Later we will give the appropriate interpretation of the introduced concepts in application to the data mining. Let G = (V, E) denote an undirected random graph with the set of nodes V |V| = n and the set of undirected arcs ,:, E eijijV We say that the graph G is connected if one can fi nd an undirected path between every two nodes of V A graph that does not satisfy th e aforementioned property is called disconnected. Every disconnected graph can be decomposed into a number of connected subgraphs. Such subgraphs are called th e connected components of the graph. For any subset of nodes 1VVin the graph, let G(V1) denote the subgraph of G induced by V1. A subset of nodes Cis said to be a clique if the induced subgraph G(C) is complete, i.e., G(C) contains all possible arcs. The problem of finding the largest clique in the graph is known as the maximum clique problem. It was proven that the maximum clique problem is NP-hard [22]. Also ma ny cases of this problem are difficult to approximate. Arora and Safra [23] shown that for some > 0 the approximation of the maximum clique problem within a factor of n is NP-hard. A path in a graph represents an alternati ng sequence of vertices and edges such that from each of its vertices there is an arc to the descendant vert ex. It is also assumed that a path has no cycles. A weighted graph associates a value (weight or co st) with every edge in the graph. Clearly, the market graph can be viewed as a weighted graph with the cross-

PAGE 15

7 correlation between two stocks being the arc weights. The sum of the weights of the traversed edges in a weighted graph is called the weight of a path. A path cover of the graph G is a set of vertex-disjoint paths that together cover the vertices of G In a weighted graph, a path cover of the maximum weight is referred to as the maximum weight path cover. The Maxi mum weight path cove ring problem (MWCP) can be formulated as a problem of finding maximum weight path cover of a given graph. This problem is shown to be in NP-hard [24]. Motivation and Techniques for Finding Connected Component in the Market Graph One can easily see from the construction of the market graph that two particular stocks are closely related if their correspondent nodes in the graph are connected by an edge. Consequently, we can deduce that ther e must be a certain degree of association between two specific stocks if their nodes in the market graph are connected by a path. Moreover, all the stocks repres ented by the nodes in the path can be combined together in a distinct group of interdependent stocks. In this view, it seems very natural to cons ider the MWCP for the market graph. In fact, finding the collection of vertex-disjoint paths that has the maximum possible weight can be perceived as the “best” arrangement of closely dependent stocks into the separate groups. In order to solve the MWCP, we app lied a greedy algorithm proposed by Liao et al. [24]. The method is analogous to Krus kal's maximum spanning tree algorithm [25]. Precisely, Liao’s algorithm iteratively selects the edge with the greatest weight to be added to the path, while preserving the proper ties of the path. In the case when two or more arcs are eligible to en ter (i.e., they all have the sa me weight), the algorithm non-

PAGE 16

8 deterministically selects only one of them. The procedure terminates when either no additional edges can be added to the solution without violating the path properties, or there are no more edges. The solution of the Maximum weighted path covering problem for a constructed market graph showed no significant reduc tion in the number of the stock groups compared with the maximum clique method we used in the previous studies. Moreover, as mentioned before, the MWCP is an NP-hard problem, and so it is computationally unattractive. Thus, the MWCP approach to grouping the re lated stocks, though interesting, is not particularly better than the previous approach based on the cliques/maximum sets. Notice that the solution of the MWCP does not include all the edges of the market graph. For the market graphs with a high value of correlation threshold, the above approach results in a rather large number of the groups with fewer stocks in each group. On the other hand, for the high threshold market graphs one might want to include all the edges of the market graph, since each and ev ery one of them corres ponds to a si gnificant level of correlation between stocks. Essentially this modified problem can be formulated as the problem of finding all connected components of the ma rket graph. Clearly, this approach gives a very natural arrangement of a ll stocks of the market graph into separate groups of closely related stocks. Each conn ected component indicat es a certain group of associated stocks. Although, the applica tion of the method based on finding all the connected components would gene rally result in stocks in a group having a somewhat weaker connections compared to the MWCP gr ouping of the same mark et graph, this can be easily overcome by setting a higher correlation threshold in the graph.

PAGE 17

9 It is a widely known fact that all the connected components in a graph can be obtained simply by using either the depth-first search or the breadth-first search. Notice that both procedures can be performed in polynomial time. Here we applied the depthfirst search algorithm (DFS) to find all c onnected components of the market graph. The DFS algorithm can be briefly descri bed as follows. First some vertex of the graph is randomly selected and a dded to a stack. Then for each node 1 of the descendants of which has not been selected previ ously, we apply a recursive procedure by adding 1 to the stack and examining all its descendants 2 in a similar fashion. Precisely, if 2 has not been selected yet we add it to the stack, and then examine all the descendants of 2 by choosing a particular descendant and proceeding recursively. The node is taken out of the stack af ter all its adjacent vertices ar e visited. After all the nodes, which can be reached from the initial node by some path, has been examined, we choose the next from those vertices of the graph that has not been visited yet. The algorithm terminates when all the vert ices in the graph are visited. The output of the DFS algorithm is a set of depth-first search trees, where each tree represents a connected component in the undirected graph. The number of connected components is equal to the number of depth-fi rst search trees. More over, the outcome of the scheme does not depend on the choice of initial vertex. The main advantage of this approach is that the DFS algorithm runs in polynomial time. In particular, the procedure takes O(n+|E |) with |E| being the number of edges in the graph) if the input is represented by the adjacency list, while it takes O(n2) if the input is given in the form of adjacency matrix. Taking into consideration the size of the input for the market graph problem, an adjacency ma trix representation is preferred.

PAGE 18

10 Structure of Connected Components in the Market Graph First, we applied the depth-first sear ch algorithm on the market graph with a correlation threshold of 0.7. The obtained stock arrangement had a large number of clusters with very few stocks in each group. Clearly, the stocks in each obtained group were strongly linked. Decreas ing the number of the groups formed would allow one to see a less pronounced pattern of connections between the stocks. This is achieved by decreasing the value of the correlation thresh old of the market graph. Subsequently, the resulted market graph also takes into account somewhat w eaker connections. Next, we applied the DFS algorithm on the market graph with the threshold value set at 0.5. As expected, the algorithm pr oduced a stock arrangement with a smaller number of groups and, in general, large numbe r of stocks in each gr oup. This is because of the trivial but important observation that all the connected co mponents of a higher threshold stay connected at the lower threshol d and also may get connected to some other connected components (Table 2-1). For the market graph with threshold of 0.5, the largest group in the stock arrangement has a total of 269 stocks, wh ich represents both technology and finance industries. An important observation is that each connected component in the market graph corresponds to a dis tinct industry sector. It should be noted that this approach pr ovides a natural way for clustering stocks. Clustering is a well-known challenging problem arising in data mining [26]. It deals with partitioning a dataset into sets (clusters) of elements grouped according to some similarity criterion. The main difficulty one encount ers in solving the cl ustering problem on a certain dataset is the fact that the number of desired clusters of similar objects is usually not known a priori, moreover, an appropriate si milarity criterion should be chosen before

PAGE 19

11 partitioning a dataset into clusters. Using the technique of representing the stock market, the clustering problem is treated as gra ph partitioning, where the subgraphs in the partition correspond to different clusters. The above results suggest that partitioning the market graph into distinct connected co mponents is a reasonable approach in the framework of clustering stocks. Size of Connected Components in the Ma rket Graph in the Context of Power-Law Model As mentioned above, the market gra ph follows the power-law model. The asymptotic properties of power-law random gr aphs, including the si ze of their connected components, have been studied theoretically. It is important to men tion the existence of a giant connected component (the unique larges t component in the graph when the average degree is greater than 1) in a power-law graph with < 0 3.457875, and the fact that a giant connected component does not exist otherwise. The emer gence of a giant connected component at the point 0 3.457875 is referred to as the phase transition. As it was found in [8], the values of for the considered instances of the market graph were smaller than the aforementione d threshold value. Therefore, one would expect to find a large connected component in the market graph. The results presented in this section confirm this hypothesis. Notice that the arrangement of stocks in to groups for the market graph with the threshold of 0.5 (presented in Table 2-1) cl early shows the presence of a giant component (represented by the financial se rvices/technology group). Observe also that the size of the giant component is significantly larger th an the sizes of al l the other groups. One may pose the following logical ques tion: How does the size of the largest component in the market graph changes over a certain period of time? To answer this

PAGE 20

12 question, we constructed different market gra phs with a given value of threshold for 11 adjacent time periods. Specifically, in order to examine the dynamics of the market graph structure, we selected the time period of 1000 trading days in 1998–2002 and considered eleven 500-day shifts within this period. Th e starting dates of any two consecutive shifts are separated by a 50-days interv al. In other words, each pair of successive shifts had 50 different days and the rest 450 days in common. The time shifts considered in this paper are the same as the ones considered in the previous studies [9]. This method lets us capture the structur al changes of the market graph using comparatively small intervals between shifts, a nd at the same time allows us to maintain sufficiently large sample sizes of the stock pri ces data in order to be able to compute the cross-correlations for each time period. Also note that in our analysis we took into consideration only the stocks, which were among those traded during the given 1000 trading days (i.e., for practical reasons we did not take into account stocks that had been withdrawn from the market). We considered three different values of th e correlation threshold, precisely 0.7, 0.6, and 0.5. For each given threshold value, the correspondent market graphs were constructed for all eleven time periods. For each of the given eleven periods we ran the DFS algorithm to find all connected components in the associated market graph. The size of the largest group (i.e., giant connected co mponent), formed in each individual period, was computed. Figure 2-1 shows the largest gr oup sizes obtained in all 11 time periods for each particular value of the threshold. It can be seen that for all three different thresholds the size of the largest group of related stocks follows an overall increasing trend. Precisely, as a characteristic common

PAGE 21

13 for all three cases, the giant component size pr edominantly increases from the oldest time period (period 1) to the most recent one (period 11). Such clearly visible overallincreasing dynamics exhibited by the largest gr oup size can be well explained in the view of the globalization tendency in the market. Note that this fact was also mentioned in [9] in the context of the growth of the edge density and the maximum clique size in the market graph. Structure of Connected Components in the Market Graph Another issue related to th e size dynamics that deserves a special consideration is how the structure of a giant connected co mponent in a market graph transforms throughout various time periods. To investigate this question, we set the threshold value at 0.7 and constructed the corr espondent market graphs for all eleven time periods above. Using the DFS algorithm, we found a giant component along with the other connected components in each of the obtained market graphs. The giant connected components for all eleven time shifts are given in table 2-3. It appears that in most cases stocks th at belong to a giant connected component during an earlier period are also included in the giant component in later periods. There are some other interesting observations about the stock structure of the largest size group found for different time periods. Interestin gly, all the giant connected components contain a large number of stocks of the comp anies representing the “high-tech” industry sector. Furthermore, each giant component includes stocks of the companies related to the semiconductor industry, and the number of these stocks in the largest group increases with time. All these facts imply that the co rresponding branches of industry had expanded during the considered period of time to form a major cluster in the market. Additionally, we detected that in the later periods (particularly, in the la st 2 of the 11 periods) the giant

PAGE 22

14 connected components in the market graphs contain quite a significant number of exchange traded funds (stocks reflecting th e behavior of certain indices representing various groups of companies). It should be mentioned that all giant connected components include Nasdaq 100 tracking stoc k (QQQ), which was also found to be the vertex with the highest degree (i.e., correlated with the most stocks) in the market graph [7-8]. Concluding Remarks We extended the methodology of representi ng the stock market as a graph. We have shown that partitioning the market gr aph into a set of connected components provides reasonable results in th e context of data mining, in particular, clustering stocks into groups with similar behavior. Moreover, we observed similar patterns of the sizes of connected components in most instances of the market graph, with one large conn ected component and several small ones. Since the market graph follows a power law with a small parameter this observation is consistent with theoretical results obtained for the power-law random graph model, indicating the existence of a gi ant connected component in such graphs. Our study confirmed that the recently introduced network-based approach is promising for studying stock market dynamics. We believe that this methodology can be further developed and generali zed to take into account va rious factors affecting the market and assist researchers and practi tioners in making strategic decisions.

PAGE 23

15 Table 2-1. Arrangement of stocks into groups for the market graph with threshold of 0.5 Industry Stocks Basic Materials copper and aluminum AA, AL, N, PD Financial services/technology AAPL, CSCO, ALTR, ADI, AMAT, AMCC, ANAD, ASML, ATML, CY, IDTI, INTC, CMGI, AMTD, AOL, AMZN, DCLK, ET, ELNK, INKT, CNET, NITE, NTBK, RNWK, NTAP, CHKP, QQQ, ADBE, MDY, ADCT, AFCI, BRCM, JDSU, BVSN, ELX, QLGC, KLAC, CMOS, KLIC, ASYT, HELX, LRCX, CYMI, LLTC, DIA, AIG, AXP, BAC, BBT, ASO, CMA, BK, BTO, ABK, HIG, CB, SPC, JP, LNC, TMK, MBI, AF, CF, GPT, FVB, CBSS, FBF, C, BSC, AGE, JPM, COF, HI, GSB, GDW, KEY, FITB, FRE, FNM, MEL, HBAN, NCC, PNC, NTRS, RF, CFR, CYN, HU, MI, MRBK, STI, NFB, ONE,WB, SPY, ADX, MIM, AMO, ATF, SBC, BLS, T, VZ, DJM, EWF, EWQ, ALA, STM, ERICY, EWD, MSCI,EWG, DT, COLT, CWP, FTE, KPN, TEF, EWI, BBV, EWP, EWN, ABN, AEG, AXA, STD,EWU, PHG, NOK, ITWO, SEBL,MXIM, LSCC,LSI LSI, NSM, PMCS, MERQ, VRSN, SUNW, CMVT, NT, BCE, XLNX, MCHP, VTSS, RFMD, TQNT, SWKS, TXCC, TXN, MOT, MU, TER, RTS, MCRL, DELL, MSFT, YHOO, IBM, ORCL, VOD,TI, PT, LEH, LM, RJF, MER, MWD, EWH, APB, APF, RR, GCH, TCH, JFC, TDF, CHL, EWS, MLF, GE, SCH, STT, UPC, SNV, SOTR, TCB, WL, WM, WFC, USB, MXF, BZF, BZL, LAQ, FMX, EKT, EWW,LDF, MSF, UBB, ELP, TBH, TAR, BFR, IRS, TEO, TDP, TMX, TFONY,TV, KOF,MXE, TZA, TY, USA, AMGN, MEDI, CHIR, GENZ, HQH, ALKS, CEGE, HGSI, ABGX, PDLI, INCY, MEDX, MLNM, AFFX, HQL, LYNX, MYGN, GLGC, VRTX, ASG, CCU, KSU, NOVL, PAYX, TLAB, WABC, VLY, KRB, CCR, PVN, MMC,CIEN, DISH, SANM, BGEN, CREE, CTXS, FLEX, HLIT, IBIS, INTU, ISSX, NXTL, QCOM, SFE, SMTC, SPOT, SWS, SIEB, JBOH, MHMY, TFSM, BRKS, COHU Gold ore Industries ABX, AEM, ASA, AU, DROOY, HGMCY, GFI, NEM, KGC, PDG, ECO, GLG, TVX Healthcare ACV-A, ACV Financial, Credit/Personal credit institutions ADVNA, ADVNB Utilities/services sectors AEE, AEP, AYE, CEG, CIN, D, DUK, ED, DTE, DQE, IDA, AVA, VKL, LNT, NI, PEG, ETR, FPL, PNW, EXC, PSD, OGE, FE, PGN, PPL, REI, PCG, SO, TE, HE, WEC, WPS, TXU, XEL, ILA, DPL Basic materials/energy sectors AHC, APA, APC, BJS, ATW, DO, BHI, CAM, ESV, GLBL, GSF, HAL, HP, NBL, BR, UCL, MUR, PEO, DVN, NE, NBR, NOI, PDE, GW, PKD, PDS, PTEN, MVK, RDC, PGO, RIG, SII, SLB, TDW, TMAR, VRC, OII, WFT, VTS, OEI, PPP, EOG, KMG, COP, CVX, SC, BP, RD, TOT, XOM Investment banking AKOA, AKOB Transportation sectors ALK, CAL, AMR, DAL, LUV, UAL, NWAC Healthcare/ pharmaceutical preparations AZN, GSK

PAGE 24

16 Table 2-1. Continued Industry Stocks Basic materials/consumer goods sector BCC, BOW, GP, IP, PCH, RYN, TIN, WY Financial/banking sectors BCM, BMO, RY, TD Consumer Goods, Tires and inner tubes BDG-A, BDG Computers and banking sectors DOCC, FLBK, IBCA, PBIX, SCAI, SOV Pharmaceutical preparations BMY, SGP, JNJ, MRK, PFE Consumer goods/financial EWJ, HIT, SNE, MTF, NTT, TM Indian Financial services IFN, IGF, IIF, JFI Korea technology/ finance KEF, KF, SKM Media/technology CYLK, HOLL, NAVR, SHRP Plastics materials and resins DD, DOW, PPG, ROH Table 2-2. Dates and mean correlatio ns corresponding to each 500-day shift Period # Starting date Ending date 1 9/24/1998 9/15/2000 2 12/4/1998 11/27/2000 3 2/18/1999 2/8/2001 4 4/30/1999 4/23/2001 5 7/13/1999 7/3/2001 6 9/22/1999 9/19/2001 7 12/2/1999 11/29/2001 8 2/14/2000 2/12/2002 9 4/26/2000 04/25/2002 10 7/7/2000 7/8/2002 11 9/18/2000 9/17/2002

PAGE 25

17 Table 2-3. Stocks contained in largest size group for eleven time periods (1 being the oldest period, and 11 being the most recent). Time Stocks Size 11 ABGX, BBH, AMGN, CHIR, CRA, MLNM, HGSI, MEDX, PDLI, IJH, AGE, DIA, C, FBF, IYF, BAC, IYG, GE, IVE, EWG, ABN, EZU, AXA, AEG, ING, EWQ, BBV, EWP, TEF, DT, FTE, STD, EWD, EWI, SPY, BDH, ADI, ALTR, AMAT, AMCC, BRCM, IAH, ATML, CY, FCS, IYW, AMD, NVLS, ASML, IFX, PHG, ALA, STM, EPC, INTC, DELL, IVW, BHH, ARBA, IYV, BEAS, IIH, CHKP, MERQ, QQQ, ARMHY, BRCD, ELX, QLGC, CIEN, JDSU, XLK, CLS, FLEX, IWF, CSCO, JNPR, MXIM, IDTI, LLTC, IRF, KLAC, BRKS, CMOS, SMH, CYMI, LRCX, KLIC, LSCC, MCHP, TXN, IWO, HHH, AOL, EBAY, IJR, IVV, IWB, IWD, IWM, IWN, IWV, IWW, IYC, IWZ, IYJ, IYY, IYZ, ATF, SBC, TTH, VZ, MKH, MDY, MWD, BSC, GS, LEH, LM, MER, XLF, AIG, BK, JPM, RKH, BBT, UPC, FVB, MI, NCC, STI, RF, CMA, MEL, ONE, USB, WB, STT, WFC, XLV, VIA—B, VIA, XLI, XLY MSFT, NOK, SBF, YHOO, SSTI, XLNX, LSI, PMCS, VTSS, TER, LTXX, VSEA, MCRL, MU, NEWP, NSM, NTAP, SEBL, EXTR, FDRY, VRTS, SMTC, SUNW, EMC, IBM, ORCL, JBL, SANM, CREE, NVDA, CNXT, ITWO, KEI, KOPN, MRVC, QCOM, RFMD, TQNT, SCMR, SNDK, VRSN, VSH, KEM, BVSN, CMRC, IWOV, MOT, NT, DCX, VRTX, DNA, GILD, IDPH, MEDI, MYGN 202 10 ABGX, BBH, AMGN, CRA, MLNM, HGSI, MEDX, PDLI, IJH, BDH, ALA, PHG, ASML, AMAT, ALTR, ADI, CY, ATML, IAH, AMCC, BHH, ARBA, IYV, BEAS, IIH, BVSN, IYW, AMD, NVLS, CMOS, SMH, BRCM, IWF, CSCO, EMC, BRCD, ELX, QLGC, JNPR, CIEN, EXTR, FDRY, QQQ, AVNX, CHKP, SEBL, MERQ, VRTS, NTAP, XLK, CLS, FLEX, SANM, JBL, CREE, DELL, INTC, IVW, C, DIA, GE, IVE, IJR, IVV, HHH, AOL, EBAY, IWB, IWD, IWM, IWV, IYC, IYY, IYF, BAC, IYG, JPM, SPY, MDY, MWD, GS, LEH, BSC, LM, MER, XLF, FBF, RKH, PNC, SSTI, STM, EPC, IFX, KLAC, BRKS, LLTC, IDTI, LSCC, LRCX, LTXX, MXIM, IRF, LSI, XLNX, MCHP, PMCS, VTSS, TXN, SMTC, TER, NOK, XLI, XLV, DIS, VIA-B, VIA XLY,YHOO, MSFT, NEWP, SUNW, JDSU, NVDA, ORCL, CMRC, GLW, ITWO, MRVC, QCOM, RFMD, TQNT, SWKS, SCMR, SNDK, VRSN, MU, NSM, KLIC, IWOV, IBM, EWG, BBV, EWP, EWQ, EWI, STD, TEF, DT, FTE, EWD, MOT, NT, VRTX, DNA, IDPH, MEDI 159 9 ADI, ALTR, AMAT, AMCC, BDH, ATML, CY, IAH, BEAS, IIH, ARBA, BHH, CMRC, QQQ, ARMHY, ASML, IFX, PHG, ALA, STM, MDY, DIA, C, JPM, XLF, BAC, FBF, MWD, GS, LEH, BSC, MER, SPY, GE, HHH, AOL, EBAY, XLK, BRCM, PMCS, CSCO, JDSU, CIEN, JNPR, GLW, SUNW, EMC, BRCD, ELX, QLGC, NTAP, SEBL, CHKP, MERQ, VRTS, LLTC, KLAC, IRF, MXIM, LSCC, IDTI, LRCX, NVLS, CMOS, INTC, DELL, TXN, XLNX, LSI, MCHP, VTSS, TXCC, TER, CREE, EXTR, FLEX, CLS, SANM, MSFT, NEWP, NVDA, ORCL, SSTI, TQNT, RFMD, SWKS, YHOO, XLI, XLV, VIA—B, VIA, PNC, NCC, XLY, NOK, AVNX, BVSN, DIGL, ITWO, MRVC, SCMR, VRSN, IWOV, IBM, NSM, NT, MU 110

PAGE 26

18 Table 2-3. Continued Time Stocks Size 8 ADI, QQQ, ALTR, AMAT, AMCC, BRCM, PMCS, CSCO, EMC, SUNW, XLK, ASML, PHG, STM, ALA, ATML, CY, LSI, XLNX, INTC, DELL, NVLS, CMOS, KLAC, LLTC, LSCC, IDTI, LRCX, TXN, MXIM, VTSS, JDSU, CIEN, JNPR, BEAS, CHKP, MERQ, SEBL, ITWO, MDY, DIA, C, JPM, XLF, BAC, BK, FBF, PNC, NCC, SPY, GE, HHH, AOL, EBAY, VRSN, YHOO, XLI, XLV, VIA—B, VIA, NTAP, BRCD, ELX, QLGC, VRTS, ORCL, GLW, TXCC, TER, KLIC, MCHP, NSM, TQNT, RFMD, SWKS, NOK, CREE, EXTR, FLEX, CLS, SANM, MSFT, AVNX, BVSN, CMRC, ARBA, CNXT, DIGL, IRF, MRVC, NEWP, SCMR 92 7 ABGX, MEDX, BBH, AMGN, DNA, HGSI, MLNM, PDLI, IDPH, MDY, ATML, AMAT, ALTR, LSCC, IDTI, QQQ, AMCC, BRCM, PMCS, CSCO, EMC, SUNW, XLK, ASML, PHG, STM, ALA, NOK, TXN, ADI, CY, LSI, XLNX, KLAC, LRCX, NVLS, KLIC, MXIM, LLTC, VTSS, TXCC, TER, MCHP, BEAS, JNPR, CIEN, MERQ, SEBL, CHKP, NTAP, QLGC, BRCD, ELX, VRTS, ORCL, CREE, DELL, FLEX, CLS, SANM, HHH, AMZN, AOL, EBAY, SPY, C, DIA, XLF, BAC, BK, FBF, JPM, PNC, NCC, XLI, XLV, VIA—B, VIA, GE, MIM, YHOO, INTC, JDSU, GLW, TQNT, RFMD, SWKS, VRSN, BVSN, CMRC, ARBA, EXTR, ITWO, SCMR, MEDI 95 6 ALA, STM, ASML, PHG, QQQ, ALTR, LSCC, ATML, AMAT, AMCC, BRCM, JDSU, CSCO, SUNW, EMC, XLK, CIEN, CREE, FLEX, GLW, INTC, JNPR, KLAC, LRCX, NVLS, KLIC, TER, TXN, XLNX, LLTC, MXIM, VTSS, PMCS, TXCC, MDY, DIA, C, JPM, XLF, BAC, BK, FBF, PNC, SPY, MIM, XLI, XLV, VIA—B, VIA, NTAP, SEBL, VRTS, ORCL, QLGC, ELX, TQNT, RFMD, SWKS, VRSN, BEAS, MERQ, BRCD, BVSN, CHKP, CMGI, ICGE, CMRC, ARBA, ITWO, RBAK, NOK 71 5 ALA, STM, ASML, PHG, QQQ, ALTR, XLK, AMAT, AMCC, BRCM, PMCS, JDSU, CSCO, SUNW, EMC, VTSS, TXCC, QLGC, ELX, KLAC, LRCX, NVLS, TER, MXIM, LLTC, XLNX, ATML, LSCC, TXN, CIEN, FLEX, INTC, MDY, SPY, DIA, XLF, BAC, C, JPM, FBF, PNC, XLI, MIM, MLF, XLV, NOK, NTAP, SEBL, VRTS, ORCL, TQNT, RFMD, SWKS, VRSN, BEAS, MERQ, BVSN, CHKP, CMGI, CMVT, GLW, ITWO, JNPR 63 4 ALTR, QQQ, AMAT, KLAC, LRCX, NVLS, TER, XLK, AMCC, BRCM, PMCS, VTSS, ATML, XLNX, LLTC, MXIM, LSCC, TXN, CSCO, JDSU, SUNW, EMC, INTC, MDY, SPY, DIA, XLF, BAC, C, JPM, FBF, PNC, XLI, MIM, MLF, XLV, ORCL, QLGC, ELX, SEBL, STM, ASML, PHG, NOK, VRTS, BEAS, MERQ, CHKP, CIEN, CMGI, CMVT, ITWO, NTAP, TQNT, SWKS, TXCC, VRSN 57 3 ALTR, XLNX, LLTC, MXIM, QQQ, AMAT, KLAC, LRCX, NVLS, TER, XLK, CSCO, EMC, SUNW, SPY, DIA, MDY, MIM, MLF, XLI, XLV, INTC, JDSU, SEBL, TXN, AMCC, PMCS, BRCM, CMGI, ITWO, NTAP, QLGC, VRSN, VRTS, VTSS, 35 2 ALTR, XLNX, QQQ, AMAT, KLAC, LRCX, NVLS, TER, AMCC, PMCS, CSCO, SPY, DIA, MDY, MIM, MLF, SUNW, EMC, INTC, JDSU, MXIM, LLTC, SEBL, VRTS 24 1 ALTR, XLNX, QQQ, AMAT, KLAC, LRCX, NVLS, TER, CSCO, SPY, DIA, MDY, MIM, SUNW, INTC 15

PAGE 27

19 0 50 100 150 200 250 1234567891011 Time PeriodLargest Group Size A 0 100 200 300 400 500 1234567891011 Time PeriodLargest group size B 0 200 400 600 800 1,000 1,200 1,400 1234567891011 Time PeriodLargest group size C Figure 2-1: Largest group size by time period (A corresponds to the threshold value of 0.7, B corresponds to the threshold of 0.6, and finally, C pertains to the market graph with threshold 0.5)

PAGE 28

20 CHAPTER 3 EVOLUTION OF SOCIAL NETWORK Social Networks have attracted many scientists in the recent days especially because of their applications in social pro cesses such as studying disease spreading [2729], urbanization studies [30-32] discover the network of I nnovators [33] and in many other fields. Also social scient ists are interested in studying evolving social networks as a dynamic process [34]. In this chapter, we present a model that would dynamically simulate a social network with some sta ndard assumptions that are already in the literature and show that the model accura tely mimics a real world network. Introduction Social Networks like web graph and finan ce graph evolve over time. In fact, the rate at which they change with time might be higher compared with other real world networks. They have network transitivity, wh ere the nodes have tendency to connect if they share a mutual neighbor [4, 35]. Th ey also have high degree correlation or assortative mixing [36], wher e nodes of high degree tend to connect with other nodes of high degree. Individual s of a social network could ha ve identities [35, 37, 38], characteristics of a specific node, which helps in a hierarchical cl assification of nodes. For instance, in a student community there are more oppor tunities for two students to establish a relationship who ar e classified in the same gr oup than for two students who are strangers to each other. These identities co uld be that the students may take the same classes, or play a common sport or have same music interests. Small world phenomenon is another feature of social network [4-6], nodes are separated by a distance of 6 or less

PAGE 29

21 on an average. These are also some of the pa rameters which would help in simulating our model. Most complex networks have preferential at tachment and addition of new vertices [2]. An important thing to note is that unl ike other complex real world networks like biological or technological networks all social networks will not be having preferential attachment and addition of new ve rtices to the networks with time. This feature especially not having preferential att achment makes these networks follow a single scale distribution as against power la w distribution [3]. Of course social networks like author collaboration and movie actors will possess th ese two characteristics. Social networks have a wide range of applications such as study of diseases in a network, identifying innovators in an author collaboration networ k, analyzing the pattern of migration among people and investigating criminal behavior using financial flows. Model We propose a model where we dynamically si mulate a social network based on the strength of the relationship [39] and identities of the indivi duals [37, 38] involved in the relationship. Other factors influencing the growth of the network are mutual acquaintances shared by the in dividuals [40] and time of thei r last contact [34, 41]. The reasons for considering these parameters are quite intuitive. It takes more time for the relation to fade or disassociate, when the re lation is stronger. Also relationships last longer with frequent contacts and fade away if individuals don’t stay in touch. The relationship strengthens with more and more contacts. The mutual acquaintances shared by the individuals help in providing opportuni ties for the individuals to meet [39]. At each time interval we pick two rand om nodes and add an edge between them with a probability given by p*fx, where

PAGE 30

22 mpe p 1 (3.1) fx = 1(1xa) b (3.2) The probability that an edge will be added between two nodes with m mutual acquaintances between them is p. This fo llows an exponential distribution, where the value of p is exponentially increases with number of mutual acquaintances [39]. The probability that an edge will be a dded between two nodes that have common identities fx. Each node is assigned a set of identities from global set based on uniform distribution and x is measur e of the common identities be tween these two nodes with support as 0 to 1. This follows Kumara swamy distribution [42], which is a double bounded distribution resembling beta distri bution and good for simulation studies. The CDF fx monotonically increases with increase in value of x. Also if an edge is added we consider each of the non-mutual neighbors of one node and the other node as a potential relation and perform the edge additions with the probability p*fx. If the nodes considered already have an e dge, we still calculat e the probabilities and increase the weight of the relationshi p by one, strengthening the relationship. We again pick two random nodes and this ti me we consider them for deletion. The probability of deletion equal to tx*f’x*wx, where tx – probability of deletion with x as a function of last time of contact wx – probability of deletion with x as a function of weight of the relationship f’x – probability of deletion with x as a function of mutual acquaintances

PAGE 31

23 Each of the above probabili ties follows gamma distribut ion having an exponential decrease with increase in value of x, given by ) (/ 1k e xk x k The simulation ends when the average degree of the network reaches 5. The number of edges in the network wi ll be O(n), a sparse network. A question that might occur is the consid eration of nearest neighbors while adding edges but not while deleting edges. This a ddition could still be justified as a fair operation. The stopping conditi on has an average degree 5. So the addition-to-deletion consideration is 1 to 5 or less at ever y time interval. Also, the nearest neighbor consideration for adding edges is a direct c onsequence of the highly clustered nature of social networks. Performance Analyses of the Model The key performance measures of the model used for simulation are Clustering coefficient Assortative mixing coefficient Average length of the giant component (small world phenomenon) Degree Distribution (Pow er law distribution) Clustering coefficient = (3.3) It is the tendency of tw o nodes that are connected by a path of length two to associate, which essentially forms the triangles in the network. As mentioned earlier, this is the motivation to consider ne arest neighbors while adding edges. 3* Number of Triangles Number of Triples

PAGE 32

24 The assortative mixing coefficient [36] is given by (3.4) This coefficient is the tendency of the node s with high degree to connect with other nodes having high degree. The average shortest distance of the gian t component should be less than 6 to have small world phe nomenon in the social network. The degree distribution should follow power law distribution to ensure scale free characteristics. Results Simulations results are tabulated. The resu lts are average of 10 test runs for each node value from 500 to 2000 (Table 3-1). Conclusions We saw the simulated model have the same features that are desired in the real world network with high correlation. The R-square value report ed from regression analyses is as high as 95% indicating scal e-free characteristics. This strengthens our convictions about the social network based on which the model was built. Also the model should help in studying social networks which change over time. The model could also be extended to social networks which has addition of newer vert ices with time and shouldn’t affect any of the ch aracteristics alrea dy present in the model. Future works could also include more sophisticated ways of assigning identi ties to nodes. An implementation of one such assignment, where the nodes are distributed in a unit hypercube with each dimension representing an id entity and the distance between them is inversely proportional to probability of a dding an edge, is already in process. 2 -1-1 iiii ii 2 -122-1 iiii i1 M jk M (j + k) 2 r = 11 M (j + k ) M (j + k) 22i

PAGE 33

25 Table 3-1: Clustering coefficient, assortative mixi ng coefficient and Average length of the giant component Nodes CC AM Average Length Giant Component 1 500 0.49 0.83 3.41 245 2 800 0.42 0.79 3.78 434 3 1000 0.42 0.81 4.13 538 4 1200 0.37 0.77 4.39 695 5 1500 0.34 0.76 4.19 875 6 1700 0.32 0.75 4.32 1047 7 2000 0.34 0.75 4.56 1309 1 10 100 1000 1101001000 DegreeNumber of vertices Figure 3-1: Degree distribution for n = 500 nodes (R-square = 0.9056) 1 10 100 1000 1101001000 DegreeNumber of vertices Figure 3-2: Degree distribution for n = 800 nodes (R-square = 0.9563)

PAGE 34

26 1 10 100 1000 1101001000 DegreeNumber of Vertices Figure 3-3: Degree distribution for n = 1000 nodes (R-square = 0.9535) 1 10 100 1000 1101001000 DegreeNumber of Vertices Figure 3-4: Degree distribution for n = 1200 nodes (R-square = 0.9511) 1 10 100 1000 1101001000 DegreeNumber of vertices Figure 3-5: Degree distribution for n = 1500 nodes (R-square = 0.9549)

PAGE 35

27 1 10 100 1000 1101001000 DegreeNumber of Vertices Figure 3-6: Degree distribution for n = 1700 nodes (R-square = 0.9535) 1 10 100 1000 1101001000 DegreeNumber of Vertices Figure 3-7: Degree distribution for n = 2000 nodes (R-square = 0.9541)

PAGE 36

28 CHAPTER 4 CONCLUSIONS In chapter 2, we discussed a quick effici ent way to identify clusters of similar stocks in the market graph. From the result s we concluded that our assumptions about the nodes that are connected must be highly correl ated are fair. In chap ter 3, we proposed a model that simulates the growth of soci al networks by considering weights for relationship strength and identities. Our si mulation resulted in social models, having many interesting features that are desired in real world social ne tworks, including high clustering and assortative mixing coefficien t, small world phenomenon and scale free distribution. We then concl uded, based on the results, by justifying the assumptions made to model the growth. Analyses of complex networks are the provenances of knowledge, which help in getting acquainted to the real world syst ems and make better decisions. We hope the thesis presented gave useful insights and mo tivated researchers to pursue research in the field of complex networks especially soci al networks. A special emphasis on social networks is laid to highlight its increasing role in various social activities over the last decade.

PAGE 37

29 LIST OF REFERENCES 1. Erd s P, Rnyi A. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci. 1960;5:17-61. 2. Barabsi AL, Albert R. Emergence of s caling in random networks. Science October 15, 1999;286:509-12. 3. Amaral LAN, Scala A, Barthelemy M, Stanley HE. Classes of small-world networks. Proc. Natl. Acad. Sci. 2000;97:11149-52. 4. Watts D, Strogatz SH. Collective dynamics of ‘small-world’ networks. Nature 1998; 393, 440-42 5. Milgram S. The small world proble m. Psychology Today 2 1967;p. 60-67 6. Strogatz SH. Exploring comple x networks. Nature 2001;410:268-76 7. Boginski V, Butenko S, Pardalos PM. On stru ctural properties of the market graph. In: Nagurney A, editor. Innovations in financial and economic networks. Northampton, MA: Edward Elgar Publishers, 2003. 8. Boginski V, Butenko S, Pardalos PM. Stat istical analysis of financial networks. Computational Statistics and Da ta Analysis 2005,48(2):431–43. 9. Boginski V, Butenko S, Pardalos PM. Mi ning market data: A network approach. Computers and Operations Research (in press). 10. Abello J, Pardalos PM, Resende MGC. On maximum clique problems in very large graphs. DIMACS Series, vol. 50. Providence, RI: American Mathematical Society; 1999. p. 119–30. 11. Aiello W, Chung F, Lu L. A random graph model for power-law graphs. Experimental Mathematics 2001;10:53–66. 12. Hayes B. Graph theory in practice. Am erican Scientist 2000;88:9–13 (Part I), 104– 9 (Part II). 13. Jeong H, Tomber B, Albert R, Oltvai ZN, Barabasi A-L. The large-scale organization of metabolic ne tworks. Nature 2000;407:651–4. 14. Watts D. Small worlds: the dynamics of networks between order and randomness. Princeton, NJ: Princeton University Press, 1999.

PAGE 38

30 15. Watts D, Strogatz S. Collective dynamics of ‘small-world’ networks. Nature 1998;393:440–2. 16. Mantegna RN, Stanley HE. An introduc tion to econophysics: correlations and complexity in finance. Cambridge: Cambridge University Press, 2000. 17. Albert R, Barabasi AL. Statistical mech anics of complex networks. Reviews of Modern Physics 2002;74:47–97. 18. Barabasi AL. Linked. Cambridge, MA: Perseus Publishing; 2002. 19. Boginski V, Butenko S, Pardalos PM Modeling and optimization in massive graphs. In: Pardalos P.M, Wolkowicz H, ed itors. Novel approaches to hard discrete optimization. Providence, RI: American Mathematical Society; 2003. p. 17–39. 20. Broder A, Kumar R, Maghoul F, Raghavan P, Rajagopalan S, Stata R, Tomkins A, Wiener J. Graph structure in the Web. Computer Networks 2000;33:309–20. 21. Faloutsos M, Faloutsos P, Fa loutsos C, On power-law rela tionships of the Internet topology. Cambridge, MA: ACM SICOMM, 1999. 22. Garey MR, Johnson DS. Computers and intract ability: a guide to the theory of NPcompleteness. New York, NY: Freeman; 1979. 23. Arora S, Safra S. Approximating clique is NP-complete. Proceedings of the 33rd IEEE Symposium on Foundations on Computer Science, Pittsburgh, 1992. p. 2–13. 24. Liao S, Devadas S, Keutzer K, Tjiang S, Wang A. Storage assignment to decrease code size. In Proceedings of the ACM SIGPLAN 1995 conference on Programming Language Design and Implem entation, La Jolla (June 1995), ACM Press, pp.186 -95. 25. Aho A, Hopcroft J, Ullman J. The desi gn and analysis of computer algorithms. Reading, MA: Addison Wesley, 1974. 26. Bradley PS, Fayyad UM, Mangasarian OL. Mathematical programming for data mining: formulations and challenge s. INFORMS Journal on Computing 1999;11(3):217–38. 27. Rothenberg RB, Potterat JJ, Woodhous e DE, Muth SQ, Darrow WW, Klovdahl AS. Social network dynamics and HIV transmission. AIDS, August 20 1998;12(12):1529-36. 28. Aral SO. Sexual network patterns as de terminants of STD rates. Sexually Transmitted Diseases, May 1999;26(5):262-64. 29. Moore C, Newman MEJ. Epidemics and percolation in small-world networks. Phys. Rev. E 61, 5678-82 (2000).

PAGE 39

31 30. Andersson C, Hellervik A, Lindgren K, Hagson A, Tornberg J. Urban economy as a scale-free network. Phys. Re v. E 68, 036124 (2003) (6 pages). 31. Boyd M. Family and Personal Networks in International Migration: Recent Developments and New Agendas. Internati onal Migration Review, Vol. 23, No. 3, (Autumn, 1989) pp. 638-670 32. Andersson C, Frenken K, Hellervik A. A complex network approach to urban growth. Papers in Evolutionary Ec onomic Geography (PEEG) 0505, Utrecht University, Section of Economic Geography, revised Feb 2005. 33. Yeung YY, Liu TCY, Ng PH. A social networ k analysis of res earch collaboration in physics education. American Journal of Physics 2005;73:145. 34. Doreian P, Stokman FN. Evolution of so cial networks. New York, NY: Gordon and Breach, 1997. 35. Jin EM, Girvan M, and Newman MEJ. The structure of growi ng social networks. Phys. Rev. E 64, 046132 (2001). 36. Newman MEJ. Assortative mixing in networks. Phys. Rev. Lett. 89, 208701 (2002). 37. White HC. Identity and Control. Princet on. NJ: Princeton University Press, Princeton, 1992. 38. Watts D, Dodds PS, Newman MEJ. Identity and search in social networks. Science 2002;296:1302-05. 39. Ravasz E, Barabsi AL. Hierarchical orga nization in complex networks. Phys. Rev. E 67, 026112 (2003) (7 pages). 40. Kossinets G, Watts D. Empirical analysis of an evolving social network science. Vol. 311, No. 5757. (6 January 2006), pp. 88-90. 41. Newman MEJ. Clustering and preferential attachment in growing networks. Phys. Rev. E 64, 025102 (2001) (4 pages). 42. Kumaraswamy P. A generalized probab ility density function for double-bounded random processes. Journal of Hydrology 1980;46:79–88.

PAGE 40

32 BIOGRAPHICAL SKETCH The author of this thesis, Mr. Ashwin Ar ulselvan, is a master’s student at the University of Florida majoring in industri al and systems engin eering with operations research as focus.


xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID E20110217_AAAACY INGEST_TIME 2011-02-17T22:14:22Z PACKAGE UFE0014925_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES
FILE SIZE 8423998 DFID F20110217_AACFZV ORIGIN DEPOSITOR PATH arulselvan_a_Page_37.tif GLOBAL false PRESERVATION BIT MESSAGE_DIGEST ALGORITHM MD5
fdd11719a719287805b3e25b3093bdc6
SHA-1
70e733bf4b2bf81ea81939333625dc4ce341b600
10809 F20110217_AACGBV arulselvan_a_Page_34.QC.jpg
4ad0607f07d30908d76f38e3ceaa7968
b94f1f6318aea187ce247d771d03b3d56f9b5051
1911 F20110217_AACGGS arulselvan_a_Page_39.txt
06d3f53637266c5a1a272fc0086d94c3
029bc96ab4473df4c0051fee6be69a9272ff77b6
20847 F20110217_AACFZW arulselvan_a_Page_36.QC.jpg
50c63aa58c528918ea4da61851358141
2f5b321cac60223e01bea74341d79bf0c4276c4f
26849 F20110217_AACGBW arulselvan_a_Page_30.QC.jpg
34f1afffca1c22847679b819988ea3dd
9e4f554833a01eb480a982a05f5e31aa16fb8399
1229 F20110217_AACGGT arulselvan_a_Page_02.pro
30374a3e22075e2de8b3663af7474b19
8a1040c53dd3bea659cdef49b75a8566c1ad65ad
1719 F20110217_AACGBX arulselvan_a_Page_30.txt
ccc345109a84608f6f95bfb4f392180b
708ed3699928d1b1a535b622a3cb9764cdecb309
66606 F20110217_AACGGU arulselvan_a_Page_04.pro
e1d6e359bbcea415daf828d82df3bee7
93e2cdfe237cd2816c698e596591e1d0ed4cc172
942606 F20110217_AACFZX arulselvan_a_Page_11.jp2
fb9d3757ad351b0035c51fe501a76911
d66a800ae1a0da63470f91c358f736c06705077c
11637 F20110217_AACGBY arulselvan_a_Page_03.pro
b888edf9961daf4b551a353836ccea48
d26d88b8a6585ab371e3f0a93e0a508a7e6a4fcf
28528 F20110217_AACGGV arulselvan_a_Page_09.pro
c1e8ed913a68cf312a1a117a1484040b
3735e3035155b6e310859879c5ae2f9f8445575f
42544 F20110217_AACFZY arulselvan_a_Page_11.pro
b863f2e319ce02946269c70c75c820da
a6dea884b69c47e44e52670611fef64e0f7e1663
8549 F20110217_AACGBZ arulselvan_a_Page_38thm.jpg
052f46c5e9bcff86c3d53bc492bf914d
5dcb7711ccb0531d60732bb82dfc98f3400ac1ab
40188 F20110217_AACGGW arulselvan_a_Page_10.pro
158324ac1f7345c58b6a9a387d71e5fc
26fb6bd0a8e80d9b3c8eb38faa79ef9bb97b51de
24514 F20110217_AACFZZ arulselvan_a_Page_31.QC.jpg
d20e15bbdb3796b59deef8508ef0d45a
90e27e7cba5f82fe6981b0ebc3ac31f9bc94c2d5
41972 F20110217_AACGGX arulselvan_a_Page_12.pro
ab35db930a64b91f9bbbc2b542318061
0778ecb31a9a7dbdaad3a5108b5569deabe26125
F20110217_AACGEA arulselvan_a_Page_03.tif
1cd6ff4db3a4be15589d1ac77bebe9bc
e5246ae04b14fc019f4d77e908557ac25378e6e5
48278 F20110217_AACGGY arulselvan_a_Page_16.pro
b04ae8d42a83a5332be1a1862bf605f1
00d2d05281e601f1588cbe4ebf411ba8fbeb8f3b
32329 F20110217_AACGEB arulselvan_a_Page_29.QC.jpg
eafb4f30515cdf7af42f019b40cb1d68
71ac264b46609636ccb5777101490cfb144af7e0
51504 F20110217_AACGGZ arulselvan_a_Page_17.pro
8ad316f197a5b42833c92ca5958b5d79
20d0b3320905a204374f1abec7f4681498f0a592
F20110217_AACGEC arulselvan_a_Page_36.tif
02641c5e96eea3f54e2972ba7d23b15d
5a2476a8a3050b66debd3e9e6a5d45eb9193bdd2
405093 F20110217_AACGED arulselvan_a_Page_33.jp2
76c8a76229b03402ca855ca797947aa1
03635aae0597bff36be87bc4f23f5c852792fba0
923316 F20110217_AACGJA arulselvan_a_Page_32.jp2
fa4a8b14df244decf73cb2aa3c90e9cd
780b6d7603958b647f53459b6b9b1d199de6e556
3672 F20110217_AACGEE arulselvan_a_Page_27thm.jpg
c417790b14d6fe5128e2d173af476002
2c219e9d707be4330ea8e723e0758326f5115155
698007 F20110217_AACGJB arulselvan_a_Page_36.jp2
1f191428c6ba2ea67b4c5c34c8fd4423
6f16808b5e1d6156100ab303223648fbca07d9e3
7005 F20110217_AACGEF arulselvan_a_Page_12thm.jpg
5d1a33bc4553412cf322017b1e8e474e
e04d90f5de8eecaf1e0ccbcb0ab5ed39fcb78ed5
1036372 F20110217_AACGJC arulselvan_a_Page_37.jp2
cbc19e7fa371b63b8d6424bbf18b2d75
8ee500692c7f9ea5de2efd90be2879b1a9925d7b
4823 F20110217_AACGEG arulselvan_a_Page_09thm.jpg
f0fde5c7f111a877da96b4d7915d3de4
36988a8e058bd44c6e7eb4115fc7c92eeaaf4ffa
1962 F20110217_AACGJD arulselvan_a_Page_01thm.jpg
f780f2a189da858d8b2bfa2adf69dcea
199e0e12b045deb7178d331bbb922259536a840a
1973 F20110217_AACGEH arulselvan_a_Page_29.txt
48ed6f9d4b610f51df5f6bd17f63fdef
aa4992775cd10ae3fddfc4c2d94a9c2d53cb27e8
627 F20110217_AACGJE arulselvan_a_Page_02thm.jpg
224f24a9145e699c7cd60a992811c348
20cc1a2a954f547746f83f99b2d479fbdfbe422f
91601 F20110217_AACGEI arulselvan_a_Page_28.jpg
182c471accdad0a1029448e8ed33c208
460a575a24d98bf15d0cc70e2999bf14fd8ef909
4603 F20110217_AACGJF arulselvan_a_Page_04thm.jpg
b2d15c7a23049d4ad08f6f7e53fd1310
5acf3a7ccfad2dc85b0a7979d032f4f66aa35fe8
101177 F20110217_AACGEJ arulselvan_a_Page_20.jpg
2ff1b62730cf7e86270f4347781413a3
4206b47a36452c2169efc7f7a4d95d4bca8a381a
3197 F20110217_AACGJG arulselvan_a_Page_06thm.jpg
7254acc0817d4bf7c368b496aba2b93a
337222bfa662020d3ffbb6e0dedda265596eb8c4
6662 F20110217_AACGEK arulselvan_a_Page_30thm.jpg
31a73aedae2f9bb2d441e115190ded5d
cd04602fc1839d367022984d7b20b3deb55a816c
6695 F20110217_AACGJH arulselvan_a_Page_10thm.jpg
a595eb05699a63424550987a67713085
a27d1f4329cdb710b3465bb8cc1dfe35ce026f37
1684 F20110217_AACGEL arulselvan_a_Page_10.txt
7f4fa47147689f63da72c7c79f9c4f5b
7318f4e908d2d47e384f7f3c67eba53d7a4d8cd9
8423 F20110217_AACGJI arulselvan_a_Page_16thm.jpg
b77b260f4269c48fc3c7da279f3f1f78
0f3518e6662df5d3fa8cbaacf6b7d331b044211f
32091 F20110217_AACGEM arulselvan_a_Page_15.QC.jpg
bfabfc1a390fb77b53e8089e2165db9d
5a33f196358c987faf46e64e724d45454ff31d38
8244 F20110217_AACGJJ arulselvan_a_Page_21thm.jpg
dfa341a45a86d710e0fbd3a8b27383c0
a4aeccb7eb9eb448815fa2d54b913fc4cd756f27
4994 F20110217_AACGJK arulselvan_a_Page_24thm.jpg
a52e8daec230816f85542aeaa7630525
331d56916a8c421a7eef75413342c6d92f6de32a
57337 F20110217_AACGEN arulselvan_a_Page_38.pro
a47529c9f75182288088d9d5f93b1abe
3bc706e263e553df431c03b975e806ac7ac762c3
9174 F20110217_AACGJL arulselvan_a_Page_26thm.jpg
453fdfc08cd95ce36cabea8393d4738b
24cd24e077efcf1fddd6bd1e964bbe52eb5ca0f1
F20110217_AACGEO arulselvan_a_Page_07.tif
969d08ee9ba1442c7841dc37533c56fa
371a7e9182e72df3ec65cc684bdcb758a04fc753
7211 F20110217_AACGJM arulselvan_a_Page_28thm.jpg
45d7f0f972ed7567ef70dcffd11c3a80
28a4b3590b0f02a57b0958450e4fa9d27ec8b943
20161 F20110217_AACGEP arulselvan_a_Page_24.QC.jpg
b7ffac8db8d4f91b5e07ee8665c5d0b0
7d71c89d6fbc1d89fff8326659600c0e86ea23eb
7827 F20110217_AACGJN arulselvan_a_Page_32thm.jpg
99873b37dabf075bb44e41b12265748c
0af23100eae188c249178128b89e43fb007fe6af
40682 F20110217_AACGEQ arulselvan_a_Page_07.pro
fa23d1de70581c35cdb7685f554bf890
bf5a2339bd0b933ba3006dfb7a7f8a6df1ed7850
4265 F20110217_AACGJO arulselvan_a_Page_33thm.jpg
b61c83484c6de1f104747145d4ff1819
a2aa2fd5e94a974d0579840d57f408f9bafa7676
188354 F20110217_AACGER arulselvan_a_Page_35.jp2
60a34bd71387b83412c2f25d9baf5743
ec916ce3fb1a2dbf703bec69da4d296081225c38
7418 F20110217_AACGJP arulselvan_a_Page_37thm.jpg
de472d1de61f4daf2685a59df322526c
7a95d89a927fcb483c88aa98e39e9751e3090fc3
6773 F20110217_AACGES arulselvan_a_Page_01.pro
af9b1462ad24369ace753b137720029f
803ca49a0314bed3b156455b30d2e6907a02aa43
51028 F20110217_AACGJQ UFE0014925_00001.mets FULL
9dde8e3c98644e10e2fd026493afcab6
f946978510631650a6282680ab1aa036150fdccd
6489 F20110217_AACGET arulselvan_a_Page_31thm.jpg
890c7825bd53da1207258f89071d9217
7f55f5b1f69874844163bccdd6074c4d4cfeb66b
3002 F20110217_AACGEU arulselvan_a_Page_25.txt
6b315fd74c570c901c0e3fdb0885635c
f865f3a3104a573cdaeef187ed735453fd355504
1593 F20110217_AACGEV arulselvan_a_Page_22.txt
5483a8b81bb3438d43b030f975dbf533
58b3a28c58b13446e005240a902a9eb6fa11d0de
1903 F20110217_AACGEW arulselvan_a_Page_16.txt
9400af7c54cd14fcf54bdeee66823afc
8085be237f408184c60464400466010f12edaff9
1044600 F20110217_AACGEX arulselvan_a_Page_14.jp2
a4bda2e9877c0a8c94dea35d43bc069a
f87c31bb091cd2abda41da360bb93f53eda7b2e9
2202 F20110217_AACGEY arulselvan_a_Page_05thm.jpg
421d4fd2fa9426ae15c9664a64d216e5
4ae5f92cc4d76d53a1f23a557c5b70e95baa8882
26681 F20110217_AACGCA arulselvan_a_Page_06.pro
1b5d6d0d9ec7f9798fb92a3f51b16000
33aa68090591db827526106471e6b9b9875dd96f
96428 F20110217_AACGEZ arulselvan_a_Page_14.jpg
9db37a1abf39a763ec9adb95dcc8439d
9d31bee13e72c5977e398f349bddde399cac1eda
1042932 F20110217_AACGCB arulselvan_a_Page_13.jp2
577a86bc87dc773526af75aa20a1de53
f0402a5c3741bfdacbba1e22c1314d816ec778e1
26029 F20110217_AACGCC arulselvan_a_Page_12.QC.jpg
19ad4b4e83ee13f007a4c2170b6af58e
55ef152a5d9cefa9e3e42a4dce1bdc2c1ce091be
51530 F20110217_AACGHA arulselvan_a_Page_18.pro
e40ef7835c46c71da57fe8220f2db31e
9ff05cd4a42b8459536bd48d89f587611b42fe54
46604 F20110217_AACGCD arulselvan_a_Page_37.pro
af490103c169d8d87f1b6d69a8388612
8b3d760c9986d2273b71cd192e81fe961c23712f
38938 F20110217_AACGHB arulselvan_a_Page_22.pro
9e8f63570c1fa833692c88a90542dc08
015ad502ccd222bf3eb4a696822a8a4bdc4f3732
6727 F20110217_AACGCE arulselvan_a_Page_22thm.jpg
1ca418dc134f7acd599309e9ec18889a
eca35dfcdcdd819a0a7eae836d9d4e104ae1a47c
12282 F20110217_AACGHC arulselvan_a_Page_27.pro
4106953eebf3d72eec22dc5350246901
dc988d0e6171046b71cc97df6a7b3211ea795ca9
F20110217_AACGCF arulselvan_a_Page_30.tif
2f3a4260574182d78977c45c0646f90a
7c2e5637c5e55dcb08ef650806ff552556e8ab56
12066 F20110217_AACGHD arulselvan_a_Page_34.pro
f083b04cb07c01636b6364e32cd12e3d
fae147239ededa2922051310863276ea04f0c65a
F20110217_AACGCG arulselvan_a_Page_27.tif
7b1adb0d92641dd6cd724e5dd89b71a1
1a3639aa3c62415b435ddb458d7b41496f6f1c88
5586 F20110217_AACGHE arulselvan_a_Page_40.pro
8feb1e5a49d8360dd106e89a3401f69f
97c63d7c0a4fe8327c4e731db15ec2a8cbfb0fe0
70492 F20110217_AACGCH arulselvan_a_Page_26.pro
bd5ae997b40970bf8db823d88676b9bf
ae0efdd56d4352237559abb310a87d1272a4c8a6
6834 F20110217_AACGHF arulselvan_a_Page_01.QC.jpg
d2bb7407f0041fb1b5ae77e05ebf8336
8a7752ee8206aefdd5c0384c44964a7df493bcee
F20110217_AACGCI arulselvan_a_Page_06.tif
4dbcb175be9d4fda64d0eb1c3633c6ca
3095e0c4cee39045f83167ebe3a09a78f2149371
27767 F20110217_AACGHG arulselvan_a_Page_03.jpg
e4375a798c1fc37d99600425cf75bd09
1ce0340d65cc765431a57704a090da1495ce991b
885582 F20110217_AACGCJ arulselvan_a_Page_22.jp2
e56751c737d1620cfc486856fd9fbf43
ada287f9b72a46a709a3cb8ce02b369312ca30b3
9934 F20110217_AACGHH arulselvan_a_Page_03.QC.jpg
ba197695796bd148b04de9f5ad09ff01
4cd0abd71c576d58afd834446b0601e2921002ad
1142 F20110217_AACGCK arulselvan_a_Page_27.txt
09a43b62840938e48be94db29ac5611e
3998ccdfdf54791d082277002a04aa8285006e4c
73086 F20110217_AACGHI arulselvan_a_Page_04.jpg
723b1e0001fec0c953c39e4f69dfc20d
0ca02a648628f84836dd55708ab363684a7bc477
28361 F20110217_AACGHJ arulselvan_a_Page_05.jpg
34dbabb6cb48c285558b331414d12657
9b400968c3a7dd89c0f092bca002e77dae4c76ee
17817 F20110217_AACGCL arulselvan_a_Page_04.QC.jpg
0303875682b6c6061db61e803c64ef4f
8e6120427f1ccd5bddc1d10268ec57e6c84287d4
43907 F20110217_AACGHK arulselvan_a_Page_06.jpg
627d14132f64d4cd19eb3129cd4bf44a
20d995f55a63f68c83d7c0e843c78b39e831a81e
F20110217_AACGCM arulselvan_a_Page_10.tif
da719f531a9ecacc350096695122fa6a
f0eedf12ca981a90151fcf18822fd2f8d703fc9c
85322 F20110217_AACGHL arulselvan_a_Page_07.jpg
74f959b91d32ff6a22280a123b74d1a6
7394684bd8c2128eacf550e7d36efc32a8de70f0
47002 F20110217_AACGCN arulselvan_a_Page_13.pro
03740551a5e15dbb1006782e31d47f49
1074432ac2820f1e884f10b7e98e54621aa38856
26804 F20110217_AACGHM arulselvan_a_Page_07.QC.jpg
0f86bf2073f0beb82bc25db2e525a02f
bf28960363d87900f5da9d4d727738ffe671ef63
F20110217_AACGCO arulselvan_a_Page_32.tif
fe79a2692b891417193e51558ea8d819
12ae83a1a76a8985027747e8abd244100706300b
61284 F20110217_AACGHN arulselvan_a_Page_09.jpg
ef8e80cac4bb889d176232183e9eb252
19c843adb3ddef5073b55d58200a53132bee1f7c
15273 F20110217_AACGCP arulselvan_a_Page_40.jpg
3d2ca830d715b955b1b4f07e2e4e480c
f6767e8273b1680b67c2afe703d37197e8bc20f4
85983 F20110217_AACGHO arulselvan_a_Page_11.jpg
31ac2c4cee7c75e2aff02050ec924cce
51ead3d92bfb6011f8bfd1de8fc0b4753f9b88ec
34226 F20110217_AACGCQ arulselvan_a_Page_21.QC.jpg
36c88b292c6ae3e1d16f28b45c1d1548
5b0c8b85a2aaa0575d51918a6634a543e7813187
97740 F20110217_AACGHP arulselvan_a_Page_15.jpg
8be845a3028dc9948fc8b2d81af59f74
92816f76ceeeebf6ab9876dd89cc4608cced65e6
40438 F20110217_AACGCR arulselvan_a_Page_30.pro
637c0a99c816f64b6f5a4ca626df650c
0f8d6eb223b4804dfaa3b284fff726c426cf8644
518 F20110217_AACGCS arulselvan_a_Page_03.txt
cd312ab721d76297dfb34298cd3e1d51
79e22a44689f43ffbd167ba6f2d71fa29055e1db
32224 F20110217_AACGHQ arulselvan_a_Page_16.QC.jpg
950ca93122f0e6747dc875746deb3113
a7be145eece4853ac8ad0960169c1b119a0d33c6
1624 F20110217_AACGCT arulselvan_a_Page_02.QC.jpg
0bde565cc09d99dd743bbab0534d0374
6529d3dcf3da700822c72ccd03f47b199c0703cb
101351 F20110217_AACGHR arulselvan_a_Page_19.jpg
70e789da856647db76aa1b5d9afed2d6
57ffc54f6c70dcb3caf9243b622a5b46530dab70
F20110217_AACGCU arulselvan_a_Page_02.tif
57c34d1887500520feb9ca4d5bfafc35
dfe557195e6f17418800d755fa87cc4555fba865
32840 F20110217_AACGHS arulselvan_a_Page_20.QC.jpg
d65ea05176c567716229feb0e43cef42
8a554c89af41785d58b2822458161525107a7530
274 F20110217_AACGCV arulselvan_a_Page_40.txt
7c00958c3b5535e1ea6f43585869331c
c82f093ae2bcefb500266563e997aa558ab8a4b5
26544 F20110217_AACGHT arulselvan_a_Page_22.QC.jpg
5a0ea8a13baab2e16a171e6cc372a263
8ce529eb39513022c020c0a2fd7ae4480a8ecb20
F20110217_AACGCW arulselvan_a_Page_09.tif
4f55cfccd21f18c3d50aec86f815ac83
15930c211bd93ca1547fec2de6f7617102f7a0b5
142218 F20110217_AACGHU arulselvan_a_Page_23.jpg
027adf61c568b20b76f621dd8c157742
0134d3bf7aa8b9dc42cadaeeea06ab512229e737
1016441 F20110217_AACGCX arulselvan_a_Page_39.jp2
5da6461e34fbb5a41f3971b34361ab7a
83cb5060c5155dd138932d491f9a61d75603b495
66635 F20110217_AACGHV arulselvan_a_Page_24.jpg
a29667ff75d6ca733f2bcff21d9d6a87
aac0d206d484ce6a18f617c6a00f29970508fe0f
7236 F20110217_AACGAA arulselvan_a_Page_11thm.jpg
4868d27f32f6b8cc902aaaf6726c6d60
3441c2ca4f08d912adc2f780e3eb6738ecc4d30f
8807 F20110217_AACGCY arulselvan_a_Page_25thm.jpg
21879083daf73d933e6bb0d3155745ed
70df63948062a69d44a81ed540016fd26476226a
721 F20110217_AACGAB arulselvan_a_Page_05.txt
1260bd8ab3af938ada83c087e2c58e42
29647915adac4961fd760bb64309df3021510e0e
619249 F20110217_AACGCZ arulselvan_a_Page_24.jp2
85330f552148f52bd70f661132ec2cf6
dd29f6a84c09484678594b3682742f22f3c49ca3
40231 F20110217_AACGHW arulselvan_a_Page_25.QC.jpg
d4140841efbca351eb874695202fb749
2c1a3b77803d3dfcff172db0f756ba95c101e46b
5207 F20110217_AACGAC arulselvan_a_Page_08.QC.jpg
e2f174525b2351432464dda4c271e68e
f85c9d7f0860c08c223bca10628cf5289af4b841
157105 F20110217_AACGHX arulselvan_a_Page_26.jpg
fc173bdac0a03cc9cecf8f79472cf9a0
0c352e0053eb2fa399ed6cca42de6288950000fe
65370 F20110217_AACGAD arulselvan_a_Page_36.jpg
78d591fff6636d4ef17b7d4110fb2dad
46ce1d1f6d7c3386d1c8a7319ae928b725f0780a
28470 F20110217_AACGFA arulselvan_a_Page_39.QC.jpg
3ea3fc120e79aa833b31ff2d8753d03c
104f2be6ffa11d27984d2f9ba9a0e4a99c00905f
41894 F20110217_AACGHY arulselvan_a_Page_26.QC.jpg
1171d3607010ff3c6b8c199b3d59a314
5a0a78eb2b7d19ebc33b281ceb01f25ccc5617b6
31012 F20110217_AACGAE arulselvan_a_Page_14.QC.jpg
2154e360edc7a30aa343acbcde6cfd4e
0df007f5babb1f7cc594f6b896e489ab9d9dd438
F20110217_AACGFB arulselvan_a_Page_22.tif
7bd43334622ca45314c6dbc0db668fba
b2b8957051c3d0930a2381b5f4a1134d84659b27
11577 F20110217_AACGHZ arulselvan_a_Page_27.QC.jpg
090028f6276ded7eab5c4641063b71e0
df6a47f3e806a0b613a98ad6fddf265db4363275
2357 F20110217_AACGAF arulselvan_a_Page_38.txt
c2648c7ebba9cbab9f597205731ad5c2
cb1d1a076b9fe5838e51354c3306f628d6b8ff92
1508 F20110217_AACGFC arulselvan_a_Page_40thm.jpg
989d741bf2f5cb413da0d13e8b406724
f8540ffcc5907b12a02b0db65c027e07a9ccb47d
5675 F20110217_AACGAG arulselvan_a_Page_08.pro
8bbabb7d76f5bc9950244b9f81aaf463
ec271518345e0cf72333a1b4416172cb46425741
16675 F20110217_AACGFD arulselvan_a_Page_05.pro
a72ab0c60cfc1692df76f13e3beaf910
70c187bcdff6edde3d44e7052f6562cf3d1ef84c
7660 F20110217_AACGAH arulselvan_a_Page_13thm.jpg
92400779a956c6ab5b58da960dd416f0
8169ca6b3de33d12dc9d5e33b876c7e09001f34d
F20110217_AACGFE arulselvan_a_Page_25.tif
5590380cb47cfd7bde040497d857fddd
1c9a5b9341beabb3f93d0e358cdc9a80498f90bb
F20110217_AACGAI arulselvan_a_Page_29.tif
2fe1d00d73889539e57ede85274adb00
f7b5fecada002e78ae8c656065ce0f21089fb1aa
2613 F20110217_AACGFF arulselvan_a_Page_23.txt
168a90e5c93efeb1d4107298662d8534
98447f70028a4fec44555e119c2fe0a613de8d09
43220 F20110217_AACGFG arulselvan_a_Page_33.jpg
2fe819c62d6bcb9d341bc953bb5def37
d23e256930c2bc607a8ef36fa63807bf4530925d
8357 F20110217_AACGFH arulselvan_a_Page_19thm.jpg
9aeb7633530747aa9611773b36fb3bc8
d6c284e6265bec0c9cca11e1f2f3d142c07519a5
1320 F20110217_AACGAJ arulselvan_a_Page_36.txt
166224d89592bca16756676c20e5c245
b058cce64cc6468937f900c9c9a80041418d9b02
1807 F20110217_AACGFI arulselvan_a_Page_12.txt
ba1d50bb262dbab0cc9836807302614f
6a74f754cff719a13787a0fc189d44c8a9817d13
F20110217_AACGAK arulselvan_a_Page_39.tif
3b7501f850c85fe6acf619f678420726
090e25e4325868aeacb07d4fab0fa8582d727b1e
2669 F20110217_AACGFJ arulselvan_a_Page_35thm.jpg
cfef9db87d1b38c3e0ed0167d7a62cd4
884a0dca36cd9d7afc78f1bb18e9e76456cd6705
272702 F20110217_AACGAL arulselvan_a_Page_34.jp2
e0636d0b54478abb18ba5a4681c323e5
58da1ed4381c129c95e4f6be2176a4412aa3d2d7
35273 F20110217_AACGFK arulselvan_a_Page_27.jpg
18c2e5ac2606b30934ce6b2765913ab6
8325167256a4365cf22a3b67158274a61a552127
96303 F20110217_AACGAM arulselvan_a_Page_13.jpg
cf3ce326732a07c7504b5b6bb27ea34a
58d7cc7c89a599af93da19ba9b08f07592c255f4
68376 F20110217_AACGFL UFE0014925_00001.xml
46ba09d4be2698b6efd78aa87ca022e6
643241c5b1f6f7f74210b5a8acc9b571cf0a9d26
6686 F20110217_AACGAN arulselvan_a_Page_07thm.jpg
87c948a4b5d0298f96ad1c3f41a1decb
e46627166f5d9329cd813b63f739a467dd49924e
5266 F20110217_AACGAO arulselvan_a_Page_36thm.jpg
a7dcd4f44f11c298c0c719d1a96d9523
03492bbf7c6c0418d1d435c25d1db4c3f9a506d5
30033 F20110217_AACGAP arulselvan_a_Page_32.QC.jpg
02402cac19b7a6f7972373a0fa5b85e4
31f8fdb8d289ed40d027e8c5aa2fcd52eb5c54f1
931 F20110217_AACGAQ arulselvan_a_Page_33.txt
7a7cd46d0ad7c3ac1f5073a0376f30a1
69e04d52d926b13c72000e8bafd1d8b85fe40ab4
F20110217_AACGFO arulselvan_a_Page_04.tif
123bb22a24f8d2b20860571b27fdb828
5b4fb03191f3251dc079e8b715d2b83132e48536
35793 F20110217_AACGAR arulselvan_a_Page_31.pro
e8fe254f141fef219a7b355e41c173a1
d3c4991cc5b4bc3b772116680a9d7afcaf7a2d76
F20110217_AACGFP arulselvan_a_Page_05.tif
cfa3722ed391e07a5e012ca04a998533
861b1c09a7568f80f01c76c51ab75acd25ff669a
F20110217_AACFYS arulselvan_a_Page_31.tif
fa92fbb956c5f34c2c3be85b8fd54f00
6610d35b43531783e5790420bab34ed73ebe1a62
44612 F20110217_AACGAS arulselvan_a_Page_28.pro
29d1f0e7d2b515ba9460c9379cb07b21
4c789b16a788ac0bd62a466bdc8b4e271f76e16f
F20110217_AACGFQ arulselvan_a_Page_12.tif
8381f96c6b55fed071307bf629df1d86
302cce14f0d6aaf2e9b039f566418e612840156d
F20110217_AACFYT arulselvan_a_Page_20.tif
9b4dcdf4cfe6e016e813ec1c35df05f8
4f96788f0d6a3dff64661b08c84e88a73d88a976
104906 F20110217_AACGAT arulselvan_a_Page_18.jpg
468ee23bb733ac06abd47c4eac68761f
90ac7535bfb4ab99339e2b4d594291a1e671769d
F20110217_AACGFR arulselvan_a_Page_13.tif
db15c271818b196b8dc819885286d192
c841276aab105313aabecfdf9ed728f05a33e566
28740 F20110217_AACFYU arulselvan_a_Page_02.jp2
a1e9f4f91f30537fa4843d206c2ed39f
b819aa0483c1eaa7a68b1e49478bd191126e42e6
F20110217_AACGAU arulselvan_a_Page_11.tif
d74e090b60ae8f5e4aa2cde37f20596b
2b55be79cb421fe23cc52c0d7f9bd5d4acccb109
F20110217_AACGFS arulselvan_a_Page_14.tif
0cf06cee7704a0afa78fda780a04106a
9fe371f2bf2ed56854fe58e5c529334a02cb7e50
33617 F20110217_AACFYV arulselvan_a_Page_17.QC.jpg
bac06b59c4090932618b6a136c1604a9
3444db5057ea3829a5009a49675e28586e60f1e0
388 F20110217_AACGAV arulselvan_a_Page_01.txt
191a7ca015e494e97649a1561c9053bf
a06e7577d18664412f7d79ee23437b62fdb6ddb8
F20110217_AACGFT arulselvan_a_Page_15.tif
e108ca3cc11bf1e450bb1f5be11982ae
f2b12ef7d53acbbcdb00ae86fc38c5fdcbbb0b74
71794 F20110217_AACFYW arulselvan_a_Page_25.pro
562c513d0c7c2d1f0027c9e49269cbd8
874c2cda0e601dfc86484f36f0ab4187d1a22b85
21707 F20110217_AACGAW arulselvan_a_Page_01.jpg
6a061f0e23f78feea8be4939212b0d38
b07c015b17f8dc0ef57e68d6076c679807e31024
F20110217_AACGFU arulselvan_a_Page_16.tif
821b3ad026d1bdd1bf17996833fd3e8a
72a559c21e1329a7446deb2aa78ae21c72ea71dd
7879 F20110217_AACFYX arulselvan_a_Page_29thm.jpg
74cfb67ddad420bb24f7612921caec3f
386fdbfc4f9e2f081b14343ca57dc9b870e17779
781472 F20110217_AACGAX arulselvan_a_Page_31.jp2
ba200ac83a62a4eabe379f7b91fa26fe
c48901aec5e2eda416eca8432eef11942cac8cb4
F20110217_AACGFV arulselvan_a_Page_17.tif
b60ebd62d96d8512ce9d513c841c4c81
0c95812cad63d3a92c09ba8f2bef71db8d3fe1a7
8083 F20110217_AACFYY arulselvan_a_Page_14thm.jpg
24eeaf649c68334d1545dedca4c2ab22
c5f2b4c73b8233322dc95b77365c5b1a23127094
44624 F20110217_AACGAY arulselvan_a_Page_39.pro
4620a14f934b549e28340c183d42f6ab
1e47991b2e7dd5ece2f5c35741db8bd7aca5c715
F20110217_AACGFW arulselvan_a_Page_18.tif
65d0f726b291b287387db5edded4e8de
05b6006d18046ceb1389c3458c0eec3bfb8481de
4821 F20110217_AACFYZ arulselvan_a_Page_02.jpg
90bf4cab9bb4e2cff393a5467e3733f8
0295a2c38db15377a5113d936238255ce43c20f2
269616 F20110217_AACGAZ arulselvan_a_Page_03.jp2
6ca7193203eb13591a708c68fda10706
e1b6cf7288c247425826327483c284b8b310d461
F20110217_AACGFX arulselvan_a_Page_21.tif
427f67c26fd672e959f1b6417dbcc923
8fd9ba3808b4c85956013052cffd5fb5d97a5bd1
F20110217_AACGDA arulselvan_a_Page_40.tif
4564b5420d77daeedb5f0b7df6be90a4
7e17896897d58c166d8e0a70201664391e08004b
F20110217_AACGFY arulselvan_a_Page_23.tif
986bfdcd8a9b1ea81d9178d22ef98be8
52e41652ff107a1e77b742a958ea15f1db00cf39
1051975 F20110217_AACGDB arulselvan_a_Page_17.jp2
7bb4def2d8224dc364f67fd1b4982d5a
ac94de137ba81409c2a440336c275f4b0734e355
F20110217_AACGFZ arulselvan_a_Page_26.tif
97831a7b68a9d5fdf6a1741f617bf39c
849d8633898a2ac53f04db1453a4b719588a5a02
399 F20110217_AACGDC arulselvan_a_Page_35.txt
15158c8b524c7620800a79baea639da3
c3fff03d30cb6f9f384a0217ef2d5c2ad209ac6d
1051986 F20110217_AACGDD arulselvan_a_Page_26.jp2
7d90568ce221efd780623b757c3d14cf
e43a788d30061ec68ed1e00d4d6cb0f177ffd60e
98521 F20110217_AACGIA arulselvan_a_Page_29.jpg
bfaefa27c2bdb972687a3cf6a8bb9539
879b916098baa7023457181d7eb2d191197b9016
F20110217_AACGDE arulselvan_a_Page_24.tif
429da32f3804f6057b0663f03418a524
c9ea392f1a369bd8360b24311addaf069709a3ce
80202 F20110217_AACGIB arulselvan_a_Page_30.jpg
25499e1cd6c55dba3a63c49219fddb9b
1d509319288415305b339e6ac56dde375871ce90
12967 F20110217_AACGDF arulselvan_a_Page_06.QC.jpg
56ac5f3fb3a85604187f0d743081da27
dd89d5d9e5df086585c1fa345b7ddc865e8e95a8
73453 F20110217_AACGIC arulselvan_a_Page_31.jpg
8451d69ef10f7fc3fb7e14ea92910ca0
1485b7b916b06ddf71ea71f43655896e572ea1c0
88226 F20110217_AACGID arulselvan_a_Page_32.jpg
d3c793d6f004488f9bae1f42c02bde8e
dc1dc326dee358472b8f343214bbb8d5808c417b
49300 F20110217_AACGDG arulselvan_a_Page_19.pro
c7b336a205794328a89b98625ed63349
d2671b30570f1e04f6feac8f8af18b0263cd0c32
29123 F20110217_AACGIE arulselvan_a_Page_34.jpg
0ccb07b86f856e148e4880e7ae513c7d
8c3fc715b6c5d23cfc9056db77b6036d7cc85268
29009 F20110217_AACGDH arulselvan_a_Page_28.QC.jpg
a91e37c834313c74a66dde112855544d
abd0d89d48ce93c6e8f42031dd43a23b440e7150
21165 F20110217_AACGIF arulselvan_a_Page_35.jpg
8cb7892d56e52af3b587c66e6e36ae76
10245e8068ec1acf8198ab37f100d96d34168135
8139 F20110217_AACGDI arulselvan_a_Page_20thm.jpg
b996d3113fae635c49a70e7c8737bd3a
c8855d36ada515916e57eb144da95083f7d607c3
93488 F20110217_AACGIG arulselvan_a_Page_37.jpg
d4a4e89fc2f8e9906d44a779b2d4db3a
3001c85240fa428f0efd386e77bcf8f1cea72709
41820 F20110217_AACGDJ arulselvan_a_Page_32.pro
a85dbe576f434e6ba3a86a0cee39a6bc
cb021ec29d954adc596be47170121d7024d4f1f7
29186 F20110217_AACGIH arulselvan_a_Page_37.QC.jpg
2cbbb4fde6d2e181301ddaf030786514
5ace1759a6dca46db55f3da37676f9223a98446a
911741 F20110217_AACGDK arulselvan_a_Page_07.jp2
757cb405e3d84cb0d90ea5812c16d122
ea5ea75b7d8d3c5a2626bc7be0e57ae6edc4d996
34951 F20110217_AACGII arulselvan_a_Page_38.QC.jpg
ba9cdb4acd99ffff4ad06e1d88751e15
304d2ac1612014e26a4711ea3f3567bbe66ebcf6
103905 F20110217_AACGDL arulselvan_a_Page_21.jpg
f0db618c881a4f46f38c69b87bbbd6de
faeea7ba2eb92e38ed936f93c8f180bd91eaea52
5103 F20110217_AACGIJ arulselvan_a_Page_40.QC.jpg
e5d8d4eaf51d4a29025a6ff19dec6b7c
697ade4f345a18905e8e91feb14b3cc079501e71
206444 F20110217_AACGIK arulselvan_a_Page_01.jp2
f5894e98e48afceca65dafdf5d6ccb07
b4dee20c9c97374f9b2b896528f9d038b40713b4
F20110217_AACGDM arulselvan_a_Page_08.tif
76e3e5fc841baf266854f0752b5b1020
599187b5e609e65312c373632db199fee799e1f7
129725 F20110217_AACGIL arulselvan_a_Page_08.jp2
3783d0e12baceeae627a3b492db709cc
eb72d4b249ab8a80ed7db82b09ed451c90a51cf9
31021 F20110217_AACGDN arulselvan_a_Page_13.QC.jpg
ebada50ca11f19a120de84357bcca1ca
f32e19a25ee8c6214c449aaba55e3d1038d2e29c
647322 F20110217_AACGIM arulselvan_a_Page_09.jp2
0b2a4ea6e50abe54641965384fb25f13
4b560040f9e9a541f9a5d140d22337a1bfb54566
3654 F20110217_AACGDO arulselvan_a_Page_34thm.jpg
4b2cf671871adfcc1fa72adaf7e4c047
134698b6bcc511591287c233f8bf432c8efe6a90
912682 F20110217_AACGIN arulselvan_a_Page_10.jp2
ef9a65fc341a044a5ee73a00d3537f22
6c9c64dfd362b0b0d7f17507760e08671b28ceb7
8167 F20110217_AACGDP arulselvan_a_Page_05.QC.jpg
0885b911f0eb2cde3617f666dd930414
504d93a72c555887b4776908e588b72bc708a678
876805 F20110217_AACGIO arulselvan_a_Page_12.jp2
455f2d2837a2d7bd19aa86a48f612beb
f8c38ea7b1695821b0b6a67f349ebf31200fe7a3
33365 F20110217_AACGDQ arulselvan_a_Page_19.QC.jpg
4a25bd9b5dfed0b9d6d49087fbeed176
6336532edf0d385ad21656b3a9f412c6e44e2484
1051977 F20110217_AACGIP arulselvan_a_Page_15.jp2
636d4bbb7d2a8328db72a5d9d013701b
1680482e77d3b78bd15e12eabacfcbfe13aefe9a
27301 F20110217_AACGDR arulselvan_a_Page_10.QC.jpg
6432a1ce645e83b521192a228c583b06
58537d65c270431a83f25ef40ecaf53eaaed5869
1051983 F20110217_AACGIQ arulselvan_a_Page_16.jp2
5ac9ff253ee8554995c1e2f8887ea2b2
377e8cf4a8be6136d9e313d1d76a63a19dba24ad
1051980 F20110217_AACGDS arulselvan_a_Page_38.jp2
92e79a1c179a7d71f48f534edd674fc8
bd615dfc6f05b4dd3c10d9b3c93898589d1c42b2
7622 F20110217_AACGDT arulselvan_a_Page_35.QC.jpg
2780f3aa8a03847c1a725583ebae75fd
d793b7bce9931b7edb7bb008b18f262d07cf4322
1051942 F20110217_AACGIR arulselvan_a_Page_18.jp2
20a6b87dc745eb21cfc167df52fd2761
3a8fd9a857b896aec87a3f4b89fb6e1e35708360
8339 F20110217_AACGDU arulselvan_a_Page_15thm.jpg
0b624a570e20d6047451dd9d47723002
7de225ebabb296d81f63a3c5424c73c0e6381e04
1051945 F20110217_AACGIS arulselvan_a_Page_19.jp2
681ad72d07899d46a87f52536c7dd22f
c7e99c31a5f8024c38eda67f0f78796c45817c41
721761 F20110217_AACGDV arulselvan_a_Page_04.jp2
1d619e3917da7ab6f5b9c7eb066ca807
28b471e713c2d9b352d4161515c58bf859c61922
1051960 F20110217_AACGIT arulselvan_a_Page_20.jp2
248af91dc368cea60756e113dda70308
376602b8052e40a4801804b174afa27edd83e007
269190 F20110217_AACGDW arulselvan_a_Page_05.jp2
9f292cd61f19d66c71cb9ac5257f8445
0964e7d1f378412899c7ce2b1d3f69a1848a82c1
1051976 F20110217_AACGIU arulselvan_a_Page_21.jp2
1ffc9716e184f486b26acc20035024c7
c2f00f735150ced5d1df86068d66f3a8000f4ac9
116769 F20110217_AACGDX arulselvan_a_Page_38.jpg
f18802e064f9d67c353fad896f52a9f7
f0589897c476fb3c603271d0d7b69156f28d6bfc
1051938 F20110217_AACGIV arulselvan_a_Page_23.jp2
b4c1c7f3d05953f7017dafdd310a7da9
c9ae35d55dd0c98f4f01e1325b4e52bc2d1c803a
1144 F20110217_AACFZA arulselvan_a_Page_06.txt
57fdc04653f2e2f804f1e11965199d50
9e53fab8fd0128dc7ff7b8eacb045c4f65acc65a
19144 F20110217_AACGBA arulselvan_a_Page_09.QC.jpg
f58938081bcfba1b063694ea022b2041
009825639a468105ffe6785bc4daddcb74057f1a
229 F20110217_AACGDY arulselvan_a_Page_08.txt
de627576e090e22e9f59b25e110e456f
e26abbebe22b10f940fcab42f220bdaad4a8f732
1051964 F20110217_AACGIW arulselvan_a_Page_25.jp2
dd47e83de662548be174f7614f512665
b793081d827adab8a0f070d5b3c9284ff38110aa
47203 F20110217_AACFZB arulselvan_a_Page_14.pro
25dfb0c8249487638b5212f9efe26a8a
880ae73dc26a17aeea31c0af55b6d323a3266491
133010 F20110217_AACGBB arulselvan_a_Page_40.jp2
24968da5769c4d0d67e9305a4bbbd5c4
e7a8083c5cdaaee94e95cb868340598468c449f1
33844 F20110217_AACGDZ arulselvan_a_Page_18.QC.jpg
94c02bef8e171a29532b3942451d6338
ea1c41c7ba2e1b8124a990f18bd8350be20fb28a
990090 F20110217_AACGIX arulselvan_a_Page_28.jp2
3d727ef3d0c5b56fe0038dfd1b719e2b
dca88d930d23a68d9550918f4107916c1a6c5f81
48919 F20110217_AACFZC arulselvan_a_Page_29.pro
16de760cf78464913b759bbe732f2202
f352a29546820551231c18e2eef0833ce2782928
2674 F20110217_AACGBC arulselvan_a_Page_04.txt
7d15ada4117628c78711c6344188f01d
96c7b550fa6baf0d8cdb168bad12a0488a68e97a
F20110217_AACGGA arulselvan_a_Page_28.tif
187f14071fc36314b226c86ea9d3118c
c43ab980b45e3e783375ebede3b87c6c2b122c4b
1051953 F20110217_AACGIY arulselvan_a_Page_29.jp2
0f2b735391cbd29357bbb0cc81dbaa2b
0342091393b7c3fc853fbc0cb6e31c5c95be1840
50303 F20110217_AACGBD arulselvan_a_Page_20.pro
7a028198dea639f5353748b95e2e58bd
9cb93fb7eaef65c1d96187fc4de8347567aca2a5
F20110217_AACGGB arulselvan_a_Page_33.tif
7f771afe8d8758149a0a64d9fa70228d
f0cc0b2182ef6d8c1b4461d01a56892cb59c1c3a
861056 F20110217_AACGIZ arulselvan_a_Page_30.jp2
64ff037ca093ffdf945049648fe88e10
08159c7ced95e12cb9676659caafb2eb8f88ee22
30820 F20110217_AACFZD arulselvan_a_Page_36.pro
85a736a34022dbcf6e25c0a684b97203
94b8215a0f7264f2cec52c1a8206d90a3ca50ee5
282119 F20110217_AACGBE arulselvan_a_Page_27.jp2
1c07a2f369e3812ee9c86649ecc00457
841d27a2116ad997b7bbe73c4484d0f4c2bb9a35
434021 F20110217_AACFZE arulselvan_a_Page_06.jp2
fa6d94cd5c9f369e46993449cb29b795
f5d36124af610ba07c8458bf210e81974f1c2c06
F20110217_AACGBF arulselvan_a_Page_01.tif
f6d6ea6d8c13acdd4f5c70e2d9b213b7
916fee8613e9e47ce72feefbba86e3137ee215ca
F20110217_AACGGC arulselvan_a_Page_34.tif
4f0253f620a36ac6df408721342968c7
f910d655d29f5fbb541ff4f44cf8878af5e65032
37306 F20110217_AACFZF arulselvan_a_Page_23.QC.jpg
ea4ed96b0a1573a9dc0978020ac7b78a
fa853defdea7b0a354a369fde1d12e1fcbb35375
99249 F20110217_AACGBG arulselvan_a_Page_16.jpg
892df1169dca1a19402c554a2f141605
fff28139a7449825b1d375d88f47796638313461
F20110217_AACGGD arulselvan_a_Page_38.tif
200336e214034bc2e36b276bf2aae1df
e0549bd2fea8843575f48bbc97740688852d222b
1786 F20110217_AACFZG arulselvan_a_Page_11.txt
bc6eb58d90c277b4570e49e69604d982
35d8f154f6e9ad122df629cddfbf3e5ae73b0b9b
80985 F20110217_AACGBH arulselvan_a_Page_22.jpg
d23df065280e930ae3e33bf046535b57
38f83b551e97702e80f2c54613a7d6a974f6fe72
115 F20110217_AACGGE arulselvan_a_Page_02.txt
3c1750a1c0da73e18f8a414b9fc0b74d
e7412f8479915b63c0a2425b8d6f4afbb3819ce5
6502 F20110217_AACFZH arulselvan_a_Page_35.pro
d63df66cdfcd5f95088ecc5b81a98b50
c073b76a32357e1c884d56c9c8ae8a4dfb7e219e
14449 F20110217_AACGBI arulselvan_a_Page_08.jpg
486b11607f935896badf70de55fe1609
ac575c2da081cf066fd1bfbff1416989f2a25aa5
1863 F20110217_AACGGF arulselvan_a_Page_13.txt
a441d64fbeb2c3b149ee74dfd88ec2a8
3f7428552f8d50ede6af1ee291e116d03152a42d
82284 F20110217_AACFZI arulselvan_a_Page_12.jpg
5f2641bee36651cae2d6ed0e4eb881ab
74386a2dce59f71e8ddac775c7773554b2745708
29315 F20110217_AACGBJ arulselvan_a_Page_11.QC.jpg
652fbe65a27f6c67185c43216643bf9a
7a9f673d0b254987920a328e66776bff37565dcc
1938 F20110217_AACGGG arulselvan_a_Page_15.txt
f816ce712902dc2c98b7f618644d81c1
a6c0dcb433cfa84dfca6732d371cbd46042e51b5
243397 F20110217_AACFZJ arulselvan_a.pdf
e0df874389422f38fc09b51161416516
77d90e08e64d2165b3bc2816af48d5341c5d0090
2038 F20110217_AACGGH arulselvan_a_Page_17.txt
be633ed712f8531220f17addbe504ebc
c1dbba0f4ffbdf786d4649f47437881657f5654c
8151 F20110217_AACFZK arulselvan_a_Page_18thm.jpg
fc832a77fbbdea06a316612b69c3a299
d7f204a95ecf970018f80c8ba6ecfe59c50ad5f8
47852 F20110217_AACGBK arulselvan_a_Page_15.pro
f23e33010ddf86607574f799a9d07d27
7646b731b8bb0ec838028bbae080a71436a991e8
2047 F20110217_AACGGI arulselvan_a_Page_18.txt
3eff84cd6006e723ab0feca7ecb096d2
f0ce7cca03e3e2f0fb915f871aa3c2bd1bef4fa8
64605 F20110217_AACFZL arulselvan_a_Page_23.pro
b73e0945fcd96fa46e48ef501d87461e
dcb32cd05bb85db61736d324efc4d7ab797db7b8
F20110217_AACGBL arulselvan_a_Page_35.tif
d527eb4c276fe2b736975952d62c3990
62487a3ad9cad7ee23a18eb5843482af25a145c2
1991 F20110217_AACGGJ arulselvan_a_Page_19.txt
0dad00942a3e860f391befc32a06f3b7
4ba1ec116760e7d3428f8f5e5abb95d17c5b691c
14401 F20110217_AACFZM arulselvan_a_Page_33.QC.jpg
00400db2aac0656c1e0c368fe72586dc
ce5938a057ef9b1bfb5a3bfa297928a19c4eb2f7
8287 F20110217_AACGBM arulselvan_a_Page_23thm.jpg
1020fc92ca159d04553a73852c8573d0
51f0252d0956c09569adfcc719d1af842974ea74
1989 F20110217_AACGGK arulselvan_a_Page_20.txt
1a586312bf0be01f5ac92e9d41ebb5b7
a05fa107e03260d7731237dd003194ffabe5dbea
52213 F20110217_AACFZN arulselvan_a_Page_21.pro
6948933d0b564409ebd2f127f9dae389
d46cdffda5c16705bfed5acf63b86f4a1b5e04d6
7403 F20110217_AACGBN arulselvan_a_Page_39thm.jpg
82c169e69de21e8949f5d6783010a432
afa1e8fd549740a974304a1a72dcac881be1c831
2061 F20110217_AACGGL arulselvan_a_Page_21.txt
35f5aa2f29fb83bc4bacff40295dc0ce
ba5de10111f89713a939d0ed854ee5e8fc58ccdd
2561 F20110217_AACFZO arulselvan_a_Page_03thm.jpg
b182d5aff15e7a3cc0d021fd9d4cd23d
0711e8b67910dc50fa4268d40db0f04f333c2e9d
94792 F20110217_AACGBO arulselvan_a_Page_39.jpg
218b8c0d477c5e8149cad62ed014d059
265ad2ce3a35456d39cd68f903c95a8ce2ce0ec0
1205 F20110217_AACGGM arulselvan_a_Page_24.txt
2485716bd982b29ece2932f81cddea9b
ec4ec78262e057738013f5fe717c436b8ca36152
F20110217_AACFZP arulselvan_a_Page_19.tif
dec329676e7537a55bf79906e2cfc418
10d220004b77e4e1f35339dffecd4991afa96c94
8465 F20110217_AACGBP arulselvan_a_Page_17thm.jpg
30051d9375efb1d14583cd73bee3bdaa
c1cc42b6b112d6477cd086a89caac1c726d5e54c
1858 F20110217_AACGGN arulselvan_a_Page_28.txt
e66a92ff8614d7facf9dd5ef3be2bae6
c6225c04998585bb45557750be0dbc870af7455b
2940 F20110217_AACFZQ arulselvan_a_Page_26.txt
c4a2cff065b0b76235901e99364da166
62dda12b1a0001f74b8565b094c312243f04d5e5
1225 F20110217_AACGBQ arulselvan_a_Page_09.txt
27f32e7c1f1f159aa88629f229a361ab
6b0a1887900b8e135340b6c20de2a69777d0eee5
1664 F20110217_AACGGO arulselvan_a_Page_31.txt
926aa0535fce8324df8b7d50b10bf4df
a87d36edea423b3e7fcedce8a941574d9ef723ce
1464 F20110217_AACFZR arulselvan_a_Page_08thm.jpg
7dbda7e45f9a445197fb6d660e58c461
0af5e837ce5d54661f8c0747a93a4b7a21fe41b1
84515 F20110217_AACGBR arulselvan_a_Page_10.jpg
a09b748c18a59a449495198ff6a394ea
747f8986e61007235a322de9cfe58b7e773c0039
27786 F20110217_AACFZS arulselvan_a_Page_24.pro
fcbf8068fea2b242c208a496c2306187
b7deb3cd2862f9fa92dc070510efd8bf05d7301b
102636 F20110217_AACGBS arulselvan_a_Page_17.jpg
eaa3c0153dba35fe18d069a6327ad37b
27cfd5c5834c842b30771040925443ec029b73c4
1864 F20110217_AACGGP arulselvan_a_Page_32.txt
a3821ba79dbca3a751535227fd405ff1
b5a1a6f6858a077731469f20fefa718cc77e2e3d
18359 F20110217_AACFZT arulselvan_a_Page_33.pro
71974f73a17c9bd8e86db687e32ad303
0e2adbfb1bdc925532751ac0e58257bb35dcb38c
1775 F20110217_AACGBT arulselvan_a_Page_07.txt
889f7db646a158a386d09487f1669974
fb62b41718028e3903be22efbcc602aecdb35871
966 F20110217_AACGGQ arulselvan_a_Page_34.txt
e37d1d26b2871010d12af37be8f0c9d9
a1fe67a10d48ef5c1979be5208009fa7b74315e8
153876 F20110217_AACFZU arulselvan_a_Page_25.jpg
9b292e2bb89428e27f62d85df539fd13
296f80e04c7b1a7524447e5da378ef1973c8dc8e
1906 F20110217_AACGBU arulselvan_a_Page_14.txt
293868dcdb37cb36b2a76692b2c52a48
ea8917cfd130102b561de9a91c82e0cb4931f732
1998 F20110217_AACGGR arulselvan_a_Page_37.txt
77f62e60e2faed317c6c855f97380967
4eab8c0d21c9913cb088c1fd6e01112e8f48fc9d


Permanent Link: http://ufdc.ufl.edu/UFE0014925/00001

Material Information

Title: Complex Network Assortment and Modeling
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0014925:00001

Permanent Link: http://ufdc.ufl.edu/UFE0014925/00001

Material Information

Title: Complex Network Assortment and Modeling
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0014925:00001


This item has the following downloads:


Full Text












COMPLEX NETWORK ASSORTMENT AND MODELING


By

ASHWIN ARULSELVAN













A THESIS PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE

UNIVERSITY OF FLORIDA


2006

































Copyright 2006

by

Ashwin Arulselvan















ACKNOWLEDGMENTS

I would like to express my gratitude to my advisor Dr. Panos M. Pardalos for all

the valuable guidance and immense support he gave me while doing this thesis. I am

really thankful to him.

I would also like to thank Dr. J. Cole Smith, member of my committee, for his

remarks, criticisms and advice for improving the quality of the thesis presented in every

possible way. I also thank my family for their moral support.
















TABLE OF CONTENTS



A C K N O W L E D G M E N T S ......... .................................................................................... iii

L IST O F T A B L E S ..................................................................... .....................

L IST O F FIG U R E S .... .............................. ....................... .......... ............... vi

ABSTRACT .............. ..................... .......... .............. vii

CHAPTER

1 IN T R O D U C T IO N ............................................................................. .............. ...

2 IDENTIFYING CONNECTED COMPONENTS IN THE MARKET GRAPH.........2

In tro d u ctio n .................................................................................. 2
Prelim inaries of G raph Theory .................... ..... ................................ 6
Motivation and Techniques for Finding Connected Component in the Market
G raph ................ ..... ... .................................... ..................7
Structure of Connected Components in the Market Graph ...................................... 10
Size of Connected Components in the Market Graph in the Context of Power-Law
M o d el .............. .......... ........................ ......... ............... ................ 1 1
Structure of Connected Components in the Market Graph ....................................... 13
C including R em arks .......................................... .. .. .... ....... .. .. .... 14

3 EVOLUTION OF SOCIAL NETWORK........................................ ............... 20

In tro du ctio n ...................................... ................................................ 2 0
M o d el ...............................................................................................2 1
Perform ance A nalyses of the M odel.................................... ..................................... 23
R e su lts ...................................... .......................................................2 4
C o n clu sio n s..................................................... ................ 2 4

4 CON CLU SION S .................................. .. .......... .. .............28

L IST O F R E FE R E N C E S ....................................................................... ... ................... 29

B IO G R A PH IC A L SK E TCH ..................................................................... ..................32
















LIST OF TABLES


Table page

2-1 Arrangement of stocks into groups for the market graph with threshold of 0.5 ......15

2-2 Dates and mean correlations corresponding to each 500-day shift..........................16

2-3 Stocks contained in largest size group for eleven time periods (1 being the oldest
period, and 11 being the most recent). ........................................ ............... 17

3-1 Clustering coefficient, assortative mixing coefficient and Average length of the
giant com ponent ................................ .................... ...................... 25















LIST OF FIGURES

Figure page

2-1 Largest group size by time period (A corresponds to the threshold value of 0.7,
B corresponds to the threshold of 0.6, and finally, C pertains to the market graph
w ith th resh old 0 .5)................................................ ................ 19

3-1 Degree distribution for n = 500 nodes (R-square = 0.9056) .............. ...............25

3-2 Degree distribution for n = 800 nodes (R-square = 0.9563) ...............................25

3-3 Degree distribution for n = 1000 nodes (R-square = 0.9535) .............................26

3-4 Degree distribution for n = 1200 nodes (R-square = 0.9511 ) .............................26

3-5 Degree distribution for n = 1500 nodes (R-square = 0.9549) .............................26

3-6 Degree distribution for n = 1700 nodes (R-square = 0.9535) .............................27

3-7 Degree distribution for n = 2000 nodes (R-square = 0.9541) .............................27















Abstract of Thesis Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Master of Science

COMPLEX NETWORK ASSORTMENT AND MODELING

By

Ashwin Arulselvan

August 2006

Chair: Panos M. Pardalos
Major Department: Industrial and Systems Engineering

Most of the real world networks were observed to follow the power law model and

possess highly clustered subgraphs and small diameters. Finance and social networks are

no exceptions for these observations. In this thesis, we consider a recently introduced

network-based representation of the U.S. stock market, which follows the power-law

model. We propose a computationally efficient technique for identifying clusters of

similar stocks in the market by partitioning the market graph into a set of connected

components. It turns out that these groups have specific structure, in which each cluster

corresponds to certain industrial segments. Moreover, the size of these connected

components is consistent with the theoretical properties of the power-law model. We then

present a model that simulates the growth of a social network over time, by considering

weights for relationship strength and identities, features attributed to individuals

represented as nodes in the network that helps in their hierarchical classification. Other

factors that influence the evolution includes mutual acquaintances between a pair of

nodes considered and the time of last acquaintance between them. Our simulation









resulted in a model having many interesting features that are desired in real world

network, including high clustering and assortative mixing coefficient, scale free

distribution and small world phenomenon.














CHAPTER 1
INTRODUCTION

Complex networks attracted a lot of attention in the recent years as many real world

networks share the same daedal features and the study of these features make them

mathematically interesting and help us better understand them. Complex networks differ

considerably from the random graph model presented by Erd6s and Renyi [1]. For

instance, random graphs follow Poisson degree distribution, while complex networks

follow power law distribution [2-3]. Power law distribution is scale free and for this

reason complex networks are also referred to as scale free networks. These networks are

highly clustered compared to random graphs [4]. Also, they were observed to exhibit

small world phenomenon [4-6].

We take two such networks for our study that we presenting in this thesis. In

chapter 2, with a brief introduction about finance graphs we present an efficient technique

to assort assets into highly correlated groups. In chapter 3, we suggest a statistical model

that simulates the growth of a social network and summarize the precision of the model.














CHAPTER 2
IDENTIFYING CONNECTED COMPONENTS IN THE MARKET GRAPH

We consider a recently introduced network-based representation of the U.S. stock

market referred to as the market graph, which has been shown to follow the power-law

model. We propose a computationally efficient technique for identifying clusters of

similar stocks in the market by partitioning the market graph into a set of connected

components. It turns out that these groups have specific structure, where each cluster

corresponds to certain industrial segments. Moreover, the size of these connected

components is consistent with the theoretical properties of the power-law model.

Introduction

Taking into consideration a huge amount of data generated by the stock market on

a daily basis, the importance of discovering efficient ways to represent and analyze these

data becomes apparent. The stock market data is generally illustrated by different plots

displaying the price of a certain stock during various time periods. Nevertheless, as the

number of stocks increases, the task of analyzing the information contained in the plot

becomes more and more complicated.

In our study we adopted an alternative approach to explore the stock market data.

Specifically, we applied a recently developed technique of representing the stock market

prices over time in the form of a network with the stocks as the nodes and the edges

induced by the relations between the prices of two different stocks. This network is called

the market graph [7-9].









It is worthwhile to mention that the above approach to representing massive

datasets is widely used in many different areas such as social sciences, finance, genomics,

and protein folding [10-15]. This methodology can be applied to interpret large datasets

arising in various applications as a graph, where the elements of the dataset are the

vertices, and the relationships between those elements are represented by the edges of the

graph. In many cases, such network representations prove to be extremely useful and

convenient for the information analysis and elucidation of the hidden dependencies in the

data.

A network representation of the stock market data is derived from the cross-

correlation of price fluctuations over a certain time period. We construct the market graph

as follows: each node in the graph corresponds to a particular stock, and two nodes are

connected by an edge if the price correlation coefficient for the pair of associated stocks

(computed over a specific period of time) exceeds a given threshold.

Let us now describe the procedure for constructing the market graph. Denote the

price of the financial instrument i on day t by P,(t). The logarithm of return on the asset i

over the one-day period from t-1 to t is given in equation (2.1)

R, (t) = In(P (t) /P (t 1)) (2.1)

Then the correlation coefficient between instruments i and j can be computed as

shown in equation 2.2.


C (2.2)



In equation denotes the average logarithm return of the asset i over the N-days

period [16] and is given in equation 2.3.









1 N
N t)1 (2.3)


Fix a threshold [11]. For each pair of stocks with C, > 0, we add an edge

between nodes i and j of the graph. This indicates that the two stocks display a similar

behavior over time. In particular, the degree of similarity is determined by the prescribed

value of the threshold. From the above it follows that the analysis of the patterns

exhibited by the market graph can provide some useful insights into the inner structure of

the stock market.

Interestingly, the previous study indicated that the degree distribution of the market

graph can be described by the power-law model [11]. We say that a vertex has degree k if

there are k edges incident to it. In accordance with the power-law model, the probability

of a vertex having a degree k is given in equation 2.4.

P(k)oc k-, (2.4)

or, equivalently,

log P(k)oc -logk. (2.5)

Equation 2.5 implies that the degree distribution plotted in the logarithmic scale

reproduces a straight line. Furthermore, the degree distribution of the graph is a key

characteristic that describes a real-life dataset corresponding to this graph. It reveals the

large-scale pattern of connections in the graph, which displays the global properties of the

dataset this graph represents.

Remarkably, aside from the stock market data, the power-law model can be

observed in many other practical areas including, but not limited to biological networks,

computer networks, and social networks [17-21]. This interesting discovery led to an









introduction of the concept of the so-called "self-organized networks." Moreover, it

turned out that this phenomenon can also be found in finance.

In the previous studies the authors came up with a novel idea to relate certain

correlation-induced characteristics of the stock market prices with some combinatorial

properties of the correspondent market graph [7-9]. Specifically, the problem of the stock

arrangement into groups of highly correlated assets was considered. This problem was

solved by utilizing simple algorithms for finding cliques / independent sets in the market

graphs resulted from different threshold values.

Our present study takes a different approach to analyzing the structure of the

market graph. Specifically, in this paper we utilized the method for arranging the closely-

related stock into certain groups based on the maximum weighted path cover of the

market graph. Although, in general the maximum weighted path cover problem is known

to be NP-hard [22]. However, we can find the connected components of the market graph

in polynomial time.

Interestingly, this study has shown that there is a certain degree of similarity in the

market graph configuration obtained by our present method with the one obtained by the

application of the cliques problem. The paper also gives interesting insights into

relationships between different industries derived from the market graph structure.

Taking into consideration a huge amount of data generated by the stock market on a daily

basis, the importance of discovering efficient ways to represent and analyze these data

becomes apparent. The stock market data is generally illustrated by different plots

displaying the price of a certain stock during various time periods. Nevertheless, as the









number of stocks increases, the task of analyzing the information contained in the plot

becomes more and more complicated.

Preliminaries of Graph Theory

Let us first introduce some basic definitions and notations from the graph theory,

which are used in the paper. Later we will give the appropriate interpretation of the

introduced concepts in application to the data mining.

Let G = (V E) denote an undirected random graph with the set of nodes V, I V = n,


and the set of undirected arcs E = {e = (i, j): i, j V)

We say that the graph G is connected if one can find an undirected path between

every two nodes of V. A graph that does not satisfy the aforementioned property is called

disconnected. Every disconnected graph can be decomposed into a number of connected

subgraphs. Such subgraphs are called the connected components of the graph.

For any subset of nodes in the graph, let G(VI)denote the subgraph of G

induced by V1. A subset of nodes C is said to be a clique if the induced subgraph G(C) is

complete, i.e., G(C)contains all possible arcs. The problem of finding the largest clique in

the graph is known as the maximum clique problem. It was proven that the maximum

clique problem is NP-hard [22]. Also many cases of this problem are difficult to

approximate. Arora and Safra [23] shown that for some e > 0 the approximation of the

maximum clique problem within a factor of n" is NP-hard.

A path in a graph represents an alternating sequence of vertices and edges such that

from each of its vertices there is an arc to the descendant vertex. It is also assumed that a

path has no cycles. A weighted graph associates a value (weight or cost) with every edge

in the graph. Clearly, the market graph can be viewed as a weighted graph with the cross-









correlation between two stocks being the arc weights. The sum of the weights of the

traversed edges in a weighted graph is called the weight of a path.

A path cover of the graph G is a set of vertex-disjoint paths that together cover the

vertices of G. In a weighted graph, a path cover of the maximum weight is referred to as

the maximum weight path cover. The Maximum weight path covering problem (MWCP)

can be formulated as a problem of finding maximum weight path cover of a given graph.

This problem is shown to be in NP-hard [24].

Motivation and Techniques for Finding Connected Component in the Market
Graph

One can easily see from the construction of the market graph that two particular

stocks are closely related if their correspondent nodes in the graph are connected by an

edge. Consequently, we can deduce that there must be a certain degree of association

between two specific stocks if their nodes in the market graph are connected by a path.

Moreover, all the stocks represented by the nodes in the path can be combined together in

a distinct group of interdependent stocks.

In this view, it seems very natural to consider the MWCP for the market graph. In

fact, finding the collection of vertex-disjoint paths that has the maximum possible weight

can be perceived as the "best" arrangement of closely dependent stocks into the separate

groups.

In order to solve the MWCP, we applied a greedy algorithm proposed by Liao et

al. [24]. The method is analogous to Kruskal's maximum spanning tree algorithm [25].

Precisely, Liao's algorithm iteratively selects the edge with the greatest weight to be

added to the path, while preserving the properties of the path. In the case when two or

more arcs are eligible to enter (i.e., they all have the same weight), the algorithm non-









deterministically selects only one of them. The procedure terminates when either no

additional edges can be added to the solution without violating the path properties, or

there are no more edges.

The solution of the Maximum weighted path covering problem for a constructed

market graph showed no significant reduction in the number of the stock groups

compared with the maximum clique method we used in the previous studies. Moreover,

as mentioned before, the MWCP is an NP-hard problem, and so it is computationally

unattractive. Thus, the MWCP approach to grouping the related stocks, though

interesting, is not particularly better than the previous approach based on the

cliques/maximum sets.

Notice that the solution of the MWCP does not include all the edges of the market

graph. For the market graphs with a high value of correlation threshold, the above

approach results in a rather large number of the groups with fewer stocks in each group.

On the other hand, for the high threshold market graphs one might want to include all the

edges of the market graph, since each and every one of them corresponds to a significant

level of correlation between stocks. Essentially, this modified problem can be formulated

as the problem of finding all connected components of the market graph. Clearly, this

approach gives a very natural arrangement of all stocks of the market graph into separate

groups of closely related stocks. Each connected component indicates a certain group of

associated stocks. Although, the application of the method based on finding all the

connected components would generally result in stocks in a group having a somewhat

weaker connections compared to the MWCP grouping of the same market graph, this can

be easily overcome by setting a higher correlation threshold in the graph.









It is a widely known fact that all the connected components in a graph can be

obtained simply by using either the depth-first search or the breadth-first search. Notice

that both procedures can be performed in polynomial time. Here we applied the depth-

first search algorithm (DFS) to find all connected components of the market graph.

The DFS algorithm can be briefly described as follows. First some vertex v of the

graph is randomly selected and added to a stack. Then for each node vi of the

descendants of v, which has not been selected previously, we apply a recursive procedure

by adding vi to the stack and examining all its descendants v2 in a similar fashion.

Precisely, if v2 has not been selected yet we add it to the stack, and then examine all the

descendants of v2 by choosing a particular descendant and proceeding recursively. The

node is taken out of the stack after all its adjacent vertices are visited. After all the nodes,

which can be reached from the initial node v by some path, has been examined, we

choose the next v from those vertices of the graph that has not been visited yet. The

algorithm terminates when all the vertices in the graph are visited.

The output of the DFS algorithm is a set of depth-first search trees, where each tree

represents a connected component in the undirected graph. The number of connected

components is equal to the number of depth-first search trees. Moreover, the outcome of

the scheme does not depend on the choice of initial vertex.

The main advantage of this approach is that the DFS algorithm runs in polynomial

time. In particular, the procedure takes O(n+l|E) with |E| being the number of edges in the

graph) if the input is represented by the adjacency list, while it takes O(n2) if the input is

given in the form of adjacency matrix. Taking into consideration the size of the input for

the market graph problem, an adjacency matrix representation is preferred.









Structure of Connected Components in the Market Graph

First, we applied the depth-first search algorithm on the market graph with a

correlation threshold of 0.7. The obtained stock arrangement had a large number of

clusters with very few stocks in each group. Clearly, the stocks in each obtained group

were strongly linked. Decreasing the number of the groups formed would allow one to

see a less pronounced pattern of connections between the stocks. This is achieved by

decreasing the value of the correlation threshold of the market graph. Subsequently, the

resulted market graph also takes into account somewhat weaker connections.

Next, we applied the DFS algorithm on the market graph with the threshold value

set at 0.5. As expected, the algorithm produced a stock arrangement with a smaller

number of groups and, in general, large number of stocks in each group. This is because

of the trivial but important observation that all the connected components of a higher

threshold stay connected at the lower threshold and also may get connected to some other

connected components (Table 2-1).

For the market graph with threshold of 0.5, the largest group in the stock

arrangement has a total of 269 stocks, which represents both technology and finance

industries. An important observation is that each connected component in the market

graph corresponds to a distinct industry sector.

It should be noted that this approach provides a natural way for clustering stocks.

Clustering is a well-known challenging problem arising in data mining [26]. It deals with

partitioning a dataset into sets (clusters) of elements grouped according to some similarity

criterion. The main difficulty one encounters in solving the clustering problem on a

certain dataset is the fact that the number of desired clusters of similar objects is usually

not known a priori, moreover, an appropriate similarity criterion should be chosen before









partitioning a dataset into clusters. Using the technique of representing the stock market,

the clustering problem is treated as graph partitioning, where the subgraphs in the

partition correspond to different clusters. The above results suggest that partitioning the

market graph into distinct connected components is a reasonable approach in the

framework of clustering stocks.

Size of Connected Components in the Market Graph in the Context of Power-Law
Model

As mentioned above, the market graph follows the power-law model. The

asymptotic properties of power-law random graphs, including the size of their connected

components, have been studied theoretically. It is important to mention the existence of a

giant connected component (the unique largest component in the graph when the average

degree is greater than 1) in a power-law graph with y < yo z 3.457875, and the fact that a

giant connected component does not exist otherwise. The emergence of a giant connected

component at the point yo0 3.457875 is referred to as the phase transition.

As it was found in [8], the values of y for the considered instances of the market

graph were smaller than the aforementioned threshold value. Therefore, one would

expect to find a large connected component in the market graph. The results presented in

this section confirm this hypothesis.

Notice that the arrangement of stocks into groups for the market graph with the

threshold of 0.5 (presented in Table 2-1) clearly shows the presence of a giant component

(represented by the financial services/technology group). Observe also that the size of the

giant component is significantly larger than the sizes of all the other groups.

One may pose the following logical question: How does the size of the largest

component in the market graph changes over a certain period of time? To answer this









question, we constructed different market graphs with a given value of threshold for 11

adjacent time periods. Specifically, in order to examine the dynamics of the market graph

structure, we selected the time period of 1000 trading days in 1998-2002 and considered

eleven 500-day shifts within this period. The starting dates of any two consecutive shifts

are separated by a 50-days interval. In other words, each pair of successive shifts had 50

different days and the rest 450 days in common. The time shifts considered in this paper

are the same as the ones considered in the previous studies [9].

This method lets us capture the structural changes of the market graph using

comparatively small intervals between shifts, and at the same time allows us to maintain

sufficiently large sample sizes of the stock prices data in order to be able to compute the

cross-correlations for each time period. Also note that in our analysis we took into

consideration only the stocks, which were among those traded during the given 1000

trading days (i.e., for practical reasons we did not take into account stocks that had been

withdrawn from the market).

We considered three different values of the correlation threshold, precisely 0.7, 0.6,

and 0.5. For each given threshold value, the correspondent market graphs were

constructed for all eleven time periods. For each of the given eleven periods we ran the

DFS algorithm to find all connected components in the associated market graph. The size

of the largest group (i.e., giant connected component), formed in each individual period,

was computed. Figure 2-1 shows the largest group sizes obtained in all 11 time periods

for each particular value of the threshold.

It can be seen that for all three different thresholds the size of the largest group of

related stocks follows an overall increasing trend. Precisely, as a characteristic common









for all three cases, the giant component size predominantly increases from the oldest time

period (period 1) to the most recent one (period 11). Such clearly visible overall-

increasing dynamics exhibited by the largest group size can be well explained in the view

of the globalization tendency in the market. Note that this fact was also mentioned in [9]

in the context of the growth of the edge density and the maximum clique size in the

market graph.

Structure of Connected Components in the Market Graph

Another issue related to the size dynamics that deserves a special consideration is

how the structure of a giant connected component in a market graph transforms

throughout various time periods. To investigate this question, we set the threshold value

at 0.7 and constructed the correspondent market graphs for all eleven time periods above.

Using the DFS algorithm, we found a giant component along with the other connected

components in each of the obtained market graphs. The giant connected components for

all eleven time shifts are given in table 2-3.

It appears that in most cases stocks that belong to a giant connected component

during an earlier period are also included in the giant component in later periods. There

are some other interesting observations about the stock structure of the largest size group

found for different time periods. Interestingly, all the giant connected components

contain a large number of stocks of the companies representing the "high-tech" industry

sector. Furthermore, each giant component includes stocks of the companies related to

the semiconductor industry, and the number of these stocks in the largest group increases

with time. All these facts imply that the corresponding branches of industry had expanded

during the considered period of time to form a major cluster in the market. Additionally,

we detected that in the later periods (particularly, in the last 2 of the 11 periods) the giant









connected components in the market graphs contain quite a significant number of

exchange traded funds (stocks reflecting the behavior of certain indices representing

various groups of companies). It should be mentioned that all giant connected

components include Nasdaq 100 tracking stock (QQQ), which was also found to be the

vertex with the highest degree (i.e., correlated with the most stocks) in the market graph

[7-8].

Concluding Remarks

We extended the methodology of representing the stock market as a graph. We

have shown that partitioning the market graph into a set of connected components

provides reasonable results in the context of data mining, in particular, clustering stocks

into groups with similar behavior.

Moreover, we observed similar patterns of the sizes of connected components in

most instances of the market graph, with one large connected component and several

small ones. Since the market graph follows a power law with a small parameter Y, this

observation is consistent with theoretical results obtained for the power-law random

graph model, indicating the existence of a giant connected component in such graphs.

Our study confirmed that the recently introduced network-based approach is

promising for studying stock market dynamics. We believe that this methodology can be

further developed and generalized to take into account various factors affecting the

market and assist researchers and practitioners in making strategic decisions.










Table 2-1. Arrangement of stocks into groups for the market graph with threshold of 0.5
Industry Stocks
Basic Materials copper AA, AL, N, PD
and aluminum


Financial
services/technology


AAPL, CSCO, ALTR, ADI, AMAT, AMCC, ANAD, ASML,
ATML, CY, IDTI, INTC, CMGI, AMTD, AOL, AMZN, DCLK, ET,
ELNK, INKT, CNET, NITE, NTBK, RNWK, NTAP, CHKP, QQQ,
ADBE, MDY, ADCT, AFCI, BRCM, JDSU, BVSN, ELX, QLGC,
KLAC, CMOS, KLIC, ASYT, HELX, LRCX, CYMI, LLTC, DIA,
AIG, AXP, BAC, BBT, ASO, CMA, BK, BTO, ABK, HIG, CB,
SPC, JP, LNC, TMK, MBI, AF, CF, GPT, FVB, CBSS, FBF, C,
BSC, AGE, JPM, COF, HI, GSB, GDW, KEY, FITB, FRE, FNM,
MEL, HBAN, NCC, PNC, NTRS, RF, CFR, CYN, HU, MI, MRBK,
STI, NFB, ONE,WB, SPY, ADX, MIM, AMO, ATF, SBC, BLS, T,
VZ, DJM, EWF, EWQ, ALA, STM, ERICY, EWD, MSCI,EWG,
DT, COLT, CWP, FTE, KPN, TEF, EWI, BBV, EWP, EWN, ABN,
AEG, AXA, STD,EWU, PHG, NOK, ITWO, SEBL,MXIM,
LSCC,LSI LSI, NSM, PMCS, MERQ, VRSN, SUNW, CMVT, NT,
BCE, XLNX, MCHP, VTSS, RFMD, TQNT, SWKS, TXCC, TXN,
MOT, MU, TER, RTS, MCRL, DELL, MSFT, YHOO, IBM, ORCL,
VODTI, PT, LEH, LM, RJF, MER, MWD, EWH, APB, APF, RR,
GCH, TCH, JFC, TDF, CHL, EWS, MLF, GE, SCH, STT, UPC,
SNV, SOTR, TCB, WL, WM, WFC, USB, MXF, BZF, BZL, LAQ,
FMX, EKT, EWW,LDF, MSF, UBB, ELP, TBH, TAR, BFR, IRS,
TEO, TDP, TMX, TFONY,TV, KOF,MXE, TZA, TY, USA, AMGN,
MEDI, CHIR, GENZ, HQH, ALKS, CEGE, HGSI, ABGX, PDLI,
INCY, MEDX, MLNM, AFFX, HQL, LYNX, MYGN, GLGC,
VRTX, ASG, CCU, KSU, NOVL, PAYX, TLAB, WABC, VLY,
KRB, CCR, PVN, MMC,CIEN, DISH, SANM, BGEN, CREE,
CTXS, FLEX, HLIT, IBIS, INTU, ISSX, NXTL, QCOM, SFE,
SMTC, SPOT, SWS, SIEB, JBOH, MHMY, TFSM, BRKS, COHU


Gold ore Industries ABX, AEM, ASA, AU, DROOY, HGMCY, GFI, NEM, KGC, PDG,
ECO, GLG, TVX
Healthcare ACV-A, ACV
Financial, ADVNA, ADVNB
Credit/Personal credit
institutions
Utilities/services sectors AEE, AEP, AYE, CEG, CIN, D, DUK, ED, DTE, DQE, IDA, AVA,
VKL, LNT, NI, PEG, ETR, FPL, PNW, EXC, PSD, OGE, FE, PGN,
PPL, REI PCG, SO, TE, HE, WEC, WPS, TXU, XEL, ILA, DPL
Basic materials/energy AHC, APA, APC, BJS, ATW, DO, BHI, CAM, ESV, GLBL, GSF,
sectors HAL, HP, NBL, BR, UCL, MUR, PEO, DVN, NE, NBR, NOL PDE,
GW, PKD, PDS, PTEN, MVK, RDC, PGO, RIG, SII SLB, TDW,
TMAR, VRC, OII, WFT, VTS, OEI PPP, EOG, KMG, COP, CVX,
SC, BP, RD, TOT, XOM
Investment banking AKOA, AKOB
Transportation sectors ALK, CAL, AMR, DAL, LUV, UAL, NWAC
Healthcare/ AZN, GSK
pharmaceutical
preparations










Table 2-1. Continued
Industry Stocks
Basic BCC, BOW, GP, IP, PCH, RYN, TIN, WY
materials/consumer
goods sector
Financial/banking BCM, BMO, RY, TD
sectors
Consumer Goods, Tires BDG-A, BDG
and inner tubes
Computers and banking DOCC, FLBK, IBCA, PBIX, SCAI, SOV
sectors
Pharmaceutical BMY, SGP, JNJ, MRK, PFE
preparations
Consumer EWJ, HIT, SNE, MTF, NTT, TM
goods/financial
Indian Financial services IFN, IGF, IIF, JFI
Korea technology/ KEF, KF, SKM
finance
Media/technology CYLK, HOLL, NAVR, SHRP
Plastics materials and DD, DOW, PPG, ROH
resins


Table 2-2. Dates and mean correlations corresponding to each 500-day shift


1 9/24/1998 9/15/2000
2 12/4/1998 11/27/2000
3 2/18/1999 2/8/2001
4 4/30/1999 4/23/2001
5 7/13/1999 7/3/2001
6 9/22/1999 9/19/2001
7 12/2/1999 11/29/2001
8 2/14/2000 2/12/2002
9 4/26/2000 04/25/2002
10 7/7/2000 7/8/2002
11 9/18/2000 9/17/2002


Period #


Starting date


Ending date










Table 2-3. Stocks contained in largest size group for eleven time periods (1 being the
oldest period, and 11 being the most recent).
Time Stocks Size
11 ABGX, BBH, AMGN, CHIR, CRA, MLNM, HGSI, MEDX, PDLI, 202
IJH, AGE, DIA, C, FBF, IYF, BAC, IYG, GE, IVE, EWG, ABN,
EZU, AXA, AEG, ING, EWQ, BBV, EWP, TEF, DT, FTE, STD,
EWD, EWI, SPY, BDH, ADI, ALTR, AMAT, AMCC, BRCM, IAH,
ATML, CY, FCS, IYW, AMD, NVLS, ASML, IFX, PHG, ALA,
STM, EPC, INTC, DELL, IVW, BHH, ARBA, IYV, BEAS, IIH, CHKP,
MERQ, QQQ, ARMHY, BRCD, ELX, QLGC, CIEN, JDSU, XLK, CLS,
FLEX, IWF, CSCO, JNPR, MXIM, IDTI, LLTC, IRF, KLAC, BRKS,
CMOS, SMH, CYMI, LRCX, KLIC, LSCC, MCHP, TXN, IWO, HHH,
AOL, EBAY, IJR, IVV, IWB, IWD, IWM, IWN, IWV, IWW, IYC, IWZ,
IYJ, IYY, IYZ, ATF, SBC, TTH, VZ, MKH, MDY, MWD, BSC, GS, LEH,
LM, MER, XLF, AIG, BK, JPM, RKH, BBT, UPC, FVB, MI, NCC, STI,
RF, CMA, MEL, ONE, USB, WB, STT, WFC, XLV, VIA-B, VIA, XLI,
XLY, MSFT, NOK, SBF, YHOO, SSTI, XLNX, LSI, PMCS, VTSS, TER,
LTXX, VSEA, MCRL, MU, NEWP, NSM, NTAP, SEBL, EXTR, FDRY,
VRTS, SMTC, SUNW, EMC, IBM, ORCL, JBL, SANM, CREE, NVDA,
CNXT, ITWO, KEI, KOPN, MRVC, QCOM, RFMD, TQNT, SCMR,
SNDK, VRSN, VSH, KEM, BVSN, CMRC, IWOV, MOT, NT, DCX,
VRTX, DNA, GILD, IDPH, MEDIA, MYGN
10 ABGX, BBH, AMGN, CRA, MLNM, HGSI, MEDX, PDLI, IJH, BDH, 159
ALA, PHG, ASML, AMAT, ALTR, ADI, CY, ATML, IAH, AMCC, BHH,
ARBA, IYV, BEAS, IIH, BVSN, IYW, AMD, NVLS, CMOS, SMH,
BRCM, IWF, CSCO, EMC, BRCD, ELX, QLGC, JNPR, CIEN, EXTR,
FDRY, QQQ, AVNX, CHKP, SEBL, MERQ, VRTS, NTAP, XLK, CLS,
FLEX, SANM, JBL, CREE, DELL, INTC, IVW, C, DIA, GE, IVE, IJR,
IVV, HHH, AOL, EBAY, IWB, IWD, IWM, IWV, IYC, IYY, IYF, BAC,
IYG, JPM, SPY, MDY, MWD, GS, LEH, BSC, LM, MER, XLF, FBF,
RKH, PNC, SSTI, STM, EPC, IFX, KLAC, BRKS, LLTC, IDTI, LSCC,
LRCX, LTXX, MXIM, IRF, LSI, XLNX, MCHP, PMCS, VTSS, TXN,
SMTC, TER, NOK, XLI, XLV, DIS, VIA-B, VIA, XLY,YHOO, MSFT,
NEWP, SUNW, JDSU, NVDA, ORCL, CMRC, GLW, ITWO, MRVC,
QCOM, RFMD, TQNT, SWKS, SCMR, SNDK, VRSN, MU, NSM, KLIC,
IWOV, IBM, EWG, BBV, EWP, EWQ, EWI, STD, TEF, DT, FTE, EWD,
MOT, NT, VRTX, DNA, IDPH, MEDI

9 ADI, ALTR, AMAT, AMCC, BDH, ATML, CY, IAH, BEAS, IIH, ARBA, 110
BHH, CMRC, QQQ, ARMHY, ASML, IFX, PHG, ALA, STM, MDY,
DIA, C, JPM, XLF, BAC, FBF, MWD, GS, LEH, BSC, MER, SPY, GE,
HHH, AOL, EBAY, XLK, BRCM, PMCS, CSCO, JDSU, CIEN, JNPR,
GLW, SUNW, EMC, BRCD, ELX, QLGC, NTAP, SEBL, CHKP, MERQ,
VRTS, LLTC, KLAC, IRF, MXIM, LSCC, IDTI, LRCX, NVLS, CMOS,
INTC, DELL, TXN, XLNX, LSI, MCHP, VTSS, TXCC, TER, CREE,
EXTR, FLEX, CLS, SANM, MSFT, NEWP, NVDA, ORCL, SSTI, TQNT,
RFMD, SWKS, YHOO, XLI, XLV, VIA-B, VIA, PNC, NCC, XLY,
NOK, AVNX, BVSN, DIGL, ITWO, MRVC, SCMR, VRSN, IWOV, IBM,
NSM, NT, MU










Table 2-3. Continued
Time Stocks Size
8 ADI, QQQ, ALTR, AMAT, AMCC, BRCM, PMCS, CSCO, EMC, SUNW, 92
XLK, ASML, PHG, STM, ALA, ATML, CY, LSI, XLNX, INTC, DELL,
NVLS, CMOS, KLAC, LLTC, LSCC, IDTI, LRCX, TXN, MXIM, VTSS,
JDSU, CIEN, JNPR, BEAS, CHKP, MERQ, SEBL, ITWO, MDY, DIA, C,
JPM, XLF, BAC, BK, FBF, PNC, NCC, SPY, GE, HHH, AOL, EBAY,
VRSN, YHOO, XLI, XLV, VIA-B, VIA, NTAP, BRCD, ELX, QLGC,
VRTS, ORCL, GLW, TXCC, TER, KLIC, MCHP, NSM, TQNT, RFMD,
SWKS, NOK, CREE, EXTR, FLEX, CLS, SANM, MSFT, AVNX, BVSN,
CMRC, ARBA, CNXT, DIGL, IRF, MRVC, NEWP, SCMR
7 ABGX, MEDX, BBH, AMGN, DNA, HGSI, MLNM, PDLI, IDPH, MDY, 95
ATML, AMAT, ALTR, LSCC, IDTI, QQQ, AMCC, BRCM, PMCS,
CSCO, EMC, SUNW, XLK, ASML, PHG, STM, ALA, NOK, TXN, ADI,
CY, LSI, XLNX, KLAC, LRCX, NVLS, KLIC, MXIM, LLTC, VTSS,
TXCC, TER, MCHP, BEAS, JNPR, CIEN, MERQ, SEBL, CHKP, NTAP,
QLGC, BRCD, ELX, VRTS, ORCL, CREE, DELL, FLEX, CLS, SANM,
HHH, AMZN, AOL, EBAY, SPY, C, DIA, XLF, BAC, BK, FBF, JPM,
PNC, NCC, XLI, XLV, VIA-B, VIA, GE, MIM, YHOO, INTC, JDSU,
GLW, TQNT, RFMD, SWKS, VRSN, BVSN, CMRC, ARBA, EXTR,
ITWO, SCMR, MEDI
6 ALA, STM, ASML, PHG, QQQ, ALTR, LSCC, ATML, AMAT, AMCC, 71
BRCM, JDSU, CSCO, SUNW, EMC, XLK, CIEN, CREE, FLEX, GLW,
INTC, JNPR, KLAC, LRCX, NVLS, KLIC, TER, TXN, XLNX, LLTC,
MXIM, VTSS, PMCS, TXCC, MDY, DIA, C, JPM, XLF, BAC, BK, FBF,
PNC, SPY, MIM, XLI, XLV, VIA-B, VIA, NTAP, SEBL, VRTS, ORCL,
QLGC, ELX, TQNT, RFMD, SWKS, VRSN, BEAS, MERQ, BRCD,
BVSN, CHKP, CMGI, ICGE, CMRC, ARBA, ITWO, RBAK, NOK
5 ALA, STM, ASML, PHG, QQQ, ALTR, XLK, AMAT, AMCC, BRCM, 63
PMCS, JDSU, CSCO, SUNW, EMC, VTSS, TXCC, QLGC, ELX, KLAC,
LRCX, NVLS, TER, MXIM, LLTC, XLNX, ATML, LSCC, TXN, CIEN,
FLEX, INTC, MDY, SPY, DIA, XLF, BAC, C, JPM, FBF, PNC, XLI,
MIM, MLF, XLV, NOK, NTAP, SEBL, VRTS, ORCL, TQNT, RFMD,
SWKS, VRSN, BEAS, MERQ, BVSN, CHKP, CMGI, CMVT, GLW,
ITWO, JNPR
4 ALTR, QQQ, AMAT, KLAC, LRCX, NVLS, TER, XLK, AMCC, BRCM, 57
PMCS, VTSS, ATML, XLNX, LLTC, MXIM, LSCC, TXN, CSCO, JDSU,
SUNW, EMC, INTC, MDY, SPY, DIA, XLF, BAC, C, JPM, FBF, PNC,
XLI, MIM, MLF, XLV, ORCL, QLGC, ELX, SEBL, STM, ASML, PHG,
NOK, VRTS, BEAS, MERQ, CHKP, CIEN, CMGI, CMVT, ITWO, NTAP,
TQNT, SWKS, TXCC, VRSN
3 ALTR, XLNX, LLTC, MXIM, QQQ, AMAT, KLAC, LRCX, NVLS, TER, 35
XLK, CSCO, EMC, SUNW, SPY, DIA, MDY, MIM, MLF, XLI, XLV,
INTC, JDSU, SEBL, TXN, AMCC, PMCS, BRCM, CMGI, ITWO, NTAP,
QLGC, VRSN, VRTS, VTSS,
2 ALTR, XLNX, QQQ, AMAT, KLAC, LRCX, NVLS, TER, AMCC, 24
PMCS, CSCO, SPY, DIA, MDY, MIM, MLF, SUNW, EMC, INTC, JDSU,
MXIM, LLTC, SEBL, VRTS
1 ALTR, XLNX, QQQ, AMAT, KLAC, LRCX, NVLS, TER, CSCO, SPY, 15
DIA, MDY, MIM, SUNW, INTC
















200

150


2o
100

50

OM A i
1 2 3 4 5 6 7 8 9 10 11
Time Period
A



500

400

300

200

100


1 2 3 4 5 6 7 8 9 10 11
Time Period
B



1,400

1,200

S1,000

800

600

400

200


1 2 3 4 5 6 7 8 9 10 11
Time Period
C


Figure 2-1: Largest group size by time period (A corresponds to the threshold value of

0.7, B corresponds to the threshold of 0.6, and finally, C pertains to the market

graph with threshold 0.5)














CHAPTER 3
EVOLUTION OF SOCIAL NETWORK

Social Networks have attracted many scientists in the recent days especially

because of their applications in social processes such as studying disease spreading [27-

29], urbanization studies [30-32], discover the network of Innovators [33] and in many

other fields. Also social scientists are interested in studying evolving social networks as a

dynamic process [34]. In this chapter, we present a model that would dynamically

simulate a social network with some standard assumptions that are already in the

literature and show that the model accurately mimics a real world network.

Introduction

Social Networks like web graph and finance graph evolve over time. In fact, the

rate at which they change with time might be higher compared with other real world

networks. They have network transitivity, where the nodes have tendency to connect if

they share a mutual neighbor [4, 35]. They also have high degree correlation or

assortative mixing [36], where nodes of high degree tend to connect with other nodes of

high degree. Individuals of a social network could have identities [35, 37, 38],

characteristics of a specific node, which helps in a hierarchical classification of nodes.

For instance, in a student community there are more opportunities for two students to

establish a relationship who are classified in the same group than for two students who

are strangers to each other. These identities could be that the students may take the same

classes, or play a common sport or have same music interests. Small world phenomenon

is another feature of social network [4-6], nodes are separated by a distance of 6 or less









on an average. These are also some of the parameters which would help in simulating our

model.

Most complex networks have preferential attachment and addition of new vertices

[2]. An important thing to note is that unlike other complex real world networks like

biological or technological networks all social networks will not be having preferential

attachment and addition of new vertices to the networks with time. This feature especially

not having preferential attachment makes these networks follow a single scale

distribution as against power law distribution [3]. Of course, social networks like author

collaboration and movie actors will possess these two characteristics. Social networks

have a wide range of applications such as study of diseases in a network, identifying

innovators in an author collaboration network, analyzing the pattern of migration among

people and investigating criminal behavior using financial flows.

Model

We propose a model where we dynamically simulate a social network based on the

strength of the relationship [39] and identities of the individuals [37, 38] involved in the

relationship. Other factors influencing the growth of the network are mutual

acquaintances shared by the individuals [40] and time of their last contact [34, 41]. The

reasons for considering these parameters are quite intuitive. It takes more time for the

relation to fade or disassociate, when the relation is stronger. Also relationships last

longer with frequent contacts and fade away if individuals don't stay in touch. The

relationship strengthens with more and more contacts. The mutual acquaintances shared

by the individuals help in providing opportunities for the individuals to meet [39].

At each time interval we pick two random nodes and add an edge between them

with a probability given by p*fx, where









p =-pe- (3.1)

f= 1- (l-xa)b (3.2)

The probability that an edge will be added between two nodes with m mutual

acquaintances between them is p. This follows an exponential distribution, where the

value of p is exponentially increases with number of mutual acquaintances [39].

The probability that an edge will be added between two nodes that have common

identities fx. Each node is assigned a set of identities from global set based on uniform

distribution and x is measure of the common identities between these two nodes with

support as 0 to 1. This follows Kumaraswamy distribution [42], which is a double

bounded distribution resembling beta distribution and good for simulation studies. The

CDF fx monotonically increases with increase in value of x.

Also if an edge is added we consider each of the non-mutual neighbors of one node

and the other node as a potential relation and perform the edge additions with the

probability p*fx.

If the nodes considered already have an edge, we still calculate the probabilities and

increase the weight of the relationship by one, strengthening the relationship.

We again pick two random nodes and this time we consider them for deletion. The

probability of deletion equal to tx*f x*x, where

tx probability of deletion with x as a function of last time of contact

Wx probability of deletion with x as a function of weight of the relationship

fx probability of deletion with x as a function of mutual acquaintances









Each of the above probabilities follows gamma distribution having an exponential

x/O
kl e
decrease with increase in value of x, given by k F(k)

The simulation ends when the average degree of the network reaches 5. The

number of edges in the network will be O(n), a sparse network.

A question that might occur is the consideration of nearest neighbors while adding

edges but not while deleting edges. This addition could still be justified as a fair

operation. The stopping condition has an average degree 5. So the addition-to-deletion

consideration is 1 to 5 or less at every time interval. Also, the nearest neighbor

consideration for adding edges is a direct consequence of the highly clustered nature of

social networks.

Performance Analyses of the Model

The key performance measures of the model used for simulation are

* Clustering coefficient

* Assortative mixing coefficient

* Average length of the giant component (small world phenomenon)

* Degree Distribution (Power law distribution)



3 Number of Triangles
Clustering coefficient = (3.3)
Number of Triples



It is the tendency of two nodes that are connected by a path of length two to

associate, which essentially forms the triangles in the network. As mentioned earlier, this

is the motivation to consider nearest neighbors while adding edges.









The assortative mixing coefficient [36] is given by


M-1 jlk,- M- 2 O+ k,)
r 1 1 (3.4)
1 2
"M' (j2 + k 2 )-h M(j+ )

This coefficient is the tendency of the nodes with high degree to connect with other

nodes having high degree. The average shortest distance of the giant component should

be less than 6 to have small world phenomenon in the social network. The degree

distribution should follow power law distribution to ensure scale free characteristics.

Results

Simulations results are tabulated. The results are average of 10 test runs for each

node value from 500 to 2000 (Table 3-1).

Conclusions

We saw the simulated model have the same features that are desired in the real

world network with high correlation. The R-square value reported from regression

analyses is as high as 95% indicating scale-free characteristics. This strengthens our

convictions about the social network based on which the model was built. Also the model

should help in studying social networks which change over time. The model could also

be extended to social networks which has addition of newer vertices with time and

shouldn't affect any of the characteristics already present in the model. Future works

could also include more sophisticated ways of assigning identities to nodes. An

implementation of one such assignment, where the nodes are distributed in a unit

hypercube with each dimension representing an identity and the distance between them is

inversely proportional to probability of adding an edge, is already in process.











Table 3-1: Clustering coefficient, assortative mixing coefficient and Average length of the giant
component
Nodes CC AM Average Length Giant Component
1 500 0.49 0.83 3.41 245
2 800 0.42 0.79 3.78 434
3 1000 0.42 0.81 4.13 538
4 1200 0.37 0.77 4.39 695
5 1500 0.34 0.76 4.19 875
6 1700 0.32 0.75 4.32 1047
7 2000 0.34 0.75 4.56 1309


100 t


*4S 4**


Degree


Figure 3-1: Degree distribution for n


1000


500 nodes (R-square


0.9056)


*

***
.


10 100 1000
Degree


Figure 3-2: Degree distribution for n = 800 nodes (R-square = 0.9563)













1000















1000
z


1
1 10 100 1000
Degree



Figure 3-3: Degree distribution for n = 1000 nodes (R-square = 0.9535)





1000


S100



E 10
10


1 -- ",
1 10 100 1000
Degree



Figure 3-4: Degree distribution for n = 1200 nodes (R-square = 0.9511)


1000



E 100



E 10 '


S*


1 10 100 1000
Degree



Figure 3-5: Degree distribution for n = 1500 nodes (R-square = 0.9549)













1000


**


0*0



A4,
<.,
| .. -


Degree



Figure 3-6: Degree distribution for n = 1700 nodes (R-square = 0.9535)


1000


w**
4V

&.
ma


1000


Degree


Figure 3-7: Degree distribution for n = 2000 nodes (R-square = 0.9541)














CHAPTER 4
CONCLUSIONS

In chapter 2, we discussed a quick efficient way to identify clusters of similar

stocks in the market graph. From the results we concluded that our assumptions about the

nodes that are connected must be highly correlated are fair. In chapter 3, we proposed a

model that simulates the growth of social networks by considering weights for

relationship strength and identities. Our simulation resulted in social models, having

many interesting features that are desired in real world social networks, including high

clustering and assortative mixing coefficient, small world phenomenon and scale free

distribution. We then concluded, based on the results, by justifying the assumptions made

to model the growth.

Analyses of complex networks are the provenances of knowledge, which help in

getting acquainted to the real world systems and make better decisions. We hope the

thesis presented gave useful insights and motivated researchers to pursue research in the

field of complex networks especially social networks. A special emphasis on social

networks is laid to highlight its increasing role in various social activities over the last

decade.
















LIST OF REFERENCES

1. Erd6s P, Renyi A. On the evolution of random graphs. Publ. Math. Inst. Hung.
Acad. Sci. 1960;5:17-61.

2. Barabasi AL, Albert R. Emergence of scaling in random networks. Science October
15, 1999;286:509-12.

3. Amaral LAN, Scala A, Barthelemy M, Stanley HE. Classes of small-world
networks. Proc. Natl. Acad. Sci. 2000;97:11149-52.

4. Watts D, Strogatz SH. Collective dynamics of 'small-world' networks. Nature
1998; 393, 440-42

5. Milgram S. The small world problem. Psychology Today 2 1967;p. 60-67

6. Strogatz SH. Exploring complex networks. Nature 2001;410:268-76

7. Boginski V, Butenko S, Pardalos PM. On structural properties of the market graph.
In: Nagurney A, editor. Innovations in financial and economic networks.
Northampton, MA: Edward Elgar Publishers, 2003.

8. Boginski V, Butenko S, Pardalos PM. Statistical analysis of financial networks.
Computational Statistics and Data Analysis 2005,48(2):431-43.

9. Boginski V, Butenko S, Pardalos PM. Mining market data: A network approach.
Computers and Operations Research (in press).

10. Abello J, Pardalos PM, Resende MGC. On maximum clique problems in very large
graphs. DIMACS Series, vol. 50. Providence, RI: American Mathematical Society;
1999. p. 119-30.

11. Aiello W, Chung F, Lu L. A random graph model for power-law graphs.
Experimental Mathematics 2001;10:53-66.

12. Hayes B. Graph theory in practice. American Scientist 2000;88:9-13 (Part I), 104-
9 (Part II).

13. Jeong H, Tomber B, Albert R, Oltvai ZN, Barabasi A-L. The large-scale
organization of metabolic networks. Nature 2000;407:651-4.

14. Watts D. Small worlds: the dynamics of networks between order and randomness.
Princeton, NJ: Princeton University Press, 1999.


29









15. Watts D, Strogatz S. Collective dynamics of 'small-world' networks. Nature
1998;393:440-2.

16. Mantegna RN, Stanley HE. An introduction to econophysics: correlations and
complexity in finance. Cambridge: Cambridge University Press, 2000.

17. Albert R, Barabasi AL. Statistical mechanics of complex networks. Reviews of
Modem Physics 2002;74:47-97.

18. Barabasi AL. Linked. Cambridge, MA: Perseus Publishing; 2002.

19. Boginski V, Butenko S, Pardalos PM. Modeling and optimization in massive
graphs. In: Pardalos P.M, Wolkowicz H, editors. Novel approaches to hard discrete
optimization. Providence, RI: American Mathematical Society; 2003. p. 17-39.

20. Broder A, Kumar R, Maghoul F, Raghavan P, Rajagopalan S, Stata R, Tomkins A,
Wiener J. Graph structure in the Web. Computer Networks 2000;33:309-20.

21. Faloutsos M, Faloutsos P, Faloutsos C, On power-law relationships of the Internet
topology. Cambridge, MA: ACM SICOMM, 1999.

22. Garey MR, Johnson DS. Computers and intractability: a guide to the theory of NP-
completeness. New York, NY: Freeman; 1979.

23. Arora S, Safra S. Approximating clique is NP-complete. Proceedings of the 33rd
IEEE Symposium on Foundations on Computer Science, Pittsburgh, 1992. p. 2-13.

24. Liao S, Devadas S, Keutzer K, Tjiang S, Wang A. Storage assignment to decrease
code size. In Proceedings of the ACM SIGPLAN 1995 conference on
Programming Language Design and Implementation, La Jolla (June 1995), ACM
Press, pp.186 -95.

25. Aho A, Hopcroft J, Ullman J. The design and analysis of computer algorithms.
Reading, MA: Addison Wesley, 1974.

26. Bradley PS, Fayyad UM, Mangasarian OL. Mathematical programming for data
mining: formulations and challenges. INFORMS Journal on Computing
1999;11(3):217-38.

27. Rothenberg RB, Potterat JJ, Woodhouse DE, Muth SQ, Darrow WW, Klovdahl
AS. Social network dynamics and HIV transmission. AIDS, August 20
1998;12(12):1529-36.

28. Aral SO. Sexual network patterns as determinants of STD rates. Sexually
Transmitted Diseases, May 1999;26(5):262-64.

29. Moore C, Newman MEJ. Epidemics and percolation in small-world networks.
Phys. Rev. E 61, 5678-82 (2000).






31


30. Andersson C, Hellervik A, Lindgren K, Hagson A, Tornberg J. Urban economy as
a scale-free network. Phys. Rev. E 68, 036124 (2003) (6 pages).

31. Boyd M. Family and Personal Networks in International Migration: Recent
Developments and New Agendas. International Migration Review, Vol. 23, No. 3,
(Autumn, 1989) pp. 638-670

32. Andersson C, Frenken K, Hellervik A. A complex network approach to urban
growth. Papers in Evolutionary Economic Geography (PEEG) 0505, Utrecht
University, Section of Economic Geography, revised Feb 2005.

33. Yeung YY, Liu TCY, Ng PH. A social network analysis of research collaboration
in physics education. American Journal of Physics 2005;73:145.

34. Doreian P, Stokman FN. Evolution of social networks. New York, NY: Gordon and
Breach, 1997.

35. Jin EM, Girvan M, and Newman MEJ. The structure of growing social networks.
Phys. Rev. E 64, 046132 (2001).

36. Newman MEJ. Assortative mixing in networks. Phys. Rev. Lett. 89, 208701
(2002).

37. White HC. Identity and Control. Princeton. NJ: Princeton University Press,
Princeton, 1992.

38. Watts D, Dodds PS, Newman MEJ. Identity and search in social networks. Science
2002;296:1302-05.

39. Ravasz E, Barabasi AL. Hierarchical organization in complex networks. Phys. Rev.
E 67, 026112 (2003) (7 pages).

40. Kossinets G, Watts D. Empirical analysis of an evolving social network science.
Vol. 311, No. 5757. (6 January 2006), pp. 88-90.

41. Newman MEJ. Clustering and preferential attachment in growing networks. Phys.
Rev. E 64, 025102 (2001) (4 pages).

42. Kumaraswamy P. A generalized probability density function for double-bounded
random processes. Journal of Hydrology 1980;46:79-88.
















BIOGRAPHICAL SKETCH

The author of this thesis, Mr. Ashwin Arulselvan, is a master's student at the

University of Florida majoring in industrial and systems engineering with operations

research as focus.