
Citation 
 Permanent Link:
 http://ufdc.ufl.edu/AA00025734/00001
Material Information
 Title:
 Efficient algorithms and data structures for VLSI CAD
 Creator:
 Cho, Seonghun
 Publication Date:
 1996
 Language:
 English
 Physical Description:
 xii, 141 leaves : ill. ; 29 cm.
Subjects
 Subjects / Keywords:
 Algorithms ( jstor )
Data models ( jstor ) Datasets ( jstor ) Experimental results ( jstor ) Heuristics ( jstor ) Information search ( jstor ) Integers ( jstor ) Left wing politics ( jstor ) Radiocarbon ( jstor ) Run time ( jstor ) Computer and Information Science and Engineering thesis, Ph. D Dissertations, Academic  Computer and Information Science and Engineering  UF
 Genre:
 bibliography ( marcgt )
theses ( marcgt ) nonfiction ( marcgt )
Notes
 Thesis:
 Thesis (Ph. D.)University of Florida, 1996.
 Bibliography:
 Includes bibliographical references (leaves 138140).
 Additional Physical Form:
 Also available online.
 General Note:
 Typescript.
 General Note:
 Vita.
 Statement of Responsibility:
 by Seonghun Cho.
Record Information
 Source Institution:
 University of Florida
 Holding Location:
 University of Florida
 Rights Management:
 Copyright [name of dissertation author]. Permission granted to the University of Florida to digitize, archive and distribute this item for nonprofit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
 Resource Identifier:
 023055091 ( ALEPH )
34968500 ( OCLC )

Downloads 
This item has the following downloads:

Full Text 
EFFICIENT ALGORITHMS AND DATA STRUCTURES FOR VLSI CAD
By
SEONGHUN CHO
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1996
ACKNOWLEDGMENTS
My heartfelt appreciation goes to my advisor Professor Sartaj Sahni for giving me continued guidance in my thesis work. I thank him for the help, patience and support he provided throughout my stay in the University of Florida. Weekly meetings and discussions with him have spawned many ideas, for which I am thankful.
I would like to thank other members in my supervisory committee, Dr LiMmn Fu, Dr Theodore Johnson, Dr Sanguthevar Rajasekaran, and Dr Paul W. Chun for their interest and comments.
Thanks go to Venkat Thanvantri for his willingness to discuss the general subject of algorithms.
Thanks go to my wife Joungyim for her love, encouragement and patience. Finally, I would like to thank my parents for the love and support, without which I could not have pursued my doctoral studies. To them I dedicate this work.
TABLE OF CONTENTS
ACKNOWLEDGMENTS ............................. ii
LIST OF TABLES ................................. vii
LIST OF FIGURES ................................ x
ABSTRACT .... ....... .. ........... ............ xi
CHAPTERS
I INTRODUCTION ............................. 1
1.1 Background ..... .. ....... .. .. .......... 1
1.2 Dissertation Outline ......................... 2
2 MINIMUM AREA JOINING OF COMPACTED CELLS ........ 4
2.1 Introduction ..... ......... .. .. ...... .... 4
2.2 1layer River Routing ......................... 9
2.3 Constraint Graph Representation .................. 16
2.4 Heuristics to Minimize Area ..................... 20
2.4.1 Heuristic I .. . . 20
2.4.2 Heuristic 2 .. .. ..... .. .. .. ... .... 21
2.4.3 Heuristic 3 ....... .. .. .. ... .... 23
2.5 Experimental Results . . . 25
2.6 Conclusion . . . 30
3 A NEW WEIGHT BALANCED BINARY SEARCH TREE ...... 32
3.1 Introduction ..... ......... .. .. .. ..... .... 32
3.2 Balanced Trees and Rotations .................... 34
3.3 flBBSTs ................................ 39
3.4 Search, Insert, and Delete in a OBBST . 44
3.4.1 Search . . . 44
3.4.2 Insertion . . . 44
3.4.3 Deletion . . . 57
3.4.4 Enhancements . . . 63
3.4.5 Top Down Algorithms . . 64
3.5 Simple OBBSTs . . . 66
3.6 BBSTs without Deletion . . 70
3.7 Experimental Results . . . 72
3.8 Conclusion . . . 98
4 WEIGHT BIASED LEFTIST TREES AND MODIFIED SKIP LISTS 105
4.1 Introduction . . . 105
4.2 Weight Biased Leftist Trees . . 106
4.3 Modified Skip Lists . . . 109
4.4 MSLs As Priority Queues . . 118
4.5 Experimental Results For Priority Queues . 122
4 6 Conclusion . . . 129
5 CONCLUSIONS . . . 135
A ABBREVIATIONS . . . 136
REFERENCES . . . . 138
BIOGRAPHICAL SKETCH . . . 141
i %,
LIST OF TABLES
2.1 Error rate (%) over optimal, I = 1 . . 27
2.2 Improvement (%) over Fang, 1 = 1 . . 27
2.3 Time taken, I = I . . . 28
2.4 Error rate (%) over optimal, I = 2 . . 29
2.5 Improvement over I = I cases . . 29
2.6 Improvement over Fang, I = 2 . . 30
2.7 Time taken, 1 = 2 . . . 31
3.1 The number of key comparisons on random inputs (version 1 code) 76
3.2 The number of key comparisons on ordered inputs (version 1 code) 77
3.3 Height of the trees on random inputs (version 1 code) . 77
3.4 Height of the trees on ordered inputs (version 1 code) . 77
3.5 The number of rotations on random inputs (version I code) 79
3.6 The number of rotations on ordered inputs (version I code) 80
3.7 Run time on random inputs using integer keys (version I code) 81
3.8 Run time on ordered inputs using integer keys (version 1 code) 82
3.9 Run time on random real inputs (version 1 code) . 83
V
3.10 Run time on ordered real inputs (version 1 code) . 84
3.11 The number of key comparisons on random inputs (version 1 code) 87
3.12 The number of key comparisons on ordered inputs (version 1 code) 88
3.13 Height of the trees on random inputs (version 1 code) . 89 3.14 Height of the trees on ordered inputs (version I code) . 89 3.15 The number of rotations on random inputs (version 1 code) 90 3.16 The number of rotations on ordered inputs (version I code) 91 3.17 Run time on random inputs using integer keys (version I code) 93 3.18 Run time on ordered inputs using integer keys (version I code) 94 3.19 Run time on random real inputs (version 1 code) . 95
3.20 Run time on ordered real inputs (version 1 code) . 96
3.21 The number of key comparisons on random inputs (version 2 code) 99
3.22 The number of key comparisons on ordered inputs (version 2 code) 100
3.23 Run time on random real inputs (version 2 code) . 101
3.24 Run time on ordered real inputs (version 2 code) . 103
4.1 The number of key comparisons . . 117
4.2 Num ber of levels . . . 118
4.3 Run tim e . . . . 119
4.4 The number of key comparisons . . 124
4.5 Height/level of the structures . . 126
4.6 Run time using integer keys . . 127
4.7 The number of key comparisons . . 130
VI
4.8 Height/level of the structures . . 131
4.9 Run time using integer keys . . 132
A.1 Abbreviations used in tables . . 137
VII
LIST OF FIGURES
2.1 Cell joining...................................... 5
2.2 Ilayer river routing. .. .. .. .. ... .... ... ... ... .....8
2.3 Round robin and greedy layer assignments. .. .. .. ... ... ....11
2.4 Minimizing the number of tracks or layers. .. .. ... ... ... ..14
2.5 Constraint graph representation. .. .. .. ... ... ... ... ....18
2.6 Merge in constraint graph. .. .. .. .... ... ... ... ... ..19
2.7 Heuristic 1. .. .. .. .. ... ... .... ... ... ... ... ....20
2.8 Heuristic 2. .. .. .. .. ... ... ... .... ... ... ... ....22
2.9 Heuristic 3. .. .. .. .. ... ... ... .... ... ... ... ....24
3.1 LL and RL rotations. .. .. .. ... ... ... .... ... ... ..35
3.2 A tree in WVB(1/4) that is not Lbalanced. .. .. ... ... .. ....42
3.3 !balanced tree that is not a COST .. .. .. .. .... ... ... ..43
3.4 Lb rotation for insertion .. .. .. .. .... ... ... ... ... ..45
3.5 Substep (i) of insertion LR rotation .. .. .. .. .... ... ... ..46
3.6 Case Lb, for LR(ii) rotation .. .. .. ... .... ... ... ... ..49
3.7 Case LR for LR(ii) rotation .. .. .. ... ... .... ... ... ..51
3.8 LL rotation for deletion. .. .. .. .. ... ... ... .... ... ..58
3.9 LR rotation for deletion. .. .. ... ... ... ... ... ... ...60
3.10 Restructuring procedure. .. .. .. ... .... ... ... ... ....65
3.11 Simple restructuring procedure for insertion .. .. .. ... ... ....68
3.12 Simple restructuring procedure for deletion .. .. .. .. ... ... ..68
3.13 Simple restructuring procedure without a #i value .. .. .. ... ....70
3.14 Run time on real inputs (version 1 code) .. .. .. ... ... ... ..85
3.15 Run time on random real inputs (version 1 code) .. .. .. ... ...92
3.16 Run time on ordered real inputs (version 1 code) .. .. .. .... ..97
3.17 Run time on random real inputs (version 2 code). .. .. .. ... ..102
3.18 Run time on ordered real inputs (version 2 code). .. .. ... .. ..102
4.1 Example minWBLTs .. .. .. .. ... ... ... ... ... ... ..108
4.2 minWBLT Insert. .. .. .. .. .... ... ... ... ... ... ..109
4.3 minWILT Deletemmn.. .. .. .... ... ... ... ... ... ..110
4.4 Skip Lists. .. .. .. ... .... ... ... ... ... ... ... ...11
4.5 Modified Skip Lists .. .. .. .. ... .... ... ... ... ... ..113
4.6 MSL Search. .. .. .. ... ... ... ... ... ... .... ....113
4.7 MSL Insert .. .. .. ... ... ... .... ... ... ... ... ..114
4.8 MSL Delete. .. .. .. ... ... ... .... ... ... ... ....114
4.9 Run time. .. .. ... ... ... ... ... ... ... .... .....120
4.10 TMSL Insert. .. .. ... ... ... ... ... ... .... ... ..121
4.11 TMSL Deletemin.. .. .. ... ... .... ... ... ... ... ..121
ix
4.12 TMSL Deletemax ....... ............................ 122
4.13 Run time on randomly ....... .......................... 128
4.14 Run time on random2 ...... .......................... 129
4.15 Run time on randomly ...... .......................... 131
4.16 Run time on random2 ...... .......................... 133
x
Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the Requirements for the Doctor of Philosophy EFFICIENT ALGORITHMS AND DATA STRUCTURES FOR VLSI CAD By
Seonghun Cho
May 1996
Chairman: Dr. Sartaj Sahni
Major Department: Computer and Information Science and Engineering
In this dissertation, we develop efficient algorithms and data structures for problems that arise in electronic computer aided design (ECAD).
We consider the problem of joining a row of compacted cells so as to minimize the area occupied by the cells and the interconnects. The cell joining process includes cell stretching and river routing. We propose several heuristics to join a row of cells in such a way that area is minimized. The proposed heuristics are compared experimentally with the previously proposed heuristic.
We develop a new class of weight balanced binary search trees called fbalanced binary search trees (0iBBSTs). 3BBSTs are designed to have reduced internal path
xi
length. As a result, they are expected to exhibit good search time characteristics. Individual search, insert, and delete operations in an n node 13BBST take O(log n) time for 0 < j3 < V2 1. Experimental results comparing the performance of #I3BSTs, W13(a) trees, AVLtrees, red/black trees, treaps, deterministic skip lists and skip lists are presented. Two simplified versions of /3BBSTs are also developed.
We propose the weight biased leftist tree as an alternative to traditional leftist trees for the representation of mergeable priority queues. A modified version of skip lists that uses fixed size nodes is also proposed. Experimental results show our modified skip list structure is faster than the original skip list structure for the representation of dictionaries. Experimental results comparing weight biased leftist trees and competing priority queue structures as well as experimentat results for double ended priority queues are presented.
X11
CHAPTER 1
INTRODUCTION
1.1 Background
In VLSI layout, we are concerned with transforming a circuit from its logical design to a physical implementation. The layout problem for VLSI circuits is generally decomposed into smaller problems such as partitioning, floorplanning, placement, routing and compaction.
The partitioning process decomposes a large circuit/module into a collection of smaller subci rcui ts/ modules. In floorplanning, logical components of a circuit are assigned relative positions on a chip. The physical realization of each component (i.e., its area and aspect ratio) is also selected. The objectives of floorplanning include overall area minimization, minimization of power consumption. etc. The precise locations for the components of a design are then determined during the placement process to optimize the area and the timing. After the components are placed, the pins are connected during the routing process. During the process of compaction, the components and interconnections are moved so as to further optimize the layout in terms of area and delay.
The routing process is usually divided into three smaller subproblems of global routing, detailed routing, and specialized routing. Global routing decomposes the
1
2
complex routing problem into small and manageable subproblems. It assigns each net to a set of routing regions such as channel, switcbbox, and planar routing to minimize a combination of criteria such as area, circuit delay, etc. Steiner trees and spanning trees are the commonly used approaches for net connection in global routing. Specialized routing is used to connect powerground nets or clock nets. Detailed routing has two types of routing, general and restricted and there are three types of detailed restricted routing, channel, switchbox, and planar.
In channel routing, all terminals of nets are located in two parallel rows across a routing region called channel. In switchbox routing, terminals of nets are located on the four sides of the routing region. Planar routing is a problem in which the interconnection topology of the nets is planar. That is, all connections can be realized on a single layer. Vias allow wires to change layers but the presence of vias reduces reliability and performance of a circuit. Single layer routing is not always possible.
River routing is a special case of planar routing in which all nets have exactly two terminals, one on each side of the channel, and the net sequence on each side of the channel is the same. River routing is used in PCB routing, particularly in dataflow architectures with multibit buses connecting a series of logic blocks, and symbolic IC design systems.
1.2 Dissertation Outline
This dissertation is divided into four chapters. In Chapter 2, we consider the problem of joining a row of compacted cells so as to minimize the area occupied by the cells and the interconnects. The cell joining process includes cell stretching and
3
river routing. We propose several heuristics to join a row of cells in such a way that area is minimized. The proposed heuristics are compared experimentally with the previously proposed one.
VLSI Physical Design Automation is essentially the study of algorithms and data structures related to the physical design process. Specific data structures can be used to improve the performance of algorithms. For example, maze routing algorithms and lineprobe algorithms in global routing [30] use search structures, retiring algorithms [6, 291 use priority queue structures, and layout compaction algorithms [13] use radix priority search trees.
Chapters 3 and 4 are about search and priority queue structures. As specific data structures could be used to produce better performance of algorithms in VLSI design automation, new search and priority queue structures are proposed and thoroughly compared with other data structures.
Finally, in the last chapter, we present conclusions of this work.
CHAPTER
MINIMUM AREA JOINING OF COMPACTED CELLS
2.1 Introduction
When designing circuits with compacted symbolic sticks basic cells, the circuit is realized by a collection of compacted cells that tile a twodimensional area. The intercell interconnects are such that each interconnect connects two terminals that are on adjacent boundaries of neighboring cells. So, for example, if cells A and B (Figure 2.1(a)) are neighboring cells of the circuit, then the right boundary of A is adjacent to the left boundary of B. The number of terminals on each of these boundaries will be the same and the Z'th terminal (from the bottom) on the right boundary of A is to be connected to the i'th terminal (from the bottom) on the left boundary of B.
Since the cells are available in compacted form, it is not possible to reduce the distance between any pair of terminals on any side of a cell. However, this distance can be increased by stretching the cell. In the example of Figure 2. 1 (a), we can stretch either cell vertically by defining a horizontal cut line at any position and pulling the two cell pieces apart by any desired amount (the cell can also be stretched horizontally by using a vertical cut line).
4
5
5 5 E 5 35
4 15
A 4 3 B 10
3 2
2
(a) Horizontal adjacent cells (b) Joining by stretching
12
(c) Joining by river routing (d) Combination cell joining
Figure 2. 1. Cell joining
The required interconnects between cells A and B of Figure 2.1(a) can be accornplished by stretching cells A and B so that the terminals of A and B line up as in Figure 2.1(b). The broken lines in Figure 2.1(a) indicate the cut lines used for stretching. The stretching enables us to join cells A and B using no routing tracks (by "join" we mean make the interconnects between cells A and B). This method of joining cells is also called pitch matching.
Another way to join cells A and B is to river route the interconnects as in Figure 2.1(c). This uses routing tracks in a channel between cells A and B but does riot increase cell height. The pitch matching and river routing approaches to cell joining have been studied in Boyer [51 and Weste [33]. Algorithms for singlelayer
6
river routing can be found in several works [15, 19, 23. 24] and those for multilayer river routing can be found in Baratz [3]. Singlelayer gridless river routing is studied in Tompa [32]. Two applications of river routing are hybrid circuit design and structured design (DSP).
Cell stretching (or pitch matching) increases the height of the layout while river routing increases its width. Both affect the layout area. The layout of Figure 2.1(b) has area 150. To compute the area of the layout of Figure 2.1(c), we assume tracks have unit separation. So, the layout width is 14 and height is 11. The layout has area 154. Cheng and Despain [S) have proposed using a combination of cell stretching and river routing so as to obtain layouts with smaller area than possible when only one of these joining methods is used. Figure 2.1(d) shows the result of joining cells A and B using both stretching and river routing. The area of this layout is 144. This is minimum for the instance of Figure 2.1(a).
Cheng and Despain [8] have proposed a heuristic for single layer joining of compacted cells. At each step of their heuristic either a row or column of compacted cells is joined. Following this, the row or column of joined cells is replaced by a composite cell that represents the result of joining. Notice that when a row (column) of cells is joined, cells may be stretched vertically (horizontally) and river routing is done in a vertical (horizontal) channel. To join a row of cells, Cheng and Despain [81 bound the maximum height to which a cell may be stretched. This bound is hmax + hb1v9/(4 h...)
7
where h,,,, is the height of the tallest compacted cell being joined and h,,,, is the average height of the cells being joined.
Using this bound, cells are joined oneatatime using a penalty/reward scheme to determine if a pair of terminals is to be joined by stretching or by river routing.
Lim, Cheng, and Sahni [17] have considered the case when only two cells are to be joined. They develop fast polynomial time algorithms to obtain the minimum area join of two cells. In addition, they are able to obtain, in low order polynomial time, minimum area joins that rninirf ze the length of the longest wire or the total wire length. Lim [16] has proposed an 0(n(n/c)') algorithm to find the minimum area join of c cells having a total of n terminals. This algorithm does an exhaustive search over all possible numbers of tracks in the c I routing channels between adjacent cells. A constraint graph is used to determine the minimum height layout for each assignment of number of tracks to routing channels. The time required per track assignment is O(n) and the worst case number of track assignments is 0((n/c)1). The algorithm of Lim [16] is flawed as it handles channels with zero routing tracks by joining the adjacent cells using minimum height cell stretching and then considers the joined cells as one. This problem is easily fixed, however, by combining, in the constraint graph, pairs of vertices that represent corresponding terminals of the two cells (i.e., ZAli terminals of each cell) with zero routing tracks in between.
In this chapter, we consider the case when I > 1 routing layers are available to river route the inter cell connections. Note that while multiple layers do not affect layout area when cell stretching alone is used, a reduction in area is possible when
8
(a) 1layer (b) 2layer
Figure 2.2. Ilayer river routing
cell stretching is combined with river routing or when river routing alone is used. We assume that in each layer of each routing channel, the interconnects are to be accomplished using river routing. An alternative is to use HV routing when I = 2, HVH or VHV routing when I 3, and extensions of HVH and VHV routing for
1 > 3. However, for river routing instances, using routing layers in this way has no advantage over river routing in each layer (see Theorem 5, Section 2). When the number of layers available for river routing is increased, one may see a dramatic reduction in the number of routing tracks needed per layer. Figure 2.2 shows an instance that needs n tracks when routed in one layer but only one track/layer when routed in two layers.
We begin, in section 2, by stating the necessary and sufficient conditions for a river routing instance to be routable in I layers using at most t tracks per layer arid stating how to perform Ilayer river routing when such a routing is possible. In
9
this section, we also show that IIV style routing has no advantage over river routing in each layer. In Section 2.3, we describe the constraint graph used to determine minimum height stretching of c cells. Heuristics for the minimum area joining of c cells are proposed in Section 2.4 and the results of experiments with these are provided in Section 2.5. Our conclusions appear in Section 2.6.
2.2 1layer River Routing
Let (Ai, B,), 1 < I < mn be a set of terminal pairs such that the A,'s are on one side (say left or top) of a routing channel and the Bi's are on the other (right or bottom) side. Terminal Ai is to be connected to terminal B2, 1 < i < m. For this channel routing instance to be an instance of river routing, it must be the case that a, < a2 < ... < a,~ and b, < b2 < ... < bm where a, and bi, respectively, give the positions of terminals Ai and Bi, 1 K I < m. We may assume an underlying grid with each terminal being at a grid position. In the case of a horizontal (vertical) channel the ais and bis are grid column (row) numbers. Leiserson and Pinter [151 have obtained the following necessary and sufficient condition for a river routing instance to be routable in a single layer using at most t > 0 tracks. Theorem 1 [15] The river routing instance defined above is routable in a single layer using at most t > 0 tracks if and only if
(a) ai~t 6, > t (b) b~ ai > t
for every i < 771 t.
10
For the general case of I > 1 layers, we obtain the necessary and sufficient condition of Theorem 2.
Theorem 2 The river routing instance defined above is routable in I > 1 layers (each layer routing whole nets) using at most t > 0 tracks per layer if and only if
(a) ai+it bi > t (b) bj+lt a > t
for every i < m lt.
Proof: First, we establish that (a) and (b) are necessary for I layer routing. Since the proofs for (a) and (b) are similar, we provide that for (a) only. Suppose that ai+lt bi < t for some i. Consider the It + 1 terminal pairs (Aj, Bi), i < j < i + It. When routing these on I layers, at least one layer has to be assigned > t + 1 terminal pairs. So, suppose that terminal pairs (A',B ), (A'2, B), ..., (A+ 1,B'+), ... are assigned to the same layer for river routin g. We may assume that a' < a'2 < ... < at+1 andb < b2 < < bt+1. Since a' < a1+< t and b' > b b' ai+itbi
1. 2 a+1l  ait+t
From Theorem 1, it follows that the terminal pairs (A, B), 1 < j < t + 1, cannot be river routed on a single layer. Hence, (Aj, Bj), i < j < i + It, cannot be river routed on 1 layers. So, (Aj, Bj), 1 < J < m, cannot be river routed on I layers. As a result,
(a) is a necessary condition.
To show that (a) and (b) are sufficient conditions for routability, we present two algorithms (RoundRobin and Greedy) that assign the nets to layers in such a way that each layer is river routable when both (a) and (b) are satisfied. The correctness
procedure RoundRobin
{ Assign the m nets to I layers. }
begin
fori 1 to m do
assign net (Ai, B1) to layer (i mod 1) + 1;
end ;
procedure Greedy;
{ Assign the m nets to I layers. }
begin
fori:= 1 tomdo
assign net (Ai, B1) to layer q such that q is the smallest integer < I
for which the conditions of Theorem 1 are not violated on layer q
(if there is no such q, then fail)
end ;
Figure 2.3. Round robin and greedy layer assignments
of these algorithms is established in Theorems 3 and 4, respectively. 0
We later discovered that Baratz [3] has not only obtained the same condition but also proposed the same two algorithms for Ilayer river routing. One assigns nets to layers in a round robin fashion and the other uses a greedy strategy. The corresponding procedures are given in Figure 2.3.
Theorem 3 The layer assignment produced by the RoundRobin procedure is river routable if
(a) ai+lt bi > t (b) bi+t ai > t for all i < mIt.
12
Proof: Let (A', B'), (A,, B'), ... = (Aj,Bj), (Aj+t, Bj+,), (Aj+2, Bj+21), ... be the nets assigned to layer (j mod 1) + 1, j < 1. So, a' = aj+(i) and b' = bj+(iI. Hence,
I
ai+t b = aj+(i+t_)l bj+(i)l
aj+(i1)I+ti bj+(i1)l
> t (from (a))
Similarly, b, a > t. So, the layer assignment satisfies the conditions of Theorem 1 and is river routable using t tracks. 0
Theorem 4 If (a) ai+t bi t and (b) bi+tt ai t for all i < m lt, then procedure Greedy assigns nets to layers such that the assignment to each layer is routable using t tracks.
Proof: If procedure Greedy is able to assign each of the mn nets to a layer, then the layer assignments satisfy the conditions of Theorem 1 and so are routable using t tracks. Suppose the algorithm fails while trying to assign net (A,, B,) to a layer. At this time nets (Ai, Bi), 1 < i < r, have been assigned to layers so as to satisfy the conditions of Theorem 1 and the assignment of net (A,, Br) to each of these layers violates these conditions. Consider first those layers, L,, on which condition (a) is violated. For a layer s E La, suppose that the assigned nets are ..., (At, Bt),
..., (A1,Bj). Let (A, BJ) = (A,B,.). Since s E L,, we have a, bj_, < t. Now, if b,. > bj'_t, then a. brit < t which violates condition (a) of this theorem. So, bt < b'>_. Since, b., < b'. < ... < b6'2 < b'_, I of the It 1 nets
13
(Ar t+ I, Br..it+ ). ...., (Ar1, B,_1) have been assigned to layers E L,,. Consequently. the layers in L,, account for tiLal of these it 1 nets.
In a similar way, we can show that the remaining 1 ILail layers account for another W~ 1L.1) of these nets. This gives us a total of ti nets, whereas we had only ti 1. This contradiction implies that procedure Greedy cannot fail unless conditions
(a) and (b) are not satisfied. 11
Procedure RoundRobin is easily seen to have complexity of 0(m). A straightforward implementation of procedure Greedy will have complexity of 0(ml). However, by using priority search trees [18] the complexity can be reduced to O(m log 1). In practice, since I is quite small, it is unlikely that the priority search tree implementation will run faster than the straightforward implementation in which the 1 layers are checked in sequence. The actual routing for all I layers can be done in 0(mt) time using the computed layer assignment and the single layer routing algorithm of Leiserson and Pinter [15].
Using Theorem 2, we can develop a linear time algorithm to determine the minimum number of tracks needed to route an instance in I layers as well as to find the minimum number of layers needed for a t track routing. The algorithm for the former is given in Figure 2.4. This figure also shows the changes needed in case t is given and we wish to determine the minimum number of layers. The correctness of the algorithm follows from that of Theorem 2 and the fact that t (or 1) is increased only if the current t (1) is found to be infeasible. The complexity is 0(m) as neither
14
procedure MinimizeTracks; {or MinimizeLayers}
{ Determine the minimum number of tracks per layer (or minimum number
of layers) needed for multilayer river routing }
begin
t:0O; {or I := 1}
while (i < m It) do
if (ai+lt b, < t) or (bi+t ai < t)
thent:=t+ 1 {orl:=l+1}
else i :=i + 1;
end ;
Figure 2.4. Minimizing the number of tracks or layers
i nor t (1) can exceed m. So, neither clause of the if statement can be executed more than m 1 times.
Using the multilayer river routing results of Baratz [3], one can trivially extend all the results of Lim, Cheng and Sahni [17] to the case of multilayer joining of compacted cells. So, the multilayer minimum area join of two compacted cells with m nets can be obtained in O(m2) time. If we wish to minimize the maximum wire length while keeping area minimum, the asymptotic time complexity is still O(m2). The total wire length can be minimized while keeping area minimum in O(m2 log m) time.
In HV style routing, each routing layer is assigned a routing direction (either H or V). In an H (V) layer only horizontal (vertical) wire segments can be laid out. Horizontal segments on one layer connect to vertical segments, of the same net, on another layer by means of vias. In the case of river routing instances, one can see that there is no advantage to having more than two Vlayers (i.e., two Vlayers are sufficient to route all river routing instances).
15
Let RR(I, t) be the set of all river routing instances that can be routed in 1 layers, using t tracks per layer and using river routing in each layer. Let HV(I, t) be all river routing instances that can be routed using HV style routing, 1 layers, and t tracks per layer. Note that HV(I,t) includes instances routable with 0, 1, and 2 Vlayers. Let HVV(1, t) be all river routing instances using HV style routing, 1 2 Hlayers, and 2 Vlayers. Theorems 5 and 6 below hold for both the knockknee [25] and directional HV models. Theorem 7 holds only for the directional model. Theorem 5 HV(I,t) C RR(I,t) for every I > 1 and every t > 1. Proof: HV(l, t) RR(/, t) follows from a more general result obtained by Baratz [3]. Baratz [3] has shown that, for river routing instances, there is no advantage to using any routing scheme that wires a net on more than one layer. Since it is easy to construct river routing instances X such that X E RR(1, t) and X HV(l,t), it follows that HV(I, t) C RR(I, t).
We provide a simpler proof of HV(I, t) g RR(l, t). This proof will also establish our next result. We shall show that if X is a river routing instance such that X RR(/,t), then X 0 HV(/,t). Hence, HV(/,t) C RR(/,t).
Suppose that X RR(l, t). From Theorem 2, it follows that ai+t bi < t or b,+tj aj < t for some i. Suppose that a,+Ii bi < t (the proof is similar when bi+tt ai < t). So, ai+lt < bi t. Since X is a river routing instance, at least nets i + t,. ,i + It intersect a vertical cut line drawn at ai+lt. Hence, the density of X at aj+it is > i + It (i + t) + 1 = (I 1)t + 1. When HV style routing is used with
16
1, 1 > 1, layers, at most I 1 layers are available for horizontal routes. With t tracks per layer, densities of at most (I 1)t can be accommodated. So, X HV(/, t). 0 Theorem 6 HVV(l,t) C RR(l 1,t) for everyI > 2 and everyt > 1. Proof: As in Theorem 5, suppose that X 0 RR(I 1, t). Let i be such that ai+(1)t bi < t. The net density at a1+(11)t is > (1 2)t + 1. In HVV routing, two layers are Vlayers. So only I 2 layers are available for horizontal segments. This is not enough as the horizontal segment density is > (1 2)t + 1 at ai+(1)t. Hence, X HVV(!, t).
One may easily construct river routing instances that are in RR(l 1, t) but not in HVV(l,t). 0
Theorem 7 RR(2, t) r HV(l, t) HVV(1, t) for every I > 1 and every t > 1. Proof: Consider the RR instance (a,,bl) = (1,2) and (a2,b2) = (2,3). This is in RR(2,t) for every t > 1 but is not in HV(/, t) HVV(/, t) for any 1. El
As remarked earlier, Theorem 7 holds only for the directional model. For the knockknee model, one can show that RR(/, t) C HV(/ + 1, t) for every I > 1 and every t > 1.
2.3 Constraint Graph Representation
Lim [16] has proposed the use of a constraint graph to determine the terminal positions in a row of compacted cells. This is for the case when the number of tracks
17
in each routing channel is given and we wish to minimize the layout height. In the constraint graph, each cell is represented by a directed chain of vertices. Each cell terminal is represented by a vertex. The exception is when a compacted cell has terminals at the same yposition on both sides of the cell. In this case, the two terminals at the same yposition are represented by a single vertex. The vertex chain is linked in the direction of increasing yposition. The chain edges are labeled by the minimum allowable terminal separation. In addition, the constraint graph contains a source vertex that represents the bottom of the layout and a sink vertex that represents the layout top. The source vertex connects to the bottom of each chain and the top of each chain is connected to the sink vertex.
Figure 2.5(b) shows the chains (solid edges) for the four cell row of Figure 2.5(a). To complete the constraint graph, directed edges are added to introduce the channel routing constraints of Theorem 2. These are represented by the broken edges of Figure 2.5(b). Figure 2.5(b) is for the two layer case.
Lim [16] has shown that the constraint graph is acyclic provided the number of tracks in each routing channel is > 0. He has proposed handling channels with zero tracks by finding first the minimum area joining of the adjacent cells (only cell stretching is permitted now) and then combining these two cells into one. Le., the two cells are replaced by their minimum area join. This strategy can be shown to result in nonoptimality of the algorithm proposed in Lim [16). To preserve optimality, it is necessary to merge the vertices that represent terminals that are the endpoints of
18
sink
2
C 9 M
C M f 2
b g b
L f
a d h k 3 2
a d h k
(a) Cell joining with 2 layers
source
(b) Constraint graph representation Figure 2.5. Constraint graph representation
nets that are to be routed using no tracks as in Figure 2.6. The resultant constraint graph is also acyclic.
It is easy to see that the number of vertices and edges in the constraint graph is O(n) where n is the total number of terminals. Furthermore, the graph can be constructed in O(n) time given the number of routing layers and the number of tracks in each channel. The constraint graph described by us is identical to that of Lim [16] except in the way channels with zero tracks are handled and in that our graph is defined for 1 > 1 routing layers while that of Lim [161 is only for I = 1.
19
sink
2
C 9 M
f 2
b
3 di 2
i V
a h k
source
Figure 2.6. Merge in constraint graph
The length of the longest path from the source vertex of the constraint graph to each of the remaining vertices can be computed in O(n) time by doing this in topological order [14, Section 6.5]. It is easy to see that if each terminal is placed at a vertical position given by the longest path length from the source, then all nets can be routed in the given number of tracks (as the conditions of Theorem 2 are satisfied in each routing channel). Furthermore, Lim [16] has shown that such a positioning of terminals results in a stretched layout of minimum height for the given channel widths. As a result, when channel widths are known, cells can be stretched to minimize area in O(n) time. The channel widths that result in minimum area can be determined in 0(n(n/c)c1) time where c is the number of cells by trying out all
20
procedure Ileuristici
begin
for i: 1 to c 1 do
begin
determine the minimum area join of each pair of adjacent cells;
select the pair that has minimum area and replace it with its
minimum area join;
end;
end ;
Figure 2.7. Heuristic 1
possible channel widths [16]. Since this is feasible only for small c, we propose several heuristics in the next section.
2.4 Heuristics to Minimize Area
We formulate three greedy heuristics to obtain the minimum area join of a row of c compacted cells that have a total of n terminals.
2.4.1 Heuristic 1
The heuristic is described in Figure 2.7.
In each iteration of the for loop we examine every pair of adjacent cells. For each pair, the minimum area join is found using the algorithm of Lim, Cheng and Sahni [17] extended to the multilayer case as discussed in Section 2.2. The pair which has the minimum area join is replaced by a single cell that represents this join. So following each iteration of the for loop the number of cells decreases by one. When the for loop terminates, we are left with a Single cell that represents the join of all c cells. The time needed to determine the minimum area join of a pair of cells with ni nets between them is O(n?). The time to do this for all pairs of adjacent
21
cells is O(>ji n') 0 (n 2). So, the for ioop iteration with i = 1 takes 0(n2) time. On subsequent iterations, only the two pairs that include the cell introduced in the previous iteration need to have their minimum area join computed. Since each cell pair being considered includes at least one composite cell, the minimum area join is computed by considering the portion of the constraint graph that represents all the basic cells in the cell pair. Channel widths for channels within a composite cell are not changed while obtaining the minimum area join of the cell pair. However, as different channel widths for the channel between the two (composite) cells being joined are tried, the constraint graph is used to determine the minimum height of the combined cell. So, the time to combine two (composite) cells with ni terminals in the channel between them is O(nn1). Hence the time for the remaining c 2 iterations is O(n Xj7i n,) = 0(n'). The overall complexity of Heuristic 1 is therefore 0(n2). In case the terminals are uniformly distributed over the cells, ni = O(n/c) for all i. The time for the first iteration of the for loop is now O(n2/c) and that for each of the remaining iterations is 0(n2/c). The overall time is 0(n2).
2.4.2 Heuristic 2
In this heuristic, we begin by assigning each channel the number of tracks needed to route the channel with no cell stretching. This number can be determined in 0(n1) time for a channel with ni nets as described in Section 2.2. The time taken to do this for all c 1 channels is 0(n). The configuration obtained in this way is the maximum width layout. Starting from this configuration, we reduce the total number of tracks available across all c 1 channels by one on each iteration. For this, the
22
procedure Heuristic2
begin
for each channel determine the number of tracks, ti,
needed to route with no stretching, 1 < i < C;
t =E1t
set up the constraint graph using ti tracks in channel i, 1 < i < c
compute layout area, A ;
for tracks :=t downto 1 do {reduce by 11
begin
for 1 i=1to ci1 do
begin
reduce the number of tracks in channel i by 1
modify the constraint graph to reflect this ;
determine the length of the longest path in the graph
and from this the layout area, a1
end;
select 1 such that a1 = min{ a,
reduce the number of tracks in channel j' by I
A =min{f A, a,
end
end ;
Figure 2.8. Heuristic 2
effect of a one track reduction is computed for each channel. The minimum layout
height is determined by computing the length of the longest path in the constraint
graph of Section 2.3. The track reduction is done in the channel that results in the
smallest layout height (hence the minimum area for the given number of tracks). The
algorithm is stated more formally in Figure 2.8.
When the algorithm terminates, A is the area of the minimum area join found
by the heuristic. To reconstruct the layout, it is necessary to store the tracks per
channel each time A is updated in the statement A = min{f A, a, }.
For the time complexity, we see that the steps that precede the outer for loop
take O(n) time. Each iteration of the outer loop takes 0(nc) time. Hence this ioop
23
contributes a total of 0(net) to the time. Since t = O(n), the overall time complexity of Heuristic 2 is O(n 2 C).
2.4.3 Heuristic 3
Unlike Heuristic 2 which attempts to minimize the layout height for each value of t, the totaJ number of tracks, Heuristic 3 attempts to minimize the width (i.e., total number of tracks) for each choice of layout height. The heuristic begins with a layout height, ht, equal to the height of the tallest compacted cell. At each iteration, the next layout height to use is computed as described later. During each iteration, cells are combined in groups of at most k (k > 1 is a parameter to the heuristic). Each group of combined cells is replaced by its minimum area join subject to the constraint that the height of the join does not exceed ht. This joining of < k cells at a time continues until only one cell remains. Its area is computed and recorded. The minimum area obtained over all heights tried is then reported as the best. Heuristic
3 is given in Figure 2.9.
In our implementation of Heuristic 3, the minimum area join of k cells is found by considering the portion of the constraint graph for all the basic cells included in these k cells. So, for this purpose composite cells are not handled as single cells. Rather, as in Heuristic 1, the basic cells they are composed of are considered and channel widths previously assigned to the associated channels are not changed. Track assignment is done only for the k 1 channels between the k composite cells. We found this to give better results than when composite cells were regarded as atomic. For the case k = 2, the minimum area is determined by a binary search over the
24
procedure Heuristic3
begin
ht :=height of the tallest cell;
repeat I minimize width subject to height < ht}
repeat I do this by combining k cells at a time}
select k adjacent cells such that the minimum height cell is selected
and the height of the tallest selected cell is minimum
(if there are fewer than k cells, then select all of them)
obtain the minimum area layout for the selected cells under
the constraint that the layout height does not exceed ht
during the preceding step record the next value of ht
that is possible for a layout
until one cell remains;
compute the area of the remaining cell and record it
if it is less than the minimum area found so far;
if there is no next height then terminate;
hti: next height;
until false
end ;
Figure 2.9. Heuristic 3
number of tracks in the single channel. This takes O(n log n,) time where ni is the
number of nets in channel i. Thus the time needed for the inner repeat loop when
k = 2 is O(cn log n) (for uniform terminal distribution it is O(cn log(n/c)). During
the binary search, the heights corresponding to channel widths that require height
> ht are recorded. The minimum of these heights yields the next value of hi.
When k > 2, all track combinations for the k 1 channels are tried as in
Section 2.3. Again, each composite cell is broken up into its basic cells. As different
track combinations are tried, we record the minimum height > ht that results from
any track combination. This gives the next value of ht. The time for the inner
repeat loop is O((c/(k 1))n(n/k)k1) (or O((c/(k 1))n(n/c)k1') when terminals
are uniformly distributed).
25
In all our experiments, the outer repeat loop was iterated fewer than (k 1)n times. To ensure that the number of iterations is O(kn), one may adopt the following scheme. When the number of iterations first reaches (k 1)n, compute a set of at most n new heights by beginning with the current constraint graph. This uses the current assignment of number of tracks in each channel. Heuristic 2 is next used to reduce the total number of available tracks by one and determine the height needed to complete the routing with the reduced number of tracks. This process gives us at most n new heights h, < h2 < ... < hp,. Heuristic 3 is now resumed with h, as the next height. Only two iterations are performed. Then Heuristic 3 is resumed with max{ h2, ht I as the next height. Again two iterations of the outer repeat loop are done. Next the heuristic is resumed with max I h3, ht } as the next height. This continues until we have gone through p resumptions of the heuristic. With this scheme to limit the number of iterations, the complexity of Heuristic 3 becomes O(cn 2 log n) when k = 2 and O((c/(k 1))kn 2(n/k)k1') = O(cn2(n/k)k1') when k > 2. For the case when the n terminals are uniformly distributed over the c cells, the complexity is O(cn'log(n/c)) when k = 2 and O((c/(k 1))kn(n/c)1) = O(cn(n/c)k1') when k > 2. One may verify that since Heuristic 3 tries the maximum useful height (i.e., the height needed when no routing tracks are available), it generates optimal solutions when k = c.
2.5 Experimental Results
We programmed our three heuristics as well as the heuristic Fang [8) in C and ran tests on a single KSR processor. Optimal solutions for instances with up to nine
26
cells were obtained using the corrected version of the exhaustive search algorithm of Lim [161. Our test set consisted of instances that had a number of cells, c, equal to one of the numbers in the set 13, ..., 9, 10, 20, 50, 1001. For each value of c, there were twenty instances and the results were averaged over these instances. An instance with c cells had c I routing channels. The number, t, of terminals on either side of each routing channel was equal to c for 3 < c < 9 and was 10 for the other values of c. In addition, when c = 100, we also had instances with 20 terminals on either side. In our experiments, we considered only single layer and two layer routing.
Table 2.1 gives the average percentage by which the area of the single layer solutions generated by each of the heuristics exceeded the area of the single layer optimal solution. As is evident, each of the heuristics proposed in this chapter gave noticeably better solutions than did Fang. This table is only for the cases 3 < c < 9 as for c > 9 the optimal algorithm of Lim [16] required too much time to complete. When k > c, Heuristic 3 is guaranteed to generate an optimal solution. So, we did not run these cases.
In table 2.2, we have used the single layer solution produced by Fang as the benchmark against which the solutions obtained by our three heuristics are compared. This table gives the average percentage by which the area of the solutions produced by our heuristics is less than that of the solutions produced by Fang. Our solutions have area 9 to 18% less.
Table 2.3 compares the computing time requirements of the various algorithms for the case of one layer. The optimal algorithm is useful only for small values of c
27
Table 2.1. Error rate (%) over optimal, 1 = 1
cells___t Fang_ Hersi Heuisic He 31k =41
[~ ~ cel 21 Fag Huitc Huitc4__Huitc__3 59 0.5 0.2 0 *
4 4 10.0 0.9 0.1 0 0 *
5 5 11.0 3.7 0.3 0.3 0.1 0.1
6 6 12.7 2.4 0.3 0.2 0.2 0.0
7 7 16.1 3.5 0.3 0.1 0.1 0.1
8 8 17.9 2.7 0.4 0.3 0.3 0.1
9 9 18.8 3.5 0. 3 0.4 10.3 0.
t = number of terminals on each side of each routing channel
:k> c
Table 2.2. Improvement (%) over Fang, I = 1
cell I t 21 Heriti Heriti __rs
[~ ~~ cel 2f~ Hersi J Hersi 3__eur=ti 4
10 10 9.4 14.0 14.0 14.2 14.3
20 10 10.5 15.4 15.4 15.6 15.6
50 10 10.6 16.1 16.1 16.2
100 10 9.1 16.5 16.5 16.6
100 20 if 9.3 18.3 18.0
excessive run time
28
Table 2.3. Time taken, I
cells I t Fang Heuristic Heuristic Heuristic3 Optimal
1 2 k 2 k 3 k 4
3 3 0.0 0.01 0.02 0.02 0.01
4 4 0.0 0.02 0.02 0.04 0.09 0.05
5 5 0.0 0.03 0.04 0.09 0.36 1.2 0.77
6 6 0.0 0.03 0.08 0.21 0.66 3.7 13
7 7 0.0 0.04 0.14 0.50 3.4 27 278
8 8 0.0 0.07 0.24 0.84 4.4 34 1.8t
9 9 0.0 0.09 0.42 1.8 16 72.3 47t
10 10 0.0 0.14 0.78 3.1 19 334
20 10 0.01 0.32 5.8 16 117 1196
50 10 0.01 1.4 93 127 896
100 10 0.03 4.4. 782 590 4873
100 20 0.06 11 3022 4087
Times are in seconds.
t : Times are in hours.
(say up to 7). While Fang is significantly faster than the heuristics proposed here, the quality of the solutions generated by our heuristics is superior.
Table 2.4 is the analog of table 2.1 for the case of two layers. Again, our heuristics performed considerably better than did Fang. Table 2.5 gives the improvement in area due to increasing the number of routing layers from one to two. This is influenced somewhat by the width of cells which in our case ranged from 5 to 30 times the track separation. With narrower cells, the impact of the second layer would have been greater and with wider cells, it would have been less. Also, the impact of the second layer is more when more routing tracks are needed. For the smaller instances of table 2.4, for example, the optimal solutions with 1 = 2 required, on average, only
1.8% less area than when I = 1.
29
Table 2.4. Error rate (%) over optimal, 1 = 2
cells I Fang Heuristic Heuristic Heuristic3
3 3 6.3 0.5 0 0
4 4 9.8 1.1 0.1 0 0
5 5 11.0 2.7 0.1 0.2 0.1 0.0
6 6 13.0 1.4 0.1 0.0 0.0 0.1
7 7 15.9 2.3 0.1 0 0 0
8 8 16.8 1.7 0.2 0 0.1 0.0
9 9 18.4 2.2 0.1 0.1 0.1 0.0
Table 2.5. Improvement (%) over I = 1 cases
cells Fang Heuristic Heuristic Heuristic3
1 2 V=k=[=
~~I I_________ k = 2k=_3 1 k = 4
10 10 4.1 4.7 3.3 3.3 3.0 3.0
20 10 4.1 5.8 3.4 3.4 3.3 3.2
50 10 4.6 5.2 3.4 3.2 3.2
100 10 4.7 6.3 3.5 3.4 3.4
100 20 7.1 10.4 4.8 5.2 
30
Table 2.6. Improvement (%) over Fang, I = 2
[cells t Heuristic I Heuristic Heurisfic3
1 2 k = 2 1 k = 3 k = 4
10 10 10.0 13.3 13.2 13.2 13.3
20 10 12.2 14.8 14.8 14.9 14.8
50 10 11.2 15.1 14.9 15.0
100 10 10.7 15.4 15.4 15.4
100 20 12.5 16.3 16.3 :__J
Table 2.6 is the analog of table 2.2 for the case of two layers. The results are similar to those in table 2.2. 'Fable 2.7 gives the average computing times for the two layer instances. These are less than for the one layer case as the constraint graph has fewer edges.
For large c, we recommend the use of heuristic 2 or 3 (with k = 2) and for small c we recommend using heuristic 3 (with k = 3 or 4).
2.6 Conclusion
vVe have considered the problem of joining a row of compacted cells and developed heuristics to stretch cells and riverroute the nets so that the layout area is minimized. Our proposed heuristic was compared, experimentally, with Fang [81 and found to produce layouts with less area. However, Fang is faster. We recommend the use of our Heuristic 3 with k = 3 or 4 in practice.
31
Table 2.7. Time taken, 1 = 2
~cellsj IJt Fanj Heuristic ersi1k=leuristic3 =_ Optimal
3 3 0.0 0.01 0.02 0.02 *0.0
4 4 0.0 0.02 0.02 0.03 0.06 0.04
5 5 0.0 0.02 0.02 0.07 0.25 0.75 0.60
6 6 0.0 0.02 0.04 0.13 0.38 2.0 11
7 7 0.0 0.04 0.06 0.29 1.9 14 223
8 8 0.0 0.06 0.12 0.53 2.5 19 15
9 9 0.0 0.06 0.20 0.96 7.5 34 39
10 10 0.01 0.09 0.32 1.7 9.9 15320 10 0.01 0.24 2.7 9.9 6.5 65150 10 0.01 1.1 42 85 586
100 10 0.02 3.7 357 472 3069 1 00 20 0,05 7.9 1377 3166 Times are in seconds.
t:Times are in hours.
CHAPTER 3
A NEW WEIGHT BALANCED BINARY SEARCH TREE
3.1 Introduction
A dictionary is a set of elements on which the operations of search, insert, and delete are performed. Many data structures have been proposed for the efficient representation of a dictionary [141. These include direct addressing schemes such as hash tables and comparison schemes such as binary search trees, AVLtrees, red/black trees [121, trees of bounded balance [21], treaps [1], deterministic skip lists [20], and skip lists [26]. Of these schemes, AVLtrees, red/black trees, and trees of bounded balance (WB(a)) are balanced binary search trees. When representing a dictionary with n elements, using one of these schemes, the corresponding binary search tree has height O(log n) and individual search, insert, and delete operations take O(log n) time. When (unbalanced) binary search trees, treaps, or skip lists are used, each operation has an expected complexity of O(log n) but the worst case complexity is O(n). When hash tables are used, the expected complexity is 0(1) per operation. However, the worst case complexity is 0(n). So, in applications where a worst case complexity guarantee is critical, one of the balanced binary search tree schemes is to be performed.
:32
33
In this chapter, we develop a new balanced binary search tree called flBBST (/3balanced binary search tree). Like WB(a) trees, this achieves balancing by controlling the relative number of nodes in each subtree. However, unlike WB(a) trees, during insert and delete operations, rotations are performed along the search path whenever they reduce the internal path length of the tree (rather than only when a subtree is out of balance). As a result, the constructed trees are expected to have a smaller internal path length than the corresponding WB(a) tree. Since the average search time is closely related to the internal path length, the time need to search in a /3BBST is expected to be less than that in a WB(a) tree.
In Section 3.2, we define the total search cost of a binary search tree and show that the rebalancing rotations performed in AVL and red/black trees might increase this metric. We also show that while similar rotations in WB(a) trees do not increase this metric, insert and delete operations in WB(a) trees do not avail of all opportunities to reduce the metric. In Section 3.3, we define OBBSTs and show their relationship to WB(a) trees. Search, insert, and delete algorithms for /3BBSTs are developed in Section 3.4. A simplified version of 83BBSTs is developed in Section 3.5. Search, insert and delete operations for this version also take O(log n) time each. An even simpler version of f3BBSTs is developed in Section 3.6. For this version, we show that the average cost of an insert and search operation is O(log n) provided no deletes are performed.
An experimental evaluation of f3B BSTs and competing schemes for dictionaries (AVL, red/black, skip lists, etc.) was done and the results of this are presented in
34
Section 3.7. This section also compares the relative performance of PBBSTs and the two simplified versions of Sections 3.5 and 3.6.
3.2 Balanced Trees and Rotations
Following an insert or delete operation in a balanced binary search tree (e.g., AVL, red/black, WB(a), etc.), it may be necessary to perform rotations to restore balance. The rotations are classified as LL, RR, LR, and Rb [14]. Lb and RR rotations as well as bR and Rb rotations are symmetric. While the conditions under which the rotations are performed vary with the class of balanced tree considered, the node movement patterns are the same. Figure 3.1 shows the transformation performed by an bb and an bR rotation. In this figure, nodes whose subtrees have changed as a result of the rotation are designated by a prime. So, p' is the original node p however its subtrees are different.
bet h(x) be the height of the subtree with root x. bet s(x) be the number of nodes in this subtree. When searching for an element x, x is compared with one element at each of I(x) levels, where I(x) is the level at which x is present (the root is at level 1). So, one measure of the "goodness" of the binary search tree, T, for search operations (assuming each element is searched for with equal probability) is its total search cost defined as:
C(T) =Z1(x).
xET
35
9P
P
P d
qI 9P I
q
c a A d
b c
a b
(a) LL rotation
9P
qI
p d
I
P 9P
a q A. X
a b c d
b c
(b) LR rotation Figure 3.1. LL and RL rotations
Notice that C(T) = I(T) + n where I(T) is the internal path length of T and n is the number of elements/nodes in T. The cost of unsuccessful searches is equal to the external path length E(T). Since E(T) = I(T) + 2n, minimizing C(T) also minimizes E(T).
Total search cost is important as this is the dominant operation in a dictionary (note that insert can be modeled as an unsuccessful search followed by the insertion of a node at the point where the search terminated and deletion can be modeled by
36
a successful search followed by a physical deletion; both operations are then followed by a rebalancing/restructuring step).
Observe that in an actual implementation of the search operation in programming languages such as C++, C, and Pascal, the search for an x at level I(x) will involve upto two comparisons at levels 1, 2,..., 1(x). If the code first checks x = ej where ej is the element at level i to be compared and then x < e to decide whether to move to the left or right subtree, then the number of element comparisons is exactly 21(x) 1. In this case, the total number of element comparisons is
NC(T) = 2 E 1(x) n = 2C(T) n
XET
and minimizing C(T) also minimizes NC(T). If the code first checks x < ej and then x = ei (or > ei), the number of element comparisons done to find x is 1(x) + r(x) + 1 where r(x) is the number of right branches on the path from the root to x. The total number of comparisons is bounded by 2C(T). For simplicity, we use C(T) to motivate our data structure.
In an AVL tree, when an LL rotation is performed, h(q) = h(c)+1 = h(d)+1 (see Figure 3.1(a)). At this time, the balance factor at gp is h(p) h(d) = 2. The rotation restores height balance which is necessary to guarantee O(log n) search, insert, delete operations in an n node AVL tree. The rotation may, however, increase the total search cost. To see this, notice that an LL rotation affects the level numbers of only those nodes that are in the subtree with root gp prior to the rotation. We see that l(q') = l(q) 1, (p') = l(p) 1, 1(gp') = l(gp) + 1, the total search cost of the subtree
37
with root a is decreased by s(a) as a result of the rotation, etc. Hence, the increase in C(T) due to the rotation is:
1(p') 1(p) + l(q') I(q) + l(gp') l(gp) s(a) s(b) + s(d)
= 1 1 + 1 s(q) + 1 + s(d) = s(d) s(q). A similar analysis shows that an LR rotation increases C(T) by s(d) s(q).
If the LL rotation was triggered by an insertion, s(q) is at least one more than the minimum number of nodes in an AVL tree of height t = h(q) 1. So, s(q) > 0t+2/v/5 where 4 = (1 + V/5)/2. The maximum value for s(d) is 2' 1. So, an LL rotation has the potential of increasing total search cost by as much as
2t 1 Ot+2/,/5  2t 1 1.62t+2/2.24.
This is negative for t < 2 and positive for t > 2. When t = 10, for example, an LL rotation may increase total search cost by as much as 877. As t gets larger, the potential increase in search cost gets much greater. This analysis is easily extended to the remaining rotations and also to red/black trees. Definition (WB(a) [21]) The balance, B(p), of a node p in a binary tree is the ratio (s(l) + 1)/(s(p) + 1) where I is the left child of p. For a E [0, 1/2], a binary tree T is in WB(a) iff a < B(p) < 1 a for every node p in T. By definition, the empty tree is in WB(a) for all a.
38
Lemma 1 (1) The maximum height, hmaz(n), of an n node tree in WB(a) is ~ log.s (n + 1) [21]
(2) Inserts and deletes can be performed in an n node tree in WB(a) in O(log n) time for 2/11 < a < 1 V2/2 [4].
(3) Each search operation in an n node tree in WB(a) takes O(log n) time [21].
In the case of weight balanced trees WB(a), an LL rotation is performed when B(gp) = 1 a and B(p) > a/(1 a) (see Figure 3.1(a)) [21]. So,
s(p)+1 s(p) + 1
1 as
s(gp) + 1 s(p) + s(d) + 2
or
a 2a 1
s(d) s(p) a
1 a 1 a and
a s(q) + 1
 < B(p) = o1a B _s(p)+1
or
a 20 1
s(q) > s(p)2 + 2a1a 1a
So, LL rotations (and also RR) do not increase the search cost. For LR rotations [211, B(gp) ,~ 1 a and B(p) < a/(1 a). So, s(d) ~ s(p)_ + and with respect to Figure 3.1(b),
a s(p) s(q)
> B(p)1 a s(p) +1
39
or
s(q) >s(p) 1 2a a
For a < 1/3, s(q) > s(d) and LR (RL) rotations do not increase search cost. Thus, in the case of WB(a) trees, the rebalancing rotations do not increase search cost. This statement remains true if the conditions for LL and LR rotation are changed to those in Blum and Mehlhorn [4].
While rotations do not increase the search cost of WB(a) trees, these trees miss performing some rotations that would reduce search cost. For example, it is possible to have a < B(gp) < 1 a, B(p) > a, and s(q) > s(d). Since B(gp) isn't high enough, an LL rotation isn't performed. Yet, performing such a rotation would reduce search cost.
3.3 UBBSTs
Definition A cost optimized search tree (COST) is a binary search tree whose search cost cannot be reduced by performing a single LL, RR, LR, or RL rotation. Theorem 8 If T is a COST with n nodes, its height is at most log,(v/5(n + 1)) 2. Proof Let Nh be the minimum number of nodes in a COST of height h. Clearly, No = 0 and N1 = 1. Consider a COST Q of height h > 2 having the minimum number of nodes Nh. Q has one subtree R whose height is h 1 and another, S, whose height is < h 1. R must be a minimal COST of height h 1 and so has Nh1 nodes. R, in return, must have one subtree, U, of height h 2 and another, V, of height < h 2. Both U and V are COSTs as R is a COST. Since R is a minimal
40
COST, U is a minimal COST of height h 2 and so has Nh2 nodes. Since Q is a COST, IS max{IjUI, IVI}. We may assume that Nh is a nondecreasing function of h. So, ISI Nh2. Since Q is a minimal COST of height h, ISI = Nh2. So, Nh = Nh1 + Nh2 + 1, h > 2
N0 =0, N = 1.
This recurrence is the same as that for the minimum number of nodes in an AVL tree of height h. So, Nh = Fh+2 1 where F is the i'th Fibbonacci number. Consequently, Nh a qh+2/v/5 1 and h < log,(v5(n + 1)) 2. O
Corollary 1 The maximum height of a COST with n nodes is the same as that of an AVL tree with this many nodes.
Definition Let a and b be the root of two binary trees. a and b are 3balanced,
0
(a) fl(s(a) 1) < s(b)
(b) fl(s(b) 1) < s(a)
A binary tree T is flbalanced iff the children of every node in T are 6balanced.
A full binary tree is 1balanced and a binary tree whose height equals its size (i.e., number of nodes) is 0balanced.
41
Lemma 2 If the binary tree T is /balanced, then it is 1balanced for 0 < 7 < /.
Proof Follows from the definition of balance. 0
Lemma 3 If the binary tree T is #balanced, 0 < /# 5 1/2, then it is in WB(a) for a = #/(1 + ). Proof Consider any node p in T. Let 1 and r be node p's left and right children.
s(l) + 1 1
B(p) =  = 1
B(p) = s(1) + s(r) + 2 1 + s(L)+1
Since T is #balanced, s(l) 1 < s(r)/4 or s(l) + 1 < s(r)/ + 2. So, s()+1 < 1/3 + 211
s(r) + 1 (s(r) + 1) 1/
or
s(r) + 1 >
s(l) + 1
So, B(p) < 1/(1 + /). Further, s(r) 1 < s(1)/3. So, s(r) + 1
< 1/#t.
s(1) + 1
And, B(p) 1/(1 + 1//) = #/(1 + f). Hence 8/(1 + #) < B(p) < 1/(1 + /) for every p in T. So, T is in WB(a) for a = 8/(1 + /3). 0
42
00
00
0/O O
0 0
Figure 3.2. A tree in WB(1/4) that is not balanced
Remark While every #balanced tree, 0 < < 1/2, is in WB(a) for a = /(1+ /), there are trees in WB(a) that are not 3balanced. Figure 3.2 shows an example of a tree in WB(1/4) that is not balanced.
Lemma 4 If T is a COST then T is balanced. Proof If T is a COST, then every subtree of T is a COST. Consider any subtree with root p, left child 1, and right child r. If neither 1 nor r exist, then s(1) = s(r) = 0 and p is balanced. If s(l) = 0 and s(r) > 1, then r has a nonempty subtree with root t and s(t) > s(1). So p is not a COST. Hence, s(r) < 1 and p is balanced. The same is true when s(r) = 0. So, assume s(l) > 0 and s(r) > 0.
If s(1) = 1, then s(r) < 3 as otherwise, one of the subtrees of r has m > 2 nodes and m > s(l) implies p is not a COST. Since s(r) 5 3, 1(s(r) 1) < s(1) and (s(l) 1) s(r). So, p is balanced. The same proof applies when s(r) = 1. When s(1) > 1 and s(r) > 1, let a and b be the roots of the left and right subtrees of 1. Since p is a COST, s(a) < s(r) and s(b) 5 s(r). So, s(l) = s(a) + s(b) + 1 < 2s(r) + 1 and (s(l) 1) s(r). Similarly, (s(r) 1) s(l). So, !(l,r). Since this proof applies to every nodes in T, the children of every p are balanced and T is balanced. O
43
~0
0 0 0
0 0 0__Figure 3.3. 'balanced tree that is not a COST
Remark There are !balanced trees that are not COSTs (see Figure 3.3).
While a COST is in WB(1/3) and WB(a) trees can be maintained efficiently only for 2/11 < a < 1 1/V2 z 0.293, a COST is better balanced than WB(a) trees with a in the usable range. Unfortunately, we are unable to develop O(log n) insert/delete algorithms for a COST.
In the next section, we develop insert and delete algorithms for #balanced binary search trees (PBBST) for 0 < 0 V2 1. Note that every (v'2 1)BBST is in WB(a) for a = 1 1/v/r which is the largest permissible a. Since our insert and delete algorithms perform rotations along the search path whenever these result in improved search cost, BBSTs are expected to have better search performance than WB(a) trees (for a = //(1 + fl)).
Each node of a /BBST has the fields LeftChild, Size, Data, and RightChild. Since every /BBST, P3 > 0, is in WB(a), for a > 0, flBBSTs have height that is logarithmic in n, the number of nodes (provided # > 0).
44
3.4 Search, Insert, and Delete in a j8BBST
To reduce notational clutter, in the rest of the chapter, we abbreviate s(a) by a (i.e., the node name denotes subtree size).
3.4.1 Search
This is done exactly as in any binary search tree. Its complexity is 0(h) where h is the height of the tree. Notice that since each node has a size field, it is easy to perform a search based on index (i.e., find the 1O'th smallest key). Similarly, our insert and delete algorithms can be adapted to indexed insert and delete.
3.4.2 Insertion
To insert a new element x into a /3BBST, we first search for x in the /3BBST. This search is unsuccessful (as x is not in the tree) and terminates by falling off the tree. A new node y containing x is inserted at the point where the search falls off the tree. bet p' be the parent (if any) of the newly inserted node. We now retrace the path from p' to the root performing rebalancing rotations.
There are four kinds of rotations LL, LR, Rb, and RR. Lb and RR rotations are symmetric and so also are bR and Rb rotations. The typical configuration before an Lb rotation is performed is given in Figure 3.4(a). p' denotes the root of a subtree in which the insertion was made. bet p be the (size of the) subtree before the insertion. Then, since the tree was a /3BBST prior to the insertion, 03(p, d). Also, for the Lb rotation to be performed, we require that (q > c) and (q > d). Note that q > d implies q > 1. 'Ae shall see that #~(q, c) follows from the fact that the insertion is
45
9P p
LL
d q gp
q c c d
(a) before (b) after
Figure 3.4. LL rotation for insertion
made into a /3BBST and from properties of the rotation. Following an LL rotation, p is updated to be the node p".
Lemma 5 [LL insertion lemma] If [f6(p, d) A /3(q, c) A (q > c) A (q > d) for 0 < < 1/2 before the rotation, then 0(q, gp') and 0(c,d) after the rotation. Proof Assume the before condition.
(a) f(q 1) < c (as f(q,c)) < gp'. Also, f(gp' 1) =/(c+d) < 2/)3q (as />0, q > c and q > d) < q (as 3 < 1/2). So, 0(q, gp').
(b) d < q :, d 1 < q 1 =, 0(d 1) <5 (q 1) < c (as #(q,c)). Also, ,8(c1)
In an LR. rotation, the before configuration is as in Figure 3.4(a). However, this time q < c. Figure 3.4(a) is redrawn in Figure 3.5(a). In this, the node labeled c in Figure 3.4(a) has been labeled q and that labeled q in Figure 3.4(a) has been labeled
46
gp
q'
P d LR(i)
a q
a b c d
b c
(a) before (b) after substep (i)
Figure 3.5. Substep (i) of insertion LR rotation
a. With respect to the labelings of Figure 3.5(a), rotation LR is applied when
[(q > a) A (q > d)].
The other conditions that apply when an LR rotation is performed are
[/(p, d) A /3(a,q) A /3(b, c)].
Here p denotes the (size of the) left subtree of gp prior to the insertion. An LR rotation is accomplished in two substeps (or two subrotations). The first of these is shown in Figure 3.5(b). Following an LR rotation, p' is updated to be node q'. Lemma 6 [LR substep(i) insertion lemma] If [f(p,d) A 3(a,q) A /3(b,c) A (q > a)A(q > d)]forO < P < 1/2 before the subrotation, then [3(p",gp')A{(3(a,b)A
(cd V ( (ab) A (cd0 after the subrotation. (c, d)) V ((a, b) A 03(c, d))}] after the subrotation.
47
Proof Assume the before condition. First, we show that 0(p",gp') after the rotation. Note that O(p" 1) = #(a + b) = #(a + b+ c + 1) 3(c + 1) = p(p' 1) /3(c+1) = 8(p1)8c < dOc < d< gp'. Also, /3(gp' 1) = 8(c+d) b+#+#d (as #(b,c)) 5 b + Sq (as q > d) 5 b + a + P (as #(a,q)) < p" (as # 5 1/2 and p" = a + b + 1). So, #(p", gp').
Next, we prove two properties that will be used to complete the proof. Pl: #(b 1) < a.
To see this, note that fl(b 1) /3(q 1) < a (as 0(a, q)). P2: (c 1) < d.
For this, observe that p' 1 = a + q 8(q 1) + q (as /(a, q)) = ( + 1)(q 1) + 1. So, q 1 < 2 = ~+. Similarly, q 1 = b + c > fl(c 1) + c (as #(b,c))
POP1 <) d (aq )< o
= (/3 + 1)(c 1) + 1. So, 9(c 1) < 'L(q 2) (q 1) 5 < 1 (as
/(p, d)) < d.
To complete the proof of the lemma, we need to show
{((ab)A (cd)) V( (a,b)A #(c,d))}.
1+/ 1+/
We do this by considering the two cases b > c and b < c. Case b > c: Since a < q = b + c + (a 1) 0(b + c) < 2b < b. This and P1 imply 3(a, b). Also, d < q = b + c + 1. So, +O(d 1) 5 + (b + c ) = c + (b 1) < Lc + (as #(b,c)) = c. This, together with P2 implies
1(c, d). So, 0(a, b) A (c, d).
Case b < c: Since a < q = b + c + 1,a 1 < b + c. So, a 1 < b+c 1 or
48
A
1+0 1 13 + 1 + 0 1s+b
Also, d 1< q2= b+c1. So, fl(d 1) : 3(b+c1) <3(2c 1) c. This, together with P2 implies #3(c, d). So,  (a, b)A/3(c, d). 0
Since an LR(i) rotation can cause the tree to lose its #balance property, it is necessary to follow this with another rotation that restores the /balance property. It suffices to consider the two cases of Figures 3.6 and 3.7 for this follow up rotation. The remaining cases are symmetric to these. In Figures 3.6 and 3.7, p and d denote the nodes that do not satisfy #(p, d). Note, however, that these nodes do satisfy
1 1 16(p, d).
Since the follow up rotation to LR(i) is done only when
d~, ) A d)),
1+
either #(p 1) > d or 3(d1) > p. When fl(p 1) > d, the second substep rotation is one of the two given in Figures 3.6 and 3.7. When 93(d 1) > p, rotations symmetric to these are performed. In the following, we assume /(p 1) > d. Further, we may assume d > 0, as d = 0 and 1+(p, d) imply p < 1. Hence, /3(p, d). Also, d > 0 and fl(p 1)> dimplyp> 1.
The LR(ii) LL rotation is done when the condition
A=(q>d)A(c<(1+3)q+(13))AB where
49
gp p
LR(ii)
d
d q gp'
LL q
q c c d
(a) before (b) after
Figure 3.6. Case LL for LR(ii) rotation
B = (p,d) A (#(p,d)) A (q,c)A(fl(p 1) > d > 0).
1+0
Lemma 7 [Case LR(ii) LL rotation] If A holds before the rotation of Figure 3.6, then 0(q, gp') and #(c, d) after the rotation provided 0 < < V2 1. Proof (a) fl(q,gp'):
O(q1) c (as 0(q,c)) < gp'. Also, 0(gp'1) = 0(c+d) < #((1+P)q+(10)+d) 5 #(1 + O)q + 3(1 0) + #(q 1) (as q> d) = 0(2+ 8)q /2 < q (as 0(2+ 0) < 1 for 0 < p < v'2 1). So, fl(q, gp').
(b) f(c, d):
fl(d 1) < #(q 1) 5 c (as P(q, c)). And, fl(c 1) = (c 1) + 0+(c
q + (c 1) = (q + c 1) = (P 2) < (p 1) < d (as '3+(p, d)).
So, fl(c, d). O
Lemma 8 If (c < (1 + fl)q + (1 0)) A (fl(p 1) > d) in Figure 3.6, then d < q provided 0 < 3 < V2 1.
50
Proof Sinced < 0(p1) = O(q+c) < O(q+(1+O)q+10) = (3+2)q+0(13) < q + 1 (as (# + 2) 1 and 0(1 ) < 1 for 0 < /_ v/21). So, d q. O
So, the only time an LR(ii) LL rotation is not done is when C = (Cl V C2) A B holds where
C1, = (q= d) A (c < (1 + #)q+1 ) C2=c> (1 +/3)q+(1 #).
At this time, the LR rotation of Figure 3.7 is done. In terms of the notation of Figure 3.7, the condition C becomes D = (D1 V D2) A E where
D = (a d) A (q < (1 + ~3)a + 1 /0)
D2=q > (1 + #)a + 1/0
E =1 (p,d) A 0(p,d) A fl(a,q) A fl(b,c) A (i(p 1) > d > 0).
1+#
Lemma 9 When an LR(ii) LR rotation is performed and/3 < V2 1, q > d and so search cost is reduced.
Proof If D1, thensince d d/l3 d > d as S5 V 1. If D2, then d < (p 1) = (a + q) < (q + q) = q <
q q 1+(a s)3 1 ). O
@A< : (as 0< Vf21). []
1+0 
51
gp
pd LR(ii)
LR
a qxx
a b c d
b c
(a) before (b) after
Figure 3.7. Case LR for LR(ii) rotation
Lemma 10 When (d = a) A #3(b, c) A (fl(p 1) > d) A (3 V2 1) (see Figure S. 7), /3(a 1) < b and 03(d 1):5 c. Proof Since f(p 1) >dand d =a, O(p 1) > aorl#(a +q) > aor a( 8)
1(a 1) < 2(+b#,1fl /3
10l 10l
# + Il)b + #(#2 +# Ii+ p)
10l 10l
#_ +l I)b + 0(#2 2 1)
1f 1fl
52
Since O(c 1) < b,c < + 1. So,
#" f1" b #(3( + 1)b 302O(a1)< 12(b+c+1)< # (b+ 2)_ 1 + 1 .
101 $113 113
So,
S+ 1 3# 1
a1< b+ .
10 1 0
However, since #12 + 2# 1 < 0 for f< vf 1, (1 + #)/(1 #) < and
(301)/(1 ) <10. So,al < b/y+#. Ifa>c+1, thenc
Lemma 11 [Case LR(ii) LR rotation] If D holds before the rotation of Figure 3.7, then #(p', gp'), #(a, b), and #(c, d) following the rotation provided 0 < 1 5 1. Proof (a) (p',gp'):
1(gp' 1) = #(c + d) 5 b + 3 + 8d (as 0(b,c)) < b + 0 + 3q (from Lemmas 9 and 10, q >d) < b+ + a+ 13= a+ b+ 20 < a+b+ 1 = p'. Also, since +(p,d) and q d,,3(p 1) 5 (1 + 1)d or #(a + q) 5 (,8 + 1)d or a + q < (1 + )d or a<(1+ )dq5(1+ )dd=d/f. So, #(p'1)=/3(a+b)
(as 0(b,c)) < d+ c + 1 = gp'.
(b) 0(a, b):
Since b < q and 13(a,q),(b 1) B (q 1) < a.
53
When DI, 13(a 1) b was proved in Lemma 10. So, f3 (a, b). When D2, q a( +3)+ 1 So,
So,
#(aI 1fib +1c+/ O b+3 1+P13 b
So, #(a, b).
(c) 0(c, d):
Note that 13(c 1) < 8(q 1) < (q 1< '3(p 1) <5d. When DI, fl(d 1) c was proved in Lemma 10. So, 03(c, d). When D2, if d < b +1, then d< b and /3(d 1) 138(b 1) < c. So, assume d> b +1. Now, b < d 1 < O(p 1) 1. So,
b < 3(a+b+c )i
q1 + # +b+c+1)1
1P (b~c13(1 3)(b+c+ 1)) 1
1+3
1 + 1 c+ +0+ ( +3 + c+ 1)) 1
c+13(1+fl)c+2131
= (2+13)c+3,31 <(2+L)c+3(asI3 V21)
54
f< c+ (as < V2 1).
Also, from d < f/(p 1) and the above derivation, we get
d < (b+c+#+(1+#)(b+c+1))
< 1 + + C+, +T(1+0)( + 0 +c+ l))
1+ #1
j ,/311 2/32 1+13
(2+/ + + 22c) + ( + 1)
= (2+/)c+ 1+c+ (1 ) +1) 1++ 0
2/32 1 2+3+ + 2
= (2 1 f+)c + 1+c+
S(2 + )c+ + 42 + 1
(2+03)c+ 1 (aS#334/32/3< 1 +# for# v'2 1).
So, f(d 1) # (2 + P)c < c (as P V 1). So, (c,d). 0
Theorem 9 If T is flbalanced, 0 < #3 < V/ 1, prior to insertion, it is so following the insertion.
Proof First note that since all binary search trees are balanced for 1 = 0, the rotations (while unnecessary) preserve 0balance. So, assume /3> 0. Consider the tree T' just after the new element has been inserted but before the backward restructuring pass begins.
If the newly inserted node, z, has no parent in T', then T was empty and T' is /balanced. If z has a parent but no grandparent, then T has at most one nonempty
55
subtree X. Since T is flbalanced, fl([XI 1) < 0. So, IXf < 1. Following the insertion, T' has one subtree with < 1 nodes and one with exactly one. So, T' is #lbalanced. We may therefore assume that z has a grandparent in T'.
From the downward insertion path, it follows that all nodes u in T' that have children 1 and r for which #fl(l, r) must lie on the path from the root to z. During the backward restructuring pass, each node on this path (other than z and its parent) play the role of gp in Figures 3.4 and 3.5. The #property cannot be violated at z as z has no children. It cannot be violated at the parent, s, of z as s satisfied the flproperty prior to insertion. As a result its other subtree has < 1 element. So, following the insertion, s satisfies the flproperty. As a result, each node in T' that might possibly violate the flproperty becomes the gp node during the restructuring pass. Consider one such gp node. It has children in T' denoted by p' and d. Its children in T are p and d. Figures 3.4 and 3.5 show the case when d is the right subtree of gp in both T and T'. The cases RR and RL arise when d is the left subtree.
During the restructuring pass, gp begins at the grandparent of z and moves up to the root of T'. If z is at level r in T', (the root being at level 1), then gp takes on r 2 values during the restructuring pass. We shall show that at each of these r 2 positions either
(a) no rotation is performed and all descendants of gp satisfy the flproperty or
(b) a rotation is performed and following this, all descendants of node p" (Figure 3.4) or of node q' (Figure 3.5) satisfy the flproperty.
56
As a result, following the rotation (if any) performed when gp becomes the root of T', the restructured tree is /balanced. The proof is by induction on r. When r = 3 (recall, we assume z has a grandparent), gp begins at the root of T' and its descendants satisfy the flproperty.
Without loss of generality, assume that the insertion took place in the left subtree of gp. With respect to Figure 3.4, we have three cases: (i) q c and q > d, (ii) q < c and c > d, and (iii) q < d and c < d. In case (i), all conditions for an LL rotation hold and such a rotation is performed. In case (ii), an LR rotation is performed. Following either rotation, T' is /balanced. In case (iii), /3(p' 1) = /3(q + c) < 2/3d < d (as ,6 _< v/2 1). Also, fl(d 1) < p < p + 1 = p'. So, fl(d 1) < p'. Hence, #(p', d) and T' is /balanced.
For the induction hypothesis, assume (a) and (b) whenever r < k. In the induction step, we show (a) and (b) for trees T with r = k + 1. The subtree in which the insertion is done has r = k. So, (a) and (b) hold for all gp locations in the subtree. We need to show (a) and (b) only when gp is at the root of T'. This follows from Lemmas 5, 6, 7, and 11.
The theorem now follows. 0
Lemma 12 The time needed to do an insertion in an n node /3BBST is O(log n) provided 0
57
Proof Follows from the fact that insertion takes 0(h) time where h is the tree height and h = O(log n) when /3> 0 (Lemmas 1 and 3). 0
3.4.3 Deletion
To delete element x from a /3BBST, we first use the unbalanced binary search tree deletion algorithm of Horowitz and Sahni [14] to delete x and then perform a series of rebalancing rotations. The steps are: Step 1 [Locate x] Search the f3BBST for the node y that contains x. If there is no
such node, terminate.
Step 2 [Delete x] If y is a leaf, set d to nil, gp, to the parent of y, and delete node
y. If y has exactly one child, set d' to be this child; change the pointer from the parent (if any) of y to point to the child of y; delete node y; set gp to be the parent of d'. If y has two children, find the node z in the left subtree of y that has largest value; move this value into node y; set y = z; go to the start
of Step 2. { note that the new y has either 0 or 1 child }
Step 3 [Rebalance] Retrace the path from d' to the root performing rebalancing
rotations.
There are four rebalancing rotations LL, LR, RR, and Rb. Since Lb and RR as well as LR and Rb are symmetric rotations, we describe LL and LR only. The discussion is very similar to the case of insertion. The differences in proofs are due to the fact that a deletion reduces the size of encountered subtrees by 1 while an
58
g p p 1
LL
pd' qgp
q c c 9
(a) before (b) after
Figure 3.8. LL rotation for deletion insertion increases it by 1. In an LL rotation, the configuration just before and after the rotation is shown in Figure 3.8. This rotation is performed when q > c and q > d'. Following the rotation, d' is updated to the node p'.
Let d denote the size of the right subtree of gp before the deletion. So, d = d + 1. Since prior to the deletion the I3BBST was #balanced, it follows that f(p, d) and 0(q, c).
Lemma 13 [LL deletion lemma] If [f#(p,d)A 0(q,c)A(q c)A(q > d)A(1/3 < 3 1/2)] before the rotation, then [#(q, gp') A 8(c, d')] after the rotation. Proof (a) fl(q, gp'):
/3(q 1) !5 c (as 8)(q,c)) < gp'. Also, B3(gp' 1) = fl(c+ d') < 2,8q (as c < q and d' < q) < q (as )3 < 1/2). So, 0(q, gp').
(b) #(c,d'):
d' < q = d'1 < q1 =: (d'1) < fl(q1) < c. Also, when c < 1, 0(c1) < 0 < d' (as d' > 0). When c > 1,q > c =z q > 2 and p = q+c+ 1 > c+3. So, 0(c 1) < /(p 1) 3/3< d 3/3 (as 0(p,d)) < d 1 (as fl > 1/3) = d'. Hence,
59
/3(c, d'). 0
In an LR rotation, the before configuration is as in Figure 3.8(a). However, this time q < c. Figure 3.8(a) is redrawn in Figure 3.9(a). In this, the node labeled c in Figure 3.8(a) has been relabeled q and that labeled q in Figure 3.8(a) has been relabeled a. With respect to the labelings of Figure 3.9(a), rotation LR is applied when
[(q > a) A (q > d')].
The other conditions that apply when an LR rotation is performed are
[fl(p, d) A f8(a, q) A fl(b, c)J.
Here d denotes the (size of) right subtree of gp prior to the deletion. As in the case of insertion, an LR rotation is accomplished in two substeps (or two subrotations). The first of these is shown in Figure 3.9. Following an LR rotation, d' is updated to node q'.
Lemma 14 [LR substep(i) deletion lemma] If [f3(p, d) A #3(a, q) A #(b, c) A (q > a) A (q > d')] before the subrotation LR(i), then [3(p', gp') A{ (/3(a, b) A '3(c, d))V( (a, b) A /(c, d'))}] after the subrotation provided 1/3 < 3 < 1/2 Proof Assume the before condition.
(a) If b =c= 0, then q b+c+ 1 = 1. Furthermore, (q > a) and (q > d') imply a d'= 0. So, gp' = p'= 1. Hence, [!(p',gp') A !(a,b) A !(c,d')]
60
gp
q'
P d' LR(i)
a q
a b c d'
b c
(a) before (b) after substep (i)
Figure 3.9. LR rotation for deletion
(b)Ifb=landc=0,thenq=2,a<1, andd'<1. So, 1
(c) Ifb=O0andc= 1, thenq=2, a<1, andd' < 1. So, 1 1 and c > 1. So, q >3, a> 1 (as (a,q) i 6(q 1) < a or a> 20 > 0), p = a + q + 1 > 5, d >2 (as #(p, d) = (p 1) < d and f > 1/3), and d' = d 1 > 1.
First, we show that #(p', gp'). For this, note that a + b + c + 1 = p 1. From #(p,d), it follows that #(a + b+ c + 1) = /(p 1) < d. So, 8(a + b) 5 d c c f. From Figure 3.9(b), we see that (p' 1) = 3(a + b). Hence, I(p' 1) 5 d flc = d',c+10 5 d'+12,8
(gp' 1) =,(c + d') 5< b + + +d' (as ,(b,c)) < b+Oq+O(asq>d')
61
< b + a + 2 (as (a,q))
< p.
So, 0(p', gp').
Next, we prove two properties that will be used to complete the proof. PI: /(b 1) < a.
To see this, note that fl(b 1) < /(q 1) !5 a (as #(a, q)). P2: l(c 1) < d'.
For this, observe that fl(c 1) < P(q 2) (as c < q 1) < #(p 4) (as q = p a 1 and a > 1) = #/(p 1) 3/3 1/3) = d'.
To complete the proof of the lemma, we need to show
1043(a, b) A # d)) V (a, b) A #(c, ))}.
For this, consider the two cases b > c and b < c (as in Lemma 6). Case b > c: Since a < q = b + c + 1,/#(a 1) < /(b + c) < 203b < b. This, together with P1 implies 3(a, b). Also, d' < q =b+c+ 1. So, 6+(d' 1)< (b +c 1) =
gc~+ + +0(+1+c
+  1) c + c = c. This, together with P2 implies a (c, d'). So,
/(a,b) A 1 (c, d').
Case b < c: Since a < q = b + c + 1,a 1 < b + c. So, a 1 < b + c 1 or
or1+ + 1+ 1+ /3+1/3
d 1 < q 2 = b + c 1. So, fl(d 1) < P(b + c 1) < 3l(2c 1) < c. This and
62
P2 imply 3(c, d'). Hence, +(a, b) A 3(c,d'). 0
The substep(ii) rotations are the same as for insertion.
Theorem 10 If T is #balanced, then following a deletion the resulting tree T' is also 3balanced provided 1/3 < fl < ' 1.
Proof Similar to that of Theorem 9. 0
When 0 < # < 1/3, we need to augment the LL rotation by a transformation for the cased' = 0. Whend'= 0,/(p 1) < d = d' + 1 = 1. So, p < 1/fl + 1 and gp = p + d' + 1 < 1/ + 2. To flbalance at gp, the at most 1/ + 2 nodes in gp are rearranged into any flBBST in constant time (as 1/# + 2 is a constant). When d' > 0, the proof of Lemma 13 part (b) can be changed to show 0(c 1) < d' for 0 < < v2 1. The new proof is: since c < q,c < (p 1)/2 and fl(c 1) < (p 1)/2 !5 d/2 # = d d/2 0 < d 1 0 < d'. The LR rotation needs to be augmented by a transformation for the case d' = d 1 < 1. At this time, fl(p 1) 5 d < (2) So, gp = p + d< 2( )+ 1 + To #balance at
gp, we rearrange the fewer than + 1 + nodes in the subtree, in constant
time, into any #balanced tree. When d' > 1 1, the proof for 3(c 1) 5 d' in 0(2+0)
Lemma 14 needs to be changed to show that the LR substep(i) lemma holds. The new proof is:
d > 01(p 1) = (a + b + c + 1) >(1(q1) + b + c + 1bc+1)
63
= 1(j3(b+c)+b+c+1)
N + ((1+/)(c 1) + (1 + O)c+ 1)
= #((1 +/3)2(c 1)+2+/).
So, 1) d22
So ~~~(1+,6)2d1(ad>
Also, note that when = 0, all trees are 8balanced so the rotations (while not needed) preserve balance.
Theorem 11 With the special handling of the case d' = 0, the tree T' resulting from a deletion in a flBBST is also #balanced for 0 < fl < V2 1. Lemma 15 The time needed to delete an element from an n node flBBST is O6og n) provided 0 < /3 < V 1.
3.4.4 Enhancements
Since our objective is to create search trees with minimum search cost, the rebalancing rotations may be performed at each positioning of gp during the backward restructuring pass so long as the conditions for the rotation apply rather than only at gp positions where the tree is unbalanced.
Consider Figure 3.4(a). If p' < d, then the conditions of Lemmas 5 and 6 cannot apply as q < p' < d. However, it is possible that e > p' where e is the size of either the left or right subtree of d. In this case, an RR or RL rotation would reduce the total search cost. The proofs of Lemmas 5 and 6 are easily extended to show that these rotations would preserve balance even though no insertion was done in the subtree
64
d. The same observation applies to deletion. Hence the backward restructuring pass for the insert and delete operations can determine the need for a rotation at each gp location as below (I and r are, respectively, the left and right children of gp).
if s(l) > s(r) then check conditions for an Lb and LR rotation
else check conditions for an RR and RL rotation.
The enhanced restructuring procedure used for insertion and deletion is given in Figure 3.10. In the RR and RL cases, we have used the relation '>' rather than'> as this results in better observed run time.
Since it can be shown that the rotations preserve balance even when there has been no insert or delete, we may check the rotation conditions during a search operation and perform rotations when these improve total search cost.
Finally, we note that it is possible to use other definitions of 18balance. For example, we could require 83(s(a) 2) < s(b) and 86(s(b) 2) < s(a) for 83(a, b). One can show that the development of this chapter applies to these modifications also. Furthermore, when this new definition is used, the number of comparisons in the second substep of the LR and Rb rotations is reduced by one.
3.4.5 Top Down Algorithms
As in the case of red/black and WB(a) trees, it is possible to perform, in O(log n) time, inserts and deletes using a single top to bottom pass. The algorithms are similar to those already presented.
65
procedure Restructuring ; begin
while (gp) do
begin
if (s(gp.left) > s(gp.right)) then
begin {check conditions for an LL and LR rotation}
p = gp.left ;
if (s(p.left) > s(p.right)) then
begin if (s(p.left) > s(gp.right)) then do LL rotation; end
else
begin
if (s(p.right) > s(gp.right)) then {LR}
begin
do LR rotation ;
{ now notations a, b,c, and d follow from figure 3.1(b) }
if ((s(a) 1) > s(b)) then
if ((s(a.right) < (1 + )s(a.left) + 1 8) and (s(b) < s(a.left))) then do LL rotation else do LR rotation else if (P(s(d) 1) > s(c)) then
if ((s(d.left) < (1 + )s(d.right) + 1 4) and (s(c) < s(d.right))) then do RR rotation else do RL rotation; end
end
end
else {check conditions for an RR and RL rotation}
begin
p = gp.right;
if (s(p.left) > s(p.right)) then
begin
if (s(p.left) > s(gp.left)) then {RL}
do symmetric to the above LR case;
end
else
begin if (s(p.right) > s(gp.left)) then do RR rotation; end
end ;
gp = gp.parent;
end;
end;
Figure 3.10. Restructuring procedure
66
3.5 Simple 9BBSTs
The development of Section 3.4 was motivated by our desire to construct trees with minimal search cost. If instead, we desire only logarithmic performance per operation, we may simplify the restructuring pass so that rotations are performed only at nodes where the flbalance property is violated. In this case, we may dispense with the LL/RR rotations and the first substep of an LR/RL rotation. Only LR/RL substep (ii) rotations are needed. To see this, observe that Lemmas 7 and 11 show that the second substep rotations rebalance at gp (see Figures 3.6 and 3.7) provided ' l+13
(p, d) (The remaining conditions are ensured by the bottomup nature of restructuring and the fact the tree was flbalanced prior to the insert or delete).
If the operation that resulted in loss of balance at gp was an insert, then fl(p 2) < d (as p > d, the insert took place in subtree p and gp was 8lbalanced prior to the insert) and fl(p 1) > d (gp is not fbalanced following the insert). For the substep (ii) rotation to restore balance, we need fl(p 1) < (1 + f3)d. This is assured if d+/ : (fl+ 1)d (asfl(p2) < d). So, we need d> 1. Ifd < 1, then d= 0. Now 0(p 2) < d and f(p 1) > d imply p = 2. One may verify that when p = 2, the LR(ii) rotations restore balance.
If the loss of fbalance at gp is the result of a deletion (say from its right subtree), then 0(p 1) < d + 1 (as gp was fbalanced prior to the delete). For the substep
(ii) rotation to accomplish the rebalancing, we need fl(p 1) < (f8 I 1)d. This is guaranteed if d + 1 < (f8 + 1)d or d > 1/fl. When d < 1/fl and 8 > 1/3, d < 2. Since 0(p1) < d+1 and 9> 1/3, when d=2, p < 10; whend= 1, p< 7; and when
67
d =0, p < 4. We may verify that for all these cases, the LR(ii) rotations restore balance. Hence, the only problematic case is when P3 < 1/3 and d < 1/f8.
When /3 < 1/3, an Lb rotation fails to restore balance only when d = 0 (see discussion following Theorem 10). So we need to rearrange the at most 1/0l + 2 nodes in gp into any flbalanced tree when d = 0. An LR rotation fails only when d < 1. To see this, note that in the terminology of Lemma 14, d is d.
0(2+03)
The proof of P2 is extended to the case #3 1/3 when d'> 1 _1. Also, since d <1/fl, for the case b > c, we get fl(d' 1) < 1f < C (as c > 1). For the case b < c, we need to show fl(a 1) < b. Since an LR rotation is done only when condition D1 V D2 holds, from Lemmas 10 and 11, it follows that 8l(a 1) < b. So, an LR rotation rebalances when P3 < 1/3 provided d > 1 1. For smaller d, the at most 12(2+,) 0 1(2+0) nodess in the subtree gp may be directly rearranged into a flbalanced tree.
The restructuring algorithm for simple flBBSTs is given in Figures 3.11 and 3.12. The algorithm of Figure 3.11 is used following an insert and that of Figure 3.12 after a delete.
Simple flBBSTs are expected to have higher search cost than the flBBSTs of Section 3.4. However, they are a good alternative to traditional WB(a) trees as they are expected to be "better balanced". To see this, note that from the proof of Lemma 3, the balance, B(p), at any node p in a flbalanced tree satisfies
1 1+s(r)+ 1
B(p) ls(1) 1
68
procedure Restructuring2; begin
while (gp) do
begin
if (fl(s(gp.left) 1) > s(gp.right)) then {do an LL or LR rotation}
begin
p = gp.left;
if ((s(p.right) < (1 + )s(p.left) + 1 /3) and
(s(gp.right) < s(p.left))) then
do LL rotation
else do LR rotation ;
end
else
do symmetric to the above L case;
gp = gp.parent ;
end;
end;
Figure 3.11. Simple restructuring procedure for insertion
procedure Restructuring3 begin
while (gp) do
begin
if (fl(s(gp.left) 1) > s(gp.right)) then
if (/3 < 1/3) and (s(gp.right) < 1//3(2 + /3) 1) then
rearrange the subtree rooted at gp into any /3balanced tree
else {do an LL or LR rotation}
begin
p = gp.left ;
if ((s(p.right) < (1 + fl)s(p.left) + 1 /3) and
(s(gp.right) < s(p.left))) then
do LL rotation
else do LR rotation ;
end
end
else
do symmetric to the above L case;
gp = gp.parent;
end;
end ;
Figure 3.12. Simple restructuring procedure for deletion
69
1
S1+ 1/3 + p(s(r)+1)
1 + 2)3+
0 03((r)+x) 1201
i +
13 1(3)1)
So,
1
B(p) < 1i + ((r)+1)
Also, sinces(r) 1 < s(1)/fl, s(r) + 1 < s(l)/p + 2. Hence, 1 + < 1 + (( +
8Q+ 03((1)+'
2 So s(1)+1' So
1
B(p) 1 + +
1
(1) T1 (s(l)+1)
Consequently,
1 1
I1 21 < B(p) < 1 + + 21
1 + (s(I)+1) 1 ( (r)+1)
When 3 = 0 1,
1 1
< B(p) < 1
2 + v/ + 2 + v/ + 1 +v
If s(p) 5 10, 0.296 < B(p) < 1 0.296. So, every #balanced subtree with 10 or
fewer nodes is in WB(a) for a 0.296. Similarly, every subtree with 100 or fewer
nodes is in WB(a) for a 0.293. In fact, for every fixed k, subtrees of size k or less
70
procedure Restructuring4
begin
while (gp) do
begin
if (s(gp.left) > s(gp.right)) then
begin {check conditions for an LL and LR rotation}
p = gp.left ;
if (s(p.left) > s(p.right)) and (s(p.left) > s(gp.right)) then
do LL rotation
else if (s(p.left) 5 s(p.right)) and (s(p.right) > s(gp.right)) then
do LR rotation ;
end
else {check conditions for an RR and RL rotation}
do symmetric to the above L case;
gp = gp.parent;
end;
end;
Figure 3.13. Simple restructuring procedure without a f8 value
are in WB(a) for a slightly higher than 1 0.2929 which is the largest value
of a for which WB(a) trees can be maintained.
3.6 BBSTs without Deletion
In some applications of a dictionary, we need to support only the insert and
search operations. In these applications, we can construct binary search trees with
total cost
C(T) nlogo(V(n + 1))
by using the simpler restructuring algorithm of Figure 3.13.
Theorem 12 When the only operations are search and insert and restructuring is done
as in Figure 3.13, C(T) < nlog,(v'5(n + 1)).
71
Proof Suppose T currently has m 1 elements and a new element is inserted. Let u be the level at which the new element is inserted. Suppose that the restructuring pass performs rotations at q < u of the nodes on the path from the root to the newly inserted node. Then C(T) increases by at most v = u q as a result of the insertion. The number of nodes on the path from the root to the newly inserted node at which no rotation is performed is also v. Let these nodes be numbered 1 through v bottom to top. Let Si denote the number of elements in the subtree with root i prior to the restructuring pass. We see that S, > 1 and S2 > 2. For node i, 2 < i < v, one of its subtrees contains node i 1. Without loss of generality, let this be the left subtree of i. Let the root of the right subtree of i be d. So,
Si >! Si_ + s(d) + 1.
If i 1 is not the left child of i, then since no rotation is done at i, s(d) > Si1. If i 1 is the left child of i, then consider node i 2. This is in one of the subtrees of i. Since no rotation is performed at i 1, s(d) > Si2. Since SiI > Si_2, we get
Si > Si1 + S,2 2 1.
Hence, S,, N where N,, is the minimum number of elements in a COST of height v. So, v < log,(v"5(m + 1)). So, when an element is inserted into a tree that has m 1 elements, its cost C(T) increases by at most logo(v'5(m + 1)). Starting with an empty tree and inserting n elements results in a tree whose cost is at most
72
n logj(v' (n + 1)). 0
Corollary 2 The expected cost of a search or insert in a BBST constructed as above is O(log n).
Proof Since C(T) < nlogj(v'5(n + 1)), the expected search cost is C(T)/n < log,(V5(n + 1)). The cost of an insert is the same order as that of a search as each insert follows the corresponding search path twice (top down and bottom up). 0
3.7 Experimental Results
For comparison purposes, we wrote C programs for BBSTs, SBBSTs (simple BBSTs), BBSTDs (BBSTs in which procedure Restructuring4 (Figure 3.13) is used to restructure following inserts as well as deletes), unbalanced binary search trees (BST), AVLtrees, topdown redblack trees (RBT), bottomup redblack trees (RBB) [31], weight balanced trees (WB), deterministic skip lists (DSL), treaps (TRP), and skip lists (SKIP). For the BBST and SBBST structures, we used P = 207/500 while for the WB structure, we used a = 207/707. While these are not the highest permissible values of # and a, this choice permitted us to use integer arithmetic rather than the substantially more expensive real arithmetic. For instance, ,l(a, b) for 3 = 207/500 can be checked using the comparisons 207(s(a) 1) > 500s(b) and 207(s(b) 1) > 500s(a). The randomized structures TRP and SKIP used the same random number generator with the same seed. SKIP was programmed with probability value p = 1/4 as in Pugh [26].
7 3
To minimize the impact of system call overheads on run time measurements, we programmed all structures using simulated pointers (i.e., an array of nodes with integer pointers [271. Skip lists use variable size nodes. This requires more complex storage management than required by the remaining structures which use nodes of the same size. For our experiments, we implemented skip lists using fixed size nodes, each node being of the maximum size. As a result, our run times for skip lists are smaller than if a space efficient implementation had been used. In all our tree structure implementations, null pointers were replaced by a pointer to a tail node whose data field could be set to the search /insert/ delete key and thus avoid checking for falling off the tree. Similar tail pointers are part of the defined structure of skip and deterministic skip lists. Each tree also had a head node. WB(a) trees were implemented with a bottomup restructuring pass. Our codes for SKIP and DSL are based on the codes of Pugh [261 and Papadakis [22], respectively. Our AVL and RBT codes are based on those of Papadakis [22] and Sedgewick [28]. The treap structure was implemented using joins and splits rather than rotations. This results in better performance. Furthermore, AVL, RBB, WB, and BBST were implemented with parent pointers in addition to left and right child pointers. For BBSTs, the enhancements described in Section 3.4.4 for insert and delete (see Figure 3.10) were employed. No rotations were performed during a search when using any of the structures.
For our experiments, we tried two versions of the code. These varied in the order in which the 'equality' and 'less than' or 'greater than' check between x and e (where x is the key being searched/inserted/deleted and e is the key in the current
74
node) is done. In version 1, we conducted an initial experiment to determine if the total comparison count is less using the order L:
if x < e then move to left child
else if x e then move to right child
else found
or the order R:
if x > e then move to right child
else if x je then move to left child
else found.
Our experiment indicated that doing the 'left child' check first (i.e. order L) worked better for AVL, BBST, BBSTD, and DSL structures while R worked better for the RBT, RBB, WB, SBBST, and TRP structures. No significant difference between L and R was observed for BSTs. For skip lists, we do not have the flexibility to change the comparison order. The version 1 codes performed the comparisons in the order determined to be better. For BSTs, the order R was used.
In the version 2 codes the comparisons in each node took the standard form
if x =e then found
else if x < e then move to left child
else move to right child.
The version 2 restructuring code for BBSTs differed from that of Figure 3.10 in that the '>' test in the second, third, and forth if statements was changed to ''
75
No change was made in the corresponding if statements for RR and RL rotations. While this increased the number of comparisons, it reduced the run time.
We experimented with n = 10,000, 50,000, 100,000, and 200,000. For each n, the following experiments were conducted:
(a) start with an empty structure and perform n inserts;
(b) search for each item in the resulting structure once; items are searched for in the order they were inserted
(c) perform an alternating sequence of n inserts and n deletes; in this, the n elements inserted in (a) are deleted in the order they were inserted and n new elements are inserted
(d) search for each of the remaining n elements in the order they were inserted
(e) delete the n elements in the order they were inserted.
For each n, the above five part experiment was repeated ten times using different random permutations of distinct elements. For each permutation, we measured the total number of element comparisons performed and then averaged these over the ten permutations.
First, we report on the relative performance of SBBSTs, BBSTDs, and 1313STs. For this comparison, we used only version I of the code. Table 3.1 gives the average number of key comparisons performed for each of the five parts of the experiment. The three versions of our proposed data structure are very competitive on this measure. BBSTI)s and BBSTs generally performed fewer comparisons than did SBBSTs. All three structures had a comparison count within 2% of one another.
76
Table 3.1. The number of key comparisons on random inputs (version 1 code)
n operation 11SBBST IBBSTD IBBST]
insert 212495 212223 212111
search 194661 191599 191578
10,000 ins/del 416125 416967 416862 search 194957 191666 191676
delete 168033 166441 166487
insert 1241080 1236682 1236114 search 1152137 1135131 1134969 50,000 ins/del 2437918 2438083 2437639 search 1153821 1134277 1134062
_____ delete 1018675 1007766 1007688 insert 2635913 2624829 2623792 search 2458079 2423988 2423613 100,000 ins/del 5183619 5180383 5179653 search 2461221 2420282 2419990
____ delete 2190798 2168049 2168110 insert 5580139 5555190 5553256 search 5223989 5148220 5147698 200,000 ins/del 10981441 10969578 10968053 search 5229172 5144808 5144148 delete 4692447 4641349 4641389
However, when we used ordered data rather than random data (Table 3.2), SBBSTs
performed noticeably inferior to BBSTDs and BBSTs; the later two remained very
competitive.
Tables 3.3 and 3.4 give the average heights of the trees using random data and
using ordered data, respectively. The first number gives the height following part (a)
of the experiment and the second following part (c). The numbers are identical for
BBSTDs and BBSTs and slightly higher (lower) for SBBSTs using random (ordered)
data.
77
Table 3.2. The number of key comparisons on ordered inputs (version 1 code)
n operation~ SBBST IBBSTD I BBST
insert 170182 150554 150554
search 188722 185530 185530
10,000 ins/del 425305 315177 314998
search 191681 184155 184155
____ delete 215214 135311 135131
insert 991526 872967 872967
search 1117174 1101481 1101481 50,000 ins/del 2472808 1806346 1805439 search 1116390 1098065 1098065 delete 1277756 792717 791815 insert 2103808 1850548 1850548 search 2384327 2354757 2354757 100,000 ins/del 5249194 3823415 3821594 search 2382759 2346118 2346128
_____ delete 2738294 1686397 1684584
insert 4449143 3903083 3903083 search 5068632 4946753 4946753 200,000 ins/del 11105525 8051695 8048058 search 5065496 5001967 5001967
_____ delete 5842168 3580856 13577223
Table 3.3. Height of the trees on random inputs (version 1 code)
n SBBST 1 BBSTD I BBST 10,000 fi17,17 ~f16,16 16,16
50,000 j~20,20 19,19 19,19
100,000 21,21 20,20 20,20
200,000 22,23 21,21 21,21
Table 3.4. Height of the trees on ordered inputs (version 1 code)
II 33SBBST I BBSTD [ BBST 10,000 3]16,15 17,17 17,17
50,000 20,20 20,20 20,20 100,000 21,21 21,21 21,21
200,000 22,22 23,22 23,22
78
The average number of rotations performed by each of the three structures is given in Tables 3.5 and 3.6. A single rotation (i.e., LL or RR) is denoted 'S' and a double rotation (i.e., LR or Rb) denoted 'D'. In the case of BBSTs, double rotations have been divided into three categories: D = LR and Rb rotations that do not perform a second substep rotation; DS = LR and Rb rotations with a second substep rotation of type Lb and RR; DD = LR and Rb rotations with a second substep rotation of type bR and Rb. BBSTIs and BBSTs performed a comparable number of rotations on both data sets. However, on random data SBBSTs performed about half as many rotations as did BBSTIs and BBSTs. On ordered data, SBBSTs performed 15 to 20% fewer rotations on part (a), 34% fewer on part (c), and 51% fewer on part (e).
The runtime performance of the structures is significantly influenced by compiler and architectural features as well as the complexity of a key comparison. The results we report are from a SUN SPARC5 using the UNIX C compiler cc with optimization option. Because of instruction pipelining features, cache replacement policies, etc., the measured run times are not always consistent with the compiler and architecture independent metrics reported in Tables 3.1 through 3.6 and later in Tables 3.11 through 3.16. For example, since the search codes for all tree based methods are essentially identical, we would expect methods with a smaller comparison count to have a smaller run time for parts (b) and (d) of the experiment. This was not always the case.
Tables 3.7 and 3.8 give the run times of the three BBST structures using integer keys and Tables 3.9 and 3.10 do this for the case of real (i.e., floating point) keys. The
79
m m000 1 4c t C:C4C
LO m l Pt "
.4 4 4 C 4
C4 14 oot to C04 CYD
Sn C1 O l)0
P4
o> :44~ C~
C) 00 0D CD0
m* p0 0 m O00r
4 14 (
80
Table 3.6. The number of rotations on ordered inputs (version 1 code)
n operation S1B1U S B TD ~ S D B DS IDDIinsert 9984 0 9985 2387 9985 2387 0 0
10,000 ins/del 14997 0 16567 6130 16644 5797 25 154
delete 4989 0 6570 3726 6647 3392 26 154
insert 49980 0 49983 11956 49983 11956 0 0
50,000 ins/del 74996 0 82862 30659 83247 28982 137 770
delete 24987 0 32859 18686 33242 17018 136 766
insert 99979 0 99983 23917 99983 23917 0 0 100,000 ins/del 149996 0 165738 61327 166504 57969 280 1540
_____ delete 49986 0 65733 37392 66505 34040 278 1536
insert 199978 0 199982 47839 199982 47839 0 0 200,000 ins/del 299996 0 331473 122653 333012 115938 559 3078
_____ delete 999 85 0 131478 74795 133016 68086 1557 13076J
sum of the run time for parts (a) (e) of the experiment is graphed in Figure 3.14. For random data, SBBSTs significantly and consistently outperformed BBSTDs and BBSTs. On ordered data, however, BBSTDs were slightly faster than BBSTs and both were significantly faster than SBBSTs.
Since BBSTs generated trees with the least search cost, we expect BBSTs to outperform SBBSTs and BBSTDs in applications where the comparison cost is very high relative to that of other operations and searches are done with a much higher frequency than inserts and deletes. However, with the mix of operations used in our tests, SBBSTs are the clear choice for random inputs and BBSTDs for ordered inputs.
In comparing with the other structures, our tables repeat the data for BBSTs. The reader may make the comparison with SBBSTs and BBSTDs.
81
Table 3.7. Run time on random inputs using integer keys (version 1 code)
n~ operation SBBST BBSTD [BBSTI insert 0.27 0.30 0.34
search 0.06 0.06 0.07
10,000 ins/del 0.57 0.62 0.70
search 0.06 0.06 0.06
delete 0.22 0.25 0.26
insert 1.48 1.61 1.75
search 0.35 0.36 0.37
50,000 ins/del 2.90 3.47 3.84
search 0.36 0.38 0.39
delete 1.13 1.47 1.62
insert 3.00 3.5 380
search 0.78 0.83 0.84
100,000 ins/del 6.28 7.78 8.41
search 0.83 0.87 0.88
delete 2.54 3.31 3.58
insert 6.56 7.74 8.37
search 1.80 1.89 1.89
200,000 ins/del 13.89 17.32 18.57 search 1.86 1.98 1.98
delete 5.64 7.41 8.02
Time Unit :sec
89
Table 3.8. Run time on ordered inputs using integer keys (version 1 code)
n operation SBBS BBSTD BBST
insert 0.32 0 20 0.27
search 0.05 0.03 0.05
10,000 ins/del 0.58 0.43 0.57
search 0.07 0.03 0.03
delete 0.20 0.17 0.23
insert 1.8 1.20 1.10
search 0.25 0.20 0.20
50,000 ins/del 2.63 2.18 2.40 search 0.25 0.20 0.20
delete 0.95 0.92 1.05
insert 3 .43 2.23 2.53
search 0.72 0.45 0.42
100,000 ins/del 5.97 4.70 5.13
search 0.55 0.47 0.42
delete 2.10 1.98 2.15
insert 6.65 4.95 5.25
search 1.20 0.92 0.90
200,000 ins/del 13.13 10.23 10.88 search 1.17 0.90 0.90
delete 4.63 4.25 4.58
Time Unit: see
83
Table 3.9. Run time on random real inputs (version 1 code)
n operation II SBBSTT[ BBSTD BBST
insert 0.23 0.34 0.36
search 0.07 0.10 0.10
10,000 ins/del 0.44 0.75 0.79
search 0.08 0.10 0.10
delete 0.17 0.29 0.30
insert 1.43 1.76 1.93
search 0.47 0.53 0.52
50,000 ins/del 2.76 3.89 4.22
search 0.50 0.54 0.55
delete 1.13 1.62 1.76
insert 2.96 3.94 4.36
search 1.08 1.17 1.16
100,000 ins/del 6.11 8.58 9.30
search 1.12 1.20 1.22
delete 2.50 3.66 3.95
insert 6.85 8.92 9.33
search 2.41 2.58 2.57
200,000 ins/del 13.86 19.49 20.46
search 2.49 2.69 2.66
delete 5.61 8.25 8.80
Time Unit: sec
84
Table 3.10. Run time on ordered real inputs (version 1 code)
n Operation SBBST BBSTD) BBST]
insert 0.27 0.23 0.20
search 0.08 0.07 0.07
10,000 ins/del 0.53 0.50 0.43
search 0.08 0.07 0.05
delete 0.18 0.23 0.20
insert 1.43 1.25 1.12
search 0.40 0.30 0.30
50,000 ins/del 2.80 2.17 2.37
search 0.40 0.30 0.30
_____ delte 1.07 0.90 0.97
insert 3.28 2.58 2.77
search 0.90 0.62 0.63
100,000 ins/del 6.15 4.70 5.13
search 0.87 0.62 0.63
delete 2.35 1.93 2.10
insert 7.37 4.55 4.92
search 1.85 1.32 1.32
200,000 ins/del 13.35 10.03 10.93
search 1.87 1.33 1.33
_____ delete 5.08 4.17 4.43
Time Unit: sec
85
Time is sum of time for parts (a)(e) of the experiment 45
40 SBBST on random inputs 40BBSTD on random inputs x35 BBST on random inputs eSBBST on ordered inputs 4< 30 BBSTD on ordered inputs X BBST on ordered inputs a* Time 25 (sec) 20
15 10
50000 100000 150000 200000
n
Figure 3.14. Run time on real inputs (version 1 code)
The average number of comparisons for each of the five parts of the experiment are given in Table 3.11 for the version 1 implementation. On the comparison measure, AVL, RBB, WB, and BBSTs are the front runners and are quite competitive with one another. On parts (a) (insert n elements) and (c) (insert n and delete n elements), AVL trees performed best while on the two search tests ((b) and (d)) and the deletion test (e), BBSTs performed best.
Table 3.12 gives the number of comparisons performed when ordered data (i.e., the elements in part (a) are 1, 2,... ,n and are inserted in this order) and those in part (c) are n + 1,.. ,2n (in this order) is used instead of random permutations of distinct elements. This experiment attempts to model realistic situations in which the inserted elements are in "nearly sorted order". BSTs were not included in this test as they perform very poorly with ordered data taking 0(n') time to insert n
86
times. The computer time needed to perform this test on BSTs was determined to be excessive. This test exhibited greater variance in performance. Among the deterministic structures, BBSTs outperformed the others in parts (a) (d) while AVL trees were ahead in part (e). For part (a), BBSTs performed approximately 45% fewer comparisons than did "L trees and approximately 12% fewer than WB trees. The randomized structure TRP was the best of the eight structures reported in Table 3.12 for part (a). It performed approximately 10% fewer comparisons than did BBST trees. However, the BBST remained best overall on parts (b), (c), and
(d).
The heights of the trees (number of levels in the case of DSL and SKIP) for the experiments with random and ordered data are given in Tables 3.13 and 3.14 respectively. The first number in each table entry is the tree height after part (a) of the experiment and the second, the height after part (c). In all cases, the number of levels using skip lists is fewest. However, among the tree structures, AVL and BBST trees have least height on random data and AVL has least with ordered data.
Tables 3.15 and 3.16, respectively, give the number of rotations performed by each of the deterministic tree schemes for experiment parts (a), (c), and (e). Note that none of the schemes performs rotations during a search.
On ordered data, BBSTs perform about 25% more rotations than do the remaining structures. These remaining structures perform about the same number of rotations. On random data, "L trees, bottomup redblack trees and WB trees perform a comparable number of rotations. Topdown redblack trees and BBST trees
87
t C4 C:o Nr 10 CIO t C14 00 m m m o m C11 00
t .14 4 M ko t N Co
in L m cq w I'd, t cr C4 r 144 to
t, C t:V t CD to LO r w m o Nzv m "'.) cq o N
ItV xo m ., r t tO r 4 M M W M 10 00 t C m
V) LO M C C) t q oo m C14 oo t 0) 00 N:v
M M LO M C I m C4 o) to to
cq C14 LO C11 C14 o
C4 C4 m C.0 m C'4 'D (.0
(D cq I czr m r C'I 10 q lo to m 4 m to 0) Cq c:> taq to to M q :V m 10 "I:r oo m Nd4 Idq Z:p C14 m m .14 Lo LO LO
00 to t CD "tV CD C) 00 <=) 4 IW 00 4 ITr ol ko 4 ko
o oo cq m to t 14 CN t .0 CD V C14 r m o C)
( U) <:> d4 CD 10 m LO m vr t '" 1 00 cr " 4 25,
C'l C'a o N C9 t U'D "::v It:0 1tv o N W M to F 10 6 Fto to to
r O*j 144 00 c"I M LO t to S4 t t N O*A to m
.dq oo C14 t IW C 4 to to 0) CD "M t 0., 0 C> m rcn N C:) ut XO C14 CO C) t o C14 lcv w t r C4
10 oo M C.0 o C> N rq M t m "te
,,,V ko o) V _4 to M m I C>
t U.) N Lo 4C4 M N to 00 00
Cl C'4 0 C'I LO LO C*4 ko N to 44 00 cq 00 00
m m .4 .0 m
m to t o cq rq
,,r m (m cq 00 cq m m C> c> o 00 m 00 (m
4 00 C4 CD t ko IW 00
r.q t to r oo co m (o 00 m ko m 14 ko (M
14 o ,:v m <0 a> o t to
WD oo 00 .w
4 CO 4 ro Cc t m m 0) 0) 00 m IC 4 O' CO m m Cl*) m CD C9 cq t 4 C.0 ko w 1w
Cq N _4 .14 4 C.0 ZW 4 t:V _4 &0 _4
4 4 Cq 4 N C4 U) Cq N ko UI LO d4
C.0 m to C1 4 00 C:> 4 00 00 4 ko ko 00 m m 00 to 00
4 Lo m lw m C14 t 00 t 00 4 LO (M Id4 4 M co (.0 <0
m to Ct4 10 o a) cq U") m 00 ro t N 4 X0 :r 0 tr.q "t 14 t oo 0) q 4 10 M C) o 10 .4 X0 C.0 (=> a>
4 M "W M ko 4 m U'lw t U> 00 L 4 N 00
to 4 r d4 4 LO
r, cq 4 c) to ,:r 4 N cq o
cq .4 "d, 4 4 cq 4 q 01 N ko C4 N LO X0 C) kf
0
C:> LO to 00 n" ":r "tv It:r it to 00 14 to 'w C.0
00 oi m cli LO o d4 o m :r r 00 00 X0 Id,
00 C14 cli ull.) Nt r 4 m q. LO
4 11:r 'Ir to 10 Izr cq m o o Tv t t 00 m m 4
4 m 4 <:r ro M .Zr cq Lo 4 cq tr U) to t LO m m N to
4 Id4 4 :r C) o Nr 14 'tr 4 Lo N o
Cq Cq C9 U) cq C'4 to LO LO
E 00 to I r 0) 00 1 to ul cq C.0 C4 t N 00 N to m cq m
o CD tc> It:r o m m u:o m t t 00 00 r c:o
m 1=1 10 <:, r r a, Z ,:p m m C) o 110 t cq
00 to t
M C,4
N "It LO t c:l CD C: 00 M N Lo .c4w cq C14 m C14 C>
10 C (m c::) CD m tt Lo 10 to C) clq m 00
C,4 m C11 to C1 CN t LO LO
m 00 LO lc:p m CD 00 LI = C) tl C o M C cq m C m 0) to 00 m C14 "d, m cq m tC:> ul: cq 'd, , r oo LO 00 C0 cq C:) m to t. It:v
TV cq cq (M N rCeD 4 m t M ltd, r t LO m m LO t m m 4 to
id M "'d, 4 N c:tt m .41 00 to C 00 to
F' to Cq '15 4 d4 Lo q 00
q 14
4 C14 4 C14 Cq LO cq cq LO LO C> xo RJ4
0
U' m C LO 00 00 00 Nt t C C> CW) M C4 cq 00 00 ":r
4) t t CD 10 LO to CD 14 w w (D LO m m m , cq
X 4 00 C14 LO m w LO m t t ul 1114 0 1 r4 4 t m
n4 tJ4 C.0 C14 LO 4 CD to M M t oo m 1 0 o r C) LO
.4 Lo o C: 4 cq C4 m C> m t t C) m 0)
LO LO C) LO m m N ho C4 00 C) 00 0 00 C cq xo cq C4 m m vl to m C14 t to m o (0
Q) 1 4 V
W
co
C) C) C)
C) C) C)
C
C14
88
CD~~~~~~ 4c m 'I14C 0C4
C14 C11 m~k C1 44 q N .: M to o M 4O
0.t))C C 1 0Mc MC D0 qLOt 0 t
01 4 14 1 m ~ d0m c l L
02C> 00 C4~ 0 00 t
to) oo_ __ N1_ __ ___o___bt
1) C1 0)C4 1 ~ Mt 0C I
C4 t 0ix q m .)14CD E' '
o~ 0 0 m 0 ,tL C)0 )U)
10 ~ ~ (0oo :) 0 m 0 0 )mC )1 >t 'M4 C0 C)tot N _v 00_ mqt cC 4,L m mC 7

Full Text 
PAGE 1
()),&,(17 $/*25,7+06 $1' '$7$ 6758&785(6 )25 9/6, &$' %\ 6(21*+81 &+2 $ ',66(57$7,21 35(6(17(' 72 7+( *5$'8$7( 6&+22/ 2) 7+( 81,9(56,7< 2) )/25,'$ ,1 3$57,$/ )8/),//0(17 2) 7+( 5(48,5(0(176 )25 7+( '(*5(( 2) '2&725 2) 3+,/2623+< 81,9(56,7< 2) )/25,'$
PAGE 2
$&.12:/('*0(176 0\ KHDUWIHOW DSSUHFLDWLRQ JRHV WR P\ DGYLVRU 3URIHVVRU 6DUWDM 6DKQL IRU JLYLQJ PH FRQWLQXHG JXLGDQFH LQ P\ WKHVLV ZRUN WKDQN KLP IRU WKH KHOS SDWLHQFH DQG VXSn
PAGE 3
7$%/( 2) &217(176 $&.12:/('*0(176 LL /,67 2) 7$%/(6 YLL /,67 2) ),*85(6 r $%675$&7 [L &+$37(56 ,1752'8&7,21 %DFNJURXQG 'LVVHUWDWLRQ 2XWOLQH 0,1,080 $5($ 2,1,1* 2) &203$&7(' &(//6 ,QWURGXFWLRQ 0D\HU 5LYHU 5RXWLQJ &RQVWUDLQW *UDSK 5HSUHVHQWDWLRQ +HXULVWLFV WR 0LQLPL]H $UHD +HXULVWLF +HXULVWLF +HXULVWLF ([SHULPHQWDO 5HVXOWV &RQFOXVLRQ $ 1(: :(,*+7 %$/$1&(' %,1$5< 6($5&+ 75(( ,QWURGXFWLRQ %DODQFHG 7UHHV DQG 5RWDWLRQV %%67V 6HDUFK ,QVHUW DQG 'HOHWH LQ D %%67 6HDUFK ,QVHUWLRQ 'HOHWLRQ (QKDQFHPHQWV 7RS 'RZQ $OJRULWKPV LLL
PAGE 4
6LPSOH %%67V %%67V ZLWKRXW 'HOHWLRQ ([SHULPHQWDO 5HVXOWV &RQFOXVLRQ :(,*+7 %,$6(' /()7,67 75((6 $1' 02',),(' 6.,3 /,676 ,QWURGXFWLRQ :HLJKW %LDVHG /HIWLVW 7UHHV 0RGLILHG 6NLS /LVWV 06/V $V 3ULRULW\ 4XHXHV ([SHULPHQWDO 5HVXOWV )RU 3ULRULW\ 4XHXHV &RQFOXVLRQ &21&/86,216 $ $%%5(9,$7,216 5()(5(1&(6 %,2*5$3+,&$/ 6.(7&+ ,9
PAGE 5
/,67 2) 7$%/(6 (UURU UDWH bf RYHU RSWLPDO O O ,PSURYHPHQW bf RYHU )DQJ 7LPH WDNHQ (UURU UDWH bf RYHU RSWLPDO ,PSURYHPHQW bf RYHU FDVHV ,PSURYHPHQW bf RYHU )DQJ 7LPH WDNHQ 7KH QXPEHU RI NH\ FRPSDULVRQV RQ UDQGRP LQSXWV YHUVLRQ FRGHf 7KH QXPEHU RI NH\ FRPSDULVRQV RQ RUGHUHG LQSXWV YHUVLRQ FRGHf +HLJKW RI WKH WUHHV RQ UDQGRP LQSXWV YHUVLRQ FRGHf +HLJKW RI WKH WUHHV RQ RUGHUHG LQSXWV YHUVLRQ FRGHf 7KH QXPEHU RI URWDWLRQV RQ UDQGRP LQSXWV YHUVLRQ FRGHf 7KH QXPEHU RI URWDWLRQV RQ RUGHUHG LQSXWV YHUVLRQ FRGHf 5XQ WLPH RQ UDQGRP LQSXWV XVLQJ LQWHJHU NH\V YHUVLRQ FRGHf 5XQ WLPH RQ RUGHUHG LQSXWV XVLQJ LQWHJHU NH\V YHUVLRQ FRGHf 5XQ WLPH RQ UDQGRP UHDO LQSXWV YHUVLRQ FRGHf Y
PAGE 6
5XQ WLPH RQ RUGHUHG UHDO LQSXWV YHUVLRQ FRGHf 7KH QXPEHU RI NH\ FRPSDULVRQV RQ UDQGRP LQSXWV YHUVLRQ FRGHf 7KH QXPEHU RI NH\ FRPSDULVRQV RQ RUGHUHG LQSXWV YHUVLRQ FRGHf +HLJKW RI WKH WUHHV RQ UDQGRP LQSXWV YHUVLRQ FRGHf +HLJKW RI WKH WUHHV RQ RUGHUHG LQSXWV YHUVLRQ FRGHf 7KH QXPEHU RI URWDWLRQV RQ UDQGRP LQSXWV YHUVLRQ FRGHf 7KH QXPEHU RI URWDWLRQV RQ RUGHUHG LQSXWV YHUVLRQ FRGHf 5XQ WLPH RQ UDQGRP LQSXWV XVLQJ LQWHJHU NH\V YHUVLRQ FRGHf 5XQ WLPH RQ RUGHUHG LQSXWV XVLQJ LQWHJHU NH\V YHUVLRQ FRGHf 5XQ WLPH RQ UDQGRP UHDO LQSXWV YHUVLRQ FRGHf 5XQ WLPH RQ RUGHUHG UHDO LQSXWV YHUVLRQ FRGHf 7KH QXPEHU RI NH\ FRPSDULVRQV RQ UDQGRP LQSXWV YHUVLRQ FRGHf 7KH QXPEHU RI NH\ FRPSDULVRQV RQ RUGHUHG LQSXWV YHUVLRQ FRGHf 5XQ WLPH RQ UDQGRP UHDO LQSXWV YHUVLRQ FRGHf 5XQ WLPH RQ RUGHUHG UHDO LQSXWV YHUVLRQ FRGHf 7KH QXPEHU RI NH\ FRPSDULVRQV 1XPEHU RI OHYHOV 5XQ WLPH 7KH QXPEHU RI NH\ FRPSDULVRQV +HLJKWOHYHO RI WKH VWUXFWXUHV 5XQ WLPH XVLQJ LQWHJHU NH\V 7KH QXPEHU RI NH\ FRPSDULVRQV YL
PAGE 7
+HLJKWOHYHO RI WKH VWUXFWXUHV \ $O PLQ WLPH XVLQJ PLHJHL .H\ D YL c
PAGE 8
/,67 2) ),*85(6 &HOO MRLQLQJ 0D\HU ULYHU URXWLQJ 5RXQG URELQ DQG JUHHG\ OD\HU DVVLJQPHQWV 0LQLPL]LQJ WKH QXPEHU RI WUDFNV RU OD\HUV &RQVWUDLQW JUDSK UHSUHVHQWDWLRQ 0HUJH LQ FRQVWUDLQW JUDSK +HXULVWLF +HXULVWLF +HXULVWLF // DQG 5/ URWDWLRQV $ WUHH LQ :%Of WKDW LV QRW AEDODQFHG fÂ§EDODQFHG WUHH WKDW LV QRW D &267 // URWDWLRQ IRU LQVHUWLRQ 6XEVWHS Lf RI LQVHUWLRQ /5 URWDWLRQ &DVH // IRU /5LLf URWDWLRQ &DVH /5 IRU /5LLf URWDWLRQ YLLL
PAGE 9
// URWDWLRQ IRU GHOHWLRQ /5 URWDWLRQ IRU GHOHWLRQ 5HVWUXFWXULQJ SURFHGXUH 6LPSOH UHVWUXFWXULQJ SURFHGXUH IRU LQVHUWLRQ 6LPSOH UHVWUXFWXULQJ SURFHGXUH IRU GHOHWLRQ 6LPSOH UHVWUXFWXULQJ SURFHGXUH ZLWKRXW D YDOXH 5XQ WLPH RQ UHDO LQSXWV YHUVLRQ FRGHf 5XQ WLPH RQ UDQGRP UHDO LQSXWV YHUVLRQ FRGHf 5XQ WLPH RQ RUGHUHG UHDO LQSXWV YHUVLRQ FRGHf 5XQ WLPH RQ UDQGRP UHDO LQSXWV YHUVLRQ FRGHf 5XQ WLPH RQ RUGHUHG UHDO LQSXWV YHUVLRQ FRGHf ([DPSOH PLQ99%/7V PLQ:%/7 ,QVHUW PLQ:%/7 'HOHWHPLQ 6NLS /LVWV ,OO 0RGLILHG 6NLS /LVWV 06/ 6HDUFK 06/ ,QVHUW 06/ 'HOHWH 5XQ WLPH 706/ ,QVHUW 706/ 'HOHWHPLQ ,;
PAGE 10
706/ 'HOHWHPD[ 5XQ WLPH RQ UDQGRPO 5XQ WLPH RQ UDQGRP 5XQ WLPH RQ UDQGRPO 5XQ WLPH RQ UDQGRP [
PAGE 11
$EVWUDFW RI 'LVVHUWDWLRQ 3UHVHQWHG WR WKH *UDGXDWH 6FKRRO RI WKH 8QLYHUVLW\ RI )ORULGD LQ 3DUWLDO )XOILOOPHQW RI WKH 5HTXLUHPHQWV IRU WKH 'RFWRU RI 3KLORVRSK\ ()),&,(17 $/*25,7+06 $1' '$7$ 6758&785(6 )25 9/6, &$' %\ 6HRQJKXQ &KR 0D\ &KDLUPDQ 'U 6DUWDM 6DKQL 0DMRU 'HSDUWPHQW &RPSXWHU DQG ,QIRUPDWLRQ 6FLHQFH DQG (QJLQHHULQJ ,Q WKLV GLVVHUWDWLRQ ZH GHYHORS HIILFLHQW DOJRULWKPV DQG GDWD VWUXFWXUHV IRU SUREn OHPV WKDW DULVH LQ HOHFWURQLF FRPSXWHU DLGHG GHVLJQ (&$'f :H FRQVLGHU WKH SUREOHP RI MRLQLQJ D URZ RI FRPSDFWHG FHOOV VR DV WR PLQLPL]H WKH DUHD RFFXSLHG E\ WKH FHOOV DQG WKH LQWHUFRQQHFWV 7KH FHOO MRLQLQJ SURFHVV LQFOXGHV FHOO VWUHWFKLQJ DQG ULYHU URXWLQJ :H SURSRVH VHYHUDO KHXULVWLFV WR MRLQ D URZ RI FHOOV LQ VXFK D ZD\ WKDW DUHD LV PLQLPL]HG 7KH SURSRVHG KHXULVWLFV DUH FRPSDUHG H[SHULPHQWDOO\ ZLWK WKH SUHYLRXVO\ SURSRVHG KHXULVWLF :H GHYHORS D QHZ FODVV RI ZHLJKW EDODQFHG ELQDU\ VHDUFK WUHHV FDOOHG AEDODQFHG ELQDU\ VHDUFK WUHHV 7%%67Vf %%67V DUH GHVLJQHG WR KDYH UHGXFHG LQWHUQDO SDWK [L
PAGE 12
OHQJWK $V D UHVXOW WKH\ DUH H[SHFWHG WR H[KLELW JRRG VHDUFK WLPH FKDUDFWHULVWLFV ,QGLYLGXDO VHDUFK LQVHUW DQG GHOHWH RSHUDWLRQV LQ DQ Q QRGH %%67 WDNH 2ORJQf WLPH IRU \ fÂ§ ([SHULPHQWDO UHVXOWV FRPSDULQJ WKH SHUIRUPDQFH RI %%67V :%Tfn UHVXOWV IRU GRXEOH HQGHG SULRULW\ TXHXHV DUH SUHVHQWHG ;OO
PAGE 13
&+$37(5 ,1752'8&7,21 %DFNJURXQG ,Q 9/6, OD\RXW ZH DUH FRQFHUQHG ZLWK WUDQVIRUPLQJ D FLUFXLW IURP LWV ORJLFDO GHn VLJQ WR D SK\VLFDO LPSOHPHQWDWLRQ 7KH OD\RXW SUREOHP IRU 9/6, FLUFXLWV LV JHQHUDOO\ GHFRPSRVHG LQWR VPDOOHU SUREOHPV VXFK DV SDUWLWLRQLQJ IORRUSODQQLQJ SODFHPHQW URXWLQJ DQG FRPSDFWLRQ 7KH SDUWLWLRQLQJ SURFHVV GHFRPSRVHV D ODUJH FLUFXLWPRGXOH LQWR D FROOHFWLRQ RI VPDOOHU VXEFLUFXLWVPRGXOHV ,Q IORRUSODQQLQJ ORJLFDO FRPSRQHQWV RI D FLUFXLW DUH DVVLJQHG UHODWLYH SRVLWLRQV RQ D FKLS 7KH SK\VLFDO UHDOL]DWLRQ RI HDFK FRPSRQHQW LH LWV DUHD DQG DVSHFW UDWLRf
PAGE 14

PAGE 15
n VHDUFK DQG SULRULW\ TXHXH VWUXFWXUHV DUH SURSRVHG DQG WKRUn RXJKO\ FRPSDUHG ZLWK RWKHU GDWD VWUXFWXUHV )LQDOO\ LQ WKH ODVW FKDSWHU ZH SUHVHQW FRQFOXVLRQV RI WKLV ZRUN
PAGE 16
&+$37(5 0,1,080 $5($ 2,1,1* 2) &203$&7(' &(//6 ,QWURGXFWLRQ :KHQ GHVLJQLQJ FLUFXLWV ZLWK FRPSDFWHG V\PEROLF VWLFNV EDVLF FHOOV WKH FLUFXLW LV UHDOL]HG E\ D FROOHFWLRQ RI FRPSDFWHG FHOOV WKDW WLOH D WZRGLPHQVLRQDO DUHD 7KH LQWHUFHOO LQWHUFRQQHFWV DUH VXFK WKDW HDFK LQWHUFRQQHFW FRQQHFWV WZR WHUPLQDOV WKDW DUH RQ DGMDFHQW ERXQGDULHV RI QHLJKERULQJ FHOOV 6R IRU H[DPSOH LI FHOOV $ DQG % )LJXUH Dff DUH QHLJKERULQJ FHOOV RI WKH FLUFXLW WKHQ WKH ULJKW ERXQGDU\ RI $ LV DGMDFHQW WR WKH OHIW ERXQGDU\ RI % 7KH QXPEHU RI WHUPLQDOV RQ HDFK RI WKHVH ERXQGDULHV ZLOO EH WKH VDPH DQG WKH 3WK WHUPLQDO IURP WKH ERWWRPf RQ WKH ULJKW ERXQGDU\ RI $ LV WR EH FRQQHFWHG WR WKH 3WK WHUPLQDO IURP WKH ERWWRPf RQ WKH OHIW ERXQGDU\ RI % 6LQFH WKH FHOOV DUH DYDLODEOH LQ FRPSDFWHG IRUP LW LV QRW SRVVLEOH WR UHGXFH WKH GLVWDQFH EHWZHHQ DQ\ SDLU RI WHUPLQDOV RQ DQ\ VLGH RI D FHOO +RZHYHU WKLV GLVWDQFH FDQ EH LQFUHDVHG E\ VWUHWFKLQJ WKH FHOO ,Q WKH H[DPSOH RI )LJXUH Df ZH FDQ VWUHWFK HLWKHU FHOO YHUWLFDOO\ E\ GHILQLQJ D KRUL]RQWDO FXW OLQH DW DQ\ SRVLWLRQ DQG SXOOLQJ WKH WZR FHOO SLHFHV DSDUW E\ DQ\ GHVLUHG DPRXQW WKH FHOO FDQ DOVR EH VWUHWFKHG KRUL]RQWDOO\ E\ XVLQJ D YHUWLFDO FXW OLQHf
PAGE 17
Df+RUL]RQWDO DGMDFHQW FHOOV EfRLQLQJ E\ VWUHWFKLQJ FfRLQLQJ E\ ULYHU URXWLQJ Gf &RPELQDWLRQ FHOO MRLQLQJ )LJXUH &HOO MRLQLQJ 7KH UHTXLUHG LQWHUFRQQHFWV EHWZHHQ FHOOV $ DQG % RI )LJXUH Df FDQ EH DFn FRPSOLVKHG E\ VWUHWFKLQJ FHOOV $ DQG % VR WKDW WKH WHUPLQDOV RI $ DQG % OLQH XS DV LQ )LJXUH Ef 7KH EURNHQ OLQHV LQ )LJXUH Df LQGLFDWH WKH FXW OLQHV XVHG IRU VWUHWFKLQJ 7KH VWUHWFKLQJ HQDEOHV XV WR MRLQ FHOOV $ DQG % XVLQJ QR URXWLQJ WUDFNV E\ fMRLQf ZH PHDQ PDNH WKH LQWHUFRQQHFWV EHWZHHQ FHOOV $ DQG %f 7KLV PHWKRG RI MRLQLQJ FHOOV LV DOVR FDOOHG SLWFK PDWFKLQJ $QRWKHU ZD\ WR MRLQ FHOOV $ DQG % LV WR ULYHU URXWH WKH LQWHUFRQQHFWV DV LQ )LJXUH Ff 7KLV XVHV URXWLQJ WUDFNV LQ D FKDQQHO EHWZHHQ FHOOV $ DQG % EXW GRHV QRW LQFUHDVH FHOO KHLJKW 7KH SLWFK PDWFKLQJ DQG ULYHU URXWLQJ DSSURDFKHV WR FHOO MRLQLQJ KDYH EHHQ VWXGLHG LQ %R\HU >@ DQG :HVWH >@ $OJRULWKPV IRU VLQJOHOD\HU
PAGE 18
ULYHU URXWLQJ FDQ EH IRXQG LQ VHYHUDO ZRUNV > @ DQG WKRVH IRU PXOWLOD\HU ULYHU URXWLQJ FDQ EH IRXQG LQ %DUDW] >@ 6LQJOHOD\HU JULGOHVV ULYHU URXWLQJ LV VWXGLHG LQ 7RPSD >@ 7ZR DSSOLFDWLRQV RI ULYHU URXWLQJ DUH K\EULG FLUFXLW GHVLJQ DQG VWUXFWXUHG GHVLJQ '63f &HOO VWUHWFKLQJ RU SLWFK PDWFKLQJf LQFUHDVHV WKH KHLJKW RI WKH OD\RXW ZKLOH ULYHU URXWLQJ LQFUHDVHV LWV ZLGWK %RWK DIIHFW WKH OD\RXW DUHD 7KH OD\RXW RI )LJXUH Ef KDV DUHD 7R FRPSXWH WKH DUHD RI WKH OD\RXW RI )LJXUH Ff ZH DVVXPH WUDFNV KDYH XQLW VHSDUDWLRQ 6R WKH OD\RXW ZLGWK LV DQG KHLJKW LV 7KH OD\RXW KDV DUHD &KHQJ DQG 'HVSDLQ >6@ KDYH SURSRVHG XVLQJ D FRPELQDWLRQ RI FHOO VWUHWFKLQJ DQG ULYHU URXWLQJ VR DV WR REWDLQ OD\RXWV ZLWK VPDOOHU DUHD WKDQ SRVVLEOH ZKHQ RQO\ RQH RI WKHVH MRLQLQJ PHWKRGV LV XVHG )LJXUH Gf VKRZV WKH UHVXOW RI MRLQLQJ FHOOV $ DQG % XVLQJ ERWK VWUHWFKLQJ DQG ULYHU URXWLQJ 7KH DUHD RI WKLV OD\RXW LV 7KLV LV PLQLPXP IRU WKH LQVWDQFH RI )LJXUH Df &KHQJ DQG 'HVSDLQ >@ KDYH SURSRVHG D KHXULVWLF IRU VLQJOH OD\HU MRLQLQJ RI FRPSDFWHG FHOOV $W HDFK VWHS RI WKHLU KHXULVWLF HLWKHU D URZ RU FROXPQ RI FRPSDFWHG FHOOV LV MRLQHG )ROORZLQJ WKLV WKH URZ RU FROXPQ RI MRLQHG FHOOV LV UHSODFHG E\ D FRPSRVLWH FHOO WKDW UHSUHVHQWV WKH UHVXOW RI MRLQLQJ 1RWLFH WKDW ZKHQ D URZ FROXPQf RI FHOOV LV MRLQHG FHOOV PD\ EH VWUHWFKHG YHUWLFDOO\ KRUL]RQWDOO\f DQG ULYHU URXWLQJ LV GRQH LQ D YHUWLFDO KRUL]RQWDOf FKDQQHO 7R MRLQ D URZ RI FHOOV &KHQJ DQG 'HVSDLQ >@ ERXQG WKH PD[LPXP KHLJKW WR ZKLFK D FHOO PD\ EH VWUHWFKHG 7KLV ERXQG LV APDL A/ r K PD[f
PAGE 19
fFf DOJRULWKP WR ILQG WKH PLQLPXP DUHD MRLQ RI F FHOOV KDYLQJ D WRWDO RI Q WHUPLQDOV 7KLV DOJRULWKP GRHV DQ H[KDXVWLYH VHDUFK RYHU DOO SRVVLEOH QXPEHUV RI WUDFNV LQ WKH F fÂ§ URXWLQJ FKDQQHOV EHWZHHQ DGMDFHQW FHOOV $ FRQVWUDLQW JUDSK LV XVHG WR GHWHUPLQH WKH PLQLPXP KHLJKW OD\RXW IRU HDFK DVVLJQPHQW RI QXPEHU RI WUDFNV WR URXWLQJ FKDQQHOV 7KH WLPH UHTXLUHG SHU WUDFN DVVLJQPHQW LV Qf DQG WKH ZRUVW FDVH QXPEHU RI WUDFN DVVLJQPHQWV LV QFfFBf 7KH DOJRULWKP RI /LP >@ LV IODZHG DV LW KDQGOHV FKDQQHOV ZLWK ]HUR URXWLQJ WUDFNV E\ MRLQLQJ WKH DGMDFHQW FHOOV XVLQJ PLQLPXP KHLJKW FHOO VWUHWFKLQJ DQG WKHQ FRQVLGHUV WKH MRLQHG FHOOV DV RQH 7KLV SUREOHP LV HDVLO\ IL[HG KRZHYHU E\ FRPELQLQJ LQ WKH FRQVWUDLQW JUDSK SDLUV RI YHUWLFHV WKDW UHSUHVHQW FRUUHVSRQGLQJ WHUPLQDOV RI WKH WZR FHOOV LH ÂfWK WHUPLQDOV RI HDFK FHOOf ZLWK ]HUR URXWLQJ WUDFNV LQ EHWZHHQ ,Q WKLV FKDSWHU ZH FRQVLGHU WKH FDVH ZKHQ URXWLQJ OD\HUV DUH DYDLODEOH WR ULYHU URXWH WKH LQWHU FHOO FRQQHFWLRQV 1RWH WKDW ZKLOH PXOWLSOH OD\HUV GR QRW DIIHFW OD\RXW DUHD ZKHQ FHOO VWUHWFKLQJ DORQH LV XVHG D UHGXFWLRQ LQ DUHD LV SRVVLEOH ZKHQ
PAGE 20
Df OD\HU Ef OD\HU )LJXUH OD\HU ULYHU URXWLQJ FHOO VWUHWFKLQJ LV FRPELQHG ZLWK ULYHU URXWLQJ RU ZKHQ ULYHU URXWLQJ DORQH LV XVHG :H DVVXPH WKDW LQ HDFK OD\HU RI HDFK URXWLQJ FKDQQHO WKH LQWHUFRQQHFWV DUH WR EH DFFRPSOLVKHG XVLQJ ULYHU URXWLQJ $Q DOWHUQDWLYH LV WR XVH +9 URXWLQJ ZKHQ +9+ RU 9+9 URXWLQJ ZKHQ DQG H[WHQVLRQV RI +9+ DQG 9+9 URXWLQJ IRU +RZHYHU IRU ULYHU URXWLQJ LQVWDQFHV XVLQJ URXWLQJ OD\HUV LQ WKLV ZD\ KDV QR DGYDQWDJH RYHU ULYHU URXWLQJ LQ HDFK OD\HU VHH 7KHRUHP 6HFWLRQ f :KHQ WKH QXPEHU RI OD\HUV DYDLODEOH IRU ULYHU URXWLQJ LV LQFUHDVHG RQH PD\ VHH D GUDPDWLF UHGXFWLRQ LQ WKH QXPEHU RI URXWLQJ WUDFNV QHHGHG SHU OD\HU )LJXUH VKRZV DQ LQVWDQFH WKDW QHHGV Q WUDFNV ZKHQ URXWHG LQ RQH OD\HU EXW RQO\ RQH WUDFNOD\HU ZKHQ URXWHG LQ WZR OD\HUV :H EHJLQ LQ VHFWLRQ E\ VWDWLQJ WKH QHFHVVDU\ DQG VXIILFLHQW FRQGLWLRQV IRU D ULYHU URXWLQJ LQVWDQFH WR EH URXWDEOH LQ OD\HUV XVLQJ DW PRVW W WUDFNV SHU OD\HU DQG VWDWLQJ KRZ WR SHUIRUP OD\HU ULYHU URXWLQJ ZKHQ VXFK D URXWLQJ LV SRVVLEOH ,Q
PAGE 21
WKLV VHFWLRQ ZH DOVR VKRZ WKDW +9 VW\OH URXWLQJ KDV QR DGYDQWDJH RYHU ULYHU URXWLQJ LQ HDFK OD\HU ,Q 6HFWLRQ ZH GHVFULEH WKH FRQVWUDLQW JUDSK XVHG WR GHWHUPLQH PLQLPXP KHLJKW VWUHWFKLQJ RI F FHOOV +HXULVWLFV IRU WKH PLQLPXP DUHD MRLQLQJ RI F FHOOV DUH SURSRVHG LQ 6HFWLRQ DQG WKH UHVXOWV RI H[SHULPHQWV ZLWK WKHVH DUH SURYLGHG LQ 6HFWLRQ 2XU FRQFOXVLRQV DSSHDU LQ 6HFWLRQ 0D\HU 5LYHU 5RXWLQJ /HW $^%cf L P EH D VHW RI WHUPLQDO SDLUV VXFK WKDW WKH fV DUH RQ RQH VLGH VD\ OHIW RU WRSf RI D URXWLQJ FKDQQHO DQG WKH %IV DUH RQ WKH RWKHU ULJKW RU ERWWRPf VLGH 7HUPLQDO LV WR EH FRQQHFWHG WR WHUPLQDO %^ L P )RU WKLV FKDQQHO URXWLQJ LQVWDQFH WR EH DQ LQVWDQFH RI ULYHU URXWLQJ LW PXVW EH WKH FDVH WKDW DL D DP DQG EL E EP ZKHUH D DQG UHVSHFWLYHO\ JLYH WKH SRVLWLRQV RI WHUPLQDOV $ DQG L P :H PD\ DVVXPH DQ XQGHUO\LQJ JULG ZLWK HDFK WHUPLQDO EHLQJ DW D JULG SRVLWLRQ ,Q WKH FDVH RI D KRUL]RQWDO YHUWLFDOf FKDQQHO WKH DV DQG V DUH JULG FROXPQ URZf QXPEHUV /HLVHUVRQ DQG 3LQWHU >@ KDYH REWDLQHG WKH IROORZLQJ QHFHVVDU\ DQG VXIILFLHQW FRQGLWLRQ IRU D ULYHU URXWLQJ LQVWDQFH WR EH URXWDEOH LQ D VLQJOH OD\HU XVLQJ DW PRVW W WUDFNV 7KHRUHP >@ 7KH ULYHU URXWLQJ LQVWDQFH GHILQHG DERYH LV URXWDEOH LQ D VLQJOH OD\HU XVLQJ DW PRVW W WUDFNV LI DQG RQO\ LI Df DLW EL!W Ef EW D^!W IRU HYHU\ L P fÂ§ W
PAGE 22
)RU WKH JHQHUDO FDVH RI OD\HUV ZH REWDLQ WKH QHFHVVDU\ DQG VXIILFLHQW FRQGLWLRQ RI 7KHRUHP 7KHRUHP 7KH ULYHU URXWLQJ LQVWDQFH GHILQHG DERYH LV URXWDEOH LQ O OD\HUV HDFK OD\HU URXWLQJ ZKROH QHWVf XVLQJ DW PRVW W WUDFNV SHU OD\HU LI DQG RQO\ LI Df DLLW E W Ef ELLW 2L!W IRU HYHU\ L P fÂ§ ,W 3URRI )LUVW ZH HVWDEOLVK WKDW Df DQG Ef DUH QHFHVVDU\ IRU OD\HU URXWLQJ 6LQFH WKH SURRIV IRU Df DQG Ef DUH VLPLODU ZH SURYLGH WKDW IRU Df RQO\ 6XSSRVH WKDW DW fÂ§ EL W IRU VRPH L &RQVLGHU WKH ,W WHUPLQDO SDLUV $ %Mf L M L ,W :KHQ URXWLQJ WKHVH RQ OD\HUV DW OHDVW RQH OD\HU KDV WR EH DVVLJQHG W WHUPLQDO SDLUV 6R VXSSRVH WKDW WHUPLQDO SDLUV $?%>f $n%cf ^$nW %nWOf DUH DVVLJQHG WR WKH VDPH OD\HU IRU ULYHU URXWLQ J :H PD\ DVVXPH WKDW D> Dn Dn DQG E> En nL 6LQFH DnWO DW DQG M DnW fÂ§ E> DLX fÂ§ W )URP 7KHRUHP LW IROORZV WKDW WKH WHUPLQDO SDLUV $IL %Mf M W FDQQRW EH ULYHU URXWHG RQ D VLQJOH OD\HU +HQFH $M %Mf L M L Â FDQQRW EH ULYHU URXWHG RQ OD\HUV 6R ^$M IMf M P FDQQRW EH ULYHU URXWHG RQ OD\HUV $V D UHVXOW Df LV D QHFHVVDU\ FRQGLWLRQ 7R VKRZ WKDW Df DQG Ef DUH VXIILFLHQW FRQGLWLRQV IRU URXWDELOLW\ ZH SUHVHQW WZR DOJRULWKPV 5RXQG5RELQ DQG *UHHG\f WKDW DVVLJQ WKH QHWV WR OD\HUV LQ VXFK D ZD\ WKDW HDFK OD\HU LV ULYHU URXWDEOH ZKHQ ERWK Df DQG Ef DUH VDWLVILHG 7KH FRUUHFWQHVV
PAGE 23
SURFHGXUH 5RXQG5RELQ ^ $VVLJQ WKH P QHWV WR OD\HUV ` EHJLQ IRU L WR P GR DVVLJQ QHW $ %If WR OD\HU L PRG f HQG SURFHGXUH *UHHG\ ^ $VVLJQ WKH P QHWV WR O OD\HUV ` EHJLQ IRU L WR P GR DVVLJQ QHW $L%^f WR OD\HU T VXFK WKDW T LV WKH VPDOOHVW LQWHJHU IRU ZKLFK WKH FRQGLWLRQV RI 7KHRUHP D[H QRW YLRODWHG RQ OD\HU T LI WKHUH LV QR VXFK T WKHQ IDLOf HQG )LJXUH 5RXQG URELQ DQG JUHHG\ OD\HU DVVLJQPHQWV RI WKHVH DOJRULWKPV LV HVWDEOLVKHG LQ 7KHRUHPV DQG UHVSHFWLYHO\ Â’ :H ODWHU GLVFRYHUHG WKDW %DUDW] >@ KDV QRW RQO\ REWDLQHG WKH VDPH FRQGLWLRQ EXW DOVR SURSRVHG WKH VDPH WZR DOJRULWKPV IRU OD\HU ULYHU URXWLQJ 2QH DVVLJQV QHWV WR OD\HUV LQ D URXQG URELQ IDVKLRQ DQG WKH RWKHU XVHV D JUHHG\ VWUDWHJ\ 7KH FRUUHVSRQGLQJ SURFHGXUHV DUH JLYHQ LQ )LJXUH 7KHRUHP 7KH OD\HU DVVLJQPHQW SURGXFHG E\ WKH 5RXQG5RELQ SURFHGXUH LV ULYHU URXWDEOH LI Df 2L Ec W Ef ELLW DL!W IRU DOO L P fÂ§ ,W
PAGE 24
3URRI /HW $? %>f $n6nf $K%Mf $Mc%MOf $O%Of EH WKH QHWV DVVLJQHG WR OD\HU M PRG f M O 6R D? DB_f DQG E> MfBf +HQFH DnW a.a DMLWLfL EMLLfL f I\Lf W IURP Dff 6LPLODUO\ EnLW fÂ§ D?!W 6R WKH OD\HU DVVLJQPHQW VDWLVILHV WKH FRQGLWLRQV RI 7KHRUHP DQG LV ULYHU URXWDEOH XVLQJ W WUDFNV Â’ 7KHRUHP L ,I Df DOW E W DQG Ef EOcW fÂ§ D[!W IRU DOO L P fÂ§,W WKHQ SURFHGXUH *UHHG\ DVVLJQV QHWV WR OD\HUV VXFK WKDW WKH DVVLJQPHQW WR HDFK OD\HU LV URXWDEOH XVLQJ W WUDFNV 3URRI ,I SURFHGXUH *UHHG\ LV DEOH WR DVVLJQ HDFK RI WKH P QHWV WR D OD\HU WKHQ WKH OD\HU DVVLJQPHQWV VDWLVI\ WKH FRQGLWLRQV RI 7KHRUHP DQG VR DUH URXWDEOH XVLQJ W WUDFNV 6XSSRVH WKH DOJRULWKP IDLOV ZKLOH WU\LQJ WR DVVLJQ QHW $U %Uf WR D OD\HU $W WKLV WLPH QHWV $%Âf L U KDYH EHHQ DVVLJQHG WR OD\HUV VR DV WR VDWLVI\ WKH FRQGLWLRQV RI 7KHRUHP DQG WKH DVVLJQPHQW RI QHW $7%7f WR HDFK RI WKHVH OD\HUV YLRODWHV WKHVH FRQGLWLRQV &RQVLGHU ILUVW WKRVH OD\HUV /D RQ ZKLFK FRQGLWLRQ Df LV YLRODWHG )RU D OD\HU V f /D VXSSRVH WKDW WKH DVVLJQHG QHWV DUH $nBW MnB[%MB[f /HW $M%Mf $7%7f 6LQFH V f /D ZH KDYH Dn fÂ§ EnBW W 1RZ LI UB nB WKHQ D7 fÂ§ UBW W ZKLFK YLRODWHV FRQGLWLRQ Df RI WKLV WKHRUHP 6R UBW EnBW 6LQFH UBe EnBW nB nBM W RI WKH ,W QHWV
PAGE 25
$UWW? %ULW$UBL UBLf KDYH EHHQ DVVLJQHG WR OD\HU V e /D &RQVHTXHQWO\ WKH OD\HUV LQ /D DFFRXQW IRU W?/D? RI WKHVH ,W fÂ§ QHWV ,Q D VLPLODU ZD\ ZH FDQ VKRZ WKDW WKH UHPDLQLQJ fÂ§ ?/D? OD\HUV DFFRXQW IRU DQRWKHU WO fÂ§ _/_f RI WKHVH QHWV 7KLV JLYHV XV D WRWDO RI WO QHWV ZKHUHDV ZH KDG RQO\ WOfÂ§ 7KLV FRQWUDGLFWLRQ LPSOLHV WKDW SURFHGXUH *UHHG\ FDQQRW IDLO XQOHVV FRQGLWLRQV Df DQG Ef DUH QRW VDWLVILHG 3URFHGXUH 5RXQG5RELQ LV HDVLO\ VHHQ WR KDYH FRPSOH[LW\ RI Pf $ VWUDLJKWIRUn ZDUG LPSOHPHQWDWLRQ RI SURFHGXUH *UHHG\ ZLOO KDYH FRPSOH[LW\ RI POf +RZHYHU E\ XVLQJ SULRULW\ VHDUFK WUHHV >@ WKH FRPSOH[LW\ FDQ EH UHGXFHG WR 2PORJf ,Q SUDFWLFH VLQFH O LV TXLWH VPDOO LW LV XQOLNHO\ WKDW WKH SULRULW\ VHDUFK WUHH LPSOHPHQn WDWLRQ ZLOO UXQ IDVWHU WKDQ WKH VWUDLJKWIRUZDUG LPSOHPHQWDWLRQ LQ ZKLFK WKH OD\HUV DUH FKHFNHG LQ VHTXHQFH 7KH DFWXDO URXWLQJ IRU DOO O OD\HUV FDQ EH GRQH LQ PWff LV LQFUHDVHG RQO\ LI WKH FXUUHQW W f LV IRXQG WR EH LQIHDVLEOH 7KH FRPSOH[LW\ LV Pf DV QHLWKHU
PAGE 26
SURFHGXUH 0LQLPL]H7UDFNV ^RU 0LQLPL]H/D\HUV` ^ 'HWHUPLQH WKH PLQLPXP QXPEHU RI WUDFNV SHU OD\HU RU PLQLPXP QXPEHU RI OD\HUVf QHHGHG IRU PXOWLOD\HU ULYHU URXWLQJ ` EHJLQ W ^RU ` r ZKLOH L P fÂ§ ,Wf GR LI DOWW E^ Wf RU W D Wf WKHQ W W ^RU ` HOVH r L HQG )LJXUH 0LQLPL]LQJ WKH QXPEHU RI WUDFNV RU OD\HUV L QRU W f FDQ H[FHHG P 6R QHLWKHU FODXVH RI WKH LI VWDWHPHQW FDQ EH H[HFXWHG PRUH WKDQ P fÂ§ WLPHV 8VLQJ WKH PXOWLOD\HU ULYHU URXWLQJ UHVXOWV RI %DUDW] >@ RQH FDQ WULYLDOO\ H[WHQG DOO WKH UHVXOWV RI /LP &KHQJ DQG 6DKQL >@ WR WKH FDVH RI PXOWLOD\HU MRLQLQJ RI FRPSDFWHG FHOOV 6R WKH PXOWLOD\HU PLQLPXP DUHD MRLQ RI WZR FRPSDFWHG FHOOV ZLWK P QHWV FDQ EH REWDLQHG LQ Pf WLPH ,I ZH ZLVK WR PLQLPL]H WKH PD[LPXP ZLUH OHQJWK ZKLOH NHHSLQJ DUHD PLQLPXP WKH DV\PSWRWLF WLPH FRPSOH[LW\ LV VWLOO Pf 7KH WRWDO ZLUH OHQJWK FDQ EH PLQLPL]HG ZKLOH NHHSLQJ DUHD PLQLPXP LQ 2PORJPf WLPH ,Q +9 VW\OH URXWLQJ HDFK URXWLQJ OD\HU LV DVVLJQHG D URXWLQJ GLUHFWLRQ HLWKHU + RU 9f ,Q DQ + 9f OD\HU RQO\ KRUL]RQWDO YHUWLFDOf ZLUH VHJPHQWV FDQ EH ODLG RXW +RUL]RQWDO VHJPHQWV RQ RQH OD\HU FRQQHFW WR YHUWLFDO VHJPHQWV RI WKH VDPH QHW RQ DQRWKHU OD\HU E\ PHDQV RI YLDV ,Q WKH FDVH RI ULYHU URXWLQJ LQVWDQFHV RQH FDQ VHH WKDW WKHUH LV QR DGYDQWDJH WR KDYLQJ PRUH WKDQ WZR 9OD\HUV LH WZR 9OD\HUV DUH VXIILFLHQW WR URXWH DOO ULYHU URXWLQJ LQVWDQFHVf
PAGE 27
/HW 55Âf EH WKH VHW RI DOO ULYHU URXWLQJ LQVWDQFHV WKDW FDQ EH URXWHG LQ OD\HUV XVLQJ W WUDFNV SHU OD\HU DQG XVLQJ ULYHU URXWLQJ LQ HDFK OD\HU /HW +9 Wf EH DOO ULYHU URXWLQJ LQVWDQFHV WKDW FDQ EH URXWHG XVLQJ +9 VW\OH URXWLQJ O OD\HUV DQG W WUDFNV SHU OD\HU 1RWH WKDW +9 Wf LQFOXGHV LQVWDQFHV URXWDEOH ZLWK DQG 9OD\HUV /HW +99 Wf EH DOO ULYHU URXWLQJ LQVWDQFHV XVLQJ +9 VW\OH URXWLQJ fÂ§ +OD\HUV DQG 9OD\HUV 7KHRUHPV DQG EHORZ KROG IRU ERWK WKH NQRFNNQHH >@ DQG GLUHFWLRQDO +9 PRGHOV 7KHRUHP KROGV RQO\ IRU WKH GLUHFWLRQDO PRGHO 7KHRUHP +9 Wf & 55f IRU HYHU\ O DQG HYHU\ W 3URRI +9f & 55OWf IROORZV IURP D PRUH JHQHUDO UHVXOW REWDLQHG E\ %DUDW] >@ %DUDW] >@ KDV VKRZQ WKDW IRU ULYHU URXWLQJ LQVWDQFHV WKHUH LV QR DGYDQWDJH WR XVLQJ DQ\ URXWLQJ VFKHPH WKDW ZLUHV D QHW RQ PRUH WKDQ RQH OD\HU 6LQFH LW LV HDV\ WR FRQVWUXFW ULYHU URXWLQJ LQVWDQFHV ; VXFK WKDW ; e 55If DQG ; A +9 f LW IROORZV WKDW +9 Wf & 55f :H SURYLGH D VLPSOHU SURRI RI +9 Wf & 55 Wf 7KLV SURRI ZLOO DOVR HVWDEOLVK RXU QH[W UHVXOW :H VKDOO VKRZ WKDW LI ; LV D ULYHU URXWLQJ LQVWDQFH VXFK WKDW ; A 55Lf WKHQ ; e +9rf +HQFH +9Wf & 55rf 6XSSRVH WKDW ; A 55 Wf )URP 7KHRUHP LW IROORZV WKDW DW fÂ§ W RU EOLW fÂ§ D W IRU VRPH L 6XSSRVH WKDW D fÂ§ W WKH SURRI LV VLPLODU ZKHQ ELQ fÂ§ D Wf 6R DLX E I W 6LQFH ; LV D ULYHU URXWLQJ LQVWDQFH DW OHDVW QHWV L I W Â‘ f r ,W LQWHUVHFW D YHUWLFDO FXW OLQH GUDZQ DW DW +HQFH WKH GHQVLW\ RI ; DW DOW LV L 7 ,W fÂ§ L 7 Wf 7 fÂ§ OfW :KHQ +9 VW\OH URXWLQJ LV XVHG ZLWK
PAGE 28
! OD\HUV DW PRVW fÂ§ OD\HUV DUH DYDLODEOH IRU KRUL]RQWDO URXWHV :LWK W WUDFNV SHU OD\HU GHQVLWLHV RI DW PRVW fÂ§ f FDQ EH DFFRPPRGDWHG 6R ; A +Â’ 7KHRUHP +99 Wf & 55 fÂ§ Wf IRU HYHU\ O DQG HYHU\ W 3URRI $V LQ 7KHRUHP VXSSRVH WKDW ; I 55fÂ§OWf /HW L EH VXFK WKDW DASLfr fÂ§ E W 7KH QHW GHQVLW\ DW LV fW ,Q +99 URXWLQJ WZR OD\HUV DUH 9OD\HUV 6R RQO\ fÂ§ OD\HUV DUH DYDLODEOH IRU KRUL]RQWDO VHJPHQWV 7KLV LV QRW HQRXJK DV WKH KRUL]RQWDO VHJPHQW GHQVLW\ LV fÂ§ fW DW DB[fW +HQFH ; +990f 2QH PD\ HDVLO\ FRQVWUXFW ULYHU URXWLQJ LQVWDQFHV WKDW DUH LQ 55 fÂ§ Wf EXW QRW LQ +99Lf Â’ 7KHRUHP 55 Wf I +9 Wf +99 Wf IRU HYHU\ O DQG HYHU\ W 3URRI &RQVLGHU WKH 55 LQVWDQFH DLLf f DQG DÂ!f f 7KLV LV LQ 55Wf IRU HYHU\ W EXW LV QRW LQ +9 Wf fÂ§ +99Wf IRU DQ\ Â’ $V UHPDUNHG HDUOLHU 7KHRUHP KROGV RQO\ IRU WKH GLUHFWLRQDO PRGHO )RU WKH NQRFNNQHH PRGHO RQH FDQ VKRZ WKDW 55Wf & +9 OLf IRU HYHU\ DQG HYHU\ W &RQVWUDLQW *UDSK 5HSUHVHQWDWLRQ /LP >@ KDV SURSRVHG WKH XVH RI D FRQVWUDLQW JUDSK WR GHWHUPLQH WKH WHUPLQDO SRVLWLRQV LQ D URZ RI FRPSDFWHG FHOOV 7KLV LV IRU WKH FDVH ZKHQ WKH QXPEHU RI WUDFNV
PAGE 29
f VKRZV WKH FKDLQV VROLG HGJHVf IRU WKH IRXU FHOO URZ RI )LJXUH Df 7R FRPSOHWH WKH FRQVWUDLQW JUDSK GLUHFWHG HGJHV DUH DGGHG WR LQWURGXFH WKH FKDQQHO URXWLQJ FRQVWUDLQWV RI 7KHRUHP 7KHVH DUH UHSUHVHQWHG E\ WKH EURNHQ HGJHV RI )LJXUH Ef )LJXUH Ef LV IRU WKH WZR OD\HU FDVH /LP >@ KDV VKRZQ WKDW WKH FRQVWUDLQW JUDSK LV DF\FOLF SURYLGHG WKH QXPEHU RI WUDFNV LQ HDFK URXWLQJ FKDQQHO LV +H KDV SURSRVHG KDQGOLQJ FKDQQHOV ZLWK ]HUR WUDFNV E\ ILQGLQJ ILUVW WKH PLQLPXP DUHD MRLQLQJ RI WKH DGMDFHQW FHOOV RQO\ FHOO VWUHWFKLQJ LV SHUPLWWHG QRZf DQG WKHQ FRPELQLQJ WKHVH WZR FHOOV LQWR RQH ,H WKH WZR FHOOV DUH UHSODFHG E\ WKHLU PLQLPXP DUHD MRLQ 7KLV VWUDWHJ\ FDQ EH VKRZQ WR UHVXOW LQ QRQRSWLPDOLW\ RI WKH DOJRULWKP SURSRVHG LQ /LP >@ 7R SUHVHUYH RSWLPDOLW\ LW LV QHFHVVDU\ WR PHUJH WKH YHUWLFHV WKDW UHSUHVHQW WHUPLQDOV WKDW DUH WKH HQGSRLQWV RI
PAGE 30
VLQN )LJXUH &RQVWUDLQW JUDSK UHSUHVHQWDWLRQ QHWV WKDW DUH WR EH URXWHG XVLQJ QR WUDFNV DV LQ )LJXUH 7KH UHVXOWDQW FRQVWUDLQW JUDSK LV DOVR DF\FOLF ,W LV HDV\ WR VHH WKDW WKH QXPEHU RI YHUWLFHV DQG HGJHV LQ WKH FRQVWUDLQW JUDSK LV Qf ZKHUH Q LV WKH WRWDO QXPEHU RI WHUPLQDOV )XUWKHUPRUH WKH JUDSK FDQ EH FRQVWUXFWHG LQ Qf WLPH JLYHQ WKH QXPEHU RI URXWLQJ OD\HUV DQG WKH QXPEHU RI WUDFNV LQ HDFK FKDQQHO 7KH FRQVWUDLQW JUDSK GHVFULEHG E\ XV LV LGHQWLFDO WR WKDW RI /LP >@ H[FHSW LQ WKH ZD\ FKDQQHOV ZLWK ]HUR WUDFNV DUH KDQGOHG DQG LQ WKDW RXU JUDSK LV GHILQHG IRU URXWLQJ OD\HUV ZKLOH WKDW RI /LP >@ LV RQO\ IRU
PAGE 31
VLQN 7KH OHQJWK RI WKH ORQJHVW SDWK IURP WKH VRXUFH YHUWH[ RI WKH FRQVWUDLQW JUDSK WR HDFK RI WKH UHPDLQLQJ YHUWLFHV FDQ EH FRPSXWHG LQ Qf WLPH E\ GRLQJ WKLV LQ WRSRORJLFDO RUGHU > 6HFWLRQ @ ,W LV HDV\ WR VHH WKDW LI HDFK WHUPLQDO LV SODFHG DW D YHUWLFDO SRVLWLRQ JLYHQ E\ WKH ORQJHVW SDWK OHQJWK IURP WKH VRXUFH WKHQ DOO QHWV FDQ EH URXWHG LQ WKH JLYHQ QXPEHU RI WUDFNV DV WKH FRQGLWLRQV RI 7KHRUHP DUH VDWLVILHG LQ HDFK URXWLQJ FKDQQHOf )XUWKHUPRUH /LP >@ KDV VKRZQ WKDW VXFK D SRVLWLRQLQJ RI WHUPLQDOV UHVXOWV LQ D VWUHWFKHG OD\RXW RI PLQLPXP KHLJKW IRU WKH JLYHQ FKDQQHO ZLGWKV $V D UHVXOW ZKHQ FKDQQHO ZLGWKV DUH NQRZQ FHOOV FDQ EH VWUHWFKHG WR PLQLPL]H DUHD LQ Qf WLPH 7KH FKDQQHO ZLGWKV WKDW UHVXOW LQ PLQLPXP DUHD FDQ EH GHWHUPLQHG LQ QQFfFf WLPH ZKHUH F LV WKH QXPEHU RI FHOOV E\ WU\LQJ RXW DOO
PAGE 32
SURFHGXUH +HXULVWLFO EHJLQ IRU L WR F fÂ§f 7KH WLPH WR GR WKLV IRU DOO SDLUV RI DGMDFHQW
PAGE 33
FHOOV LV eQf Qf 6R WKH IRU ORRS LWHUDWLRQ ZLWK L WDNHV Qff FHOOV EHLQJ MRLQHG DUH WULHG WKH FRQVWUDLQW JUDSK LV XVHG WR GHWHUPLQH WKH PLQLPXP KHLJKW RI WKH FRPELQHG FHOO 6R WKH WLPH WR FRPELQH WZR FRPSRVLWHf FHOOV ZLWK Q WHUPLQDOV LQ WKH FKDQQHO EHWZHHQ WKHP LV QQf +HQFH WKH WLPH IRU WKH UHPDLQLQJ F fÂ§ LWHUDWLRQV LV Qe Qf Qf 7KH RYHUDOO FRPSOH[LW\ RI +HXULVWLF LV WKHUHIRUH Qf ,Q FDVH WKH WHUPLQDOV DUH XQLIRUPO\ GLVWULEXWHG RYHU WKH FHOOV Q QFf IRU DOO L 7KH WLPH IRU WKH ILUVW LWHUDWLRQ RI WKH IRU ORRS LV QRZ QFf DQG WKDW IRU HDFK RI WKH UHPDLQLQJ LWHUDWLRQV LV QFf 7KH RYHUDOO WLPH LV Qf +HXULVWLF ,Q WKLV KHXULVWLF ZH EHJLQ E\ DVVLJQLQJ HDFK FKDQQHO WKH QXPEHU RI WUDFNV QHHGHG WR URXWH WKH FKDQQHO ZLWK QR FHOO VWUHWFKLQJ 7KLV QXPEHU FDQ EH GHWHUPLQHG LQ Qf WLPH IRU D FKDQQHO ZLWK Q QHWV DV GHVFULEHG LQ 6HFWLRQ 7KH WLPH WDNHQ WR GR WKLV IRU DOO F fÂ§ FKDQQHOV LV Qf 7KH FRQILJXUDWLRQ REWDLQHG LQ WKLV ZD\ LV WKH PD[LPXP ZLGWK OD\RXW 6WDUWLQJ IURP WKLV FRQILJXUDWLRQ ZH UHGXFH WKH WRWDO QXPEHU RI WUDFNV DYDLODEOH DFURVV DOO F fÂ§ FKDQQHOV E\ RQH RQ HDFK LWHUDWLRQ )RU WKLV WKH
PAGE 34
SURFHGXUH +HXULVWLF EHJLQ IRU HDFK FKDQQHO GHWHUPLQH WKH QXPEHU RI WUDFNV QHHGHG WR URXWH ZLWK QR VWUHWFKLQJ L F W (F L 8 VHW XS WKH FRQVWUDLQW JUDSK XVLQJ WÂ WUDFNV LQ FKDQQHO L L F FRPSXWH OD\RXW DUHD $ IRU WUDFNV W GRZQWR GR ^UHGXFH E\ ` EHJLQ IRU L WR F fÂ§ GR EHJLQ UHGXFH WKH QXPEHU RI WUDFNV LQ FKDQQHO L E\ PRGLI\ WKH FRQVWUDLQW JUDSK WR UHIOHFW WKLV GHWHUPLQH WKH OHQJWK RI WKH ORQJHVW SDWK LQ WKH JUDSK DQG IURP WKLV WKH OD\RXW DUHD D HQG VHOHFW M VXFK WKDW DM PLQ^ D ` UHGXFH WKH QXPEHU RI WUDFNV LQ FKDQQHO M E\ $ PLQ^ $DM ` HQG HQG )LJXUH +HXULVWLF HIIHFW RI D RQH WUDFN UHGXFWLRQ LV FRPSXWHG IRU HDFK FKDQQHO 7KH PLQLPXP OD\RXW KHLJKW LV GHWHUPLQHG E\ FRPSXWLQJ WKH OHQJWK RI WKH ORQJHVW SDWK LQ WKH FRQVWUDLQW JUDSK RI 6HFWLRQ 7KH WUDFN UHGXFWLRQ LV GRQH LQ WKH FKDQQHO WKDW UHVXOWV LQ WKH VPDOOHVW OD\RXW KHLJKW KHQFH WKH PLQLPXP DUHD IRU WKH JLYHQ QXPEHU RI WUDFNVf 7KH DOJRULWKP LV VWDWHG PRUH IRUPDOO\ LQ )LJXUH :KHQ WKH DOJRULWKP WHUPLQDWHV $ LV WKH DUHD RI WKH PLQLPXP DUHD MRLQ IRXQG E\ WKH KHXULVWLF 7R UHFRQVWUXFW WKH OD\RXW LW LV QHFHVVDU\ WR VWRUH WKH WUDFNV SHU FKDQQHO HDFK WLPH $ LV XSGDWHG LQ WKH VWDWHPHQW $ PP^ $ D` ` )RU WKH WLPH FRPSOH[LW\ ZH VHH WKDW WKH VWHSV WKDW SUHFHGH WKH RXWHU IRU ORRS WDNH Qf WLPH (DFK LWHUDWLRQ RI WKH RXWHU ORRS WDNHV 2QFf WLPH +HQFH WKLV ORRS
PAGE 35
FRQWULEXWHV D WRWDO RI QHWf WR WKH WLPH 6LQFH W Qf WKH RYHUDOO WLPH FRPSOH[LW\ RI +HXULVWLF LV QFf +HXULVWLF 8QOLNH +HXULVWLF ZKLFK DWWHPSWV WR PLQLPL]H WKH OD\RXW KHLJKW IRU HDFK YDOXH RI W WKH WRWDO QXPEHU RI WUDFNV +HXULVWLF DWWHPSWV WR PLQLPL]H WKH ZLGWK LH WRWDO QXPEHU RI WUDFNVf IRU HDFK FKRLFH RI OD\RXW KHLJKW 7KH KHXULVWLF EHJLQV ZLWK D OD\RXW KHLJKW KW HTXDO WR WKH KHLJKW RI WKH WDOOHVW FRPSDFWHG FHOO $W HDFK LWHUDWLRQ WKH QH[W OD\RXW KHLJKW WR XVH LV FRPSXWHG DV GHVFULEHG ODWHU 'XULQJ HDFK LWHUDWLRQ FHOOV DUH FRPELQHG LQ JURXSV RI DW PRVW N N LV D SDUDPHWHU WR WKH KHXULVWLFffÂ§ FKDQQHOV EHWZHHQ WKH N FRPSRVLWH FHOOV :H IRXQG WKLV WR JLYH EHWWHU UHVXOWV WKDQ ZKHQ FRPSRVLWH FHOOV ZHUH UHJDUGHG DV DWRPLF )RU WKH FDVH N WKH PLQLPXP DUHD LV GHWHUPLQHG E\ D ELQDU\ VHDUFK RYHU WKH
PAGE 36
SURFHGXUH +HXULVWLF EHJLQ KW KHLJKW RI WKH WDOOHVW FHOO UHSHDW ^ PLQLPL]H ZLGWK VXEMHFW WR KHLJKW KW ` UHSHDW ^ GR WKLV E\ FRPELQLQJ N FHOOV DW D WLPH ` VHOHFW N DGMDFHQW FHOOV VXFK WKDW WKH PLQLPXP KHLJKW FHOO LV VHOHFWHG DQG WKH KHLJKW RI WKH WDOOHVW VHOHFWHG FHOO LV PLQLPXP LI WKHUH DUH IHZHU WKDQ N FHOOV WKHQ VHOHFW DOO RI WKHPf REWDLQ WKH PLQLPXP DUHD OD\RXW IRU WKH VHOHFWHG FHOOV XQGHU WKH FRQVWUDLQW WKDW WKH OD\RXW KHLJKW GRHV QRW H[FHHG KW GXULQJ WKH SUHFHGLQJ VWHS UHFRUG WKH QH[W YDOXH RI KW WKDW LV SRVVLEOH IRU D OD\RXW XQWLO RQH FHOO UHPDLQV FRPSXWH WKH DUHD RI WKH UHPDLQLQJ FHOO DQG UHFRUG LW LI LW LV OHVV WKDQ WKH PLQLPXP DUHD IRXQG VR IDU LI WKHUH LV QR QH[W KHLJKW WKHQ WHUPLQDWH KW QH[W KHLJKW XQWLO IDOVH HQG )LJXUH +HXULVWLF QXPEHU RI WUDFNV LQ WKH VLQJOH FKDQQHO 7KLV WDNHV 2QORJQf WLPH ZKHUH Q LV WKH QXPEHU RI QHWV LQ FKDQQHO L 7KXV WKH WLPH QHHGHG IRU WKH LQQHU UHSHDW ORRS ZKHQ N LV 2FQORJQf IRU XQLIRUP WHUPLQDO GLVWULEXWLRQ LW LV FQ ORJQFff 'XULQJ WKH ELQDU\ VHDUFK WKH KHLJKWV FRUUHVSRQGLQJ WR FKDQQHO ZLGWKV WKDW UHTXLUH KHLJKW KW DUH UHFRUGHG 7KH PLQLPXP RI WKHVH KHLJKWV \LHOGV WKH QH[W YDOXH RI KW :KHQ N DOO WUDFN FRPELQDWLRQV IRU WKH N fÂ§ FKDQQHOV DUH WULHG DV LQ 6HFWLRQ $JDLQ HDFK FRPSRVLWH FHOO LV EURNHQ XS LQWR LWV EDVLF FHOOV $V GLIIHUHQW WUDFN FRPELQDWLRQV DUH WULHG ZH UHFRUG WKH PLQLPXP KHLJKW KW WKDW UHVXOWV IURP DQ\ WUDFN FRPELQDWLRQ 7KLV JLYHV WKH QH[W YDOXH RI KW 7KH WLPH IRU WKH LQQHU UHSHDW ORRS LV FU fÂ§ OffQQÂfÂƒff RU FN fÂ§ OffQQFfWBf ZKHQ WHUPLQDOV DUH XQLIRUPO\ GLVWULEXWHGf
PAGE 37
,Q DOO RXU H[SHULPHQWV WKH RXWHU UHSHDW ORRS ZDV LWHUDWHG IHZHU WKDQ N fÂ§ OfQ WLPHV 7R HQVXUH WKDW WKH QXPEHU RI LWHUDWLRQV LV NQf RQH PD\ DGRSW WKH IROORZLQJ VFKHPH :KHQ WKH QXPEHU RI LWHUDWLRQV ILUVW UHDFKHV N fÂ§ Of` DV WKH QH[W KHLJKW $JDLQ WZR LWHUDWLRQV RI WKH RXWHU UHSHDW ORRS DUH GRQH 1H[W WKH KHXULVWLF LV UHVXPHG ZLWK PD[^ KKW ` DV WKH QH[W KHLJKW 7KLV FRQWLQXHV XQWLO ZH KDYH JRQH WKURXJK S UHVXPSWLRQV RI WKH KHXULVWLF :LWK WKLV VFKHPH WR OLPLW WKH QXPEHU RI LWHUDWLRQV WKH FRPSOH[LW\ RI +HXULVWLF EHFRPHV FQ ORJ Qf ZKHQ N DQG FN fÂ§ ffIFQQÂfrf FQQÂfNBf ZKHQ N )RU WKH FDVH ZKHQ WKH Q WHUPLQDOV DUH XQLIRUPO\ GLVWULEXWHG RYHU WKH F FHOOV WKH FRPSOH[LW\ LV FQ ORJQFff ZKHQ N DQG FN fÂ§ ffNQQFfNaf FQQFfrf ZKHQ N 2QH PD\ YHULI\ WKDW VLQFH +HXULVWLF WULHV WKH PD[LPXP XVHIXO KHLJKW LH WKH KHLJKW QHHGHG ZKHQ QR URXWLQJ WUDFNV DUH DYDLODEOHf LW JHQHUDWHV RSWLPDO VROXWLRQV ZKHQ N F ([SHULPHQWDO 5HVXOWV :H SURJUDPPHG RXU WKUHH KHXULVWLFV DV ZHOO DV WKH KHXULVWLF )DQJ >@ LQ & DQG UDQ WHVWV RQ D VLQJOH .65 SURFHVVRU 2SWLPDO VROXWLRQV IRU LQVWDQFHV ZLWK XS WR QLQH
PAGE 38
FHOOV ZHUH REWDLQHG XVLQJ WKH FRUUHFWHG YHUVLRQ RI WKH H[KDXVWLYH VHDUFK DOJRULWKP RI /LP >@ 2XU WHVW VHW FRQVLVWHG RI LQVWDQFHV WKDW KDG D QXPEHU RI FHOOV F HTXDO WR RQH RI WKH QXPEHUV LQ WKH VHW ^ ` )RU HDFK YDOXH RI F WKHUH ZHUH WZHQW\ LQVWDQFHV DQG WKH UHVXOWV ZHUH DYHUDJHG RYHU WKHVH LQVWDQFHV $Q LQVWDQFH ZLWK F FHOOV KDG F fÂ§ URXWLQJ FKDQQHOV 7KH QXPEHU L RI WHUPLQDOV RQ HLWKHU VLGH RI HDFK URXWLQJ FKDQQHO ZDV HTXDO WR F IRU F DQG ZDV IRU WKH RWKHU YDOXHV RI F ,Q DGGLWLRQ ZKHQ F ZH DOVR KDG LQVWDQFHV ZLWK WHUPLQDOV RQ HLWKHU VLGH ,Q RXU H[SHULPHQWV ZH FRQVLGHUHG RQO\ VLQJOH OD\HU DQG WZR OD\HU URXWLQJ 7DEOH JLYHV WKH DYHUDJH SHUFHQWDJH E\ ZKLFK WKH DUHD RI WKH VLQJOH OD\HU VROXWLRQV JHQHUDWHG E\ HDFK RI WKH KHXULVWLFV H[FHHGHG WKH DUHD RI WKH VLQJOH OD\HU RSWLPDO VROXWLRQ $V LV HYLGHQW HDFK RI WKH KHXULVWLFV SURSRVHG LQ WKLV FKDSWHU JDYH QRWLFHDEO\ EHWWHU VROXWLRQV WKDQ GLG )DQJ 7KLV WDEOH LV RQO\ IRU WKH FDVHV F DV IRU F WKH RSWLPDO DOJRULWKP RI /LP >@ UHTXLUHG WRR PXFK WLPH WR FRPSOHWH :KHQ N F +HXULVWLF LV JXDUDQWHHG WR JHQHUDWH DQ RSWLPDO VROXWLRQ 6R ZH GLG QRW UXQ WKHVH FDVHV ,Q WDEOH ZH KDYH XVHG WKH VLQJOH OD\HU VROXWLRQ SURGXFHG E\ )DQJ DV WKH EHQFKPDUN DJDLQVW ZKLFK WKH VROXWLRQV REWDLQHG E\ RXU WKUHH KHXULVWLFV DUH FRPSDUHG 7KLV WDEOH JLYHV WKH DYHUDJH SHUFHQWDJH E\ ZKLFK WKH DUHD RI WKH VROXWLRQV SURGXFHG E\ RXU KHXULVWLFV LV OHVV WKDQ WKDW RI WKH VROXWLRQV SURGXFHG E\ )DQJ 2XU VROXWLRQV KDYH DUHD WR b OHVV 7DEOH FRPSDUHV WKH FRPSXWLQJ WLPH UHTXLUHPHQWV RI WKH YDULRXV DOJRULWKPV IRU WKH FDVH RI RQH OD\HU 7KH RSWLPDO DOJRULWKP LV XVHIXO RQO\ IRU VPDOO YDOXHV RI F
PAGE 39
7DEOH (UURU UDWH bf RYHU RSWLPDO W QXPEHU RI WHUPLQDOV RQ HDFK VLGH RI HDFK URXWLQJ FKDQQHO r N F 7DEOH ,PSURYHPHQW bf RYHU )DQJ FHOOV W +HXULVWLF +HXULVWLF +HXULVWLF N N $U H[FHVVLYH UXQ WLPH
PAGE 40
7DEOH 7LPH WDNHQ O fÂ§ ? FHOOV W )DQJ +HXULVWLF +HXULVWLF +HXULVWLF 2SWLPDO &1 ,, N ,, r r r r r 7LPHV DUH LQ VHFRQGV I 7LPHV DUH LQ KRXUV VD\ XS WR fb OHVV DUHD WKDQ ZKHQ
PAGE 41
7DEOH (UURU UDWH bf RYHU RSWLPDO 7DEOH ,PSURYHPHQW bf RYHU FDVHV FHOOV W )DQJ +HXULVWLF +HXULVWLF +HXULVWLF N N N fÂ§
PAGE 42
7DEOH ,PSURYHPHQW bf RYHU )DQJ FHOOV W +HXULVWLF +HXULVWLF +HXULVWLF N N N 7DEOH LV WKH DQDORJ RI WDEOH IRU WKH FDVH RI WZR OD\HUV 7KH UHVXOWV DUH VLPLODU WR WKRVH LQ WDEOH 7DEOH JLYHV WKH DYHUDJH FRPSXWLQJ WLPHV IRU WKH WZR OD\HU LQVWDQFHV 7KHVH DUH OHVV WKDQ IRU WKH RQH OD\HU FDVH DV WKH FRQVWUDLQW JUDSK KDV IHZHU HGJHV )RU ODUJH F ZH UHFRPPHQG WKH XVH RI KHXULVWLF RU ZLWK N f DQG IRU VPDOO F ZH UHFRPPHQG XVLQJ KHXULVWLF ZLWK L RU f &RQFOXVLRQ :H KDYH FRQVLGHUHG WKH SUREOHP RI MRLQLQJ D URZ RI FRPSDFWHG FHOOV DQG GHn YHORSHG KHXULVWLFV WR VWUHWFK FHOOV DQG ULYHUURXWH WKH QHWV VR WKDW WKH OD\RXW DUHD LV PLQLPL]HG 2XU SURSRVHG KHXULVWLF ZDV FRPSDUHG H[SHULPHQWDOO\ ZLWK )DQJ >@ DQG IRXQG WR SURGXFH OD\RXWV ZLWK OHVV DUHD +RZHYHU )DQJ LV IDVWHU :H UHFRPPHQG WKH XVH RI RXU +HXULVWLF ZLWK N RU LQ SUDFWLFH
PAGE 43
7DEOH 7LPH WDNHQ FHOOV W )DQJ +HXULVWLF +HXULVWLF +HXULVWLF 2SWLPDO N N N r r r r r fÂ§ fÂ§ fÂ§ 7LPHV DUH LQ VHFRQGV I 7LPHV DUH LQ KRXUV
PAGE 44
ff DUH EDODQFHG ELQDU\ VHDUFK WUHHV :KHQ UHSUHVHQWLQJ D GLFWLRQDU\ ZLWK Q HOHPHQWV XVLQJ RQH RI WKHVH VFKHPHV WKH FRUUHVSRQGLQJ ELQDU\ VHDUFK WUHH KDV KHLJKW 2ORJQf DQG LQGLYLGXDO VHDUFK LQVHUW DQG GHOHWH RSHUDWLRQV WDNH 2ORJQf WLPH :KHQ XQEDODQFHGf ELQDU\ VHDUFK WUHHV WUHDSV RU VNLS OLVWV DUH XVHG HDFK RSHUDWLRQ KDV DQ H[SHFWHG FRPSOH[LW\ RI 2ORJ Qf EXW WKH ZRUVW FDVH FRPSOH[LW\ LV Qf :KHQ KDVK WDEOHV DUH XVHG WKH H[SHFWHG FRPSOH[LW\ LV f SHU RSHUDWLRQ +RZHYHU WKH ZRUVW FDVH FRPSOH[LW\ LV Qf 6R LQ DSSOLFDWLRQV ZKHUH D ZRUVW FDVH FRPSOH[LW\ JXDUDQWHH LV FULWLFDO RQH RI WKH EDODQFHG ELQDU\ VHDUFK WUHH VFKHPHV LV WR EH SHUIRUPHG
PAGE 45
,Q WKLV FKDSWHU ZH GHYHORS D QHZ EDODQFHG ELQDU\ VHDUFK WUHH FDOOHG "%%67 EDODQFHG ELQDU\ VHDUFK WUHHf /LNH :%Df WUHHV WKLV DFKLHYHV EDODQFLQJ E\ FRQWUROOLQJ WKH UHODWLYH QXPEHU RI QRGHV LQ HDFK VXEWUHH +RZHYHU XQOLNH :%Rf WUHHV GXULQJ LQVHUW DQG GHOHWH RSHUDWLRQV URWDWLRQV DUH SHUIRUPHG DORQJ WKH VHDUFK SDWK ZKHQHYHU WKH\ UHGXFH WKH LQWHUQDO SDWK OHQJWK RI WKH WUHH UDWKHU WKDQ RQO\ ZKHQ D VXEWUHH LV RXW RI EDODQFHf $V D UHVXOW WKH FRQVWUXFWHG WUHHV DUH H[SHFWHG WR KDYH D VPDOOHU LQWHUQDO SDWK OHQJWK WKDQ WKH FRUUHVSRQGLQJ :%Df WUHH 6LQFH WKH DYHUDJH VHDUFK WLPH LV FORVHO\ UHODWHG WR WKH LQWHUQDO SDWK OHQJWK WKH WLPH QHHG WR VHDUFK LQ D %%67 LV H[SHFWHG WR EH OHVV WKDQ WKDW LQ D :%Df WUHH ,Q 6HFWLRQ ZH GHILQH WKH WRWDO VHDUFK FRVW RI D ELQDU\ VHDUFK WUHH DQG VKRZ WKDW WKH UHEDODQFLQJ URWDWLRQV SHUIRUPHG LQ $9/ DQG UHGEODFN WUHHV PLJKW LQFUHDVH WKLV PHWULF :H DOVR VKRZ WKDW ZKLOH VLPLODU URWDWLRQV LQ :%Df WUHHV GR QRW LQFUHDVH WKLV PHWULF LQVHUW DQG GHOHWH RSHUDWLRQV LQ :%Df WUHHV GR QRW DYDLO RI DOO RSSRUn WXQLWLHV WR UHGXFH WKH PHWULF ,Q 6HFWLRQ ZH GHILQH %%67V DQG VKRZ WKHLU UHODWLRQVKLS WR :%Df WUHHV 6HDUFK LQVHUW DQG GHOHWH DOJRULWKPV IRU %%67V DUH GHYHORSHG LQ 6HFWLRQ $ VLPSOLILHG YHUVLRQ RI %%67V LV GHYHORSHG LQ 6HFWLRQ 6HDUFK LQVHUW DQG GHOHWH RSHUDWLRQV IRU WKLV YHUVLRQ DOVR WDNH 2ORJQf WLPH HDFK $Q HYHQ VLPSOHU YHUVLRQ RI "%%67V LV GHYHORSHG LQ 6HFWLRQ )RU WKLV YHUVLRQ ZH VKRZ WKDW WKH DYHUDJH FRVW RI DQ LQVHUW DQG VHDUFK RSHUDWLRQ LV 2ORJ Qf SURYLGHG QR GHOHWHV DUH SHUIRUPHG $Q H[SHULPHQWDO HYDOXDWLRQ RI "%%67V DQG FRPSHWLQJ VFKHPHV IRU GLFWLRQDULHV $9/ UHGEODFN VNLS OLVWV HWFf ZDV GRQH DQG WKH UHVXOWV RI WKLV DUH SUHVHQWHG LQ
PAGE 46
6HFWLRQ 7KLV VHFWLRQ DOVR FRPSDUHV WKH UHODWLYH SHUIRUPDQFH RI "%%67V DQG WKH WZR VLPSOLILHG YHUVLRQV RI 6HFWLRQV DQG %DODQFHG 7UHHV DQG 5RWDWLRQV )ROORZLQJ DQ LQVHUW RU GHOHWH RSHUDWLRQ LQ D EDODQFHG ELQDU\ VHDUFK WUHH HJ $9/ UHGEODFN :%FWf HWFfn LV WKH RULJLQDO QRGH S KRZHYHU LWV VXEWUHHV DUH GLIIHUHQW /HW K[f EH WKH KHLJKW RI WKH VXEWUHH ZLWK URRW [ /HW V[f EH WKH QXPEHU RI QRGHV LQ WKLV VXEWUHH :KHQ VHDUFKLQJ IRU DQ HOHPHQW [ [ LV FRPSDUHG ZLWK RQH HOHPHQW DW HDFK RI [f OHYHOV ZKHUH [f LV WKH OHYHO DW ZKLFK [ LV SUHVHQW WKH URRW LV DW OHYHO f 6R RQH PHDVXUH RI WKH fJRRGQHVVf RI WKH ELQDU\ VHDUFK WUHH 7 IRU VHDUFK RSHUDWLRQV DVVXPLQJ HDFK HOHPHQW LV VHDUFKHG IRU ZLWK HTXDO SUREDELOLW\f LV &7f fe }rff LWV WRWDO VHDUFK FRVW GHILQHG DV
PAGE 47
Df // URWDWLRQ Ef /5 URWDWLRQ )LJXUH // DQG 5/ URWDWLRQV 1RWLFH WKDW &7f ,7f Q ZKHUH ,7f LV WKH LQWHUQDO SDWK OHQJWK RI 7 DQG Q LV WKH QXPEHU RI HOHPHQWVQRGHV LQ 7 7KH FRVW RI XQVXFFHVVIXO VHDUFKHV LV HTXDO WR WKH H[WHUQDO SDWK OHQJWK (7f 6LQFH (7f ,7f Q PLQLPL]LQJ &7f DOVR PLQLPL]HV (7f 7RWDO VHDUFK FRVW LV LPSRUWDQW DV WKLV LV WKH GRPLQDQW RSHUDWLRQ LQ D GLFWLRQDU\ QRWH WKDW LQVHUW FDQ EH PRGHOHG DV DQ XQVXFFHVVIXO VHDUFK IROORZHG E\ WKH LQVHUWLRQ RI D QRGH DW WKH SRLQW ZKHUH WKH VHDUFK WHUPLQDWHG DQG GHOHWLRQ FDQ EH PRGHOHG E\
PAGE 48
D VXFFHVVIXO VHDUFK IROORZHG E\ D SK\VLFDO GHOHWLRQ ERWK RSHUDWLRQV DUH WKHQ IROORZHG E\ D UHEDODQFLQJUHVWUXFWXULQJ VWHSf 2EVHUYH WKDW LQ DQ DFWXDO LPSOHPHQWDWLRQ RI WKH VHDUFK RSHUDWLRQ LQ SURJUDPn PLQJ ODQJXDJHV VXFK DV & & DQG 3DVFDO WKH VHDUFK IRU DQ [ DW OHYHO [f ZLOO LQYROYH XSWR WZR FRPSDULVRQV DW OHYHOV [f ,I WKH FRGH ILUVW FKHFNV [ H ZKHUH H LV WKH HOHPHQW DW OHYHO L WR EH FRPSDUHG DQG WKHQ [ H WR GHFLGH ZKHWKHU WR PRYH WR WKH OHIW RU ULJKW VXEWUHH WKHQ WKH QXPEHU RI HOHPHQW FRPSDULVRQV LV H[DFWO\ [f fÂ§ ,Q WKLV FDVH WKH WRWDO QXPEHU RI HOHPHQW FRPSDULVRQV LV 1&7f e O[f Q &7f Q [H7 DQG PLQLPL]LQJ &7f DOVR PLQLPL]HV 1&7f ,I WKH FRGH ILUVW FKHFNV [ H DQG WKHQ [ H RU Hf WKH QXPEHU RI HOHPHQW FRPSD[LVRQV GRQH WR ILQG [ LV [f U[f ZKHUH U[f LV WKH QXPEHU RI ULJKW EUDQFKHV RQ WKH SDWK IURP WKH URRW WR [ 7KH WRWDO QXPEHU RI FRPSDULVRQV LV ERXQGHG E\ &7ff )RU VLPSOLFLW\ ZH XVH &7f WR PRWLYDWH RXU GDWD VWUXFWXUH ,Q DQ $9/ WUHH ZKHQ DQ // URWDWLRQ LV SHUIRUPHG KTf LFf LGf VHH )LJXUH Dff $W WKLV WLPH WKH EDODQFH IDFWRU DW JS LV KSf fÂ§ KGf 7KH URWDWLRQ UHVWRUHV KHLJKW EDODQFH ZKLFK LV QHFHVVDU\ WR JXDUDQWHH 2ORJQf VHDUFK LQVHUW GHOHWH RSHUDWLRQV LQ DQ Q QRGH $9/ WUHH 7KH URWDWLRQ PD\ KRZHYHU LQFUHDVH WKH WRWDO VHDUFK FRVW 7R VHH WKLV QRWLFH WKDW DQ // URWDWLRQ DIIHFWV WKH OHYHO QXPEHUV RI RQO\ WKRVH QRGHV WKDW DUH LQ WKH VXEWUHH ZLWK URRW JS SULRU WR WKH URWDWLRQ :H VHH WKDW O^Tnf O^Tf fÂ§ 0Snf O^Sf fÂ§ ?OJSnf O^JSf? WKH WRWDO VHDUFK FRVW RI WKH VXEWUHH
PAGE 49
ZLWK URRW D LV GHFUHDVHG E\ VDf DV D UHVXOW RI WKH URWDWLRQ HWF +HQFH WKH LQFUHDVH LQ &7f GXH WR WKH URWDWLRQ LV .Sf .Sf .Yf .Tf .DSf .DSf Vrf VEf VGf fÂ§ VTf VGf VGf VJf $ VLPLODU DQDO\VLV VKRZV WKDW DQ /5 URWDWLRQ LQFUHDVHV &7f E\ VGf fÂ§ VTf ,I WKH // URWDWLRQ ZDV WULJJHUHG E\ DQ LQVHUWLRQ VTf LV DW OHDVW RQH PRUH WKDQ WKH PLQLPXP QXPEHU RI QRGHV LQ DQ $9/ WUHH RI KHLJKW W KTf fÂ§ 6R VTf !W\ ZKHUH ?f 7KH PD[LPXP YDOXH IRU VGf LV fÂ§ 6R DQ // URWDWLRQ KDV WKH SRWHQWLDO RI LQFUHDVLQJ WRWDO VHDUFK FRVW E\ DV PXFK DV f fÂ§ fÂ§ ILWr\ m f W 7KLV LV QHJDWLYH IRU W DQG SRVLWLYH IRU W :KHQ W IRU H[DPSOH DQ // URWDWLRQ PD\ LQFUHDVH WRWDO VHDUFK FRVW E\ DV PXFK DV $V W JHWV ODUJHU WKH SRWHQWLDO LQFUHDVH LQ VHDUFK FRVW JHWV PXFK JUHDWHU 7KLV DQDO\VLV LV HDVLO\ H[WHQGHG WR WKH UHPDLQLQJ URWDWLRQV DQG DOVR WR UHGEODFN WUHHV 'HILQLWLRQ :%Df >@f 7KH EDODQFH %Sf RI D QRGH S LQ D ELQDU\ WUHH LV WKH UDWLR Vf OfVSf f ZKHUH LV WKH OHIW FKLOG RI S )RU D f >@ D ELQDU\ WUHH 7 LV LQ :%Df LII D %^Sf fÂ§ D IRU HYHU\ QRGH S LQ 7 %\ GHILQLWLRQ WKH HPSW\ WUHH LV LQ :%Df IRU DOO D
PAGE 50
/HPPD f 7KH PD[LPXP KHLJKW KPD[Qf RI DQ Q QRGH WUHH LQ :%Df LV a ORJBBQ f >@ fÂ§4 f ,QVHUWV DQG GHOHWHV FDQ EH SHUIRUPHG LQ DQ Q QRGH WUHH LQ :%Df LQ 2ORJQf WLPH IRU D \ >@ f (DFK VHDUFK RSHUDWLRQ LQ DQ Q QRGH WUHH LQ :%Df WDNHV ?RJQf WLPH >@ ,Q WKH FDVH RI ZHLJKW EDODQFHG WUHHV :%Df DQ // URWDWLRQ LV SHUIRUPHG ZKHQ %JSf m fÂ§ D DQG %Sf D fÂ§ Df VHH )LJXUH Dff >@ 6R DUR rSf VA a V3f VSf VGf RU V^Gf m VSf fÂ§ D D fÂ§ D DQG D fÂ§ D %Sf VTf V^Sf RU VTf VSfWAfÂ§ fÂ§ D D fÂ§ D 6R // URWDWLRQV DQG DOVR 55f GR QRW LQFUHDVH WKH VHDUFK FRVW )RU /5 URWDWLRQV >@ %JSf m D DQG %Sf D fÂ§ Df 6R VGf m VSf\IA DQG ZLWK UHVSHFW WR )LJXUH Ef D %Sf VSf VTf VSf fÂ§ D
PAGE 51
RU fÂ§ RW R "f VSfa fÂ§ f fÂ§ D fÂ§ D )RU D VTf VGf DQG /5 5/f URWDWLRQV GR QRW LQFUHDVH VHDUFK FRVW 7KXV LQ WKH FDVH RI :%Df WUHHV WKH UHEDODQFLQJ URWDWLRQV GR QRW LQFUHDVH VHDUFK FRVW 7KLV VWDWHPHQW UHPDLQV WUXH LI WKH FRQGLWLRQV IRU // DQG /5 URWDWLRQ DUH FKDQJHG WR WKRVH LQ %OXP DQG 0HKOKRUQ >@ :KLOH URWDWLRQV GR QRW LQFUHDVH WKH VHDUFK FRVW RI :%Df WUHHV WKHVH WUHHV PLVV SHUIRUPLQJ VRPH URWDWLRQV WKDW ZRXOG UHGXFH VHDUFK FRVW )RU H[DPSOH LW LV SRVVLEOH WR KDYH D %JSf D %Sf DQG VTf VGf 6LQFH %JSf LVQfW KLJK HQRXJK DQ // URWDWLRQ LVQfW SHUIRUPHG
PAGE 52
&267 8 LV D PLQLPDO &267 RI KHLJKW K DQG VR KDV 1K QRGHV 6LQFH 4 LV D &267 _6_ PD[^?8? _9n_` :H PD\ DVVXPH WKDW 1K LV D QRQGHFUHDVLQJ IXQFWLRQ RI K 6R __ 1K 6LQFH 4 LV D PLQLPDO &267 RI KHLJKW K _6_ 1A 6R 1K 1NL 1K K 1 1L 7KLV UHFXUUHQFH LV WKH VDPH DV WKDW IRU WKH PLQLPXP QXPEHU RI QRGHV LQ DQ $9/ WUHH RI KHLJKW K 6R )K a ZKHUH ) LV WKH &WK )LEERQDFFL QXPEHU &RQVHTXHQWO\ 1K m IUA\% fÂ§ DQG K ORJA?m ff fÂ§ Â’ &RUROODU\ O 7KH PD[LPXP KHLJKW RI D &267 ZLWK Q QRGHV LV WKH VDPH DV WKDW RI DQ $9/ WUHH ZLWK WKLV PDQ\ QRGHV 'HILQLWLRQ /HW D DQG E EH WKH URRW RI WZR ELQDU\ WUHHV D DQG E DUH EDODQFHG ZLWK UHVSHFW WR RQH DQRWKHU GHQRWHG Df LII Df "VDf f VEf Ef V^Ef f VDf $ ELQDU\ WUHH 7 LV AEDODQFHG LII WKH FKLOGUHQ RI HYHU\ QRGH LQ 7 DUH AEDODQFHG $ IXOO ELQDU\ WUHH LV EDODQFHG DQG D ELQDU\ WUHH ZKRVH KHLJKW HTXDOV LWV VL]H LH QXPEHU RI QRGHVf LV EDODQFHG
PAGE 53
/HPPD ,I WKH ELQDU\ WUHH 7 LV EDODQFHG WKHQ LW LV EDODQFHG IRU 3URRI )ROORZV IURP WKH GHILQLWLRQ RI EDODQFH Â’ /HPPD ,I WKH ELQDU\ WUHH 7 LV EDODQFHG I WKHQ LW LV LQ :%Df IRU D O f 3URRI &RQVLGHU DQ\ QRGH S LQ 7 /HW O DQG U EH QRGH SnV OHIW DQG ULJKW FKLOGUHQ %Sf r9f VOf VUf Uf n f 6LQFH 7 LV EDODQFHG VOf fÂ§ VUf RU VOf I VUf 6R rOf DUf 3 fÂ§ IVUf f RU VUf A7 !3 6R %Sf f )XUWKHU VUf fÂ§ VOf 6R VUf "f $QG %Sf f f +HQFH f %Sf f IRU HYHU\ S LQ 7 6R 7 LV LQ :%Df IRU D f Â’
PAGE 54
2 )LJXUH $ WUHH LQ :%Of WKDW LV QRW _EDODQFHG 5HPDUN :KLOH HYHU\ AEDODQFHG WUHH LV LQ :%Df IRU D cf WKHUH DUH WUHHV LQ :%Df WKDW DUH QRW AEDODQFHG )LJXUH VKRZV DQ H[DPSOH RI D WUHH LQ :%Of WKDW LV QRW AEDODQFHG /HPPD L ,I 7 LV D &267 WKHQ 7 LV EDODQFHG 3URRI ,I 7 LV D &267 WKHQ HYHU\ VXEWUHH RI 7 LV D &267 &RQVLGHU DQ\ VXEWUHH ZLWK URRW S OHIW FKLOG DQG ULJKW FKLOG U ,I QHLWKHU O QRU U H[LVW WKHQ VOf VUf DQG S LV _EDODQFHG ,I VOf fÂ§ DQG VUf WKHQ U KDV D QRQHPSW\ VXEWUHH ZLWK URRW W DQG VWf VOf 6R S LV QRW D &267 +HQFH VUf DQG S LV AEDODQFHG 7KH VDPH LV WUXH ZKHQ VUf 6R DVVXPH VOf DQG VUf ,I VOf WKHQ VUf DV RWKHUZLVH RQH RI WKH VXEWUHHV RI U KDV P QRGHV DQG P VOf LPSOLHV S LV QRW D &267 6LQFH VUf AAUf fÂ§ f VOf DQG AVf fÂ§ f VUf 6R S LV AEDODQFHG 7KH VDPH SURRI DSSOLHV ZKHQ VUf :KHQ VOf DQG VUf OHW D DQG E EH WKH URRWV RI WKH OHIW DQG ULJKW VXEWUHHV RI O 6LQFH S LV D &267 VDf VUf DQG VEf VUf 6R VOf VDf I VEf VUf DQG _Vf fÂ§ f VUf 6LPLODUO\ _Uf fÂ§ f VOf 6R Uf 6LQFH WKLV SURRI DSSOLHV WR HYHU\ QRGHV LQ 7 WKH FKLOGUHQ RI HYHU\ S DUH AEDODQFHG DQG 7 LV _EDODQFHG Â’
PAGE 55
)LJXUH AEDODQFHG WUHH WKDW LV QRW D &267 5HPDUN 7KHUH DUH AEDODQFHG WUHHV WKDW DUH QRW &267V VHH )LJXUH f :KLOH D &267 LV LQ :%Of DQG :%Df WUHHV FDQ EH PDLQWDLQHG HIILFLHQWO\ RQO\ IRU D fÂ§ O! m D &267 LV EHWWHU EDODQFHG WKDQ :%DUf WUHHV ZLWK D LQ WKH XVDEOH UDQJH 8QIRUWXQDWHO\ ZH DUH XQDEOH WR GHYHORS 2ORJQf LQVHUWGHOHWH DOJRULWKPV IRU D &267 ,Q WKH QH[W VHFWLRQ ZH GHYHORS LQVHUW DQG GHOHWH DOJRULWKPV IRU AEDODQFHG ELQDU\ VHDUFK WUHHV %%67f IRU ? fÂ§ 1RWH WKDW HYHU\ \ fÂ§ f%%67 LV LQ :%Df IRU D fÂ§ O? ZKLFK LV WKH ODUJHVW SHUPLVVLEOH RU 6LQFH RXU LQVHUW DQG GHOHWH DOJRULWKPV SHUIRUP URWDWLRQV DORQJ WKH VHDUFK SDWK ZKHQHYHU WKHVH UHVXOW LQ LPSURYHG VHDUFK FRVW %%67V DUH H[SHFWHG WR KDYH EHWWHU VHDUFK SHUIRUPDQFH WKDQ :%Df WUHHV IRU D cff (DFK QRGH RI D %%67 KDV WKH ILHOGV /HIW&KLOG 6L]H 'DWD DQG 5LJKW&KLOG 6LQFH HYHU\ %%67 LV LQ :%Rf IRU D %%67V KDYH KHLJKW WKDW LV ORJDULWKPLF LQ Q WKH QXPEHU RI QRGHV SURYLGHG f
PAGE 56
6HDUFK ,QVHUW DQG 'HOHWH LQ D II%%67 7R UHGXFH QRWDWLRQDO FOXWWHU LQ WKH UHVW RI WKH FKDSWHU ZH DEEUHYLDWH VDf E\ D LH WKH QRGH QDPH GHQRWHV VXEWUHH VL]Hf 6HDUFK 7KLV LV GRQH H[DFWO\ DV LQ DQ\ ELQDU\ VHDUFK WUHH ,WV FRPSOH[LW\ LV Kf ZKHUH K LV WKH KHLJKW RI WKH WUHH 1RWLFH WKDW VLQFH HDFK QRGH KDV D VL]H ILHOG LW LV HDV\ WR SHUIRUP D VHDUFK EDVHG RQ LQGH[ LH ILQG WKH fWK VPDOOHVW NH\f 6LPLODUO\ RXU LQVHUW DQG GHOHWH DOJRULWKPV FDQ EH DGDSWHG WR LQGH[HG LQVHUW DQG GHOHWH ,QVHUWLRQ 7R LQVHUW D QHZ HOHPHQW [ LQWR D %%67 ZH ILUVW VHDUFK IRU [ LQ WKH "%%67 7KLV VHDUFK LV XQVXFFHVVIXO DV [ LV QRW LQ WKH WUHHf DQG WHUPLQDWHV E\ IDOOLQJ RII WKH WUHH $ QHZ QRGH \ FRQWDLQLQJ [ LV LQVHUWHG DW WKH SRLQW ZKHUH WKH VHDUFK IDOOV RII WKH WUHH /HW Sn EH WKH SDUHQW LI DQ\f RI WKH QHZO\ LQVHUWHG QRGH :H QRZ UHWUDFH WKH SDWK IURP Sn WR WKH URRW SHUIRUPLQJ UHEDODQFLQJ URWDWLRQV 7KHUH DUH IRXU NLQGV RI URWDWLRQV // /5 5/ DQG 55 // DQG 55 URWDWLRQV DUH V\PPHWULF DQG VR DOVR DUH /5 DQG 5/ URWDWLRQV 7KH W\SLFDO FRQILJXUDWLRQ EHIRUH DQ // URWDWLRQ LV SHUIRUPHG LV JLYHQ LQ )LJXUH Df Sn GHQRWHV WKH URRW RI D VXEWUHH LQ ZKLFK WKH LQVHUWLRQ ZDV PDGH /HW S EH WKH VL]H RI WKHf VXEWUHH EHIRUH WKH LQVHUWLRQ 7KHQ VLQFH WKH WUHH ZDV D %%67 SULRU WR WKH LQVHUWLRQ SGf $OVR IRU WKH // URWDWLRQ WR EH SHUIRUPHG ZH UHTXLUH WKDW T Ff DQG T Gf 1RWH WKDW T G LPSOLHV T :H VKDOO VHH WKDW TFf IROORZV IURP WKH IDFW WKDW WKH LQVHUWLRQ LV
PAGE 57
Df EHIRUH Ef DIWHU )LJXUH // URWDWLRQ IRU LQVHUWLRQ PDGH LQWR D %%67 DQG IURP SURSHUWLHV RI WKH URWDWLRQ )ROORZLQJ DQ // URWDWLRQ Sn LV XSGDWHG WR EH WKH QRGH S /HPPD >// LQVHUWLRQ OHPPD@ ,I >IS Gf $ IT Ff $ T Ff $ T GfIRU c EHIRUH WKH URWDWLRQ WKHQ TJSnf DQG "F Gf DIWHU WKH URWDWLRQ 3URRI $VVXPH WKH EHIRUH FRQGLWLRQ Df IT fÂ§ f F DV ITFff JSn $OVR JSn fÂ§ f cF?Gf IT DV T!F DQG T Gf T DV c f 6R TJSnf Ef G T G fÂ§ T fÂ§ IG fÂ§ f T fÂ§ f F DV ITFff $OVR Fa f 3T F f ^Sn f SS?f G DV ^SGff 6R FGf Â’ ,Q DQ /5 URWDWLRQ WKH EHIRUH FRQILJXUDWLRQ LV DV LQ )LJXUH Df +RZHYHU WKLV WLPH T F )LJXUH Df LV UHGUDZQ LQ )LJXUH Df ,Q WKLV WKH QRGH ODEHOHG F LQ )LJXUH Df KDV EHHQ ODEHOHG T DQG WKDW ODEHOHG T LQ )LJXUH Df KDV EHHQ ODEHOHG
PAGE 58
Df EHIRUH Ef DIWHU VXEVWHS Lf )LJXUH 6XEVWHS Lf RI LQVHUWLRQ /5 URWDWLRQ D :LWK UHVSHFW WR WKH ODEHOLQJV RI )LJXUH Df URWDWLRQ /5 LV DSSOLHG ZKHQ >" Df$T! Gf@ 7KH RWKHU FRQGLWLRQV WKDW DSSO\ ZKHQ DQ /5 URWDWLRQ LV SHUIRUPHG DUH >"SGf $ 0D! f $ Ff@ +HUH S GHQRWHV WKH VL]H RI WKHf OHIW VXEWUHH RI JS SULRU WR WKH LQVHUWLRQ $Q /5 URWDWLRQ LV DFFRPSOLVKHG LQ WZR VXEVWHSV RU WZR VXEURWDWLRQVf 7KH ILUVW RI WKHVH LV VKRZQ LQ )LJXUH Ef )ROORZLQJ DQ /5 URWDWLRQ Sn LV XSGDWHG WR EH QRGH Tn /HPPD >/5 VXEVWHSLf LQVHUWLRQ OHPPD@ ,I >IOSGf $ ILDTf $ Ff $ T Df$T!Gf@IRU2 EHIRUH WKH VXEURWDWLRQ WKHQ >5^SJSnf$^DEf$\A F Gff 9 \IAD f $ "F Gff`DIWHU WKH VXEURWDWLRQ
PAGE 59
3URRI $VVXPH WKH EHIRUH FRQGLWLRQ )LUVW ZH VKRZ WKDW ^SJSnf DIWHU WKH URWDWLRQ 1RWH WKDW S fÂ§ f D Ef D E F f fÂ§ F f Sn fÂ§ f fÂ§ F f S f F GF G JSn $OVR JSn f F Gf E G DV EFff E TDVT!GfE^D DV DTff S DV DQG S D f 6R SS&JSnf 1H[W ZH SURYH WZR SURSHUWLHV WKDW ZLOO EH XVHG WR FRPSOHWH WKH SURRI 3, "t f D 7R VHH WKLV QRWH WKDW E fÂ§ f T fÂ§ f D DV D Tff 3 F f G )RU WKLV REVHUYH WKDW Sn fÂ§ ? fÂ§ D T! T fÂ§ f T DV D Tff f fÂ§ f 6R T 6LPLODUO\ T E F ^F f F DV ^EFff ! f f 6R IOH f MIMIL f MM" f DV SGff G 7R FRPSOHWH WKH SURRI RI WKH OHPPD ZH QHHG WR VKRZ ^"} f $ 9 $}0ff` :H GR WKLV E\ FRQVLGHULQJ WKH WZR FDVHV E F DQG E F &DVH E F 6LQFH DT E F D fÂ§ f E Ff E 7KLV DQG 3, LPSO\ D Ef $OVR GT E F 6R AI\G fÂ§ f AI\ F fÂ§ f L"7F a f fÂ§ @LF DV EFff F 7KLV WRJHWKHU ZLWK 3 LPSOLHV ,ASFGf 6R DEf $ ASFGf &DVH E F 6LQFH D T F D fÂ§ F 6R D fÂ§ E F fÂ§ ORU
PAGE 60
L6 Â00f 7KLV DQG 3, LPSO\ f $OVR G T E F 6R #G f E F f F f F 7KLV WRJHWKHU ZLWK 3 LPSOLHV F Gf 6R \AMDf $ FGf Â’ 6LQFH DQ /5Lf URWDWLRQ FDQ FDXVH WKH WUHH WR ORVH LWV EDODQFH SURSHUW\ LW LV QHFHVVDU\ WR IROORZ WKLV ZLWK DQRWKHU URWDWLRQ WKDW UHVWRUHV WKH EDODQFH SURSHUW\ ,W VXIILFHV WR FRQVLGHU WKH WZR FDVHV RI )LJXUHV DQG IRU WKLV IROORZ XS URWDWLRQ 7KH UHPDLQLQJ FDVHV DUH V\PPHWULF WR WKHVH ,Q )LJXUHV DQG S DQG G GHQRWH WKH QRGHV WKDW GR QRW VDWLVI\ S Gf 1RWH KRZHYHU WKDW WKHVH QRGHV GR VDWLVI\ 6LQFH WKH IROORZ XS URWDWLRQ WR /5Lf LV GRQH RQO\ ZKHQ HLWKHU SfÂ§ f G RU G fÂ§ f S :KHQ SfÂ§ f G WKH VHFRQG VXEVWHS URWDWLRQ LV RQH RI WKH WZR JLYHQ LQ )LJXUHV DQG :KHQ G fÂ§ f S URWDWLRQV V\PPHWULF WR WKHVH DUH SHUIRUPHG ,Q WKH IROORZLQJ ZH DVVXPH S fÂ§ f G )XUWKHU ZH PD\ DVVXPH G DV G DQG \AS Gf LPSO\ S +HQFH IOSGf $OVR G DQG S fÂ§ f G LPSO\ S 7KH /5LLf // URWDWLRQ LV GRQH ZKHQ WKH FRQGLWLRQ $ T Gf $ F fT fÂ§ ff $ % ZKHUH
PAGE 61
Df EHIRUH Ef DIWHU )LJXUH &DVH // IRU /5LLf URWDWLRQ % $ $ TFf $ SS f G f /HPPD >&DVH /5LLf // URWDWLRQ@ ,I $ KROGV EHIRUH WKH URWDWLRQ RI )LJXUH WKHQ 3^TJSnf DQG IWFGf DIWHU WKH URWDWLRQ SURYLGHG c 9 fÂ§ 3URRI Df STJSnf? 3Ta f F DV 3T Fff JSn $OVR SJSnOf SFGf 3O3fTOa3f Gf 3^ 3fT 3 f f DV T Gf 3 DV Sf IRU ? ,f 6R S^TJSnf Ef AI Lf f F DV "" Fff $QG SF f AF f AF f L_AF f f L_A F f MAS f AS f G DV ASGff 6R 3FGf Â’ /HPPD ,I F 3fT fÂ§ "ff $ "S fÂ§ f Gf LQ )LJXUH WKHQ G T SURYLGHG 3 ? fÂ§
PAGE 62
3URRI 6LQFHG SfÂ§ f IT?Ff T O^I`fTOfÂ§f fT OfÂ§f T DV "" f DQG c "f IRU \ Of 6R G T Â’ 6R WKH RQO\ WLPH DQ /5LLf // URWDWLRQ LV QRW GRQH LV ZKHQ & &? 9 &"f $ % KROGV ZKHUH F Gf$F O "fJ f & F!O fT Of $W WKLV WLPH WKH /5 URWDWLRQ RI )LJXUH LV GRQH ,Q WHUPV RI WKH QRWDWLRQ RI )LJXUH WKH FRQGLWLRQ & EHFRPHV fÂ§ '? 9 '"f $ ( ZKHUH 'L D fÂ§ Gf $ T cfD f fÂ§ T "fD fÂ§ ( Gf $ f3}Gf $ 0D! f $ +} Ff $ 3 f G f /HPPD :KHQ DQ /5LLf /5 URWDWLRQ LV SHUIRUPHG DQG IO \ fÂ§ T G DQG VR VHDUFK FRVW LV UHGXFHG 3URRI ,I 'L WKHQ VLQFH G IS fÂ§ f "D Tf G f Gc fÂ§ G G DV 98'f WKHQ L IWSf D f f +JST A DV" ?f 2
PAGE 63
D E G /5LLf /5 D E F G F Df EHIRUH Ef DIWHU )LJXUH &DVH /5 IRU /5LLf URWDWLRQ /HPPD :KHQ G Df $ Ff $ S fÂ§ f Gf $ c ? fÂ§ f VHH )LJXUH f 3D fÂ§ f E DQG G fÂ§ f F 3URRI 6LQFH S fÂ§ f G DQG G D S fÂ§ f D RU D Tf D RU DO fÂ§ f IOT RU D AT 6R D f T IL AE F f ,I F WKHQ }Â‘f < M 3 ""Of 4 If ""f f DV f IRU ? DQG e IRU f
PAGE 64
6LQFH F fÂ§ f F I 6R fE n 6R +RZHYHU VLQFH fÂ§ IRU Y fÂ§ fO fÂ§ f A DQG OfO f 6R D E ,I D F WKHQ F D I :H KDYH DOUHDG\ VKRZQ WKDW IRU F E D fÂ§ f E 6R DVVXPH D F 1RZ D F DQG D f F f E DV Fff 6R D f E LQ DOO FDVHV D fÂ§ f F PD\ EH VKRZQ LQ D VLPLODU ZD\ 6LQFH D G ZH JHW G fÂ§ f F Â’ /HPPD >&DVH /5LLf /5 URWDWLRQ@ ,I KROGV EHIRUH WKH URWDWLRQ RI )LJXUH WKHQ SnJSnf DEf DQG FGf IROORZLQJ WKH URWDWLRQ SURYLGHG \ fÂ§ 3URRI Df SfJSnf? JSn fÂ§ f ^F Gf E G DV EFff E T IURP /HPPDV DQG T Gf E D D E D E^ ? Sn $OVR VLQFH MA^S Gf DQG T GS fÂ§ f fG RU D I Tf fG RU D T MMfG RU D MMfGT sfGG G 6R ^Sn f D Ef G E G F DV E Fff G? F JSn Ef D Ef 6LQFH E T DQG DTfEfÂ§ f T fÂ§ f D
PAGE 65
:KHQ 'L D fÂ§ f E ZDV SURYHG LQ /HPPD 6R DEf :KHQ T DO f 6R A T O B E F r f 7 a 8 f f n 6R "DOf E F E E fÂ§ E 6R ^DEf Ff ^FGf 1RWH WKDW F f J f MA^T f MAS f G :KHQ '? G fÂ§ f F ZDV SURYHG LQ /HPPD 6R FGf :KHQ LI G E WKHQ G E DQG G f E f F 6R DVVXPH G E 1RZ E G fÂ§ ^S fÂ§ f fÂ§ 6R E D Â FOf fÂ§ r f MAFO"fFOffO MA_ F A :_ Fff F Om O mLLAFffO F fF fÂ§ fF fÂ§ fF DV ? fÂ§ f
PAGE 66
D6 I ? fÂ§ f $OVR IURP G S fÂ§ f DQG WKH DERYH GHULYDWLRQ ZH JHW G F O f F ff F f_ &ff 77AFfU6AFfZf 4 fF M A f fÂ§ 3fF M 3fF 3 fF DV OIRU Y f 6R 3^G f fF F DV YA f 6R F Gf Â’ 7KHRUHP ,I 7 LV EDODQFHG ? fÂ§ SULRU WR LQVHUWLRQ LW LV VR IROORZLQJ WKH LQVHUWLRQ 3URRI )LUVW QRWH WKDW VLQFH DOO ELQDU\ VHDUFK WUHHV DUH EDODQFHG IRU WKH URWDn WLRQV ZKLOH XQQHFHVVDU\f SUHVHUYH EDODQFH 6R DVVXPH &RQVLGHU WKH WUHH 7n MXVW DIWHU WKH QHZ HOHPHQW KDV EHHQ LQVHUWHG EXW EHIRUH WKH EDFNZDUG UHVWUXFWXULQJ SDVV EHJLQV ,I WKH QHZO\ LQVHUWHG QRGH ] KDV QR SDUHQW LQ 7? WKHQ 7 ZDV HPSW\ DQG 7n LV EDODQFHG ,I ] KDV D SDUHQW EXW QR JUDQGSDUHQW WKHQ 7 KDV DW PRVW RQH QRQHPSW\
PAGE 67
VXEWUHH ; 6LQFH 7 LV AEDODQFHG G?;? fÂ§ f 6R _<_ )ROORZLQJ WKH LQVHUWLRQ 7n KDV RQH VXEWUHH ZLWK QRGHV DQG RQH ZLWK H[DFWO\ RQH 6R 7n LV AEDODQFHG :H PD\ WKHUHIRUH DVVXPH WKDW ] KDV D JUDQGSDUHQW LQ 7n )URP WKH GRZQZDUG LQVHUWLRQ SDWK LW IROORZV WKDW DOO QRGHV X LQ 7n WKDW KDYH FKLOGUHQ DQG U IRU ZKLFK !" Uf PXVW OLH RQ WKH SDWK IURP WKH URRW WR ] 'XULQJ WKH EDFNZDUG UHVWUXFWXULQJ SDVV HDFK QRGH RQ WKLV SDWK RWKHU WKDQ DQG LWV SDUHQWf SOD\ WKH UROH RI JS LQ )LJXUHV DQG 7KH SURSHUW\ FDQQRW EH YLRODWHG DW ] DV ] KDV QR FKLOGUHQ ,W FDQQRW EH YLRODWHG DW WKH SDUHQW V RI ] DV V VDWLVILHG WKH SURSHUW\ SULRU WR LQVHUWLRQ $V D UHVXOW LWV RWKHU VXEWUHH KDV HOHPHQW 6R IROORZLQJ WKH LQVHUWLRQ V VDWLVILHV WKH SURSHUW\ $V D UHVXOW HDFK QRGH LQ 7n WKDW PLJKW SRVVLEO\ YLRODWH WKH SURSHUW\ EHFRPHV WKH JS QRGH GXULQJ WKH UHVWUXFWXULQJ SDVV &RQVLGHU RQH VXFK JS QRGH ,W KDV FKLOGUHQ LQ 7n GHQRWHG E\ Sn DQG G ,WV FKLOGUHQ LQ 7 DUH S DQG G )LJXUHV DQG VKRZ WKH FDVH ZKHQ G LV WKH ULJKW VXEWUHH RI JS LQ ERWK 7 DQG 7n 7KH FDVHV 55 DQG 5/ DULVH ZKHQ G LV WKH OHIW VXEWUHH 'XULQJ WKH UHVWUXFWXULQJ SDVV JS EHJLQV DW WKH JUDQGSDUHQW RI ] DQG PRYHV XS WR WKH URRW RI 7n ,I ] LV DW OHYHO U LQ 7n WKH URRW EHLQJ DW OHYHO f WKHQ JS WDNHV RQ U fÂ§ YDOXHV GXULQJ WKH UHVWUXFWXULQJ SDVV :H VKDOO VKRZ WKDW DW HDFK RI WKHVH U fÂ§ SRVLWLRQV HLWKHU Df QR URWDWLRQ LV SHUIRUPHG DQG DOO GHVFHQGDQWV RI JS VDWLVI\ WKH SURSHUW\ RU Ef D URWDWLRQ LV SHUIRUPHG DQG IROORZLQJ WKLV DOO GHVFHQGDQWV RI QRGH S )LJXUH f RU RI QRGH Tn )LJXUH f VDWLVI\ WKH SURSHUW\
PAGE 68
$V D UHVXOW IROORZLQJ WKH URWDWLRQ LI DQ\f SHUIRUPHG ZKHQ JS EHFRPHV WKH URRW RI 7n WKH UHVWUXFWXUHG WUHH LV AEDODQFHG 7KH SURRI LV E\ LQGXFWLRQ RQ U :KHQ U UHFDOO ZH DVVXPH ] KDV D JUDQGSDUHQWf JS EHJLQV DW WKH URRW RI 7 DQG LWV GHVFHQGDQWV VDWLVI\ WKH SURSHUW\ :LWKRXW ORVV RI JHQHUDOLW\ DVVXPH WKDW WKH LQVHUWLRQ WRRN SODFH LQ WKH OHIW VXEWUHH RI JS :LWK UHVSHFW WR )LJXUH ZH KDYH WKUHH FDVHV Lf T F DQG T G LLf T F DQG F G DQG LLLf T G DQG F G ,Q FDVH Lf DOO FRQGLWLRQV IRU DQ // URWDWLRQ KROG DQG VXFK D URWDWLRQ LV SHUIRUPHG ,Q FDVH LLf DQ /5 URWDWLRQ LV SHUIRUPHG )ROORZLQJ HLWKHU URWDWLRQ 9 LV EDODQFHG ,Q FDVH LLLf Sn fÂ§ f T Ff G G DV IL \ f $OVR G fÂ§ f S S Sn 6R G f Sn +HQFH SnGf DQG 7n LV EDODQFHG )RU WKH LQGXFWLRQ K\SRWKHVLV DVVXPH Df DQG Ef ZKHQHYHU U N ,Q WKH LQGXFWLRQ VWHS ZH VKRZ Df DQG Ef IRU WUHHV 7 ZLWK U N 7KH VXEWUHH LQ ZKLFK WKH LQVHUWLRQ LV GRQH KDV U N 6R Df DQG Ef KROG IRU DOO JS ORFDWLRQV LQ WKH VXEWUHH :H QHHG WR VKRZ Df DQG Ef RQO\ ZKHQ JS LV DW WKH URRW RI 7n 7KLV IROORZV IURP /HPPDV DQG 7KH WKHRUHP QRZ IROORZV Â’ /HPPD 7KH WLPH QHHGHG WR GR DQ LQVHUWLRQ LQ DQ Q QRGH %%67 LV 2ORJQf SURYLGHG IL \ fÂ§
PAGE 69
3URRI )ROORZV IURP WKH IDFW WKDW LQVHUWLRQ WDNHV Kf WLPH ZKHUH K LV WKH WUHH KHLJKW DQG K 2ORJQf ZKHQ c /HPPDV DQG f Â’ 'HOHWLRQ 7R GHOHWH HOHPHQW [ IURP D /%%67 ZH ILUVW XVH WKH XQEDODQFHG ELQDU\ VHDUFK WUHH GHOHWLRQ DOJRULWKP RI +RURZLW] DQG 6DKQL >@ WR GHOHWH [ DQG WKHQ SHUIRUP D VHULHV RI UHEDODQFLQJ URWDWLRQV 7KH VWHSV DUH 6WHS >/RFDWH [@ 6HDUFK WKH %%67 IRU WKH QRGH \ WKDW FRQWDLQV [ ,I WKHUH LV QR VXFK QRGH WHUPLQDWH 6WHS >'HOHWH [? ,I \ LV D OHDI VHW GO WR QLO JS WR WKH SDUHQW RI \ DQG GHOHWH QRGH \ ,I \ KDV H[DFWO\ RQH FKLOG VHW GO WR EH WKLV FKLOG FKDQJH WKH SRLQWHU IURP WKH SDUHQW LI DQ\f RI \ WR SRLQW WR WKH FKLOG RI \ GHOHWH QRGH \ VHW JS WR EH WKH SDUHQW RI GO ,I \ KDV WZR FKLOGUHQ ILQG WKH QRGH ] LQ WKH OHIW VXEWUHH RI \ WKDW KDV ODUJHVW YDOXH PRYH WKLV YDOXH LQWR QRGH \ VHW \ ] JR WR WKH VWDUW RI 6WHS ^ QRWH WKDW WKH QHZ \ KDV HLWKHU RU FKLOG ` 6WHS >5HEDODQFH@ 5HWUDFH WKH SDWK IURP GO WR WKH URRW SHUIRUPLQJ UHEDODQFLQJ URWDWLRQV 7KHUH DUH IRXU UHEDODQFLQJ URWDWLRQV // /5 55 DQG 5/ 6LQFH // DQG 55 DV ZHOO DV /5 DQG 5/ DUH V\PPHWULF URWDWLRQV ZH GHVFULEH // DQG /5 RQO\ 7KH GLVFXVVLRQ LV YHU\ VLPLODU WR WKH FDVH RI LQVHUWLRQ 7KH GLIIHUHQFHV LQ SURRIV DUH GXH WR WKH IDFW WKDW D GHOHWLRQ UHGXFHV WKH VL]H RI HQFRXQWHUHG VXEWUHHV E\ ZKLOH DQ
PAGE 70
Df EHIRUH Ef DIWHU )LJXUH // URWDWLRQ IRU GHOHWLRQ LQVHUWLRQ LQFUHDVHV LW E\ ,Q DQ // URWDWLRQ WKH FRQILJXUDWLRQ MXVW EHIRUH DQG DIWHU WKH URWDWLRQ LV VKRZQ LQ )LJXUH 7KLV URWDWLRQ LV SHUIRUPHG ZKHQ T F DQG T GO )ROORZLQJ WKH URWDWLRQ Gn LV XSGDWHG WR WKH QRGH Sn /HW G GHQRWH WKH VL]H RI WKH ULJKW VXEWUHH RI JS EHIRUH WKH GHOHWLRQ 6R G GnIL 6LQFH SULRU WR WKH GHOHWLRQ WKH %%67 ZDV AEDODQFHG LW IROORZV WKDW "S Gf DQG Ff /HPPD >// GHOHWLRQ OHPPD@ ,I >SGf $TFf $T FfO?T $ f EHIRUH WKH URWDWLRQ WKHQ >IOTJSnf $ FGnf@ DIWHU WKH URWDWLRQ 3URRI Df TJSnf ^T fÂ§ f F DV TFff JSn $OVR /JSn fÂ§ f F Gf T DV F T DQG Gn Tf T DV 3 f 6R TJSIf Ef +FGnf GO T !f G fÂ§ T fÂ§ Gn fÂ§ f T fÂ§ f F $OVR ZKHQ F "FfÂ§f Gn DV G f :KHQ F ?T F Â‘ T DQG S T F !F 6R I^F f f^S f G DV Gff G DV f G +HQFH
PAGE 71
,Q DQ /5 URWDWLRQ WKH EHIRUH FRQILJXUDWLRQ LV DV LQ )LJXUH Df +RZHYHU WKLV WLPH T F )LJXUH Df LV UHGUDZQ LQ )LJXUH Df ,Q WKLV WKH QRGH ODEHOHG F LQ )LJXUH Df KDV EHHQ UHODEHOHG T DQG WKDW ODEHOHG T LQ )LJXUH Df KDV EHHQ UHODEHOHG D :LWK UHVSHFW WR WKH ODEHOLQJV RI )LJXUH Df URWDWLRQ /5 LV DSSOLHG ZKHQ >T Df $ Gf? 7KH RWKHU FRQGLWLRQV WKDW DSSO\ ZKHQ DQ /5 URWDWLRQ LV SHUIRUPHG DUH >AS Gf $ D Tf $ cE Ff@ +HUH G GHQRWHV WKH VL]H RIf ULJKW VXEWUHH RI JS SULRU WR WKH GHOHWLRQ $V LQ WKH FDVH RI LQVHUWLRQ DQ /5 URWDWLRQ LV DFFRPSOLVKHG LQ WZR VXEVWHSV RU WZR VXEURWDWLRQVf 7KH ILUVW RI WKHVH LV VKRZQ LQ )LJXUH )ROORZLQJ DQ /5 URWDWLRQ G LV XSGDWHG WR QRGH Tn /HPPD >/5 VXEVWHSLf GHOHWLRQ OHPPD@ ,I >S Gf $ "D Tf $ Ff $ T Df $ T Gnf@ EHIRUH WKH VXEURWDWLRQ /5Lf WKHQ >SnJSnf$^ID f$\AMF Gnff9\AM DEf $ "F Gnff`@ DIWHU WKH VXEURWDWLRQ SURYLGHG 5 3URRI $VVXPH WKH EHIRUH FRQGLWLRQ Df ,I E F WKHQ T E F )XUWKHUPRUH T Df DQG T Gnf LPSO\ D G 6R JSn Sn +HQFH >?SnJSnf $ _Df $ _FGnf@
PAGE 72
Df EHIRUH Ef DIWHU VXEVWHS Lf )LJXUH /5 URWDWLRQ IRU GHOHWLRQ Ef ,I E DQG F WKHQ T D DQG G 6R Sn DQG JMI +HQFH >\SnJSnf $ ?^DEf $ FGnf@ Ff ,I E DQG F WKHQ T D DQG Gn 6R Sn DQG +HQFH $ ?DEf $ F Ânf@ $V D UHVXOW RI Df Ff WR FRPSOHWH WKH SURRI ZH PD\ DVVXPH WKDW E DQG F 6R T D DV IDTf Â‘ IOT fÂ§ f D RU D ! f S D T O G DV "S Gf S fÂ§ f G DQG f DQG Gn G fÂ§ )LUVW ZH VKRZ WKDW SnJSnf )RU WKLV QRWH WKDW D F O S fÂ§ )URP AS Gf LW IROORZV WKDW D F f S fÂ§ f G 6R ILD Ef G fÂ§ fF fÂ§ )URP )LJXUH Ef ZH VHH WKDW Sn fÂ§ f D Ef +HQFH Sn fÂ§ f G fÂ§ F fÂ§ GO fÂ§ F G fÂ§ JSn $OVR JSOf F Gnf E U DV E Fff E T DV T Gnf
PAGE 73
D Â DV DTff 3n 6R 3^SnJSnf 1H[W ZH SURYH WZR SURSHUWLHV WKDW ZLOO EH XVHG WR FRPSOHWH WKH SURRI 3, Âf D 7R VHH WKLV QRWH WKDW Â fÂ§ f IWT fÂ§ f D DV "D ff 3 3F f Gf )RU WKLV REVHUYH WKDW ÂF fÂ§ f T fÂ§ f DV F T fÂ§ f ÂS fÂ§ f DV T S fÂ§ D fÂ§ DQG D f ÂS f =IL G DV ÂS Gf DQG Â f Gn 7R FRPSOHWH WKH SURRI RI WKH OHPPD ZH QHHG WR VKRZ ++f f $ MaF f 9 f $ +H )RU WKLV FRQVLGHU WKH WZR FDVHV F DQG E F DV LQ /HPPD f &DVH E F 6LQFH DT E?F ÂD fÂ§ f Ff Â 7KLV WRJHWKHU ZLWK 3, LPSOLHV ÂD f $OVR In 6R A\In fÂ§ f A\ F fÂ§ f 6,F Lr fÂ§ rf VIUF MIO mf 7KLV WRJHWKHU ZLWK 3 LPSOLHV 6R +L2$U0nf &DVH F 6LQFH D F D fÂ§ F 6R D fÂ§ FfÂ§ RU %X A YE A Ut WK E 7KLV DQG 3 LP3\ LÂUD!f $OVR! Gn fÂ§ fÂ§ FfÂ§ 6R "In fÂ§ f 3E F fÂ§ f ILF fÂ§ f F 7KLV DQG
PAGE 74
3 LPSO\ IFGnf +HQFH MAD Ef $ F Gnf Â’ 7KH VXEVWHSLLf URWDWLRQV DUH WKH VDPH DV IRU LQVHUWLRQ 7KHRUHP ,I7 LV AEDODQFHG WKHQ IROORZLQJ D GHOHWLRQ WKH UHVXOWLQJ WUHH 7n LV DOVR cEDODQFHG SURYLGHG c \ fÂ§ 3URRI 6LPLODU WR WKDW RI 7KHRUHP D :KHQ c ZH QHHG WR DXJPHQW WKH // URWDWLRQ E\ D WUDQVIRUPDWLRQ IRU WKH FDVH GO :KHQ G ^LS fÂ§ f G Gn 6R S DQG JS S G 7R EDODQFH DW JS WKH DW PRVW QRGHV LQ JS DUH UHDUUDQJHG LQWR DQ\ %%67 LQ FRQVWDQW WLPH DV LV D FRQVWDQWf :KHQ G WKH SURRI RI /HPPD SDUW Ef FDQ EH FKDQJHG WR VKRZ "F fÂ§ f GO IRU I \ fÂ§ 7KH QHZ SURRI LV VLQFH F TF S fÂ§ Of DQG "F fÂ§ f "S fÂ§ Of fÂ§ G fÂ§ G fÂ§ G fÂ§ GfÂ§?fÂ§cGn 7KH /5 URWDWLRQ QHHGV WR EH DXJPHQWHG E\ D WUDQVIRUPDWLRQ IRU WKH FDVH G fÂ§ G fÂ§ SASf a 7 WKLV WLPH IS f G MAS\ 6R JS S G SAS@ 7R EDODQFH DW JS ZH UHDUUDQJH WKH IHZHU WKDQ SWASf SSM QRAHV !Q VXEWUHH LQ FRQVWDQW WLPH LQWR DQ\ AEDODQFHG WUHH :KHQ Gn SMSM fÂ§ WKH SURRI IRU F fÂ§ f Gn LQ /HPPD QHHGV WR EH FKDQJHG WR VKRZ WKDW WKH /5 VXEVWHSLf OHPPD KROGV 7KH QHZ SURRI LV G ^S f D E F f T f E FI f
PAGE 75
^"t Ff E F f O fFOf O fFOf mO"fFOf f 6R mF f G DV I ÂIDf L $OVR QRWH WKDW ZKHQ DOO WUHHV DUH AEDODQFHG VR WKH URWDWLRQV ZKLOH QRW QHHGHGf SUHVHUYH EDODQFH 7KHRUHP :LWK WKH VSHFLDO KDQGOLQJ RI WKH FDVH Gn WKH WUHH 7n UHVXOWLQJ IURP D GHOHWLRQ LQ D %%67 LV DOVR EDODQFHG IRU \ fÂ§ /HPPD 7KH WLPH QHHGHG WR GHOHWH DQ HOHPHQW IURP DQ Q QRGH %%67 LV ?RJQf SURYLGHG c ? fÂ§ (QKDQFHPHQWV 6LQFH RXU REMHFWLYH LV WR FUHDWH VHDUFK WUHHV ZLWK PLQLPXP VHDUFK FRVW WKH UHEDODQFLQJ URWDWLRQV PD\ EH SHUIRUPHG DW HDFK SRVLWLRQLQJ RI JS GXULQJ WKH EDFNZDUG UHVWUXFWXULQJ SDVV VR ORQJ DV WKH FRQGLWLRQV IRU WKH URWDWLRQ DSSO\ UDWKHU WKDQ RQO\ DW JS SRVLWLRQV ZKHUH WKH WUHH LV XQEDODQFHG &RQVLGHU )LJXUH Df ,I Sn G WKHQ WKH FRQGLWLRQV RI /HPPDV DQG FDQQRW DSSO\ DV T Sn G +RZHYHU LW LV SRVVLEOH WKDW H Sn ZKHUH H LV WKH VL]H RI HLWKHU WKH OHIW RU ULJKW VXEWUHH RI G ,Q WKLV FDVH DQ 55 RU 5/ URWDWLRQ ZRXOG UHGXFH WKH WRWDO VHDUFK FRVW 7KH SURRIV RI /HPPDV DQG DUH HDVLO\ H[WHQGHG WR VKRZ WKDW WKHVH URWDWLRQV ZRXOG SUHVHUYH EDODQFH HYHQ WKRXJK QR LQVHUWLRQ ZDV GRQH LQ WKH VXEWUHH
PAGE 76
G 7KH VDPH REVHUYDWLRQ DSSOLHV WR GHOHWLRQ +HQFH WKH EDFNZDUG UHVWUXFWXULQJ SDVV IRU WKH LQVHUW DQG GHOHWH RSHUDWLRQV FDQ GHWHUPLQH WKH QHHG IRU D URWDWLRQ DW HDFK JS ORFDWLRQ DV EHORZ DQG U DUH UHVSHFWLYHO\ WKH OHIW DQG ULJKW FKLOGUHQ RI JSf L 02 VUf WKHQ FKHFN FRQGLWLRQV IRU DQ // DQG /5 URWDWLRQ HOVH FKHFN FRQGLWLRQV IRU DQ 55 DQG 5/ URWDWLRQ 7KH HQKDQFHG UHVWUXFWXULQJ SURFHGXUH XVHG IRU LQVHUWLRQ DQG GHOHWLRQ LV JLYHQ LQ )LJXUH ,Q WKH 55 DQG 5/ FDVHV ZH KDYH XVHG WKH UHODWLRQ r!f UDWKHU WKDQ f!f DV WKLV UHVXOWV LQ EHWWHU REVHUYHG UXQ WLPH 6LQFH LW FDQ EH VKRZQ WKDW WKH URWDWLRQV SUHVHUYH EDODQFH HYHQ ZKHQ WKHUH KDV EHHQ QR LQVHUW RU GHOHWH ZH PD\ FKHFN WKH URWDWLRQ FRQGLWLRQV GXULQJ D VHDUFK RSHUDWLRQ DQG SHUIRUP URWDWLRQV ZKHQ WKHVH LPSURYH WRWDO VHDUFK FRVW )LQDOO\ ZH QRWH WKDW LW LV SRVVLEOH WR XVH RWKHU GHILQLWLRQV RI EDODQFH )RU H[DPSOH ZH FRXOG UHTXLUH "VDf fÂ§ f VEf DQG V!f fÂ§ f VDf IRU DEf 2QH FDQ VKRZ WKDW WKH GHYHORSPHQW RI WKLV FKDSWHU DSSOLHV WR WKHVH PRGLILFDWLRQV DOVR )XUWKHUPRUH ZKHQ WKLV QHZ GHILQLWLRQ LV XVHG WKH QXPEHU RI FRPSDULVRQV LQ WKH VHFRQG VXEVWHS RI WKH /5 DQG 5/ URWDWLRQV LV UHGXFHG E\ RQH 7RS 'RZQ $OJRULWKPV $V LQ WKH FDVH RI UHGEODFN DQG :%Df WUHHV LW LV SRVVLEOH WR SHUIRUP LQ ORJ Qf WLPH LQVHUWV DQG GHOHWHV XVLQJ D VLQJOH WRS WR ERWWRP SDVV 7KH DOJRULWKPV DUH VLPLODU WR WKRVH DOUHDG\ SUHVHQWHG
PAGE 77
SURFHGXUH 5HVWUXFWXULQJ EHJLQ ZKLOH JSf GR EHJLQ LI VJSOHIWf VJSULJKWff WKHQ EHJLQ ^FKHFN FRQGLWLRQV IRU DQ // DQG /5 URWDWLRQ` S JSOHIW LI VSOHIWf VSULJKWff WKHQ EHJLQ LI VSOHIWf VJSULJKWff WKHQ GR // URWDWLRQ HQG HOVH EHJLQ LI VSULJKWf VJSULJKWff WKHQ ^/5` EHJLQ GR /5 URWDWLRQ ^ QRZ QRWDWLRQV D F DQG G IROORZ IURP ILJXUH Ef ` LI "VDf fÂ§ f VEff WKHQ LI VDULJKWf fVDOHIWf fÂ§ "f DQG VEf VDOHIWfff WKHQ GR // URWDWLRQ HOVH GR /5 URWDWLRQ HOVH LI "VGf fÂ§ f VFff WKHQ LI VGOHIWf 3fVGULJKWf fÂ§ "f DQG VFf VGULJKWfff WKHQ GR 55 URWDWLRQ HOVH GR 5/ URWDWLRQ HQG HQG HQG HOVH ^FKHFN FRQGLWLRQV IRU DQ 55 DQG 5/ URWDWLRQ` EHJLQ 3 JSULJKW LI VSOHIWf VSULJKWff WKHQ EHJLQ LI VSOHIWf VJSOHIWff WKHQ ^5/` GR V\PPHWULF WR WKH DERYH /5 FDVH HQG HOVH EHJLQ LI VSULJKWf VJSOHIWff WKHQ GR 55 URWDWLRQ HQG HQG JS JSSDUHQW HQG HQG )LJXUH 5HVWUXFWXULQJ SURFHGXUH
PAGE 78
6LPSOH %%67V 7KH GHYHORSPHQW RI 6HFWLRQ ZDV PRWLYDWHG E\ RXU GHVLUH WR FRQVWUXFW WUHHV ZLWK PLQLPDO VHDUFK FRVW ,I LQVWHDG ZH GHVLUH RQO\ ORJDULWKPLF SHUIRUPDQFH SHU RSHUDWLRQ ZH PD\ VLPSOLI\ WKH UHVWUXFWXULQJ SDVV VR WKDW URWDWLRQV DUH SHUIRUPHG RQO\ DW QRGHV ZKHUH WKH EDODQFH SURSHUW\ LV YLRODWHG ,Q WKLV FDVH ZH PD\ GLVSHQVH ZLWK WKH //55 URWDWLRQV DQG WKH ILUVW VXEVWHS RI DQ /55/ URWDWLRQ 2QO\ /55/ VXEVWHS LLf URWDWLRQV DUH QHHGHG 7R VHH WKLV REVHUYH WKDW /HPPDV DQG VKRZ WKDW WKH VHFRQG VXEVWHS URWDWLRQV UHEDODQFH DW JS VHH )LJXUHV DQG f SURYLGHG S Gf 7KH UHPDLQLQJ FRQGLWLRQV DUH HQVXUHG E\ WKH ERWWRPXS QDWXUH RI UHVWUXFWXULQJ DQG WKH IDFW WKH WUHH ZDV AEDODQFHG SULRU WR WKH LQVHUW RU GHOHWHf ,I WKH RSHUDWLRQ WKDW UHVXOWHG LQ ORVV RI EDODQFH DW JS ZDV DQ LQVHUW WKHQ S fÂ§ f G DV S G WKH LQVHUW WRRN SODFH LQ VXEWUHH S DQG JS ZDV AEDODQFHG SULRU WR WKH LQVHUWf DQG S fÂ§ f G JS LV QRW AEDODQFHG IROORZLQJ WKH LQVHUWf )RU WKH VXEVWHS LLf URWDWLRQ WR UHVWRUH EDODQFH ZH QHHG "S fÂ§ f fG 7KLV LV DVVXUHG LI G OfG DV S fÂ§ f Gf 6R ZH QHHG G ,I G WKHQ G 1RZ S fÂ§ f G DQG S fÂ§ f G LPSO\ S 2QH PD\ YHULI\ WKDW ZKHQ S WKH /5LLf URWDWLRQV UHVWRUH EDODQFH ,I WKH ORVV RI EDODQFH DW JS LV WKH UHVXOW RI D GHOHWLRQ VD\ IURP LWV ULJKW VXEWUHHf WKHQ S fÂ§ f G DV SS ZDV EDODQFHG SULRU WR WKH GHOHWHf )RU WKH VXEVWHS LLf URWDWLRQ WR DFFRPSOLVK WKH UHEDODQFLQJ ZH QHHG "S fÂ§ f ) OfG 7KLV LV JXDUDQWHHG LI G OfG RU G :KHQ G DQG G 6LQFH S fÂ§ f G DQG ZKHQ G S ZKHQ G S DQG ZKHQ
PAGE 79
G 2 S :H PD\ YHULI\ WKDW IRU DOO WKHVH FDVHV WKH /5LLf URWDWLRQV UHVWRUH EDODQFH +HQFH WKH RQO\ SUREOHPDWLF FDVH LV ZKHQ c DQG G :KHQ DQ // URWDWLRQ IDLOV WR UHVWRUH EDODQFH RQO\ ZKHQ G VHH GLVFXVVLRQ IROORZLQJ 7KHRUHP f 6R ZH QHHG WR UHDUUDQJH WKH DW PRVW QRGHV LQ JS LQWR DQ\ AEDODQFHG WUHH ZKHQ G $Q /5 URWDWLRQ IDLOV RQO\ ZKHQ G Sf a 7R VHH WKLV QRWH WKDW LQ WKH WHUPLQRORJ\ RI /HPPD G LV GO 7KH SURRI RI 3 LV H[WHQGHG WR WKH FDVH ZKHQ G SnAS@ a $OVR VLQFH GO IRU WKH FDVH E F ZH JHW Gn fÂ§ f fÂ§ F DV F f )RU WKH FDVH E F ZH QHHG WR VKRZ D fÂ§ f E 6LQFH DQ /5 URWDWLRQ LV GRQH RQO\ ZKHQ FRQGLWLRQ 9 KROGV IURP /HPPDV DQG LW IROORZV WKDW "D fÂ§ f E 6R DQ /5 URWDWLRQ UHEDODQFHV ZKHQ SURYLGHG G SA3f a A ARU VPDAHU DW PRVW SLf f QRGHV LQ WKH VXEWUHH JS PD\ EH GLUHFWO\ UHDUUDQJHG LQWR D AEDODQFHG WUHH 7KH UHVWUXFWXULQJ DOJRULWKP IRU VLPSOH %%67V LV JLYHQ LQ )LJXUHV DQG 7KH DOJRULWKP RI )LJXUH LV XVHG IROORZLQJ DQ LQVHUW DQG WKDW RI )LJXUH DIWHU D GHOHWH 6LPSOH /%%67V DUH H[SHFWHG WR KDYH KLJKHU VHDUFK FRVW WKDQ WKH %%67V RI 6HFWLRQ +RZHYHU WKH\ DUH D JRRG DOWHUQDWLYH WR WUDGLWLRQDO :%Rf WUHHV DV WKH\ DUH H[SHFWHG WR EH fEHWWHU EDODQFHGf 7R VHH WKLV QRWH WKDW IURP WKH SURRI RI /HPPD WKH EDODQFH %Sf DW DQ\ QRGH S LQ D AEDODQFHG WUHH VDWLVILHV B AUf %Sf f
PAGE 80
SURFHGXUH 5HVWUXFWXULQJ EHJLQ ZKLOH JSf GR EHJLQ LI VJSOHIWf fÂ§ f VJSULJKWff WKHQ ^GR DQ // RU /5 URWDWLRQf EHJLQ 3 3 OHIL LI VSULJKWf fVSOHIWf fÂ§f DQG VJSULJKWf VSOHIWfff WKHQ GR // URWDWLRQ HOVH GR /5 URWDWLRQ HQG HOVH GR V\PPHWULF WR WKH DERYH / FDVH JS JSSDUHQW HQG HQG )LJXUH 6LPSOH UHVWUXFWXULQJ SURFHGXUH IRU LQVHUWLRQ SURFHGXUH 5HVWUXFWXULQJV EHJLQ ZKLOH JSf GR EHJLQ LI VJSOHIWf fÂ§ f VJSULJKWff WKHQ LI f DQG VJSULJKWf f fÂ§ f WKHQ UHDUUDQJH WKH VXEWUHH URRWHG DW JS LQWR DQ\ AEDODQFHG WUHH HOVH ^GR DQ // RU /5 URWDWLRQf EHJLQ S JSOHIW LI VSULJKWf fVSOHIWf fÂ§ f DQG VJSULJKWf VSOHIWfff WKHQ GR // URWDWLRQ HOVH GR /5 URWDWLRQ HQG HQG HOVH GR V\PPHWULF WR WKH DERYH / FDVH JS JSSDUHQW HQG HQG )LJXUH 6LPSOH UHVWUXFWXULQJ SURFHGXUH IRU GHOHWLRQ
PAGE 81
! mfn+Lf 3 mrf L 3 7 }UfOf 6R r3f S mrf $OVR VLQFH VUf Vf VUf VOf +HQFH SALf rLn 6R %Sf fÂ§ LOO L S 3VLff W LL f L f 3 mrf &RQVHTXHQWO\ ,,, 3 mrf %Sf M fÂ§ f rUfOf :KHQ YA Â‘ %Sf U AA YAAIW ,I VSf %Sf fÂ§ 6R HYHU\ EDODQFHG VXEWUHH ZLWK RU IHZHU QRGHV LV LQ :%Df IRU D VV 6LPLODUO\ HYHU\ VXEWUHH ZLWK RU IHZHU QRGHV LV LQ :%Df IRU D m ,Q IDFW IRU HYHU\ IL[HG N VXEWUHHV RI VL]H N RU OHVV
PAGE 82
SURFHGXUH 5HVWUXFWXULQJ EHJLQ ZKLOH JSf GR EHJLQ LI VJSOHIWf VJSULJKWff WKHQ EHJLQ ^FKHFN FRQGLWLRQV IRU DQ // DQG /5 URWDWLRQ` S JS OHIW LI VSOHIWf VSULJKWff DQG VSOHIWf VJSULJKWff WKHQ GR // URWDWLRQ HOVH LI VSOHIWf VSULJKWff DQG VSULJKWf VJSULJKWff WKHQ GR /5 URWDWLRQ HQG HOVH ^FKHFN FRQGLWLRQV IRU DQ 55 DQG 5/ URWDWLRQ` GR V\PPHWULF WR WKH DERYH / FDVH JS JSSDUHQW HQG HQG )LJXUH 6LPSOH UHVWUXFWXULQJ SURFHGXUH ZLWKRXW D YDOXH DUH LQ :%Df IRU D VOLJKWO\ KLJKHU WKDQ A m ZKLFK LV WKH ODUJHVW YDOXH RI D IRU ZKLFK :%Df WUHHV FDQ EH PDLQWDLQHG %%67V ZLWKRXW 'HOHWLRQ ,Q VRPH DSSOLFDWLRQV RI D GLFWLRQDU\ ZH QHHG WR VXSSRUW RQO\ WKH LQVHUW DQG VHDUFK RSHUDWLRQV ,Q WKHVH DSSOLFDWLRQV ZH FDQ FRQVWUXFW ELQDU\ VHDUFK WUHHV ZLWK WRWDO FRVW &7f QORJÂ?Q ff E\ XVLQJ WKH VLPSOHU UHVWUXFWXULQJ DOJRULWKP RI )LJXUH 7KHRUHP :KHQ WKH RQO\ RSHUDWLRQV DUH VHDUFK DQG LQVHUW DQG UHVWUXFWXULQJ LV GRQH DV LQ )LJXUH &7f QORJÂ?Q ff
PAGE 83
3URRI 6XSSRVH 7 FXUUHQWO\ KDV P fÂ§ HOHPHQWV DQG D QHZ HOHPHQW LV LQVHUWHG /HW X EH WKH OHYHO DW ZKLFK WKH QHZ HOHPHQW LV LQVHUWHG 6XSSRVH WKDW WKH UHVWUXFWXULQJ SDVV SHUIRUPV URWDWLRQV DW T X RI WKH QRGHV RQ WKH SDWK IURP WKH URRW WR WKH QHZO\ LQVHUWHG QRGH 7KHQ &^7f LQFUHDVHV E\ DW PRVW Y X fÂ§ T DV D UHVXOW RI WKH LQVHUWLRQ 7KH QXPEHU RI QRGHV RQ WKH SDWK IURP WKH URRW WR WKH QHZO\ LQVHUWHG QRGH DW ZKLFK QR URWDWLRQ LV SHUIRUPHG LV DOVR Y /HW WKHVH QRGHV EH QXPEHUHG WKURXJK Y ERWWRP WR WRS /HW 6L GHQRWH WKH QXPEHU RI HOHPHQWV LQ WKH VXEWUHH ZLWK URRW L SULRU WR WKH UHVWUXFWXULQJ SDVV :H VHH WKDW 6L DQG 6L )RU QRGH r L Y RQH RI LWV VXEWUHHV FRQWDLQV QRGH L fÂ§ :LWKRXW ORVV RI JHQHUDOLW\ OHW WKLV EH WKH OHIW VXEWUHH RI L /HW WKH URRW RI WKH ULJKW VXEWUHH RI W EH G 6R 6L 6LL VGf ,I ] fÂ§ LV QRW WKH OHIW FKLOG RI W WKHQ VLQFH QR URWDWLRQ LV GRQH DW r VGf BL ,I ] fÂ§ LV WKH OHIW FKLOG RI L WKHQ FRQVLGHU QRGH L fÂ§ 7KLV LV LQ RQH RI WKH VXEWUHHV RI L 6LQFH QR URWDWLRQ LV SHUIRUPHG DW L fÂ§ VGf 6LÂ‘ 6LQFH 6L ZH JHW 6L 6LL 6L / +HQFH 6Y 1Y ZKHUH 1Y LV WKH PLQLPXP QXPEHU RI HOHPHQWV LQ D &267 RI KHLJKW Y 6R Y ORJA?P ff 6R ZKHQ DQ HOHPHQW LV LQVHUWHG LQWR D WUHH WKDW KDV P fÂ§ HOHPHQWV LWV FRVW &7f LQFUHDVHV E\ DW PRVW ORJÂ!P ff 6WDUWLQJ ZLWK DQ HPSW\ WUHH DQG LQVHUWLQJ Q HOHPHQWV UHVXOWV LQ D WUHH ZKRVH FRVW LV DW PRVW
PAGE 84
QORJr9Q ff &RUROODU\ 7KH H[SHFWHG FRVW RI D VHDUFK RU LQVHUW LQ D %%67 FRQVWUXFWHG DV DERYH LV 2?RJQf 3URRI 6LQFH &7f QORJAYQ ff WKH H[SHFWHG VHDUFK FRVW LV &7fQ ORJÂ?Q ff 7KH FRVW RI WLQ LQVHUW LV WKH VDPH RUGHU DV WKDW RI D VHDUFK DV HDFK LQVHUW IROORZV WKH FRUUHVSRQGLQJ VHDUFK SDWK WZLFH WRS GRZQ DQG ERWWRP XSf Â’ ([SHULPHQWDO 5HVXOWV )RU FRPSDULVRQ SXUSRVHV ZH ZURWH & SURJUDPV IRU %%67V 6%%67V VLPSOH %%67Vf %%67'V %%67V LQ ZKLFK SURFHGXUH 5HVWUXFWXULQJ )LJXUH f LV XVHG WR UHVWUXFWXUH IROORZLQJ LQVHUWV DV ZHOO DV GHOHWHVf XQEDODQFHG ELQDU\ VHDUFK WUHHV %67f $9/WUHHV WRSGRZQ UHGEODFN WUHHV 5%7f ERWWRPXS UHGEODFN WUHHV 5% %f >@ ZHLJKW EDODQFHG WUHHV :%f GHWHUPLQLVWLF VNLS OLVWV '6/f WUHDSV 753f DQG VNLS OLVWV 6.,3f )RU WKH %%67 DQG 6%%67 VWUXFWXUHV ZH XVHG ZKLOH IRU WKH :% VWUXFWXUH ZH XVHG D :KLOH WKHVH DUH QRW WKH KLJKHVW SHUPLVVLEOH YDOXHV RI DQG D WKLV FKRLFH SHUPLWWHG XV WR XVH LQWHJHU DULWKPHWLF UDWKHU WKDQ WKH VXEVWDQWLDOO\ PRUH H[SHQVLYH UHDO DULWKPHWLF )RU LQVWDQFH "Df IRU FDQ EH FKHFNHG XVLQJ WKH FRPSDULVRQV VDf fÂ§ f Vf DQG Vf fÂ§ f VDf 7KH UDQGRPL]HG VWUXFWXUHV 753 DQG 6.,3 XVHG WKH VDPH UDQGRP QXPEHU JHQHUDWRU ZLWK WKH VDPH VHHG 6.,3 ZDV SURJUDPPHG ZLWK SUREDELOLW\ YDOXH S DV LQ 3XJK >@
PAGE 85
7R PLQLPL]H WKH LPSDFW RI V\VWHP FDOO RYHUKHDGV RQ UXQ WLPH PHDVXUHPHQWV ZH SURJUDPPHG DOO VWUXFWXUHV XVLQJ VLPXODWHG SRLQWHUV LH DQ DUUD\ RI QRGHV ZLWK LQWHJHU SRLQWHUV >@ 6NLS OLVWV XVH YDULDEOH VL]H QRGHV 7KLV UHTXLUHV PRUH FRPSOH[ VWRUDJH PDQDJHPHQW WKDQ UHTXLUHG E\ WKH UHPDLQLQJ VWUXFWXUHV ZKLFK XVH QRGHV RI WKH VDPH VL]H )RU RXU H[SHULPHQWV ZH LPSOHPHQWHG VNLS OLVWV XVLQJ IL[HG VL]H QRGHV HDFK QRGH EHLQJ RI WKH PD[LPXP VL]H $V D UHVXOW RXU UXQ WLPHV IRU VNLS OLVWV DUH VPDOOHU WKDQ LI D VSDFH HIILFLHQW LPSOHPHQWDWLRQ KDG EHHQ XVHG ,Q DOO RXU WUHH VWUXFn WXUH LPSOHPHQWDWLRQV QXOO SRLQWHUV ZHUH UHSODFHG E\ D SRLQWHU WR D WDLO QRGH ZKRVH GDWD ILHOG FRXOG EH VHW WR WKH VHDUFKLQVHUWGHOHWH NH\ DQG WKXV DYRLG FKHFNLQJ IRU IDOOLQJ RII WKH WUHH 6LPLODU WDLO SRLQWHUV DUH SDUW RI WKH GHILQHG VWUXFWXUH RI VNLS DQG GHWHUPLQLVWLF VNLS OLVWV (DFK WUHH DOVR KDG D KHDG QRGH :%Df WUHHV ZHUH LPSOHn PHQWHG ZLWK D ERWWRPXS UHVWUXFWXULQJ SDVV 2XU FRGHV IRU 6.,3 DQG '6/ DUH EDVHG RQ WKH FRGHV RI 3XJK >@ DQG 3DSDGDNLV >@ UHVSHFWLYHO\ 2XU $9/ DQG 5%7 FRGHV DUH EDVHG RQ WKRVH RI 3DSDGDNLV >@ DQG 6HGJHZLFN >@ 7KH WUHDS VWUXFWXUH ZDV LPSOHPHQWHG XVLQJ MRLQV DQG VSOLWV UDWKHU WKDQ URWDWLRQV 7KLV UHVXOWV LQ EHWWHU SHUn IRUPDQFH )XUWKHUPRUH $9/ 5%% :% DQG %%67 ZHUH LPSOHPHQWHG ZLWK SDUHQW SRLQWHUV LQ DGGLWLRQ WR OHIW DQG ULJKW FKLOG SRLQWHUV )RU %%67V WKH HQKDQFHPHQWV GHVFULEHG LQ 6HFWLRQ IRU LQVHUW DQG GHOHWH VHH )LJXUH f ZHUH HPSOR\HG 1R URWDWLRQV ZHUH SHUIRUPHG GXULQJ D VHDUFK ZKHQ XVLQJ DQ\ RI WKH VWUXFWXUHV )RU RXU H[SHULPHQWV ZH WULHG WZR YHUVLRQV RI WKH FRGH 7KHVH YDULHG LQ WKH RUGHU LQ ZKLFK WKH fHTXDOLW\f DQG fOHVV WKDQf RU fJUHDWHU WKDQf FKHFN EHWZHHQ [ DQG H ZKHUH [ LV WKH NH\ EHLQJ VHDUFKHGLQVHUWHGGHOHWHG DQG H LV WKH NH\ LQ WKH FXUUHQW
PAGE 86
QRGHf LV GRQH ,Q YHUVLRQ ZH FRQGXFWHG DQ LQLWLDO H[SHULPHQW WR GHWHUPLQH LI WKH WRWDO FRPSDULVRQ FRXQW LV OHVV XVLQJ WKH RUGHU / LI [ H WKHQ PRYH WR OHIW FKLOG HOVH LI [ A H WKHQ PRYH WR ULJKW FKLOG HOVH IRXQG RU WKH RUGHU 5 LI [ H WKHQ PRYH WR ULJKW FKLOG HOVH LI [ H WKHQ PRYH WR OHIW FKLOG HOVH IRXQG 2XU H[SHULPHQW LQGLFDWHG WKDW GRLQJ WKH fOHIW FKLOGf FKHFN ILUVW LH RUGHU /fr!f WHVW LQ WKH VHFRQG WKLUG DQG IRUWK LI VWDWHPHQWV ZDV FKDQJHG WR r!f
PAGE 87
1R FKDQJH ZDV PDGH LQ WKH FRUUHVSRQGLQJ LI VWDWHPHQWV IRU 55 DQG 5/ URWDWLRQV :KLOH WKLV LQFUHDVHG WKH QXPEHU RI FRPSDULVRQV LW UHGXFHG WKH UXQ WLPH :H H[SHULPHQWHG ZLWK Q DQG )RU HDFK Q WKH IROORZLQJ H[SHULPHQWV ZHUH FRQGXFWHG Df VWDUW ZLWK DQ HPSW\ VWUXFWXUH DQG SHUIRUP Q LQVHUWV Ef VHDUFK IRU HDFK LWHP LQ WKH UHVXOWLQJ VWUXFWXUH RQFH LWHPV DUH VHDUFKHG IRU LQ WKH RUGHU WKH\ ZHUH LQVHUWHG Ff SHUIRUP DQ DOWHUQDWLQJ VHTXHQFH RI Q LQVHUWV DQG Q GHOHWHV LQ WKLV WKH Q HOHPHQWV LQVHUWHG LQ Df DUH GHOHWHG LQ WKH RUGHU WKH\ ZHUH LQVHUWHG DQG Q QHZ HOHPHQWV DUH LQVHUWHG Gf VHDUFK IRU HDFK RI WKH UHPDLQLQJ Q HOHPHQWV LQ WKH RUGHU WKH\ ZHUH LQVHUWHG Hfn SHULPHQW 7KH WKUHH YHUVLRQV RI RXU SURSRVHG GDWD VWUXFWXUH DUH YHU\ FRPSHWLWLYH RQ WKLV PHDVXUH %%67'V DQG %%67V JHQHUDOO\ SHUIRUPHG IHZHU FRPSDULVRQV WKDQ GLG 6%%67V $OO WKUHH VWUXFWXUHV KDG D FRPSDULVRQ FRXQW ZLWKLQ b RI RQH DQRWKHU
PAGE 88
7DEOH 7KH QXPEHU RI NH\ FRPSDULVRQV RQ UDQGRP LQSXWV YHUVLRQ FRGHf Q RSHUDWLRQ 6%%67 %%67' %%67 LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH +RZHYHU ZKHQ ZH XVHG RUGHUHG GDWD UDWKHU WKDQ UDQGRP GDWD 7DEOH f 6%%67V SHUIRUPHG QRWLFHDEO\ LQIHULRU WR %%67'V DQG %%67V WKH ODWHU WZR UHPDLQHG YHU\ FRPSHWLWLYH 7DEOHV DQG JLYH WKH DYHUDJH KHLJKWV RI WKH WUHHV XVLQJ UDQGRP GDWD DQG XVLQJ RUGHUHG GDWD UHVSHFWLYHO\ 7KH ILUVW QXPEHU JLYHV WKH KHLJKW IROORZLQJ SDUW Df RI WKH H[SHULPHQW DQG WKH VHFRQG IROORZLQJ SDUW Ff 7KH QXPEHUV DUH LGHQWLFDO IRU %%67'V DQG %%67V DQG VOLJKWO\ KLJKHU ORZHUf IRU 6%%67V XVLQJ UDQGRP RUGHUHGf GDWD
PAGE 89
7DEOH 7KH QXPEHU RI NH\ FRPSDULVRQV RQ RUGHUHG LQSXWV YHUVLRQ FRGHf Q RSHUDWLRQ 6%%67 %%67' %%67 LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH 7DEOH +HLJKW RI WKH WUHHV RQ UDQGRP LQSXWV YHUVLRQ FRGHf Q 6%%67 %%67' %%67 7DEOH +HLJKW RI WKH WUHHV RQ RUGHUHG LQSXWV YHUVLRQ FRGHf Q 6%%67 %%67' %%67
PAGE 90
7KH DYHUDJH QXPEHU RI URWDWLRQV SHUIRUPHG E\ HDFK RI WKH WKUHH VWUXFWXUHV LV JLYHQ LQ 7DEOHV DQG $ VLQJOH URWDWLRQ LH // RU 55f LV GHQRWHG f6f DQG D GRXEOH URWDWLRQ LH /5 RU 5/f GHQRWHG f'f ,Q WKH FDVH RI %%67V GRXEOH URWDWLRQV KDYH EHHQ GLYLGHG LQWR WKUHH FDWHJRULHV /5 DQG 5/ URWDWLRQV WKDW GR QRW SHUIRUP D VHFRQG VXEVWHS URWDWLRQ '6 /5 DQG 5/ URWDWLRQV ZLWK D VHFRQG VXEVWHS URWDWLRQ RI W\SH // DQG 55 '' /5 DQG 5/ URWDWLRQV ZLWK D VHFRQG VXEVWHS URWDWLRQ RI W\SH /5 DQG 5/ %%67'V DQG %%67V SHUIRUPHG D FRPSDUDEOH QXPEHU RI URWDWLRQV RQ ERWK GDWD VHWV +RZHYHU RQ UDQGRP GDWD 6%%67V SHUIRUPHG DERXW KDOI DV PDQ\ URWDWLRQV DV GLG %%67'V DQG %%67V 2Q RUGHUHG GDWD 6%%67V SHUIRUPHG WR b IHZHU URWDWLRQV RQ SDUW Df b IHZHU RQ SDUW Ff DQG b IHZHU RQ SDUW Hf 7KH UXQWLPH SHUIRUPDQFH RI WKH VWUXFWXUHV LV VLJQLILFDQWO\ LQIOXHQFHG E\ FRPnn LVRQ FRXQW WR KDYH D VPDOOHU UXQ WLPH IRU SDUWV Ef DQG Gf RI WKH H[SHULPHQW 7KLV ZDV QRW DOZD\V WKH FDVH 7DEOHV DQG JLYH WKH UXQ WLPHV RI WKH WKUHH %%67 VWUXFWXUHV XVLQJ LQWHJHU NH\V DQG 7DEOHV DQG GR WKLV IRU WKH FDVH RI UHDO LH IORDWLQJ SRLQWf NH\V 7KH
PAGE 91
7DEOH 7KH QXPEHU RI URWDWLRQV RQ UDQGRP LQSXWV YHUVLRQ FRGHf Q RSHUDWLRQ 6%%67 %%67' %%67 6 6 6 '6 '' LQVHUW LQVGHO GHOHWH LQVHUW LQVGHO GHOHWH LQVHUW LQVGHO GHOHWH LQVHUW LQVGHO GHOHWH
PAGE 92
7DEOH 7KH QXPEHU RI URWDWLRQV RQ RUGHUHG LQSXWV YHUVLRQ FRGHf Q RSHUDWLRQ 6%%67 %%67' %%67 6 6 6 '6 '' LQVHUW LQVGHO GHOHWH LQVHUW LQVGHO GHOHWH LQVHUW LQVGHO GHOHWH LQVHUW LQVGHO GHOHWH VXP RI WKH UXQ WLPH IRU SDUWV Df Hf
PAGE 93
7DEOH 5XQ WLPH RQ UDQGRP LQSXWV XVLQJ LQWHJHU NH\V YHUVLRQ FRGHf Q RSHUDWLRQ 6%%67 %%67' %%67 LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH 7LPH 8QLW VHF
PAGE 94
7DEOH 5XQ WLPH RQ RUGHUHG LQSXWV XVLQJ LQWHJHU NH\V YHUVLRQ FRGHf Q RSHUDWLRQ 6%%67 %%67' %%67 LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH 7LPH 8QLW VHF
PAGE 95
7DEOH 5XQ WLPH RQ UDQGRP UHDO LQSXWV YHUVLRQ FRGHf Q RSHUDWLRQ 6%%67 %%67' %%67 LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH 7LPH 8QLW VHF
PAGE 96
7DEOH 5XQ WLPH RQ RUGHUHG UHDO LQSXWV YHUVLRQ FRGHf Q RSHUDWLRQ 6%%67 %%67' %%67 LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH 7LPH 8QLW VHF
PAGE 97
7LPH LV VXP RI WLPH IRU SDUWV DfHf RI WKH H[SHULPHQW )LJXUH 5XQ WLPH RQ UHDO LQSXWV YHUVLRQ FRGHf 7KH DYHUDJH QXPEHU RI FRPSDULVRQV IRU HDFK RI WKH ILYH SDUWV RI WKH H[SHULPHQW DUH JLYHQ LQ 7DEOH IRU WKH YHUVLRQ LPSOHPHQWDWLRQ 2Q WKH FRPSDULVRQ PHDVXUH $9/ 5%% :% DQG %%67V DUH WKH IURQW UXQQHUV DQG DUH TXLWH FRPSHWLWLYH ZLWK RQH DQRWKHU 2Q SDUWV Df LQVHUW Q HOHPHQWVf DQG Ff LQVHUW Q DQG GHOHWH Q HOHPHQWVf $9/ WUHHV SHUIRUPHG EHVW ZKLOH RQ WKH WZR VHDUFK WHVWV Ef DQG Gff DQG WKH GHOHWLRQ WHVW Hf %%67V SHUIRUPHG EHVW 7DEOH JLYHV WKH QXPEHU RI FRPSDULVRQV SHUIRUPHG ZKHQ RUGHUHG GDWD LH WKH HOHPHQWV LQ SDUW Df DUH OQ DQG DUH LQVHUWHG LQ WKLV RUGHUf DQG WKRVH LQ SDUW Ff DUH Q I Q LQ WKLV RUGHUf LV XVHG LQVWHDG RI UDQGRP SHUPXWDWLRQV RI GLVWLQFW HOHPHQWV 7KLV H[SHULPHQW DWWHPSWV WR PRGHO UHDOLVWLF VLWXDWLRQV LQ ZKLFK WKH LQVHUWHG HOHPHQWV DUH LQ fQHDUO\ VRUWHG RUGHUf %67V ZHUH QRW LQFOXGHG LQ WKLV WHVW DV WKH\ SHUIRUP YHU\ SRRUO\ ZLWK RUGHUHG GDWD WDNLQJ Qf WLPH WR LQVHUW Q
PAGE 98
WLPHV 7KH FRPSXWHU WLPH QHHGHG WR SHUIRUP WKLV WHVW RQ %67V ZDV GHWHUPLQHG WR EH H[FHVVLYH 7KLV WHVW H[KLELWHG JUHDWHU YDULDQFH LQ SHUIRUPDQFH $PRQJ WKH GHWHUPLQLVWLF VWUXFWXUHV %%67V RXWSHUIRUPHG WKH RWKHUV LQ SDUWV Df Gf ZKLOH $9/ WUHHV ZHUH DKHDG LQ SDUW Hf )RU SDUW Df %%67V SHUIRUPHG DSSUR[LPDWHO\ b IHZHU FRPSDULVRQV WKDQ GLG $9/ WUHHV DQG DSSUR[LPDWHO\ b IHZHU WKDQ :% WUHHV 7KH UDQGRPL]HG VWUXFWXUH 753 ZDV WKH EHVW RI WKH HLJKW VWUXFWXUHV UHSRUWHG LQ 7DEOH IRU SDUW Df ,W SHUIRUPHG DSSUR[LPDWHO\ b IHZHU FRPSDULVRQV WKDQ GLG %%67 WUHHV +RZHYHU WKH %%67 UHPDLQHG EHVW RYHUDOO RQ SDUWV Ef Ff DQG Gf 7KH KHLJKWV RI WKH WUHHV QXPEHU RI OHYHOV LQ WKH FDVH RI '6/ DQG 6.,3f IRU WKH H[SHULPHQWV ZLWK UDQGRP DQG RUGHUHG GDWD DUH JLYHQ LQ 7DEOHV DQG UHVSHFWLYHO\ 7KH ILUVW QXPEHU LQ HDFK WDEOH HQWU\ LV WKH WUHH KHLJKW DIWHU SDUW Df RI WKH H[SHULPHQW DQG WKH VHFRQG WKH KHLJKW DIWHU SDUW Ff ,Q DOO FDVHV WKH QXPEHU RI OHYHOV XVLQJ VNLS OLVWV LV IHZHVW +RZHYHU DPRQJ WKH WUHH VWUXFWXUHV $9/ DQG %%67 WUHHV KDYH OHDVW KHLJKW RQ UDQGRP GDWD DQG $9/ KDV OHDVW ZLWK RUGHUHG GDWD 7DEOHV DQG UHVSHFWLYHO\ JLYH WKH QXPEHU RI URWDWLRQV SHUIRUPHG E\ HDFK RI WKH GHWHUPLQLVWLF WUHH VFKHPHV IRU H[SHULPHQW SDUWV Df Ff DQG Hf 1RWH WKDW QRQH RI WKH VFKHPHV SHUIRUPV URWDWLRQV GXULQJ D VHDUFK 2Q RUGHUHG GDWD %%67V SHUIRUP DERXW b PRUH URWDWLRQV WKDQ GR WKH UHn PDLQLQJ VWUXFWXUHV 7KHVH UHPDLQLQJ VWUXFWXUHV SHUIRUP DERXW WKH VDPH QXPEHU RI URWDWLRQV 2Q UDQGRP GDWD $9/ WUHHV ERWWRPXS UHGEODFN WUHHV DQG :% WUHHV SHUn IRUP D FRPSDUDEOH QXPEHU RI URWDWLRQV 7RSGRZQ UHGEODFN WUHHV DQG %%67 WUHHV
PAGE 99
ODEOH 7KH QXPEHU RI NH\ FRPSDULVRQV RQ UDQGRP LQSXWV YHUVLRQ FRGHf Q RSHUDWLRQ %67 $9/ 5%7 5%% :% %%67 '6/ 753 6.,3 LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH
PAGE 100
7DEOH 7KH QXPEHU RI NH\ FRPSDULVRQV RQ RUGHUHG LQSXWV YHUVLRQ FRGHf Q RSHUDWLRQ $9/ 5%7 5%% :% %%67 '6/ 753 6.,3 LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH RR
PAGE 101
7DEOH +HLJKW RI WKH WUHHV RQ UDQGRP LQSXWV YHUVLRQ FRGHf Q %67 $9/ 5%7 5%% :% %%67 '6/ 753 6.,3 7DEOH +HLJKW RI WKH WUHHV RQ RUGHUHG LQSXWV YHUVLRQ FRGHf Q $9/ 5%7 5%% :% %%67 '6/ 753 6.,3 SHUIRUP D VLJQLILFDQWO\ ODUJHU QXPEHU RI URWDWLRQV ,Q IDFW %%67V SHUIRUP DERXW WZLFH DV PDQ\ URWDWLRQV DV $9/ WUHHV 7KH DYHUDJH UXQ WLPHV IRU WKH UDQGRP GDWD WHVWV DUH JLYHQ LQ 7DEOH DQG LQ 7DEOH IRU WKH RUGHUHG GDWD WHVW %RWK RI WKHVH XVH LQWHJHU NH\V 7KH WLPHV XVLQJ UHDO NH\V DUH JLYHQ LQ 7DEOHV DQG 7KH VXP RI WKH UXQ WLPH IRU SDUWV Ef DQG Gf RI WKH H[SHULPHQW LV JUDSKHG LQ )LJXUH IRU UDQGRP GDWD DQG LQ )LJXUH IRU RUGHUHG GDWD 7KH JUDSK RI )LJXUH VKRZV RQO\ RQH OLQH 0,; IRU $9/ 5%7 5%% :% DQG %%67 ZKLOH WKDW RI )LJXUH VKRZV 0,; IRU $9/ 5%7 5%% DQG :% DV WKH WLPHV IRU WKHVH DUH YHU\ FORVH :LWK LQWHJHU NH\V DQG UDQGRP GDWD XQEDODQFHG ELQDU\ VHDUFK WUHHV %67Vf RXWSHUIRUPHG HDFK RI WKH UHPDLQLQJ VWUXFWXUHV 7KH QH[W EHVW SHUIRUPDQFH ZDV H[KLELWHG E\ ERWWRP XS UHGEODFN WUHHV 7KH\ GLG PDUJLQDOO\ EHWWHU WKDQ $9/ WUHHV 7KH UHPDLQLQJ
PAGE 102
7DEOH 7KH QXPEHU RI URWDWLRQV RQ UDQGRP LQSXWV YHUVLRQ FRGHf Q RSHUDWLRQ $9/ 5%7 5%% :% %%67 6 6 6 6 6 '6 '' LQVHUW LQVGHO GHOHWH LQVHUW LQVGHO GHOHWH LQVHUW LQVGHO GHOHWH LQVHUW LQVGHO GHOHWH
PAGE 103
7DEOH ,KH QXPEHU RI URWDWLRQV RQ RUGHUHG LQSXWV YHUVLRQ FRGHf $9/ 5%7 5%% :% %%67 Q RSHUDWLRQ 6 6 6 6 6 '6 '' LQVHUW LQVGHO GHOHWH LQVHUW LQVGHO GHOHWH LQVHUW LQVGHO GHOHWH LQVHUW LQVGHO GHOHWH
PAGE 104
7LPH LV VXP RI WLPH IRU SDUWV Ef DQG Gf RI WKH H[SHULPHQW )LJXUH 5XQ WLPH RQ UDQGRP UHDO LQSXWV YHUVLRQ FRGHf VWUXFWXUHV KDYH D QRWLFHDEO\ LQIHULRU VWUXFWXUH )RU RUGHUHG LQWHJHU NH\V %67V WDNH PRUH WLPH WKDQ ZH ZHUH ZLOOLQJ WR H[SHQG 2I WKH UHPDLQLQJ VWUXFWXUHV WUHDSV JHQHUDOO\ SHUIRUPHG EHVW RQ SDUWV Df Ff DQG Hf ZKLOH %%67V GLG EHVW RQ SDUWV Ef DQG Gf :LWK UHDO NH\V DQG UDQGRP GDWD %67V GLG QRW RXWSHUIRUP WKH UHPDLQLQJ VWUXFn WXUHV 1RZ WKH ILYH EDODQFHG ELQDU\ WUHH VWUXFWXUH EHFDPH TXLWH FRPSHWLWLYH ZLWK UHVSHFW WR WKH VHDUFK RSHUDWLRQV LH SDUWV Ef DQG Gff 5%% JHQHUDOO\ RXWSHUn IRUPHG WKH RWKHU VWUXFWXUHV RQ SDUWV Df Ff DQG Hf 8VLQJ RUGHUHG UHDO NH\V WKH WUHDS ZDV WKH FOHDU ZLQQHU RQ SDUWV Df Ff DQG Hf ZKLOH %%67V KDQGLO\ RXWSHUn IRUPHG WKH UHPDLQLQJ VWUXFWXUHV RQ SDUWV Ef DQG Gf 6RPH RI WKH H[SHULPHQWDO UHVXOWV XVLQJ YHUVLRQ RI WKH FRGH DUH VKRZQ LQ 7Dn EOHV 2Q WKH FRPSDULVRQ PHDVXUH ZLWK UDQGRP GDWD 7DEOH f VNLS
PAGE 105
7DEOH 5XQ WLPH RQ UDQGRP LQSXWV XVLQJ LQWHJHU NH\V YHUVLRQ FRGHf Q RSHUDWLRQ %67 $9/ 5%7 5%% :% %%67 '6/ 753 6.,3 LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH 7LPH 8QLW VHF
PAGE 106
7DEOH 5XQ WLPH RQ RUGHUHG LQSXWV XVLQJ LQWHJHU NH\V YHUVLRQ FRGHf Q RSHUDWLRQ $9/ 5%7 5%% :% %%67 '6/ 753 6.,3 LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH 7LPH 8QLW VHF
PAGE 107
7DEOH 5XQ WLPH RQ UDQGRP UHDO LQSXWV YHUVLRQ FRGHf Q RSHUDWLRQ %67 $9/ 5%7 5%% :% %%67 '6/ 753 6.,3 LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH 7LPH 8QLW VHF H! &Q
PAGE 108
7DEOH 5XQ WLPH RQ RUGHUHG UHDO LQSXWV YHUVLRQ FRGHf Q RSHUDWLRQ $9/ 5%7 5%% :% %%67 '6/ 753 6.,3 LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH 7LPH 8QLW VHF
PAGE 109
7LPH LV VXP RI WLPH IRU SDUWV Ef DQG Gf RI WKH H[SHULPHQW )LJXUH 5XQ WLPH RQ RUGHUHG UHDO LQSXWV YHUVLRQ FRGHf OLVWV SHUIRUPHG EHVW RQ SDUW Df 2I WKH GHWHUPLQLVWLF PHWKRGV %%67V VOLJKWO\ RXWn SHUIRUPHG WKH RWKHUV RQ SDUW Df 2Q SDUWV Ef Hf $9/ 5%7 5%% :% DQG %%67V ZHUH TXLWH FRPSHWLWLYH DQG RXWSHUIRUPHG %67V DQG WKH UDQGRPL]HG VFKHPHV %%67V SHUIRUPHG EHVW RQ SDUWV Ef DQG Gf 5%7V GLG EHVW RQ SDUW Hf DQG 5%% DQG $9/ GLG EHVW RQ SDUW Ff ,Q FRPSDULQJ WKH UHVXOWV RI 7DEOH WR WKRVH RI 7DEOH XVLQJ YHUVLRQ FRGHf ZH VHH WKDW WKH FKDQJH WR YHUVLRQ JHQHUDOO\ LQn FUHDVHG WKH FRPSDULVRQ FRVW RI WKH GHWHUPLQLVWLF WUHH VWUXFWXUHV E\ DERXW b )RU WKH '6/ WKH FKDQJH LQ FRGH KDG PL[HG UHVXOWV 1RWLFH WKDW IRU 5%7 DQG '6/V WKH FRPSDULVRQ FRXQW IRU SDUWV Df Ff DQG Hf DUH WKH VDPH DV IRU WKH YHUVLRQ FRGH 7KLV LV EHFDXVH IRU LQVHUWV DQG GHOHWHV LW LV QHFHVVDU\ WR GR WKH HTXDO FKHFN ILUVW ZKHQ XVLQJ WKHVH VWUXFWXUHV )RU 6.,3V WKH FRXQW LV WKH VDPH IRU DOO ILYH SDUWV DV WKH YHUVLRQ DQG FRGHV DUH WKH VDPH
PAGE 110
:LWK RUGHUHG GDWD 7DEOH f WUHDSV UHTXLUHG WKH IHZHVW FRPSDULVRQV IRU SDUW Df 6NLS OLVWV GLG EHVW RQ SDUWV Ff DQG Hf DQG $9/ WUHHV JHQHUDOO\ RXWSHUIRUPHG WKH RWKHU VWUXFWXUHV RQ SDUWV Ef DQG Gf 2QFH DJDLQ WKH FRPSDULVRQ FRXQWV ZHUH JHQHUDOO\ KLJKHU XVLQJ WKH YHUVLRQ FRGH WKDQ XVLQJ WKH YHUVLRQ FRGH 5XQ WLPH GDWD XVLQJ UHDO NH\V LV JLYHQ LQ 7DEOHV DQG 7KH VXP RI WKH UXQ WLPH IRU SDUWV Ef DQG Gf RI WKH H[SHULPHQW LV JUDSKHG LQ )LJXUH IRU UDQGRP GDWD DQG LQ )LJXUH IRU RUGHUHG GDWD 7KH JUDSK RI )LJXUH VKRZV RQO\ RQH OLQH 0,; IRU $9/ 5%7 5%% :% DQG %%67 ZKLOH WKDW RI )LJXUH VKRZV 0,; IRU $9/ 5%7 5%% DQG :% DV WKH WLPHV IRU WKHVH DUH YHU\ FORVH :LWK UDQGRP GDWD 5%% JHQHUDOO\ SHUIRUPHG EHVW RQ SDUW Df RQ SDUWV Ef DQG Gf WKH IURQW UXQQHU YDULHG DPRQJ $9/ 5%7 DQG :% DQG RQ SDUWV Ff DQG Hf 5%%V JHQHUDOO\ GLG EHVW 2Q RUGHUHG GDWD 753V GLG EHVW RQ SD[WV Df Ff DQG Hf ZKLOH %%67V GLG EHVW RQ SDUWV Ef DQG Gf &RQFOXVLRQ :H KDYH GHYHORSHG D QHZ ZHLJKW EDODQFHG GDWD VWUXFWXUH FDOOHG %%67 7KLV ZDV GHYHORSHG IRU WKH UHSUHVHQWDWLRQ RI D GLFWLRQDU\ ,Q GHYHORSLQJ WKH LQVHUWGHOHWH DOJRULWKPV ZH VRXJKW WR PLQLPL]H WKH VHDUFK FRVW RI WKH UHVXOWLQJ WUHH 2XU H[SHULn PHQWDO UHVXOWV VKRZ WKDW %%67V JHQHUDOO\ KDYH WKH EHVW VHDUFK FRVW RI WKH VWUXFWXUHV FRQVLGHUHG )XUWKHUPRUH WKLV WUDQVODWHV LQWR UHGXFHG VHDUFK WLPH ZKHQ WKH NH\ FRPn SDULVRQ FRVW LV UHODWLYHO\ KLJK HJ IRU UHDO NH\Vf 7KH LQVHUW DQG GHOHWH DOJRULWKPV IRU %%67V DUH QRW DV HIILFLHQW DV WKRVH IRU RWKHU GLFWLRQDU\ VWUXFWXUHV VXFK DV $9/ WUHHVf $V D UHVXOW ZH UHFRPPHQG 7%%67V IRU HQYLURQPHQWV ZKHUH VHDUFKHV
PAGE 111
7DEOH 7KH QXPEHU RI NH\ FRPSDULVRQV RQ UDQGRP LQSXWV YHUVLRQ FRGHf Q RSHUDWLRQ %67 $9/ 5%7 5%% :% %%67 '6/ 753 6.,3 LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH WR WR
PAGE 112
7DEOH 7KH QXPEHU RI NH\ FRPSDULVRQV RQ RUGHUHG LQSXWV YHUVLRQ FRGHf Q RSHUDWLRQ $9/ 5%7 5%% :% %%67 '6/ 753 6.,3 LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH
PAGE 113
7DEOH 5XQ WLPH RQ UDQGRP UHDO LQSXWV YHUVLRQ FRGHf Q RSHUDWLRQ %67 $9/ 5%7 5%% :% %%67 '6/ 753 6.,3 LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH 7LPH 8QLW VHF
PAGE 114
7LPH LV VXP RI WLPH IRU SDUWV Ef DQG Gf RI WKH H[SHULPHQW )LJXUH 5XQ WLPH RQ UDQGRP UHDO LQSXWV YHUVLRQ FRGHf 7LPH LV VXP RI WLPH IRU SDUWV Ef DQG Gf RI WKH H[SHULPHQW )LJXUH 5XQ WLPH RQ RUGHUHG UHDO LQSXWV YHUVLRQ FRGHf
PAGE 115
7DEOH 5XQ WLPH RQ RUGHUHG UHDO LQSXWV YHUVLRQ FRGHf Q RSHUDWLRQ $9/ 5%7 5%% :% %%67 '6/ 753 6.,3 LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH 7LPH 8QLW VHF
PAGE 116
DUH GRQH ZLWK PXFK JUHDWHU IUHTXHQF\ WKDQ LQVHUWV DQGRU GHOHWHV %DVHG RQ RXU H[SHULPHQWV ZH FRQFOXGH WKDW $9/ WUHHV UHPDLQ WKH EHVW GLFWLRQDU\ VWUXFWXUH IRU JHQHUDO DSSOLFDWLRQV :H KDYH DOVR SURSRVHG WZR VLPSOLILHG YHUVLRQV RI WKH %%67 FDOOHG 6%%67 DQG %%67' 7KH 6%%67 VHHNV RQO\ WR SURYLGH ORJDULWKPLF UXQ WLPH SHU RSHUDWLRQ DQG XQOLNH WKH JHQHUDO %%67 GRHV QRW UHGXFH VHDUFK FRVW DW HYHU\ RSSRUWXQLW\ 7KH 6%%67 SURYLGHV VOLJKWO\ EHWWHU EDODQFH WKDQ SURYLGHG E\ :%Df WUHHV 7KH %%67' GRHV QRW DWWHPSW WR PDLQWDLQ EDODQFH +RZHYHU LW SHUIRUPV URWDWLRQV WR UHGXFH VHDUFK FRVW ZKHQHYHU SRVVLEOH %RWK YHUVLRQV DUH YHU\ FRPSHWLWLYH ZLWK %%67V 7KH 6%%67 H[KLELWHG PXFK EHWWHU UXQ WLPH SHUIRUPDQFH WKDQ %%67V RQ UDQGRP GDWD DQG WKH %%67' VOLJKWO\ RXWSHUIRUPHG WKH %%67 RQ RUGHUHG GDWD +RZHYHU %%67V JHQHUDWHG WUHHV ZLWK WKH ORZHVW VHDUFK FRVW WKRXJK QRW E\ PXFKf
PAGE 117
&+$37(5 :(,*+7 %,$6(' /()7,67 75((6 $1' 02',),(' 6.,3 /,676 ,QWURGXFWLRQ 6HYHUDO GDWD VWUXFWXUHV HJ KHDSV OHIWLVW WUHHV >@ ELQRPLDO KHDSV >@f KDYH EHHQ SURSRVHG IRU WKH UHSUHVHQWDWLRQ RI D VLQJOH HQGHGf SULRULW\ TXHXH +HDSV SHUPLW RQH WR GHOHWH WKH PLQ HOHPHQW DQG LQVHUW DQ DUELWUDU\ HOHPHQW LQWR DQ Q HOHPHQW SULRULW\ TXHXH LQ 2ORJQf WLPH /HIWLVW WUHHV VXSSRUW ERWK WKHVH RSHUDWLRQV DQG WKH PHUJLQJ RI SDLUV RI SULRULW\ TXHXHV LQ ORJDULWKPLF WLPH 8VLQJ ELQRPLDO KHDSV LQVHUWV DQG FRPELQHV WDNH f WLPH DQG D GHOHWHPLQ WDNHV 2ORJQf DPRUWL]HG WLPH ,Q WKLV FKDSWHU ZH EHJLQ LQ 6HFWLRQ E\ GHYHORSLQJ WKH ZHLJKW ELDVHG OHIWLVW WUHH 7KLV LV VLPLODU WR D OHIWLVW WUHH +RZHYHU ELDVLQJ RI OHIW DQG ULJKW VXEWUHHV LV GRQH E\ QXPEHU RI QRGHV UDWKHU WKDQ E\ OHQJWK RI SDWKV ([SHULPHQWDO UHVXOWV SUHVHQWHG LQ 6HFWLRQ VKRZ WKDW ZHLJKW ELDVHG OHIWLVW WUHHV SURYLGH EHWWHU SHUIRUPDQFH WKDQ SURYLGHG E\ OHIWLVW WUHHV 7KH H[SHULPHQWDO FRPSDULVRQV RI 6HFWLRQ DOVR LQFOXGH D FRPSDULVRQ ZLWK KHDSV DQG ELQRPLDO KHDSV DV ZHOO DV ZLWK XQEDODQFHG ELQDU\ VHDUFK WUHHV DQG WKH SUREDELOLVWLF VWUXFWXUHV WUHDS >@ DQG VNLS OLVWV >@ ,Q 6HFWLRQ ZH SURSRVH D IL[HG QRGH VL]H UHSUHVHQWDWLRQ IRU VNLS OLVWV 7KH QHZ VWUXFWXUH LV FDOOHG PRGLILHG VNLS OLVWV DQG LV H[SHULPHQWDOO\ FRPSDUHG ZLWK WKH
PAGE 118
YDULDEOH QRGH VL]H VWUXFWXUH VNLS OLVWV 2XU H[SHULPHQWV LQGLFDWH WKDW PRGLILHG VNLS OLVWV DUH IDVWHU WKDQ VNLS OLVWV ZKHQ XVHG WR UHSUHVHQW GLFWLRQDULHV 0RGLILHG VNLS OLVWV DUH DXJPHQWHG E\ D WKUHDG LQ 6HFWLRQ WR REWDLQ D VWUXFWXUH VXLWDEOH IRU XVH DV D SULRULW\ TXHXH )RU FRPSOHWHQHVV ZH LQFOXGH LQ 6HFWLRQ D FRPSDULVRQ RI GDWD VWUXFWXUHV IRU GRXEOH HQGHG SULRULW\ TXHXHV :HLJKW %LDVHG /HIWLVW 7UHHV /HW 7 EH DQ H[WHQGHG ELQDU\ WUHH )RU DQ\ LQWHUQDO QRGH [ RI 7 OHW /HIW&KLOG[f DQG 5LJKW&KLOG[f UHVSHFWLYHO\ GHQRWH WKH OHIW DQG ULJKW FKLOGUHQ RI [ 7KH ZHLJKW Z[f RI DQ\ QRGH [ LV WKH QXPEHU RI LQWHUQDO QRGHV LQ WKH VXEWUHH ZLWK URRW [ 7KH OHQJWK VKRUWHVW[f RI D VKRUWHVW SDWK IURP [ WR DQ H[WHUQDO QRGH VDWLVILHV WKH UHFXUUHQFH VKRUWHVWA [f fÂ§ LI [ LV DQ H[WHUQDO QRGH PLQ^VKRUWHVW/HIW&KLOG[ffVKRUWHVW5LJKW&KLOG[ff` RWKHUZLVH 'HILQLWLRQ >@ $ OHIWLVW WUHH /7f LV D ELQDU\ WUHH VXFK WKDW LI LW LV QRW HPSW\ WKHQ VKRUWHVW/HIW&KLOG[ff VKRUWHVW5LJKW&KLOG[ff IRU HYHU\ LQWHUQDO QRGH [ $ ZHLJKW ELDVHG OHIWLVW WUHH :%/7f LV GHILQHG E\ XVLQJ WKH ZHLJKW PHDVXUH LQ SODFH RI WKH PHDVXUH VKRUWHVW
PAGE 119
'HILQLWLRQ $ ZHLJKW ELDVHG OHIWLVW WUHH :%/7f LV D ELQDU\ WUHH VXFK WKDW LI LW LV QRW HPSW\ WKHQ ZHLJKW/HIW&KLOG[ff ZHLJKW5LJKW&KLOG[ff IRU HYHU\ LQWHUQDO QRGH [ ,W LV NQRZQ >@ WKDW WKH OHQJWK ULJKWPRVW[f RI WKH ULJKWPRVW URRW WR H[WHUQDO QRGH SDWK RI DQ\ VXEWUHH [ RI D OHIWLVW WUHH VDWLVILHV ULJKWPRVW[ f ORJZ[f f 7KH VDPH LV WUXH IRU ZHLJKW ELDVHG OHIWLVW WUHHV 7KHRUHP /HW [ EH DQ\ LQWHUQDO QRGH RI D ZHLJKW ELDVHG OHIWLVW WUHH ULJKWPRVW[f ORJW\[f f 3URRI 7KH SURRI LV E\ LQGXFWLRQ RQ Z[f :KHQ Z[f ULJKWPRVW[f DQG ORJX[f f ORJ )RU WKH LQGXFWLRQ K\SRWKHVLV DVVXPH WKDW ULJKWPRVW[f ORJX![fOf ZKHQHYHU Z[f Q :KHQ Z[f Q Z5LJKW&KLOG[ff Q fÂ§ Of DQG ULJKWPRVW[f fÂ§ ULJKWPRVW5LJKW&KLOG[ff ORJQ fÂ§ Of f ORJQ f ORJQ f Â’ 'HILQLWLRQ $ PLQ PD[f:%/7 LV D :%/7 WKDW LV DOVR D PLQ PD[f WUHH (DFK QRGH RI D PLQ:%/7 KDV WKH ILHOGV ,VL]H QXPEHU RI LQWHUQDO QRGHV LQ OHIW VXEWUHHf UVL]H OHIW SRLQWHU WR OHIW VXEWUHHf ULJKW DQG GDWD :KLOH WKH
PAGE 120
ERWWRP Df (PSW\ PLQ:%/7 Ef 1RQHPSW\ PLQ:%/7 )LJXUH ([DPSOH PLQ:%/7V QXPEHU RI VL]H ILHOGV LQ D QRGH PD\ EH UHGXFHG WR RQH WZR ILHOGV UHVXOW LQ D IDVWHU LPSOHPHQWDWLRQ :H DVVXPH D KHDG QRGH KHDG ZLWK ,VL]H RR DQG cFKLOG KHDG ,Q DGGLWLRQ D ERWWRP QRGH ERWWRP ZLWK GDWDNH\ RR $OO SRLQWHUV WKDW ZRXOG QRUPDOO\ EH QLO DUH UHSODFHG E\ D SRLQWHU WR ERWWRP )LJXUH Df VKRZV WKH UHSUHVHQWDWLRQ RI DQ HPSW\ PLQ:%/7 DQG )LJXUH Ef VKRZV DQ H[DPSOH QRQ HPSW\ PLQ:%/7 1RWLFH WKDW DOO HOHPHQWV DUH LQ WKH ULJKW VXEWUHH RI WKH KHDG QRGH 0LQ PD[f:%/7V FDQ EH XVHG DV SULRULW\ TXHXHV LQ WKH VDPH ZD\ DV PLQ PD[f /7V )RU LQVWDQFH D PLQ:%/7 VXSSRUWV WKH VWDQGDUG SULRULW\ TXHXH RSHUDWLRQV RI LQVHUW DQG GHOHWHPLQ LQ ORJDULWKPLF WLPH ,Q DGGLWLRQ WKH FRPELQH RSHUDWLRQ LH MRLQ WZR SULRULW\ TXHXHV WRJHWKHUf FDQ DOVR EH GRQH LQ ORJDULWKPLF WLPH 7KH DOJRULWKPV IRU WKHVH RSHUDWLRQV KDYH WKH VDPH IODYRU DV WKH FRUUHVSRQGLQJ RQHV IRU PLQ/7V $ KLJK OHYHO GHVFULSWLRQ RI WKH LQVHUW DQG GHOHWHPLQ DOJRULWKP IRU PLQ :%/7 LV JLYHQ LQ )LJXUHV DQG UHVSHFWLYHO\ 7KH DOJRULWKP WR FRPELQH WZR
PAGE 121
SURFHGXUH ,QVHUWGf ^LQVHUW G LQWR D PLQ:%/7` EHJLQ FUHDWH D QRGH [ ZLWK [GDWD G W KHDG ^KHDG QRGH` ZKLOH WULJKWGDWDNH\ GNH\f GR EHJLQ WUVL]H WUVL]H LI WOVL]H WUVL]Hf WKHQ EHJLQ VZDS 9V FKLOGUHQ W WOHIW HQG HOVH W WULJKW HQG [OHIW WULJKW [ULJKW ERWWRP [OVL]H WUVL]H [UVL]H fÂ§ LI WOVL]H WUVL]Hf WKHQ ^VZDS FKLOGUHQ` EHJLQ WULJKW WOHIW WOHIW [ WOVL]H [OVL]H HQG HOVH EHJLQ WULJKW [ WUVL]H WUVL]H HQG HQG )LJXUH PLQ:%/7 ,QVHUW PLQ:%/7V LV VLPLODU WR WKH GHOHWHPLQ DOJRULWKP 7KH WLPH UHTXLUHG WR SHUIRUP HDFK RI WKH RSHUDWLRQV RQ D PLQ:%/7 7 LV ULJKWPRVW7ff 1RWLFH WKDW ZKLOH WKH LQVHUW DQG GHOHWHPLQ RSHUDWLRQV IRU PLQ/7V UHTXLUH D WRSGRZQ SDVV IROORZHG E\ D ERWWRPXS SDVV WKHVH RSHUDWLRQV FDQ EH SHUIRUPHG E\ D VLQJOH WRSGRZQ SDVV LQ PLQ:%/7V +HQFH ZH H[SHFW PLQ:%/7V WR RXWSHUIRUP PLQ/7V 0RGLILHG 6NLS /LVWV 6NLS OLVWV ZHUH SURSRVHG LQ 3XJK >@ DV D SUREDELOLVWLF VROXWLRQ IRU WKH GLFWLRQDU\ SUREOHP LH UHSUHVHQW D VHW RI NH\V DQG VXSSRUW WKH RSHUDWLRQV RI VHDUFK LQVHUW DQG GHOHWHf 7KH HVVHQWLDO LGHD LQ VNLS OLVWV LV WR PDLQWDLQ XSWR ,PD[ RUGHUHG FKDLQV
PAGE 122
SURFHGXUH 'HOHWHPLQ EHJLQ [ KHDGULJKW LI [ ERWWRPf WKHQ UHWXUQ ^HPSW\ WUHH` KHDGULJKW [OHIW KHDGUVL]H [OVL]H D KHDG E [ULJKW EVL]H [UVL]H GHOHWH [ LI L! ERWWRPf WKHQ UHWXUQ U DULJKW ZKLOH U A ERWWRPf GR EHJLQ ÂVL]H DUVL]H W DUVL]H LI DOVL]H Vf WKHQ ^ZRUN RQ DOHIWf EHJLQ DULJKW DOHIW DUVL]H DOVL]H DOVL]H V LI UGDWDNH\ EGDWDNH\f WKHQ EHJLQ DOHIW E D E U VL]H W HQG HOVH EHJLQ DOHIW U D U HQG HQG HOVH GR V\PPHWULF RSHUDWLRQV RQ DULJKW U DULJKW HQG LI DOVL]H EVL]Hf WKHQ EHJLQ DULJKW DOHIW DOHIW E DUVL]H DOVL]H DOVL]H EVL]H HQG HOVH EHJLQ DULJKW E DUVL]H EVL]H HQG HQG )LJXUH PLQ:%/7 'HOHWHPLQ
PAGE 123
,OO OHYHO )LJXUH 6NLS /LVWV GHVLJQDWHG DV OHYHO FKDLQ OHYHO FKDLQ HWF ,I ZH FXUUHQWO\ KDYH OFXUUHQW QXPEHU RI FKDLQV WKHQ DOO Q HOHPHQWV RI WKH GLFWLRQDU\ DUH LQ WKH OHYHO FKDLQ DQG IRU HDFK FXUUHQW DSSUR[LPDWHO\ D IUDFWLRQ S RI WKH HOHPHQWV RQ WKH OHYHO fÂ§ FKDLQ DUH DOVR RQ WKH OHYHO FKDLQ ,GHDOO\ LI WKH OHYHO fÂ§ FKDLQ KDV P HOHPHQWV WKHQ WKH DSSUR[LPDWHO\ P [ S HOHPHQWV RQ WKH OHYHO FKDLQ DUH DERXW S DSDUW LQ WKH OHYHO fÂ§ FKDLQ )LJXUH VKRZV DQ LGHDO VLWXDWLRQ IRU WKH FDVH OHXUUHQW DQG S :KLOH WKH VHDUFK LQVHUW DQG GHOHWH DOJRULWKPV IRU VNLS OLVWV DUH VLPSOH DQG KDYH SUREDELOLVWLF FRPSOH[LW\ 2ORJQf ZKHQ WKH OHYHO FKDLQ KDV Q HOHPHQWV VNLS OLVWV VXIIHU IURP WKH IROORZLQJ LPSOHPHQWDWLRQDO GUDZEDFNV ,Q SURJUDPPLQJ ODQJXDJHV VXFK DV 3DVFDO LW LVQfW SRVVLEOH WR KDYH YDULDEOH VL]H QRGHV $V D UHVXOW HDFK QRGH KDV RQH GDWD ILHOG DQG ,PD[ SRLQWHU ILHOGV 6R WKH Q HOHPHQW QRGHV KDYH D WRWDO RI Q [ ,PD[ SRLQWHU ILHOGV HYHQ WKRXJK RQO\ DERXW Q fÂ§Sf SRLQWHUV DUH QHFHVVDU\ 6LQFH ,PD[ LV JHQHUDOO\ PXFK ODUJHU WKDQ WKH UHFRPPHQGHG YDOXH LV ORJAAQ0D[ ZKHUH Q0D[ LV WKH ODUJHVW QXPEHU RI HOHPHQWV H[SHFWHG LQ WKH GLFWLRQDU\f VNLS OLVWV UHTXLUH PRUH VSDFH WKDQ :%/7V
PAGE 124
:KLOH ODQJXDJHV VXFK DV & DQG & VXSSRUW YDULDEOH VL]H QRGHV DQG ZH FDQ FRQVWUXFW YDULDEOH VL]H QRGHV XVLQJ VLPXODWHG SRLQWHUV >@ LQ ODQJXDJHV VXFK DV 3DVFDO WKDW GR QRW VXSSRUW YDULDEOH VL]H QRGHV WKH XVH RI YDULDEOH VL]H QRGHV UHTXLUHV PRUH FRPSOH[ VWRUDJH PDQDJHPHQW WHFKQLTXHV WKDQ UHTXLUHG E\ WKH XVH RI IL[HG VL]H QRGHV 6R JUHDWHU HIILFLHQF\ FDQ EH DFKLHYHG XVLQJ VLPXODWHG SRLQWHUV DQG IL[HG VL]H QRGHV :LWK WKHVH WZR REVHUYDWLRQV LQ PLQG ZH SURSRVH D PRGLILHG VNLS OLVW 06/f VWUXFWXUH LQ ZKLFK HDFK QRGH KDV RQH GDWD ILHOG DQG WKUHH SRLQWHU ILHOGV OHIW ULJKW DQG GRZQ 1RWLFH WKDW WKLV PHDQV 06/V XVH IRXU ILHOGV SHU QRGH ZKLOH :%/7V XVH ILYH DV LQGLFDWHG HDUOLHU WKLV FDQ EH UHGXFHG WR IRXU DW WKH H[SHQVH RI LQFUHDVHG UXQ WLPHf 7KH OHIW DQG ULJKW ILHOGV DUH XVHG WR PDLQWDLQ HDFK OHYHO FKDLQ DV D GRXEO\ OLQNHG OLVW DQG WKH GRZQ ILHOG RI D OHYHO QRGH [ SRLQWV WR WKH OHIWPRVW QRGH LQ WKH OHYHO fÂ§ FKDLQ WKDW KDV NH\ YDOXH ODUJHU WKDQ WKH NH\ LQ [ )LJXUH VKRZV WKH PRGLILHG VNLS OLVW WKDW FRUUHVSRQGV WR WKH VNLS OLVW RI )LJXUH 1RWLFH WKDW HDFK HOHPHQW LV LQ H[DFWO\ RQH GRXEO\ OLQNHG OLVW :H FDQ UHGXFH WKH QXPEHU RI SRLQWHUV LQ HDFK QRGH WR WZR E\ HOLPLQDWLQJ WKH ILHOG OHIW DQG KDYLQJ GRZQ SRLQW RQH QRGH WKH OHIW RI ZKHUH LW FXUUHQWO\ SRLQWV H[FHSW IRU KHDG QRGHV ZKRVH GRZQ ILHOGV VWLOO SRLQW WR WKH KHDG QRGH RI WKH QH[W FKDLQf +RZHYHU WKLV UHVXOWV LQ D OHVV WLPH HIILFLHQW LPSOHPHQWDWLRQ + DQG 7 UHVSHFWLYHO\ SRLQW WR WKH KHDG DQG WDLO RI WKH OHYHO FXUUHQW FKDLQ $ KLJK OHYHO GHVFULSWLRQ RI WKH DOJRULWKPV WR VHDUFK LQVHUW DQG GHOHWH DUH JLYHQ LQ )LJXUHV DQG 7KH QH[W WKHRUHP VKRZV WKDW WKHLU SUREDELOLVWLF FRPSOH[LW\ LV 2ORJ Qf ZKHUH Q LV WKH WRWDO QXPEHU RI HOHPHQWV LQ WKH GLFWLRQDU\
PAGE 125
OHYHO + 7 )LJXUH 0RGLILHG 6NLS /LVWV SURFHGXUH 6HDUFKNH\f EHJLQ 3 + ZKLOH S QLOf GR EHJLQ ZKLOH SGDWDNH\ NH\f GR S SULJKW LI SGDWDNH\ NH\f WKHQ UHSRUW DQG VWRS HOVH S SOHIWGRZQ ^ OHYHO GRZQ` HQG HQG )LJXUH 06/ 6HDUFK
PAGE 126
SURFHGXUH ,QVHUWGf EHJLQ UDQGRPO\ JHQHUDWH WKH OHYHO N DW ZKLFK G LV WR EH LQVHUWHG VHDUFK WKH 06/ + IRU GNH\ VDYLQJ LQIRUPDWLRQ XVHIXO IRU LQVHUWLRQ LI GNH\ LV IRXQG WKHQ IDLO ^GXSOLFDWH` JHW D QHZ QRGH [ DQG VHW [GDWD G LI IF OHXUUHQWf DQG FXUUHQW A ,PD[ff WKHQ EHJLQ OHXUUHQW cFXUUHQW FUHDWH D QHZ FKDLQ ZLWK D KHDG QRGH QRGH [ DQG D WDLO DQG FRQQHFW WKLV FKDLQ WR + XSGDWH + VHW [GRZQ WR WKH DSSURSULDWH QRGH LQ WKH OHYHO OHXUUHQW fÂ§ FKDLQ WR QLO LI N f HQG HOVH EHJLQ LQVHUW [ LQWR WKH OHYHO N FKDLQ VHW [GRZQ WR WKH DSSURSULDWH QRGH LQ WKH OHYHO N fÂ§ FKDLQ WR QLO LI N f XSGDWH WKH GRZQ ILHOG RI QRGHV RQ WKH OHYHO N FKDLQ LI DQ\f DV QHHGHG HQG HQG )LJXUH 06/ ,QVHUW SURFHGXUH 'HOHWH]f EHJLQ VHDUFK WKH 06/ + IRU D QRGH [ ZLWK GDWDNH\ ] VDYLQJ LQIRUPDWLRQ XVHIXO IRU GHOHWLRQ LI QRW IRXQG WKHQ IDLO OHW N EH WKH OHYHO DW ZKLFK ] LV IRXQG IRU HDFK QRGH S RQ OHYHO N WKDW KDV SGRZQ [ VHW SGRZQ [ULJKW GHOHWH [ IURP WKH OHYHO N OLVW LI WKH OLVW DW OHYHO OHXUUHQW EHFRPHV HPSW\ WKHQ GHOHWH WKLV DQG VXFFHHGLQJ HPSW\ OLVWV XQWLO ZH UHDFK WKH ILUVW QRQ HPSW\ OLVW XSGDWH OHXUUHQW HQG )LJXUH 06/ 'HOHWH
PAGE 127
7KHRUHP OL 7KH SUREDELOLVWLF FRPSOH[LW\ RI WKH 06/ RSHUDWLRQV LV 2ILRJ Qf 3URRI :H HVWDEOLVK WKLV E\ VKRZLQJ WKDW RXU DOJRULWKPV GR DW PRVW D ORJDULWKPLF DPRXQW RI DGGLWLRQDO ZRUN WKDQ GR WKRVH RI 3XJK >@ 6LQFH WKH DOJRULWKPV RI 3XJK >@ KDV SUREDELOLVWLF 2ORJQf FRPSOH[LW\ VR DOVR GR RXUV 'XULQJ D VHDUFK WKH H[WUD ZRUN UHVXOWV IURP PRYLQJ EDFN RQH QRGH RQ HDFK OHYHO DQG WKHQ PRYLQJ GRZQ RQH OHYHO :KHQ WKLV LV GRQH IURP DQ\ OHYHO RWKHU WKDQ FXUUHQW ZH H[SHFW WR H[DPLQH XSWR F SfÂ§ DGGLWLRQDO QRGHV RQ WKH QH[W ORZHU OHYHO +HQFH XSWR FOFXUUHQW fÂ§ f DGGLn WLRQDO QRGHV JHW H[DPLQHG 'XULQJ DQ LQVHUW ZH DOVR QHHG WR YHULI\ WKDW WKH HOHPHQW EHLQJ LQVHUWHG LVQfW RQH RI WKH HOHPHQWV DOUHDG\ LQ WKH 06/ 7KLV UHTXLUHV DQ DGGLn WLRQDO FRPSDULVRQ DW HDFK OHYHO 6R 06/V PD\ PDNH XSWR FOFXUUHQW fÂ§ f I FXUUHQW DGGLWLRQDO FRPSDUHV GXULQJ DQ LQVHUW 7KH QXPEHU RI GRZQ SRLQWHUV WKDW QHHG WR EH FKDQJHG GXULQJ DQ LQVHUW RU GHOHWH LV H[SHFWHG WR EH LSr 6LQFH F DQG S DUH FRQVWDQWV DQG ,PD[ fÂ§ ORJMS Q WKH H[SHFWHG DGGLWLRQDO ZRUN LV 2ORJ Qf Â’ 7KH UHODWLYH SHUIRUPDQFH RI VNLS OLVWV DQG PRGLILHG VNLS OLVWV DV D GDWD VWUXFWXUH IRU GLFWLRQDULHV ZDV GHWHUPLQHG E\ SURJUDPPLQJ WKH WZR LQ & %RWK ZHUH LPSOHPHQWHG XVLQJ VLPXODWHG SRLQWHUV 7KH VLPXODWHG SRLQWHU LPSOHPHQWDWLRQ RI VNLS OLVWV XVHG IL[HG VL]H QRGHV 7KLV DYRLGHG WKH XVH RI FRPSOH[ VWRUDJH PDQDJHPHQW PHWKRGV DQG ELDVHG WKH UXQ WLPH PHDVXUHPHQWV LQ IDYRU RI VNLS OLVWV )RU WKH FDVH RI VNLS OLVWV ZH XVHG S DQG IRU 06/V S 7KHVH YDOXHV RI S ZHUH IRXQG WR ZRUN EHVW IRU HDFK VWUXFWXUH ,PD[ ZDV VHW WR IRU ERWK VWUXFWXUHV
PAGE 128
:H H[SHULPHQWHG ZLWK Q DQG )RU HDFK Q WKH IROORZLQJ ILYH SDUW H[SHULPHQW ZDV FRQGXFWHG Df VWDUW ZLWK DQ HPSW\ VWUXFWXUH DQG SHUIRUP Q LQVHUWV Ef VHDUFK IRU HDFK LWHP LQ WKH UHVXOWLQJ VWUXFWXUH RQFH LWHPV DUH VHDUFKHG IRU LQ WKH RUGHU WKH\ ZHUH LQVHUWHG Ff SHUIRUP DQ DOWHUQDWLQJ VHTXHQFH RI Q LQVHUWV DQG Q GHOHWHV LQ WKLV WKH Q HOHPHQWV LQVHUWHG LQ Df DUH GHOHWHG LQ WKH RUGHU WKH\ ZHUH LQVHUWHG DQG Q QHZ HOHPHQWV DUH LQVHUWHG Gf VHDUFK IRU HDFK RI WKH UHPDLQLQJ Q HOHPHQWV LQ WKH RUGHU WKH\ ZHUH LQVHUWHG Hfb WR b PRUH FRPSDULVRQV RQ HDFK RI WKH ILYH SDUWV RI WKH H[SHULPHQW 2Q RUGHUHG LQSXWV WKH GLVSDULW\ LV HYHQ JUHDWHU ZLWK 06/V PDNLQJ b WR b PRUH FRPSDULVRQ 7DEOH JLYHV WKH QXPEHU RI OHYHOV LQ 6.,3 DQG 06/ 7KH ILUVW QXPEHU RI HDFK HQWU\ LV WKH QXPEHU RI OHYHOV IROORZLQJ SDUW Df RI WKH H[SHULPHQW DQG WKH VHFRQG WKH QXPEHU RI OHYHOV IROORZLQJ SDUW Ef $V FDQ EH
PAGE 129
7DEOH 7KH QXPEHU RI NH\ FRPSDULVRQV Q RSHUDWLRQ UDQGRP LQSXWV RUGHUHG LQSXWV 6.,3 06/ 6.,3 06/ LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH
PAGE 130
7DEOH 1XPEHU RI OHYHOV Q UDQGRP LQSXWV RUGHUHG LQSXWV 6.,3 06/ 6.,3 06/ VHHQ WKH QXPEHU RI OHYHOV LV YHU\ FRPSDUDEOH IRU ERWK VWUXFWXUHV 06/V JHQHUDOO\ KDG RQH RU WZR OHYHOV IHZHU WKDQ 6.,3V KDG 'HVSLWH WKH ODUJH GLVSDULW\ LQ QXPEHU RI FRPSDULVRQV 06/V JHQHUDOO\ UHTXLUHG OHVV WLPH WKDQ UHTXLUHG E\ 6.,3V VHH 7DEOH DQG )LJXUH ff WLPH DQG WKHQ GHOHWHG LQ 2ORJQf SUREDELOLVWLF WLPH ,Q WKH FDVH RI 06/V WKH PLQ HOHPHQW LV WKH ILUVW RQH LQ RQH RI WKH OHXUUHQW FKDLQV 7KLV FDQ EH LGHQWLILHG LQ ORJDULWKPLF WLPH XVLQJ D ORVHU WUHH ZKRVH HOHPHQWV DUH WKH ILUVW HOHPHQW IURP HDFK 06/ FKDLQ %\ XVLQJ DQ DGGLWLRQDO SRLQWHU ILHOG LQ HDFK QRGH ZH FDQ WKUHDG WKH HOHPHQWV LQ DQ 06/ LQWR D FKDLQ 7KH HOHPHQWV DSSHDU LQ QRQGHFHQGLQJ RUGHU RQ WKLV FKDLQ 7KH
PAGE 131
7DEOH 5XQ WLPH Q RSHUDWLRQ UDQGRP LQSXWV RUGHUHG LQSXWV 6.,3 06/ 6.,3 06/ LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH LQVHUW VHDUFK LQVGHO VHDUFK GHOHWH
PAGE 132
7LPH LV VXP RI WLPH IRU SDUWV DfHf RI WKH H[SHULPHQW )LJXUH 5XQ WLPH UHVXOWLQJ WKUHDGHG VWUXFWXUH LV UHIHUUHG WR DV 706/ WKUHDGHG PRGLILHG VNLS OLVWVf $ GHOHWH PLQ RSHUDWLRQ FDQ EH GRQH LQ f H[SHFWHG WLPH ZKHQ D 706/ LV XVHG 7KH H[SHFWHG WLPH IRU DQ LQVHUW UHPDLQV 2ORJQf 7KH DOJRULWKPV IRU WKH LQVHUW DQG GHOHWH PLQ RSHUDWLRQV IRU 706/V DUH JLYHQ LQ )LJXUHV DQG UHVSHFWLYHO\ 7KH ODVW VWHS RI )LJXUH LV LPSOHPHQWHG E\ ILUVW ILQGLQJ WKH ODUJHVW HOHPHQW RQ OHYHO ZLWK NH\ GNH\ IRU WKLV VWDUW DW OHYHO OHXUUHQW fÂ§ f DQG WKHQ IROORZ WKH WKUHDGHG FKDLQ 7KHRUHP 7KH H[SHFWHG FRPSOH[LW\ RI DQ LQVHUW DQG GHOHWHPLQ RSHUDWLRQ LQ D 706/ LV 2?RJQf DQG f UHVSHFWLYHO\ 3URRI )ROORZV IURP WKH QRWLRQ RI D WKUHDG 7KHRUHP DQG 3XJK >@ Â’
PAGE 133
SURFHGXUH ,QVHUWGf EHJLQ UDQGRPO\ JHQHUDWH WKH OHYHO N DW ZKLFK G LV WR EH LQVHUWHG JHW D QHZ QRGH [ DQG VHW [GDWD G LI r OFXUUHQWf DQG FXUUHQW A ,PD[ff WKHQ EHJLQ OHXUUHQW FXUUHQW FUHDWH D QHZ FKDLQ ZLWK D KHDG QRGH QRGH [ DQG D WDLO DQG FRQQHFW WKLV FKDLQ WR + XSGDWH + VHW [GRZQ WR WKH DSSURSULDWH QRGH LQ WKH OHYHO OHXUUHQW fÂ§ FKDLQ WR QLO LI N f HQG HOVH EHJLQ LQVHUW [ LQWR WKH OHYHO N FKDLQ VHW [GRZQ WR WKH DSSURSULDWH QRGH LQ WKH OHYHO N fÂ§ FKDLQ WR QLO LI N f XSGDWH WKH GRZQ ILHOG RI QRGHV RQ WKH OHYHO N I FKDLQ LI DQ\f DV QHHGHG HQG ILQG QRGH ZLWK ODUJHVW NH\ GNH\ DQG LQVHUW [ LQWR WKUHDGHG OLVW HQG )LJXUH 706/ ,QVHUW SURFHGXUH 'HOHWHPLQ EHJLQ GHOHWH WKH ILUVW QRGH [ IURP WKH WKUHDG OLVW OHW N EH WKH OHYHO [ LV RQ GHOHWH [ IURP WKH OHYHO N OLVW QRWH WKHUH DUH QR GRZQ ILHOGV RQ OHYHO N WKDW QHHG WR EH XSGDWHGf LI WKH OLVW DW OHYHO OHXUUHQW EHFRPHV HPSW\ WKHQ GHOHWH WKLV DQG VXFFHHGLQJ HPSW\ OLVWV XQWLO ZH UHDFK WKH ILUVW QRQ HPSW\ OLVW XSGDWH OHXUUHQW HQG )LJXUH 706/ 'HOHWHPLQ
PAGE 134
SURFHGXUH 'HOHWHPD[ EHJLQ GHOHWH WKH ODVW QRGH [ IURP WKH WKUHDG OLVW OHW N EH WKH OHYHO [ LV RQ GHOHWH [ IURP WKH OHYHO N OLVW XSGDWLQJ SGRZQ IRU QRGHV RQ OHYHO N? DV QHFHVVDU\ LI WKH OLVW DW OHYHO FXUUHQW EHFRPHV HPSW\ WKHQ GHOHWH WKLV DQG VXFFHHGLQJ HPSW\ OLVWV XQWLO ZH UHDFK WKH ILUVW QRQ HPSW\ OLVW XSGDWH FXUUHQW HQG )LJXUH 706/ 'HOHWHPD[ 706/V PD\ EH IXUWKHU H[WHQGHG E\ PDNLQJ WKH WKUHDGHG FKDLQ D GRXEO\ ILQNHG ILVW 7KLV SHUPLWV ERWK GHOHWHPLQ DQG GHOHWHPD[ WR EH GRQH LQ f H[SHFWHG WLPH DQG LQVHUW LQ 2ORJQf H[SHFWHG WLPH :LWK WKLV H[WHQVLRQ 706/V PD\ EH XVHG WR UHSUHVHQW GRXEOH HQGHG SULRULW\ TXHXHV ([SHULPHQWDO 5HVXOWV )RU 3ULRULW\ 4XHXHV 7KH VLQJOHHQGHG SULRULW\ TXHXH VWUXFWXUHV PLQ KHDS +HDSf ELQRPLDO KHDS %+HDSf OHIWLVW WUHHV /7f ZHLJKW ELDVHG OHIWLVW WUHHV :%/7f DQG 706/V ZHUH SURJUDPPHG LQ & ,Q DGGLWLRQ SULRULW\ TXHXH YHUVLRQV RI XQEDODQFHG ELQDU\ VHDUFK WUHHV %67f $9/ WUHHV WUHDSV 753f DQG VNLS ILVWV 6.,3ff
PAGE 135
ZHUH SURJUDPPHG XVLQJ VLPXODWHG SRLQWHUV 7KH PLQ KHDS ZDV SURJUDPPHG XVLQJ D RQHGLPHQVLRQDO DUUD\ )RU RXU H[SHULPHQWV ZH EHJDQ ZLWK VWUXFWXUHV LQLWLDOL]HG ZLWK Q DQG HOHPHQWV DQG WKHQ SHUIRUPHG D UDQGRP VHTXHQFH RI RSHUDWLRQV 7KLV UDQGRP VHTXHQFH FRQVLVWV RI DSSUR[LPDWHO\ b LQVHUW DQG b GHOHWH PLQ RSHUDWLRQV 7KH UHVXOWV D[H JLYHQ LQ 7DEOHV DQG ,Q WKH GDWD VHWV fUDQGRPOf DQG fUDQGRPf WKH HOHPHQWV WR EH LQVHUWHG ZHUH UDQGRPO\ JHQHUDWHG ZKLOH LQ WKH GDWD VHW fLQFUHDVLQJf DQ DVFHQGLQJ VHTXHQFH RI HOHPHQWV ZDV LQVHUWHG DQG LQ WKH GDWD VHW fGHFUHDVLQJf D GHVFHQGLQJ VHTXHQFH RI HOHPHQWV ZDV XVHG 6LQFH %67 KDYH YHU\ SRRU SHUIRUPDQFH RQ WKH ODVW WZR GDWD VHWV ZH H[FOXGHG LW IURP WKLV SDUW RI WKH H[SHULPHQW ,Q WKH FDVH RI ERWK UDQGRPO DQG UDQGRP WHQ UDQGRP VHTXHQFHV ZHUH XVHG DQG WKH DYHUDJH RI WKHVH WHQ LV UHSRUWHG 7KH UDQGRPO DQG UDQGRP VHTXHQFHV GLIIHUHG LQ WKDW IRU UDQGRPO WKH NH\V ZHUH LQWHJHUV LQ WKH UDQJH fÂ§ f
PAGE 136
7DEOH 7KH QXPEHU RI NH\ FRPSDULVRQV LQSXWV Q %67 +HDS %+HDS /7 :%/7 753 6.,3 706/ $9/ UDQGRP UDQGRP LQFUHDVLQJ GHFUHDVLQJ Q WKH QXPEHU RI HOHPHQWV LQ LQLWLDO GDWD VWUXFWXUHV 7RWDO QXPEHU RI RSHUDWLRQV SHUIRUPHG
PAGE 137
7KH VWUXFWXUH KHLJKW LQLWLDOO\ DQG IROORZLQJ WKH RSHUDWLRQV LV JLYHQ LQ 7DEOH IRU %67V +HDSV 753V DQG $9/ WUHHV )RU %+HDSV WKH KHLJKW RI WKH WDOOHVW WUHH LV JLYHQ )RU 6.,3V DQG 706/V WKLV WDEOH JLYHV WKH QXPEHU RI OHYHOV ,Q WKH FDVH RI /7 DQG :%/7 WKLV WDEOH JLYHV WKH OHQJWK RI WKH ULJKWPRVW SDWK IROORZLQJ LQLWLDOL]DWLRQ DQG WKH DYHUDJH RI LWV OHQJWK IROORZLQJ HDFK RI WKH RSHUDWLRQV 7KH WZR OHIWLVW VWUXFWXUHV D[H DEOH WR PDLQWDLQ WKHLU ULJKWPRVW SDWKV VR DV WR KDYH D OHQJWK PXFK OHVV WKDQ ORJQ f 7KH PHDVXUHG UXQ WLPHV RQ D 6XQ 6SDUF DUH JLYHQ LQ 7DEOH )RU WKLV WKH FRGHV ZHUH FRPSLOHG XVLQJ WKH FF FRPSLOHU LQ RSWLPL]HG PRGH 7KH UXQ WLPH IRU WKH GDWD VHW UDQGRPO LV JUDSKHG LQ )LJXUH 7KH UXQ WLPH IRU WKH GDWD VHW UDQGRP DQG +HDS /7 :%/7 6.,3 706/ DQG $9/ LV JUDSKHG LQ )LJXUH )RU WKH GDWD VHWV UDQGRPO DQG UDQGRP ZLWK Q DQG :%/7V UHTXLUHG OHDVW WLPH )RU UDQGRPO ZLWK Q %67V WRRN OHDVW WLPH ZKLOH ZKHQ Q ERWK %67V DQG +HDSV WRRN OHDVW WLPH )RU UDQGRP ZLWK Q :%/7V ZHUH IDVWHVW ZKLOH IRU Q fÂ§ +HDS ZDV EHVW 2Q WKH RUGHUHG GDWD VHWV %67V KDYH D YHU\ KLJK FRPSOH[LW\ DQG DUH WKH SRRUHVW SHUIRUPHUV WLPHV QRW VKRZQ LQ 7DEOH f )RU LQFUHDVLQJ GDWD +HDS ZDV EHVW IRU Q DQG DQG ERWK +HDS DQG 753 EHVW IRU Q )RU GHFUHDVLQJ GDWD :%/7V ZHUH JHQHUDOO\ EHVW 2Q DOO GDWD VHWV :%/7V DOZD\V GLG DW OHDVW DV ZHOO DQG RIWHQ EHWWHUf DV /7V %HWZHHQ 6.,3 DQG 706/ ZH VHH WKDW 6.,3 JHQHUDOO\ GLG EHWWHU IRU VPDOO Q DQG 706/ IRU ODUJH Q
PAGE 138
7DEOH +HLJKWOHYHO RI WKH VWUXFWXUHV LQSXWV Q %67 +HDS %+HDS /7 :%/7 753 6.,3 706/ $9/ UDQGRPO UDQGRP LQFUHDVLQJ 8 GHFUHDVLQJ 8 Q WKH QXPEHU RI HOHPHQWV LQ LQLWLDO GDWD VWUXFWXUHV 7RWDO QXPEHU RI RSHUDWLRQV SHUIRUPHG
PAGE 139
7DEOH 5XQ WLPH XVLQJ LQWHJHU NH\V LQSXWV Q %67 +HDS %+HDS /7 :%/7 753 6.,3 706/ $9/ UDQGRP UDQGRP LQFUHDVLQJ GHFUHDVLQJ 7LPH 8QLW VHF Q WKH QXPEHU RI HOHPHQWV LQ LQLWLDO GDWD VWUXFWXUHV 7RWDO QXPEHU RI RSHUDWLRQV SHUIRUPHG
PAGE 140
Q )LJXUH 5XQ WLPH RQ UDQGRP $QRWKHU ZD\ WR LQWHUSUHW WKH WLPH UHVXOWV LV LQ WHUPV RI WKH UDWLR PQ P QXPEHU RI RSHUDWLRQVf ,Q WKH H[SHULPHQWV UHSRUWHG LQ 7DEOH P $V PQ LQFUHDVHV :%/7V DQG /7V SHUIRUP EHWWHU UHODWLYH WR WKH UHPDLQLQJ VWUXFWXUHV 7KLV LV EHFDXVH DV P LQFUHDVHV WKH ZHLJKW ELDVHGf OHIWLVW WUHHV FRQVWUXFWHG DUH YHU\ KLJKO\ VNHZHG WR WKH OHIW DQG WKH OHQJWK RI WKH ULJKWPRVW SDWK LV FORVH WR RQH 7DEOHV DQG SURYLGH DQ H[SHULPHQWDO FRPSDULVRQ RI %67V $9/ WUHHV 00+V PLQPD[ KHDSVf >@ 'HDSV >@ 753V 6.,3V DQG 706/V DV D GDWD VWUXFWXUH IRU GRXEOH HQGHG SULRULW\ TXHXHV 7KH H[SHULPHQWDO VHWXS LV VLPLODU WR WKDW XVHG IRU VLQJOH HQGHG SULRULW\ TXHXHV +RZHYHU WKLV WLPH WKH RSHUDWLRQ PL[ ZDV b LQVHUW b GHOHWHPLQ DQG b GHOHWHPD[ 2Q WKH FRPSDULVRQ PHDVXUH WUHDSV GLG EHVW RQ LQFUHDVLQJ GDWD H[FHSW ZKHQ Q f DQG VNLS OLVWV GLG EHVW ZKHQ GHFUHDVLQJ GDWD ZDV XVHG 2Q DOO RWKHU GDWD $9/ WUHHV GLG EHVW $V IDU DV UXQ WLPH
xml version 1.0 encoding UTF8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchemainstance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID EJRX3G9NY_VF7SPL INGEST_TIME 20141007T00:51:57Z PACKAGE AA00025734_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES
EFFICIENT ALGORITHMS AND DATA STRUCTURES FOR VLSI CAD
By
SEONGHUN CHO
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1996
ACKNOWLEDGMENTS
My heartfelt appreciation goes to my advisor Professor Sartaj Sahni for giving
me continued guidance in my thesis work. I thank him for the help, patience and supÂ¬
port he provided throughout my stay in the University of Florida. Weekly meetings
and discussions with him have spawned many ideas, for which I am thankful.
I would like to thank other members in my supervisory committee, Dr LiMin
Fu, Dr Theodore Johnson, Dr Sanguthevar Rajasekaran, and Dr Paul W. Chun for
their interest and comments.
Thanks go to Venkat Thanvantri for his willingness to discuss the general subject
of algorithms.
Thanks go to my wife Joungyim for her love, encouragement and patience.
Finally, I would like to thank my parents for the love and support, without which I
could not have pursued my doctoral studies. To them I dedicate this work.
TABLE OF CONTENTS
ACKNOWLEDGMENTS Ã¼
LIST OF TABLES vii
LIST OF FIGURES x
ABSTRACT xi
CHAPTERS
1 INTRODUCTION 1
1.1 Background 1
1.2 Dissertation Outline 2
2 MINIMUM AREA JOINING OF COMPACTED CELLS 4
2.1 Introduction 4
2.2 Mayer River Routing 9
2.3 Constraint Graph Representation 16
2.4 Heuristics to Minimize Area 20
2.4.1 Heuristic 1 20
2.4.2 Heuristic 2 21
2.4.3 Heuristic 3 23
2.5 Experimental Results 25
2.6 Conclusion 30
3 A NEW WEIGHT BALANCED BINARY SEARCH TREE 32
3.1 Introduction 32
3.2 Balanced Trees and Rotations 34
3.3 /3BBSTs 39
3.4 Search, Insert, and Delete in a /JBBST 44
3.4.1 Search 44
3.4.2 Insertion 44
3.4.3 Deletion 57
3.4.4 Enhancements 63
3.4.5 Top Down Algorithms 64
iii
3.5 Simple /3BBSTs 66
3.6 BBSTs without Deletion 70
3.7 Experimental Results 72
3.8 Conclusion 98
4 WEIGHT BIASED LEFTIST TREES AND MODIFIED SKIP LISTS 105
4.1 Introduction 105
4.2 Weight Biased Leftist Trees 106
4.3 Modified Skip Lists 109
4.4 MSLs As Priority Queues 118
4.5 Experimental Results For Priority Queues 122
4.6 Conclusion 129
5 CONCLUSIONS
135
A ABBREVIATIONS .
136
REFERENCES 138
BIOGRAPHICAL SKETCH 141
IV
LIST OF TABLES
2.1 Error rate (%) over optimal, / = 1 27
2.2 Improvement (%) over Fang, / = 1 27
2.3 Time taken, / = 1 28
2.4 Error rate (%) over optimal, 1 = 2 29
2.5 Improvement (%) over / = 1 cases 29
2.6 Improvement (%) over Fang, 1 = 2 30
2.7 Time taken, 1 = 2 31
3.1 The number of key comparisons on random inputs (version 1 code) . 76
3.2 The number of key comparisons on ordered inputs (version 1 code) . 77
3.3 Height of the trees on random inputs (version 1 code) 77
3.4 Height of the trees on ordered inputs (version 1 code) 77
3.5 The number of rotations on random inputs (version 1 code) 79
3.6 The number of rotations on ordered inputs (version 1 code) 80
3.7 Run time on random inputs using integer keys (version 1 code) .... 81
3.8 Run time on ordered inputs using integer keys (version 1 code) .... 82
3.9 Run time on random real inputs (version 1 code) 83
v
3.10 Run time on ordered real inputs (version 1 code) 84
3.11 The number of key comparisons on random inputs (version 1 code) . 87
3.12 The number of key comparisons on ordered inputs (version 1 code) . 88
3.13 Height of the trees on random inputs (version 1 code) 89
3.14 Height of the trees on ordered inputs (version 1 code) 89
3.15 The number of rotations on random inputs (version 1 code) 90
3.16 The number of rotations on ordered inputs (version 1 code) 91
3.17 Run time on random inputs using integer keys (version 1 code) .... 93
3.18 Run time on ordered inputs using integer keys (version 1 code) .... 94
3.19 Run time on random real inputs (version 1 code) 95
3.20 Run time on ordered real inputs (version 1 code) 96
3.21 The number of key comparisons on random inputs (version 2 code) . 99
3.22 The number of key comparisons on ordered inputs (version 2 code) . 100
3.23 Run time on random real inputs (version 2 code) 101
3.24 Run time on ordered real inputs (version 2 code) 103
4.1 The number of key comparisons 117
4.2 Number of levels 118
4.3 Run time 119
4.4 The number of key comparisons 124
4.5 Height/level of the structures 126
4.6 Run time using integer keys 127
4.7 The number of key comparisons 130
vi
4.8
Height/level of the structures
141
132
4.y
A.l
itun time using miegei Keys
137
vii
LIST OF FIGURES
2.1 Cell joining 5
2.2 Mayer river routing 8
2.3 Round robin and greedy layer assignments 11
2.4 Minimizing the number of tracks or layers 14
2.5 Constraint graph representation 18
2.6 Merge in constraint graph 19
2.7 Heuristic 1 20
2.8 Heuristic 2 22
2.9 Heuristic 3 24
3.1 LL and RL rotations 35
3.2 A tree in WB(l/4) that is not ^balanced 42
3.3 â€”balanced tree that is not a COST 43
3.4 LL rotation for insertion 45
3.5 Substep (i) of insertion LR rotation 46
3.6 Case LL for LR(ii) rotation 49
3.7 Case LR for LR(ii) rotation 51
viii
3.8 LL rotation for deletion 58
3.9 LR rotation for deletion 60
3.10 Restructuring procedure 65
3.11 Simple restructuring procedure for insertion 68
3.12 Simple restructuring procedure for deletion 68
3.13 Simple restructuring procedure without a /3 value 70
3.14 Run time on real inputs (version 1 code) 85
3.15 Run time on random real inputs (version 1 code) 92
3.16 Run time on ordered real inputs (version 1 code) 97
3.17 Run time on random real inputs (version 2 code) 102
3.18 Run time on ordered real inputs (version 2 code) 102
4.1 Example minVVBLTs 108
4.2 minWBLT Insert 109
4.3 minWBLT Deletemin 110
4.4 Skip Lists Ill
4.5 Modified Skip Lists 113
4.6 MSL Search 113
4.7 MSL Insert 114
4.8 MSL Delete 114
4.9 Run time 120
4.10 TMSL Insert 121
4.11 TMSL Deletemin 121
IX
4.12TMSL Deletemax
. 122
4.13 Run time on randoml 128
4.14 Run time on random2 129
4.15 Run time on randoml 131
4.16 Run time on random2 133
x
Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Doctor of Philosophy
EFFICIENT ALGORITHMS AND DATA STRUCTURES FOR VLSI CAD
By
Seonghun Cho
May 1996
Chairman: Dr. Sartaj Sahni
Major Department: Computer and Information Science and Engineering
In this dissertation, we develop efficient algorithms and data structures for probÂ¬
lems that arise in electronic computer aided design (ECAD).
We consider the problem of joining a row of compacted cells so as to minimize the
area occupied by the cells and the interconnects. The cell joining process includes
cell stretching and river routing. We propose several heuristics to join a row of
cells in such a way that area is minimized. The proposed heuristics are compared
experimentally with the previously proposed heuristic.
We develop a new class of weight balanced binary search trees called /^balanced
binary search trees (/3BBSTs). /2BBSTs are designed to have reduced internal path
xi
length. As a result, they are expected to exhibit good search time characteristics.
Individual search, insert, and delete operations in an n node /2BBST take O(logn)
time for 0 < /? < y/2 â€” 1. Experimental results comparing the performance of /?
BBSTs, WB(q) trees, AVLtrees, red/black trees, treaps, deterministic skip lists and
skip lists are presented. Two simplified versions of /TBBSTs are also developed.
We propose the weight biased leftist tree as an alternative to traditional leftist
trees for the representation of mergeable priority queues. A modified version of
skip lists that uses fixed size nodes is also proposed. Experimental results show
our modified skip list structure is faster than the original skip list structure for the
representation of dictionaries. Experimental results comparing weight biased leftist
trees and competing priority queue structures as well as experimental' results for
double ended priority queues are presented.
Xll
CHAPTER 1
INTRODUCTION
1.1 Background
In VLSI layout, we are concerned with transforming a circuit from its logical deÂ¬
sign to a physical implementation. The layout problem for VLSI circuits is generally
decomposed into smaller problems such as partitioning, floorplanning, placement,
routing and compaction.
The partitioning process decomposes a large circuit/module into a collection of
smaller subcircuits/modules. In floorplanning, logical components of a circuit are
assigned relative positions on a chip. The physical realization of each component (i.e.,
its area and aspect ratio) is also selected. The objectives of floorplanning include
overall area minimization, minimization of power consumption, etc. The precise
locations for the components of a design are then determined during the placement
process to optimize the area and the timing. After the components are placed, the
pins are connected during the routing process. During the process of compaction,
the components and interconnections are moved so as to further optimize the layout
in terms of area and delay.
The routing process is usually divided into three smaller subproblems of global
routing, detailed routing, and specialized routing. Global routing decomposes the
1
2
complex routing problem into small and manageable subproblems. It assigns each
net to a set of routing regions such as channel, switchbox, and planar routing to
minimize a combination of criteria such as area, circuit delay, etc. Steiner trees
and spanning trees are the commonly used approaches for net connection in global
routing. Specialized routing is used to connect powerground nets or clock nets.
Detailed routing has two types of routing, general and restricted and there are three
types of detailed restricted routing, channel, switchbox, and planar.
In channel routing, all terminals of nets are located in two parallel rows across
a routing region called channel. In switchbox routing, terminals of nets are located
on the four sides of the routing region. Planar routing is a problem in which the
interconnection topology of the nets is planar. That is, all connections can be realized
on a single layer. Vias allow wires to change layers but the presence of vias reduces
reliability and performance of a circuit. Single layer routing is not always possible.
River routing is a special case of planar routing in which all nets have exactly
two terminals, one on each side of the channel, and the net sequence on each side
of the channel is the same. River routing is used in PCB routing, particularly in
dataflow architectures with multibit buses connecting a series of logic blocks, and
symbolic IC design systems.
1.2 Dissertation Outline
This dissertation is divided into four chapters. In Chapter 2, we consider the
problem of joining a row of compacted cells so as to minimize the area occupied by
the cells and the interconnects. The cell joining process includes cell stretching and
3
river routing. We propose several heuristics to join a row of cells in such a way that
area is minimized. The proposed heuristics are compared experimentally with the
previously proposed one.
VLSI Physical Design Automation is essentially the study of algorithms and data
structures related to the physical design process. Specific data structures can be used
to improve the performance of algorithms. For example, maze routing algorithms and
lineprobe algorithms in global routing [30] use search structures, retiming algorithms
[6, 29] use priority queue structures, and layout compaction algorithms [13] use radix
priority search trees.
Chapters 3 and 4 are about search and priority queue structures. As specific
data structures could be used to produce better performance of algorithms in VLSI
design automation, new search and priority queue structures are proposed and thorÂ¬
oughly compared with other data structures.
Finally, in the last chapter, we present conclusions of this work.
CHAPTER 2
MINIMUM AREA JOINING OF COMPACTED CELLS
2.1 Introduction
When designing circuits with compacted symbolic sticks basic cells, the circuit
is realized by a collection of compacted cells that tile a twodimensional area. The
intercell interconnects are such that each interconnect connects two terminals that
are on adjacent boundaries of neighboring cells. So, for example, if cells A and B
(Figure 2.1(a)) are neighboring cells of the circuit, then the right boundary of A
is adjacent to the left boundary of B. The number of terminals on each of these
boundaries will be the same and the Pth terminal (from the bottom) on the right
boundary of A is to be connected to the iâ€™th terminal (from the bottom) on the left
boundary of B.
Since the cells are available in compacted form, it is not possible to reduce the
distance between any pair of terminals on any side of a cell. However, this distance
can be increased by stretching the cell. In the example of Figure 2.1(a), we can stretch
either cell vertically by defining a horizontal cut line at any position and pulling the
two cell pieces apart by any desired amount (the cell can also be stretched horizontally
by using a vertical cut line).
4
5
(a)Horizontal adjacent cells
(b)Joining by stretching
(c)Joining by river routing (d) Combination cell joining
Figure 2.1. Cell joining
The required interconnects between cells A and B of Figure 2.1(a) can be acÂ¬
complished by stretching cells A and B so that the terminals of A and B line up as
in Figure 2.1(b). The broken lines in Figure 2.1(a) indicate the cut lines used for
stretching. The stretching enables us to join cells A and B using no routing tracks
(by â€œjoinâ€ we mean make the interconnects between cells A and B). This method of
joining cells is also called pitch matching.
Another way to join cells A and B is to river route the interconnects as in
Figure 2.1(c). This uses routing tracks in a channel between cells A and B but does
not increase cell height. The pitch matching and river routing approaches to cell
joining have been studied in Boyer [5] and Weste [33]. Algorithms for singlelayer
6
river routing can be found in several works [15, 19, 23. 24] and those for multilayer
river routing can be found in Baratz [3]. Singlelayer gridless river routing is studied in
Tompa [32]. Two applications of river routing are hybrid circuit design and structured
design (DSP).
Cell stretching (or pitch matching) increases the height of the layout while river
routing increases its width. Both affect the layout area. The layout of Figure 2.1(b)
has area 150. To compute the area of the layout of Figure 2.1(c), we assume tracks
have unit separation. So, the layout width is 14 and height is 11. The layout has area
154. Cheng and Despain [S] have proposed using a combination of cell stretching and
river routing so as to obtain layouts with smaller area than possible when only one
of these joining methods is used. Figure 2.1(d) shows the result of joining cells A
and B using both stretching and river routing. The area of this layout is 144. This
is minimum for the instance of Figure 2.1(a).
Cheng and Despain [8] have proposed a heuristic for single layer joining of
compacted cells. At each step of their heuristic either a row or column of compacted
cells is joined. Following this, the row or column of joined cells is replaced by a
composite cell that represents the result of joining. Notice that when a row (column)
of cells is joined, cells may be stretched vertically (horizontally) and river routing is
done in a vertical (horizontal) channel. To join a row of cells, Cheng and Despain [8]
bound the maximum height to which a cell may be stretched. This bound is
hrnax + hlvg/(4 * h max)
7
where hmax is the height of the tallest compacted cell being joined and havg is the
average height of the cells being joined.
Using this bound, cells are joined oneatatime using a penalty/reward scheme
to determine if a pair of terminals is to be joined by stretching or by river routing.
Lim, Cheng, and Sahni [17] have considered the case when only two cells are to
be joined. They develop fast polynomial time algorithms to obtain the minimum area
join of two cells. In addition, they are able to obtain, in low order polynomial time,
minimum area joins that minimize the length of the longest wire or the total wire
length. Lim [16] has proposed an 0(n(7z/c)c1) algorithm to find the minimum area
join of c cells having a total of n terminals. This algorithm does an exhaustive search
over all possible numbers of tracks in the c â€” 1 routing channels between adjacent
cells. A constraint graph is used to determine the minimum height layout for each
assignment of number of tracks to routing channels. The time required per track
assignment is 0(n) and the worst case number of track assignments is 0((n/c)c_1).
The algorithm of Lim [16] is flawed as it handles channels with zero routing tracks
by joining the adjacent cells using minimum height cell stretching and then considers
the joined cells as one. This problem is easily fixed, however, by combining, in the
constraint graph, pairs of vertices that represent corresponding terminals of the two
cells (i.e., Â¿â€™th terminals of each cell) with zero routing tracks in between.
In this chapter, wre consider the case when / > 1 routing layers are available to
river route the inter cell connections. Note that while multiple layers do not affect
layout area when cell stretching alone is used, a reduction in area is possible when
8
(a) 1layer (b) 2layer
Figure 2.2. /layer river routing
cell stretching is combined with river routing or when river routing alone is used.
We assume that in each layer of each routing channel, the interconnects are to be
accomplished using river routing. An alternative is to use HV routing when / = 2,
HVH or VHV routing when / = 3, and extensions of HVH and VHV routing for
/ > 3. However, for river routing instances, using routing layers in this way has
no advantage over river routing in each layer (see Theorem 5, Section 2). When
the number of layers available for river routing is increased, one may see a dramatic
reduction in the number of routing tracks needed per layer. Figure 2.2 shows an
instance that needs n tracks when routed in one layer but only one track/layer when
routed in two layers.
We begin, in section 2, by stating the necessary and sufficient conditions for
a river routing instance to be routable in / layers using at most t tracks per layer
and stating how to perform /layer river routing when such a routing is possible. In
9
this section, we also show that HV style routing has no advantage over river routing
in each layer. In Section 2.3, we describe the constraint graph used to determine
minimum height stretching of c cells. Heuristics for the minimum area joining of
c cells are proposed in Section 2.4 and the results of experiments with these are
provided in Section 2.5. Our conclusions appear in Section 2.6.
2.2 Mayer River Routing
Let (A{,BÂ¡), 1 < t < m, be a set of terminal pairs such that the /4,â€™s are on
one side (say left or top) of a routing channel and the Bfs are on the other (right
or bottom) side. Terminal is to be connected to terminal B{, 1 < i < m. For
this channel routing instance to be an instance of river routing, it must be the case
that ai < a2 < ... < am and b\ < b2 < ... < bm where a, and 6,, respectively, give
the positions of terminals A, and J5,, 1 < i < m. We may assume an underlying
grid with each terminal being at a grid position. In the case of a horizontal (vertical)
channel the g,s and 6,,s are grid column (row) numbers. Leiserson and Pinter [15] have
obtained the following necessary and sufficient condition for a river routing instance
to be routable in a single layer using at most t > 0 tracks.
Theorem 1 [15] The river routing instance defined above is routable in a single layer
using at most t > 0 tracks if and only if
(a) ai+t  bi>t
(b) b,+t  ai>t
for every i < m â€” t.
10
For the general case of / > 1 layers, we obtain the necessary and sufficient
condition of Theorem 2.
Theorem 2 The river routing instance defined above is routable in l > 1 layers (each
layer routing whole nets) using at most t > 0 tracks per layer if and only if
(a) ai+it bi >t
(b) bi+it â€” cti > t
for every i < m â€” It.
Proof: First, we establish that (a) and (b) are necessary for l layer routing. Since
the proofs for (a) and (b) are similar, we provide that for (a) only. Suppose that
a1+/t â€” bi < t for some i. Consider the It + 1 terminal pairs (Aj, Bj), i < j < i + It.
When routing these on / layers, at least one layer has to be assigned > t + 1 terminal
pairs. So, suppose that terminal pairs (A\,B[), (A'2,B'2), ..., (A't+l, B't+l), ... are
assigned to the same layer for river routin g. We may assume that a[ < a'2 < ... <
a'(+1 and b\ < b'2 < ... < 6'i+1. Since a't+1 < a,+/t and b\ > 6,, a't+1 â€” b[ < ai+Â¡t â€” 6, < t.
From Theorem 1, it follows that the terminal pairs (/!' , Z^), 1 < j < t + 1, cannot be
river routed on a single layer. Hence, (Aj, Bj), i < j < i + /Â¿, cannot be river routed
on / layers. So, (Aj, Bj), 1 < j < m, cannot be river routed on / layers. As a result,
(a) is a necessary condition.
To show that (a) and (b) are sufficient conditions for routability, we present two
algorithms (RoundRobin and Greedy) that assign the nets to layers in such a way
that each layer is river routable when both (a) and (b) are satisfied. The correctness
11
procedure RoundRobin ;
{ Assign the m nets to / layers. }
begin
for i := 1 to m do
assign net (A,, Bf) to layer (i mod /) + 1 ;
end ;
procedure Greedy ;
{ Assign the m nets to l layers. }
begin
for i := 1 to m do
assign net (Ai,B{) to layer q such that q is the smallest integer < /
for which the conditions of Theorem 1 axe not violated on layer q
(if there is no such q, then fail) ;
end ;
Figure 2.3. Round robin and greedy layer assignments
of these algorithms is established in Theorems 3 and 4, respectively. â–¡
We later discovered that Baratz [3] has not only obtained the same condition
but also proposed the same two algorithms for /layer river routing. One assigns
nets to layers in a round robin fashion and the other uses a greedy strategy. The
corresponding procedures are given in Figure 2.3.
Theorem 3 The layer assignment produced by the RoundRobin procedure is river
routable if
(a) Oi+/Ã bÂ¡ > t
(b) bi+it  a,>t
for all i < m â€” It.
12
Proof: Let (A[,B[), (A',^), â€¢ â€¢ â€¢ = (Aj,Bj), (Aj+l,Bj+l), (A>+2/, BJ+â€ž), ... be the
nets assigned to layer (j mod /) + 1, j < l. So, a\ = a,+(,_)/ and b[ =
Hence,
a', + t ~K~ aj + (i + tl)l  fy + (Â«l)i
= aj+(ii)i+ti  fy+(Â«l)/
> t (from (a))
Similarly, b'i+t â€” a[ > t. So, the layer assignment satisfies the conditions of Theorem 1
and is river routable using t tracks. â–¡
Theorem i If (a) al+Â¡t  b, > t and (b) bl+Â¡t â€” ax>t for all i < m â€”It, then procedure
Greedy assigns nets to layers such that the assignment to each layer is routable using
t tracks.
Proof: If procedure Greedy is able to assign each of the m nets to a layer, then the
layer assignments satisfy the conditions of Theorem 1 and so are routable using t
tracks. Suppose the algorithm fails while trying to assign net (Ar, Br) to a layer. At
this time nets (A,, Â£?,), 1 < i < r, have been assigned to layers so as to satisfy the
conditions of Theorem 1 and the assignment of net (AT,BT) to each of these layers
violates these conditions. Consider first those layers, La, on which condition (a) is
violated. For a layer s â‚¬ La, suppose that the assigned nets are ..., (A'_t,B':_t),
..., (A'j_x,B'j_j). Let (Aj,Bj) = (Ar,Br). Since s â‚¬ La, we have a' â€” 6'_t < t.
Now, if 6r_/( > 6'_(, then aT â€” 6r_/t < t which violates condition (a) of this theorem.
So, 6r_Ãt < Since, Ã³r_ÃÂ£ < b'_t < ... < &'_2 < ,, t of the It  1 nets
13
(Artt+\, Brit+1(i4r_i, 5r_!) have been assigned to layer s Â£ La. Consequently,
the layers in La account for t\La\ of these It â€” 1 nets.
In a similar way, we can show that the remaining / â€” \La\ layers account for
another t(l â€” L0) of these nets. This gives us a total of tl nets, whereas we had only
tlâ€” 1. This contradiction implies that procedure Greedy cannot fail unless conditions
(a) and (b) are not satisfied. Q
Procedure RoundRobin is easily seen to have complexity of 0(m). A straightforÂ¬
ward implementation of procedure Greedy will have complexity of 0(ml). However,
by using priority search trees [18] the complexity can be reduced to O(mlog/). In
practice, since / is quite small, it is unlikely that the priority search tree implemenÂ¬
tation will run faster than the straightforward implementation in which the / layers
are checked in sequence. The actual routing for all l layers can be done in 0(mt)
time using the computed layer assignment and the single layer routing algorithm of
Leiserson and Pinter [15].
Using Theorem 2, we can develop a linear time algorithm to determine the
minimum number of tracks needed to route an instance in / layers as well as to find
the minimum number of layers needed for a t track routing. The algorithm for the
former is given in Figure 2.4. This figure also shows the changes needed in case t is
given and we wish to determine the minimum number of layers. The correctness of
the algorithm follows from that of Theorem 2 and the fact that t (or /) is increased
only if the current t (/) is found to be infeasible. The complexity is 0(m) as neither
14
procedure MinimizeTracks ; {or MinimizeLayers}
{ Determine the minimum number of tracks per layer (or minimum number
of layers) needed for multilayer river routing }
begin
t 0 ; {or / := 1}
* := 1 ;
while (i < m â€” It) do
if (ai+u  b{ < t) or (6,+/t  a, < t)
then t := t f 1 {or / := / + 1}
else i := t + 1 ;
end ;
Figure 2.4. Minimizing the number of tracks or layers
i nor t (/) can exceed m. So, neither clause of the if statement can be executed more
than m â€” 1 times.
Using the multilayer river routing results of Baratz [3], one can trivially extend
all the results of Lim, Cheng and Sahni [17] to the case of multilayer joining of
compacted cells. So, the multilayer minimum area join of two compacted cells with
m nets can be obtained in 0(m2) time. If we wish to minimize the maximum wire
length while keeping area minimum, the asymptotic time complexity is still 0(m2).
The total wire length can be minimized while keeping area minimum in O(m2logm)
time.
In HV style routing, each routing layer is assigned a routing direction (either
H or V). In an H (V) layer only horizontal (vertical) wire segments can be laid out.
Horizontal segments on one layer connect to vertical segments, of the same net, on
another layer by means of vias. In the case of river routing instances, one can see
that there is no advantage to having more than two Vlayers (i.e., two Vlayers are
sufficient to route all river routing instances).
15
Let RR(/,Â¿) be the set of all river routing instances that can be routed in /
layers, using t tracks per layer and using river routing in each layer. Let HV(/, t) be
all river routing instances that can be routed using HV style routing, l layers, and
t tracks per layer. Note that HV(/, t) includes instances routable with 0, 1, and 2
Vlayers. Let HVV(/, t) be all river routing instances using HV style routing, / â€” 2
Hlayers, and 2 Vlayers. Theorems 5 and 6 below hold for both the knockknee [25]
and directional HV models. Theorem 7 holds only for the directional model.
Theorem 5 HV(/, t) C RR(/,<) for every l > 1 and every t > 1.
Proof: HV(/,Ã) C RR(l,t) follows from a more general result obtained by Baratz [3].
Baratz [3] has shown that, for river routing instances, there is no advantage to using
any routing scheme that wires a net on more than one layer. Since it is easy to
construct river routing instances X such that X Â£ RR(/,f) and X ^ HV(/, <), it
follows that HV(/, t) C RR(/,<).
We provide a simpler proof of HV(/, t) C RR(/, t). This proof will also establish
our next result. We shall show that if X is a river routing instance such that X ^
RR(/,i), then X Â£ HV(/,0 Hence, HV(/,t) C RR(/,<).
Suppose that X ^ RR(/, t). From Theorem 2, it follows that a,+/t â€” 6, < t or
bl+it â€” a, < t for some i. Suppose that a,+/< â€” 6, < t (the proof is similar when
bi+n â€” a, < t). So, a,+/t < b, f t. Since X is a river routing instance, at least nets
i + f,..., i + It intersect a vertical cut line drawn at a,+/t. Hence, the density of X
at a,+it is > i f It â€” (i + t) T 1 = (/ â€” l)t f 1. When HV style routing is used with
16
/, / > 1, layers, at most / â€” 1 layers are available for horizontal routes. With t tracks
per layer, densities of at most (/ â€” 1)/ can be accommodated. So, X HV(/,t). â–¡
Theorem 6 HVV(/, t) C RR(/ â€” 1, i) for every l > 2 and every t > 1.
Proof: As in Theorem 5, suppose that X (f RR(/â€” 1, t). Let i be such that â€”
b, < t. The net density at is > (/  2)t + 1. In HVV routing, two layers
are Vlayers. So only 1 â€” 2 layers are available for horizontal segments. This is not
enough as the horizontal segment density is > (/ â€” 2)t + 1 at a,+(/_i)t. Hence, X $
HVV(M).
One may easily construct river routing instances that are in RR(/ â€” 1, t) but not
in HVV(/,i). â–¡
Theorem 7 RR(2, t) (f HV(/, t)  HVV(/, t) for every l > 1 and every t > 1.
Proof: Consider the RR instance (ai,6i) = (1,2) and (a2,Â¿>2) = (2,3). This is in
RR(2,t) for every t > 1 but is not in HV(/, t) â€” HVV(/,t) for any /. â–¡
As remarked earlier, Theorem 7 holds only for the directional model. For the
knockknee model, one can show that RR(/,t) C HV(/ + l,i) for every / > 1 and
every t > 1.
2.3 Constraint Graph Representation
Lim [16] has proposed the use of a constraint graph to determine the terminal
positions in a row of compacted cells. This is for the case when the number of tracks
17
in each routing channel is given and we wish to minimize the layout height. In the
constraint graph, each cell is represented by a directed chain of vertices. Each cell
terminal is represented by a vertex. The exception is when a compacted cell has
terminals at the same ^position on both sides of the cell. In this case, the two
terminals at the same yposition are represented by a single vertex. The vertex chain
is linked in the direction of increasing yposition. The chain edges are labeled by the
minimum allowable terminal separation. In addition, the constraint graph contains
a source vertex that represents the bottom of the layout and a sink vertex that
represents the layout top. The source vertex connects to the bottom of each chain
and the top of each chain is connected to the sink vertex.
Figure 2.5(b) shows the chains (solid edges) for the four cell row of Figure 2.5(a).
To complete the constraint graph, directed edges are added to introduce the channel
routing constraints of Theorem 2. These are represented by the broken edges of
Figure 2.5(b). Figure 2.5(b) is for the two layer case.
Lim [16] has shown that the constraint graph is acyclic provided the number
of tracks in each routing channel is > 0. He has proposed handling channels with
zero tracks by finding first the minimum area joining of the adjacent cells (only cell
stretching is permitted now) and then combining these two cells into one. I.e., the two
cells are replaced by their minimum area join. This strategy can be shown to result
in nonoptimality of the algorithm proposed in Lim [16]. To preserve optimality, it
is necessary to merge the vertices that represent terminals that are the endpoints of
18
sink
Figure 2.5. Constraint graph representation
nets that are to be routed using no tracks as in Figure 2.6. The resultant constraint
graph is also acyclic.
It is easy to see that the number of vertices and edges in the constraint graph
is 0(n) where n is the total number of terminals. Furthermore, the graph can be
constructed in 0(n) time given the number of routing layers and the number of tracks
in each channel. The constraint graph described by us is identical to that of Lim [16]
except in the way channels with zero tracks are handled and in that our graph is
defined for / > 1 routing layers while that of Lim [16] is only for / = 1.
19
sink
The length of the longest path from the source vertex of the constraint graph
to each of the remaining vertices can be computed in O(n) time by doing this in
topological order [14, Section 6.5]. It is easy to see that if each terminal is placed
at a vertical position given by the longest path length from the source, then all nets
can be routed in the given number of tracks (as the conditions of Theorem 2 are
satisfied in each routing channel). Furthermore, Lim [16] has shown that such a
positioning of terminals results in a stretched layout of minimum height for the given
channel widths. As a result, when channel widths are known, cells can be stretched
to minimize area in 0(n) time. The channel widths that result in minimum area can
be determined in 0(n(n/c)c1) time where c is the number of cells by trying out all
20
procedure Heuristicl ;
begin
for i := 1 to c â€” 1 do
begin
determine the minimum area join of each pair of adjacent cells ;
select the pair that has minimum area and replace it with its
minimum area join ;
end ;
end ;
Figure 2.7. Heuristic 1
possible channel widths [16]. Since this is feasible only for small c, we propose several
heuristics in the next section.
2.4 Heuristics to Minimize Area
We formulate three greedy heuristics to obtain the minimum area join of a row
of c compacted cells that have a total of n terminals.
2.4.1 Heuristic 1
The heuristic is described in Figure 2.7.
In each iteration of the for loop we examine every pair of adjacent cells. For
each pair, the minimum area join is found using the algorithm of Lim, Cheng and
Sahni [17] extended to the multilayer case as discussed in Section 2.2. The pair which
has the minimum area join is replaced by a single cell that represents this join. So
following each iteration of the for loop the number of cells decreases by one. When
the for loop terminates, we are left with a single cell that represents the join of
all c cells. The time needed to determine the minimum area join of a pair of cells
with n, nets between them is 0(n?). The time to do this for all pairs of adjacent
21
cells is 0(Â£,n2) = 0(n2). So, the for loop iteration with i = 1 takes 0(n2) time.
On subsequent iterations, only the two pairs that include the cell introduced in the
previous iteration need to have their minimum area join computed. Since each cell
pair being considered includes at least one composite cell, the minimum area join
is computed by considering the portion of the constraint graph that represents all
the basic cells in the cell pair. Channel widths for channels within a composite cell
are not changed while obtaining the minimum area join of the cell pair. However,
as different channel widths for the channel between the two (composite) cells being
joined are tried, the constraint graph is used to determine the minimum height of the
combined cell. So, the time to combine two (composite) cells with n, terminals in the
channel between them is 0(nn,). Hence the time for the remaining c â€” 2 iterations
is 0(nÂ£, n,) = 0(n2). The overall complexity of Heuristic 1 is therefore 0(n2). In
case the terminals are uniformly distributed over the cells, n, = 0(n/c) for all i. The
time for the first iteration of the for loop is now 0(n2/c) and that for each of the
remaining iterations is 0(n2/c). The overall time is 0(n2).
2.4.2 Heuristic 2
In this heuristic, we begin by assigning each channel the number of tracks needed
to route the channel with no cell stretching. This number can be determined in 0(n,)
time for a channel with n, nets as described in Section 2.2. The time taken to do
this for all c â€” 1 channels is 0(n). The configuration obtained in this way is the
maximum width layout. Starting from this configuration, we reduce the total number
of tracks available across all c â€” 1 channels by one on each iteration. For this, the
22
procedure Heuristic2 ;
begin
for each channel determine the number of tracks,
needed to route with no stretching, 1 < i < c ;
t := Ed U ;
set up the constraint graph using tÂ¿ tracks in channel i, 1 < i < c ;
compute layout area, A ;
for tracks := t downto 1 do {reduce by 1}
begin
for i := 1 to c â€” 1 do
begin
reduce the number of tracks in channel i by 1 ;
modify the constraint graph to reflect this ;
determine the length of the longest path in the graph
and from this the layout area, a, ;
end ;
select j such that aj â€” min{ a, } ;
reduce the number of tracks in channel j by 1 ;
A = min{ A,aj } ;
end ;
end ;
Figure 2.8. Heuristic 2
effect of a one track reduction is computed for each channel. The minimum layout
height is determined by computing the length of the longest path in the constraint
graph of Section 2.3. The track reduction is done in the channel that results in the
smallest layout height (hence the minimum area for the given number of tracks). The
algorithm is stated more formally in Figure 2.8.
When the algorithm terminates, A is the area of the minimum area join found
by the heuristic. To reconstruct the layout, it is necessary to store the tracks per
channel each time A is updated in the statement A = mm{ A, a} }.
For the time complexity, we see that the steps that precede the outer for loop
take 0(n) time. Each iteration of the outer loop takes O(nc) time. Hence this loop
23
contributes a total of 0(net) to the time. Since t = 0(n), the overall time complexity
of Heuristic 2 is 0(n2c).
2.4.3 Heuristic 3
Unlike Heuristic 2 which attempts to minimize the layout height for each value
of t, the total number of tracks, Heuristic 3 attempts to minimize the width (i.e.,
total number of tracks) for each choice of layout height. The heuristic begins with a
layout height, ht, equal to the height of the tallest compacted cell. At each iteration,
the next layout height to use is computed as described later. During each iteration,
cells are combined in groups of at most k (k > 1 is a parameter to the heuristic).
Each group of combined cells is replaced by its minimum area join subject to the
constraint that the height of the join does not exceed ht. This joining of < k cells at
a time continues until only one cell remains. Its area is computed and recorded. The
minimum area obtained over all heights tried is then reported as the best. Heuristic
3 is given in Figure 2.9.
In our implementation of Heuristic 3, the minimum area join of k cells is found
by considering the portion of the constraint graph for all the basic cells included in
these k cells. So, for this purpose composite cells are not handled as single cells.
Rather, as in Heuristic 1, the basic cells they are composed of are considered and
channel widths previously assigned to the associated channels are not changed. Track
assignment is done only for the k â€” 1 channels between the k composite cells. We
found this to give better results than when composite cells were regarded as atomic.
For the case k = 2, the minimum area is determined by a binary search over the
24
procedure Heuristic3 ;
begin
ht := height of the tallest cell ;
repeat { minimize width subject to height < ht }
repeat { do this by combining k cells at a time }
select k adjacent cells such that the minimum height cell is selected
and the height of the tallest selected cell is minimum
(if there are fewer than k cells, then select all of them) ;
obtain the minimum area layout for the selected cells under
the constraint that the layout height does not exceed ht ;
during the preceding step record the next value of ht
that is possible for a layout ;
until one cell remains ;
compute the area of the remaining cell and record it
if it is less than the minimum area found so far ;
if there is no next height then terminate ;
ht := next height ;
until false ;
end ;
Figure 2.9. Heuristic 3
number of tracks in the single channel. This takes O(nlogn,) time where n, is the
number of nets in channel i. Thus the time needed for the inner repeat loop when
k â€” 2 is O(cnlogn) (for uniform terminal distribution it is 0(cn log(n/c)). During
the binary search, the heights corresponding to channel widths that require height
> ht are recorded. The minimum of these heights yields the next value of ht.
When k > 2, all track combinations for the k â€” 1 channels are tried as in
Section 2.3. Again, each composite cell is broken up into its basic cells. As different
track combinations are tried, we record the minimum height > ht that results from
any track combination. This gives the next value of ht. The time for the inner
repeat loop is 0((c/(/r â€” l))n(n/Â¿)Ã:â€œ1) (or 0((c/(k â€” l))n(n/c)fc_1) when terminals
are uniformly distributed).
25
In all our experiments, the outer repeat loop was iterated fewer than (k â€” l)n
times. To ensure that the number of iterations is 0(kn), one may adopt the following
scheme. When the number of iterations first reaches (k â€” l)n, compute a set of at
most n new heights by beginning with the current constraint graph. This uses the
current assignment of number of tracks in each channel. Heuristic 2 is next used to
reduce the total number of available tracks by one and determine the height needed
to complete the routing with the reduced number of tracks. This process gives us
at most n new heights hi < h2 < ... < hp. Heuristic 3 is now resumed with hi as
the next height. Only two iterations are performed. Then Heuristic 3 is resumed
with max{ h2, ht } as the next height. Again two iterations of the outer repeat loop
are done. Next the heuristic is resumed with max{ h$,ht } as the next height. This
continues until we have gone through p resumptions of the heuristic. With this scheme
to limit the number of iterations, the complexity of Heuristic 3 becomes 0(cn2 log n)
when k â€” 2 and 0((c/(k â€” 1 ))fcn2(n/Â¿)*1) = 0(cn2(n/Â¿)k_1) when k > 2. For the
case when the n terminals are uniformly distributed over the c cells, the complexity
is 0(cn2 log(n/c)) when k = 2 and 0((c/(kâ€” 1 ))kn(n/c)k~1) = 0(cn(n/c)*1) when
k > 2. One may verify that since Heuristic 3 tries the maximum useful height (i.e.,
the height needed when no routing tracks are available), it generates optimal solutions
when k = c.
2.5 Experimental Results
We programmed our three heuristics as well as the heuristic Fang [8] in C and
ran tests on a single KSR processor. Optimal solutions for instances with up to nine
26
cells were obtained using the corrected version of the exhaustive search algorithm of
Lim [16]. Our test set consisted of instances that had a number of cells, c, equal to
one of the numbers in the set {3, ..9, 10, 20, 50, 100}. For each value of c, there
were twenty instances and the results were averaged over these instances. An instance
with c cells had c â€” 1 routing channels. The number, i, of terminals on either side of
each routing channel was equal to c for 3 < c < 9 and was 10 for the other values of
c. In addition, when c = 100, we also had instances with 20 terminals on either side.
In our experiments, we considered only single layer and two layer routing.
Table 2.1 gives the average percentage by which the area of the single layer
solutions generated by each of the heuristics exceeded the area of the single layer
optimal solution. As is evident, each of the heuristics proposed in this chapter gave
noticeably better solutions than did Fang. This table is only for the cases 3 < c < 9
as for c > 9 the optimal algorithm of Lim [16] required too much time to complete.
When k > c, Heuristic 3 is guaranteed to generate an optimal solution. So, we did
not run these cases.
In table 2.2, we have used the single layer solution produced by Fang as the
benchmark against which the solutions obtained by our three heuristics are compared.
This table gives the average percentage by which the area of the solutions produced
by our heuristics is less than that of the solutions produced by Fang. Our solutions
have area 9 to 18% less.
Table 2.3 compares the computing time requirements of the various algorithms
for the case of one layer. The optimal algorithm is useful only for small values of c
Table 2.1. Error rate (%) over optimal, / = 1
t = number of terminals on each side of each routing channel
* : k > c
Table 2.2. Improvement (%) over Fang, / = 1
cells
t
Heuristic
1
Heuristic
2
Heuristic3
Jfc = 2
k = 3
Ar = 4
10
10
9.4
14.0
14.0
14.2
14.3
20
10
10.5
15.4
15.4
15.6
15.6
50
10
10.6
16.1
16.1
16.2

100
10
9.1
16.5
16.5
16.6

100
20
9.3
18.3
18.0


: excessive run time
28
Table 2.3. Time taken, l â€” \
cells
t
Fang
Heuristic
1
Heuristic
2
Heuristic3
Optimal
k = 2
ifc = 3
II
3
3
0.0
0.01
0.02
0.02
*
*
0.01
4
4
0.0
0.02
0.02
0.04
0.09
*
0.05
5
5
0.0
0.03
0.04
0.09
0.36
1.2
0.77
6
6
0.0
0.03
0.08
0.21
0.66
3.7
13
7
7
0.0
0.04
0.14
0.50
3.4
27
278
8
8
0.0
0.07
0.24
0.84
4.4
34
1.8*
9
9
0.0
0.09
0.42
1.8
16
72.3
47*
10
10
0.0
0.14
0.78
3.1
19
334

20
10
0.01
0.32
5.8
16
117
1196

50
10
0.01
1.4
93
127
896


100
10
0.03
4.4.
782
590
4873


100
20
0.06
11
3022
4087



Times are in seconds,
f : Times are in hours.
(say up to 7). While Fang is significantly faster than the heuristics proposed here,
the quality of the solutions generated by our heuristics is superior.
Table 2.4 is the analog of table 2.1 for the case of two layers. Again, our heuristics
performed considerably better than did Fang. Table 2.5 gives the improvement in
area due to increasing the number of routing layers from one to two. This is influenced
somewhat by the width of cells which in our case ranged from 5 to 30 times the track
separation. With narrower cells, the impact of the second layer would have been
greater and with wider cells, it would have been less. Also, the impact of the second
layer is more when more routing tracks are needed. For the smaller instances of
table 2.4, for example, the optimal solutions with / = 2 required, on average, only
1.8% less area than when / = 1.
29
Table 2.5. Improvement (%) over / = 1 cases
cells
t
Fang
Heuristic
1
Heuristic
2
Heuristic3
k = 2
ib = 3
k = 4
10
10
4.1
4.7
3.3
3.3
3.0
3.0
20
10
4.1
5.8
3.4
3.4
3.3
3.2
50
10
4.6
5.2
3.4
3.2
3.2

100
10
4.7
6.3
3.5
3.4
3.4

100
20
7.1
10.4
4.8
5.2

â€”
30
Table 2.6. Improvement (%) over Fang, / = 2
cells
t
Heuristic
1
Heuristic
2
Heuristic3
k = 2
k = 3
k = 4
10
10
10.0
13.3
13.2
13.2
13.3
20
10
12.2
14.8
14.8
14.9
14.8
50
10
11.2
15.1
14.9
15.0

100
10
10.7
15.4
15.4
15.4

100
20
12.5
16.3
16.3


Table 2.6 is the analog of table 2.2 for the case of two layers. The results are
similar to those in table 2.2. Table 2.7 gives the average computing times for the two
layer instances. These are less than for the one layer case as the constraint graph has
fewer edges.
For large c, we recommend the use of heuristic 2 or 3 (with k = 2) and for small
c we recommend using heuristic 3 (with i = 3 or 4).
2.6 Conclusion
We have considered the problem of joining a row of compacted cells and deÂ¬
veloped heuristics to stretch cells and riverroute the nets so that the layout area is
minimized. Our proposed heuristic was compared, experimentally, with Fang [8] and
found to produce layouts with less area. However, Fang is faster. We recommend the
use of our Heuristic 3 with k = 3 or 4 in practice.
31
Table 2.7. Time taken, / = 2
cells
t
Fang
Heuristic
1
Heuristic
2
Heuristic3
Optimal
k = 2
k = 3
= 4
3
3
0.0
0.01
0.02
0.02
*
*
0.0
4
4
0.0
0.02
0.02
0.03
0.06
*
0.04
5
5
0.0
0.02
0.02
0.07
0.25
0.75
0.60
6
6
0.0
0.02
0.04
0.13
0.38
2.0
11
7
7
0.0
0.04
0.06
0.29
1.9
14
223
8
8
0.0
0.06
0.12
0.53
2.5
19
1.5*
9
9
0.0
0.06
0.20
0.96
7.5
34
39*
10
10
0.01
0.09
0.32
1.7
9.9
153

20
10
0.01
0.24
2.7
9.9
65
651

50
10
0.01
1.1
42
85
586


100
10
0.02
3.7
357
472
3069


100
20
0.05
7.9
1377
3166
â€”
â€”
â€”
Times are in seconds,
f : Times are in hours.
CHAPTER 3
A NEW WEIGHT BALANCED BINARY SEARCH TREE
3.1 Introduction
A dictionary is a set of elements on which the operations of search, insert, and
delete are performed. Many data structures have been proposed for the efficient
representation of a dictionary [14]. These include direct addressing schemes such as
hash tables and comparison schemes such as binary search trees, AVLtrees, red/black
trees [12], trees of bounded balance [21], treaps [1], deterministic skip lists [20], and
skip lists [26]. Of these schemes, AVLtrees, red/black trees, and trees of bounded
balance (WB(a)) are balanced binary search trees. When representing a dictionary
with n elements, using one of these schemes, the corresponding binary search tree
has height O(logn) and individual search, insert, and delete operations take O(logn)
time. When (unbalanced) binary search trees, treaps, or skip lists are used, each
operation has an expected complexity of O(log n) but the worst case complexity is
0(n). When hash tables are used, the expected complexity is 0(1) per operation.
However, the worst case complexity is 0(n). So, in applications where a worst case
complexity guarantee is critical, one of the balanced binary search tree schemes is to
be performed.
32
33
In this chapter, we develop a new balanced binary search tree called /?BBST (/?
balanced binary search tree). Like WB(a) trees, this achieves balancing by controlling
the relative number of nodes in each subtree. However, unlike WB(o) trees, during
insert and delete operations, rotations are performed along the search path whenever
they reduce the internal path length of the tree (rather than only when a subtree is
out of balance). As a result, the constructed trees are expected to have a smaller
internal path length than the corresponding WB(a) tree. Since the average search
time is closely related to the internal path length, the time need to search in a /3BBST
is expected to be less than that in a WB(a) tree.
In Section 3.2, we define the total search cost of a binary search tree and show
that the rebalancing rotations performed in AVL and red/black trees might increase
this metric. We also show that while similar rotations in WB(a) trees do not increase
this metric, insert and delete operations in WB(o) trees do not avail of all opporÂ¬
tunities to reduce the metric. In Section 3.3, we define /3BBSTs and show their
relationship to WB(a) trees. Search, insert, and delete algorithms for /?BBSTs are
developed in Section 3.4. A simplified version of /3BBSTs is developed in Section 3.5.
Search, insert and delete operations for this version also take O(logn) time each. An
even simpler version of /?BBSTs is developed in Section 3.6. For this version, we
show that the average cost of an insert and search operation is O(log n) provided no
deletes are performed.
An experimental evaluation of /?BBSTs and competing schemes for dictionaries
(AVL, red/black, skip lists, etc.) was done and the results of this are presented in
34
Section 3.7. This section also compares the relative performance of /?BBSTs and the
two simplified versions of Sections 3.5 and 3.6.
3.2 Balanced Trees and Rotations
Following an insert or delete operation in a balanced binary search tree (e.g.,
AVL, red/black, WB(ct), etc.), it may be necessary to perform rotations to restore
balance. The rotations are classified as LL, RR, LR, and RL [14]. LL and RR
rotations as well as LR and RL rotations are symmetric. While the conditions under
which the rotations are performed vary with the class of balanced tree considered,
the node movement patterns are the same. Figure 3.1 shows the transformation
performed by an LL and an LR rotation. In this figure, nodes whose subtrees have
changed as a result of the rotation are designated by a prime. So, p' is the original
node p however its subtrees are different.
Let h(x) be the height of the subtree with root x. Let s(x) be the number of
nodes in this subtree. When searching for an element x, x is compared with one
element at each of /(x) levels, where /(x) is the level at which x is present (the root
is at level 1). So, one measure of the â€œgoodnessâ€ of the binary search tree, T, for
search operations (assuming each element is searched for with equal probability) is
C(T) = â€¢Â£ '(*)â€¢
xÂ£T
its total search cost defined as:
35
(a) LL rotation
(b) LR rotation
Figure 3.1. LL and RL rotations
Notice that C(T) = I(T) + n where I(T) is the internal path length of T and
n is the number of elements/nodes in T. The cost of unsuccessful searches is equal
to the external path length E(T). Since E(T) = I(T) + 2n, minimizing C(T) also
minimizes E(T).
Total search cost is important as this is the dominant operation in a dictionary
(note that insert can be modeled as an unsuccessful search followed by the insertion
of a node at the point where the search terminated and deletion can be modeled by
36
a successful search followed by a physical deletion; both operations are then followed
by a rebalancing/restructuring step).
Observe that in an actual implementation of the search operation in programÂ¬
ming languages such as C++, C, and Pascal, the search for an x at level /(x) will
involve upto two comparisons at levels 1,2,..., /(x). If the code first checks x = e,
where e, is the element at level i to be compared and then x < e, to decide whether to
move to the left or right subtree, then the number of element comparisons is exactly
2/(x) â€” 1. In this case, the total number of element comparisons is
NC(T) = 2 l(x) ~ n = 2C(r)  n
xeT
and minimizing C(T) also minimizes NC(T). If the code first checks x < e, and then
x = e, (or > et), the number of element compaxisons done to find x is /(x) + r(x) + 1
where r(x) is the number of right branches on the path from the root to x. The
total number of comparisons is bounded by 2C(Tâ€™). For simplicity, we use C(T) to
motivate our data structure.
In an AVL tree, when an LL rotation is performed, h(q) = /i(c) +1 = /i(d) +1 (see
Figure 3.1(a)). At this time, the balance factor at gp is h(p) â€” h(d) = 2. The rotation
restores height balance which is necessary to guarantee O(logn) search, insert, delete
operations in an n node AVL tree. The rotation may, however, increase the total
search cost. To see this, notice that an LL rotation affects the level numbers of only
those nodes that are in the subtree with root gp prior to the rotation. We see that
l(q') = l{
37
with root a is decreased by s(a) as a result of the rotation, etc. Hence, the increase
in C(T) due to the rotation is:
Kp)  Kp) + *W) K
= 1  1 + 1  s(q) + 1 + s(d) = s(d)  s(g).
A similar analysis shows that an LR rotation increases C(T) by s(d) â€” s(q).
If the LL rotation was triggered by an insertion, s(q) is at least one more than the
minimum number of nodes in an AVL tree of height t = h(q) â€” 1. So, s(q) > t+2/>/5
where = (1 + \/5)/2. The maximum value for s(d) is 2J â€” 1. So, an LL rotation
has the potential of increasing total search cost by as much as
2â€˜  1  i+2/^5 Â« 2â€˜  1  1.62t+2/2.24.
This is negative for t < 2 and positive for t > 2. When t = 10, for example, an
LL rotation may increase total search cost by as much as 877. As t gets larger, the
potential increase in search cost gets much greater. This analysis is easily extended
to the remaining rotations and also to red/black trees.
Definition (WB(a) [21]) The balance, B(p), of a node p in a binary tree is the
ratio (s(/) + l)/(s(p) + 1) where / is the left child of p. For a â‚¬ [0,1/2], a binary tree
T is in WB(a) iff a < B{p) < 1 â€” a for every node p in T. By definition, the empty
tree is in WB(a) for all a.
38
Lemma 1 (1) The maximum height, hmax(n), of an n node tree in WB(a) is ~
log_!_(n + 1) [21]
1 â€”Q
(2) Inserts and deletes can be performed in an n node tree in WB(a) in O(logn) time
for 2/11 < a < 1 y/2/2 [4].
(3) Each search operation in an n node tree in WB(a) takes 0(\ogn) time [21].
In the case of weight balanced trees WB(a), an LL rotation is performed when
B(gp) Â«1 â€” a and B(p) > a/( 1 â€” a) (see Figure 3.1(a)) [21]. So,
1 aro *(p) + 1 = s^ + 1
~ s(gp) + 1 s(p) + s(d) + 2
or
s{d) Â« s(p) +
1 â€” a
2a  1
1 â€” a
and
a
1 â€” a
< B(p) =
s(q) +1
s{p) +1
or
s(q) >
1 â€” a
+
2a  1
1 â€” a
So, LL rotations (and also RR) do not increase the search cost. For LR rotations
[21], B(gp) Â« 1  a and B(p) < a/( 1 â€” a). So, s(d) m s(p)j^ + ^7 and with
respect to Figure 3.1(b),
a
> B(P) =
s(p) ~ Â¿(g)
s(p) + 1
1 â€” a
39
or
1 â€” 2a a
5(?) > s{p) â€”  â–
1 â€” a 1 â€” a
For a < 1/3, s(q) > s(d) and LR (RL) rotations do not increase search cost. Thus,
in the case of WB(a) trees, the rebalancing rotations do not increase search cost.
This statement remains true if the conditions for LL and LR rotation are changed to
those in Blum and Mehlhorn [4].
While rotations do not increase the search cost of WB(a) trees, these trees
miss performing some rotations that would reduce search cost. For example, it is
possible to have a < B(gp) < 1  a, B(p) > yÂ£^, and s(q) > s(d). Since B(gp) isnâ€™t
high enough, an LL rotation isnâ€™t performed. Yet, performing such a rotation would
reduce search cost.
3.3 ffBBSTs
Definition A cost optimized search tree (COST) is a binary search tree whose search
cost cannot be reduced by performing a single LL, RR, LR, or RL rotation.
Theorem 8 If T is a COST with n nodes, its height is at most logÂ¿(\/5(Â« + 1)) â€” 2.
Proof Let Nh be the minimum number of nodes in a COST of height h. Clearly,
A^o = 0 and N\ = 1. Consider a COST Q of height h > 2 having the minimum
number of nodes Nh Q has one subtree R whose height is A â€” 1 and another, S,
whose height is < h â€” 1. R must be a minimal COST of height h â€” 1 and so has
Nh~i nodes. R, in return, must have one subtree, U, of height h â€” 2 and another, V,
of height < h â€” 2. Both U and V are COSTs as R is a COST. Since R is a minimal
40
COST, U is a minimal COST of height h  2 and so has 1Vfc_2 nodes. Since Q is a
COST, S > max{\U\, V'}. We may assume that Nh is a nondecreasing function of
h. So, 5 > Nh2 Since Q is a minimal COST of height h, S = N^2 So,
Nh = Nk.i + Nh2 + 1, h > 2
N0 = 0, Ni = 1.
This recurrence is the same as that for the minimum number of nodes in an AVL tree
of height h. So, = Fh+2 ~ 1 where F, is the *â€™th Fibbonacci number. Consequently,
Nh Â« (f>h+2/V5 â€” 1 and h < logÂ¿(\/5(Â« + 1)) â€” 2. â–¡
Corollary 1 The maximum height of a COST with n nodes is the same as that of an
AVL tree with this many nodes.
Definition Let a and b be the root of two binary trees, a and b are /3balanced,
0 < 0 < 1, with respect to one another, denoted /3(a,6), iff
(a) /?(s(a)  1) < s(6)
(b) 0(s{b)  1) < s(a)
A binary tree T is ^balanced iff the children of every node in T are /^balanced.
A full binary tree is 1balanced and a binary tree whose height equals its size
(i.e., number of nodes) is 0balanced.
41
Lemma 2 If the binary tree T is Â¡3balanced, then it is fbalanced for 0 < 7 < p.
Proof Follows from the definition of balance. â–¡
Lemma 3 If the binary tree T is /?balanced, 0 < ft < 1/2, then it is in WB(a) for
a = 0/(1 +ft).
Proof Consider any node p in T. Let l and r be node p's left and right children.
B(p) =
*{l) + 1
s(l) + $(r) + 2
1
1 +
J(r)+1 '
,(/) + !
Since T is /9balanced, s(l) â€” 1 < s(r)//3 or s(l) f 1 < s(r)//3 + 2. So,
*(l) + 1
s(r) + 1
2/31
f3(s(r) + 1)
or
â€¢s(r) + 1
40+T
>/?â€¢
So, B(p) < 1/(1 + P). Further, s(r) â€” 1 < s(l)/p. So,
s(r) + 1
(0 + 1
And, B(p) > 1/(1 + 1 IP) = P/( 1 + P). Hence P/( 1 + P) < B(p) < 1/(1 + P) for
every p in T. So, T is in WB(or) for o = P/(l + P). â–¡
42
O
Figure 3.2. A tree in WB(l/4) that is not balanced
Remark While every /^balanced tree, 0 < /? < 1/2, is in WB(a) for a = Â¡3/(1+/3),
there are trees in WB(a) that are not /^balanced. Figure 3.2 shows an example of a
tree in WB(l/4) that is not ^balanced.
Lemma i If T is a COST then T is balanced.
Proof If T is a COST, then every subtree of T is a COST. Consider any subtree
with root p, left child /, and right child r. If neither l nor r exist, then s(l) = s(r) = 0
and p is balanced. If s(l) â€” 0 and s(r) > 1, then r has a nonempty subtree with
root t and s(t) > s(l). So p is not a COST. Hence, s(r) < 1 and p is ^balanced. The
same is true when s(r) = 0. So, assume s(l) > 0 and s(r) > 0.
If s(l) = 1, then s(r) < 3 as otherwise, one of the subtrees of r has m > 2
nodes and m > s(l) implies p is not a COST. Since s(r) < 3, ^(^(r) â€” 1) < s(l) and
^(s(/) â€” 1) < s(r). So, p is ^balanced. The same proof applies when s(r) = 1. When
s(l) > 1 and s(r) > 1, let a and b be the roots of the left and right subtrees of l. Since
p is a COST, s(a) < s(r) and s(b) < s(r). So, s(l) = s(a) f s(b) + 1 < 2s(r) + 1 and
(s(/) â€” 1) < s(r). Similarly, (s(r) â€” 1) < s(l). So, (/, r). Since this proof applies
to every nodes in T, the children of every p are ^balanced and T is balanced. â–¡
43
Figure 3.3. ^balanced tree that is not a COST
Remark There are ^balanced trees that are not COSTs (see Figure 3.3).
While a COST is in WB(l/3) and WB(a) trees can be maintained efficiently
only for 2/11 < a < 1 â€” l/\/2 Â« 0.293, a COST is better balanced than WB(ar)
trees with a in the usable range. Unfortunately, we are unable to develop O(logn)
insert/delete algorithms for a COST.
In the next section, we develop insert and delete algorithms for /^balanced
binary search trees (/?BBST) for 0 < < y/2 â€” 1. Note that every (y/2 â€” 1)BBST is
in WB(a) for a = 1 â€” l/\/2 which is the largest permissible a. Since our insert and
delete algorithms perform rotations along the search path whenever these result in
improved search cost, BBSTs are expected to have better search performance than
WB(a) trees (for a = /?/(1 + Â¡3)).
Each node of a /JBBST has the fields LeftChild, Size, Data, and RightChild.
Since every /3BBST, /? > 0, is in WB(o), for a > 0, /2BBSTs have height that is
logarithmic in n, the number of nodes (provided (3 > 0).
44
3.4 Search. Insert, and Delete in a ffBBST
To reduce notational clutter, in the rest of the chapter, we abbreviate s(a) by a
(i.e., the node name denotes subtree size).
3.4.1 Search
This is done exactly as in any binary search tree. Its complexity is 0(h) where
h is the height of the tree. Notice that since each node has a size field, it is easy
to perform a search based on index (i.e., find the 10â€™th smallest key). Similarly, our
insert and delete algorithms can be adapted to indexed insert and delete.
3.4.2 Insertion
To insert a new element x into a /3BBST, we first search for x in the /?BBST.
This search is unsuccessful (as x is not in the tree) and terminates by falling off the
tree. A new node y containing x is inserted at the point where the search falls off
the tree. Let p' be the parent (if any) of the newly inserted node. We now retrace
the path from p' to the root performing rebalancing rotations.
There are four kinds of rotations LL, LR, RL, and RR. LL and RR rotations are
symmetric and so also are LR and RL rotations. The typical configuration before an
LL rotation is performed is given in Figure 3.4(a). p' denotes the root of a subtree in
which the insertion was made. Let p be the (size of the) subtree before the insertion.
Then, since the tree was a /2BBST prior to the insertion, 0(p,d). Also, for the LL
rotation to be performed, we require that (q > c) and (q > d). Note that q > d
implies q > 1. We shall see that (3(q,c) follows from the fact that the insertion is
45
(a) before (b) after
Figure 3.4. LL rotation for insertion
made into a /3BBST and from properties of the rotation. Following an LL rotation,
p' is updated to be the node p".
Lemma 5 [LL insertion lemma] If ](3(p, d) A f3(q, c) A (q > c) A (q > d)J for 0 < Â¡3 <
1/2 before the rotation, then Â¡3{q,gp') and /?(c, d) after the rotation.
Proof Assume the before condition.
(a) f3(q â€” 1) < c (as f3(q,c)) < gp'. Also, f3(gp' â€” 1) = 0(c + d) < 2f3q (as /3 > 0,q>c
and q > d) < q (as Â¡3 < 1/2). So, /3(q,gp').
(b) d < q => d â€” 1 < q â€” 1 => f3(d â€” 1) < /3(q â€” 1) < c (as f3(q,c)). Also,
P(c~ 1) < P(q + c 1) = f3(p' 2) =p{p\) < d (as 0{p,d)). So, /3(c,d). â–¡
In an LR rotation, the before configuration is as in Figure 3.4(a). However, this
time q < c. Figure 3.4(a) is redrawn in Figure 3.5(a). In this, the node labeled c in
Figure 3.4(a) has been labeled q and that labeled q in Figure 3.4(a) has been labeled
46
(a) before
(b) after substep (i)
Figure 3.5. Substep (i) of insertion LR rotation
a. With respect to the labelings of Figure 3.5(a), rotation LR is applied when
[(? > a)A(q> d)].
The other conditions that apply when an LR rotation is performed are
[/?(p,d)A /Ma> ?)A c)]
Here p denotes the (size of the) left subtree of gp prior to the insertion. An LR
rotation is accomplished in two substeps (or two subrotations). The first of these is
shown in Figure 3.5(b). Following an LR rotation, p' is updated to be node q'.
Lemma 6 [LR substep(i) insertion lemma] If [fl(p,d) A fi(a,q) A /?(6, c) A (q >
a)A(q>d)]forO
(c, d)) V (7^3(ai b) A /3(c, d))}] after the subrotation.
Proof Assume the before condition. First, we show that (3(p",gp') after the
rotation. Note that (3(p" â€” 1) = 0(a + 6) = (3(a + 6 + c + 1) â€” 0(c + 1) = (3(p' â€” 1) â€”
0(c + 1) = 0(p 1)  (3c < d(3c < d < gp'. Also, fl(gp'  1) = 0(c + d) < 6 + 0 + 0d
(as 0(6,c)) d)
p" = a + 6 + 1). So, p(pC,gp').
Next, we prove two properties that will be used to complete the proof.
PI: 0(61) < a.
To see this, note that 0(6 â€” 1) < 0(
P2: 0(c 1) < d.
For this, observe that p' â€” \ â€” a+ q> (3(q â€” 1) + q (as 0(a, q)) â€” (0 + 1)(<7 â€” 1) + 1.
So, q  1 < = j^y. Similarly, q  1 = 6 + c > 0(c  1) + c (as 0(6,c))
= 0Â» + l)(c  1) + 1. So, fle  1) < jfj(Â«  2) < jfjfa  1) < (as
0(p,d)) < d.
To complete the proof of the lemma, we need to show
{(/Â»(Â«,*) A V A/Â»M))}.
We do this by considering the two cases 6 > c and b < c.
Case b > c: Since a
PI imply 0(a, 6). Also, d < 9 = 6 + c + 1. So, ^fy(d â€” 1) < ^(6 + c  1) =
i?+Tc ~ 1) â€” ]3+ic + (as 0(6, c)) = c. This, together with P2 implies
i+/r(c,d). So, 0(a,6) A y^(c,d).
Case 6 < c: Since a < 9 = 6 + c + 1, a â€” 1 < 6 + c. So, a â€” l<6+câ€” lor
48
+ + Ht,c)) = 6. This and PI imply
Also, d  1 < q  2 = 6 + c  1. So, @(d  1) < /?(& + c  1) < /3(2c  1) < c. This,
together with P2 implies /3(c, d). So, y^j(a,6) A 0(c,d). â–¡
Since an LR(i) rotation can cause the tree to lose its /3balance property, it is
necessary to follow this with another rotation that restores the /3balance property.
It suffices to consider the two cases of Figures 3.6 and 3.7 for this follow up rotation.
The remaining cases are symmetric to these. In Figures 3.6 and 3.7, p and d denote
the nodes that do not satisfy /3(p, d). Note, however, that these nodes do satisfy
Since the follow up rotation to LR(i) is done only when
A (/Â»&>, <0),
either /3(pâ€” 1) > d or /3(d â€” 1) > p. When /3(pâ€” 1) > d, the second substep rotation is
one of the two given in Figures 3.6 and 3.7. When /3(d â€” 1) > p, rotations symmetric
to these are performed. In the following, we assume /3(p â€” 1) > d. Further, we may
assume d > 0, as d â€” 0 and y^(p, d) imply p < 1. Hence, f3(p,d). Also, d > 0 and
/3(p â€” 1) > d imply p > 1.
The LR(ii) LL rotation is done when the condition
A = (q > d) A (c < (1 + /3)q + (1 â€” 0)) A B where
49
(a) before (b) after
Figure 3.6. Case LL for LR(ii) rotation
B = y^(p,d) A A0{q,c) A (p(p  1) > d > 0).
Lemma 7 [Case LR(ii) LL rotation] If A holds before the rotation of Figure 3.6, then
P{q,gp') and ft(c,d) after the rotation provided 0 < Â¡3 < >/2 â€” 1.
Proof (a) p(q,gp')\
P{q~ 1) < c(as0(q,c)) < gp'. Also, p(gp'l) = p(c+d) < P((l+P)q+(l~P) + d) <
P{ 1 + p)q + P( 1  0) +  1) (as q > d) = P(2 +  /?2 < 9 (as /?(2 + P) < 1
for 0 < Â£ < \/2  1). So, p{q,gp').
(b)
^(
ffe? + i^(c â€œ 1) = + c  1) =  2) < ^(p  1) < d (as ^(;p,d)).
So, P(c,d). â–¡
Lemma 8 If (c < (1 + P)q + (1 â€” /?)) A (/?(p â€” 1) > d) in Figure 3.6, then d < q
provided 0 < P < \/2 â€” 1.
50
Proof Sinced < 0(pâ€” 1) = 0(q+c) < 0(q + (l+0)q+lâ€”0) = 0(0+2)q + 0(l â€” 0) <
9 + 1 (as 0(0 + 2) < 1 and 0(1  0) < 1 for 0 < 0 < y/2  1). So, d < q. â–¡
So, the only time an LR(ii) LL rotation is not done is when C = (C\ V C2) A B
holds where
cx = (q = d) A (c < (1 + 0)q + 10)
C2 = c > (1 + 0)q + (1 â€” 0).
At this time, the LR rotation of Figure 3.7 is done. In terms of the notation of
Figure 3.7, the condition C becomes D â€” (D\ V D?) A E where
Di = (a = d) A (q < (1 + 0)a + 1  0)
D2 â€” q > (1 + 0)o. + 1 â€” 0
E = d) A d) A #â€¢(Â«Â» 9) A c) A (0ÃP  1) > Â¿ > 0).
Lemma 9 When an LR(ii) LR rotation is performed and 0 < \/2 â€” 1, q > d and so
search cost is reduced.
Proof If Di, then since d < 0(p â€” 1) = 0(a + q) = 0(d + 9), q > d/0 â€” d > d as
fi < V2 ~ 1. K D,, then i < ftp1) = 0(a + ?) < + ,) = Hgpq
*s0
51
a
b
d
LR(ii)
LR
a
b
c
d
c
(a) before
(b) after
Figure 3.7. Case LR for LR(ii) rotation
Lemma 10 When (d = a) A 0(b, c) A (0(p â€” 1) > d) A (Â¡3 < \/2 â€” 1) (see Figure 3.7),
0(a â€” 1) < b and 0(d â€” 1) < c.
Proof Since 0(p â€” 1) > d and d = a, 0(p â€” 1) > a or 0(a + q) > a or a(l â€” 0) < 0q
or a< So, 0(a  1) < ^q  0 = ^(b + c + 1)  0.
If c <  + 0, then
/Â»(<â– 1) < ^ + ^ +
10 10 P
0(0+1 )b 0(0> + 0l+0)
10 + 10
0(0+1 )b 0(02 + 20l)
10 + 10
< b (as 0(0 I 1) < 1  0 for 0 < V2  1 and 02 + 20  1 < 0 for 0 < V2  1).
52
Since /9(c â€” 1) < 6, c <  f 1. So,
ffl+l)b 3/92/9
1/9 1/9 '
So,
However, since /92 + 2/9â€”1 < 0 for /9 < y/2 â€” 1, (1 + /9)/(l â€” /9) < and
(3/3  1)/(1 â€” /9) < /9. So, a  1 < b/P + p. If a > c + 1, then c < a  1 < 6//3 + /9.
We have already shown that for c < b/P + /9, /9(a â€” 1) < b. So, assume a < c + 1
Now, a < c and /9(a â€” 1) < P(c  1) < b (as /9(6,c)). So, ,3(a  1) < b in all cases.
P(a â€” 1) < c may be shown in a similar way. Since a = d, we get /9(d â€” 1) < c. â–¡
Lemma 11 [Case LR(ii) LR rotation] If D holds before the rotation of Figure 3.7,
then P(p',gp'), P(a,b), and P(c,d) following the rotation provided 0 < /9 < \/2 â€” 1.
Proof (a) p(pâ€™,gp'):
P(gp' â€” 1) = P{c + d) < b + P + Pd (as P(b,c)) < b + /9 + Pq (from Lemmas 9
and 10, q > d)
and q > d,P(p â€” 1) < (P + 1 )d or p(a f q) < (P + 1 )d or a + q < (1 + jÂ¡)d or
a < (1 4 jj)dq < (1 + Â±)dd = d/p. So, P{p'  1) = p(a + 6)
(as P(b, c)) < d\ c+ 1 = gp'.
(b) P(a,b):
Since b < q and P(a,q),P(bâ€” 1) < P(q â€” 1) < a.
53
When Di, P(a â€” 1) < b was proved in Lemma 10. So, /3(a,6).
When D2, q > a(l + /3) + 1  /3. So,
^ q 1  /3 _ 6 + c+ 1 1/3
â€œÃT/3HÃ‘3~ l+P l+P'
So,
/3(al)<
pb + /3c 4 P
1+/3
1/3
1+/3
/3/3<
/3Â¿> + Â¿> + 2/3
1+/3
1/3
1+/3
P â€” P â€”
b.
So, /3(a,6).
(c) P(c,d):
Note that /3(c  1) < P(q  1) < j^{q  1) < y+^ÃP  1) < d.
When D\, P(d â€” 1) < c was proved in Lemma 10. So, f3(c,d).
When D2, if d < b + 1, then d < b and P(d 1) < P(b 1) < c. So, assume d > b+ 1.
Now, b < d â€” 1 < /3(p â€” 1) â€” 1. So,
b < /3(a + 6 + C+1) â€” 1
< Hq~+ * + c+l)l
= j^(4+c+/J+(l+/?)(6+c+l))l
< j^(+1+c + ^ + (1+W( + 1 + c+1))1
= I~(^Ãlc + (l+/?) + (l + /?)(iÃ^<:+2))l
= c + P + (\ + P)c + 2/3 â€” 1
= (2 + P)c + 3/3 â€” 1 < (2 + P)c + P (as /3 < \/2 â€” 1)
54
< ^ + 0 (as P < v/2 1).
Also, from d < P(p â€” 1) and the above derivation, we get
d <
<
(6+ c+ /? + (! + p)(b + c + 1))
1+0
rr^(+y9+c+/9+(1+/*)
= TT^O+r&+^c)+w+1)
2Q2
= (2 + 0)c + + P{P + 1)
2/?2 + /92 + /?3 + p + /?2
â€” (2 + P)c +
= (2 + P)c +
l+P
P3 f 4/?2 + P
1 +P
< (2 + P)c+l {asP3 + 4p2 + P< l+pfor p< v/2  1).
So, 0(d  1) < p{2 + p)c < c (as P < v/2  1). So, /?(c, d).
â–¡
Theorem 9 If T is Pbalanced, 0 < P < \/2 â€” 1, prior to insertion, it is so following
the insertion.
Proof First note that since all binary search trees are balanced for P = 0, the rotaÂ¬
tions (while unnecessary) preserve 0balance. So, assume P > 0. Consider the tree V
just after the new element has been inserted but before the backward restructuring
pass begins.
If the newly inserted node, z, has no parent in T\ then T was empty and T' is
/^balanced. If z has a parent but no grandparent, then T has at most one nonempty
55
subtree X. Since T is /2balanced, (d(\X\ â€” 1) < 0. So, .Y < 1. Following the
insertion, T' has one subtree with < 1 nodes and one with exactly one. So, T' is
/2balanced. We may therefore assume that z has a grandparent in T'.
From the downward insertion path, it follows that all nodes u in T' that have
children / and r for which >/2(/, r) must lie on the path from the root to z. During
the backward restructuring pass, each node on this path (other than 2 and its parent)
play the role of gp in Figures 3.4 and 3.5. The /2property cannot be violated at z
as z has no children. It cannot be violated at the parent, s, of z as s satisfied the
/^property prior to insertion. As a result its other subtree has < 1 element. So,
following the insertion, s satisfies the /2property. As a result, each node in T' that
might possibly violate the /2property becomes the gp node during the restructuring
pass. Consider one such gp node. It has children in T' denoted by p' and d. Its
children in T are p and d. Figures 3.4 and 3.5 show the case when d is the right
subtree of gp in both T and T'. The cases RR and RL arise when d is the left
subtree.
During the restructuring pass, gp begins at the grandparent of z and moves up
to the root of T'. If z is at level r in T'â€™, (the root being at level 1), then gp takes on
r â€” 2 values during the restructuring pass. We shall show that at each of these r â€” 2
positions either
(a) no rotation is performed and all descendants of gp satisfy the /2property or
(b) a rotation is performed and following this, all descendants of node p" (Figure 3.4)
or of node q' (Figure 3.5) satisfy the /2property.
56
As a result, following the rotation (if any) performed when gp becomes the root
of T', the restructured tree is /^balanced. The proof is by induction on r. When
r = 3 (recall, we assume z has a grandparent), gp begins at the root of T' and its
descendants satisfy the /9property.
Without loss of generality, assume that the insertion took place in the left subtree
of gp. With respect to Figure 3.4, we have three cases: (i) q > c and q > d, (ii) q < c
and c > d, and (iii) q < d and c < d. In case (i), all conditions for an LL rotation hold
and such a rotation is performed. In case (ii), an LR rotation is performed. Following
either rotation, V is /9balanced. In case (iii), /3(p' â€” 1) = 0(q + c) < 2/3d < d (as
fi < y/2  1). Also, fi(d â€” 1) < p < p + 1 = p\ So, /3(d  1) < p'. Hence, /9(p',d)
and T' is /9balanced.
For the induction hypothesis, assume (a) and (b) whenever r < k. In the
induction step, we show (a) and (b) for trees T with r = k + 1. The subtree in
which the insertion is done has r = k. So, (a) and (b) hold for all gp locations in the
subtree. We need to show (a) and (b) only when gp is at the root of T'. This follows
from Lemmas 5, 6, 7, and 11.
The theorem now follows. â–¡
Lemma 12 The time needed to do an insertion in an n node 0BBST is O(logn)
provided 0 < Â¡3 < y/2 â€” 1.
57
Proof Follows from the fact that insertion takes 0(h) time where h is the tree
height and h = O(logn) when Â¡3 > 0 (Lemmas 1 and 3). â–¡
3.4.3 Deletion
To delete element x from a /LBBST, we first use the unbalanced binary search
tree deletion algorithm of Horowitz and Sahni [14] to delete x and then perform a
series of rebalancing rotations. The steps are:
Step 1 [Locate x] Search the /3BBST for the node y that contains x. If there is no
such node, terminate.
Step 2 [Delete x\ If y is a leaf, set dl to nil, gp to the parent of y, and delete node
y. If y has exactly one child, set dl to be this child; change the pointer from
the parent (if any) of y to point to the child of y; delete node y; set gp to be
the parent of dl. If y has two children, find the node z in the left subtree of y
that has largest value; move this value into node y; set y = z; go to the start
of Step 2. { note that the new y has either 0 or 1 child }
Step 3 [Rebalance] Retrace the path from dl to the root performing rebalancing
rotations.
There are four rebalancing rotations LL, LR, RR, and RL. Since LL and RR
as well as LR and RL are symmetric rotations, we describe LL and LR only. The
discussion is very similar to the case of insertion. The differences in proofs are due
to the fact that a deletion reduces the size of encountered subtrees by 1 while an
58
(a) before (b) after
Figure 3.8. LL rotation for deletion
insertion increases it by 1. In an LL rotation, the configuration just before and after
the rotation is shown in Figure 3.8. This rotation is performed when q > c and
q > d!. Following the rotation, d' is updated to the node p'.
Let d denote the size of the right subtree of gp before the deletion. So, d = d'fi 1.
Since prior to the deletion the /3BBST was /^balanced, it follows that /?(p, d) and
/T(9, c).
Lemma 13 [LL deletion lemma] If [0(p,d) A(3(q,c) A(q > c)l\(q > A(1/3 < /? <
1/2)7 before the rotation, then [&(q,gp') A 0(c,d')] after the rotation.
Proof (a) /3(q,gp'):
0{q â€” 1) < c (as /3(q,c)) < gp'. Also, (3(gp' â€” 1) = /3(c + d!) < 20q (as c < q and
d' < q) < q (as P < 1/2). So, f3(q,gp').
(b) /?M'):
d! < q => d' â€” 1 < q â€” 1 => 0(d' â€” 1) < 0(q â€” 1) < c. Also, when c < 1, /?(câ€”1) < 0 < d'
(as d! > 0). When c > \,q > c =$â– q > 2 and p â€” q + c + 1 >c + 3. So,
f3{c  1) < (){p  1)  3/? < d  3(3 (as 0(p, d)) < d  1 (as > 1/3) = d!. Hence,
59
In an LR rotation, the before configuration is as in Figure 3.8(a). However, this
time q < c. Figure 3.8(a) is redrawn in Figure 3.9(a). In this, the node labeled c
in Figure 3.8(a) has been relabeled q and that labeled q in Figure 3.8(a) has been
relabeled a. With respect to the labelings of Figure 3.9(a), rotation LR is applied
when
[(? > a) A ( d!)\.
The other conditions that apply when an LR rotation is performed are
[^(p, d) A q) A /?(6, c)].
Here d denotes the (size of) right subtree of gp prior to the deletion. As in the case
of insertion, an LR rotation is accomplished in two substeps (or two subrotations).
The first of these is shown in Figure 3.9. Following an LR rotation, d1 is updated to
node q'.
Lemma If [LR substep(i) deletion lemma] If [/3(p, d) A (3(a, q) A /?(6, c) A (q > a) A
(q > d')] before the subrotation LR(i), then [/3(p',gp')A{(f3(a, 6)Ay^j(c, d'))V(y^j
(a,b) A /?(c, d'))}] after the subrotation provided 1/3 < R < 1/2 .
Proof Assume the before condition.
(a) If b = c = 0, then <7 = 6 + c + 1 = 1. Furthermore, (q > a) and (q > d') imply
a = d! = 0. So, gp' = p' = 1. Hence, \\(p',gp') A (a,6) A (c,d')]
60
(a) before
(b) after substep (i)
Figure 3.9. LR rotation for deletion
(b) If b = 1 and c = 0, then q = 2, a < 1, and d! < 1. So, 1 < p' < 3 and 1 < gj/ < 2.
Hence, [Â§(?', gjt) A Â±(a,6) A (c,d')]
(c) If b = 0 and c = 1, then q = 2, a < 1, and d! <1. So, 1 < p' < 2 and 1 < gp' < 3.
Hence, \\{j/,gp') A \(a,b) A \{c,d')\
As a result of (a)  (c), to complete the proof, we may assume that b > 1 and c > 1.
So, q > 3, a > 1 (as /3(a, q) f3(q â€” 1) < a or a > 2/3 > 0), p = a + q + 1 > 5,d > 2
(as fi(p,d) => /9(p â€” 1) < d and 0 > 1/3), and d' = d â€” 1 > 1.
First, we show that (3(p',gp'). For this, note that a + 6 + c+ l = p â€” 1. From
/3(p, d), it follows that (3(a + 6 + c + 1) = /9(p â€” 1) < d. So, y9(a + b) < d â€” flc â€” 0.
From Figure 3.9(b), we see that 0(p' â€” 1) = 0{a + b). Hence, 0{p' â€” 1) < d â€” 0c â€” 0 =
d' â€” 0c + 1 â€” 0 < d! + 1 â€” 20 < gp'. Also,
0(gp'\) = 0(c + d') < b 0 + 0d! (as 0(b, c))
< b + 0q + 0 (as q > d')
61
< 6 + a + 2Â¿0 (as 0(a,q))
< ?'â–
So, P(p',gp').
Next, we prove two properties that will be used to complete the proof.
PI: Â¿0(61)
To see this, note that Â¿0(6 â€” 1) < ft(q â€” 1) < a (as /?(a, 9)).
P2: P(c 1) < dâ€˜.
For this, observe that Â¿9(c â€” 1) < (3(q â€” 2) (as c < q â€” 1) < fi(p â€” 4) (as q = p â€” a â€” 1
and a > 1) = Â¿0(p  1) â€” 3/0 < d  1 (as Â¿9(p, d) and Â¿3 > 1/3) = d'.
To complete the proof of the lemma, we need to show
{(Hâ€œ, 4) A Jd')) V (â€œâ€¢(Â«, 4) A 0(c, i))l
For this, consider the two cases 6 > c and b < c (as in Lemma 6).
Case b > c: Since a
with PI implies Â¿0(a, 6). Also, d'
^+IC+ *)  ^+IC+ Â¡3+1 = câ€™ Thisâ€™ toSether with P2 imples ^{c,d'). So,
/Ma>&) A d')
Case b < c: Since a < q â€” 6 + c + 1,a â€” 1 < 6 + c. So, a â€” 1 < 6 + câ€” 1
Â°r < i% + TT0 = b. This and PI imply ^(0,6). Also,
d' â€” 1 < q â€” 2 = b + c â€” 1. So, /?(d' â€” 1) < /3(6 + c â€” 1) < /?(2c â€” 1) < c. This and
62
P2 imply f3(c,d'). Hence, j^(a,b) A 0(c,d'). â–¡
The substep(ii) rotations are the same as for insertion.
Theorem 10 IfT is ^balanced, then following a deletion the resulting tree V is also
Â¡3balanced provided 1/3 < Â¡3 < y/2 â€” 1.
Proof Similar to that of Theorem 9. a
When 0 < Â¡3 < 1/3, we need to augment the LL rotation by a transformation
for the case d! = 0. When d! = 0, /3(p  1) < d = d! + 1 = 1. So, p < 1//2 + 1 and
gp = p + d! + 1 < 1//2 + 2. To /2balance at gp, the at most 1//2 + 2 nodes in gp
are rearranged into any /2BBST in constant time (as 1//2 + 2 is a constant). When
d! > 0, the proof of Lemma 13 part (b) can be changed to show /?(c â€” 1) < d! for
0 < Â¡3 < y/2 â€” 1. The new proof is: since c < q,c < (p â€” l)/2 and /?(c â€” 1) <
/?(p â€” \)/2 â€” d < d/2 â€” /3 = d â€” d/2 â€” Â¡3 < d â€” 1â€” /3 < d1. The LR rotation needs
to be augmented by a transformation for the case d! â€” d â€” 1 < p^+p) ~ T this
time, f3(p  1) < d < j^py So, gp = p + d < pi^j) + 1 + p^+p] To /2balance at
gp, we rearrange the fewer than pi^+p) + 1 + p(2+pj no<^es >n subtree, in constant
time, into any ^balanced tree. When d! > pÂ¡j+pj â€” 1, the proof for /2(c â€” 1) < d' in
Lemma 14 needs to be changed to show that the LR substep(i) lemma holds. The
new proof is:
d > /3{p  1) = /3(a + b + c + 1) > /3(/3(q  1)  b + cf 1)
63
= ${/?(&+ c) 4 b + c+ 1)
> /9((1 + /J)/J(c1) + (1 + /J)c+1)
= 0((1+/?)â€™(<:1)+ 2 + 0).
So, 0(c  1) < < rf  1 (Â«Â« Ã¡ > *,Â¿35) = *.
Also, note that when /3 = 0, all trees are /^balanced so the rotations (while not
needed) preserve balance.
Theorem 11 With the special handling of the case d' = 0, the tree T' resulting from
a deletion in a /3BBST is also (3balanced for 0 < (3 < y/2 â€” 1.
Lemma 15 The time needed to delete an element from an n node f)BBST is 0(\ogn)
provided 0 < Â¡3 < \/2 â€” 1.
3.4.4 Enhancements
Since our objective is to create search trees with minimum search cost, the
rebalancing rotations may be performed at each positioning of gp during the backward
restructuring pass so long as the conditions for the rotation apply rather than only
at gp positions where the tree is unbalanced.
Consider Figure 3.4(a). If p' < d, then the conditions of Lemmas 5 and 6 cannot
apply as q < p' < d. However, it is possible that e > p' where e is the size of either the
left or right subtree of d. In this case, an RR or RL rotation would reduce the total
search cost. The proofs of Lemmas 5 and 6 are easily extended to show that these
rotations would preserve balance even though no insertion was done in the subtree
64
d. The same observation applies to deletion. Hence the backward restructuring pass
for the insert and delete operations can determine the need for a rotation at each gp
location as below (/ and r are, respectively, the left and right children of gp).
i MO > s(r) then check conditions for an LL and LR rotation
else check conditions for an RR and RL rotation.
The enhanced restructuring procedure used for insertion and deletion is given in
Figure 3.10. In the RR and RL cases, we have used the relation *>â€™ rather than â€˜>â€™
as this results in better observed run time.
Since it can be shown that the rotations preserve balance even when there
has been no insert or delete, we may check the rotation conditions during a search
operation and perform rotations when these improve total search cost.
Finally, we note that it is possible to use other definitions of /3balance. For
example, we could require fl(s(a) â€” 2) < s(b) and /3(s(2>) â€” 2) < s(a) for 0(a,b).
One can show that the development of this chapter applies to these modifications
also. Furthermore, when this new definition is used, the number of comparisons in
the second substep of the LR and RL rotations is reduced by one.
3.4.5 Top Down Algorithms
As in the case of red/black and WB(a) trees, it is possible to perform, in 0(log n)
time, inserts and deletes using a single top to bottom pass. The algorithms are similar
to those already presented.
65
procedure Restructuring ;
begin
while (gp) do
begin
if (s(gp.left) > s(gp.right)) then
begin {check conditions for an LL and LR rotation}
p = gp.left ;
if (s(p.left) > s(p.right)) then
begin if (s(p.left) > s(gp.right)) then do LL rotation; end
else
begin
if (s(p.right) > s(gp.right)) then {LR}
begin
do LR rotation ;
{ now notations a, 6, c, and d follow from figure 3.1(b) }
if (/?(s(a) â€” 1) > s(b)) then
if ((s(a.right) < (1 + 0)s(a.left) + 1 â€” 0) and
(s(b) < s(a.left))) then
do LL rotation
else do LR rotation
else if (/?(s(d) â€” 1) > s(c)) then
if ((s(d.left) < (1 + P)s(d.right) +1â€”0) and
(s(c) < s(d.right))) then
do RR rotation
else do RL rotation ;
end
end
end
else {check conditions for an RR and RL rotation}
begin
P = gpright ;
if (s(p.left) > s(p.right)) then
begin
if (s(p.left) > s(gp.left)) then {RL}
do symmetric to the above LR case ;
end
else
begin if (s(p.right) > s(gp.left)) then do RR rotation; end ;
end ;
gp = gp.parent ;
end ;
end ;
Figure 3.10. Restructuring procedure
66
3.5 Simple /3BBSTs
The development of Section 3.4 was motivated by our desire to construct trees
with minimal search cost. If instead, we desire only logarithmic performance per
operation, we may simplify the restructuring pass so that rotations are performed
only at nodes where the /3balance property is violated. In this case, we may dispense
with the LL/RR rotations and the first substep of an LR/RL rotation. Only LR/RL
substep (ii) rotations are needed. To see this, observe that Lemmas 7 and 11 show that
the second substep rotations rebalance at gp (see Figures 3.6 and 3.7) provided
(p, d) (The remaining conditions are ensured by the bottomup nature of restructuring
and the fact the tree was ^balanced prior to the insert or delete).
If the operation that resulted in loss of balance at gp was an insert, then /?(p â€”
2) < d (as p > d, the insert took place in subtree p and gp was ^balanced prior
to the insert) and fi(p â€” 1) > d (gp is not ^balanced following the insert). For the
substep (ii) rotation to restore balance, we need 0(pâ€” 1) < (1 + fd)d. This is assured
if d + /? < (/? + l)d (as /3(p â€” 2) < d). So, we need d > 1. If d < 1, then d = 0. Now
/?(p â€” 2) < d and Â¡3(p â€” 1) > d imply p = 2. One may verify that when p = 2, the
LR(ii) rotations restore balance.
If the loss of /3balance at gp is the result of a deletion (say from its right subtree),
then /?(p â€” l) < d + 1 (as gp was /3balanced prior to the delete). For the substep
(ii) rotation to accomplish the rebalancing, we need /?(p â€” 1) < (/? F l)d. This is
guaranteed if d+ 1 < ((3 + l)d or d > 1//3. When d < 1//3 and > 1/3, d < 2. Since
0(p â€” 1) < d + 1 and /3 > 1/3, when d = 2, p < 10; when d = 1, p < 7; and when
67
d = O, p < 4. We may verify that for all these cases, the LR(ii) rotations restore
balance. Hence, the only problematic case is when /? < 1/3 and d < 1/(3.
When /? < 1/3, an LL rotation fails to restore balance only when d = 0 (see
discussion following Theorem 10). So we need to rearrange the at most 1/(3 + 2
nodes in gp into any /^balanced tree when d â€” 0. An LR rotation fails only when
d < p^+0) ~~ To see this, note that in the terminology of Lemma 14, d is dl.
The proof of P2 is extended to the case /? < 1/3 when dl > p^+p) ~ 1 Also, since
dl < 1//3, for the case b > c, we get /3(d' â€” 1) < 1 â€” (3 < c (as c > 1). For the
case b < c, we need to show (3(a â€” 1) < b. Since an LR rotation is done only when
condition Dl V D2 holds, from Lemmas 10 and 11, it follows that /?(a â€” 1) < b. So,
an LR rotation rebalances when 0 < 1/3 provided d > p^+p) ~ 1 FÂ°r smaller d, the
at most pip+p) + 0(2+0) + 1 nodes in the subtree gp may be directly rearranged into
a /3balanced tree.
The restructuring algorithm for simple /3BBSTs is given in Figures 3.11 and 3.12.
The algorithm of Figure 3.11 is used following an insert and that of Figure 3.12 after
a delete.
Simple /SBBSTs are expected to have higher search cost than the /3BBSTs
of Section 3.4. However, they are a good alternative to traditional WB(a) trees as
they are expected to be â€œbetter balancedâ€. To see this, note that from the proof of
Lemma 3, the balance, B(p), at any node p in a ^balanced tree satisfies
1 _ 1 , ^(r) + !
B(p) + s(l) + 1
68
procedure Restructuring2 ;
begin
while (gp) do
begin
if (0(s(gp.left) â€” 1) > s(gp.right)) then {do an LL or LR rotation)
begin
P = gp iefi;
if ((s(p.right) < (1 + 0)s(p.left) + 1â€”0) and
(s(gp.right) < s(p.left))) then
do LL rotation
else do LR rotation ;
end
else
do symmetric to the above L case ;
gp = gp.parent ;
end ;
end ;
Figure 3.11. Simple restructuring procedure for insertion
procedure Restructurings ;
begin
while (gp) do
begin
if (0(s(gp.left) â€” 1) > s(gp.right)) then
if (0 < 1/3) and (s(gp.right) < 1/0(2 + 0) â€” 1) then
rearrange the subtree rooted at gp into any /^balanced tree
else {do an LL or LR rotation)
begin
p = gp.left ;
if ((s(p.right) < (1 + 0)s(p.left) + 1 â€” 0) and
(s(gp.right) < s(p.left))) then
do LL rotation
else do LR rotation ;
end
end
else
do symmetric to the above L case ;
gp = gp.parent ;
end ;
end ;
Figure 3.12. Simple restructuring procedure for deletion
69
>
1 +
1/0 +
2/31
Â«â€¢(r)+l)
2/31
1 + 0 + 0(Â«(r)+l)
I 4. 2/31
P ^ /Â»(*(r)+l)
So,
B(p) < 1
1
1 + p +
2/31
/3(a(r)+l)
Also, since s(r)  1 < a(/)/0, s(r) + 1 < s(l)/0 + 2. Hence, 1 + < 1 + +
*(0+i'
So,
B(p)  , 1 i . 2
1 + 0 0(s(l)+1) T i(/)+l
1
â€œ 1 . i I 2/31 â€¢
1 + /3 + /3(*(0+l)
Consequently,
1 + i + 2/?~1
1 + P + ^(*(0+i)
< B(P) <
1
1 + j +
2/3 â€” 1 â€™
/3(*(r)+l)
When 0= v^ 1,
~r < #(p) < 1
2 + ^+7^ 2 + v^+^ft
If s(p) < 10, 0.296 < B(p) < 1 â€” 0.296. So, every 0balanced subtree with 10 or
fewer nodes is in WB(a) for a ss 0.296. Similarly, every subtree with 100 or fewer
nodes is in WB(a) for a Â« 0.293. In fact, for every fixed k, subtrees of size k or less
70
procedure Restructuring4 ;
begin
while (gp) do
begin
if (s(gp.left) > s(gp.right)) then
begin {check conditions for an LL and LR rotation}
p = gp left 5
if (s(p.left) > s(p.right)) and (s(p.left) > s(gp.right)) then
do LL rotation
else if (s(p.left) < s(p.right)) and (s(p.right) > s(gp.right)) then
do LR rotation ;
end
else {check conditions for an RR and RL rotation}
do symmetric to the above L case ;
gp = gp.parent ;
end ;
end ;
Figure 3.13. Simple restructuring procedure without a /? value
are in WB(a) for a slightly higher than 0.2929 which is the largest value
of a for which WB(a) trees can be maintained.
3.6 BBSTs without Deletion
In some applications of a dictionary, we need to support only the insert and
search operations. In these applications, we can construct binary search trees with
total cost
C(r)
by using the simpler restructuring algorithm of Figure 3.13.
Theorem 12 When the only operations are search and insert and restructuring is done
as in Figure 3.13, C(T) < nlogÂ¿(\/5(n + 1)).
71
Proof Suppose T currently has m â€” 1 elements and a new element is inserted. Let
u be the level at which the new element is inserted. Suppose that the restructuring
pass performs rotations at q < u of the nodes on the path from the root to the newly
inserted node. Then C{T) increases by at most v â€” u â€” q as a result of the insertion.
The number of nodes on the path from the root to the newly inserted node at which
no rotation is performed is also v. Let these nodes be numbered 1 through v bottom
to top. Let Si denote the number of elements in the subtree with root i prior to the
restructuring pass. We see that Si > 1 and S2 > 2. For node 1, 2 < i < v, one of its
subtrees contains node i â€” 1. Without loss of generality, let this be the left subtree
of 1. Let the root of the right subtree of t be d. So,
Si > Sii + s(d) + 1.
If z â€” 1 is not the left child of t, then since no rotation is done at *, s(d) > 5,_ 1. If
z â€” 1 is the left child of 1, then consider node i â€” 2. This is in one of the subtrees of
i. Since no rotation is performed at i â€” 1, s(d) > Si2â– Since Si1 > Si2, we get
Si > Si1 + Si2 + L
Hence, Sv > Nv where Nv is the minimum number of elements in a COST of
height v. So, v < log^(\/5(m + 1)). So, when an element is inserted into a tree that
has m â€” 1 elements, its cost C(T) increases by at most logÂ¿(v/5(m + 1)). Starting
with an empty tree and inserting n elements results in a tree whose cost is at most
72
nlog*(V5(n +1)). D
Corollary 2 The expected cost of a search or insert in a BBST constructed as above
is O(\ogn).
Proof Since C(T) < nlog^(v/5(n + 1)), the expected search cost is C(T)/n <
lÂ°g^(v/5(n + 1)). The cost of an insert is the same order as that of a search as each
insert follows the corresponding search path twice (top down and bottom up). â–¡
3.7 Experimental Results
For comparison purposes, we wrote C programs for BBSTs, SBBSTs (simple
BBSTs), BBSTDs (BBSTs in which procedure Restructuring4 (Figure 3.13) is used
to restructure following inserts as well as deletes), unbalanced binary search trees
(BST), AVLtrees, topdown redblack trees (RBT), bottomup redblack trees (RB
B) [31], weight balanced trees (WB), deterministic skip lists (DSL), treaps (TRP),
and skip lists (SKIP). For the BBST and SBBST structures, we used /? = 207/500
while for the WB structure, we used a = 207/707. While these are not the highest
permissible values of /? and a, this choice permitted us to use integer arithmetic
rather than the substantially more expensive real arithmetic. For instance, /?(a,6)
for /? = 207/500 can be checked using the comparisons 207(s(a) â€” 1) > 500s(6)
and 207(s(6) â€” 1) > 500s(a). The randomized structures TRP and SKIP used the
same random number generator with the same seed. SKIP was programmed with
probability value p = 1/4 as in Pugh [26].
73
To minimize the impact of system call overheads on run time measurements,
we programmed all structures using simulated pointers (i.e., an array of nodes with
integer pointers [27]. Skip lists use variable size nodes. This requires more complex
storage management than required by the remaining structures which use nodes of
the same size. For our experiments, we implemented skip lists using fixed size nodes,
each node being of the maximum size. As a result, our run times for skip lists are
smaller than if a space efficient implementation had been used. In all our tree strucÂ¬
ture implementations, null pointers were replaced by a pointer to a tail node whose
data field could be set to the search/insert/delete key and thus avoid checking for
falling off the tree. Similar tail pointers are part of the defined structure of skip and
deterministic skip lists. Each tree also had a head node. WB(a) trees were impleÂ¬
mented with a bottomup restructuring pass. Our codes for SKIP and DSL are based
on the codes of Pugh [26] and Papadakis [22], respectively. Our AVL and RBT codes
are based on those of Papadakis [22] and Sedgewick [28]. The treap structure was
implemented using joins and splits rather than rotations. This results in better perÂ¬
formance. Furthermore, AVL, RBB, WB, and BBST were implemented with parent
pointers in addition to left and right child pointers. For BBSTs, the enhancements
described in Section 3.4.4 for insert and delete (see Figure 3.10) were employed. No
rotations were performed during a search when using any of the structures.
For our experiments, we tried two versions of the code. These varied in the
order in which the â€˜equalityâ€™ and â€˜less thanâ€™ or â€˜greater thanâ€™ check between x and e
(where x is the key being searched/inserted/deleted and e is the key in the current
74
node) is done. In version 1, we conducted an initial experiment to determine if the
total comparison count is less using the order L:
if x < e then move to left child
else if x ^ e then move to right child
else found
or the order R:
if x > e then move to right child
else if x / e then move to left child
else found.
Our experiment indicated that doing the â€˜left childâ€™ check first (i.e. order L) worked
better for AVL, BBST, BBSTD, and DSL structures while R worked better for the
RBT, RBB, WB, SBBST, and TRP structures. No significant difference between L
and R was observed for BSTs. For skip lists, we do not have the flexibility to change
the comparison order. The version 1 codes performed the comparisons in the order
determined to be better. For BSTs, the order R was used.
In the version 2 codes the comparisons in each node took the standard form
if x = e then found
else if x < e then move to left child
else move to right child.
The version 2 restructuring code for BBSTs differed from that of Figure 3.10 in
that the â€˜>â€™ test in the second, third, and forth if statements was changed to *>â€™.
75
No change was made in the corresponding if statements for RR and RL rotations.
While this increased the number of comparisons, it reduced the run time.
We experimented with n = 10,000, 50,000, 100,000, and 200,000. For each n,
the following experiments were conducted:
(a) start with an empty structure and perform n inserts;
(b) search for each item in the resulting structure once; items are searched for in the
order they were inserted
(c) perform an alternating sequence of n inserts and n deletes; in this, the n elements
inserted in (a) are deleted in the order they were inserted and n new elements are
inserted
(d) search for each of the remaining n elements in the order they were inserted
(e) delete the n elements in the order they were inserted.
For each n, the above five part experiment was repeated ten times using different
random permutations of distinct elements. For each permutation, we measured the
total number of element comparisons performed and then averaged these over the ten
permutations.
First, we report on the relative performance of SBBSTs, BBSTDs, and BB
STs. For this comparison, we used only version 1 of the code. Table 3.1 gives the
average number of key comparisons performed for each of the five parts of the exÂ¬
periment. The three versions of our proposed data structure are very competitive
on this measure. BBSTDs and BBSTs generally performed fewer comparisons than
did SBBSTs. All three structures had a comparison count within 2% of one another.
76
Table 3.1. The number of key comparisons on random inputs (version 1 code)
n
operation
SBBST
BBSTD
BBST
insert
212495
212223
212111
search
194661
191599
191578
10,000
ins/del
416125
416967
416862
search
194957
191666
191676
delete
168033
166441
166487
insert
1241080
1236682
1236114
search
1152137
1135131
1134969
50,000
ins/del
2437918
2438083
2437639
search
1153821
1134277
1134062
delete
1018675
1007766
1007688
insert
2635913
2624829
2623792
search
2458079
2423988
2423613
100,000
ins/del
5183619
5180383
5179653
search
2461221
2420282
2419990
delete
2190798
2168049
2168110
insert
5580139
5555190
5553256
search
5223989
5148220
5147698
200,000
ins/del
10981441
10969578
10968053
search
5229172
5144808
5144148
delete
4692447
4641349
4641389
However, when we used ordered data rather than random data (Table 3.2), SBBSTs
performed noticeably inferior to BBSTDs and BBSTs; the later two remained very
competitive.
Tables 3.3 and 3.4 give the average heights of the trees using random data and
using ordered data, respectively. The first number gives the height following part (a)
of the experiment and the second following part (c). The numbers are identical for
BBSTDs and BBSTs and slightly higher (lower) for SBBSTs using random (ordered)
data.
77
Table 3.2. The number of key comparisons on ordered inputs (version 1 code)
n
operation
SBBST
BBSTD
BBST
insert
170182
150554
150554
search
188722
185530
185530
10,000
ins/del
425305
315177
314998
search
191681
184155
184155
delete
215214
135311
135131
insert
991526
872967
872967
search
1117174
1101481
1101481
50,000
ins/del
2472808
1806346
1805439
search
1116390
1098065
1098065
delete
1277756
792717
791815
insert
2103808
1850548
1850548
search
2384327
2354757
2354757
100,000
ins/del
5249194
3823415
3821594
search
2382759
2346118
2346128
delete
2738294
1686397
1684584
insert
4449143
3903083
3903083
search
5068632
4946753
4946753
200,000
ins/del
11105525
8051695
8048058
search
5065496
5001967
5001967
delete
5842168
3580856
3577223
Table 3.3. Height of the trees on random inputs (version 1 code)
n
SBBST
BBSTD
BBST
10,000
17,17
16,16
16,16
50,000
20,20
19,19
19,19
100,000
21,21
20,20
20,20
200,000
22,23
21,21
21,21
Table 3.4. Height of the trees on ordered inputs (version 1 code)
n
SBBST
BBSTD
BBST
10,000
16,15
17,17
17,17
50,000
20,20
20,20
20,20
100,000
21,21
21,21
21,21
200,000
22,22
23,22
23,22
78
The average number of rotations performed by each of the three structures is
given in Tables 3.5 and 3.6. A single rotation (i.e., LL or RR) is denoted â€˜Sâ€™ and a
double rotation (i.e., LR or RL) denoted â€˜D\ In the case of BBSTs, double rotations
have been divided into three categories: D = LR and RL rotations that do not perform
a second substep rotation; DS = LR and RL rotations with a second substep rotation
of type LL and RR; DD = LR and RL rotations with a second substep rotation of
type LR and RL. BBSTDs and BBSTs performed a comparable number of rotations
on both data sets. However, on random data SBBSTs performed about half as many
rotations as did BBSTDs and BBSTs. On ordered data, SBBSTs performed 15 to
20% fewer rotations on part (a), 34% fewer on part (c), and 51% fewer on part (e).
The runtime performance of the structures is significantly influenced by comÂ¬
piler and architectural features as well as the complexity of a key comparison. The
results we report are from a SUN SPARC5 using the UNIX C compiler cc with
optimization option. Because of instruction pipelining features, cache replacement
policies, etc., the measured run times are not always consistent with the compiler
and architecture independent metrics reported in Tables 3.1 through 3.6 and later
in Tables 3.11 through 3.16. For example, since the search codes for all tree based
methods are essentially identical, we would expect methods with a smaller comparÂ¬
ison count to have a smaller run time for parts (b) and (d) of the experiment. This
was not always the case.
Tables 3.7 and 3.8 give the run times of the three BBST structures using integer
keys and Tables 3.9 and 3.10 do this for the case of real (i.e., floating point) keys. The
Table 3.5: The number of rotations on random inputs (version 1 code)
n
operation
SBBST
BBSTD
BBST
S
D
S
D
S
D
DS
DD
insert
2341
2220
5045
4314
5025
3938
151
93
10,000
ins/del
4269
3216
10158
6311
10104
5849
232
103
delete
1607
1110
5235
2104
5201
2018
51
28
insert
11719
11120
25216
21596
25059
19732
754
455
50,000
ins/del
21330
16125
51238
31499
50979
29198
1161
531
delete
8058
5648
26214
10462
26068
10033
248
131
insert
23450
22262
50283
43230
50047
39461
1527
920
100,000
ins/del
42780
32203
102218
62967
101836
58491
2275
1046
delete
16095
11306
52227
21022
51943
20147
496
260
insert
46934
44525
100664
86605
100205
79013
3054
1840
200,000
ins/del
85283
64417
204459
125960
203568
116940
4593
2059
delete
32233
22551
104344
41884
103826
40157
990
523
80
Table 3.6. The number of rotations on ordered inputs (version 1 code)
n
operation
SBBST
BBSTD
BBST
S
D
S
D
S
D
DS
DD
insert
9984
0
9985
2387
9985
2387
0
0
10,000
ins/del
14997
0
16567
6130
16644
5797
25
154
delete
4989
0
6570
3726
6647
3392
26
154
insert
49980
0
49983
11956
49983
11956
0
0
50,000
ins/del
74996
0
82862
30659
83247
28982
137
770
delete
24987
0
32859
18686
33242
17018
136
766
insert
99979
0
99983
23917
99983
23917
0
0
100,000
ins/del
149996
0
165738
61327
166504
57969
280
1540
delete
49986
0
65733
37392
66505
34040
278
1536
insert
199978
0
199982
47839
199982
47839
0
0
200,000
ins/del
299996
0
331473
122653
333012
115938
559
3078
delete
99985
0
131478
74795
133016
68086
557
3076
sum of the run time for parts (a)  (e) of the experiment is graphed in Figure 3.14.
For random data, SBBSTs significantly and consistently outperformed BBSTDs and
BBSTs. On ordered data, however, BBSTDs were slightly faster than BBSTs and
both were significantly faster than SBBSTs.
Since BBSTs generated trees with the least search cost, we expect BBSTs to
outperform SBBSTs and BBSTDs in applications where the comparison cost is very
high relative to that of other operations and searches are done with a much higher
frequency than inserts and deletes. However, wfith the mix of operations used in
our tests, SBBSTs are the clear choice for random inputs and BBSTDs for ordered
inputs.
In comparing with the other structures, our tables repeat the data for BBSTs.
The reader may make the comparison with SBBSTs and BBSTDs.
81
Table 3.7. Run time on random inputs using integer keys (version 1 code)
n
operation
SBBST
BBSTD
BBST
insert
0.27
0.30
0.34
search
0.06
0.06
0.07
10,000
ins/del
0.57
0.62
0.70
search
0.06
0.06
0.06
delete
0.22
0.25
0.26
insert
1.48
1.61
1.75
search
0.35
0.36
0.37
50,000
ins/del
2.90
3.47
3.84
search
0.36
0.38
0.39
delete
1.13
1.47
1.62
insert
3.00
3.57
3.80
search
0.78
0.83
0.84
100,000
ins/del
6.28
7.78
8.41
search
0.83
0.87
0.88
delete
2.54
3.31
3.58
insert
6.56
7.74
8.37
search
1.80
1.89
1.89
200,000
ins/del
13.89
17.32
18.57
search
1.86
1.98
1.98
delete
5.64
7.41
8.02
Time Unit : sec
82
Table 3.8. Run time on ordered inputs using integer keys (version 1 code)
n
operation
SBBST
BBSTD
BBST
insert
0.32
0.20
0.27
search
0.05
0.03
0.05
10,000
ins/del
0.58
0.43
0.57
search
0.07
0.03
0.03
delete
0.20
0.17
0.23
insert
1.38
1.20
1.10
search
0.25
0.20
0.20
50,000
ins/del
2.63
2.18
2.40
search
0.25
0.20
0.20
delete
0.95
0.92
1.05
insert
3.43
2.23
2.53
search
0.72
0.45
0.42
100,000
ins/del
5.97
4.70
5.13
search
0.55
0.47
0.42
delete
2.10
1.98
2.15
insert
6.65
4.95
5.25
search
1.20
0.92
0.90
200,000
ins/del
13.13
10.23
10.88
search
1.17
0.90
0.90
delete
4.63
4.25
4.58
Time Unit : sec
83
Table 3.9. Run time on random real inputs (version 1 code)
n
operation
SBBST
BBSTD
BBST
insert
0.23
0.34
0.36
search
0.07
0.10
0.10
10,000
ins/del
0.44
0.75
0.79
search
0.08
0.10
0.10
delete
0.17
0.29
0.30
insert
1.43
1.76
1.93
search
0.47
0.53
0.52
50,000
ins/del
2.76
3.89
4.22
search
0.50
0.54
0.55
delete
1.13
1.62
1.76
insert
2.96
3.94
4.36
search
1.08
1.17
1.16
100,000
ins/del
6.11
8.58
9.30
search
1.12
1.20
1.22
delete
2.50
3.66
3.95
insert
6.85
8.92
9.33
search
2.41
2.58
2.57
200,000
ins/del
13.86
19.49
20.46
search
2.49
2.69
2.66
delete
5.61
8.25
8.80
Time Unit : sec
84
Table 3.10. Run time on ordered real inputs (version 1 code)
n
operation
SBBST
BBSTD
BBST
insert
0.27
0.23
0.20
search
0.08
0.07
0.07
10,000
ins/del
0.53
0.50
0.43
search
0.08
0.07
0.05
delete
0.18
0.23
0.20
insert
1.43
1.25
1.12
search
0.40
0.30
0.30
50,000
ins/del
2.80
2.17
2.37
search
0.40
0.30
0.30
delete
1.07
0.90
0.97
insert
3.28
2.58
2.77
search
0.90
0.62
0.63
100,000
ins/del
6.15
4.70
5.13
search
0.87
0.62
0.63
delete
2.35
1.93
2.10
insert
7.37
4.55
4.92
search
1.85
1.32
1.32
200,000
ins/del
13.35
10.03
10.93
search
1.87
1.33
1.33
delete
5.08
4.17
4.43
Time Unit : sec
85
Time is sum of time for parts (a)(e) of the experiment
Figure 3.14. Run time on real inputs (version 1 code)
The average number of comparisons for each of the five parts of the experiment
are given in Table 3.11 for the version 1 implementation. On the comparison measure,
AVL, RBB, WB, and BBSTs are the front runners and are quite competitive with
one another. On parts (a) (insert n elements) and (c) (insert n and delete n elements),
AVL trees performed best while on the two search tests ((b) and (d)) and the deletion
test (e), BBSTs performed best.
Table 3.12 gives the number of comparisons performed when ordered data (i.e.,
the elements in part (a) are l,2,...,n and are inserted in this order) and those in
part (c) are n + 1,... ,2n (in this order) is used instead of random permutations of
distinct elements. This experiment attempts to model realistic situations in which
the inserted elements are in â€œnearly sorted orderâ€. BSTs were not included in this
test as they perform very poorly with ordered data taking 0(n2) time to insert n
86
times. The computer time needed to perform this test on BSTs was determined
to be excessive. This test exhibited greater variance in performance. Among the
deterministic structures, BBSTs outperformed the others in parts (a)  (d) while
AVL trees were ahead in part (e). For part (a), BBSTs performed approximately
45% fewer comparisons than did AVL trees and approximately 12% fewer than WB
trees. The randomized structure TRP was the best of the eight structures reported
in Table 3.12 for part (a). It performed approximately 10% fewer comparisons than
did BBST trees. However, the BBST remained best overall on parts (b), (c), and
(d).
The heights of the trees (number of levels in the case of DSL and SKIP) for
the experiments with random and ordered data are given in Tables 3.13 and 3.14
respectively. The first number in each table entry is the tree height after part (a) of
the experiment and the second, the height after part (c). In all cases, the number of
levels using skip lists is fewest. However, among the tree structures, AVL and BBST
trees have least height on random data and AVL has least with ordered data.
Tables 3.15 and 3.16, respectively, give the number of rotations performed by
each of the deterministic tree schemes for experiment parts (a), (c), and (e). Note
that none of the schemes performs rotations during a search.
On ordered data, BBSTs perform about 25% more rotations than do the reÂ¬
maining structures. These remaining structures perform about the same number of
rotations. On random data, AVL trees, bottomup redblack trees and WB trees perÂ¬
form a comparable number of rotations. Topdown redblack trees and BBST trees
lable 3.11: The number of key comparisons on random inputs (version 1 code)
n
operation
BST
AVL
RBT
RBB
WB
BBST
DSL
TRP
SKIP
10,000
insert
search
ins/del
search
delete
264175
254175
516853
252200
215555
211401
193253
411220
193141
167312
262838
194606
515184
197399
200218
211886
194291
414990
195525
167455
211916
194153
414635
194442
167531
212111
191578
416862
191676
166487
276247
258089
923524
256578
526242
296866
258662
601137
254119
242743
224757
255072
519430
256124
231745
50,000
insert
search
ins/del
search
delete
1560958
1510958
3061868
1500504
1316917
1234911
1147273
2417733
1145808
1013535
1550701
1150466
3058045
1173662
1242426
1236968
1146754
2424944
1152764
1013144
1238628
1149970
2431281
1151578
1015988
1236114
1134969
2437639
1134062
1007688
1640660
1512093
5351715
1499657
3077266
1717037
1503452
3456045
1497081
1451835
1357076
1537547
2996512
1501731
1373858
100,000
insert
search
ins/del
search
delete
3329780
3229780
6537563
3208453
2839934
2623894
2445659
5137280
2443038
2181327
3305332
2451137
6564352
2502098
2692672
2626314
2446466
5154118
2457531
2177946
2631411
2453855
5170695
2456748
2185213
2623792
2423613
5179653
2419990
2168110
3513401
3244497
11545200
3229747
6561272
3632046
3247143
7476441
3310823
3177135
2919371
3188621
6399463
3225343
2981173
200,000
insert
search
ins/del
search
delete
7076132
6876132
13907058
6830718
6095324
5553640
5191730
10862426
5186737
4664876
7016676
5209189
13940982
5332771
5800203
5558174
5199786
10921880
5223154
4664344
5571133
5215568
10956496
5220965
4680768
5553256
5147698
10968053
5144148
4641389
7483199
6887196
24207106
6814733
13811271
7682439
6797942
15543559
6916150
6700557
6178596
6697223
13377747
6680642
6149268
Table 3.12: The number of key comparisons on ordered inputs (version 1 code)
n
operation
AVL
RBT
RBB
WB
BBST
DSL
TRP
SKIP
insert
277234
376228
241383
171017
150554
435199
135989
247129
search
191917
188246
190106
188722
185530
262423
271087
256706
10,000
ins/del
421032
718040
508810
425843
314998
983676
390899
354566
search
195133
189494
190090
191681
184155
249694
269031
250538
delete
104038
276136
218216
214930
135131
468244
193080
84392
insert
1618930
2233658
1436225
995720
872967
2585557
825390
1422120
search
1120497
1117001
1120495
1117174
1101481
1509152
1540082
1467217
50,000
ins/del
2418422
4311748
3055100
2475487
1805439
6019215
2194668
1973416
search
1124001
1168633
1126126
1116390
1098065
1481819
1568903
1449810
delete
607478
1719212
1323918
1276262
791815
2785792
1181612
486498
insert
3437858
4767564
3072389
2112201
1850548
5521408
1724473
2925618
search
2390963
2383979
2390961
2384327
2354757
3218246
3564282
2970715
100,000
ins/del
5111850
9223606
6510188
5254541
3821594
12788447
4438266
4406427
search
2397971
2487243
2402224
2382759
2346128
3163554
3281308
3277089
delete
1289954
3737982
2847792
2735270
1684584
5971196
2403622
961283
insert
7275714
10135418
6544713
4465935
3903083
11743159
3428355
6403207
search
5081893
5067933
5081891
5068632
4946753
6836428
7174727
6448304
200,000
ins/del
10773706
19647336
13820364
11116226
8048058
27076911
9054078
9062233
search
5095909
5274461
5104418
5065496
5001967
6727017
7006341
6458321
delete
2729906
8075474
6095538
5836096
3577223
12741948
5094044
1995215
oo
00
89
Table 3.13. Height of the trees on random inputs (version 1 code)
n
BST
AVL
RBT
RBB
WB
BBST
DSL
TRP
SKIP
10,000
31,31
16,16
17,18
16,17
17,17
16,16
12,11
32,31
8,8
50,000
38,38
19,19
20,21
19,20
20,20
19,19
13,12
38,37
9,9
100,000
41,41
20,20
21,22
20,21
21,22
20,20
14,13
41,40
9,9
200,000
44,43
21,21
22,24
21,22
23,23
21,21
15,14
43,44
9,9
Table 3.14. Height of the trees on ordered inputs (version 1 code)
n
AVL
RBT
RBB
WB
BBST
DSL
TRP
SKIP
10,000
14,14
20,20
24,24
16,15
17,17
14,13
33,34
8,8
50,000
16,16
23,23
29,28
20,20
20,20
16,16
41,41
9,9
100,000
17,17
25,25
31,30
21,21
21,21
17,17
46,41
9,9
200,000
18,18
27,27
33,32
22,22
23,22
18,18
47,46
9,9
perform a significantly larger number of rotations. In fact, BBSTs perform about
twice as many rotations as AVL trees.
The average run times for the random data tests are given in Table 3.17 and
in Table 3.18 for the ordered data test. Both of these use integer keys. The times
using real keys are given in Tables 3.19 and 3.20. The sum of the run time for parts
(b) and (d) of the experiment is graphed in Figure 3.15 for random data and in
Figure 3.16 for ordered data. The graph of Figure 3.17 shows only one line MIX
for AVL, RBT, RBB, WB, and BBST while that of Figure 3.18 shows MIX for
AVL, RBT, RBB, and WB as the times for these are very close. With integer
keys and random data, unbalanced binary search trees (BSTs) outperformed each
of the remaining structures. The next best performance was exhibited by bottom
up redblack trees. They did marginally better than AVL trees. The remaining
Table 3.15: The number of rotations on random inputs (version 1 code)
n
operation
AVL
RBT
RBB
WB
BBST
S
D
S
D
S
D
S
D
S
D
DS
DD
insert
2328
2322
1964
1955
1946
1933
2274
2065
5025
3938
151
93
10,000
ins/del
4343
3224
14773
8213
4053
2591
4256
2978
10104
5849
232
103
delete
1645
1120
9558
2678
1845
1166
1595
1022
5201
2018
51
28
insert
11664
11614
9822
9815
9710
9689
11355
10352
25059
19732
754
455
50,000
ins/del
21585
16214
81895
45180
20255
12979
21266
14975
50979
29198
1161
531
delete
8231
5630
54806
13431
9196
5844
7963
5194
26068
10033
248
131
insert
23316
23254
19593
19677
19340
19414
22723
20730
50047
39461
1527
920
100,000
ins/del
43243
32361
196769
103835
40618
25919
42567
29898
101836
58491
2275
1046
delete
16466
11264
119825
26953
18530
11708
16024
10420
51943
20147
496
260
insert
46631
46518
39290
39291
38797
38793
45458
41480
100205
79013
3054
1840
200,000
ins/del
86218
64712
394187
209941
80892
52030
84927
59911
203568
116940
4593
2059
delete
33047
22477
247905
54046
37083
23379
31984
20800
103826
40157
990
523
Table 3.16: Ihe number of rotations on ordered inputs (version 1 code)
AVL
RBT
RBB
WB
BBST
n
operation
S
D
S
D
S
D
S
D
S
D
DS
DD
insert
9986
0
9980
0
9976
0
9984
0
9985
2387
0
0
10,000
ins/del
14996
0
14999
0
14995
0
14997
0
16644
5797
25
154
delete
4990
0
4983
1
4989
0
4989
0
6647
3392
26
154
insert
49984
0
49977
0
49971
0
49980
0
49983
11956
0
0
50,000
ins/del
74994
0
75000
0
74994
0
74996
0
83247
28982
137
770
delete
24988
0
24978
1
24986
0
24987
0
33242
17018
136
766
insert
99983
0
99975
0
99969
0
99979
0
99983
23917
0
0
100,000
ins/del
149994
0
150000
0
149994
0
149996
0
166504
57969
280
1540
delete
49987
0
49977
1
49985
0
49986
0
66505
34040
278
1536
insert
199982
0
199973
0
199967
0
199978
0
199982
47839
0
0
200,000
ins/del
299994
0
300000
0
299994
0
299996
0
333012
115938
559
3078
delete
99986
0
99976
1
99984
0
99985
0
133016
68086
557
3076
92
Time is sum of time for parts (b) and (d) of the experiment
Figure 3.15. Run time on random real inputs (version 1 code)
structures have a noticeably inferior structure. For ordered integer keys, BSTs take
more time than we were willing to expend. Of the remaining structures, treaps
generally performed best on parts (a), (c), and (e) while BBSTs did best on parts
(b) and (d).
With real keys and random data, BSTs did not outperform the remaining strucÂ¬
tures. Now, the five balanced binary tree structure became quite competitive with
respect to the search operations (i.e., parts (b) and (d)). RBB generally outperÂ¬
formed the other structures on parts (a), (c), and (e). Using ordered real keys, the
treap was the clear winner on parts (a), (c), and (e) while BBSTs handily outperÂ¬
formed the remaining structures on parts (b) and (d).
Some of the experimental results using version 2 of the code are shown in TaÂ¬
bles 3.21 3.24. On the comparison measure, with random data (Table 3.21), skip
Table 3.17: Run time on random inputs using integer keys (version 1 code)
n
operation
BST
AVL
RBT
RBB
WB
BBST
DSL
TRP
SKIP
insert
0.08
0.12
0.15
0.12
0.20
0.34
0.19
0.18
0.24
search
0.05
0.05
0.05
0.06
0.05
0.07
0.09
0.09
0.18
10,000
ins/del
0.14
0.21
0.36
0.22
0.39
0.70
0.49
0.33
0.45
search
0.05
0.05
0.05
0.05
0.05
0.06
0.09
0.09
0.18
delete
0.05
0.08
0.12
0.09
0.16
0.26
0.20
0.08
0.16
insert
0.65
0.79
0.98
0.73
1.18
1.75
1.10
1.01
1.36
search
0.40
0.36
0.36
0.36
0.35
0.37
0.58
0.56
1.25
50,000
ins/del
1.04
1.48
2.50
1.26
2.22
3.84
2.77
1.86
2.73
search
0.40
0.41
0.44
0.36
0.36
0.39
0.57
0.56
1.16
delete
0.39
0.54
1.01
0.51
0.94
1.62
1.16
0.51
1.10
insert
1.34
1.57
2.10
1.54
2.54
3.80
2.46
2.23
2.84
search
0.88
0.80
0.80
0.83
0.78
0.84
1.36
1.30
2.63
100,000
ins/del
2.36
3.21
5.52
2.74
4.86
8.41
6.35
4.10
6.13
search
0.93
0.94
1.00
0.84
0.83
0.88
1.33
1.29
2.61
delete
0.88
1.24
2.26
1.14
2.11
3.58
2.64
1.23
2.41
insert
2.79
3.37
4.41
3.18
5.21
8.37
5.56
4.70
6.25
search
2.00
1.80
1.81
1.81
1.78
1.89
3.03
2.91
5.85
200,000
ins/del
5.24
6.99
12.51
5.99
10.54
18.57
14.29
8.95
13.29
search
2.08
2.12
2.25
1.91
1.87
1.98
3.04
2.93
5.81
delete
2.01
2.69
5.06
2.51
4.55
8.02
5.84
2.76
5.35
Time Unit : sec
Table 3.18. Run time on ordered inputs using integer keys (version 1 code)
n
operation
AVL
RBT
RBB
WB
BBST
DSL
TRP
SKIP
insert
0.12
0.17
0.12
0.18
0.27
0.23
0.08
0.20
search
0.05
0.03
0.03
0.07
0.05
0.07
0.05
0.12
10,000
ins/del
0.18
0.32
0.20
0.35
0.57
0.42
0.17
0.20
search
0.05
0.05
0.05
0.05
0.03
0.07
0.05
0.13
delete
0.05
0.10
0.07
0.13
0.23
0.15
0.05
0.07
insert
0.75
1.02
0.92
1.25
1.10
0.98
0.47
0.92
search
0.32
0.27
0.27
0.28
0.20
0.33
0.32
0.62
50,000
ins/del
1.28
2.17
1.25
2.20
2.40
2.03
0.80
1.07
search
0.28
0.28
0.27
0.28
0.20
0.30
0.37
0.62
delete
0.30
0.75
0.37
0.85
1.05
0.65
0.30
0.27
insert
1.50
2.52
1.70
2.58
2.53
2.58
0.90
1.72
search
0.70
0.60
0.57
0.70
0.42
0.70
0.63
1.23
100,000
ins/del
2.60
4.68
2.53
4.78
5.13
4.42
1.52
2.43
search
0.63
0.60
0.55
0.62
0.42
0.70
0.58
1.35
delete
0.62
1.65
0.78
1.87
2.15
1.42
0.45
0.55
insert
3.12
4.82
3.38
5.67
5.25
4.72
1.80
3.52
search
1.38
1.30
1.22
1.33
0.90
1.60
1.25
2.70
200,000
ins/del
5.15
10.40
5.35
10.40
10.88
9.48
3.10
5.13
search
1.33
1.33
1.18
1.32
0.90
1.50
1.28
2.72
delete
1.35
3.63
1.68
4.12
4.58
2.98
0.93
1.12
Time Unit : sec
Table 3.19: Run time on random real inputs (version 1 code)
n
operation
BST
AVL
RBT
RBB
WB
BBST
DSL
TRP
SKIP
insert
0.14
0.15
0.21
0.17
0.23
0.36
0.22
0.23
0.30
search
0.09
0.07
0.09
0.10
0.08
0.10
0.13
0.13
0.21
10,000
ins/del
0.24
0.27
0.51
0.32
0.38
0.79
0.62
0.41
0.53
search
0.09
0.08
0.09
0.10
0.08
0.10
0.12
0.12
0.21
delete
0.09
0.09
0.17
0.14
0.14
0.30
0.28
0.11
0.19
insert
0.94
0.97
1.22
0.86
1.29
1.93
1.48
1.19
1.67
search
0.64
0.52
0.50
0.51
0.51
0.52
0.87
0.71
1.44
50,000
ins/del
1.68
1.77
2.74
1.53
2.29
4.22
3.93
2.17
3.15
search
0.66
0.55
0.56
0.54
0.56
0.55
0.86
0.71
1.33
delete
0.63
0.67
1.10
0.72
0.92
1.76
1.80
0.69
1.22
insert
2.06
1.85
2.34
1.90
2.66
4.36
3.05
2.67
3.61
search
1.43
1.13
1.09
1.13
1.14
1.16
1.84
1.66
3.00
100,000
ins/del
3.63
3.93
6.18
3.33
4.96
9.30
8.45
4.84
7.10
search
1.45
1.26
1.27
1.17
1.26
1.22
1.83
1.65
3.01
delete
1.39
1.50
2.51
1.55
2.03
3.95
3.91
1.61
2.75
insert
4.34
3.95
5.20
3.88
5.56
9.33
6.77
5.81
7.90
search
3.19
2.49
2.42
2.50
2.45
2.57
4.14
3.67
6.62
200,000
ins/del
8.01
8.25
13.78
7.29
10.65
20.46
18.88
10.48
15.83
search
3.21
2.83
2.86
2.62
2.74
2.66
4.08
3.73
6.74
delete
3.11
3.27
5.55
3.41
4.43
8.80
8.56
3.54
6.04
Time Unit : sec
96
Table 3.20. Run time on ordered real inputs (version 1 code)
n
operation
AVL
RBT
RBB
WB
BBST
DSL
TRP
SKIP
insert
0.13
0.22
0.15
0.25
0.20
0.25
0.12
0.30
search
0.07
0.08
0.07
0.07
0.07
0.10
0.07
0.15
10,000
ins/del
0.23
0.42
0.27
0.40
0.43
0.47
0.18
0.28
search
0.07
0.05
0.08
0.08
0.05
0.08
0.08
0.12
delete
0.07
0.17
0.08
0.15
0.20
0.20
0.05
0.07
insert
1.15
1.58
1.12
1.85
1.12
1.30
0.67
1.35
search
0.42
0.42
0.43
0.40
0.30
0.53
0.38
0.82
50,000
ins/del
1.28
2.75
1.57
2.57
2.37
3.02
0.92
1.40
search
0.40
0.42
0.42
0.48
0.30
0.53
0.40
0.75
delete
0.38
0.95
0.55
0.93
0.97
1.15
0.33
0.35
insert
1.77
3.23
2.12
3.35
2.77
3.13
1.17
2.42
search
0.90
0.87
0.90
0.88
0.63
1.12
0.92
1.70
100,000
ins/del
3.00
6.00
3.42
5.38
5.13
6.32
1.92
3.22
search
0.97
0.92
0.88
0.98
0.63
1.12
0.82
1.70
delete
0.87
2.08
1.17
2.05
2.10
2.40
0.70
0.67
insert
3.92
6.42
4.27
7.25
4.92
6.03
2.58
4.93
search
1.92
1.87
1.92
1.88
1.32
2.40
1.85
3.87
200,000
ins/del
5.78
13.80
7.33
11.88
10.93
13.72
3.75
6.67
search
1.90
1.93
1.92
2.13
1.33
2.38
1.75
3.97
delete
1.67
4.55
2.48
4.45
4.43
5.10
1.40
1.35
Time Unit : sec
97
Time is sum of time for parts (b) and (d) of the experiment
Figure 3.16. Run time on ordered real inputs (version 1 code)
lists performed best on part (a). Of the deterministic methods, BBSTs slightly outÂ¬
performed the others on part (a). On parts (b)  (e), AVL, RBT, RBB, WB, and
BBSTs were quite competitive and outperformed BSTs and the randomized schemes.
BBSTs performed best on parts (b) and (d), RBTs did best on part (e) and RBB
and AVL did best on part (c). In comparing the results of Table 3.21 to those of
Table 3.11 (using version 1 code), we see that the change to version 2 generally inÂ¬
creased the comparison cost of the deterministic tree structures by about 25%. For
the DSL, the change in code had mixed results. Notice that for RBT and DSLs,
the comparison count for parts (a), (c), and (e) are the same as for the version 1
code. This is because for inserts and deletes, it is necessary to do the equal check
first when using these structures. For SKIPs the count is the same for all five parts
as the version 1 and 2 codes are the same.
98
With ordered data (Table 3.22), treaps required the fewest comparisons for part
(a). Skip lists did best on parts (c) and (e), and AVL trees generally outperformed
the other structures on parts (b) and (d). Once again, the comparison counts were
generally higher using the version 2 code than using the version 1 code.
Run time data using real keys is given in Tables 3.23 and 3.24. The sum of the
run time for parts (b) and (d) of the experiment is graphed in Figure 3.17 for random
data and in Figure 3.18 for ordered data. The graph of Figure 3.17 shows only one
line MIX for AVL, RBT, RBB, WB, and BBST while that of Figure 3.18 shows
MIX for AVL, RBT, RBB, and WB as the times for these are very close. With
random data, RBB generally performed best on part (a), on parts (b) and (d), the
front runner varied among AVL, RBT, and WB, and on parts (c) and (e) RBBs
generally did best. On ordered data, TRPs did best on paxts (a), (c), and (e) while
BBSTs did best on parts (b) and (d).
3.8 Conclusion
We have developed a new weight balanced data structure called /9BBST. This
was developed for the representation of a dictionary. In developing the insert/delete
algorithms, we sought to minimize the search cost of the resulting tree. Our experiÂ¬
mental results show that BBSTs generally have the best search cost of the structures
considered. Furthermore, this translates into reduced search time when the key comÂ¬
parison cost is relatively high (e.g., for real keys). The insert and delete algorithms
for /?BBSTs are not as efficient as those for other dictionary structures (such as
AVL trees). As a result, we recommend /TBBSTs for environments where searches
Table 3.21: The number of key comparisons on random inputs (version 2 code)
n
operation
BST
AVL
RBT
RBB
WB
BBST
DSL
TRP
SKIP
insert
332753
262198
262838
262726
263177
260896
276247
375698
224757
search
322753
241557
242258
242262
242824
240126
348403
329411
255072
10,000
ins/del
650901
514371
515184
513920
515732
518124
923524
755629
519430
search
318749
241536
247130
243191
242867
240126
335613
320612
256124
delete
271004
206558
200218
206721
207622
207210
526242
300619
231745
insert
1983939
1546988
1550701
1549795
1554520
1539666
1640660
2184066
1357076
search
1933939
1443879
1447870
1446679
1452920
1435927
2043618
1921255
1537547
50,000
ins/del
3892221
3043090
3058045
3040654
3055092
3061443
5351715
4393520
2996512
search
1913068
1443837
1476158
1451163
1452625
1435726
1969926
1909919
1501731
delete
1674128
1267637
1242426
1268881
1275612
1270935
3077266
1815736
1373858
insert
4245062
3297162
3305332
3302792
3314410
3281959
3513401
4637264
2919371
search
4145062
3090057
3098143
3096011
3111095
3074661
4387427
4161175
3188621
100,000
ins/del
8336846
6490752
6564352
6486464
6520729
6528606
11545200
9484761
6399463
search
4102672
3089826
3176862
3105465
3110184
3074305
4270168
4224698
3225343
delete
3623179
2738267
2692672
2740846
2756006
2744369
6561272
4008111
2981173
insert
9045367
6999791
7016676
7012317
7040203
6969465
7483199
9834444
6178596
search
8845367
6584279
6603044
6599643
6633218
6554714
9373163
8752856
6697223
200,000
ins/del
17782478
13790643
13940982
13789492
13862467
13867876
24207106
19825904
13377747
search
8757433
6585758
6747566
6618833
6630334
6554354
8995685
8889053
6680642
delete
7800524
5882302
5800203
5889983
5923552
5893982
13811271
8456931
6149268
SO
Table 3.22: The number of key comparisons on ordered inputs (version 2 code)
n
operation
AVL
RBT
RBB
WB
BBST
DSL
TRP
SKIP
insert
267234
376228
442766
302034
261108
435199
216958
247129
search
237262
239442
247706
237298
243110
372444
332060
256706
10,000
ins/del
493028
718040
727620
562808
558770
983676
482499
354566
search
240910
238834
246330
238320
242736
349730
344982
250538
delete
178076
276136
208216
204930
239028
468244
183080
84392
insert
1568930
2233658
2672450
1791440
1545934
2585557
1375770
1422120
search
1418962
1421560
1459588
1420858
1455936
2159176
1990474
1467217
50,000
ins/del
2881762
4311748
4360200
3301668
3251450
6019215
2742877
1973416
search
1419154
1444824
1444258
1424442
1452494
2131862
1956194
1449810
delete
1064956
1719212
1273918
1226262
1427504
2785792
1131612
486498
insert
3337858
4767564
5744778
3824402
3301096
5521408
2898893
2925618
search
3037892
3043084
3119128
3041676
3121098
4618272
4538718
2970715
100,000
ins/del
6113530
9223606
9320376
7027676
6930932
12788447
5492066
4406427
search
3038276
3089612
3088470
3048844
3114012
4563600
4158994
3277089
delete
2279908
3737982
2747792
2635270
3056908
5971196
2303622
961283
insert
7075714
10135418
12289426
8131870
7006166
11743159
5756575
6403207
search
6475750
6486128
6638204
6483310
6646168
9836456
9102954
6448304
200,000
ins/del
12927066
19647336
19840728
14904040
14671602
27076911
11290926
9062233
search
6476518
6579184
6576890
6497646
6634260
9727066
8918638
6458321
delete
4859812
8075474
5895538
5636096
6529928
12741948
4894044
1995215
Table 3.23: Run time on random real inputs (version 2 code)
n
operation
BST
AVL
RBT
RBB
WB
BBST
DSL
TRP
SKIP
insert
0.15
0.14
0.20
0.18
0.25
0.36
0.23
0.25
0.31
search
0.10
0.08
0.10
0.11
0.09
0.11
0.13
0.16
0.21
10,000
ins/del
0.27
0.27
0.52
0.34
0.47
0.80
0.64
0.50
0.54
search
0.10
0.08
0.10
0.11
0.09
0.11
0.13
0.14
0.21
delete
0.10
0.10
0.20
0.14
0.18
0.32
0.29
0.14
0.19
insert
1.02
0.98
1.15
0.89
1.46
1.88
1.44
1.34
1.65
search
0.69
0.55
0.57
0.55
0.57
0.55
0.89
0.83
1.42
50,000
ins/del
1.79
1.80
2.99
1.59
2.93
3.97
3.82
2.44
3.16
search
0.71
0.60
0.63
0.55
0.57
0.56
0.87
0.79
1.32
delete
0.67
0.67
1.22
0.66
1.19
1.63
1.80
0.75
1.21
insert
2.15
2.00
2.58
1.90
3.18
4.01
3.11
2.95
3.69
search
1.52
1.21
1.24
1.18
1.23
1.23
1.97
1.84
3.04
100,000
ins/del
3.88
3.92
6.74
3.46
6.28
8.73
8.50
5.39
7.18
search
1.55
1.32
1.45
1.25
1.29
1.27
1.95
1.82
2.98
delete
1.51
1.49
2.75
1.45
2.57
3.64
3.93
1.73
2.77
insert
5.04
4.45
5.79
4.28
6.92
9.20
7.05
6.81
8.01
search
3.43
2.63
2.70
2.64
2.73
2.69
4.43
4.00
6.60
200,000
ins/del
8.92
8.87
15.36
7.88
13.85
19.53
19.55
12.17
16.11
search
3.43
2.98
3.13
2.73
2.83
2.77
4.37
4.02
6.70
delete
3.33
3.32
6.08
3.20
5.65
8.24
8.91
3.88
6.04
Time Unit : sec
102
Time is sum of time for parts (b) and (d) of the experiment
Figure 3.17. Run time on random real inputs (version 2 code)
Time is sum of time for parts (b) and (d) of the experiment
Figure 3.18. Run time on ordered real inputs (version 2 code)
103
Table 3.24. Run time on ordered real inputs (version 2 code)
n
operation
AVL
RBT
RBB
WB
BBST
DSL
TRP
SKIP
insert
0.17
0.23
0.28
0.27
0.30
0.23
0.15
0.30
search
0.08
0.08
0.12
0.08
0.08
0.12
0.12
0.13
10,000
ins/del
0.23
0.43
0.40
0.47
0.60
0.48
0.17
0.27
search
0.08
0.08
0.07
0.08
0.08
0.08
0.10
0.13
delete
0.08
0.J5
0.12
0.17
0.20
0.20
0.08
0.05
insert
0.83
1.45
1.43
1.57
1.37
1.35
0.82
1.18
search
0.45
0.48
0.48
0.47
0.38
0.60
0.50
0.83
50,000
ins/del
1.35
2.65
1.95
2.75
2.47
3.05
1.05
1.42
search
0.45
0.47
0.45
0.47
0.37
0.63
0.58
0.77
delete
0.45
1.05
0.50
1.00
1.03
1.17
0.43
0.33
insert
1.78
2.75
2.73
3.43
2.63
3.23
1.33
2.18
search
0.97
0.98
1.00
1.03
0.77
1.30
1.15
1.55
100,000
ins/del
2.85
6.22
3.98
6.00
5.33
6.37
2.02
3.33
search
0.97
1.10
0.98
1.02
0.77
1.32
1.03
1.70
delete
0.97
2.18
1.05
2.15
2.22
2.43
0.63
0.67
insert
3.78
6.08
5.43
7.18
5.37
6.07
2.87
5.23
search
2.08
2.13
2.13
2.17
1.63
3.10
2.27
3.47
200,000
ins/del
6.13
13.93
8.48
13.42
11.33
13.60
4.10
7.02
search
2.12
2.15
2.13
2.17
1.63
2.80
2.18
4.27
delete
2.03
4.75
2.27
4.77
4.72
5.18
1.35
1.35
Time Unit : sec
104
are done with much greater frequency than inserts and/or deletes. Based on our
experiments, we conclude that AVL trees remain the best dictionary structure for
general applications.
We have also proposed two simplified versions of the BBST called SBBST and
BBSTD. The SBBST seeks only to provide logarithmic run time per operation and
unlike the general BBST, does not reduce search cost at every opportunity. The
SBBST provides slightly better balance than provided by WB(a) trees. The BBSTD
does not attempt to maintain /2balance. However it performs rotations to reduce
search cost whenever possible. Both versions are very competitive with BBSTs. The
SBBST exhibited much better run time performance than BBSTs on random data
and the BBSTD slightly outperformed the BBST on ordered data. However, BBSTs
generated trees with the lowest search cost (though not by much).
CHAPTER 4
WEIGHT BIASED LEFTIST TREES AND MODIFIED SKIP LISTS
4.1 Introduction
Several data structures (e.g., heaps, leftist trees [9], binomial heaps [10]) have
been proposed for the representation of a (single ended) priority queue. Heaps permit
one to delete the min element and insert an arbitrary element into an n element
priority queue in O(logn) time. Leftist trees support both these operations and the
merging of pairs of priority queues in logarithmic time. Using binomial heaps, inserts
and combines take 0(1) time and a deletemin takes O(logn) amortized time. In
this chapter, we begin in Section 4.2, by developing the weight biased leftist tree.
This is similar to a leftist tree. However biasing of left and right subtrees is done
by number of nodes rather than by length of paths. Experimental results presented
in Section 4.5 show that weight biased leftist trees provide better performance than
provided by leftist trees. The experimental comparisons of Section 4.5 also include a
comparison with heaps and binomial heaps as well as with unbalanced binary search
trees and the probabilistic structures treap [1] and skip lists [26].
In Section 4.3, we propose a fixed node size representation for skip lists. The
new structure is called modified skip lists and is experimentally compared with the
105
106
variable node size structure skip lists. Our experiments indicate that modified skip
lists are faster than skip lists when used to represent dictionaries.
Modified skip lists are augmented by a thread in Section 4.4 to obtain a structure
suitable for use as a priority queue. For completeness, we include, in Section 4.5, a
comparison of data structures for double ended priority queues.
4.2 Weight Biased Leftist Trees
Let T be an extended binary tree. For any internal node x of T, let LeftChild(x)
and RightChild(x), respectively, denote the left and right children of x. The weight,
w(x), of any node x is the number of internal nodes in the subtree with root x.
The length, shortest(x), of a shortest path from x to an external node satisfies the
recurrence
shortest^ x) â€” <
0 if x is an external node
1 + min{shortest(LeftChild(x)),shortest(RightChild(x))} otherwise.
Definition [9] A leftist tree (LT) is a binary tree such that if it is not empty, then
shortest(LeftChild(x)) > shortest(RightChild(x))
for every internal node x.
A weight biased leftist tree (WBLT) is defined by using the weight measure in
place of the measure shortest.
107
Definition A weight biased, leftist tree (WBLT) is a binary tree such that if it is
not empty, then
weight(LeftChild(x)) > weight(RightChild(x))
for every internal node x.
It is known [9] that the length, rightmost(x), of the rightmost root to external
node path of any subtree, x, of a leftist tree satisfies
rightmost(x ) < log2M*) + !)â€¢
The same is true for weight biased leftist trees.
Theorem 13 Let x be any internal node of a weight biased leftist tree, rightmost(x) <
log2(tt;(x) + 1).
Proof The proof is by induction on w(x). When w(x) = 1, rightmost(x) =
1 and log2(u;(x) + 1) = log2 2 = 1. For the induction hypothesis, assume that
rightmost(x) < log2(u>(x)+l) whenever w(x) < n. When w(x) = n, w(RightChild(x)) <
(n â€” l)/2 and rightmost(x) â€” 1 + rightmost(RightChild(x)) < 1 + log2((n â€” l)/2 +
1) = 1 + log2(n + 1)  1 = log2(n + 1). â–¡
Definition A min (max)WBLT is a WBLT that is also a min (max) tree.
Each node of a minWBLT has the fields: Isize (number of internal nodes
in left subtree), rsize, left (pointer to left subtree), right, and data. While the
108
bottom
(a) Empty minWBLT (b) Nonempty minWBLT
Figure 4.1. Example minWBLTs
number of size fields in a node may be reduced to one, two fields result in a faster
implementation. We assume a head node head with Isize = oo and Â¡child = head. In
addition, a bottom node bottom with data.key = oo. All pointers that would normally
be nil are replaced by a pointer to bottom. Figure 4.1(a) shows the representation of
an empty minWBLT and Figure 4.1(b) shows an example non empty minWBLT.
Notice that all elements are in the right subtree of the head node.
Min (max)WBLTs can be used as priority queues in the same way as min (max)
LTs. For instance, a minWBLT supports the standard priority queue operations
of insert and deletemin in logarithmic time. In addition, the combine operation
(i.e., join two priority queues together) can also be done in logarithmic time. The
algorithms for these operations have the same flavor as the corresponding ones for
minLTs. A high level description of the insert and deletemin algorithm for min
WBLT is given in Figures 4.2 and 4.3, respectively. The algorithm to combine two
109
procedure Insert(d) ;
{insert d into a minWBLT}
begin
create a node x with x.data = d ;
t = head ; {head node}
while (t.right.data.key < d.key) do
begin
t.rsize = t.rsize + 1 ;
if (t.lsize < t.rsize) then
begin swap Vs children ; t = t.left ; end
else t = t.right ;
end ;
x.left = t.right ; x.right = bottom ;
x.lsize = t.rsize ; x.rsize = 0 ;
if (t.lsize = t.rsize) then {swap children}
begin
t.right = t.left ;
t.left = x ; t.lsize = x.lsize + 1 ;
end
else
begin t.right = x ; t.rsize = t.rsize + 1 ; end ;
end ;
Figure 4.2. minWBLT Insert
minWBLTs is similar to the deletemin algorithm. The time required to perform
each of the operations on a minWBLT T is 0(rightmost(T)).
Notice that while the insert and deletemin operations for minLTs require a
topdown pass followed by a bottomup pass, these operations can be performed by
a single topdown pass in minWBLTs. Hence, we expect minWBLTs to outperform
minLTs.
4.3 Modified Skip Lists
Skip lists were proposed in Pugh [26] as a probabilistic solution for the dictionary
problem (i.e., represent a set of keys and support the operations of search, insert,
and delete). The essential idea in skip lists is to maintain upto Imax ordered chains
procedure Deletemin ;
begin
x = head.right ;
if (x = bottom) then return ; {empty tree}
head.right = x.left ; head.rsize = x.lsize ;
a = head;
b = x.right ; 6sÂ¿ze = x.rsize ;
delete x ;
if (i> = bottom) then return ;
r = a.right ;
while (r ^ bottom) do
begin
5 = Â¿size + a.rsize ; t = a.rsize ;
if (a.lsize < 5) then {work on a.left)
begin
a.right = a.left ; a.rsize â€” a.lsize ; a.lsize = s ;
if (r.data.key > b.data.key) then
begin a.left = b ; a = b ; 6 = r ; bsize = t ; end
else
begin a.left = r ; a = r ; end
end
else
do symmetric operations on a.right ;
r = a.right ;
end ;
if (a.lsize < bsize) then
begin
a.right = a.left ; a.left = b ;
a.rsize = a.lsize ; a.lsize = bsize ;
end
else
begin a.right = b ; a.rsize = bsize ; end ;
end ;
Figure 4.3. minWBLT Deletemin
Ill
level
Figure 4.4. Skip Lists
designated as level 1 chain, level 2 chain, etc. If we currently have Â¡current number
of chains, then all n elements of the dictionary are in the level 1 chain and for each
/, 2 < / < /current, approximately a fraction p of the elements on the level / â€” 1
chain are also on the level / chain. Ideally, if the level / â€” 1 chain has m elements
then the approximately m x p elements on the level / chain are about 1/p apart in
the level / â€” 1 chain. Figure 4.4 shows an ideal situation for the case Â¡current = 4
and p = 1/2.
While the search, insert, and delete algorithms for skip lists are simple and have
probabilistic complexity O(logn) when the level 1 chain has n elements, skip lists
suffer from the following implementational drawbacks:
1. In programming languages such as Pascal, it isnâ€™t possible to have variable size
nodes. As a result, each node has one data field, and Â¡max pointer fields. So,
the n element nodes have a total of n x Â¡max pointer fields even though only
about n/( 1 â€”p) pointers are necessary. Since Â¡max is generally much larger than
3 (the recommended value is log^^nMax where nMax is the largest number of
elements expected in the dictionary), skip lists require more space than WBLTs.
112
2. While languages such as C and C++ support variable size nodes and we can
construct variable size nodes using simulated pointers [27] in languages such as
Pascal that do not support variable size nodes, the use of variable size nodes
requires more complex storage management techniques than required by the
use of fixed size nodes. So, greater efficiency can be achieved using simulated
pointers and fixed size nodes.
With these two observations in mind, we propose a modified skip list (MSL)
structure in which each node has one data field and three pointer fields: left, right,
and down. Notice that this means MSLs use four fields per node while WBLTs use
five (as indicated earlier this can be reduced to four at the expense of increased run
time). The left and right fields are used to maintain each level / chain as a doubly
linked list and the down field of a level / node x points to the leftmost node in the level
/ â€” 1 chain that has key value larger than the key in x. Figure 4.5 shows the modified
skip list that corresponds to the skip list of Figure 4.4. Notice that each element is
in exactly one doubly linked list. We can reduce the number of pointers in each node
to two by eliminating the field left and having down point one node the left of where
it currently points (except for head nodes whose down fields still point to the head
node of the next chain). However, this results in a less time efficient implementation.
H and T, respectively, point to the head and tail of the level /current chain.
A high level description of the algorithms to search, insert, and delete are given in
Figures 4.6, 4.7, and 4.8. The next theorem shows that their probabilistic complexity
is O(log n) where n is the total number of elements in the dictionary.
113
level H
T
Figure 4.5. Modified Skip Lists
procedure Search(key) ;
begin
PH;
while (p / nil) do
begin
while (p.data.key < key) do
p = p.right ;
if (p.data.key = key) then report and stop
else p = p.left.down ; {1 level down}
end ;
end ;
Figure 4.6. MSL Search
114
procedure Insert(d) ;
begin
randomly generate the level k at which d is to be inserted ;
search the MSL H for d.key saving information useful for insertion ;
if d.key is found then fail ; {duplicate}
get a new node x and set x.data = d ;
if ((k > leurrent) and (/current ^ Imax)) then
begin
leurrent = Â¡current + 1 ;
create a new chain with a head node, node x, and a tail and
connect this chain to H ;
update H ;
set x.down to the appropriate node in the level leurrent â€” 1 chain
(to nil if k = 1) ;
end
else
begin
insert x into the level k chain ;
set x.down to the appropriate node in the level k â€” 1 chain
(to nil if k = 1) ;
update the down field of nodes on the level k + 1 chain (if any) as needed ;
end ;
end ;
Figure 4.7. MSL Insert
procedure Delete(z) ;
begin
search the MSL H for a node x with data.key = z saving information useful
for deletion;
if not found then fail ;
let k be the level at which z is found ;
for each node p on level k + 1 that has p.down = x, set p.down = x.right ;
delete x from the level k list ;
if the list at level leurrent becomes empty then
delete this and succeeding empty lists until we reach the first non empty list,
update leurrent ;
end ;
Figure 4.8. MSL Delete
115
Theorem li The probabilistic complexity of the MSL operations is Ofiog n).
Proof We establish this by showing that our algorithms do at most a logarithmic
amount of additional work than do those of Pugh [26]. Since the algorithms of Pugh
[26] has probabilistic O(logn) complexity, so also do ours. During a search, the extra
work results from moving back one node on each level and then moving down one level.
When this is done from any level other than /current, we expect to examine upto
c = 1/pâ€” 1 additional nodes on the next lower level. Hence, upto c(lcurrent â€” 2) addiÂ¬
tional nodes get examined. During an insert, we also need to verify that the element
being inserted isnâ€™t one of the elements already in the MSL. This requires an addiÂ¬
tional comparison at each level. So, MSLs may make upto c(lcurrent â€” 2) f leurrent
additional compares during an insert. The number of down pointers that need to
be changed during an insert or delete is expected to be ]CÂ¿^i *p' = (Tl^p Since c
and p are constants and Imax â€” logj/p n, the expected additional work is O(log n). â–¡
The relative performance of skip lists and modified skip lists as a data structure
for dictionaries was determined by programming the two in C. Both were implemented
using simulated pointers. The simulated pointer implementation of skip lists used
fixed size nodes. This avoided the use of complex storage management methods and
biased the run time measurements in favor of skip lists. For the case of skip lists, we
used p = 1/4 and for MSLs, p = 1/5. These values of p were found to work best for
each structure. Imax was set to 16 for both structures.
116
We experimented with n = 10,000, 50,000, 100,000, and 200,000. For each n,
the following five part experiment was conducted:
(a) start with an empty structure and perform n inserts;
(b) search for each item in the resulting structure once; items are searched for in the
order they were inserted
(c) perform an alternating sequence of n inserts and n deletes; in this, the n elements
inserted in (a) are deleted in the order they were inserted and n new elements are
inserted
(d) search for each of the remaining n elements in the order they were inserted
(e) delete the n elements in the order they were inserted.
For each n, the above five part experiment was repeated ten times using different
random permutations of distinct elements. For each sequence, we measured the total
number of element comparisons performed and then averaged these over the ten
sequences. The average number of comparisons for each of the five parts of the
experiment are given in Table 4.1.
Also given in this table is the number of comparisons using ordered data. For
this data set, elements were inserted and deleted in the order 1,2,3,.... For the
case of random data, MSLs make 40% to 50% more comparisons on each of the five
parts of the experiment. On ordered inputs, the disparity is even greater with MSLs
making 30% to 140% more comparison. Table 4.2 gives the number of levels in SKIP
and MSL. The first number of each entry is the number of levels following part (a)
of the experiment and the second the number of levels following part (b). As can be
117
Table 4.1. The number of key comparisons
n
operation
random inputs
ordered inputs
SKIP
MSL
SKIP
MSL
insert
224757
322499
247129
318854
search
255072
362865
256706
339019
10,000
ins/del
519430
734161
354566
560219
search
256124
349591
250538
339121
delete
231745
320594
84392
185489
insert
1357076
1950583
1422120
1911818
search
1537547
1965649
1467217
1836713
50,000
ins/del
2996512
4142186
1973416
3204400
search
1501731
2038774
1449810
1989550
delete
1373858
1853671
486498
931975
insert
2919371
4146428
2925618
4275880
search
3188621
4315576
2970715
4082193
100,000
ins/del
6399463
9103135
4406427
6895510
search
3225343
4427979
3277089
4345874
delete
2981173
4161994
961283
2052638
insert
6178596
8927523
6403207
9022631
search
6697223
9273707
6448304
8946474
200,000
ins/del
13377747
19370831
9054078
9062233
search
6680642
9662006
6458321
9197714
delete
6149268
9101721
1995215
4837867
118
Table 4.2. Number of levels
n
random inputs
ordered inputs
SKIP
MSL
SKIP
MSL
10,000
8,8
7,7
8,8
7,7
50,000
9,9
7,7
9,9
7,7
100,000
9,9
7,8
9,9
7,8
200,000
9,9
8,9
9,9
8,9
seen, the number of levels is very comparable for both structures. MSLs generally
had one or two levels fewer than SKIPs had.
Despite the large disparity in number of comparisons, MSLs generally required
less time than required by SKIPs (see Table 4.3 and Figure 4.9). Integer keys were
used for our run time measurements. In many practical situations the observed time
difference will be noticeably greater as one would need to code skip lists using more
complex storage management techniques to allow for variable size nodes.
4.4 MSLs As Priority Queues
At first glance, it might appear that skip lists are clearly a better choice than
modified skip lists for use as a priority queue. The min element in a skip list is
the first element in the level one chain. So, it can be identified in 0(1) time and
then deleted in O(logn) probabilistic time. In the case of MSLs, the min element
is the first one in one of the leurrent chains. This can be identified in logarithmic
time using a loser tree whose elements are the first element from each MSL chain.
By using an additional pointer field in each node, we can thread the elements in an
MSL into a chain. The elements appear in nondecending order on this chain. The
119
Table 4.3. Run time
n
operation
random inputs
ordered inputs
SKIP
MSL
SKIP
MSL
insert
0.24
0.18
0.20
0.17
search
0.18
0.12
0.12
0.07
10,000
ins/del
0.45
0.35
0.20
0.20
search
0.18
0.12
0.13
0.07
delete
0.16
0.12
0.07
0.05
insert
1.36
1.22
0.92
0.80
search
1.25
0.98
0.62
0.38
50,000
ins/del
2.73
2.53
1.07
1.08
search
1.16
1.00
0.62
0.42
delete
1.10
0.83
0.27
0.23
insert
2.84
2.86
1.72
1.60
search
2.63
2.39
1.23
0.85
100,000
ins/del
6.13
5.80
2.43
2.28
search
2.61
2.33
1.35
0.92
delete
2.41
2.02
0.55
0.52
insert
6.25
6.49
3.52
3.47
search
5.85
5.34
2.70
1.87
200,000
ins/del
13.29
13.02
5.13
4.75
search
5.81
5.51
2.72
1.92
delete
5.35
4.85
1.12
1.18
120
Time is sum of time for parts (a)(e) of the experiment
Figure 4.9. Run time
resulting threaded structure is referred to as TMSL (threaded modified skip lists).
A delete min operation can be done in 0(1) expected time when a TMSL is used.
The expected time for an insert remains O(logn). The algorithms for the insert and
delete min operations for TMSLs are given in Figures 4.10 and 4.11, respectively.
The last step of Figure 4.10 is implemented by first finding the largest element on
level 1 with key < d.key (for this, start at level leurrent â€” 1) and then follow the
threaded chain.
Theorem 15 The expected complexity of an insert and deletemin operation in a
TMSL is O(\ogn) and 0(1), respectively.
Proof Follows from the notion of a thread, Theorem 14, and Pugh [26]. â–¡
121
procedure Insert(d) ;
begin
randomly generate the level k at which d is to be inserted ;
get a new node x and set x.data = d ;
if ((* > lcurrent) and (/current ^ Imax)) then
begin
leurrent = /current + 1 ;
create a new chain with a head node, node x, and a tail and
connect this chain to H ;
update H ;
set x.down to the appropriate node in the level leurrent â€” 1 chain
(to nil if k = 1) ;
end
else
begin
insert x into the level k chain ;
set x.down to the appropriate node in the level kâ€” 1 chain (to nil if k = 1) ;
update the down field of nodes on the level k + 1 chain (if any) as needed ;
end ;
find node with largest key < d.key and insert x into threaded list ;
end ;
Figure 4.10. TMSL Insert
procedure Deletemin ;
begin
delete the first node x from the thread list ;
let k be the level x is on ;
delete x from the level k list (note there are no down fields on level k + 1
that need to be updated) ;
if the list at level leurrent becomes empty then
delete this and succeeding empty lists until we reach the first non empty list,
update leurrent ;
end ;
Figure 4.11. TMSL Deletemin
122
procedure Deletemax ;
begin
delete the last node x from the thread list ;
let k be the level x is on ;
delete x from the level k list updating p.down for nodes on level k\1 as necessary;
if the list at level Â¡current becomes empty then
delete this and succeeding empty lists until we reach the first non empty list,
update /current ;
end ;
Figure 4.12. TMSL Deletemax
TMSLs may be further extended by making the threaded chain a doubly finked
fist. This permits both deletemin and deletemax to be done in 0(1) expected time
and insert in O(logn) expected time. With this extension, TMSLs may be used to
represent double ended priority queues.
4.5 Experimental Results For Priority Queues
The singleended priority queue structures min heap (Heap), binomial heap
(BHeap), leftist trees (LT), weight biased leftist trees (WBLT), and TMSLs were
programmed in C. In addition, priority queue versions of unbalanced binary search
trees (BST), AVL trees, treaps (TRP), and skip fists (SKIP) were also programmed.
The priority queue version of these structures differed from their normal dictionaxy
versions in that the delete operation was customized to support only a delete min.
For skip fists and TMSLs, the level allocation probability p was set to 1/4. While
BSTs are normally defined only for the case when the keys are distinct, they are
easily extended to handle multiple elements with the same key. In our extension,
if a node has key x, then its left subtree has values < x and its right values > x.
To minimize the effects of system call overheads, all structures (other than Heap)
123
were programmed using simulated pointers. The min heap was programmed using a
onedimensional array.
For our experiments, we began with structures initialized with n = 100, 1,000,
100,000, and 100,000 elements and then performed a random sequence of 100,000
operations. This random sequence consists of approximately 50% insert and 50%
delete min operations. The results axe given in Tables 4.4, 4.5, and 4.6. In the data
sets â€˜randomlâ€™ and â€˜random2â€™, the elements to be inserted were randomly generated
while in the data set â€˜increasingâ€™ an ascending sequence of elements was inserted
and in the data set â€˜decreasingâ€™, a descending sequence of elements was used. Since
BST have very poor performance on the last two data sets, we excluded it from this
part of the experiment. In the case of both randoml and random2, ten random
sequences were used and the average of these ten is reported. The randoml and
random2 sequences differed in that for randoml, the keys were integers in the range
0..(106 â€” 1) while for random2, they were in the range 0..999. So, random2 is expected
to have many more duplicates.
Table 4.4 gives the total number of comparisons made by each of the methods.
On the two random data tests, weight biased leftist trees required the fewest number
of comparisons except when n = 100,000. In this case, AVL trees required the fewest.
With ascending data, treaps did best and with descending data, LTs and WBLTs did
best. For both, each insert could be done with one comparison as both structures
build a left skewed tree.
Table 4.4: The number of key comparisons
inputs
n
BST
Heap
BHeap
LT
WBLT
TRP
SKIP
TMSL
AVL
100
621307
823685
268224
165771
165525
407326
270584
503567
373683
random 1
1,000
677570
1317728
383285
203550
202274
537418
397181
729274
542789
10,000
726875
1693955
645664
476534
468713
757946
711869
1104805
685422
100,000
1067670
1746376
1153516
1199207
1171181
1119554
1327083
1778579
877131
100
522728
781808
260209
164288
164031
384721
261169
485995
358885
random2
1,000
612630
1273828
389333
199886
198720
518022
386837
707997
534406
10,000
1027346
1576921
642968
448379
439014
768783
710612
1131421
685926
100,000
5641638
1713676
1146214
1032043
978785
1746732
1339410
1800404
890190
100

564332
410119
552081
536796
363045
629085
836223
382513
increasing
1,000

946659
655107
917664
882490
496223
1018473
1331317
584422
10,000

1284712
814412
1234622
1192325
645765
1313298
1685844
747581
100,000

1617645
923048
1550741
1560866
723825
1568819
1939835
902437
100

836361
202402
50010
50010
362723
194245
425238
334965
decreasing
1,000

1394286
313587
50010
50010
515579
300298
637341
512942
10,000

1950062
413840
50010
50010
558222
400082
836879
672910
100,000

2400032
534821
50010
50010
648730
450090
936986
835044
n = the number of elements in initial data structures
Total number of operations performed = 100,000
125
The structure height initially and following the 100,000 operations is given in
Table 4.5 for BSTs, Heaps, TRPs and AVL trees. For BHeaps, the height of the
tallest tree is given. For SKIPs and TMSLs, this table gives the number of levels. In
the case of LT and WBLT, this table gives the length of the rightmost path following
initialization and the average of its length following each of the 100,000 operations.
The two leftist structures axe able to maintain their rightmost paths so as to have a
length much less than log2(n + 1).
The measured run times on a Sun Sparc 5 are given in Table 4.6. For this, the
codes were compiled using the cc compiler in optimized mode. The run time for the
data set randoml is graphed in Figure 4.13. The run time for the data set random2
and Heap, LT, WBLT, SKIP, TMSL, and AVL is graphed in Figure 4.14. For the
data sets randoml and random2 with n = 100 and 1,000, WBLTs required least time.
For randoml with n = 10,000, BSTs took least time while when n = 100,000, both
BSTs and Heaps took least time. For random2 with n = 10,000, WBLTs were fastest
while for n â€” 100,000, Heap was best. On the ordered data sets, BSTs have a very
high complexity and are the poorest performers (times not shown in Table 4.6). For
increasing data, Heap was best for n = 100, 1,000 and 10,000 and both Heap and
TRP best for n = 100,000. For decreasing data, WBLTs were generally best. On
all data sets, WBLTs always did at least as well (and often better) as LTs. Between
SKIP and TMSL, we see that SKIP generally did better for small n and TMSL for
large n.
Table 4.5: Height/level of the structures
inputs
n
BST
Heap
BHeap
LT
WBLT
TRP
SKIP
TMSL
AVL
100
13,13
7,6
1,6
4,2
4,2
12,11
4,4
4,4
8,7
randoml
1,000
22,22
10,10
1,10
7,2
7,2
22,24
6,6
6,6
12,12
10,000
31,32
14,14
1,14
8,4
8,4
33,30
8,7
8,7
16,16
100,000
40,41
17,17
1,17
10,9
10,9
42,42
9,9
9,9
20,20
100
13,16
7,7
1,7
4,2
4,2
14,17
4,4
4,4
8,7
random2
1,000
23,69
10,10
1,10
5,2
5,2
23,63
6,5
6,5
12,11
10,000
39,93
14,14
1,14
6,4
6,4
36,83
8,7
8,7
16,15
100,000
147,201
17,17
1,17
6,8
6,8
133,183
9,9
9,9
19,19
100

7,8
1,8
6,4
6,4
11,15
4,5
4,5
7,8
increasing
1,000

10,11
1,11
9,7
9,7
24,24
6,6
6,6
10,11
10,000

14,14
1,14
13,9
13,9
33,34
8,8
8,8
14,14
100,000

17,17
1,17
16,11
16,11
46,41
9,9
9,9
17,17
100

7,7
1,7
U
1,1
11,15
4,4
4,4
7,7
decreasing
1,000

10,10
1,10
1,1
1,1
24,24
6,6
6,6
10,10
10,000

14,14
1,14
1,1
1,1
33,33
8,8
8,8
14,14
100,000

17,17
1,17
1,1
1,1
46,46
9,9
9,9
17,17
n = the number of elements in initial data structures
Total number of operations performed = 100,000
Table 4.6: Run time using integer keys
inputs
n
BST
Heap
BHeap
LT
WBLT
TRP
SKIP
TMSL
AVL
100
0.32
0.32
0.53
0.30
0.23
0.56
0.36
0.37
0.50
random 1
1,000
0.34
0.44
0.59
0.29
0.25
0.56
0.35
0.39
0.56
10,000
0.38
0.57
0.98
0.62
0.49
0.71
0.62
0.59
0.70
100,000
0.66
0.66
1.90
1.77
1.26
1.26
1.33
1.32
0.96
100
0.29
0.30
0.55
0.27
0.25
0.55
0.32
0.35
0.50
random2
1,000
0.32
0.44
0.58
0.27
0.23
0.53
0.32
0.36
0.53
10,000
0.49
0.51
0.93
0.57
0.44
0.70
0.59
0.57
0.67
100,000
3.83
0.68
1.92
1.44
0.99
1.70
1.32
1.34
1.02
100

0.22
0.63
0.50
0.40
0.42
0.43
0.42
0.62
increasing
1,000

0.35
0.92
0.95
0.70
0.48
0.78
0.58
0.58
10,000

0.47
1.05
1.25
0.95
0.55
0.72
0.60
0.63
100,000

0.60
1.30
1.83
1.40
0.60
0.83
0.65
0.78
100

0.30
0.38
0.13
0.12
0.45
0.23
0.28
0.40
decreasing
1,000

0.45
0.47
0.13
0.10
0.50
0.28
0.33
0.48
10,000

0.58
0.55
0.12
0.12
0.52
0.32
0.38
0.67
100,000

0.70
0.73
0.12
0.12
0.55
0.35
0.45
0.80
Time Unit : sec
n = the number of elements in initial data structures
Total number of operations performed = 100,000
128
n
Figure 4.13. Run time on random 1
Another way to interpret the time results is in terms of the ratio m/n (m =
number of operations). In the experiments reported in Table 4.6, m = 100,000. As
m/n increases, WBLTs and LTs perform better relative to the remaining structures.
This is because as m increases, the (weight biased) leftist trees constructed are very
highly skewed to the left and the length of the rightmost path is close to one.
Tables 4.7, 4.8, and 4.9 provide an experimental comparison of BSTs, AVL
trees, MMHs (minmax heaps) [2], Deaps [7], TRPs, SKIPs, and TMSLs as a data
structure for double ended priority queues. The experimental setup is similar to that
used for single ended priority queues. However, this time the operation mix was 50%
insert, 25% deletemin, and 25% deletemax. On the comparison measure, treaps
did best on increasing data (except when n = 100) and skip lists did best when
decreasing data was used. On all other data, AVL trees did best. As far as run time
129
n
Figure 4.14. Run time on random2
is concerned, BSTs did best on the random data tests except when n = 100,000 and
the set random2 was used. In this case, deaps and AVL trees took least time. For
increasing data, treaps were best and for decreasing data, skip lists were best. The
run time for the data set randoml is graphed in Figure 4.15. The run time for the
data set random2 and MMH, Deap, SKIP, TMSL, and AVL is graphed in Figure 4.16.
4.6 Conclusion
We have developed two new data structures: weight biased leftist trees and
modified skip lists. Experiments indicate that WBLTs have better performance (i.e.,
run time characteristic and number of comparisons) than LTs as a data structure for
single ended priority queues and MSLs have a better performance than skip lists as
Table 4.7: The number of key comparisons
inputs
n
BST
MMH
Deap
TRP
SKIP
TMSL
AVL
100
534197
994374
581071
402987
462690
666845
371435
random 1
1,000
676634
1677964
912282
550363
698132
996774
545600
10,000
738100
2328247
1122599
759755
1034669
1437925
693935
100,000
1068250
2795369
1123423
1127616
1439387
1862864
878709
100
514680
941925
557549
396910
447909
650533
357556
random2
1,000
908437
1651830
909747
564414
658284
949230
537616
10,000
1339444
2239519
1145724
881889
1003300
1379690
689415
100,000
5956407
2754760
1125210
1873922
1454138
1875706
891332
100

926017
592503
364026
624430
803024
363399
increasing
1,000

1660945
1052894
507812
999886
1304223
562873
10,000

2392627
1411640
614534
1373332
1766792
726087
100,000

3015425
1849318
679120
1487465
1866498
878978
100

926041
615944
360030
193021
413284
355999
decreasing
1,000

1711076
1062010
490022
287887
603089
563085
10,000

2400014
1480876
676190
352780
732673
725079
100,000

3035292
1851845
698740
450090
927278
878244
n = the number of elements in initial data structures
Total number of operations performed = 100,000
131
Table 4.8. Height/level of the structures
inputs
n
BST
MMH
Deap
TRP
SKIP
TMSL
AVL
100
13,12
7,6
7,6
13,11
4,4
4,4
8,7
randoml
1,000
22,22
10,10
10,10
23,22
6,6
6,6
12,12
10,000
32,31
14,14
14,14
33,32
8,8
8,8
16,16
100,000
41,41
17,17
17,17
41,42
9,9
9,9
20,20
100
13,13
7,7
7,7
14,12
4,4
4,4
8,7
random2
1,000
23,69
10,10
10,10
22,60
6,6
6,6
12,11
10,000
38,93
14,14
14,14
35,82
8,7
8,7
16,15
100,000
147,199
17,17
17,17
135,186
9,9
9,9
19,19
100
â€”
7,8
7,8
11,16
4,5
4,5
7,8
increasing
1,000

10,11
10,11
24,27
6,7
6,7
10,11
10,000

14,14
14,14
33,33
8,8
8,8
14,14
100,000

17,17
17,17
46,43
9,9
9,9
17,17
100

7,7
7,7
11,15
4,5
4,5
7,8
decreasing
1,000

10,10
10,10
24,21
6,7
6,7
10,11
10,000

14,14
14,14
33,36
8,8
8,8
14,14
100,000

17,17
17,17
46,43
9,9
9,9
17,17
n = the number of elements in initial data structures
Total number of operations performed = 100,000
n
Figure 4.15. Run time on randoml
132
Table 4.9. Run time using integer keys
inputs
n
BST
MMH
Deap
TRP
SKIP
TMSL
AVL
100
0.29
0.42
0.39
0.44
0.45
0.42
0.52
random 1
1,000
0.32
0.62
0.57
0.49
0.56
0.51
0.57
10,000
0.34
0.83
0.81
0.65
0.87
0.74
0.67
100,000
0.64
1.18
1.05
1.17
1.51
1.45
0.99
100
0.27
0.42
0.39
0.47
0.51
0.46
0.56
random2
1,000
0.47
0.64
0.59
0.54
0.53
0.48
0.55
10,000
0.59
0.85
0.78
0.72
0.89
0.71
0.69
100,000
4.22
1.07
1.01
1.91
1.50
1.47
1.01
100

0.38
0.38
0.35
0.48
0.38
0.60
increasing
1,000

0.60
0.63
0.42
0.65
0.53
0.53
10,000

0.88
0.82
0.47
0.80
0.63
0.62
100,000

1.12
1.15
0.57
0.92
0.85
0.77
100

0.37
0.40
0.35
0.35
0.42
0.53
decreasing
1,000

0.63
0.62
0.43
0.33
0.38
0.55
10,000

0.83
0.83
0.50
0.35
0.40
0.62
100,000

1.05
1.10
0.63
0.40
0.45
0.83
Time Unit : sec
n = the number of elements in initial data structures
Total number of operations performed = 100,000
133
n
Figure 4.16. Run time on random2
a data structure for dictionaries. MSLs have the added advantage of using fixed size
nodes.
Our experiments show that binary search trees (modified to handle equal keys)
perform best of the tested double ended priority queue structures using random data.
Of course, these are unsuitable for general application as they have very poor perÂ¬
formance on ordered data. Minmax heaps, deaps and AVL trees guarantee O(logn)
behavior per operation. Of these three, AVL trees generally do best for large n. It
is possible that other balanced search structures such as bottomup redblack trees
might do even better. Treaps and skip lists are randomized structures with O(logn)
expected complexity. Treaps were generally faster than skip lists (except for decreasÂ¬
ing data) as double ended priority queues.
134
For single ended priority queues, if we exclude BSTs because of their very poor
performance on ordered data, WBLTs did best on the data sets randoml and random2
(except when n = 100,000), and decreasing. Heaps did best on the remaining data
sets. The probabilistic structures TRP, SKIP and TMSL were generally slower than
WBLTs. When the ratio m/n (m = number of operations, n = average queue size)
is large, WBLTs (and LTs) outperform heaps (and all other tested structures) as the
binary trees constructed tend to be highly skewed to the left and the length of the
rightmost path is close to one.
Our experimental results for single ended priority queues are in marked conÂ¬
trast to those reported in Gonnet and BaezaYates [11, p228] where leftist trees are
reported to take approximately four times as much time as heaps. We suspect this
difference in results is because of different programming techniques (recursion vs. itÂ¬
eration, dynamic vs. static memory allocation, etc.) used in Gonnet and BaezaYates
[11] for the different structures. In our experiments, all structures were coded using
similar programming techniques.
CHAPTER 5
CONCLUSIONS
We have considered the problem of joining a row of compacted cells and deÂ¬
veloped heuristics to stretch cells and riverroute the nets so that the layout area is
minimized. Our proposed heuristics were compared, experimentally, with Fang [8]
and found to produce layouts with less area.
We have developed a new weight balanced search structure called /3BBST. Our
experimental results show that BBSTs generally have the best search cost of the
structures considered. We recommend /3BBSTs for environments where searches
are done with much greater frequency than inserts and/or deletes. Based on our
experiments, we conclude that AVL trees remain the best dictionary structure for
general applications.
We have also proposed two simplified versions of the BBST called SBBST and
BBSTD. Both versions are very competitive with BBSTs. The BBST and SBBST
provide slightly better balance than provided by WB(a) trees.
Another new data structures, weight biased leftist trees, developed by us, has
better performance than leftist trees as a data structure for single ended priority
queues. We have also proposed a fixed node size representation for skip lists. As
a data structure for dictionaries, this has a better performance than skip lists with
variable node size.
135
APPENDIX A
ABBREVIATIONS
136
Table A.l. Abbreviations used in tables
AVL
AdelsonVelskii and Landisâ€™ trees
BHeap
binomial heaps
BBST
/^balanced binary search trees
BBSTD
BBSTs without deletion
BST
unbalanced binary search trees
Deap
deaps
DSL
deterministic skip lists
Heap
heaps
LT
leftist trees
MMH
minmax heaps
MSL
modified skip lists
RBB
bottomup redblack trees
RBT
topdown redblack trees
SBBST
simple /2balanced binary search trees
SKIP
skip lists
TMSL
threaded modified skip lists
TRP
treaps
WB
weight balanced trees
WBLT
weight biased leftist trees
REFERENCES
[1] ARAGON, C. R., AND SEIDEL, R. G. Randomized Search Trees. In Proc.
30th Ann. IEEE Symposium on Foundations of Computer Science (Oct. 1989),
pp. 540545.
[2] Atkinson, M., Sack, J., Santoro, N., and Strothotte, T. Minmax
Heaps and Generalized Priority Queues. Commun. ACM 29, 10 (1986), 996
1000.
[3] BARATZ, A. Algorithms for Integrated Circuit Signal Routing. Ph.D. dissertaÂ¬
tion, 1981. MIT.
[4] Blum, N., AND MehlhORN, K. On the Average Number of Rebalancing
Operations in Weightbalanced Trees. Theoretical Comput. Sci. 11 (1980), 303
320.
[5] Boyer, D. G. Symbolic Layout Compaction Review. In Proc. 25th Design
Automation Conf. (June 1988), pp. 383389.
[6] CARLSON, B. S., AND Lee, S. J. Delay Optimization of Digital CMOS VLSI
Circuits by Transistor Reordering. IEEE Transactions on ComputerAided DeÂ¬
sign of Integrated Circuits and Systems 14, 10 (Oct. 1995), 11831192.
[7] CARLSSON, S. The Deap: a DoubleEnded Heap to Implement DoubleEnded
Priority Queues. Inf. Process. Lett. 26 (1987), 3336.
[8] CHENG, G., and DespaiN, A. Fang: A Joiner for Compacted Cells. In VLSI
89 (1989), pp. 455463.
[9] CRANE, C. Linear Lists and Priority Queues as Balanced Binary Trees. Tech.
Rep. CS72259, Stanford University, 1972. Dept, of Comp. Sci.
[10] FREDMAN, M., AND TARJAN, R. Fibonacci Heaps and Their Uses in Improved
Network Optimization Algorithms. Journal of ACM 34, 3 (1987), 596615.
[11] GONNET, G. H., AND BaezaYatES, R. Handbook of Algorithms and Data
Structures, 2nd ed. AddisonWesley Pub. Co., Reading, MA, 1991.
[12] GuiBAS, L. J., AND SEDGEWICK, R. A Dichromatic Framework for Balanced
Trees. In Proc. 19th FOCS (1978), pp. 821.
138
139
[13] HARRISON, A. J. VLSI Layout Compaction Using Radix Priority Search Trees.
In Proc. 28th Design Automation Conf. (March 1991), pp. 732735.
[14] HOROWITZ, E., AND SAHNI, S. Fundamentals of Data Structures in Pascal,
4th ed. W. H. Freeman and Company, New York, NY, 1994.
[15] LEISERSON, C., and Pinter, R. Optimal Placement for River Routing. SIAM
J. Comput. 3, 3 (Aug. 1983), 447462.
[16] LlM, A. Efficient Algorithm for CAD in VLSI . Ph.D. dissertation, 1992.
University of Minnesota, MN.
[17] LlM, A., CHENG, S., AND Sahni, S. Optimal Joining of Compacted Cells.
IEEE Trans. Comput. 42, 5 (May 1993), 597607.
[18] McCREIGHT, E. M. Priority Search Trees. SIAM J. Comput. 14, 2 (May 1985),
257276.
[19] MlRZAIAN, A. River Routing in VLSI. J. Comput. Syst. Sci. 34, 1 (1987),
4354.
[20] Munro, J. I., Papadakis, T., AND Sedgewick, R. Deterministic Skip
Lists. In 3rd Annual ACMSIAM Symposium on Discrete Algorithms (Jan.
1992), pp. 367375.
[21] NlEVERGELT, J., AND Reingold, E. M. Binary Search Trees of Bounded
Balance. SIAM J. Comput. 2, 2 (March 1973), 3343.
[22] Papadakis, T. Skip Lists and Probabilistic Analysis of Algorithms. Ph.D.
dissertation, 1993. University of Waterloo.
[23] PINTER, R. Y. On Routing Two Point Nets Across A Channel. In Proc. 19th
Design Automation Conf. (June 1982), pp. 899902.
[24] PINTER, R. Y. River Routing: Methodology and Analysis. In Third Caltech
Conference on VLSI, R. Bryant., Ed. Computer Science Press, Rockville, MD,
March 1983, pp. 141163.
[25] PREPARATA, F. P., and W. Lipski, J. Optimal ThreeLayer Channel Routing.
IEEE Trans. Comput. c33, 5 (May 1984), 427437.
[26] PUGH, W. Skip Lists: a Probabilistic Alternative to Balanced Trees. Commun.
ACM 33, 6 (1990), 668676.
[27] SAHNI, S. Software Development in Pascal. NSPAN Printing and Publishing
Co., Gainesville, FL, 1993.
[28] SEDGEWICK, R. Algorithms in C++. AddisonWesley Pub. Co., Reading, MA,
1994.
140
[29] SHENOY, N., AND RUDELL, R. Efficient Implementation of Retiming. In Proc.
International Conf. on CAD (June 1994), pp. 226233.
[30] SHERWANI, N. Algorithms for VLSI Physical Design Automation. Kluwer AcaÂ¬
demic, Norwell, MA, 1992.
[31] Tarjan, R. E. Updating a Balanced Search Tree in 0(1) Rotations. Inf.
Process. Lett. 16 (June 1983), 253257.
[32] Tompa, M. An Optimal Solution to A Wirerouting Problem. J. Comput. Syst.
Sci. 23, 2 (May 1981), 127150.
[33] WESTE, N. Virtual Grid Symbolic Layout. In Proc. 18lh Design Automation
Conf. (June 1981), pp. 225233.
BIOGRAPHICAL SKETCH
Seonghun Cho was brought up in the Republic of Korea. He completed his
undergraduate degree in mathematics from Seoul National University. He obtained
an M.S degree in computer science from University of MissouriColumbia, Columbia,
MO. He started to work towards his Ph.D from Fall 1991 in the Computer and
Information Science and Engineering Department at the University of Florida. His
research interests include VLSI CAD algorithms and data structures. He will be
graduating in May 1996.
141
I certify that I have read this study and that in my opinion it conforms
to acceptable standards of scholarly presentation and is fully adequate, in
scope and quality, as a dissertation for the degree of Doctor of Philosophy.
Professor of Computer and
Information Science and Engineering
I certify that I have read this study and that in my opinion it conforms
to acceptable standards of scholarly presentation and is fully adequate, in
scope and quality, as a dissertation for th
LiMin Fu
Associate Professor of Computer and
Information Science and Engineering
I certify that I have read this study and that in my opinion it conforms
to acceptable standards of scholarly presentation and is fully adequate, in
scope and quality, as a dissertation for the degree of Doctor of Philosophy.
r" â– '
Theodore Johnson
Associate Professor of Computer and
Information Science and Engineering
I certify that I have read this study and that in my opinion it conforms
to acceptable standards of scholarly presentation and is fully adequate, in
scope and quality, as a dissertation for the degree of Doctor of Philosophy.
Sangilthevar Rajasekaran
Associate Professor of Computer and
Information Science and Engineering
I certify that I have read this study and that in my opinion it conforms
to acceptable standards of scholarly presentation and is fully adequate, in
scope and quality, as a dissertation for the degree of Doctor of Philosophy.
Pyi W.Omn
P/ofessor of Biochemistry and
Molecular Biology
This dissertation was submitted to the Graduate Faculty of the College
of Engineering and to the Graduate School and was accepted as partial fulÂ¬
fillment of the requirements for the degree of Doctor of Philosophy.
May 1996
/ Â£>â€¢.
Winfred M. Phillips
Dean, College of Engineering
Karen A. Holbrook
Dean, Graduate School

