A New Weight Balanced Binary Search Treel
Seonghun Cho and Sartaj Sahni
Department of Computer and Information Science and Engineering
Ur,..i' .'Ii of Florida
Gainesville, FL 32611, U.S.A.
Technical Report 96001
Abstract
We develop a new class of weight balanced binary search trees called 3balanced
binary search trees (3BBSTs). 3BBSTs are designed to have reduced internal path
length. As a result, they are expected to exhibit good search time characteristics.
Individual search, insert, and delete operations in an n node 3BBST take O(logn)
time for 0 < p < v/21. Experimental results comparing the performance of 3BBSTs,
WB(a) trees, AVLtrees, red/black trees, treaps, deterministic skip lists and skip lists
are presented. Two simplified versions of 3BBSTs are also developed.
Keywords and Phrases. data structures, weight balanced binary search trees
1 Introduction
A dictionary is a set of elements on which the operations of search, insert, and delete are
performed. Many data structures have been proposed for the efficient representation of a
dictionary [HORO94]. These include direct addressing schemes such as hash tables and
comparison schemes such as binary search trees, AVLtrees, red/black trees [GUIB78], trees
of bounded balance [NIEV73], treaps [ARAG89], deterministic skip lists [MUNR92], and skip
lists [PUGH90]. Of these schemes, AVLtrees, red/black trees, and trees of bounded balance
(WB(a)) are balanced binary search trees. When representing a dictionary with n elements,
using one of these schemes, the corresponding binary search tree has height O(log n) and
individual search, insert, and delete operations take O(logn) time. When (unbalanced)
1This research was supported, in part, by the Army Research Office under grant DAA II ,''ilO0111,l
and by the National Science Foundation under grant MIP9103379.
binary search trees, treaps, or skip lists are used, each operation has an expected complexity
of O(log n) but the worst case complexity is O(n). When hash tables are used, the expected
complexity is 0(1) per operation. However, the worst case complexity is O(n). So, in
applications where a worst case complexity guarantee is critical, one of the balanced binary
search tree schemes is to be performed.
In this paper, we develop a new balanced binary search tree called 3BBST (3balanced
binary search tree). Like WB(a) trees, this achieves balancing by controlling the relative
number of nodes in each subtree. However, unlike WB(a) trees, during insert and delete
operations, rotations are performed along the search path whenever they reduce the internal
path length of the tree (rather than only when a subtree is out of balance). As a result, the
constructed trees are expected to have a smaller internal path length than the corresponding
WB(a) tree. Since the average search time is closely related to the internal path length, the
time need to search in a 3BBST is expected to be less than that in a WB(a) tree.
In Section 2, we define the total search cost of a binary search tree and show that the
rebalancing rotations performed in AVL and red/black trees might increase this metric. We
also show that while similar rotations in WB(ca) trees do not increase this metric, insert and
delete operations in WB(a) trees do not avail of all opportunities to reduce the metric. In
Section 3, we define 3BBSTs and show their relationship to WB(a) trees. Search, insert,
and delete algorithms for 3BBSTs are developed in Section 4. A simplified version of 3
BBSTs is developed in Section 5. Search, insert and delete operations for this version also
take O(log n) time each. An even simpler version of 3BBSTs is developed in Section 6.
For this version, we show that the average cost of an insert and search operation is O(log n)
provided no deletes are performed.
An experimental evaluation of 3BBSTs and competing schemes for dictionaries (AVL,
red/black, skip lists, etc.) was done and the results of this are presented in Section 7. This
section also compares the relative performance of 3BBSTs and the two simplified versions
of Sections 5 and 6.
2 Balanced Trees and Rotations
Following an insert or delete operation in a balanced binary search tree (e.g., AVL, red/black,
WB(a), etc.), it may be necessary to perform rotations to restore balance. The rotations
are classified as LL, RR, LR, and RL [HOR094]. LL and RR rotations as well as LR and
RL rotations are symmetric. While the conditions under which the rotations are performed
vary with the class of balanced tree considered, the node movement patterns are the same.
Figure 1 shows the transformation performed by an LL and an LR rotation. In this figure,
nodes whose subtrees have changed as a result of the rotation are designated by a prime.
So, p' is the original node p however its subtrees are different.
Let h(x) be the height of the subtree with root x. Let s(x) be the number of nodes in this
subtree. When searching for an element x, x is compared with one element at each of l(x)
levels, where 1(x) is the level at which x is present (the root is at level 1). So, one measure
of the goodd. " of the binary search tree, T, for search operations (assuming each element
is searched for with equal probability) is its total search cost defined as:
C(T) = 1(x).
xcT
Notice that C(T) = I(T) + n where I(T) is the internal path length of T and n is the
number of elements/nodes in T. The cost of unsuccessful searches is equal to the external
path length E(T). Since E(T) = I(T) + 2n, minimizing C(T) also minimizes E(T).
Total search cost is important as this is the dominant operation in a dictionary (note
that insert can be modeled as an unsuccessful search followed by the insertion of a node
at the point where the search terminated and deletion can be modeled by a successful
search followed by a physical deletion; both operations are then followed by a rebalanc
ing/restructuring step).
Observe that in an actual implementation of the search operation in programming lan
guages such as C++, C, and Pascal, the search for an x at level l(x) will involve upto two
comparisons at levels 1, 2,..., 1(x). If the code first checks x = e where e is the element
(a) LL rotation
(b) LR rotation
Figure 1: LL and RL rotations
at level i to be compared and then x < e to decide whether to move to the left or right
subtree, then the number of element comparisons is exactly 21(x) 1. In this case, the total
number of element comparisons is
NC(T) = 2 1(x) n = 2C(T) n
xeT
and minimizing C(T) also minimizes NC(T). If the code first checks x < ec and then x = ec
(or > ei), the number of element comparisons done to find x is l(x)+r(x)+1 where r(x) is the
number of right branches on the path from the root to x. The total number of comparisons
is bounded by 2C(T). For simplicity, we use C(T) to motivate our data structure.
In an AVL tree, when an LL rotation is performed, h(q) = h(c) + 1 = h(d) + 1 (see
Figure l(a)). At this time, the balance factor at gp is h(p) h(d) = 2. The rotation restores
height balance which is necessary to guarantee O(log n) search, insert, delete operations in
an n node AVL tree. The rotation may, however, increase the total search cost. To see
this, notice that an LL rotation affects the level numbers of only those nodes that are in the
subtree with root gp prior to the rotation. We see that l(q') = l(q) l(p') = l(p) l(gp') =
l(gp) + 1, the total search cost of the subtree with root a is decreased by s(a) as a result of
the rotation, etc. Hence, the increase in C(T) due to the rotation is:
l(p') l(p) + l(q') l(q) + (gp') l(gp) s(a) s(b) + s(d)
= 1 +1 1 s(q)+ 1 + s(d) = s(d) s(q).
A similar analysis shows that an LR rotation increases C(T) by s(d) s(q).
If the LL rotation was triggered by an insertion, s(q) is at least one more than the
minimum number of nodes in an AVL tree of height t = h(q) 1. So, s(q) > 0t+2//5 where
( = (1 + 5)/2. The maximum value for s(d) is 2 1. So, an LL rotation has the potential
of increasing total search cost by as much as
2 1 t+2/ 2 1 1.62+2/2.24.
This is negative for t < 2 and positive for t > 2. When t = 10, for example, an LL
rotation may increase total search cost by as much as 877. As t gets larger, the potential
increase in search cost gets much greater. This analysis is easily extended to the remaining
rotations and also to red/black trees.
Definition (WB(a) [NIEV73]) The balance, B(p), of a node p in a binary tree is the ratio
(s(1) + l)/(s(p) + 1) where 1 is the left child of p. For a E [0, 1/2], a binary tree T is in
WB(a) iff a < B(p) < 1 a for every node p in T. By definition, the empty tree is in WB(a)
for all a.
Lemma 1 (1) The maximum height, hmax(n), of an n node tree in WB(a) is ~ log 1 (n+
1) [NIE V73]
(2) Inserts and deletes can be performed in an n node tree in WB(a) in O(log n) time for
2/11 < a < 1 v/2 [BLUMAI].
(3) Each search operation in an n node tree in WB(a) takes O(log n) time [NIEV73].
In the case of weight balanced trees WB(a), an LL rotation is performed when B(gp) w
1 a and B(p) > a/(1 a) (see Figure l(a)) [NIEV73]. So,
S s(p)+l s(p) +
s(gp) + 1 s(p) + s(d) + 2
or
ca 2a 1
s (d) s (p) +
)1+1a
and
a s(q) + t
< B(p)=
t a Sa(p) + 1
or
1 a 2a 1
s(q) > s(p)t + .
So, LL rotations (and also RR) do not increase the search cost. For LR rotations
[NIEV73], B(gp) t 1 a and B(p) < a/(1 a). So, s(d) a s(p) + and with
respect to Figure l(b),
a s(p) s(q)
> B(p)=
1 a S(p) +
or
1 2a c
s(q) > s(p)
1 a 1 a
For ca 1/3, s(q) > s(d) and LR (RL) rotations do not increase search cost. Thus, in the
case of WB(a) trees, the rebalancing rotations do not increase search cost. This statement
remains true if the conditions for LL and LR rotation are changed to those in [BLU1\ i]
While rotations do not increase the search cost of WB(a) trees, these trees miss per
forming some rotations that would reduce search cost. For example, it is possible to have
a < B(gp) < 1 a, B(p) > , and s(q) > s(d). Since B(gp) isn't high enough, an LL
rotation isn't performed. Yet, performing such a rotation would reduce search cost.
3 3BBSTs
Definition A cost optimized search tree (COST) is a binary search tree whose search cost
cannot be reduced by performing a single LL, RR, LR, or RL rotation.
Theorem 1 If T is a COST with n nodes, its height is at most log(V5(n + 1)) 2.
Proof Let NA be the minimum number of nodes in a COST of height h. Clearly, No = 0
and N1 = 1. Consider a COST Q of height h > 2 having the minimum number of nodes Nh.
Q has one subtree R whose height is h 1 and another, S, whose height is < h 1. R must
be a minimal COST of height h 1 and so has NhI_ nodes. R, in return, must have one
subtree, U, of height h 2 and another, V, of height < h 2. Both U and V are COSTs
as R is a COST. Since R is a minimal COST, U is a minimal COST of height h 2 and
so has Nh2 nodes. Since Q is a COST, IS > max{lU\, IV}. We may assume that NA is
a nondecreasing function of h. So, ISI > Nh2 Since Q is a minimal COST of height h,
IS = Nh2. So,
Nh = N + Nh2 + 1, h>2
No = 0, N = .
This recurrence is the same as that for the minimum number of nodes in an AVL tree
of height h. So, NA = Fh+2 1 where Fi is the i'th Fibbonacci number. Consequently,
Ns a h+2/V/ 1 and h < log9(g5(n + 1)) 2. O
Corollary 1 The maximum height of a COST with n nodes is the same as that of an AVL
tree with this ,,i,,.i nodes.
Definition Let a and b be the root of two binary trees. a and b are 3balanced, 0 < < 1,
with respect to one another, denoted 3(a, b), iff
(a) P(s(a) 1) < s(b)
(b) /(s(b) 1) < s(a)
A binary tree T is 3balanced iff the children of every node in T are 3balanced.
A full binary tree is 1balanced and a binary tree whose height equals its size (i.e., number
of nodes) is 0balanced.
Lemma 2 If the binary tree T is 3balanced, then it is 7balanced for 0 <7 <3.
Proof Follows from the definition of balance. o
Lemma 3 If the binary tree T is 3balanced, 0 < 3 < 1/2, then it is in WB(a) for a =
M//(1 + 4).
Proof Consider any node p in T. Let 1 and r be node p's left and right children.
(p) s(l) + 1 1
s(1) + s(r) + 2 1+ '
Since T is 3balanced, s(l) 1 < s(r)/3 or s(l) + 1 < s(r)/3 + 2. So,
s(1) + 1 22 1
< 1/ + < 1/
s(r) + 1 (s() + 1)
or
s(r) + 1 >
s() + 1 
So, B(p) < 1/(1 + 3). Further, s(r) 1 < s(l)/P. So,
s(r) + 1 <
s(1) + 1 
O O
0 0
O O
Figure 2: A tree in WB(1/4) that is not 'balanced
And, B(p) > 1/(1 + 1//) = //(1 + /). Hence //(1 + /) < B(p) < 1/(1 + /) for every p in
T. So, T is in WB(a) for a = //(1 + /). o
Remark While every /balanced tree, 0 < / < 1/2, is in WB(a) for a = 3/(1 + /),
there are trees in WB(a) that are not /balanced. Figure 2 shows an example of a tree in
WB(1/4) that is not 'balanced.
Lemma 4 If T is a COST then T is balanced.
Proof If T is a COST, then every subtree of T is a COST. Consider any subtree with
root p, left child 1, and right child r. If neither 1 nor r exist, then s(l) = s(r) = 0 and p
is 2balanced. If s(l) = 0 and s(r) > 1, then r has a nonempty subtree with root t and
s(t) > s(l). So p is not a COST. Hence, s(r) < 1 and p is balanced. The same is true
when s(r) = 0. So, assume s(l) > 0 and s(r) > 0.
If s(l) = 1, then s(r) < 3 as otherwise, one of the subtrees of r has m > 2 nodes and
m > s(l) implies p is not a COST. Since s(r) < 3, 1(s(r) 1) < s(l) and 1(s(l) 1) < s(r).
So, p is balanced. The same proof applies when s(r) = 1. When s(l) > 1 and s(r) > 1,
let a and b be the roots of the left and right subtrees of 1. Since p is a COST, s(a) < s(r)
and s(b) < s(r). So, s(l) = s(a) + s(b) + 1 < 2s(r) + 1 and (s(1) 1) < s(r). Similarly,
1(s(r) 1) < s(1). So, (1, r). Since this proof applies to every nodes in T, the children of
every p are balanced and T is balanced. o
eve2 2ae
0
0 O0
o 0
Figure 3: balanced tree that is not a COST
Remark There are balanced trees that are not COSTs (see Figure 3).
While a COST is in WB(1/3) and WB(a) trees can be maintained efficiently only for
2/11 < a 1 1 1//2 0.293, a COST is better balanced than WB(a) trees with a in the
usable range. Unfortunately, we are unable to develop O(log n) insert/delete algorithms for
a COST.
In the next section, we develop insert and delete algorithms for 3balanced binary search
trees (3BBST) for 0 < 3 < /2 1. Note that every (2 1)BBST is in WB(a) for
a =1 1 1//2 which is the largest permissible a. Since our insert and delete algorithms
perform rotations along the search path whenever these result in improved search cost,
BBSTs are expected to have better search performance than WB(a) trees (for a = 3/(1+3)).
Each node of a 3BBST has the fields LeftChild, Size, Data, and RightChild. Since every
3BBST, P > 0, is in WB(a), for a > 0, 3BBSTs have height that is logarithmic in n, the
number of nodes (provided 3 > 0).
4 Search, Insert, and Delete in a 3BBST
To reduce notational clutter, in the rest of the paper, we abbreviate s(a) by a (i.e., the node
name denotes subtree size).
4.1 Search
This is done exactly as in any binary search tree. Its complexity is O(h) where h is the
height of the tree. Notice that since each node has a size field, it is easy to perform a search
Sd / q
q c c d
(a) before (b) after
Figure 4: LL rotation for insertion
based on index (i.e., find the 10'th smallest key). Similarly, our insert and delete algorithms
can be adapted to indexed insert and delete.
4.2 Insertion
To insert a new element x into a 3BBST, we first search for x in the 3BBST. This search
is unsuccessful (as x is not in the tree) and terminates by falling off the tree. A new node y
containing x is inserted at the point where the search falls off the tree. Let p' be the parent
(if any) of the newly inserted node. We now retrace the path from p' to the root performing
rebalancing rotations.
There are four kinds of rotations LL, LR, RL, and RR. LL and RR rotations are symmetric
and so also are LR and RL rotations. The typical configuration before an LL rotation is
performed is given in Figure 4(a). p' denotes the root of a subtree in which the insertion
was made. Let p be the (size of the) subtree before the insertion. Then, since the tree was
a 3BBST prior to the insertion, 3(p, d). Also, for the LL rotation to be performed, we
require that (q > c) and (q > d). Note that q > d implies q > 1. We shall see that 3(q, c)
follows from the fact that the insertion is made into a 3BBST and from properties of the
rotation. Following an LL rotation, p' is updated to be the node p".
Lemma 5 [LL insertion lemma] If [f(p, d) A Q(q, c) A (q > c) A (q > d)] for 0 < 3 < 1/2
P LR(i)
p" gp'
a
a b c d
b c
(a) before (b) after substep (i)
Figure 5: Substep (i) of insertion LR rotation
before the rotation, then 3(q, gp') and 3(c, d) after the rotation.
Proof Assume the before condition.
(a) Q(q 1) < c (as 3(q, c)) < gp'. Also, P(gp' 1) = 3(c + d) < 23q (as > > 0, q > c and
q > d) < q (as 3 < 1/2). So, P(q,gp').
(b) d < q = d 1 < q 1 = 3(d 1) < Q(q 1) < c (as 3(q,c)). Also, (c 1) <
Q(q + c 1) = Q(p' 2) = Q(p 1) < d (as 3(p, d)). So, 3(c, d). o
In an LR rotation, the before configuration is as in Figure 4(a). However, this time q < c.
Figure 4(a) is redrawn in Figure 5(a). In this, the node labeled c in Figure 4(a) has been
labeled q and that labeled q in Figure 4(a) has been labeled a. With respect to the labelings
of Figure 5(a), rotation LR is applied when
[(q > a) A (q > d)].
The other conditions that apply when an LR rotation is performed are
[(p, d) A P(a, q) A (b, c)].
Here p denotes the (size of the) left subtree of gp prior to the insertion. An LR rotation is
accomplished in two substeps (or two subrotations). The first of these is shown in Figure 5(b).
Following an LR rotation, p' is updated to be node q'.
Lemma 6 [LR substep(i) insertion lemma] If [P(p, d) A (a, q) A(b, c)A(q > a)A(q > d)]
for 0 < p < 1/2 before the subrotation, then [p(p",gp') A {(3(a,b) A +(c,d)) V ( /
(a, b) A 3(c, d))}] after the subrotation.
Proof Assume the before condition. First, we show that P(p",gp') after the rotation.
Note that P(p" 1) = p(a + b) = p(a + b + c + 1) p(c + 1) = P(p' 1) 3(c + 1) =
(p 1) 3c < d c < d < gp'. Also, P(gp' 1) = p(c + d) < b + + pd (as P(b, c))
< b + pq (as q > d) < b + a + 3 (as P(a, q)) < p" (as 3 < 1/2 and p" = a + b + 1). So,
(p", gp').
Next, we prove two properties that will be used to complete the proof.
P1: 3(b 1) < a.
To see this, note that 3(b 1) < 3(q 1) < a (as 3(a, q)).
P2: 3(c 1) < d.
For this, observe that p' 1 = a + q > 3(q 1) + q (as /(a, q)) = (3 + 1)(q 1) + 1. So,
q 1< j = +'. Similarly, q 1 = b + c 3(c 1) + c (as 3(b, c)) = (3 + 1)(c 1) + 1.
So, 3(c 1) < /(q 2) < /(q 1) < < d (as 3(p, d)) < d.
+1 +1 (/3+1)2 (/3+1)2
To complete the proof of the lemma, we need to show
{(/(a, b) A (c, d)) V ( (a, b) A 3(c, d))}.
We do this by considering the two cases b > c and b < c.
Case b > c: Since a < q = b+c+1, 3(a1) < 3(b+c) < 2/b < b. This and P1 imply P(a, b).
Also, d < q b+c+. So, / (d 1)< /3t(b+ c )= /c+ /3c(b 1) < 3 c+ c
(as 3(b, c)) = c. This, together with P2 implies (c, d). So, /(a, b) A /(c, d).
Case b < c: Since a < q = b + c + 1,a 1 < b + c. So, a 1 b + c 1 or
(i) < b + 13(c1) < /b + b (as 3(bc)) = b. This and P1 imply + (ab). Also
1+ 1+P 1+ 1+/ 1+ ) //33(. This, t r wi
d 1 q 2 = b + c 1. So, (d 1) < 3(b+c 1) < (2c 1) < c. This, together with
J \ LR(ii) /
d ,
d LL q
q c c d
(a) before (b) after
Figure 6: Case LL for LR(ii) rotation
P2 implies 3(c, d). So, (a, b) A /(c, d). o
Since an LR(i) rotation can cause the tree to lose its 3balance property, it is necessary
to follow this with another rotation that restores the 3balance property. It suffices to
consider the two cases of Figures 6 and 7 for this follow up rotation. The remaining cases
are symmetric to these. In Figures 6 and 7, p and d denote the nodes that do not satisfy
3(p, d). Note, however, that these nodes do satisfy /+ (p, d).
Since the follow up rotation to LR(i) is done only when
(p, d) A ((p, d)),
l+#
either 3(p 1) > d or 3(d 1) > p. When 3(p 1) > d, the second substep rotation is one
of the two given in Figures 6 and 7. When 3(d 1) > p, rotations symmetric to these are
performed. In the following, we assume 3(p 1) > d. Further, we may assume d > 0, as
d = 0 and (p, d) imply p < 1. Hence, 3(p, d). Also, d > 0 and 3(p 1) > d imply p > 1.
The LR(ii) LL rotation is done when the condition
A =(q> d) A(c< (1 + )q+ (1 )) AB where
B = (p, d) A ((p, d)) A 3(q, c) A (3(p 1) > d > 0).
+1+
Lemma 7 [Case LR(ii) LL rotation] If A holds before the rotation of Figure 6, then 3
(q, gp') and /(c, d) after the rotation provided 0 < 3 < 2v2 1.
Proof (a) P(q,gp'):
(q 1) < c (as 3(q, c)) < gp'. Also, (gp' 1) = (c + d) < 3((1 + )q + (1 3) + d) <
P(1 + 3)q 3+ P( P) + (q 1) (as q > d) = 3(2 + 3)q 2 < q (as 3(2 + 3) < 1 for
0 < 2 <  1). So, (q, gp').
(b) (c, d):
(d 1) < J(q 1) < c (as 3(q,c)). And, (c 1) = (c 1) + (c 1) <
q + (c 1) = j(q + 1) = l(p 2) < i/(p 1) < d (as /(p,d)). So,
p(c,d). o
Lemma 8 If (c < (1 + /)q + (1 3)) A (/(p 1) > d) in Figure 6, then d < q provided
0 < p 2 1.
Proof Since d < (p1) = P(q+c) < (q+(1 +P)q+1 ) = /(/+2)q+/(1 ) < q+1
(as 3(3 + 2) < 1 and 3(1 3) < 1 for 0 < < V/2 1). So, d < q. D
So, the only time an LR(ii) LL rotation is not done is when C = (C1 V C2) A B holds
where
C = (q =d) A (c < (1 + 3)q+1 )
C2 c> (1 + )q + (1 3).
At this time, the LR rotation of Figure 7 is done. In terms of the notation of Figure 7, the
condition C becomes D = (Di V D2) A E where
D = (a = d) A (q < (1 + )a+1 3)
D2 = q > (1 + p)a + 1 
E (p, d) A 3(p, d) A 3(a, q) A 3(b, c) A ((p 1) > d > 0).
t=+
P LR(ii)
LR
aq
a b c d
b c
(a) before (b) after
Figure 7: Case LR for LR(ii) rotation
Lemma 9 When an LR(ii) LR rotation is performed and < 2v2 1, q > d and so search
cost is reduced.
Proof If Di, then since d < (p 1) = 3(a+q) = 3(d+q), q > d/ d > d as < 2 1.
If D2, then d < 3(p 1) = 3(a + q) < p(4t + q) = q(2+/q /3 < q (as
3< / 1). D
Lemma 10 When (d = a) A /(b,c) A (3(p 1) > d) A (/3 < /2 1) (see Figure 7),
*(a 1) < b and (d d 1) < c.
Proof Since 3(p 1) > d and d = a, (p 1) > aor 3(a + q) > a or a(1 ) < q or
a < /q. So, 3(a 1) < = ( + +1) 3.
If c < + 3, then
32 b
P(a 1) < 1 3(b + + + 1) 
P(P + 1)b P2( + )
1/ + 1 /
1 3 1 3
3( 3+1 )b +3(32 + 23 1)
1 +
< b (as + 1) <1 1 for /3
Since 3(c 1) < b, c < + 1. So,
(2 2 b (+ 1)b 332 3 2
( a 1)< (b+ c +t )< (b+ +2)< +
1 1P 1_ 1P
So,
S + 1 3 1
al< b+ S
However, since 32 + 2 < 0 for 3 < / 1, (1+ 3)/(1 ) < 1 and (33 )/(1 ) < .
So, a 1 < b/ + 3. If a > c + 1, then c < a 1 < b/ + 3. We have already shown that for
c < b3+ 3, 3(a 1) < b. So, assume a < c+1. Now, a < cand (a 1) < 3(c ) < b (as
/(b, c)). So, /(a 1) < b in all cases. /(a 1) < c may be shown in a similar way. Since
a = d, we get 3(d 1) < c. o
Lemma 11 [Case LR(ii) LR rotation] If D holds before the rotation of Figure 7, then 3
(p',gp'), 3(a, b), and 3(c, d) following the rotation provided 0 < 3 < /2 1.
Proof (a) P(p',gp'):
(gp' 1) = /(c + d) < b+ + /d (as /(b,c)) < b 3 + /q (from Lemmas 9 and 10, q > d)
< b+3+a++ = a+b+23 < a+b+1 = p'. Also, since /3(p, d) and q > d, (pl) < (3+1)d
or 3(a + q) < (3 + 1)d or a + q < (1 + 1)d or a < (1 + 1)d q < (1 + )d d = d/3. So,
P(p' 1) = (a + b) < d + b
(b) /(a,b):
Since b < q and /(a, q), (b 1) < /(q 1) < a.
When D1, /(a 1) < b was proved in Lemma 10. So, /(a, b).
When D2, q > a(1 + 3) + 1 3. So,
q 1 / b+c+1 1 3
Sa< 
+# ~1+ 1+# 1 +
3b+3c+3 13 3 b+b+23 13
1+3 ++t < < + = b.
1+t 1+ t 1+t 1+3
So, (a, b).
(c) (c, d):
Note that 3(c 1) < 3(q 1) < /3(q 1) < /(p 1) < d.
When D1, 3(d 1) < c was proved in Lemma 10. So, 3(c, d).
When D2, if d < b + 1, then d < b and 3(d 1) < 3(b 1) < c. So, assume d > b + 1. Now,
b
< 3(a+b+c+1)
< 3(q +b+c+1)
= +(b+c+ + (1+ )(b +c+1)) 
t1+
= c+3+(1+3)c+ 2 1
= (+ )c+ 3 1 < (2+ )c (as p 2 
< c + (as < 2 1).
3
Also, from d < P(p
1) and the above derivation, we get
d < (b+c+ 3+(1+ )(b+c+ ))
S+
c + + + (t c 0 + + c
S ( +t+c+ +*(1+ )( ++c+l))
1+33
3 3+1 232 1
= ( c) +( 3c) (3 + 1)
232
2(2 + 2 + 33 + 3 + 32
= (2 + )c ++
(2+)+ 1+3
*(a 1) <
< (2 + 3)c + 1 (as 3 + 432 + < 1 + 3 for 3< V/2 1).
So, 3(d 1) < 3(2 + 3)c < c (as < v 2 1). So, 3(c, d). o
Theorem 2 If T is 3balanced, 0 < 3 < 2v 1, prior to insertion, it is so following the
insertion.
Proof First note that since all binary search trees are balanced for f = 0, the rotations
(while unnecessary) preserve 0balance. So, assume 3 > 0. Consider the tree T' just after
the new element has been inserted but before the backward restructuring pass begins.
If the newly inserted node, z, has no parent in T', then T was empty and T' is 3balanced.
If z has a parent but no grandparent, then T has at most one nonempty subtree X. Since
T is 3balanced, /3(X 1) < 0. So, X < 1. Following the insertion, T' has one subtree
with < 1 nodes and one with exactly one. So, T' is 3balanced. We may therefore assume
that z has a grandparent in T'.
From the downward insertion path, it follows that all nodes u in T' that have children
1 and r for which /(, r) must lie on the path from the root to z. During the backward
restructuring pass, each node on this path (other than z and its parent) play the role of gp
in Figures 4 and 5. The 3property cannot be violated at z as z has no children. It cannot
be violated at the parent, s, of z as s satisfied the 3property prior to insertion. As a result
its other subtree has < 1 element. So, following the insertion, s satisfies the 3property. As
a result, each node in T' that might possibly violate the 3property becomes the gp node
during the restructuring pass. Consider one such gp node. It has children in T' denoted by
p' and d. Its children in T are p and d. Figures 4 and 5 show the case when d is the right
subtree of gp in both T and T'. The cases RR and RL arise when d is the left subtree.
During the restructuring pass, gp begins at the grandparent of z and moves up to the
root of T'. If z is at level r in T', (the root being at level 1), then gp takes on r 2 values
during the restructuring pass. We shall show that at each of these r 2 positions either
(a) no rotation is performed and all descendants of gp satisfy the /property or
(b) a rotation is performed and following this, all descendants of node p" (Figure 4) or of
node q' (Figure 5) satisfy the /property.
As a result, following the rotation (if any) performed when gp becomes the root of T', the
restructured tree is /balanced. The proof is by induction on r. When r = 3 (recall, we
assume z has a grandparent), gp begins at the root of T' and its descendants satisfy the
/property.
Without loss of generality, assume that the insertion took place in the left subtree of gp.
With respect to Figure 4, we have three cases: (i) q > c and q > d, (ii) q < c and c > d,
and (iii) q < d and c < d. In case (i), all conditions for an LL rotation hold and such a
rotation is performed. In case (ii), an LR rotation is performed. Following either rotation,
T' is /balanced. In case (iii), ?(p' 1) = ?(q + c) < 2?d < d (as 3 < /2 1). Also,
?(d 1) < p < p + 1 = p'. So, ?(d 1) < p'. Hence, 3(p', d) and T' is /balanced.
For the induction hypothesis, assume (a) and (b) whenever r < k. In the induction step,
we show (a) and (b) for trees T with r = k + 1. The subtree in which the insertion is done
has r = k. So, (a) and (b) hold for all gp locations in the subtree. We need to show (a) and
(b) only when gp is at the root of T'. This follows from Lemmas 5, 6, 7, and 11.
The theorem now follows. o
Lemma 12 The time needed to do an insertion in an n node 3BBST is O(log n) provided
0 < < 2 1.
Proof Follows from the fact that insertion takes O(h) time where h is the tree height and
h = O(log n) when 3 > 0 (Lemmas 1 and 3). o
4.3 Deletion
To delete element x from a 3BBST, we first use the unbalanced binary search tree deletion
algorithm of [HOR094] to delete x and then perform a series of rebalancing rotations. The
steps are:
Step 1 [Locate x] Search the 3BBST for the node y that contains x. If there is no such
node, terminate.
Step 2 [Delete x] If y is a leaf, set d' to nil, gp to the parent of y, and delete node y. If
y has exactly one child, set d' to be this child; change the pointer from the parent (if
any) of y to point to the child of y; delete node y; set gp to be the parent of d'. If y
has two children, find the node z in the left subtree of y that has largest value; move
this value into node y; set y = z; go to the start of Step 2. { note that the new y has
either 0 or 1 child }
Step 3 [Rebalance] Retrace the path from d' to the root performing rebalancing rotations.
There are four rebalancing rotations LL, LR, RR, and RL. Since LL and RR as well as
LR and RL are symmetric rotations, we describe LL and LR only. The discussion is very
similar to the case of insertion. The differences in proofs are due to the fact that a deletion
reduces the size of encountered subtrees by 1 while an insertion increases it by 1. In an
LL rotation, the configuration just before and after the rotation is shown in Figure 8. This
rotation is performed when q > c and q > d'. Following the rotation, d' is updated to the
node p'.
Let d denote the size of the right subtree of gp before the deletion. So, d = d' + 1. Since
prior to the deletion the 3BBST was 3balanced, it follows that 3(p, d) and 3(q, c).
Lemma 13 [LL deletion lemma] If [f(p, d) A /(q, c) A (q > c) A (q > d) A (1/3 < 3 < 1/2)]
before the rotation, then [f(q, gp') A 3(c, d')] after the rotation.
\ LL / gp
q c c d'
(a) before (b) after
Figure 8: LL rotation for deletion
Proof (a) P(q,gp'):
P(q 1) < c (as 3(q, c)) < gp'. Also, 3(gp' 1) = 3(c + d') < 2aq (as c < q and d' < q)
< q (as 3 < 1/2). So, P(q,gp').
(b) 3(c, d'):
d' < q = d' 1 < q 1 9P(d' 1) < 9(q 1) < c. Also, when c < 1, 9(c 1) < 0 < d' (as
d' > O). Whenc > 1,q > c > q > 2 and p= q+c+1 > c+3. So, 3(c1)< /(p1)3/3 <
d 3P (as 3(p, d)) < d 1 (as 3 > 1/3) = d'. Hence, 3(c, d'). o
In an LR rotation, the before configuration is as in Figure 8(a). However, this time q < c.
Figure 8(a) is redrawn in Figure 9(a). In this, the node labeled c in Figure 8(a) has been
relabeled q and that labeled q in Figure 8(a) has been relabeled a. With respect to the
labelings of Figure 9(a), rotation LR is applied when
[(q > a) A (q > d')].
The other conditions that apply when an LR rotation is performed are
[P(p, d) A P(a, q) A P(b, c)].
Here d denotes the (size of) right subtree of gp prior to the deletion. As in the case of
insertion, an LR rotation is accomplished in two substeps (or two subrotations). The first
P LR(i)
p/ gp'
a
a b c d'
b c
(a) before (b) after substep (i)
Figure 9: LR rotation for deletion
of these is shown in Figure 9. Following an LR rotation, d' is updated to node q'.
Lemma 14 [LR substep(i) deletion lemma] If [f(p, d) A(a, q) A(b, c)A(q > a)A(q > d')]
before the subrotation LR(i), then [/(p', gp') A{(t(a, b)A ~~(c, d'))V(i/ (a, b) A/(c, d'))}]
after the subrotation provided 1/3 < 3 < 1/2
Proof Assume the before condition.
(a) If b = c = 0, then q = b + c+ 1 = 1. Furthermore, (q > a) and (q > d') imply a = d' = 0.
So, gp'= p' = 1. Hence, [ (p', gp') A '(a, b) A (c, d')]
(b) If b = 1 and c =0, then q = 2, a < 1, and d' < 1. So, 1 < p' < 3 and 1 < gp'< 2.
Hence, [(p', gp') A (a, b) A (c, d')]
(c) If b = 0 and c = 1, then = 2, a 1, and d' 1. So, 1 < p' < 2and 1
[1(p', gp') A (a, b) A (c, d')]
As a result of (a) (c), to complete the proof, we may assume that b > 1 and c > 1. So,
q > 3, a > 1 (as (a,q) = t(q 1) < a or a > 23 > 0), p = a + q + 1 > 5, d > 2 (as
3(p, d) = 3(p 1) 1/3), and d' = d 1 > 1.
First, we show that 3(p', gp'). For this, note that a + b + c +1 = p 1. From 3(p, d), it
follows that 3(a+b+c+ 1) = (p 1) < d. So, 3(a+b) < d c From Figure 9(b), we
23
see that P(p'l) = (a+b). Hence, P(p'l) < dc = d'pc+lP < d'+123 < gp'.
Also,
P(gp' 1) = (c +d') < b+ + d' (as (b, c))
< b+ q+ (as q > d')
b + a + 2P (as (a, q))
< p/.
So, P(p',gp').
Next, we prove two properties that will be used to complete the proof.
Pl: P(b 1) < a.
To see this, note that P(b 1) < P(q 1) < a (as 3(a, q)).
P2: (c 1) < d'.
For this, observe that 3(c 1) < ((q 2) (as c < q 1) < J(p 4) (as q = p a 1 and
a >1) = (p 1) 3/3 1/3) = d'.
To complete the proof of the lemma, we need to show
{(3(a, b) A (c, d')) V ( (a, b) A 3(c, d'))
For this, consider the two cases b > c and b < c (as in Lemma 6).
Case b > c: Since a < q = b + c + 1, p(a 1) < 3(b+ c) < 2pb < b. This, together with P1
implies (a, b). Also, d' < q = b+c+1. So, (d'(l) < / (b+cl) = 4c+ 4(b1) <
/3 /
c + = c. This, together with P2 implies (c, d'). So, 3(a, b) A (c, d').
Case b < c: Since a < q = b+c+,a < b+c. So,a 1 b+c 1 or
1+/3
3b + 1 < b + b = b. This and P1 imply (a, b). Also, d'1 2 = b+c1.
1+3 1+3 1+3 1+3 1+3
So, (d' 1) < P(b + c 1) < 3(2c 1) < c. This and P2 imply 3(c, d'). Hence, /
S"i 1+/3
(a, b) A 3(c,d'). D
The substep(ii) rotations are the same as for insertion.
Theorem 3 If T is 3balanced, then following a deletion the resulting tree T' is also 3
balanced provided 1/3 < 3 < 2v2 1.
Proof Similar to that of Theorem 2. o
When 0 < 3 < 1/3, we need to augment the LL rotation by a transformation for the case
d' = 0. When d' = 0, (p 1) < d = d'+l = 1. So, p < 1//3+1 and gp =p+d'+1 < 1//3+2.
To 3balance at gp, the at most 1/3 + 2 nodes in gp are rearranged into any 3BBST in
constant time (as 1/3+2 is a constant). When d' > 0, the proof of Lemma 13 part (b) can be
changed to show 3(c1) < d' for 0 < 3 < v21. The new proof is: since c < q, c < (p1)/2
and 3(c 1) < 3(p 1)/2 /3 < d/2 3 = d d/2 3 d 1 3 < d'. The LR rotation
needs to be augmented by a transformation for the case d' = d 1 < 1. At this
time, 3(p 1) < d < So, gp = p + d < + 1 + 1. To 3balance at gp, we
/3(2 o/3) g = d< /32(2+/3) /03(2
rearrange the fewer than +1 + 1 nodes in the subtree, in constant time, into any
/32( /) /3(2+/3)
3balanced tree. When d' > 1 1, the proof for (c 1) < d' in Lemma 14 needs to be
/3(2+/3)
changed to show that the LR substep(i) lemma holds. The new proof is:
d > P(p1)= 3(a+b+c+1) > (3(q1)+b+c+1)
= 3P((b+c)+b+c+ 1)
> /((1 +3)3(c 1) + (1 + )c + 1)
= /((1 + /)2(c 1) + 2 + ).
So, (c 1) < d2//3 < d 1 (as d > / ) = d'
(1+/3)2 /(2+/)
Also, note that when 3 = 0, all trees are 3balanced so the rotations (while not needed)
preserve balance.
Theorem 4 With the special handling of the case d' = 0, the tree T' resulting from a deletion
in a 3BBST is also 3balanced for 0 < 3 < /2 1.
Lemma 15 The time needed to delete an element from an n node 3BBST is O(logn)
provided 0 < /
4.4 Enhancements
Since our objective is to create search trees with minimum search cost, the rebalancing
rotations may be performed at each positioning of gp during the backward restructuring
pass so long as the conditions for the rotation apply rather than only at gp positions where
the tree is unbalanced.
Consider Figure 4(a). If p' < d, then the conditions of Lemmas 5 and 6 cannot apply as
q < p' < d. However, it is possible that e > p' where e is the size of either the left or right
subtree of d. In this case, an RR or RL rotation would reduce the total search cost. The
proofs of Lemmas 5 and 6 are easily extended to show that these rotations would preserve
balance even though no insertion was done in the subtree d. The same observation applies
to deletion. Hence the backward restructuring pass for the insert and delete operations can
determine the need for a rotation at each gp location as below (1 and r are, respectively, the
left and right children of gp).
if s(l) > s(r) then check conditions for an LL and LR rotation
else check conditions for an RR and RL rotation.
The enhanced restructuring procedure used for insertion and deletion is given in Figure 10.
In the RR and RL cases, we have used the relation '>' rather than '>' as this results in
better observed run time.
Since it can be shown that the rotations preserve balance even when there has been no
insert or delete, we may check the rotation conditions during a search operation and perform
rotations when these improve total search cost.
Finally, we note that it is possible to use other definitions of 3balance. For example,
we could require 3(s(a) 2) < s(b) and 3(s(b) 2) < s(a) for f(a, b). One can show that
the development of this paper applies to these modifications also. Furthermore, when this
new definition is used, the number of comparisons in the second substep of the LR and RL
rotations is reduced by one.
procedure Restructuring ;
begin
while (gp) do
begin
if s(gp.left) > s(gp.right) then { check conditions for an LL and LR rotation }
begin
p = gp.left;
if (s(p.left) > s(p.right)) then
begin if (s(p.left) > s(gp.right)) then do LL rotation; end
else
begin
if (s(p.right) > s(gp.right)) then { LR }
begin
do LR rotation ;
{ now notations a, b, c, and d follow from figure l(b) }
if (3(s(a) 1) > s(b)) then
if ((s(a.right) < (1 + 3)s(a.left) + 1 3) and
(s(b) < s(a.left))) then
do LL rotation
else do LR rotation
else if (P(s(d) 1) > s(c)) then
if ((s(d.left) < (1 + 3)s(d.right) + 1 3) and
(s(c) < s(d.right))) then
do RR rotation
else do RL rotation ;
end
end
end
else { check conditions for an RR and RL rotation }
begin
p = gp.right ;
if (s(p.left) > s(p.right)) then
begin
if (s(p.left) > s(gp.left)) then { RL }
do symmetric to the above LR case ;
end
else
begin if (s(p.right) > s(gp.left)) then do RR rotation; end ;
end ;
gp = gp.parent ;
end;
end ;
Figure 10: Restructuring procedure
4.5 Top Down Algorithms
As in the case of red/black and WB(ac) trees, it is possible to perform, in O(log n) time,
inserts and deletes using a single top to bottom pass. The algorithms are similar to those
already presented.
5 Simple ,BBSTs
The development of Section 4 was motivated by our desire to construct trees with minimal
search cost. If instead, we desire only logarithmic performance per operation, we may simplify
the restructuring pass so that rotations are performed only at nodes where the 3balance
property is violated. In this case, we may dispense with the LL/RR rotations and the first
substep of an LR/RL rotation. Only LR/RL substep (ii) rotations are needed. To see this,
observe that Lemmas 7 and 11 show that the second substep rotations rebalance at gp (see
Figures 6 and 7) provided i+ (p, d) (The remaining conditions are ensured by the bottomup
nature of restructuring and the fact the tree was 3balanced prior to the insert or delete).
If the operation that resulted in loss of balance at gp was an insert, then O(p 2) < d
(as p > d, the insert took place in subtree p and gp was 3balanced prior to the insert)
and 3(p 1) > d (gp is not 3balanced following the insert). For the substep (ii) rotation
to restore balance, we need 3(p 1) < (1 + 3)d. This is assured if d + 3 < (3 + 1)d (as
O(p 2) < d). So, we need d > 1. If d < 1, then d = 0. Now O(p 2) < d and O(p 1) > d
imply p = 2. One may verify that when p = 2, the LR(ii) rotations restore balance.
If the loss of 3balance at gp is the result of a deletion (say from its right subtree), then
3(p 1) < d + 1 (as gp was 3balanced prior to the delete). For the substep (ii) rotation to
accomplish the rebalancing, we need /(p1) < (3+1)d. This is guaranteed if d+1 < (3+1)d
or d > 1/. When d < 1/O and d > 1/3, d < 2. Since (p 1) < d + 1 and d > 1/3,
when d = 2, p < 10; when d = 1, p < 7; and when d = 0, p < 4. We may verify that for all
these cases, the LR(ii) rotations restore balance. Hence, the only problematic case is when
3 < 1/3 and d < 1/3.
procedure Restructuring2 ;
begin
while (gp) do
begin
if (3(s(gp.left) 1) > s(gp.right)) then { do an LL or LR rotation }
begin
p = gp.left;
if ((s(p.right) < (1 + 3)s(p.left) + 1 3) and
(s(gp.right) < s(p.left))) then
do LL rotation
else do LR rotation ;
end
else
do symmetric to the above L case ;
gp = gp.parent ;
end;
end ;
Figure 11: Simple restructuring procedure for insertion
When 3 < 1/3, an LL rotation fails to restore balance only when d = 0 (see discussion
following Theorem 3). So we need to rearrange the at most 1/3 + 2 nodes in gp into any
3balanced tree when d = 0. An LR rotation fails only when d < 1 1. To see this, note
that in the terminology of Lemma 14, d is d'. The proof of P2 is extended to the case 3 < 1/3
when d' > 1 1. Also, since d' < 1/3, for the case b > c, we get P(d'1) < 13 < c (as
c > 1). For the case b < c, we need to show 3(a 1) < b. Since an LR rotation is done only
when condition D1 V D2 holds, from Lemmas 10 and 11, it follows that 3(a 1) < b. So, an
LR rotation rebalances when 3 < 1/3 provided d > 1 1. For smaller d, the at most
(2+/3) + p(2+p) + 1 nodes in the subtree gp may be directly rearranged into a 3balanced
tree.
The restructuring algorithm for simple 3BBSTs is given in Figures 11 and 12. The
algorithm of Figure 11 is used following an insert and that of Figure 12 after a delete.
Simple 3BBSTs are expected to have higher search cost than the 3BBSTs of Section 4.
However, they are a good alternative to traditional WB(a) trees as they are expected to be
"better balanced". To see this, note that from the proof of Lemma 3, the balance, B(p), at
procedure Restructuring3 ;
begin
while (gp) do
begin
if (P(s(gp.left) 1) > s(gp.right)) then
if (3 < 1/3) and (s(gp.right) < 1/3(2 + 3)
rearrange the subtree rooted at gp into
else { do an LL or LR rotation }
begin
p = gp.left;
if ((s(p.right) < (1 + 3)s(p.left) + 1 
(s(gp.right) < s(p.left))) then
do LL rotation
else do LR rotation ;
end
end
else
do symmetric to the above L case ;
gp = gp.parent ;
end;
end ;
 1) then
any 3balanced tree
3) and
Figure 12: Simple restructuring procedure for deletion
any node p in a 3balanced tree satisfies
1
B(p)
B(p) <1
s(r) +
= 1+
1
> 1+
1 + 1 + 2/31
S /3(s(r)+1)
1 2/31
/3 /3((r)+1)
1
S 1 20/31
1 + + /3((r)+1)
Also, since s(r) 1 < s(l)/P, s(r) + 1 < s(1)// + 2. Hence, 1 + (+1 1 + (l +1 +
So,+
So,
B(p) >
1
1+1 1 + 2
S3 /3(s(i)+1) s(l)+l
1
1 1 231
/P (s(i)+1)
Consequently,
1 1
11+ 2/3 < B(p) < 1  12/31
1 (s(l)+l) 1 (s(r)+l)
When P = /2 1,
1 1
2+ 2+1 < B<(p) <1 2+
2 + v/2 + 1 2+ +/2+ 1v2
2)+   s(r)+l
If s(p) < 10, 0.296 < B(p) < 1 0.296. So, every 3balanced subtree with 10 or fewer nodes
is in WB(a) for a 0.296. Similarly, every subtree with 100 or fewer nodes is in WB(a) for
a w 0.293. In fact, for every fixed k, subtrees of size k or less are in WB(a) for a slightly
higher than 1  0.2929 which is the largest value of a for which WB(a) trees can be
maintained.
6 BBSTs without Deletion
In some applications of a dictionary, we need to support only the insert and search operations.
In these applications, we can construct binary search trees with total cost
C(T) < n log+(5(n + 1))
by using the simpler restructuring algorithm of Figure 13.
Theorem 5 When the ori operations are search and insert and restructuring is done as in
Figure 13, C(T) < nlog+( 5(n + 1)).
Proof Suppose T currently has m 1 elements and a new element is inserted. Let u be
the level at which the new element is inserted. Suppose that the restructuring pass performs
rotations at q < u of the nodes on the path from the root to the newly inserted node. Then
C(T) increases by at most v = u q as a result of the insertion. The number of nodes on
the path from the root to the newly inserted node at which no rotation is performed is also
v. Let these nodes be numbered 1 through v bottom to top. Let Si denote the number of
elements in the subtree with root i prior to the restructuring pass. We see that S1 > 1 and
procedure Restructuring4 ;
begin
while (gp) do
begin
if s(gp.left) > s(gp.right) then { check conditions for an LL and LR rotation }
begin
p = gp.lcft;
if (s(p.left) > s(p.right)) and (s(p.left) > s(gp.right)) then
do LL rotation
else if (s(p.left) < s(p.right)) and (s(p.right) > s(gp.right)) then
do LR rotation ;
end
else { check conditions for an RR and RL rotation }
do symmetric to the above L case ;
gp = gp.parent ;
end;
end ;
Figure 13: Simple restructuring procedure without a 3 value
S2 > 2. For node i, 2 < i < v, one of its subtrees contains node i 1. Without loss of
generality, let this be the left subtree of i. Let the root of the right subtree of i be d. So,
Si > Si; +s(d) + 1.
If i 1 is not the left child of i, then since no rotation is done at i, s(d) > Si1. If i 1
is the left child of i, then consider node i 2. This is in one of the subtrees of i. Since no
rotation is performed at i 1, s(d) > Si2. Since Si_ > Si2, we get
Si > Si1 + Si2 + 1.
Hence, S, > N, where N, is the minimum number of elements in a COST of height
v. So, v < logg(/5(m + 1)). So, when an element is inserted into a tree that has m 1
elements, its cost C(T) increases by at most log(v/5(m + 1)). Starting with an empty tree
and inserting n elements results in a tree whose cost is at most n log( /5(n + 1)). O
Corollary 2 The expected cost of a search or insert in a BBST constructed as above is
O(log n).
Proof Since C(T) < r logg(/5(n + 1)), the expected search cost is C(T)/n < log(V(r5(n +
1)). The cost of an insert is the same order as that of a search as each insert follows the
corresponding search path twice (top down and bottom up). O
7 Experimental Results
For comparison purposes, we wrote C programs for BBSTs, SBBSTs (simple BBSTs), BB
STDs (BBSTs in which procedure Restructuring4 (Figure 13) is used to restructure follow
ing inserts as well as deletes), unbalanced binary search trees (BST), AVLtrees, topdown
redblack trees (RBT), bottomup redblack trees (RBB) [TARJ83], weight balanced trees
(WB), deterministic skip lists (DSL), treaps (TRP), and skip lists (SKIP). For the BBST and
SBBST structures, we used / = 207/500 while for the WB structure, we used a = 207/707.
While these are not the highest permissible values of 3 and a, this choice permitted us to use
integer arithmetic rather than the substantially more expensive real arithmetic. For instance,
3(a, b) for 3 = 207/500 can be checked using the comparisons 207(s(a) 1) > 500s(b) and
207(s(b) 1) > 500s(a). The randomized structures TRP and SKIP used the same random
number generator with the same seed. SKIP was programmed with probability value p = 1/4
as in [PUGH90].
To minimize the impact of system call overheads on run time measurements, we pro
grammed all structures using simulated pointers (i.e., an array of nodes with integer pointers
[SAHN93]. Skip lists use variable size nodes. This requires more complex storage manage
ment than required by the remaining structures which use nodes of the same size. For our
experiments, we implemented skip lists using fixed size nodes, each node being of the max
imum size. As a result, our run times for skip lists are smaller than if a space efficient
implementation had been used. In all our tree structure implementations, null pointers were
replaced by a pointer to a tail node whose data field could be set to the search/insert/delete
key and thus avoid checking for falling off the tree. Similar tail pointers are part of the de
fined structure of skip and deterministic skip lists. Each tree also had a head node. WB(a)
trees were implemented with a bottomup restructuring pass. Our codes for SKIP and DSL
are based on the codes of [PUGH90] and [PAPA93], respectively. Our AVL and RBT codes
are based on those of [PAPA93] and [SEDG94]. The treap structure was implemented using
joins and splits rather than rotations. This results in better performance. Furthermore,
AVL, RBB, WB, and BBST were implemented with parent pointers in addition to left and
right child pointers. For BBSTs, the enhancements described in Section 4.4 for insert and
delete (see Figure 10) were employed. No rotations were performed during a search when
using any of the structures.
For our experiments, we tried two versions of the code. These varied in the order in which
the 'equality' and 'less than' or 'greater than' check between x and e (where x is the key
being searched/inserted/deleted and e is the key in the current node) is done. In version 1,
we conducted an initial experiment to determine if the total comparison count is less using
the order L:
if x < e then move to left child
else if x e then move to right child
else found
or the order R:
if x > e then move to right child
else if x e then move to left child
else found.
Our experiment indicated that doing the 'left child' check first (i.e. order L) worked better
for AVL, BBST, BBSTD, and DSL structures while R worked better for the RBT, RBB,
WB, SBBST, and TRP structures. No significant difference between L and R was observed
for BSTs. For skip lists, we do not have the flexibility to change the comparison order. The
version 1 codes performed the comparisons in the order determined to be better. For BSTs,
the order R was used.
In the version 2 codes the comparisons in each node took the standard form
if x = e then found
else if x < e then move to left child
else move to right child.
The version 2 restructuring code for BBSTs differed from that of Figure 10 in that the
'>' test in the second, third, and forth if statements was changed to '>'. No change was
made in the corresponding if statements for RR and RL rotations. While this increased the
number of comparisons, it reduced the run time.
We experimented with n = 10,000, 50,000, 100,000, and 200,000. For each n, the following
experiments were conducted:
(a) start with an empty structure and perform n inserts;
(b) search for each item in the resulting structure once; items are searched for in the order
they were inserted
(c) perform an alternating sequence of n inserts and n deletes; in this, the n elements inserted
in (a) are deleted in the order they were inserted and n new elements are inserted
(d) search for each of the remaining n elements in the order they were inserted
(e) delete the n elements in the order they were inserted.
For each n, the above five part experiment was repeated ten times using different random
permutations of distinct elements. For each permutation, we measured the total number of
element comparisons performed and then averaged these over the ten permutations.
First, we report on the relative performance of SBBSTs, BBSTDs, and BBSTs. For this
comparison, we used only version 1 of the code. Table 1 gives the average number of key
comparisons performed for each of the five parts of the experiment. The three versions of our
proposed data structure are very competitive on this measure. BBSTDs and BBSTs generally
performed fewer comparisons than did SBBSTs. All three structures had a comparison count
within '' of one another. However, when we used ordered data rather than random data
(Table 2), SBBSTs performed noticeably inferior to BBSTDs and BBSTs; the later two
Table 1: The number of key comparisons on random inputs (version 1 code)
remained very competitive.
Tables 3 and 4 give the average heights of the trees using random data and using ordered
data, respectively. The first number gives the height following part (a) of the experiment
and the second following part (c). The numbers are identical for BBSTDs and BBSTs and
slightly higher (lower) for SBBSTs using random (ordered) data.
The average number of rotations performed by each of the three structures is given in
Tables 5 and 6. A single rotation (i.e., LL or RR) is denoted 'S' and a double rotation
(i.e., LR or RL) denoted 'D'. In the case of BBSTs, double rotations have been divided into
three categories: D = LR and RL rotations that do not perform a second substep rotation;
DS = LR and RL rotations with a second substep rotation of type LL and RR; DD = LR
and RL rotations with a second substep rotation of type LR and RL. BBSTDs and BBSTs
n operation SBBST BBSTD BBST
insert 212495 212223 212111
search 194661 191599 191578
10,000 ins/del 416125 416967 41, '. 2
search 194957 191666 191676
delete 11.i ; 166441 166487
insert 1241080 1' ii;i 1236114
search 1152137 1135131 1134969
50,000 ins/del 2437918 24:;"11 ; 24:;1. ;'
search 1153821 1134277 1134062
delete 1018675 1007766 1007688
insert 21. ;.,13 2624829 2623792
search 24511' 1 242 ;',s1 2 1 ;.13
100,000 ins/del 51T ;i.19 5111" ; 5179653
search 2461221 242 11"' 2419990
delete 2190798 2168049 2168110
insert 5580139 5555190 5553256
search 5223989 5148220 5147698
200,000 ins/del 10981441 10969578 1'll.,;
search 5229172 5144n"'s 5144148
delete 4692447 4641349 4641: ;'
Table 2: The number of key comparisons on ordered inputs (version 1 code)
n SBBST BBSTD BBST
10,000 17,17 16,16 16,16
50,000 20,20 19,19 19,19
100,000 21,21 20,20 20,20
200,000 22,23 21,21 21,21
Table 3: Height of the trees on random inputs (version 1 code)
n SBBST BBSTD BBST
10,000 16,15 17,17 17,17
50,000 20,20 20,20 20,20
100,000 21,21 21,21 21,21
200,000 22,22 23,22 23,22
Table 4: Height of the trees on ordered inputs (version 1 code)
n operation SBBST BBSTD BBST
insert 170182 150554 150554
search 188722 1IS10 1,10
10,000 ins/del I1,!_. 1 315177 314998
search 191681 184155 184155
delete 215214 135311 135131
insert 991526 872967 872967
search 1117174 1101481 1101481
50,000 ins/del 2472 ' l 1 II, ;46 1~11,139
search 111. ;'I l; 1 n.,',1 l1 1 1i .,'1
delete 1277756 792717 791815
insert 2111 ;S SIi,48 1Si,48
search 2384327 2354757 2354757
100,000 ins/del 5249194 3823415 3821594
search 2217.' 2346118 2346128
delete 2;7 ;"* 11, ,i 1684584
insert 4449143 :;' i ii :;'i ;ii
search 5111.I. ;' 4946753 4946753
200,000 ins/del 11105525 Sil11695 804S'_.8
search 5065496 5001967 5001967
delete 5842168 35 '.i 3577223
SBBST BBSTD BBST
n operation S D S D S D DS DD
insert 2341 2220 5045 4314 5025 3938 151 93
10,000 ins/del 12'1.' 3216 10158 6311 10104 5849 232 103
delete 1607 1110 5235 2104 5201 2018 51 28
insert 11719 11120 25216 21596 25059 19732 754 455
50,000 ins/del 21330 16125 51238 31499 50979 29198 1161 531
delete n'.S 5648 26214 10462 26068 10033 248 131
insert 23450 22262 51ll' ; 43230 50047 39461 1527 920
100,000 ins/del 42780 32203 102218 62967 101; 1. 58491 2275 1046
delete 16095 11306 52227 21022 51943 20147 496 260
insert 46934 44525 100664 t..ll"' 100205 79013 3054 1840
200,000 ins/del 8'" ; 64417 204459 125960 203568 116940 4593 2059
delete 32233 22551 104344 41884 l 1'. 40157 990 523
Table 5: The number of rotations on random inputs (version 1 code)
performed a comparable number of rotations on both data sets. However, on random data
SBBSTs performed about half as many rotations as did BBSTDs and BBSTs. On ordered
data, SBBSTs performed 15 to 21' fewer rotations on part (a), 34 fewer on part (c), and
51 fewer on part (e).
The runtime performance of the structures is significantly influenced by compiler and
architectural features as well as the complexity of a key comparison. The results we report
are from a SUN SPARC5 using the UNIX C compiler cc with optimization option. Because
of instruction pipelining features, cache replacement policies, etc., the measured run times
are not always consistent with the compiler and architecture independent metrics reported
in Tables 1 through 6 and later in Tables 11 through 16. For example, since the search codes
for all tree based methods are essentially identical, we would expect methods with a smaller
comparison count to have a smaller run time for parts (b) and (d) of the experiment. This
was not always the case.
Tables 7 and 8 give the run times of the three BBST structures using integer keys and
Tables 9 and 10 do this for the case of real (i.e., floating point) keys. The sum of the run
SBBST BBSTD BBST
n operation S D S D S D DS DD
insert 9984 0 **ii" 2387 *'*"* 2387 0 0
10,000 ins/del 14997 0 16567 6130 16644 5797 25 154
delete 4989 0 6570 3726 6647 3392 26 154
insert 4'' 11 0 4'1'' ; 11956 4' 1' ; 11956 0 0
50,000 ins/del 74996 0 ."', 30659 83247 28982 137 770
delete 24987 0 :; i* 1 'S.6 33242 17018 136 766
insert 99979 0 ,i' ; 23917 *i'*' ; 23917 0 0
100,000 ins/del 149996 0 1';7 ;s 61327 166504 57969 280 1540
delete 4*'l*l'l 0 65733 37392 66505 34040 278 1536
insert 199978 0 199982 47839 199982 47839 0 0
200,000 ins/del 299996 0 331473 122653 333012 115938 559 3078
delete ''' 0 131478 74795 133016 61I 1. 557 3076
Table 6: The number of rotations on ordered inputs (version 1 code)
time for parts (a) (e) of the experiment is graphed in Figure 14. For random data, SBBSTs
significantly and consistently outperformed BBSTDs and BBSTs. On ordered data, however,
BBSTDs were slightly faster than BBSTs and both were significantly faster than SBBSTs.
Since BBSTs generated trees with the least search cost, we expect BBSTs to outperform
SBBSTs and BBSTDs in applications where the comparison cost is very high relative to that
of other operations and searches are done with a much higher frequency than inserts and
deletes. However, with the mix of operations used in our tests, SBBSTs are the clear choice
for random inputs and BBSTDs for ordered inputs.
In comparing with the other structures, our tables repeat the data for BBSTs. The reader
may make the comparison with SBBSTs and BBSTDs.
The average number of comparisons for each of the five parts of the experiment are given
in Table 11 for the version 1 implementation. On the comparison measure, AVL, RBB, WB,
and BBSTs are the front runners and are quite competitive with one another. On parts (a)
(insert n elements) and (c) (insert n and delete n elements), AVL trees performed best while
on the two search tests ((b) and (d)) and the deletion test (e), BBSTs performed best.
n operation SBBST BBSTD BBST
insert 0.27 0.30 0.34
search 0.06 0.06 0.07
10,000 ins/del 0.57 0.62 0.70
search 0.06 0.06 0.06
delete 0.22 0.25 0.26
insert 1.48 1.61 1.75
search 0.35 0.36 0.37
50,000 ins/del 2.90 3.47 3.84
search 0.36 0.38 0.39
delete 1.13 1.47 1.62
insert 3.00 3.57 3.80
search 0.78 0.83 0.84
100,000 ins/del 6.28 7.78 8.41
search 0.83 0.87 0.88
delete 2.54 3.31 3.58
insert 6.56 7.74 8.37
search 1.80 1.89 1.89
200,000 ins/del 13.89 17.32 18.57
search 1.i1, 1.98 1.98
delete 5.64 7.41 8.02
Time Unit : sec
Table 7: Run time on random inputs using integer keys (version 1 code)
n operation SBBST BBSTD BBST
insert 0.32 0.20 0.27
search 0.05 0.03 0.05
10,000 ins/del 0.58 0.43 0.57
search 0.07 0.03 0.03
delete 0.20 0.17 0.23
insert 1.38 1.20 1.10
search 0.25 0.20 0.20
50,000 ins/del 2.63 2.18 2.40
search 0.25 0.20 0.20
delete 0.95 0.92 1.05
insert 3.43 2.23 2.53
search 0.72 0.45 0.42
100,000 ins/del 5.97 4.70 5.13
search 0.55 0.47 0.42
delete 2.10 1.98 2.15
insert 6.65 4.95 5.25
search 1.20 0.92 0.90
200,000 ins/del 13.13 10.23 10.88
search 1.17 0.90 0.90
delete 4.63 4.25 4.58
Time Unit : sec
Table 8: Run time on ordered inputs using integer keys (version 1 code)
n operation SBBST BBSTD BBST
insert 0.23 0.34 0.36
search 0.07 0.10 0.10
10,000 ins/del 0.44 0.75 0.79
search 0.08 0.10 0.10
delete 0.17 0.29 0.30
insert 1.43 1.76 1.93
search 0.47 0.53 0.52
50,000 ins/del 2.76 3.89 4.22
search 0.50 0.54 0.55
delete 1.13 1.62 1.76
insert 2.96 3.94 4.36
search 1.08 1.17 1.16
100,000 ins/del 6.11 8.58 9.30
search 1.12 1.20 1.22
delete 2.50 3.66 3.95
insert 6 8.92 9.33
search 2.41 2.58 2.57
200,000 ins/del 13 19.49 20.46
search 2.49 2.69 2.66
delete 5.61 8.25 .iI
Time Unit : sec
Table 9: Run time on random real inputs (version 1 code)
n operation SBBST BBSTD BBST
insert 0.27 0.23 0.20
search 0.08 0.07 0.07
10,000 ins/del 0.53 0.50 0.43
search 0.08 0.07 0.05
delete 0.18 0.23 0.20
insert 1.43 1.25 1.12
search 0.40 0.30 0.30
50,000 ins/del 2.80 2.17 2.37
search 0.40 0.30 0.30
delete 1.07 0.90 0.97
insert 3.28 2.58 2.77
search 0.90 0.62 0.63
100,000 ins/del 6.15 4.70 5.13
search 0.87 0.62 0.63
delete 2.35 1.93 2.10
insert 7.37 4.55 4.92
search 1.S, 1.32 1.32
200,000 ins/del 13.35 10.03 10.93
search 1.87 1.33 1.33
delete 5.08 4.17 4.43
Time Unit : sec
Table 10: Run time on ordered real inputs (version 1 code)
n operation BST AVL RBT RBB WB BBST DSL TRP SKIP
insert 264175 211401 21 .' ; 211P i 211916 212111 276247 2" 224757
search 254175 193253 194606 194291 194153 191578 25 i 2", ..' 255072
10,000 ins/del 511 411220 515184 414990 41 11. ; 41,. ,' 923524 601137 519430
search 252200 193141 197399 195525 194442 191676 256578 254119 256124
delete 215555 167312 200218 167455 167531 166487 526242 242743 231745
insert 1560958 1234911 1550701 1';i.'.S 1;' I. 1236114 1640660 1717037 1357076
search 1510958 1147273 1150466 1146754 1149970 1134969 1512093 1503452 1537547
50,000 ins/del 3061i.S 2417733 31'.1145 2424944 2431281 24:;;1.;', 5351715 3456045 2996512
search 1500504 1145",i 1173662 1152764 1151578 1134062 1499657 1497081 1501731
delete 1316917 1013535 1242 11. 1013144 1015'Y1 1007688 3077266 1451, ; 1:;7 ;,S
insert 3329780 21.' I 3305332 2626314 2631411 2623792 3513401 3632046 2919371
search 3229780 2445659 2451137 2446466 245 ;"' 22;';13 3244497 3247143 31t.'2
100,000 ins/del 6537563 51:;7 1 6564352 5154118 5170695 5179653 11545200 7476441 1 ;'i"'113
search :;'I 13 2443038 2502098 2457531 2456748 2419990 3229747 33111' ; 3225343
delete 2 ;'' *;4 2181327 21.'r'72 2177946 21S;213 2168110 6561272 3177135 2981173
insert 7076132 555 ;. 10 7016676 5558174 5571133 5553256 7483199 7682439 617S,.
search 6876132 5191730 5209189 51*'*' 5215568 5147698 6887196 6797942 i;
200,000 ins/del 13907058 11D,''1., 2 13940982 10921',, 10956496 1' '; 24207106 1554:;'.'., 13377747
search 6830718 5186737 5332771 5223154 5220965 5144148 6814733 6916150 I,'ii42
delete 6095324 4664876 5s111 1; 4664344 4.17.S 4641:;' 13811271 6700557 6149268
Table 11: The number of key comparisons on random inputs (version 1 code)
n operation AVL RBT RBB WB BBST DSL TRP SKIP
insert 277234 :I .'' 241383 171017 150554 435199 1:.'' 247129
search 191917 188246 190106 188722 IS, 10 262423 271087 256706
10,000 ins/del 421032 718040 51"110 !'1.S43 314998 *" ;1.76 390899 354566
search 195133 189494 190090 191681 184155 249694 21'Ii ; 250538
delete 104038 276136 218216 214930 135131 468244 1' i i I 84392
insert 1618930 2233658 1436225 ''* ,.211 872967 2 .';.7 ,, K)0 1422120
search 1120497 1117001 1120495 1117174 1101481 1509152 1540082 1467217
50,000 ins/del 2418422 4311748 3055100 2475487 lt', 139 6019215 2194668 1973416
search 1124001 11'.. ; 1126126 111i. il li' 1,11 1481819 1568903 1449810
delete 607478 1719212 1323918 1276262 791815 27S,792 1181612 4'i. 98
insert 3437S.S 4767564 3i1', 1 i 2112201 1SI,48 5521408 1724473 2'r,'18
search 2390963 2383979 2390961 2384327 2354757 3218246 :;i. 1"'' 2970715
100,000 ins/del 5111Sn 9223606 6510188 5254541 3821594 12788447 44:;, 1., 4406427
search 2397971 2487243 2402224 2 ; 27.' 2346128 3163554 :E':l;ns 3277089
delete 1289954 3737982 2847792 2735270 1684584 5971196 2403622 9612 ;
insert 7275714 10135418 6544713 4!..', ; :;'ii ;,i ; 11743159 3! ;., 6403207
search 5081893 5067933 5081891 5111.. ;' 4946753 1.' ; .1 7174727 6448304
200,000 ins/del 10773706 19647336 1321i'1;. 11116226 804t'1,S 27076911 9054078 9062233
search 5095909 5274461 5104418 5065496 5001967 6727017 7006341 6458321
delete 2729906 '7iV474 6095538 58 i;.i'. 3577223 12741948 5094044 1995215
Table 12: The number of key comparisons on ordered inputs (version 1 code)
Time is sum of time for parts (a)(e) of the experiment
45
40 SBBST on random inputs 4;
BBSTD on random inputs 
35 BBST on random inputs e
SBBST on ordered inputs < 
30 BBSTD on ordered inputs .X 
BBST on ordered inputs .o 
Time 25 
(sec) 20 
15
50000 100000 150000 200000
Figure 14: Run time on real inputs (version 1 code)
Table 12 gives the number of comparisons performed when ordered data (i.e., the elements
in part (a) are 1, 2,. ., n and are inserted in this order) and those in part (c) are n +1,..., 2n
(in this order) is used instead of random permutations of distinct elements. This experiment
attempts to model realistic situations in which the inserted elements are in "nearly sorted
order". BSTs were not included in this test as they perform very poorly with ordered data
taking O(n2) time to insert n times. The computer time needed to perform this test on
BSTs was determined to be excessive. This test exhibited greater variance in performance.
Among the deterministic structures, BBSTs outperformed the others in parts (a) (d) while
AVL trees were ahead in part (e). For part (a), BBSTs performed approximately 42,' .
fewer comparisons than did AVL trees and approximately 12' fewer than WB trees. The
randomized structure TRP was the best of the eight structures reported in Table 12 for part
(a). It performed approximately Ii' fewer comparisons than did BBST trees. However, the
BBST remained best overall on parts (b), (c), and (d).
The heights of the trees (number of levels in the case of DSL and SKIP) for the exper
n BST AVL RBT RBB WB BBSTI DSL I TRP SKIP
10,000 31,31 16,16 17,18 16,17 17,17 16,16 12,11 32,31 8,8
50,000 38,38 19,19 20,21 19,20 20,20 19,19 13,12 38,37 9,9
100,000 41,41 20,20 21,22 20,21 21,22 20,20 14,13 41,40 9,9
200,000 44,43 21,21 22,24 21,22 23,23 21,21 15,14 43,44 9,9
Table 13: Height of the trees on random inputs (version 1 code)
n AVL RBT RBB WB BBST DSL TRP SKIP
10,000 14,14 20,20 24,24 16,15 17,17 14,13 33,34 8,8
50,000 16,16 23,23 29,28 20,20 20,20 16,16 41,41 9,9
100,000 17,17 25,25 31,30 21,21 21,21 17,17 46,41 9,9
200,000 18,18 27,27 33,32 22,22 23,22 18,18 47,46 9,9
Table 14: Height of the trees on ordered inputs (version 1 code)
iments with random and ordered data are given in Tables 13 and 14 respectively. The first
number in each table entry is the tree height after part (a) of the experiment and the second,
the height after part (c). In all cases, the number of levels using skip lists is fewest. However,
among the tree structures, AVL and BBST trees have least height on random data and AVL
has least with ordered data.
Tables 15 and 16, respectively, give the number of rotations performed by each of the
deterministic tree schemes for experiment parts (a), (c), and (e). Note that none of the
schemes performs rotations during a search.
On ordered data, BBSTs perform about 2".' more rotations than do the remaining
structures. These remaining structures perform about the same number of rotations. On
random data, AVL trees, bottomup redblack trees and WB trees perform a comparable
number of rotations. Topdown redblack trees and BBST trees perform a significantly larger
number of rotations. In fact, BBSTs perform about twice as many rotations as AVL trees.
The average run times for the random data tests are given in Table 17 and in Table 18
for the ordered data test. Both of these use integer keys. The times using real keys are
AVL RBT RBB WB BBST
n operation S D S D S D S D S D DS DD
insert 2 ;'s 2322 1964 1955 1946 1933 2274 2065 5025 3938 151 93
10,000 ins/del 4343 3224 14773 8213 4053 2591 !'".. 2978 10104 5849 232 103
delete 1645 1120 9558 2678 1845 1166 1595 1022 5201 2018 51 28
insert 11664 11614 9822 9815 9710 9689 11355 10352 25059 19732 754 455
50,000 ins/del 215s' 16214 81895 45180 20255 12979 21266 14975 50979 29198 1161 531
delete 8231 5630 54ii., 13431 9196 5844 7963 5194 26068 10033 248 131
insert 23316 23254 19593 19677 19340 19414 22723 20730 50047 39461 1527 920
100,000 ins/del 43243 32361 196769 111 ;, 40618 25919 1,.1.7 29898 101 ;i. 58491 2275 1046
delete 16466 11264 11i'"_ 26953 1S,10 11708 16024 10420 51943 20147 496 260
insert 46631 46518 39290 39291 38797 38793 45458 41480 100205 79013 3054 1840
200,000 ins/del ,.,18 64712 394187 209941 80892 52030 84927 59911 203568 116940 4593 2059
delete 33047 22477 247905 54046 37083 23379 31984 20800 11; ". 40157 990 523
Table 15: The number of rotations on random inputs (version 1 code)
AVL RBT RBB WB BBST
n operation S D S D S D S D S D DS DD
insert *I'l i 0 'l' II 0 9976 0 9984 0 *'*'* 2387 0 0
10,000 ins/del 14996 0 14999 0 14995 0 14997 0 16644 5797 25 154
delete 4990 0 4' ; 1 4989 0 4989 0 6647 3392 26 154
insert 49984 0 49977 0 49971 0 4*'*',, 0 4'1'' ; 11956 0 0
50,000 ins/del 74994 0 75000 0 74994 0 74996 0 83247 28982 137 770
delete 24's s 0 24978 1 24', 1 0 24987 0 33242 17018 136 766
insert *" ; 0 99975 0 99969 0 99979 0 *''"; 23917 0 0
100,000 ins/del 149994 0 150000 0 149994 0 149996 0 166504 57969 280 1540
delete 49987 0 49977 1 4'***,' 0 4'l's 0 66505 34040 278 1536
insert 199982 0 199973 0 199967 0 199978 0 199982 47839 0 0
200,000 ins/del 299994 0 300000 0 299994 0 299996 0 333012 115938 559 3078
delete *irl. 0 99976 1 99984 0 *i'. 0 133016 611. 557 3076
Table 16: The number of rotations on ordered inputs (version 1 code)
Time
Unit : sec
Table 17: Run time on random inputs using integer keys (version 1 code)
given in Tables 19 and 20. The sum of the run time for parts (b) and (d) of the experiment
is graphed in Figure 15 for random data and in Figure 16 for ordered data. The graph of
Figure 17 shows only one line MIX for AVL, RBT, RBB, WB, and BBST while that of
Figure 18 shows MIX for AVL, RBT, RBB, and WB as the times for these are very close.
With integer keys and random data, unbalanced binary search trees (BSTs) outperformed
each of the remaining structures. The next best performance was exhibited by bottomup
redblack trees. They did marginally better than AVL trees. The remaining structures have
a noticeably inferior structure. For ordered integer keys, BSTs take more time than we were
willing to expend. Of the remaining structures, treaps generally performed best on parts
(a), (c), and (e) while BBSTs did best on parts (b) and (d).
n operation BST AVL RBT RBB WB BBST DSL TRP SKIP
insert 0.08 0.12 0.15 0.12 0.20 0.34 0.19 0.18 0.24
search 0.05 0.05 0.05 0.06 0.05 0.07 0.09 0.09 0.18
10,000 ins/del 0.14 0.21 0.36 0.22 0.39 0.70 0.49 0.33 0.45
search 0.05 0.05 0.05 0.05 0.05 0.06 0.09 0.09 0.18
delete 0.05 0.08 0.12 0.09 0.16 0.26 0.20 0.08 0.16
insert 0.65 0.79 0.98 0.73 1.18 1.75 1.10 1.01 1.36
search 0.40 0.36 0.36 0.36 0.35 0.37 0.58 0.56 1.25
50,000 ins/del 1.04 1.48 2.50 1.26 2.22 3.84 2.77 1.ii1 2.73
search 0.40 0.41 0.44 0.36 0.36 0.39 0.57 0.56 1.16
delete 0.39 0.54 1.01 0.51 0.94 1.62 1.16 0.51 1.10
insert 1.34 1.57 2.10 1.54 2.54 3.80 2.46 2.23 2.84
search 0.88 0.80 0.80 0.83 0.78 0.84 1.36 1.30 2.63
100,000 ins/del 2.36 3.21 5.52 2.74 4.i'1 8.41 6.35 4.10 6.13
search 0.93 0.94 1.00 0.84 0.83 0.88 1.33 1.29 2.61
delete 0.88 1.24 2.26 1.14 2.11 3.58 2.64 1.23 2.41
insert 2.79 3.37 4.41 3.18 5.21 8.37 5.56 4.70 6.25
search 2.00 1.80 1.81 1.81 1.78 1.89 3.03 2.91 5.S5
200,000 ins/del 5.24 6.99 12.51 5.99 10.54 18.57 14.29 8.95 13.29
search 2.08 2.12 2.25 1.91 1.87 1.98 3.04 2.93 5.81
delete 2.01 2.69 5.06 2.51 4.55 8.02 5.84 2.76 5.35
n operation AVL RBT RBB WB BBST DSL TRP SKIP
insert 0.12 0.17 0.12 0.18 0.27 0.23 0.08 0.20
search 0.05 0.03 0.03 0.07 0.05 0.07 0.05 0.12
10,000 ins/del 0.18 0.32 0.20 0.35 0.57 0.42 0.17 0.20
search 0.05 0.05 0.05 0.05 0.03 0.07 0.05 0.13
delete 0.05 0.10 0.07 0.13 0.23 0.15 0.05 0.07
insert 0.75 1.02 0.92 1.25 1.10 0.98 0.47 0.92
search 0.32 0.27 0.27 0.28 0.20 0.33 0.32 0.62
50,000 ins/del 1.28 2.17 1.25 2.20 2.40 2.03 0.80 1.07
search 0.28 0.28 0.27 0.28 0.20 0.30 0.37 0.62
delete 0.30 0.75 0.37 0 1.05 0.65 0.30 0.27
insert 1.50 2.52 1.70 2.58 2.53 2.58 0.90 1.72
search 0.70 0.60 0.57 0.70 0.42 0.70 0.63 1.23
100,000 ins/del 2.60 4.68 2.53 4.78 5.13 4.42 1.52 2.43
search 0.63 0.60 0.55 0.62 0.42 0.70 0.58 1.35
delete 0.62 1.65 0.78 1.87 2.15 1.42 0.45 0.55
insert 3.12 4.82 3.38 5.67 5.25 4.72 1.80 3.52
search 1.38 1.30 1.22 1.33 0.90 1.60 1.25 2.70
200,000 ins/del 5.15 10.40 5.35 10.40 10.88 9.48 3.10 5.13
search 1.33 1.33 1.18 1.32 0.90 1.50 1.28 2.72
delete 1.35 3.63 1.68 4.12 4.58 2.98 0.93 1.12
Time Unit : sec
Table 18: Run time on ordered inputs using integer keys (version 1 code)
Time
Unit : sec
Table 19: Run time on random real inputs (version 1 code)
With real keys and random data, BSTs did not outperform the remaining structures.
Now, the five balanced binary tree structure became quite competitive with respect to the
search operations (i.e., parts (b) and (d)). RBB generally outperformed the other structures
on parts (a), (c), and (e). Using ordered real keys, the treap was the clear winner on parts
(a), (c), and (e) while BBSTs handily outperformed the remaining structures on parts (b)
and (d).
Some of the experimental results using version 2 of the code are shown in Tables 21 24.
On the comparison measure, with random data (Table 21), skip lists performed best on
part (a). Of the deterministic methods, BBSTs slightly outperformed the others on part
(a). On parts (b) (e), AVL, RBT, RBB, WB, and BBSTs were quite competitive and
n operation BST AVL RBT RBB WB BBST DSL TRP SKIP
insert 0.14 0.15 0.21 0.17 0.23 0.36 0.22 0.23 0.30
search 0.09 0.07 0.09 0.10 0.08 0.10 0.13 0.13 0.21
10,000 ins/del 0.24 0.27 0.51 0.32 0.38 0.79 0.62 0.41 0.53
search 0.09 0.08 0.09 0.10 0.08 0.10 0.12 0.12 0.21
delete 0.09 0.09 0.17 0.14 0.14 0.30 0.28 0.11 0.19
insert 0.94 0.97 1.22 0.86 1.29 1.93 1.48 1.19 1.67
search 0.64 0.52 0.50 0.51 0.51 0.52 0.87 0.71 1.44
50,000 ins/del 1.68 1.77 2.74 1.53 2.29 4.22 3.93 2.17 3.15
search 0.66 0.55 0.56 0.54 0.56 0.55 0.86 0.71 1.33
delete 0.63 0.67 1.10 0.72 0.92 1.76 1.80 0.69 1.22
insert 2.06 1.;S 2.34 1.90 2.66 4.36 3.05 2.67 3.61
search 1.43 1.13 1.09 1.13 1.14 1.16 1.84 1.66 3.00
100,000 ins/del 3.63 3.93 6.18 3.33 4.96 9.30 8.45 4.84 7.10
search 1.45 1.26 1.27 1.17 1.26 1.22 1.83 1.65 3.01
delete 1.39 1.50 2.51 1.55 2.03 3.95 3.91 1.61 2.75
insert 4.34 3.95 5.20 3.88 5.56 9.33 6.77 5.81 7.90
search 3.19 2.49 2.42 2.50 2.45 2.57 4.14 3.67 6.62
200,000 ins/del 8.01 8.25 13.78 7.29 10.65 20.46 18.88 10.48 15.83
search 3.21 2.83 2.86 2.62 2.74 2.66 4.08 3.73 6.74
delete 3.11 3.27 5.55 3.41 4.43 .'ii 8.56 3.54 6.04
Time is sum of time for parts (b) and (d) of the experiment
50000 100000 150000
n
200000
Figure 15: Run time on random real inputs (version 1 code)
Time is sum of time for parts (b) and (d) of the experiment
50000 100000 150000
n
200000
Figure 16: Run time on ordered real inputs (version 1 code)
Time
(sec)
Time
(sec)
n operation AVL RBT RBB WB BBST DSL TRP SKIP
insert 0.13 0.22 0.15 0.25 0.20 0.25 0.12 0.30
search 0.07 0.08 0.07 0.07 0.07 0.10 0.07 0.15
10,000 ins/del 0.23 0.42 0.27 0.40 0.43 0.47 0.18 0.28
search 0.07 0.05 0.08 0.08 0.05 0.08 0.08 0.12
delete 0.07 0.17 0.08 0.15 0.20 0.20 0.05 0.07
insert 1.15 1.58 1.12 1.S 1.12 1.30 0.67 1.35
search 0.42 0.42 0.43 0.40 0.30 0.53 0.38 0.82
50,000 ins/del 1.28 2.75 1.57 2.57 2.37 3.02 0.92 1.40
search 0.40 0.42 0.42 0.48 0.30 0.53 0.40 0.75
delete 0.38 0.95 0.55 0.93 0.97 1.15 0.33 0.35
insert 1.77 3.23 2.12 3.35 2.77 3.13 1.17 2.42
search 0.90 0.87 0.90 0.88 0.63 1.12 0.92 1.70
100,000 ins/del 3.00 6.00 3.42 5.38 5.13 6.32 1.92 3.22
search 0.97 0.92 0.88 0.98 0.63 1.12 0.82 1.70
delete 0.87 2.08 1.17 2.05 2.10 2.40 0.70 0.67
insert 3.92 6.42 4.27 7.25 4.92 6.03 2.58 4.93
search 1.92 1.87 1.92 1.88 1.32 2.40 1.S5 3.87
200,000 ins/del 5.78 13.80 7.33 11.88 10.93 13.72 3.75 6.67
search 1.90 1.93 1.92 2.13 1.33 2.38 1.75 3.97
delete 1.67 4.55 2.48 4.45 4.43 5.10 1.40 1.35
Time Unit : sec
Table 20: Run time on ordered real inputs (version 1 code)
outperformed BSTs and the randomized schemes. BBSTs performed best on parts (b) and
(d), RBTs did best on part (e) and RBB and AVL did best on part (c). In comparing
the results of Table 21 to those of Table 11 (using version 1 code), we see that the change
to version 2 generally increased the comparison cost of the deterministic tree structures by
about 2.' For the DSL, the change in code had mixed results. Notice that for RBT and
DSLs, the comparison count for parts (a), (c), and (e) are the same as for the version 1 code.
This is because for inserts and deletes, it is necessary to do the equal check first when using
these structures. For SKIPs the count is the same for all five parts as the version 1 and 2
codes are the same.
With ordered data (Table 22), treaps required the fewest comparisons for part (a). Skip
lists did best on parts (c) and (e), and AVL trees generally outperformed the other structures
on parts (b) and (d). Once again, the comparison counts were generally higher using the
version 2 code than using the version 1 code.
Run time data using real keys is given in Tables 23 and 24. The sum of the run time for
parts (b) and (d) of the experiment is graphed in Figure 17 for random data and in Figure 18
for ordered data. The graph of Figure 17 shows only one line MIX for AVL, RBT, RBB,
WB, and BBST while that of Figure 18 shows MIX for AVL, RBT, RBB, and WB as the
times for these are very close. With random data, RBB generally performed best on part
(a), on parts (b) and (d), the front runner varied among AVL, RBT, and WB, and on parts
(c) and (e) RBBs generally did best. On ordered data, TRPs did best on parts (a), (c), and
(e) while BBSTs did best on parts (b) and (d).
8 Conclusion
We have developed a new weight balanced data structure called 3BBST. This was developed
for the representation of a dictionary. In developing the insert/delete algorithms, we sought
to minimize the search cost of the resulting tree. Our experimental results show that BBSTs
generally have the best search cost of the structures considered. Furthermore, this translates
n operation BST AVL RBT RBB WB BBST DSL TRP SKIP
insert 332753 262198 21 ;S 262726 263177 21 11. 276247 375698 224757
search 322753 241557 21 LS 2 1'2i , 2!__'4 240126 348403 329411 255072
10,000 ins/del 650901 514371 515184 513920 515732 518124 923524 755629 519430
search 318749 241536 247130 243191 2!",1.7 240126 335613 320612 256124
delete 271004 206558 200218 206721 207622 207210 526242 300619 231745
insert 1'" ;' ;'i 15 "' 1550701 1549795 1554520 1539666 1640660 2184066 1357076
search 1933939 1443879 1447870 1446679 1452920 1435927 2043618 1921255 1537547
50,000 ins/del :; ',2221 3043090 311'. 145 3040654 3055092 3061443 5351715 4:;' ;1'l 2996512
search 191:11.s 1443837 1476158 1451163 1452625 1435726 1'.li'i.'", 1909919 1501731
delete 1674128 1267637 1242 ,1. 1268881 1275612 12'ii', 3077266 1815736 1:;i ;,S
insert 4245062 3297162 3305332 3302792 3314410 :;1s1959 3513401 11i. ; ', 2919371
search 4145062 3090057 3098143 3096011 3111095 3074661 4387427 4161175 31U'i'!
100,000 ins/del ; ;."16 6490752 6564352 64'i, 1.L 6520729 '(.;''. 1 11545200 9484761 1,;'1 '!.3
search 4102672 3089826 317.1 .2 3105465 3110184 3074305 4270168 4224698 3225343
delete 3623179 27 ;' 7 21.' r'272 274084G 2756006 2744369 6561272 4008111 2981173
insert 9045367 6999791 7016676 7012317 7040203 6969465 7483199 ', ;4444 617S,,
search 8845367 6584279 6603044 6599643 6633218 6554714 9373163 8752 .; I. 'I;2' ;
200,000 ins/del 17782478 1:;7'11. 13 13940982 13789492 1:; 1.467 1:;'.7876 24207106 1' 2",'4 13377747
search 8757433 ,; 758 6747566 6618833 6630334 6554354 8995>;i 5 '1,; I111 .42
delete 7~Ti"I'24 5882302 5 sli ; 5s**" ; 5923552 5' *"'' 13811271 8 ',' ;[ 6149268
Table 21: The number of key comparisons on random inputs (version 2 code)
n operation AVL RBT RBB WB BBST DSL TRP SKIP
insert 267234 : "1."' 442766 302034 261108 435199 216958 247129
search 237262 2 ;''l 247706 237298 243110 372444 332060 256706
10,000 ins/del 4' ;ii> 718040 727620 .'.'IS 558770 '." ;i76 482499 354566
search 240910 238834 21, ;;1 238320 21' i. 349730 344982 250538
delete 17tl7. 276136 208216 204930 2 ;'i' 468244 1 II1i 84392
insert 1568930 2233658 2672450 1791440 1545934 2 .'',7 1375770 1422120
search 1418962 1421560 1459588 1421 .S 1455' ;. 2159176 1990474 1467217
50,000 ins/del 201762 4311748 4360200 3301668 3251450 6019215 271'_'77 1973416
search 1419154 1444824 141 S 1424442 1452494 2131 .' 1956194 1449810
delete 1064956 1719212 1273918 1226262 1427504 27S,792 1131612 4',. 98
insert 3337S.S 4767564 5744778 3824402 3301096 5521408 2*'" ; 2' '.i18
search 3037892 304:;11 [ 3119128 3041676 3121098 4618272 4538718 2970715
100,000 ins/del 6113530 9223606 9320376 7027676 1,'i ;1 r1 ;' 12788447 5492066 4406427
search 3 l;21. 3089612 :;" 170 3048844 3114012 4563600 4158994 3277089
delete 2279908 3737982 2747792 2635270 3056908 5971196 2303622 961_ ;
insert 7075714 10135418 1""_' L26 8131870 7006166 11743159 5756575 6403207
search 6475750 64i1.128 1.1. ;i214 6483310 6646168 *" ;, 156 9102954 6448304
200,000 ins/del 12927066 19647336 198411 ' 14904040 14671602 27076911 112" i' r . 9062233
search 6476518 6579184 6576890 6497646 663 '21.1 9727066 891h,;s 6458321
delete 4S "'12 'i117 474 5895538 ,". 1i.ii (1,_"_'s 12741948 4t'" 1044 1995215
Table 22: The number of key comparisons on ordered inputs (version 2 code)
n operation BST AVL RBT RBB WB BBST DSL TRP SKIP
insert 0.15 0.14 0.20 0.18 0.25 0.36 0.23 0.25 0.31
search 0.10 0.08 0.10 0.11 0.09 0.11 0.13 0.16 0.21
10,000 ins/del 0.27 0.27 0.52 0.34 0.47 0.80 0.64 0.50 0.54
search 0.10 0.08 0.10 0.11 0.09 0.11 0.13 0.14 0.21
delete 0.10 0.10 0.20 0.14 0.18 0.32 0.29 0.14 0.19
insert 1.02 0.98 1.15 0.89 1.46 1.88 1.44 1.34 1.65
search 0.69 0.55 0.57 0.55 0.57 0.55 0.89 0.83 1.42
50,000 ins/del 1.79 1.80 2.99 1.59 2.93 3.97 3.82 2.44 3.16
search 0.71 0.60 0.63 0.55 0.57 0.56 0.87 0.79 1.32
delete 0.67 0.67 1.22 0.66 1.19 1.63 1.80 0.75 1.21
insert 2.15 2.00 2.58 1.90 3.18 4.01 3.11 2.95 3.69
search 1.52 1.21 1.24 1.18 1.23 1.23 1.97 1.84 3.04
100,000 ins/del 3.88 3.92 6.74 3.46 6.28 8.73 8.50 5.39 7.18
search 1.55 1.32 1.45 1.25 1.29 1.27 1.95 1.82 2.98
delete 1.51 1.49 2.75 1.45 2.57 3.64 3.93 1.73 2.77
insert 5.04 4.45 5.79 4.28 6.92 9.20 7.05 6.81 8.01
search 3.43 2.63 2.70 2.64 2.73 2.69 4.43 4.00 6.60
200,000 ins/del 8.92 8.87 15.36 7.88 13 , 19.53 19.55 12.17 16.11
search 3.43 2.98 3.13 2.73 2.83 2.77 4.37 4.02 6.70
delete 3.33 3.32 6.08 3.20 5.65 8.24 8.91 3.88 6.04
Time Unit : sec
Table 23: Run time on random real inputs (version 2 code)
Time
(sec)
Time is sum of time for parts (b) and (d) of the experiment
14
.A
12 MIX  
BST ** 
DSL e
10 SKIP A 
TRP x .
8
4
0
50000 100000 150000 200000
n
Figure 17: Run time on random real inputs (version 2 code)
Time is sum of time for parts (b) and (d) of the experiment
8
7 MIX .
BBST 
6 DSL 
SKIP A 
5 TRP . 
4
3
2
0
50000 100000 150000 200000
Figure 18: Run time on ordered real inputs (version 2 code)
Figure 18: Run time on ordered real inputs (version 2 code)
Time
(sec)
n operation AVL RBT RBB WB BBST DSL TRP SKIP
insert 0.17 0.23 0.28 0.27 0.30 0.23 0.15 0.30
search 0.08 0.08 0.12 0.08 0.08 0.12 0.12 0.13
10,000 ins/del 0.23 0.43 0.40 0.47 0.60 0.48 0.17 0.27
search 0.08 0.08 0.07 0.08 0.08 0.08 0.10 0.13
delete 0.08 0.15 0.12 0.17 0.20 0.20 0.08 0.05
insert 0.83 1.45 1.43 1.57 1.37 1.35 0.82 1.18
search 0.45 0.48 0.48 0.47 0.38 0.60 0.50 0.83
50,000 ins/del 1.35 2.65 1.95 2.75 2.47 3.05 1.05 1.42
search 0.45 0.47 0.45 0.47 0.37 0.63 0.58 0.77
delete 0.45 1.05 0.50 1.00 1.03 1.17 0.43 0.33
insert 1.78 2.75 2.73 3.43 2.63 3.23 1.33 2.18
search 0.97 0.98 1.00 1.03 0.77 1.30 1.15 1.55
100,000 ins/del 2 6.22 3.98 6.00 5.33 6.37 2.02 3.33
search 0.97 1.10 0.98 1.02 0.77 1.32 1.03 1.70
delete 0.97 2.18 1.05 2.15 2.22 2.43 0.63 0.67
insert 3.78 6.08 5.43 7.18 5.37 6.07 2.87 5.23
search 2.08 2.13 2.13 2.17 1.63 3.10 2.27 3.47
200,000 ins/del 6.13 13.93 8.48 13.42 11.33 13.60 4.10 7.02
search 2.12 2.15 2.13 2.17 1.63 2.80 2.18 4.27
delete 2.03 4.75 2.27 4.77 4.72 5.18 1.35 1.35
Time Unit : sec
Table 24: Run time on ordered real inputs (version 2 code)
into reduced search time when the key comparison cost is relatively high (e.g., for real keys).
The insert and delete algorithms for 3BBSTs are not as efficient as those for other dictionary
structures (such as AVL trees). As a result, we recommend 3BBSTs for environments where
searches are done with much greater frequency than inserts and/or deletes. Based on our
experiments, we conclude that AVL trees remain the best dictionary structure for general
applications.
We have also proposed two simplified versions of the BBST called SBBST and BBSTD.
The SBBST seeks only to provide logarithmic run time per operation and unlike the general
BBST, does not reduce search cost at every opportunity. The SBBST provides slightly
better balance than provided by WB(a) trees. The BBSTD does not attempt to maintain
3balance. However it performs rotations to reduce search cost whenever possible. Both
versions are very competitive with BBSTs. The SBBST exhibited much better run time
performance than BBSTs on random data and the BBSTD slightly outperformed the BBST
on ordered data. However, BBSTs generated trees with the lowest search cost (though not
by much).
References
[ARAG89] C. R. Aragon and R. G. Seidel, Randomized Search Trees, Proc. 30th Ann. IEEE
Symposium on Foundations of Computer Science, pp. 540545, October 1989.
[BLUM \] N. Blum and K. Mehlhorn, On the Average Number of Rebalancing Operations
in Weightbalanced Trees, Theoretical Computer Science, vol 11, pp.303320, 1'l'1
[GUIB78] L. J. Guibas and R. Sedgewick, A Dichromatic Framework for Balanced Trees,
Proc. 19th FOCS, pp. 821, 1978.
[HORO94] E. Horowitz and S. Sahni, Fundatamentals of Data Structures in Pascal, 4th
Edition, New York: W. H. Freeman and Company, 1994.
[MUNR92] J. I. Munro, T. Papadakis and R. Sedgewick, Deterministic Skip Lists, 3rd An
nual ACMSIAM Symposium on Discrete Algorithms, pp. 367375, January 1992.
[NIEV73] J. Nievergelt and E. M. Reingold, Binary Search Trees of Bounded Balance, SIAM
J. Computing, Vol. 2, No. 2, pp. 3343, March 1973.
[PAPA93] T. Papadakis, Skip Lists and Probabilistic Analysis of Algorithms, PhD Disser
tation, Univ. of Waterloo, 1993.
[PUGH90] W. Pugh, Skip Lists: a Probabilistic Alternative to Balanced Trees, Communi
cations of the ACM, vol. 33, no. 6, pp.668676, 1990.
[SAHN93] S. Sahni, Software Development in Pascal, Florida: NSPAN Printing and Pub
lishing Co., 1993.
[SEDG94] R. Sedgewick, Algorithms in C++, Mass.: AddisonWesley Pub. Co., 1994.
[TARJ83] R. E. Tarjan, Updating a Balanced Search Tree in 0(1) Rotations, Information
Processing Letters, Vol. 16, pp. 253257, June l'i ;
