Gator: A Discrimination Network Structure for
Active Database Rule Condition Matching
Eric N. Hanson
301 CSE
CIS Department
University of Florida
Gainesville, FL 32611
hanson cis.ufl.edu
April 7, 1993
UFCISTR93009
(revised June 14, 1993)
Abstract
This paper introduces a new discrimination network structure called Gator that is
a generalization of the widely known Rete and TREAT algorithms. Gator can be used
as a replacement for Rete or TREAT in active database rule systems and production
system interpreters. Gator is designed as the target structure for a discrimination net
work optimizer. Algorithms for performing pattern matching using a Gator network to
see if a rule condition has been satisfied are given. Moreover, cost estimation functions
for Gator networks are introduced, along with one possible strategy for building an
optimized Gator network for a collection of rules.
1 Introduction
Both production systems and active database systems must perform rule condition matching
during execution to determine which rules to fire. The most successful rule condition testing
mechanisms developed for mainmemory production systems system are discrimination net
works known as Rete [3] and TREAT [9]. Like production systems, active database systems
must also test rule conditions, and we believe some kind of discrimination network will be
the best tool for doing so. However, choosing a good discrimination network structure is
far more important in the active database environment than in mainmemory production
systems because of the volume of data involved, and the fact that some or all of the data
may be on disks several orders of magnitude slower than memory. This has led us to look
for ways to optimize rule condition testing. One way to do this is to generate an optimized
discrimination network. An example of this approach is an OPS5 system developed by Ishida
[8] that contained an optimizer for building a highperformance Rete network for a particular
OPS5 application.
However, previous work has shown that the TREAT algorithm can sometimes out
perform the Rete algorithm [9]. Our recent performance study comparing Rete and TREAT
in a database environment showed that neither Rete nor TREAT always is best, TREAT
normally is better than Rete, but sometimes Rete can vastly outperform TREAT [14]. This
lead us to search for a more general structure than Rete or TREAT.
This paper presents a generalized discrimination network structure called the Gator (Gen
eralized Treat/Rete) network. Gator networks are general tree structures. Rete and TREAT
networks are special cases of Gator. Gator networks are suitable for optimization because
there are a very wide variety of Gator structures that can perform pattern matching for a
single rule. In contrast, there is only a single TREAT network for a given rule, and Rete
networks are limited to binarytree structures.
In many situations the optimal discrimination network will have the form of a Gator
network, not a Rete or TREAT network. Gator networks are also appropriate when there
are storage constraints since the number of internal memory nodes is not fixed as in Rete.
The challenge is to develop an optimizer with search strategies and heuristics that can find
a good Gator network in a reasonable amount of time.
The main body of this paper describes the Gator network structure and a setoriented
strategy that can be used with a Gator network to do rule condition matching. This set
oriented strategy is suitable for active database rule condition matching. An outline of
an optimization strategy for Gator networks is discussed, and cost estimation functions for
Gator networks are given. The appendix discusses a recursive, tupleatatime version of
the rule condition matching process for Gator networks. This later matching strategy is
appropriate for mainmemorybased production system implementations.
2 Gator Network Structure
The condition of a rule in a production system or active database system has a structure
similar to that of a relational database query. This structure can be represented as a rule con
dition graph with one node for each condition element, and one edge for each join condition.
The nodes are decorated with the selection conditions associated with the corresponding
condition elements. An example rule condition graph is shown in figure 1. This rule condi
tion graph is based on some hypothetical relations R1 through R5, the meaning of which is
not important for this discussion.
Sample Rete, TREAT, and Gator networks for a rule whose condition has five elements,
similar to the condition that corresponds to the graph in figure 1, are shown in figure 2. The
figure just shows the shape of the networks other details such as the predicates associated
with the a and 3 memory nodes are omitted. In the figure, the Rete network is a binary
tree, the TREAT network is a degenerate tree with a root and five leaves, and the Gator
network is a tree with two subtrees, one with three children and one with two children. The
subtree with three children has the structure of a TREAT network.
In TREAT, the only memory nodes are amemories. Rete has a leftdeep binary tree
format and maintains internal (3memory) nodes, which always have two inputs.
R1.d=R2.d
R1.a>17
R2.e=R4.e R4.f=R5.f
R4 R5 i
R5.b="Friday"
SR3.g=R2.g
R3
R3.c="on"
Figure 1: Example rule condition graph.
Rete
a a a(
TREAT
Pnode
Gator
SPn
Pnode
Pnode
Figure 2: Examples of Rete, TREAT and Gator discrimination network structure.
Gator differs from Rete and TREAT in the following way. A Gator network ,..ii have
internal memory nodes with any number of inputs, not just two. We will call these multiple
input nodes 3memories to be consistent with terminology of the Rete algorithm. Children of
a multipleinput node may be a combination of leaves (amemories) and other multipleinput
nodes.
3 Virtual a memories
Another property of the Gator network is that a memory nodes may be either materialized,
and thus contain all the tokens that match their selection condition, or they may be virtual,
in which case they contain only their selection predicate but not the tokens matching the
predicate. The concept of virtual amemories has been used in a variation of the TREAT
algorithm called ATREAT [4]. Use of virtual amemories in Gator is identical to that in
ATREAT. Virtual amemories save space since the matching tokens need not be stored
in the memory node. This is particularly important in a database environment since the
underlying data sets can be huge.
When attempting to join a token t to a virtual amemory, the system runs a onerelation
query on the relation associated with the amemory. The selection predicate for this query
is constructed from the selection predicate of the rule condition element associated with the
amemory, ANDed with all conditions from the rule of this form:
a.field = t.field
The value t.field is a constant extracted from token t.
Virtual amemories are really a version of the indexing concept used to implement produc
tion system matching prior to the development of the Rete algorithm [3]. In pure indexing,
there is no discrimination network. Tokens are matched against individual condition ele
ments of rules using an index. When a token matches a single condition element, the system
attempts to directly match the token against elements of working memory to see if the entire
rule condition has been satisfied by a new combination of working memory elements.
A Gator network for the same rule can thus range from a degenerate one with all virtual
amemory nodes and no 3memory nodes (a pure indexing strategy), to a full binary tree
structure with all memory nodes materialized (a Rete network).
4 Processing Tokens in a Gator Network
When a token enters a node in a discrimination network, the token must be processed to
see if the condition of any rule becomes matched or is no longer matched. The algorithm
for processing tokens in a Gator network is similar to the algorithms for processing tokens
in Rete and TREAT. The rest of this section covers handling of + tokens, tokens, and
negated condition elements in Gator networks. The algorithms for processing tokens are
described here in a setoriented style that is suitable for use in active database systems. A
tupleatatime, recursive style of the algorithm that is similar to how Rete and TREAT are
typically implemented for mainmemory production systems is presented in [5].
The following terminology will be used. Memory nodes of type a and 3 will be referred
to together as memory nodes. Nodes that can have multiple inputs, including 3memories
and Pnodes, will be called multiple input nodes. The term node may be used to describe an
amemory, 3memory, or Pnode.
4.1 Handling + tokens
The setoriented version of the Gator algorithm keeps around at each step a set of tokens
called a temporary join result. For processing a + token, this algorithm is implemented in
terms of a recursive function, InsertPlusTempResult. As part of each memory node N in
the network, there is a list of pairs, one pair for each multipleinput node for which N is an
input (N may be an input to more than one node if it is part of a shared subexpression).
Each pair contains the following:
a multiple input node MInode which has N as one of its inputs, and
a plan P for how to join a temporary result inserted into N to the other input nodes of
MInode. This plan is simply a list of the identifiers of these other nodes, in the order
in which they should be joined to the temporary result.
The list of pairs of the above form is called the nodeplanpair list, or NPPlist. The function
Mlnode(NPpair) extracts the multiple input node from the node/plan pair NPpair. The
function plan(NPpair) extracts from NPpair the list of nodes specifying the join order. Based
on this terminology, InsertPlusTempResult is shown in Figure 3. To initiate match processing
when a new token t is inserted into a leaf node LEAF of the Gator network (either a Pnode
or an amemory), an initial temporary result is constructed. This first temporary result TR1
is a set containing only one token, t. Then InsertPlusTempResult is called with TR1 and
LEAF as arguments. The algorithm InsertPlusTempResult terminates when the temporary
result is empty or the final temporary result is added to the Pnode.
Later, this paper will consider the issue of how to construct the join plan that is associated
with each input node of a multiple input node. The algorithm above works for any valid join
order. However, selection of the join order when a Gator network is built is an important
step that can have a big impact on the performance of the network.
The simplest implementation of InsertPlusTempResult will use nestedloop join as the
join algorithm, always using the temporary result as the outer scan, and the memory node
being joined to the temporary result as the inner scan. An alternative might allow use of a
sortmerge algorithm for some of the joins.
4.2 Handling tokens
Handling (minus) tokens is slightly different from handling of + tokens. The standard
delete optimization familiar from implementations of Rete is used. This optimization does
not do joins during deletions. Rather, when a token t enters a node, the token is deleted
from the node. Then, if that node feeds into a multiple input node, the tokens in the multiple
input node are scanned to see if they contain t as a component. If so, they are deleted. In
InsertPlusTempResult(TR,Node)
{
/* TR is a temporary result.
Node is a memory node or Pnode in a Gator network. */
If TR is empty, return.
Insert the tokens in TR into the collection of tokens
belonging to Node.
If Node is a Pnode,
adjust the rule agenda if necessary, and return.
For each element x of NPPlist(Node) {
For each node y in plan(x),
in order from first to last,
{
/* Join the current temporary result to the next
memory node specified by the join order plan,
forming the next temporary result. */
TR = join(TR,y)
}
/* Insert the final temporary result into the
multiple input node for which we are trying
to find new matching tokens. */
InsertPlusTempResult(TR,MInode(x))
Figure 3: Procedure for inserting a temporary result into a node
turn more tokens are generated and passed to the successor of the multiple input node.
Detailed algorithms for tokens won't be presented.
4.3 Negated Condition Elements
In some production systems such as OPS5 [1], one or more condition elements of a rule can
be negated. For example, consider this OPS5 rule defined on relations R1 through R6, with
negated condition element on R3 and R5.
(P rulel
(R1 ^a ^b 17)
(R2 ^a ^c ^e ^h )
(R3 ^d ^f 32) ; negated condition element
(R4 ^g ^i )
(R5 ^h ) ; negated condition element
(R6 ^i ^a>100)
>
... rule action ...}
In OPS5, a combination C of working memory elements matches a rule condition if it
matches all the positive condition elements, and there is no working memory element in any
negated condition element that matches C.
The Gator network handles negated condition elements in the following way. As with
Rete, each rule condition must have at least one positive condition element. No join edges
are allowed in the rule condition graph between two negated condition elements. In OPS5
terminology, this means there are no pattern variables shared between two negated condition
elements. A negated amemory is associated with each negated condition element. Unlike
a positive memory node, a negated amemory does not have an output node. Rather, it is
connected to each positive amemory to which it has a join relationship in the rule condition
graph. On each positive amemory node there is a count field for every negated amemory
node with which the positive node has a join condition. Each count field on a token t in such
a positive amemory contains the number of tokens in the corresponding negated node that
match t. An example of the condition graph for the rule called "rulel" shown above is given
in figure 4. A Gator network for the rule is given in figure 5. The nodes in the condition
graph corresponding to negated condition elements are shown with a '' sign in front of
the relation name, and a dashed circle around them. In figure 5 the dashed lines show the
edges in the rule condition graph of figure 4 superimposed on the leaf nodes of a Gator
network for rulel. These edges serve as the connections between negated memory nodes and
the positive nodes with which they have a join relationship. A dashed edge from a positive
amemory to a negated amemory indicates that pattern matching needs to be performed
with the negated memory when a token enters the positive memory, and vice versa. The
node a2 in figure 5 has two count fields on each token, showing the number of matching
tokens in a3 and a5, respectively. Any token entering node 31 does not match any token in
the negated condition elements a3 and a5. In general, a token entering a 3memory never
R3.f=32
R3)
/
R1.a=R2.a R2.e=R4.g R4.i=R6.i
R1 R2 R4 R6)
R1.b=7
R6.a>100
R5)
Figure 4: Condition graph for rulel.
a3 a5 al a2  4  a6
pi
Pnode
Pnode
Figure 5: Example Gator network for rulel.
matches a negated condition element connected to one of the leaves in the subtree rooted at
the /memory. Its leaves are all positive amemories.
There are the following cases to consider when processing a token I entering an amemory
or a temporary result TR being propagated downward in the network:
1. a + token enters a positive amemory node,
2. a + token enters a negated amemory node,
3. a token enters a positive amemory node,
4. a token enters a negated amemory node.
5. a + TR enters a 3memory node,
6. a TR enters a 3memory node,
7. a + TR enters a Pnode,
8. a TR enters a Pnode,
The InsertPlusTempResult and InsertMinusTempResult functions discussed previously
would have to be modified to handle negated condition elements. We will not give a modified
version of InsertPlusTempResult here. Rather, we will discuss the issues involved in handling
negated condition elements for the cases above.
The discussion below describes in detail what to do for each of the cases. A token entering
a node will be called I and a temporary result entering a node will be called TR. The node
the value is entering is called N.
+ token enters a positive amemory If the positive amemory N has no connections
in the rule condition graph to a negated node, then everything proceeds as described
previously in the InsertPlusTemporaryResult function. If N is connected to a set of m
negated amemories called Sneg, then we add token t to N as follows. First, a new to
ken t' is constructed by combining t with m initially null count field values, one for each
negated amemory in Sneg. Then, t is joined with each node in Sneg, and each count
field of t' is initialized to the number of matching tokens in the corresponding negated
node. If any count field of t' is greater than zero, processing stops. Otherwise, a new
temporary result TR' is formed, containing only token t. Then, processing proceeds
as in the standard InsertPlusTempResult function, except that TR' is substituted for
TR.1
+ token enters a negated amemory In this case, t is inserted into the negated a
memory node N. Then t is joined to all nonnegated nodes connected to the negated
node by an edge in the rule condition graph. The match count field on each token that
t joins to is incremented. If a token's count become nonzero, the token is put in a
1One can think of the negated amemories connected to a nonnegated amemory as fillers that restrict
the tuples passing out of the nonnegated memory.
temporary result TR' associated with the token's memory node. After all tokens from
TR are joined to that memory node, TR' is propagated downward in the network as
a (minus) temporary result. The process is completed for all positive amemories
connected to N.
 token enters a positive amemory The token t is removed from amemory N. If N
is connected to any negated amemories and t has all count fields equal to zero, a
temporary result TR' is constructed that contains only t, and then TR' is propagated
downward in the network as a (minus) temporary result.
 token enters a negated amemory The token t is joined to each positive amemory
connected to N, and the appropriate count field of each token to which t joins is
decremented. For each of these positive amemories, a + temporary result TR' is
created that contains those tokens whose count fields dropped to zero. Each of these
+ temporary results are then propagated downward.
+ TR enters a 3memory The same process described in InsertPlusTemporaryResult is
used, except that if TR is joined to a positive amemory, only the tokens in the a
memory with all zerovalued counts, or no counts, may qualify.
 TR enters a 3memory This is handled the same as the previous case.
+ TR enters a Pnode Use the same procedure as in InsertPlusTemporaryResult.
 TR enters a Pnode Use the same procedure as in InsertPlusTemporaryResult.
This concludes the discussion of pattern matching using Gator networks for rules with
both positive and negated condition elements. Of course, it is not enough to just know how
to do pattern matching using a Gator network. It must be decided which Gator network
to build for a given rule to get the best (or at least good) performance. This optimization
problem is the subject of the next section.
5 Optimization of Gator Networks
The key parts of an optimization strategy for building a good Gator network are computation
of costs of partial solutions, search strategies and heuristics for building new partial solutions
from smaller partial solutions, and strategies for sharing subexpressions. These issues are
addressed below.
5.1 A Strategy for Optimizing Gator Networks
A dynamic programming approach combined with pruning and heuristics can be used to
construct an efficient and potentially optimal Gator network for a given collection of rules,
database, and update pattern. In this approach, when optimizing a rule, we build an array of
sets of partial Gator networks called Nets indexed from 1. n. Before optimizing rule R, Nets
is emptied, and then initialized to contain all Gator subnetworks computed for previously
getOptimalGatorNet(ruleConditionGraph,previousGatorNets)
{
/* ruleConditionGraph is the rule condition graph.
previousGatorNets is the set of Gator subnetworks
built previously for other rules. */
/* Initialize Nets[l..n] with networks potentially useful
for subexpressions. */
initSubExpressions(Nets,ruleConditionGraph,previousGatorNets)
/* Find the set of singleinput nodes for this rule. */
add getNetsSizel(ruleConditionGraph) to Nets[l]
For i = 2 to n {
For j = 1 to i1 {
/* Combine compatible networks of size j and ij to
form new networks, prune out the ones that will
definitely not be part of the optimal solution, and
add the remaining ones to Nets[i]. */
Nets[i] = combineAndPrune(Nets[j],Nets[ij]
}
}
Winner = the network in Nets[n] with lowest cost.
return (winner)
}
Figure 6: Gator network optimization algorithm
optimized rules that are usable for the current rule because of shared subexpressions in the
rule condition. A possibly shareable subnetwork with k amemories as leaves is placed in
the set Nets [k]. The sharable subnetworks in Nets all are assigned a cost value of zero
since their cost was already counted. The algorithm builds Nets[i] for i=1..n where n is
the number of condition elements in the rule condition. After building the set Nets [n], the
lowest cost Gator network in Nets [n] is output as the result. This algorithm is illustrated
in the getOptimalGatorNet procedure shown in figure 6. In order to limit the number of
subnetworks generated, the following heuristics will be used:
Connectivity Heuristic: Do not combine two Gator networks unless there is an explicit
join between them in the rule condition graph.
In other words, do not combine a Gator network nl with a Gator network n2 if there is
no edge between the subgraph of the rule condition graph corresponding to ni, and the
subgraph corresponding to n2. In the case where the rule condition graph is not connected,
Join Rule
m m'
m + m'
Merge Rule
ml m2 mk m ml m2 mk
m +
Figure 7: Graphical representation of rules for combining Gator networks.
dummy join edges with "true" predicates are added until the graph becomes connected.
These dummy edges are placed so that no cycles are added to the graph.
A caveat to the connectivity heuristic is that the optimization algorithm proposed here
assumes that the rule condition graph is acyclic. If there are cycles in the graph, edges
will be removed until no cycles remain before optimization starts. Conditions corresponding
to the removed edges will be added to the Gator network chosen by the optimizer so that
pattern matching will function correctly. A better way to handle cyclic joins awaits further
study.
Lowest Cost Heuristic: In the case where there already is a network in Nets correspond
ing to the same set of condition elements as the network just created, and the existing
network costs no more than the new network, discard the new network.
In addition to these heuristics, rules for combining Gator subnetworks into larger sub
networks are needed. The following rules suffice:
Join: Combine two memory nodes m and m' by creating a single twoinput node 3 with
the other two nodes as inputs.
Merge: Combine a node m and a multipleinput node 3 by adding m as another input node
of 3.
A straightforward inductive proof shows that these two rules are sufficient for building any
Gator network. A graphical representation of the rules is shown in figure 7.
In the earlier discussion of how the match is performed using a Gator network, it was
mentioned that there is a join order plan associated with each memory node N for each mul
tiple input node to which N is an input. During optimization, this plan will be constructed
as follows. Let N be an input node of a multiple input node 3. When a temporary result
TR1 enters N, TR1 is joined to a sibling of N to form TR2, then TR2 is joined to another
sibling of N to form TR3, and so on until no siblings remain. The join order plan is built
so that TRi+i is always formed by joining TRi to the remaining sibling S of N such that
(1) TRi has a join edge to S in the rule condition graph, and (2) the TRi+1 value with the
smallest expected size is generated. To be more specific, we say TRi has a join edge to sibling
S if the subgraph of the rule condition graph corresponding to TRi, and the subgraph of the
rule condition graph corresponding to S are linked directly by an edge.
The procedure described above uses a heuristic to construct a reasonably good join plan
quickly. An alternative would be to use a query optimizer to decide the join plan, but
that would be prohibitively expensive since construction of the plan is part of the larger
optimization process to build an optimized Gator network. Running a query optimizer
hundreds or thousands of times as part of optimizing the Gator network for a single rule
would simply take too long to be feasible. Also, since the initial temporary result coming
into N is likely to be small, it will usually be best to join that temporary result to a sibling
of N as the first step because that will tend to make the next temporary result small as well.
This concludes the discussion of the proposed search strategy for constructing Gator
networks, pruning bad subnetworks, and eventually returning the best network. Even though
a very large number of Gator networks and subnetworks will be built during the optimization
process, the process should still terminate in a reasonable amount of time (less then a few
seconds for one rule) because typical rule condition graphs will have only a few nodes. It
will be very rare for a rule condition graph to have more than ten nodes. Most will have one
to five nodes.
The optimization technique discussed above is now being implemented as part of a sim
ulator to compare Gator with Rete and TREAT. Other optimization strategies such as
simulated annealing, iterative improvement, and twophase optimization [6, 7] could also be
applied to Gator networks. Finding which strategy is best is a possible research topic.
This section assumes all amemories in a Gator network are stored, not virtual. Devel
oping a process for building optimized Gator networks that may contain virtual amemories
is a subject for future work. At this point, it is .i1. .i 1 that if virtual amemories are to
be used, a postprocessor can be applied to the Gator network produced by the optimizer to
decide which nodes should be virtual. Below, we discuss in detail the functions for estimating
the cost of a piece of Gator network.
5.2 Cost Functions
The question that must be answered is, what does a Gator network cost? First, the units
of cost will be update f,, '',"' ,' times elapsed time. To see why this is important and how
it can lead to good Gator network structures, observe the following. If no tokens ever enter
the memory nodes at the leaves of a Gator network, then its cost is zero, so its structure is
not important. If the tokens frequently enter one leaf node al but not the others, then it
will probably be appropriate to construct any Gator network for the infrequently updated
nodes, as long as it has a 3memory at the bottom, and then join al to that 3memory. This
will require only a oneway join when tokens enter al, rather than a multiway join.
Since the optimizer will operate in a database management system environment, the
catalog statistics about relation size, attribute cardinality, and attribute value distribution
will be available. These statistics can be used by the optimizer to compute estimates of
selection predicate and join predicate selectivity, just as they are used by query optimizers
[11]. The most important variable in the problem of discrimination network optimization that
is different from those used for query optimization is update frequency. Relative frequency of
updates to different memory nodes in a discrimination network can have a major impact on
the choice of the optimal structure. It is assumed that the database system keeps the insert
and delete frequencies of each relation in the system catalogs. Modifications of existing
records are treated as deletes followed by inserts. The frequency might be in units like
"total number of operations performed in the last 24 hours." The discussion below gives the
parameters involved in computing the cost of a Gator network, and then develops formulas
for estimating the total cost of a network.
5.3 Parameters used
The following are the parameters that are needed to estimate the cost of a Gator network.
Unless stated otherwise, N represents any kind of Gator network node (a, 3 or Pnode).
Fi(N) is the frequency with which new + tokens enter a node and update it.
Fd(N) is the frequency with which new tokens enter a node and update it.
S(N) is the cardinality of a given node.
Pages(N) is the number of pages occupied by N.
Sel(a) is the selectivity of the predicate associated with the amemory node a. This value
can be estimated from catalog data using standard techniques [11].
JSF(Ni, N2) is the estimated join selectivity factor between a pair of nodes N1 and N2. It
is an estimate of the following value:
S(N MN N2)
S(NI) S(N2)
JSF can be estimated using standard techniques from query optimization.
I/Oweight is the time to do a disk read or write.
CPUeight is the CPU time spent to insert or delete a tuple in a memory node or perform
a predicate test on a tuple or between a pair of tuples.
Reln(a) is the relation from which amemory node a is derived. The frequency F(Reln(a))
of updates to this relation can be obtained from statistics maintained in the catalog.
Cost(N) is the total cost associated with the subnetwork rooted at N.
LocalCost(N) is the cost associated with only local processing at N (i.e. not including the
cost of its children).
In the cost formulas to follow, it will sometimes be necessary to estimate the number of pages
touched in a memory node of size m blocks when k records in it are accessed at random.
For this the following approximation to the Yao function [15] is used:
Sm(1 (1 /m)k) if k> 1
Yao(m, k) k if k <
k if k < 1
In the cost functions that follow, unless stated otherwise, it is assumed that any disk pages
read are not already in the buffer pool.
5.3.1 Total Cost of a Gator Network
The total cost of a Gator network with Pnode P is Cost(P). The Cost function is recursively
defined. Its implementation for each of the different node types is discussed below.
5.3.2 Cost Computation of a nodes
The a nodes will either be stored on a single page, or organized using a clustered index
with the entire tuple contents as the key. If an index is used, we assume no index I/O will
be needed because the index will be in the buffer pool. Hence, one disk read and one disk
write are required to do either an insert or delete. Since only one tuple is touched, only one
CPUweiht cost is incurred. Hence, we have:
Cost(a) = LocalCost(a) = (2 I/Oweght + CPU+eight) (Fi(a) + Fd(a))
The insert and delete frequencies of the a node, Fi(a) and Fd(a), are given by these formulas:
Fi(a) = Fj(Reln(a)) Sel(a)
Fd(a) = Fd(Reln(a)) Sel(a)
5.3.3 Cost computation of 3 nodes
A 3 node can have two or more children, and the children may be either a or 3 nodes. The
cost of a 3 node is a function of the cost of processing tokens propagated from each of its
children. This involves join costs for doing the matching, plus costs to update the stored
copy of the 3 node itself. Associated with each child c of a 3 node is a join plan that gives the
sequence in which join operations should be performed when a temporary result TR arrives
at a child c of a 3 node. The join plan is the sequence (m,, m2, ..., mnk), where mi through
mk are the children of 3 except for c. When TR enters c, then the following sequence of
operations is performed:
TR = TR N mi
TR = TR N m2
TR = TR N mk
At this point, the contents of TR are inserted into or deleted from f, as appropriate.
Each child of a f node either fits on one page or has a secondary index (or indexes) for
faster access on the join fields. The indexes on the child nodes are used when the TR is
joined with them. The cost of a f node includes the cost of its children, the cost of updating
it, and the cost of performing joins when temporary results are propagated from its children.
Cost() = LocalCost(P) + Cost(N)
NEchildren(3)
LocalCost( ) = Fi(N).PerChildInsCost(N, P)+Fd(N)PerChildDelCost(N, )
NEchildren(3)
The function PerChildInsCost(N, 3) accounts for the cost to process a temporary result
containing one + token arriving at 3 from child N. This represents a simplifying assump
tion that if a temporary result TR arrives at 3 from N, each token in TR will be processed
individually. Not making this assumption would make the cost estimation functions signifi
cantly more complex. The function PerChildDelCost(N, 3) accounts for the cost to process
a temporary result containing a single token arriving at 3 from child N. The function
PerChildlnsCost(N, 3) is most easily represented as the following procedure:
PerChildlnsCost(N, 3) {
(size, cost) = JoinSizeAndCost(N, 3)
return( cost + updateCost(3, size) )
}
The body of this procedure calls a function JoinSizeAndCost(N, 3) that returns a pair
of numbers which are the expected size of the last temporary result generated after a
token comes into N, and the cost of generating that temporary result. The function
JoinSizeAndCost(N, 3) is shown in figure 8. Another procedure called in PerChildlnsCost
is UpdateCost(3,TRsize), and this procedure gives the time required to update a multiple
input node 3 to reflect the contents of a temporary result of size TRsize that is to be
propagated through 3. The UpdateCost function is shown in figure 9.
The procedure PerChildDelCost(N, 3) has a slightly different structure than the pro
cedure PerChildlnsCost(N, 3). It is assumed that the standard delete optimization often
used in Rete and TREAT implementations is employed. That means that when a temporary
result TR with a tag arrives at a child N of node 3, rather than joining TR with siblings
of N, the node 3 is scanned to see if any members of TR are components of tokens in 3. If
so, those tokens from 3 are removed and propagated onward as a new temporary result with
a tag.
JoinSizeAndCost(N, 3) {
TRsize = 1 /* initial temporary result size */
cost = 0
L = plan(N, 3) /* initialize join order list */
previousNode = N
m = first(L) /* let m be first element of list L */
while ( m f Null ) {
if m fits on one page then
cost = cost +I/Owiht + CPUieht S(m) TRsize
else /* m is bigger than one page, so assume */
/* we only look at tuples on one page of m */
/* for each tuple in TR. */
cost = cost +I/Oweight min(pages(m),TRsize) +
CPUweighttuplesPerPage(m) TRsize
TRsize = JSF(previousNode, m)TRsize.S(m)
L = rest(L) /* set list L to remaining list elements
after the first element */
previousNode = m
m = first(L)
}
return(T Rsize,cost)
}
Figure 8: When a temporary result TR is propagated out of a node N that is an input node
to a node 3, TR must be joined to the other input nodes of 3. This procedure computes
both the cost of doing these joins, and the size of the final temporary result generated after
doing the joins.
UpdateCost(3,TRsize) {
/* A TRsize value less than one is significant
since small join selectivities can produce temporary
results that are small on average. The Yao function
takes this into consideration. */
cost = Yao(pages(P),TRsize) 2 I/O,,eiht + TRsize CPU,,,,ht
/* If TR is larger than one page, add the cost to
allocate or delete pages in 3. */
if TRsize >
tuplesPerPage(f3)
cost = cost + tupleP e/3) 2 Il/Oweight
return(cost)
}
Figure 9: Function to compute cost to update a 3 memory node.
The function PerChildDelCost(N, 3) must account for the cost to read all pages of the
3 node (no index is available to support this operation), plus the cost to write pages of 3
that contain tokens with components from child N. Each token in 3 must be examined, so
one CPUweieht factor must be paid for each token in 3 as well. This function is most easily
represented as the following procedure:
PerChildDelCost(N, 3) {
(size, cost) = JoinSizeAndCost(N, 3)
return( (Yao(Pages(P), size) + Pages(P)) I/Oe,,ght
+S(p) CPUeiSht) }
5.3.4 Pnode Cost
The Pnode is not stored permanently on durable storage, and it is emptied when the rule
associated with it is fired. The cost of a Pnode is the sum of the cost of its children and the
local cost involved when performing joins when temporary results arrive from its children.
The formula shown below for the cost is similar to the one for a 3memory except the cost
to update the contents of the Pnode is a CPUonly cost.
Cost(Pnode) = LocalCost(Pnode) + Cost(N)
NEchildren(Pnode)
Loca lC PFn ) F(N) PerChildlnsCost(N, Pnode)
LocalCost(Pnode) = n
P no d + Fd(N) PerChildDelCost(N, Pnode)
NEchildren(Pnode)
The above formulas are identical to the ones for 3 nodes. The difference for a Pnode is
in the PerChildlnsCost and PerChildDelCost functions. The PerChildlnsCost is the
cost to update N plus the cost to join the + token in the TR that just arrived at N to the
other children of the Pnode, plus the cost to place in the Pnode any new compound tokens
matching the complete rule condition. The Pnode is usually small since it only contains
the data that has matched the rule condition since the last time the rule executed. The
Pnode will thus normally be in main memory, so there will not be any I/O cost to update
the Pnode.
PerChildlnsCost(N, Pnode) {
(size, cost) = JoinSizeAndCost(N, Pnode)
return( cost + CPUweiht size )
}
The PerChildDelCost is the cost to scan the Pnode to see if any compound token in the
Pnode contains a component which is equal to the token in the TR that just arrived at
N. The actual size of the Pnode is impossible to know. It can be assumed that the Pnode
is usually small or empty. Hence it is assumed that the cost to scan the Pnode is just
CPUwight. Since the Pnode will be in main memory there will be no I/O cost to update it.
PerChildDelCost(N, Pnode) = CPU,,iht
The return statement above returns the join portion of the cost, plus a pertuple CPU
cost for updating the Pnode.
5.3.5 Estimating update frequency of 3 nodes
Formulas given previously have used the frequency functions Fi and Fd on 3 nodes. To finish
the story we need to show how Fi and Fd for 3 nodes can be estimated. The value of Fi(3)
is the relative frequency of propagation of compound + tokens out of 3 compared with other
nodes. Similarly, the value of Fd(3) is the relative frequency of propagation of compound 
tokens out of 3. An estimate for Fi is:
F() = Fi(N) JoinSize(N, 3)
NEchildren(3)
where JoinSize(N, 3) simply returns the size value from the pair of values returned by
JoinSizeAndCost(N, 3). Similarly, an estimate for Fd is:
Fd8() 5 F4(N) JoinSize(N, 3)
NEchildren(3)
5.4 Cost estimation and optimization: conclusion
This section has presented a search strategy for building an optimized Gator network, and
functions for estimating the cost of a Gator network to help guide the search.
The cost formulas developed in this section are only estimates of the actual cost that
will be incurred when performing matching with a Gator network. These estimates may
vary from actual observed times by a significant fraction. However, the estimates do have
value for the purpose of comparing the cost of different Gator networks. Rough estimates
of the cost of query plans have proven highly effective when used in query optimizers, which
consistently produce good or even optimal plans [11].
6 Related Work
The Gator network is a descendent of Rete, TREAT, and work on production system match
ing that predates Rete [3, 9]. The Gator network is only useful with an optimizer or at least
a good set of heuristics for constructing a network for a particular rule or set of rules. The
feasibility of generating an optimizer for Rete networks was demonstrated by Ishida [8], and
this lends evidence that it should be possible to develop an effective Gator optimizer. Work
on conventional query optimization and query plan cost estimation is relevant [11], as is
work on extended query optimization problems such as optimizing large join queries and
considering bushy join trees [6, 7]. There is also some similarity between the problem of
optimizing the discrimination network for a collection of rules, and optimizing an execution
plan for simultaneously evaluating a set of queries [2, 10, 12]. The key differences between
work on discrimination network optimization and query optimization is that in discrimina
tion network optimization, update frequency is a key variable. Moreover, discrimination
network construction requires making decisions about whether or not to construct memory
nodes, and that requires considering the cost of maintaining that memory node. In query
processing there is no need to consider update frequency, and their is nothing analogous to
maintaining a memory node.
7 Summary and Conclusion
This paper has introduced a new, generalized discrimination network structure called Gator
that can be used to test rule conditions in active databases and production systems. Gator is
extremely general in that the network for a rule can vary all the way from having no memory
nodes materialized, to having a binary tree structure like a Rete network. Because Gator is
so general, it is a good target for an optimizer. We plan to do future research to develop
an optimizer for Gator along the lines described in this paper. This will serve to evaluate
whether the optimization strategy we propose here is feasible, and whether it produces good
quality discrimination networks. One indication of the quality of a discrimination network
for a rule would be that the network has lower cost than a TREAT network, an arbitrary
Rete network, and an optimized Rete network.
We are now building a simulator that will generate a synthetic collection of relations and
rules, and then build an optimized Gator network for the collection of rules. This simulator
will be used to hold a contest between Gator networks optimized using the strategy proposed
in this paper, Rete networks, TREAT networks, and Gator networks built using one or more
alternate optimization strategies or heuristics. If Gator optimization proves viable, then the
Gator network may well become a key part of future active database systems and possibly
mainmemory production systems as well.
References
[1] L. Brownston, R. Farrell, E. Kant, and N. Martin. Programming Expert Sill.i ,, in
OPS5: an Introduction to RuleBased Programming. Addison Wesley, I' .
[2] S. Finkelstein. Common expression analysis in database applications. In Proc. ACM
SIGMOD International Conference on Management of Data, pages 235245, 1982.
[3] C. L. Forgy. Rete: A fast algorithm for the many pattern/many object pattern match
problem. Artificial Intelligence, 19:1737, 1982.
[4] Eric N. Hanson. Rule condition testing and action execution in Ariel. In Proc. ACM
SIGMOD International Conference on Management of Data, pages 4958, June 1992.
[5] Eric N. Hanson. Gator: A discrimination network suitable for optimizing production rule
matching. Technical Report CISTR00793, University of Florida CIS Dept., February
1993.
[6] Yiannis loannidis and Younkyung Cha Kang. Randomized algorithms for optimizing
large join queries. In Proc. ACM SIGMOD International Conference on Management
of Data, pages 312321, May 1990.
[7] Yiannis loannidis and Younkyung Cha Kang. Leftdeep vs. bush trees: An analysis of
strategy spaces and its implications for query optimization. In Proc. ACM SIGMOD
International Conference on Management of Data, pages 168177, May 1991.
[8] Toru Ishida. Optimizing rules in production system programs. In Proc. AAAI National
Conference on Artificial Intelligence, pages 699704, 1'l"'
[9] Daniel P. Miranker. TREAT: A better match algorithm for AI production systems. In
Proc. AAAI National Conference on Artificial Intelligence, pages 4247, August 1987.
[10] A. Rosenthal and U.S. Chakravarthy. Anatomy of a modular multiple query optimizer.
In Proc. of VLDB Conf., pages 230239, l'"'
[11] P. Selinger et al. Access path selection in a relational database management system.
In Proceedings of the 1979 ACM SIGMOD International Conference on Management of
Data, June 1979. (reprinted in [13]).
[12] Timos Sellis. Global query optimization. ACM TODS, 13(1):2352, l'i"
[13] Michael Stonebraker, editor. Readings in Database S.ll /,,, Morgan Kaufmann, 1'l"
[14] Yuwang Wang and Eric N. Hanson. A performance comparison of the Rete and TREAT
algorithms for testing database rule conditions. In Proc. IEEE Data Eng. Conf., pages
8897, February 1992.
[15] S. B. Yao. Approximating block accesses in database organizations. Communications
of the ACM, 20(4), 1977.
/* A function that inserts a token into any kind of node (Pnode,
alphamemory, or betamemory), and then initiates match processing. */
InsertPlusToken(t,Node)
{
Insert t into the collection of tokens belonging to Node.
If Node is a Pnode,
adjust the rule agenda if necessary, and return.
nextNodes = the set of nodes for which Node is an input node
(there may be more than one if subexpressions are shared
between rules).
For each element x of nextNodes {
otherChildren = the set of input nodes of x other than Node.
If every element of otherChildren has one or more tokens in it then
ProcessPlusToken(t,x,otherChildren).
}
}
Figure 10: Function to start match processing for a token t and insert t into a memory node
or Pnode.
8 Appendix: Tupleatatime Matching
This appendix describes how to do match processing for tokens with a Gator network,
using a recursive, tupleatatime approach similar to how Rete and TREAT are normally
implemented for mainmemory production systems such as OPS5. This algorithm always
results in processing that is equivalent in time complexity to a setoriented approach like
that discussed in section 4 that always uses nestedloop join.
When a + token t arrives at a memory node N, first the token is inserted into N. Let
MInode be a multiple input node into which the output of N is directed. There is a set SJ
of join conditions associated with MInode. Let S be the cross product of t and all the input
nodes of MInode other than N. If a tuple in S matches all conditions in SJ, then that tuple
forms a new token that is then passed on to MInode for processing.
The algorithm for inserting t into a node N and doing match processing for it if necessary
is analogous to the algorithm used to process a token entering an amemory in the TREAT
algorithm [9]. This algorithm is described below using a tupleatatime, recursive style,
similar to the way Rete and TREAT are normally implemented for mainmemory produc
tion systems such as OPS5. The tupleatatime version of the algorithm is based on two
mutually recursive functions, InsertPlusToken, and ProcessPlusToken. To start things off,
InsertPlusToken would be called for a token t on a "leaf" node of a Gator network (which
would be either an amemory or a Pnode). The InsertPlusToken function is shown in Figure
10, and the ProcessPlusToken function is shown in Figure 11.
In ProcessPlustoken, if there are no children of MInode remaining to join t to, that means
/* A function to join token t to a remaining child of multiple
input node MInode (if there is a child left). If there are no
children of MInode left to join to, then t matches all conditions
associated with MInode, so it is inserted into MInode. */
ProcessPlusToken(t,MInode,childrenRemaining)
{/*
t is the token to be processed. It may be a simple token
containing one tuple, or a compound token containing the join
of several tuples.
MInode is the multiple input node that is to be signaled if
a token matches all the conditions associated with the node.
childrenRemaining is a subset of the children of MInode such that token t
has not yet been joined to any of those children
combine(token,tuple) is a function that combines a token and a tuple
that joins with that token into a new compound token.
*/
If childrenRemaining is empty, then
InsertPlusToken(t,MInode)
else {
select a child C of MInode from childrenRemaining
to join t to next.
For each tuple x in C, if t joins with x, then
ProcessPlusToken(combine(t,x),childrenRemaining C)
}
}
Figure 11: Function to perform match processing for a token t that has arrived at one of the
input nodes of a multiple input node.
t has satisfied all the conditions associated with MInode, so t is inserted into MInode. The
other case is when siblings remain. Whenever possible, token t is joined to an element of
siblingRemaining with which at least one element of siblings has a join condition (if no join
condition exists between siblings and siblingsRemaining, then a cross product is performed
with an arbitrary member of siblingsRemaining). For each tuple/token pair that joins, the
pair is combined to form a new token and the algorithm is called recursively for this new
token. If no joining pairs are generated, processing stops.
8.1 Handling tokens
To handle tokens, the Gator algorithm uses two functions, DeleteMinusToken and Process
MinusToken, which are very similar to InsertPlusToken and ProcessPlusToken, respectively.
The difference is that DeleteMinusToken removes token t from a node rather than inserting
it. Furthermore, after deleting t from the node, if t is to be propagated further in the network,
then ProcessMinusToken is called instead of ProcessPlusToken. The body of ProcessMinus
Token contains a call to DeleteMinusToken instead of InsertMinusToken. To be brief, we
will not show the complete algorithm for DeleteMinusToken and ProcessMinusToken.
