• TABLE OF CONTENTS
HIDE
 Abstract
 Main






Group Title: Department of Computer and Information Science and Engineering Technical Reports
Title: Efficient implementation techniques for topological predicates on complex spatial objects : the evaluation phase
CITATION PDF VIEWER THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00095646/00001
 Material Information
Title: Efficient implementation techniques for topological predicates on complex spatial objects : the evaluation phase
Alternate Title: Department of Computer and Information Science and Engineering Technical Report
Physical Description: Book
Language: English
Creator: Praing, Reasey
Schneider, Markus
Publisher: Department of Computer and Information Science and Engineering, University of Florida
Place of Publication: Gainesville, Fla.
Copyright Date: 2007
 Record Information
Bibliographic ID: UF00095646
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.

Downloads

This item has the following downloads:

2007295 ( PDF )


Table of Contents
    Abstract
        Abstract
    Main
        Page 1
        Page 2
        Page 3
        Page 4
        Page 5
        Page 6
        Page 7
        Page 8
        Page 9
        Page 10
        Page 11
        Page 12
        Page 13
        Page 14
        Page 15
        Page 16
        Page 17
        Page 18
        Page 19
        Page 20
        Page 21
        Page 22
        Page 23
        Page 24
        Page 25
        Page 26
        Page 27
        Page 28
        Page 29
        Page 30
        Page 31
        Page 32
        Page 33
Full Text







Efficient Implementation Techniques for

Topological Predicates on Complex Spatial Objects:

The Evaluation Phase

Reasey Praing & Markus Schneider*

University of Florida
Department of Computer & Information Science & Engineering
Gainesville, FL 32611, USA
{rpraing, mschneid}@cise.ufl.edu



Abstract
Topological predicates like overlap, inside, meet, and disjoint uniquely characterize the relative position
between objects in space. They have been the subject of extensive interdisciplinary research. Spatial
database systems and geographical information systems have shown a special interest in them since they
enable the support of suitable query languages for spatial data retrieval and analysis. A look into the
literature reveals that the research efforts so far have mainly dealt with the conceptual design of and the
reasoning with these predicates while the development of efficient and robust implementation methods
for them has been largely neglected. Especially the recent design of topological predicates for different
combinations of complex spatial data types has resulted in a large increase of their numbers and stressed
the importance of their efficient implementation. The goal of this article is to develop efficient imple-
mentation techniques of topological predicates for all combinations of the complex spatial data types
point2D, line2D, and region2D within the framework of the spatial algebra SPAL2D. Our solution con-
sists of two phases. In the exploration phase described in previous work of the authors, for a given scene
of two spatial objects, all topological events are registered in so-called topologicalfeature vectors. These
vectors serve as input for the evaluation phase which is the focus of this article and which analyzes the
topological events and determines the Boolean result of a topological predicate or the kind of topological
predicate by a formally defined method called 9-intersection matrix characterization. Besides this very
general evaluation method, the article presents an optimized method for predicate verification, called ma-
trix ;im,,iim and an optimized method for predicate determination, called minimum cost decision tree.
Our evaluation methods also turn out to be able to compute dimension-refined topological predicates.

*This work was partially supported by the National Science Foundation (NSF) under grant number NSF-CAREER-IIS-0347574.








1 Introduction


In several disciplines like artificial intelligence, linguistics, robotics, and cognitive science, the study of
topological predicates has been a topic of extensive research. Topological predicates (like overlap, meet,
inside) characterize the relative position of spatial objects (like points, lines, regions). They are of purely
qualitative nature and are independent of any quantitative, metric measures like distance or direction mea-
sures. Instead, they are associated with notions like adjacency, coincidence, connectivity, inclusion, and
continuity and are preserved under affine transformations such as translation, scaling, and rotation. Spatial
database systems and geographical information systems have shown a special interest in these predicates
since they enable the support of suitable query languages for spatial data retrieval and analysis. A look into
the literature reveals that the research efforts so far have mainly dealt with the conceptual design of and the
reasoning with these predicates as well as with strategies for avoiding the necessity of their computation
at all or their repetitive computation at least. The two main conceptual approaches proposed so far are the
9-intersection model [16], which rests on point set theory and point set topology, and the RCC model [10],
which leverages spatial logic. Their main, common features are the provision of a complete set of mutually
exclusive topological predicates for each spatial data type combination and their restriction to simple spatial
objects. Both models have produced very similar results and form the basis of most publications in this field.
In this article, we are especially interested in topological predicates for complex spatial objects, as they have
been recently specified in [38] on the basis of the 9-intersection model. In addition, we also contemplate
dimension-refined topological predicates [30] which take into account the dimensions of the intersections
between two spatial objects and enable the posing of more fine-grained than purely topological queries.
In contrast to the large amount of conceptual work, implementation issues for topological predicates
have been widely neglected. Since topological predicates are expensive predicates that cannot be evaluated
in constant time, the strategy in query plans has consequently been to avoid their computation. An important
example are spatial index structures that are employed as filtering techniques in query processing. Their aim
is to identify a hopefully small collection of candidate pairs of spatial objects that could possibly fulfil the
predicate of interest and to exclude a large collection of pairs of spatial objects that definitely cannot sat-
isfy the predicate. However, efficient implementation techniques for the topological predicates themselves
cannot be found in the literature. But such techniques are indispensable due to the transition from simple
to complex spatial objects, the consequential increase of the number of topological predicates, and the only,
unacceptable alternative of error-prone and inappropriate ad hoc implementations.
The goal of this (and a previous [39]) article is to develop and present systematic, correct, robust, and
efficient implementation strategies for topological predicates between all combinations of the three complex
spatial data types point2D, line2D, and region2D within the framework of the spatial algebra SPAL2D.
Given two objects A and B of any complex spatial data type point2D, line2D, or region2D [34], we can
pose at least two kinds of topological queries that our implementation supports: (1) "Do A and B satisfy the
topological predicate p?" and (2) "What is the topological predicate p between A and B?". Only query 1
yields a Boolean value, and we call it hence a verification query. Query 2 returns a predicate (name), and we
call it hence a determination query. For both query types, we can even ask more fine-grained queries that
include the dimension of an intersection. For example, the topological predicate meet between two region
objects can be dimensionally refined into 0-meet, 1-meet, and 01-meet respectively. This states that the two
region objects meet in a point object (OD), a line object (ID), and a point object and a line object (OD+1D)
respectively. We distinguish two phases of predicate execution: In an exploration phase described in detail
in [39], a plane sweep scans a given configuration of two spatial objects, detects all topological events (like
intersections), and records them in so-called topological feature vectors. These vectors serve as input for the
evaluation phase which analyzes these topological data and determines the Boolean result of a topological
predicate (query 1) or the kind of topological predicate (query 2). This article reviews the exploration phase,
which outputs the topological feature vectors, and puts an emphasis on sophisticated and efficient methods








for the evaluation phase, which takes the topological feature vectors as arguments. The two-phase approach
provides a direct and sound interaction and synergy between conceptual work (9-intersection model) and
implementation (algorithmic design).
The special goals of the evaluation phase are the correct and efficient interpretation and matching of
the topological feature vectors from the exploration phase with characteristic properties of the topological
predicates. For this purpose, we introduce a general method called 9-intersection matrix characterization
that can be leveraged for both predicate verification and predicate determination, that depends on the spatial
data type combination under consideration, and whose correctness is formally proved. A slight extension of
this method can then be used for the matching of dimension-refined topological predicates. A fine-tuning
of the 9-intersection matrix characterization leads to two optimized approaches called matrix :l,,,inig for
predicate verification and minimum cost decision trees for predicate determination.
Section 2 discusses related work about simple and complex spatial data types as well as available design
and implementation concepts for pure and dimension-refined topological predicates. In Section 3, we give
an overview of the overall approach and review the exploration phase and a simple evaluation method called
direct predicate characterization from [39]. Section 4 focuses on the evaluation phase and matches the
acquired topological feature vectors with characteristic properties of the topological predicates by the 9-
intersection matrix characterization. Section 5 presents the two fine-tuned and optimized approaches of
matrix thinning for predicate verification and minimum cost decision trees for predicate determination. In
Section 6, we give a brief overview of the implementation environment, present our testing strategy, and
present an assessment of our overall approach. Finally, Section 7 draws some conclusions.


2 Related Work

In this section we present related work on spatial data types as the argument types of topological predicates
(Section 2.1), give an overview of the essential conceptual models for topological relationships (Section 2.2)
and dimension-refined topological relationships (Section 2.3), and discuss implementation aspects of topo-
logical predicates (Section 2.4).

2.1 Spatial Data Types

In the spatial database and GIS community, spatial data types (e.g., [14, 24, 34]) like point, line, or region
have found wide acceptance as fundamental abstractions for modeling the structure of geometric entities,
their relationships, properties, and operations. They form the basis of a number of data models and query
languages for spatial data and have gained access into commercial software products. Topological predicates
operate on instances of these data types, called spatial objects. The literature distinguishes simple and
complex spatial data types, depending on the spatial complexity they are able to model. Simple spatial
data types only provide simple object structures like single points, continuous lines, and simple regions
(Figure 1(a)-(c)). Until recently, topological relationships have only been defined for them. However, from
an application perspective, simple geometric structures have turned out to be inadequate abstractions for
real spatial applications since they are insufficient to cope with the variety and complexity of geographic
reality. From a formal perspective, simple spatial data types are not closed under the geometric set operations
,, i'/ I L i '., union, and difference. This means that these operations applied to two simple spatial objects as
input can produce a spatial object that is not simple. Complex spatial data types solve both problems. They
provide universal and versatile spatial objects and are closed under the geometric set operations. They allow
objects with multiple components, region components that may have holes, and line components that may
model ramified, connected networks(Figure l(d)-(f)).
Our formal specification of complex spatial data types given in [38] rests on point set theory and point
set topology [22] and forms the basis of our spatial data type implementation. We call this specification











o

(a) (b) (c) (d) (e) (f)

Figure 1: Examples of a simple point object (a), a simple line object (b), a simple region object (c), a
complex point object (d), a complex line object (e), and a complex region object (f).

the abstract model of our spatial type system SPAL2D. Only a few formal approaches [5, 24, 40] have
been developed for complex spatial objects; they all share a number of structural features with our type
specifications. In their OGC Abstract Specification [31], the OpenGIS Consortium (OGC) has proposed
some informally described geometric structures that are called simple features and that are similar to ours.
Implementations of spatial type systems are available in ESRI's Spatial Database Engine (ArcSDE) [21]
and in a few spatial extension packages of commercial database management systems that have integrated
ArcSDE functionality. Examples are the Informix Geodetic DataBlade [28], the Oracle Spatial Cartridge
[32], and DB2's Spatial Extender [11]. Descriptions of data structures for spatial data types that are able
to support the efficient implementation of topological predicates are rare. We have specified our own data
structures in [39] and call this specification the discrete model of SPAL2D. They are an extension of the
realm-based data structures of the ROSE Algebra [25] in the sense that they generalize these concepts and
accommodate realm-based and general spatial objects.

2.2 Conceptual Models for Topological Relationships

Topological relationships characterize the relative position between spatial objects. The 9-intersection model
[16], which is based on point set theory and point set topology, and the RCC model [10], which is based on
spatial logic, have proved to be the fundamental approaches to deriving these relationships. Despite rather
different foundations, both approaches come to very similar results. Our implementation strategy relies
heavily on the 9-intersection model since it enables us to create a direct link between the concepts of this
model and our implementation approach as well as to prove the correctness of our strategy.
Based on the 9-intersection model, a complete collection of mutually exclusive topological relationships
can be determined for each combination of simple and recently also complex spatial data types. The model
is based on the nine possible intersections of the boundary (aA), the interior (A'), and the exterior (A-)
[22] of a spatial object A with the corresponding components aB, B', and B of another spatial object
B. Each intersection is tested with regard to the topologically invariant criteria of non-emptiness. The
topological relationship between two spatial objects A and B can be expressed by evaluating the 3 x 3-
matrix in Figure 2(a). We call each matrix element a matrix predicate. A total number of 29 512 different
configurations are possible from which only a certain subset makes sense depending on the definition and
combination of the spatial data types considered. For each combination of spatial types, this means that
each of its predicates is associated with a unique 9-,,i11, /i 'i.' matrix so that all predicates are mutually
exclusive and complete with regard to the topologically invariant criteria of non-emptiness and emptiness.
Topological relationships have been first investigated for two simple regions (disjoint, meet, overlap,
equal, inside, contains, covers, coveredBy) [8, 12, 15, 16], for two simple lines [6, 13, 17], and for a simple
line and a simple region [18]. Topological predicates involving simple points have been considered trivial.
Two restricted attempts to a definition of topological relationships on more complex spatial objects are the
TRCR (Topological Relationships for Composite Regions) model [7], which only allows sets of disjoint
simple regions without holes and does not derive topological relationships systematically from the underly-








point2D line2D region2D
ANBOR 0 aANB #0 ANB R 0 Popoint2D 2/5 3/14 3/7
AnB 0 -aB 0 o- AB #0 line2D 3/14 33/82 19/43
A-nBf0 A-n B 0 A-nB f0
region2D 3/7 19/43 8/33

(a) (b)
Figure 2: The 9-intersection matrix (a) and the numbers of topological predicates between two sim-
ple/complex spatial objects (b).

ing model, and the approach in [20], which only considers topological relationships of simple regions with
holes, does not permit multi-part regions, and depends on the number of holes of the operand objects. In
[1, 37, 38] we have given a thorough, systematic, and complete specification of topological relationships for
all combinations of complex spatial data types. It is also based on the 9-intersection model and forms the
basis of our implementation. Figure 2(b) shows the increase of topological predicates for complex objects
compared to simple objects and underpins the need for efficient predicate execution techniques.

2.3 Conceptual Models for Dimension-Refined Topological Relationships
A replacement of the topological invariant of non-emptiness and emptiness of an intersection by the topo-
logical invariant of the dimension of an intersection leads from the 9-intersection matrix (Figure 2(a)) to the
9-intersection dimension matrix (Figure 3(a)). In [30] we have explored which predicates can be derived on
the basis of the dimension matrix for all combinations of the spatial data types point, line, and region. It
turns out that these predicates are refinements of the topological predicates on spatial data types. Hence, we
denote them as dimension-refined topological predicates. Figure 3(b) shows the numbers of these predicates
for all combinations of simple and complex spatial data types respectively. A comparison with Figure 2(b)
reveals that all type combinations that include the type point cannot be dimension-refined. This is only pos-
sible if one-dimensional components of a spatial object are involved. But a point object is zero-dimensional.
Therefore, a single dimension-refined topological predicate is derived from a corresponding "pure" topolog-
ical predicate. This is different with the other type combinations that include the types line and region. The
interior of a line object and the boundary of a region object are one-dimensional structures that can interact
in several different ways. Either they are disjoint, or they only share a point object, or they only share a line
object, or they share a point object and a line object. Consequently, the dimension of the intersection is un-
defined, only zero-dimensional, only one-dimensional, or zero- and one-dimensional. This leads to different
refinements of the underlying topological predicates. In this article, we are interested in dimension-refined
predicates since our approach also enables us to evaluate them.
Another approach that also includes a concept of dimension in predicates is the dimension-extended
model described in [4, 8]. Our concept improves and generalizes this approach (for a comparison see [30]).


point2D line2D region2D
dim(A nB) dim(A NOB) dim(A nB-) point lineD regionZD
dim(ANBO) dim(aANaB) dim(aANB-) point2D 2/5 3/14 3/7
dim(AAnB) dim(AnaOB) dim(AnB-) line2D 3/14 61/146 43/75
region2D 3/7 43/75 16/53

(a) (b)
Figure 3: The 9-intersection dimension matrix (a) and the numbers of dimension-refined topological predi-
cates between two simple/complex spatial objects (b).








2.4 Implementation Aspects of Topological Predicates


In queries, topological predicates are usually employed as filter conditions in spatial selections and spatial
joins. At least two main strategies can be distinguished for their processing. The first, optional but worth-
while strategy aims at avoiding the execution of topological predicates since they are expensive predicates
that cannot be executed in constant time. At the algebraic level, topological reasoning techniques examine
the topological relationships contained in a query for topological consistency [33]. Optimization techniques
minimize the number of computations needed [9] (see also Section 5.1) and aim at eliminating topological
relationships that are implied uniquely by composition [19]. At the physical level, methods like predicate
migration [27], predicate placement [26], disjunctive query optimization [3], and approximation-based eval-
uation [2] pursue the concept of avoiding unnecessary or repetitive predicate evaluations in query access
plans. In a filtering step on the basis of minimum bounding rectangles, spatial index structures [23] identify
those candidate pairs of spatial objects that could possibly fulfil the topological predicate of interest.
The second, required strategy aims at designing algorithms for the efficient execution of topological
predicates. This is the goal of this article together with a previous article in [39] that is reviewed in Sec-
tion 3. This research goes far beyond our own initial attempts in [35, 36] that extend the concept of the eight
topological predicates for two simple regions to a concept and implementation of these eight predicates for
complex regions. We are not aware of any other published research on the implementation of topological
predicates. Spatial extension packages like the ESRI ArcSDE, the Informix Geodetic DataBlade, the Oracle
Spatial Cartridge, and DB2's Spatial Extender offer limited sets of named topological predicates for spatial
objects. But their definitions are vague and their implementations unpublished. The open source JTS Topol-
ogy Suite [29] implements topological predicates through topology graphs. Such a graph stores for each
node (endpoint) and edge (segment) of a spatial object whether it is located in the interior, in the exterior, or
on the boundary of another spatial object. Computing the topology graphs and deriving the 9-intersection
matrix from them require quadratic time and quadratic space in terms of the nodes and edges of the two
operand objects; our solution requires linearithmic (loglinear) time and linear space.


3 Review: Exploration Phase and Evaluation Phase

The motivation for this article and the companion article in [39] is our design and derivation of topologi-
cal predicates ([38], Section 2.2) and dimension-refined topological relationships ([30], Section 2.3) for all
combinations of the complex spatial data types point2D, line2D, and region2D (Section 2.1) and the result-
ing open issue of their efficient, correct, and robust implementation. Our proposed solution is a two-phase
approach consisting of an exploration phase and a subsequent evaluation phase. It has been implemented
within the framework of a new, sophisticated type system for two-dimensional spatial data, called Spail,!
Algebra 2D (SPAL2D), that arranges for its integration into extensible database systems. In this section, we
review the exploration phase (Section 3.1) and also an ad hoc solution for the evaluation phase (Section 3.2)
on the basis of our results in [39].

3.1 The Exploration Phase for Collecting Topological Information

The main task of the exploration phase is the detection and collecting of all relevant topological events as
they are crucial for a predicate verification and predicate determination in the evaluation phase. For this, we
have designed exploration clg. 'iii,,i\ (Figure 4). They take two general or realm-based spatial objects F of
type a and G of type 3 with a, 3 E {point2D,line2D,region2D} as input, discover the topological events of
this particular scene, and output the gained topological information in two topological feature vectors vF for
object F and vG for object G. The algorithms are all based on special and sophisticated instances of the plane
sweep paradigm, which is a well known concept of computational geometry. We distinguish exploration








Spatial Topological
objects Exploration / feature
F: a, algorithm, p vectors
G:P VF, VG

Figure 4: Exploration phase

algorithms for the type combinations point2D/point2D, point2D/line2D, point2D/region2D, line2D/line2D,
line2D/region2D, and region2D/region2D. That is, the dimension of the first argument object is always
lower than or equal to the dimension of the second argument object. Due to the rather different properties
of the six type combinations, a single exploration algorithm covering all cases is unfavorable.
We refer the reader to [39] for a detailed description of the data structures of the three complex spatial
data types as well as the exploration algorithms for the six type combinations. For our purposes here, we
only need to know the following: A point2D object F is represented as a sequence of lexicographically
ordered points. Let P(F) be the set of all points of F. A line2D object F is represented as a sequence of
ordered halfsegments. A halfsegment includes segment information and emphasizes one of its end points as
the 1 'iw,, ih g point. Consequently, each segment corresponds to two halfsegments, a left halfsegment and
a right halfsegment. Let H(F) be the set of all halfsegments ofF. For f E H(F), letf.s denote its segment
component and dp(f) be a function yielding the dominating point of f. Let B(F) be the set of all boundary
points ofF. These are all those points ofF from which exactly one segment emanates. A region2D object
F is represented as a sequence of ordered attributed halfsegments. In our case, the attribute of a halfsegment
is a flag ia for "Interior Above" indicating whether the interior of a region is above or, for vertical segments,
left of the segment. Let H(F) be the set of all attributed halfsegments of F. For f e H(F), letf.s denote
its segment component andf.ia its attribute component. Note that H(F), where F is a line2D or region2D
object, only includes (half)segments that are either disjoint from, equal to, or meeting (in an end point) the
(half)segments of G due to our special splitting strategy during plane sweeps [39].
Topological feature vectors as the output of the exploration phase serve as input of the evaluation phase;
i.e., they represent the interface between both phases. Therefore, we focus here on their definitions since
they are later needed in this article. A topological feature vector vF of a spatial object F is a Boolean vector
consisting of a special set of topologicall flags". These flags are specific for each type combination, and
their values are unique for a given spatial configuration. The flags are all initialized to false at the beginning.
Once certain topological information about a spatial argument object has been discovered, the corresponding
flag of its topological feature vector is set to true. For example, if two line2D objects share a segment, the
corresponding flag vF[segshared] is set to true (see Definition 4(i) below). To minimize the number of
topological flags, in symmetric cases, we do not specify a corresponding flag for G. In our example, we do
not have a flag VG[segshared]. In the following, we give the definitions of the topological feature vectors
for all type combinations and explain them as needed. Their detailed description as well as a discussion of
their identification, uniqueness, and completeness can be found in [39].
For two point2D objects, we consider the cases whether both objects have a point in common (flag
poishared) and whether F (G) contains a point that is not part of G (F) (flag poidisjoint in vF and vG).

Definition 1 Let F, G E point2D, and let vF and vG be their topological feature vectors. Then

(i) vF[poishared] : 3f e P(F) 3g e P(G) : f g
(ii) vF[poidisjoint] : 3f e P(F) Vg E P(G) : f g
(iii) VG[poidisjoint] : 3g e P(G) Vf E P(F) : f / g
The predicates "=" and "7" check the equality and inequality of two single points respectively.
In case of apoint2D object F and a line2D object G, we distinguish the following cases: First, a point
f of F can be disjoint from G (flag poidisjoint). Second, a point f can lie in the interior of a segment of G







(flagpoi_on_interior). Third, a point f can be equal to a boundary point of G (flag poi_on_bound). Fourth,
G can contain a boundary point that is unequal to all points in F (flag boundpoi_disjoint).

Definition 2 Let F E point2D, G E line2D, and vF and VG be their topological feature vectors. Then


(i) vF [poidisjoint]
(ii) v [poi_on_interior]
(iii) vF [poi_on_bound]
(iv) vG [bound poidisjoint]


:= 3f e P(F) Vg E H(G) : -on(f,g.s)
:I 3f E P(F) 3g e H(G) Vb E B(G) : on(f,g.s) A f b
: 3ffEP(F) geB(G):f g
:. 3g B(G)VfeP(F):f g


The predicate on is a robust geometric predicate and checks whether a point is located on a segment.
In case of a point2D object F and a region2D object G, we are interested in the topological events
whether a point of F lies either inside G (flag poi_inside), on the boundary of G (flag poi_on_bound), or
outside of G (flag poi_outside).

Definition 3 Let F E point2D, G E region2D, and vF and VG be their topological feature vectors. Then


(i) vF [poi_inside]
(ii) vF [poi_on_bound]
(iii) VF [poi_outside]


: 3f P(F) : poilnRegion(f, G)
: 3f e P(F) 3g H(G): on(f,g.s)
: 3f E P(F) Vg H(G) : -poilnRegion(f,G) A -on(f,g.s)


The predicate poilnRegion checks whether a point lies inside a region2D object.
For two line2D objects, we obtain the following cases: First, the interiors of two segments of F and
G can partially or completely coincide (flag seg_shared). Second, if a segment of F does not partially or
completely coincide with any segment of G, we register this in the flag seg_unshared. Third, we set the flag
interiorpoishared if two segments intersect in a single point that does not belong to the boundaries of F
or G. Fourth, a boundary endpoint of a segment of F can be located in the interior of a segment (including
connector points) of G (flag bound_oninterior). Fifth, both objects F and G can share a boundary point (flag
boundshared). Sixth, if a boundary endpoint of a segment of F lies outside of all segments of G, we set
the flag bound_disjoint. The last three cases introduce the analogous flags seg_unshared, bound_on_interior,
and bound_disjoint for G.

Definition 4 Let F, G E line2D, and let vF and VG be their topological feature vectors. Then


(i) vF[segshared]
(ii) VF[interiorpoishared]

(iii) vF [seg_unshared]
(iv) VF [bound_oninterior]


(v)
(vi)
(vii)
(viii)


VF [boundshared]
VF [bound_disjoint]
vG [seg_unshared]
vG [bound_on_interior]


(ix) vG[bound_disjoint]


: I 3f e H(F) 3g e H(G) : seglntersect(f.s,g.s)
: 3/f EH(F) 3geH(G)Vp B(F)UB(G):
poilntersect(f.s,g.s) A poilntersection(f.s,g.s) / p
: 3If e H(F) Vg E H(G) : -seglntersect(f.s,g.s)
: I 3f EH(F) 3geEH(G) 3pe B(F) \ B(G):
poilntersection (f.s,g.s)= p
: I 3pEB(F) q B(G):p q
: 3p E B(F) Vg C H(G): -on(p,g.s)
:= 3g e H(G) Vf C H(F) : -seglntersect(f.s,g.s)
: I 3f EH(F) 3gEH(G) 3p B(G) \B(F):
poilntersection (f.s,g.s) p
: 3qq E B(G) Vf C H(F): -on(q,f.s)


The predicates poilntersect and seglntersect test whether two segments intersect in a point or segment
respectively. The operation poilntersection returns the intersection point of two segments.
In case of a line2D object and a region2D object, we differentiate the following cases: First, the intersec-
tion of the interiors of F and G means that a segment of F lies in G (flag seg_inside). Second and third, the







interior of a segment of F intersects with a boundary segment of G if either both segments partially or fully
coincide (flag segshared), or if they properly intersect in a single point (flagpoishared). Fourth, the interior
of a segment of F intersects with the exterior of G if the segment is disjoint from G (flag segoutside). Fifth,
a boundary point of F intersects the interior of G if the boundary point lies inside of G (flag boundinside).
Sixth, if it lies on the boundary of G, we set the flag bound_shared. Seventh, if it lies outside of G, we set
the flag bounddisjoint. Eighth, if the boundary of G intersects the exterior of F, a boundary segment of G
must be disjoint from F (flag seg_unshared).

Definition 5 Let F E line2D, G E region2D, and vF and vG be their topological feature vectors. Then
(i) vF[seg_inside] : 3f EH(F) Vg H(G):
-seglntersect(f.s,g.s) A seglnRegion (f.s, G)
(ii) vF[segshared] : I 3f C H(F) g E H(G) : seglntersect(f.s,g.s)
(iii) vF[segoutside] : f EH(F) Vg H(G):
-seglntersect(f.s,g.s) A -seglnRegion(f.s, G)
(iv) vF[poishared] : ~f f H(F) 3g E H(G) : poilntersect(f.s,g.s) A
poilntersection(f.s,g.s) B B(F)
(v) vF[bound_inside] : 3f eH(F) : poilnRegion(dp(f),G) A dp(f) E B(F)
(vi) vF [boundshared] :~ f f H(F) 3g E H(G) : poilntersect(f.s,g.s) A
poilntersection (f.s,g.s) E B(F)
(vii) vF [bounddisjoint] : 3f E H(F) Vg E H(G) : -poilnRegion(dp(f), G) A
dp(f)E B(F) A -on(dp(f),g.s)
(viii) vG[seg_unshared] : I 3g e H(G) Vf C H(F) : -seglntersect(f.s,g.s)
The operation seglnRegion checks whether a segment is located inside a region2D object.
Finally, we consider the possible topological events between two region2D objects. The main goal
is here to determine the existing segment classes in both objects. The segment class of a segment s is a
pair (m/n) of overlap numbers indicating the number of overlapping region2D objects below/above s in a
given scene. In our case, 0 < m,n < 2 holds. The topological feature vector for each object is therefore
a segment classification vector. Each vector contains a field for the segment classes (0/1), (1/0), (0/2),
(2/0), (1/2), (2/1), and (1/1). We know from [39] that the pairs (0/0) and (2/2) are invalid. Another flag
boundpoi-shared indicates whether any two unequal boundary segments of both objects share a common
point. The following definition makes a later needed connection between representational and point set
topological concepts. For a segment s = (p,q) E seg2D, the function pts yields the infinite point set of
s as pts(s) {= r R2 r p (q p), E IR, 0 < X < 1}. Further, for FE region2D, we define aF
UfeH(F)pts(f.s), F = p e R2 ipoilnRegion(p,F)}, and F- R2 -F -F.
Definition 6 Let F, G E region2D and vF be the segment classification vector ofF. Then
(i) vF[(O/1)] : f EH(F) :f.ia A pts(f.s) C G-
(ii) vF[(1/0)] : f EH(F) : -f.ia A pts(f.s) C G
(iii) vF[(1/2)] : 3f EH(F):f.ia A pts(f.s) C G
(iv) vF[(2/1)] : 3f H(F) : -f.ia A pts(f.s) C G
(v) vF[(0/2)] : 3 E H(F) g e H(G) : f.s g.s A f.ia A g.ia
(vi) vF[(2/0)] fe H(F) 3g H(G) :f.s = g.s A -f.ia A -g.ia
(vii) vF[(1/1)] 3f f H(F) 3g H(G) f.s g.s A
((f.ia A -g.ia) V (-f.ia A g.ia))
(viii) vF[boundpoishared] : 3f e H(F) 3g e H(G) : f.s / g.s A dp(f) = dp(g)
The segment classification vector VG of G includes the cases (i) to (iv) with F and G swapped; we omit
the flags for the cases (v) to (viii) due to their symmetry (or equivalence) to flags ofF.








3.2 Direct Predicate Characterization as a Simple Evaluation Method


The goal of the evaluation phase is to leverage the output of the exploration phase, i.e., the topological fea-
ture vectors vF and VG of two given spatial argument objects F and G, for either verifying a given (proper or
dimension-refined) topological predicate or determining such a predicate. A first method already presented
in [39] is the direct predicate characterization of all n topological predicates of each type combination (see
Figure 2(b) for the different type combinations and values of n). For example, for the line2D/line2D case,
we have to determine which topological flags of vF and VG must be turned on and off so that a given topo-
logical predicate (verification query) or a predicate to be found (determination query) is fulfilled. The direct
predicate characterization gives an answer for each individual predicate of each individual type combina-
tion. This means that we obtain 184 individual predicate characterizations without converse predicates and
248 individual predicate characterizations with converse predicates. In general, each characterization is a
Boolean expression in conjunctive normal form and expressed in terms of vF and vG.
As an example, we consider the topological predicate number 8 (meet) between two line2D objects F
and G (Figure 5 and [38]) and see how the flags of the topological feature vectors (Definition 4) are used.

P8 (F, G) :<= -vF [seg_shared] A -vF [interiorpoishared] A vF [segunshared] A
-VF[boundoninterior] A vF[boundshared] A vF[bounddisjoint] A
vG[segunshared] A -VG [boundoninterior] A vG[bounddisjoint]

If we take into account the semantics of the topological feature flags, the right side of the equiva-
lence means that both objects may only and must share boundary parts. More precisely and by considering
the matrix in Figure 5, intersections between both interiors (-vF[seg_shared], -vF [interiorpoishared])
as well as between the boundary of one object and the interior of the other object (-vF [bound_on_interior],
-vG [boundoninterior]) are not allowed; besides intersections between both boundaries (vF [boundshared),
each component of one object must interact with the exterior of the other object (vF[segunshared],
vG[segunshared], vF[bounddisjoint], vG[bounddisjoint]).



1 1 1

Figure 5: The 9-intersection matrix number 8 for the predicate meet between two line2D objects

The predicate characterizations can be read in both directions. For predicate verification, i.e., for eval-
uating a specific topological predicate, we look from left to right and check the respective right side of the
predicate's direct characterization. For predicate determination, i.e., for deriving the topological relation-
ship from a given spatial configuration of two spatial objects, we have to look from right to left. That is,
consecutively we evaluate the right sides of the predicate characterizations by applying them to the given
topological feature vectors vF and vG. For the characterization that matches we look on its left side to obtain
the name or number of the pertaining predicate.


4 The Evaluation Phase: 9-Intersection Matrix Characterization for Match-
ing Topological Relationships

The motivation for this article consists in the fact that the direct predicate characterization method suffers
from three main shortcomings. First, this evaluation method depends on the number of topological predi-
cates. That is, each of the 184 (248) topological predicates between complex spatial objects requires an own
specification. Second, in the worst case, all direct predicate characterizations with respect to a particular

















J, VF, VG vF v+1
S(p returns true) 1 i- + y
Syes
i:= 1 yes charac[rial,,r l ion
nn, EIri. pr i yes
E h alu h-- 1 .- I S ;l n ( return pi
Saraaralzaillon o:. no i > 8
(p returns false ) malnr pr,.i.:rale i, i:= i+ 1
no
Is the
evaluation i> 8
result equal to



(a) (b)

Figure 6: Predicate verification (a) and predicate determination (b) based on the 9-intersection matrix
characterization method

type combination have to be checked for predicate determination. Third, the direct predicate characteriza-
tion method is error-prone. It is difficult to ensure that each characterization is correct and unique and that all
characterizations together are mutually exclusive and cover all topological predicates. From this standpoint,
this solution is an ad hoc approach.
In the following, we present a sophisticated approach, called 9-,,iU, L Ii, 'i matrix characterization
(9IMC), that avoids these shortcomings and has a formal foundation. Besides a higher efficiency, this ap-
proach has the additional feature that its correctness can be formally proved. Section 4.1 introduces the
general method. The next two subsections elaborate on a particular step of the general method that is de-
pendent on the type combination under consideration. Section 4.3 deals with the special region2D/region2D
case while Section 4.2 handles the cases of all other type combinations. Section 4.4 applies and extends our
approach to the evaluation of dimension-refined topological predicates.


4.1 General Method

Instead of characterizing each topological predicate directly, the central idea of our second approach is to
uniquely characterize each element of the 3 x 3-matrix of the 9-intersection model (Figure 2(a)) by means
of the topological feature vectors vF and vG. As we know, each matrix element is a predicate called matrix
predicate that checks one of the nine intersections between the boundary DF, interior F", or exterior F of a
spatial object F with the boundary DG, interior G', or exterior G of another spatial object G for inequality
to the empty set. For each topological predicate, its specification is then given as the logical conjunction
of the characterizations of the nine matrix predicates. Since the topological feature vectors are different for
each type combination, the characterization of each matrix predicate is different for each type combination
too. The characterizations themselves are the themes of the next subsections.
Figure 6(a) shows the general method for predicate verification. Based on the topological predicate p to
be verified as well as vF and VG as input, we evaluate in a loop the characterizations of all matrix predicates








numbered from left to right and from top to bottom. The nineth matrix predicate F n G- / 0 always
yields true [38]; hence, we do not have to check it. After the computation (dark shaded action element) of
the value of the matrix predicate i, we compare it to the corresponding value of the matrix predicate p(i) of
p. If the values are equal, we proceed with the next matrix predicate i + 1. Otherwise, we stop, and p yields
false. If there is a coincidence between the computed values of all matrix predicates with the corresponding
values of p's matrix, p yields true. The benefit of this approach is that it only requires eight predicate
characterizations and that these characterizations are the same for each of the n topological predicates of the
same type combination. In particular, an individual characterization of all n topological predicates is not
needed here. In Section 5.1 we show that this method can be even further improved.
Figure 6(b) shows the general method for predicate determination. Based on vF and VG as input, we
evaluate the 9IM characterizations (dark shaded action element) of all eight matrix predicates and insert the
Boolean values into an intersection matrix m initialized with true for each matrix predicate. Matrix m is then
compared against the matrices pi of all n topological predicates. We know that one of them must match m.
The merit of this approach is that only eight characterizations are needed to determine the intersection matrix
of the topological predicate. But unfortunately we need n matrix comparisons to determine the pertaining
topological predicate in the worst case. In Section 5.2 we introduce a method that eliminates this problem
to a large extent. But the method here is already a significant improvement compared with the necessity to
compute all n direct predicate characterizations.
We apply the general method and the characterizations to both proper and dimension-refined topological
predicates. However, the dimension-refined predicates require some additional characterizations.

4.2 Type Combination Dependent 9-Intersection Matrix Characterization

The last, missing step refers to the characterizations of the eight matrix predicates of the 9-intersection
matrix for all spatial data type combinations. A 9IMC means that each matrix predicate, which takes ab-
stract, infinite point sets F and G representing spatial objects as arguments, is uniquely characterized by the
topological feature vectors vF and VG, which are discrete implementation concepts. For this purpose, for
each discrete spatial object F E a E {point2D, line2D,region2D}, we determine the corresponding abstract
point sets of its boundary, interior, and exterior. For the region2D data type, we have already done this for
Definition 6. For F E point2D, we define DF -0, F P(F), and F I R2 -P(F). For F E line2D, we
define DF {p e R2 IVf,g G H(F),f / q : p dp(f) => p / dp(g)}, F UfH(F)pts(f.s) DF, and
F = R2 _F F. As we will see, each characterization can be performed in constant time, and its cor-
rectness can be shown by a simple proof. In this subsection, we present the characterizations for all type
combinations except for the more complicated case of two region2D objects; this case is dealt with in the
next subsection. The central idea in the proofs of the lemmas below is to accomplish a correspondence be-
tween a matrix predicate based on the point sets DF, F", F DG, G', and G and an equivalent expression
based on finite representations like P(F),H(F), B(F), P(G),H(G), and B(G).
In case of two point2D objects, the 3 x 3-matrix is reduced to a 2 x 2-matrix since the boundary of a
point2D object is defined to be empty [38]. We obtain the following statement:

Lemma 1 Let F, G E point2D. Then the characterization of the matrix predicates of the (reduced) 9-
intersection matrix is as follows:

(i) F n G' 0 vF[poishared]
(ii) F nG G- 0 = vF [poidisjoint]
(iii) F- n G/ 0 vG poi disjoint]
(iv) F n G- $ 0 true








Proof In (i), the intersection of the interiors of F and G is non-empty if, and only if, both objects share a
point. That is, 3f E P(F) 3g E P(G) : equal(f,g). This matches directly the definition of vF [poishared]
in Definition 1(i). In (ii), a point of F can only be part of the exterior of G if it does not belong to G.
That is, 3f e P(F) Vg e P(G) : -equal(f,g). This fits directly to the definition of vF[poi_disjoint] in
Definition l(ii). Case (iii) is symmetric to (ii). Case (iv) follows from Lemma 5.1.2 in [38]. D
In case of apoint2D object and a line2D object, the 3 x 3-matrix is reduced to a 2 x 3-matrix since the
boundary of apoint2D object is defined to be empty. We obtain the following statement:

Lemma 2 Let F E point2D and G E line2D. Then the characterization of the matrix predicates of the
(reduced) 9-intersection matrix is as follows:
(i) F n G' / 0 vF [poion_interior]
(ii) F n 3G 0 vF [poionbound]
(iii) F n G /^ 0 vF [poidisjoint]
(iv) F n G' 0 true
(v) F- nG / ~0 = vG[bound poi disjoint]
(vi) F n G- # 0 < true
Proof In (i), the intersection of the interiors ofF and G is non-empty if, and only if, a point ofF is located
on G but is not a boundary point of G. That is, 3f e P(F) 3g e H(G) Vb E B(G) : on(f,g.s) A f b. This
corresponds directly to the definition of vF [poi_oninterior] in Definition 2(ii). In (ii), the intersection of
the interior of F and the boundary of G is non-empty if, and only if, a point of F coincides with a boundary
point of G. That is, 3f E P(F) 3g E B(G) : f = g. But this matches the definition of vF[poi_on_bound]
in Definition 2(iii). Statement (iii) is satisfied if, and only if, a point of F is outside of G. That is, 3f E
P(F) Vg H(G): -on(f,g.s). But this is just the definition of vF[poi_disjoint] in Definition 2(i). Statement
(iv) always holds according to Lemma 6.1.2 in [38]. To be fulfilled, statement (v) requires that a boundary
point of G lies outside of F. That is, 3g E B(G) Vf E P(F) : f / g. This corresponds to the definition of
vG [boundpoi_disjoint] in Definition 2(iv). The last statement follows from Lemma 6.1.3 in [38]. D
In case of apoint2D object and a region2D object, we also obtain a reduction of the 3 x 3-matrix to a
2 x 3-matrix. We obtain the following statement:

Lemma 3 Let F E point2D and G E region2D. Then the characterization of the matrix predicates of the
(reduced) 9-intersection matrix is as follows:
(i) F n G' 0 v= VF[poi-inside]
(ii) F n 3G /0 0 vF [poionbound]
(iii) F n G- / 0 vF [poioutside]
(iv) F- nG' 0 true
(v) F- nrG 0 0 true
(vi) F- n G- 0 > true
Proof Statement (i) requires that a point of F is located inside G but not on the boundary of G. That is,
3f e P(F) : poilnRegion (f, G) (where poilnRegion is the predicate which checks whether a single point lies
inside a region2D object). This corresponds directly to the definition of vF [poiinside] in Definition 3(i). In
(ii), the intersection of F and the boundary of G is non-empty if, and only if, a point of F lies on one of the
boundary segments of G. That is, 3f E P(F) 3 (g,ia) E H(G) : on(f,g.s). This matches the definition of
vF [poi_onbound] in Definition 3(ii). Statement (iii) is satisfied if, and only if, a point of F is outside of G.
That is, 3f E P(F) V(g, ia) E H(G) : -poilnRegion(f, G) A -ion(f,g.s). This corresponds to the definition
of vF[poioutside] in Definition 3(iii). Statements (iv) and (v) follow from Lemma 6.2.3 in [38]. The last
statement follows from Lemma 6.2.1 in [38]. D
In case of two line2D objects, we obtain the following statement:








Lemma 4 Let F, G E line2D. Then the characterization of the matrix predicates of the 9-intersection matrix
is as follows:

(i) F" G' 0 vF [segshared] V vF [interior _poi shared]
(ii) F"n G 7 =0 vG [boundon_interior]
(iii) F n G /7 0 vF [segunshared]
(iv) aF n G = 0 VF [boundoninterior]
(v) aF n 3G 7^ 0 vF [boundshared]
(vi) 3F n G / 0 = VF [bounddisjoint]
(vii) F n G' 0 vG[segunshared]
(viii) F- n 3G / 0 vG [bounddisjoint]
(ix) F nG G- 0 true
Proof In (i), the interiors of two line2D objects intersect if, and only if, any two segments partially or
completely coincide or if two segments share a single point that does not belong to the boundaries ofF and
G. That is, 3f e H(F) 3g e H(G) : seglntersect(f.s,g.s) V 3f e H(F) 3g e H(G) Vp e B(F) UB(G) :
poilntersect(f.s,g.s) A poilntersection (f.s,g.s) / p. The first expression corresponds to the definition
of VF[seg_shared] in Definition 4(i). The second expression is the definition of VF[interior_poi shared]
in Definition 4(ii). Statement (ii) requires that an intersection point p of F and G exists such that p is a
boundary point of G but not a boundary point ofF. That is, 3f e H(F) 3g e H(G) 3p e B(G) \ B(F) :
poilntersection (f.s,g.s) = p. This matches the definition of V [bound_on_interior] in Definition 4(viii).
Statement (iii) is satisfied if, and only if, there is a segment ofF that is outside of G. That is, 3f E H(F) Vg E
H(G) : -seglntersect(f.s,g.s). This corresponds to the definition of vF [seg_unshared] in Definition 4(iii).
Statement (iv) is symmetric to statement (ii) and based on Definition 4(iv). In (v), the boundaries of F
and G intersect if, and only if, they share a boundary point. That is, 3p E B(F) 3q E B(G) : p = q. This
matches the definition of vF [boundshared] in Definition 4(v). Statement (vi) requires the existence of a
boundary point of F that is not located on any segment of G. That is, 3p E B(F) Vg E H(G) : -on(p,g.s).
This corresponds to the definition of vF [bound_disjoint] in Definition 4(vi). Statement (vii) is symmetric to
statement (iii) and based on Definition 4(vii). Statement (viii) is symmetric to statement (vi) and based on
Definition 4(ix). The last statement follows from Lemma 5.2.1 in [38]. D
In case of a line2D object and a region2D object, we obtain the following statement:

Lemma 5 Let F E line2D and G E region2D. Then the characterization of the matrix predicates of the
9-intersection matrix is as follows:

(i) F n G' / 0 VF[seginside]
(ii) F na G 7 0 vF [seg_shared] V vF [poishared]
(iii) F n G- 0 F [segoutside]
(iv) 3F n G' / 0 = vF [bound inside]
(v) aF n 3G 7^ 0 vF [boundshared]
(vi) 3F n G VF [bound disjoint]
(vii) F- n G' 7 0 true
(viii) F n 3G 7/ 0 vG [segunshared]
(ix) F nG G 0 0 true
Proof In (i), the interiors of F and G intersect if, and only if, a segment of F is located in G but does
not coincide with a boundary segment of G. That is, 3f E H(F) Vg E H(G) : -seglntersect(f.s,g.s) A
seglnRegion(f.s,G). This corresponds to the definition of vF[seg_inside] in Definition 5(i). Statement
(ii) requires that either F and G share a segment, or they share an intersection point that is not a bound-
ary point of F. That is, 3f e H(F) 3g e H(G) : seglntersect(f.s,g.s) V 3f E H(F) 3g E H(G) :








poilntersect(f.s,g.s) A poilntersection (f.s,g.s) ( B(F). The first argument of the disjunction matches the
definition of F [segshared] in Definition 5(ii). The second argument matches the definition ofvF [poishared]
in Definition 5(iv). Statement (iii) is satisfied if, and only if, a segment of F is located outside of G.
That is, 3f E H(F) Vg E H(G) : -seglntersect(f.s,g.s) A -seglnRegion(f.s,G). This corresponds to
the definition of vF[segoutside] in Definition 5(iii). Statement (iv) holds if, and only if, a segment of
F lies inside G and one of the end points of the segment is a boundary point. That is, 3f E H(F) :
poilnRegion(dp(f), G) A dp(f) E B(F). This corresponds to the definition of vF [bound_inside] in Defini-
tion 5(v). In (v), we must find a segment ofF and a segment of G which intersect in a point that is a boundary
point ofF. That is, 3f e H(F) 3g e H(G) : poilntersect(f.s,g.s) A poilntersection (f.s,g.s) E B(F). This
matches the definition of vF [boundshared] in Definition 5(vi). Statement (vi) requires the existence of an
endpoint of a segment of F that is a boundary point and not located inside or on any segment of G. That
is, 3f E H(F) Vg E H(G) : -poilnRegion(dp(f),G) A dp(f) E B(F) A -on(dp(f),g.s). This corre-
sponds to the definition of vF [bounddisjoint] in Definition 5(vii). Statement (vii) always holds according
to Lemma 6.3.2 in [38]. Statement (viii) is satisfied if, and only if, a segment of G does not coincide with
any segment ofF. That is, 3g E H(G) Vf E H(F) : -seglntersect(f.s,g.s). This fits to the definition of
vF [segunshared] in Definition 5(viii). The last statement follows from Lemma 6.3.1 in [38]. D

4.3 9-Intersection Matrix Characterization of the region2D/region2D Case
As indicated in [39], the exploration algorithm for the region2D/region2D case is quite different from the
algorithms of the other type combinations. It has to take into account the areal extent of both objects and
has resulted in the concepts of overlap number, segment classes, and segment classification vector. In this
subsection, we deal with the 9IMC based on two segment classification vectors. The goal of the following
lemmas is to prepare the unique characterization of all matrix predicates by means of segment classes. The
first lemma provides a translation of each segment class into a matrix predicate expression.

Lemma 6 Let F, G E region2D and vF and VG be their segment classification vectors. Then we can infer the
following implications and equivalences between segment classes and matrix predicates:

(i) vF[(0/1)] V vF[(1/0)] Fn G- 0
(ii) VG[(0/1)] V vG[(1/0)] F-n 3G/0
(iii) vF[(1/2)] V vF[(2/1)] a OFnGo 2 0
(iv) vG[(1/2)] V vG[(2/1)] V FnaGrG 0
(v) VF[(0/2)] V vF[(2/0)] 3F n -G / 0 A F n G / 0
(vi) VF[(1/1)1] F n 3G 0 A F nG G- 0 A F n G 0
Proof According to Definition 6(i) and (ii), the left side of(i) is equivalent to the expression 3f E H(F) :
pts(f.s) C G This is equivalent to OF n G / 0. The proof of (iii) is similar and based on Definition 6(iii)
and (iv); only the term G has to be replaced by G. The proof of (ii) can be obtained by swapping the
roles of F and G in (i). Similarly, the proof of (iv) requires a swapping of F and G in (iii). According
to Definition 6(v) and (vi), the left side of (v) is equivalent to the expression 3f E H(F) 3g E H(G) :
f.s g.s A ((f.ia A g.ia) V (-f.ia A -g.ia)). From the first element of the conjunction, we can (only)
conclude that FF n GG # 0. Equivalence does not hold since two boundaries can also intersect if they only
share single intersection or meeting points but not (half)segments. The second element of the conjunction
requires that the interiors of both region2D objects are located on the same side. Hence, F n G 7 0 must
hold. Also this is only an implication since an intersection of both interiors is possible without having any
(0/2)- or (2/0)-segments. According to Definition 6(vii), the left side of (vi) is equivalent to the expression
3f e H(F) 3g H(G) :f.s g.s A ((f.ia A -g.ia) V (-if.ia A g.ia)). The first element of the conjunction
implies that aF n 3G 7 0. The second element of the conjunction requires that the interiors of both region2D








objects are located on different sides. Since the definition of type region2D disallows (1/1)-segments for
single objects, the interior of F must intersect the exterior of G, and vice versa. This is only an implication
since an intersection of the interior of one region2D object with the exterior of another region2D object is
possible without having (1/1)-segments. O
The second lemma provides a translation of some matrix predicates into segment classes.

Lemma 7 Let F, G E region2D and vF and VG be their segment classification vectors. Then we can infer the
following implications between matrix predicates and segment classes:


(i) F n G 0

(ii) F n G 0

(iii) F- n G 0


VF [(0/2)]
VG[(1/2)]
SvF([(0/1)]
VG[(1/2)]
SvF[(1/2)]
VG[(0/1)]


VF[(2/0)] V vF[(1/2)] V vF[(2/1)] V
vG[(2/1)]
vF[(1/0)] V vF[(1/1)] V
vG[(2/1)]
VF[(2/1) V VF[(1/1)] V
VG[(1/0)]


Proof In (i), the intersection of the interiors of F and G implies that both objects share a common area.
Consequently, this area must have overlap number 2 so that at least one of the two objects must have a
(a/2)- or (2/a)-segment with a E {0, 1}. In (ii), the fact that F intersects G means that F contains an area
which it does not share with G. That is, the overlap number of this area is 1, and F must have a (a/1)- or a
(1/a)-segment with a E {0, 1}. The fact that a part of the interior of F is located outside of G implies two
possible topological situations for G: either both objects share a common segment and their interiors are on
different sides, i.e., G has a (1/1)-segment (covered by vF[(1/1)]), or the interior of F is intersected by the
boundary and the interior of G so that G has a (1/2)- or (2/1)-segment. We prove (iii) by swapping F and
G in (ii). o
The third lemma states some implications between matrix predicates.

Lemma 8 Let F, G E region2D. Then we can infer the following implications between matrix predicates:


VF [boundpoi_shared]
F n G- w 0
F n OG / 0
aF n G" f0
F n OG /0


F AFn3G O0
F Fn G f 0 A F n G f 0
SF n G f 0 A F- n G- 0
4 F n G f 0 A F- n G 0
SF n G 7 0 A F" n G- 7 0


Proof Statement (i) can be shown by considering the definition of boundpoi_shared. This flag is true if
any two halfsegments of F and G share a single meeting or intersection point. Hence, the intersection of
both boundaries is non-empty. The proofs for (ii) to (v) require point set topological concepts. Statements
(ii) and (iii) follow from Lemma 5.3.6 in [38]. Statements (iv) and (v) result from Lemma 5.3.5 in [38]. D
The following theorem collects the results we have already obtained so far and proves the lacking parts
of the nine matrix predicate characterizations.

Theorem 1 Let F, G E region2D and vF and vG be their segment classification vectors. Then the matrix
predicates of the 9-intersection matrix are equivalent to the following segment class characterizations:


F n G- / 0
F nG 0


SvF [(0/2)]
VG[(1/2)]
= vG[(1/2)]
# vF[(0/1)]
= vF[(1/2)]


vF[(2/0)] V vF[(1/2)] V vF[(2/1)] V
vG[(2/1)]
vG[(2/1)]
VF[(1/0)] V vF[(1/1)] V VG[(1/2)] V VG[(2/1)]
VF[(2/1)]


(i) F" n G 7 0








(v) aF n O 3G 0 vF[(0/2)] V VF[(2/0)] V vF[(1/1)] V vF[boundpoi_shared]
(vi) 3F n G / 0 VF[(0/1)] V vF[(1/0)]
(vii) F nG 4 0 vF[(1/2)] V vF[(2/1)] V vF[(1/1)]V v[(0/1)]V v[(1/0)]
(viii) F naG P 0 0 vG[(0/1)] V VG[(1/0)]
(ix) F n G- / 0 true

Proof For (i), the forward implication corresponds to Lemma 7(i). The backward implication can be
derived from Lemma 6(v) for (0/2)- and (2/0)-segments of F (and G). For (1/2)- and (2/1)-segments,
Lemma 6(iii) and 6(iv) imply aF n G 7 0 and F" n aG f 0, respectively. From these two implications, by
using Lemma 8(iv) and 8(v), we can derive in both cases F n G / 0. Statements (ii) and (iv) correspond
to Lemma 6(iv) and 6(iii), respectively. For (iii) [(vii)], the forward implication corresponds to Lemma 7(ii)
[7(iii)]. The backward implication for (iii) [(vii)] requires Lemma 6(i) [6(ii)] and Lemma 8(ii) [8(iii)] for
the (0/1)- and (1/0)-segments ofF [G], Lemma 6(vi) [6(vi)] for the (1/1)-segments ofF (and hence G),
as well as Lemma 6(iv) [6(iii)] and Lemma 8(v) [8(iv)] for the (1/2)- and (2/1)-segments of G [F]. For
(v), the forward implication can be shown as follows: if the boundaries of F and G intersect, then either
they share a common meeting or intersection point, i.e., the flag v [boundpoishared] is set, or there are
two halfsegments of F and G whose segment components are equal. No other alternative is possible due
to our splitting strategy for halfsegments during the plane sweep. As we know, equal segments of F and G
must have the segment classes (0/2), (2/0), or (1/1). The backward implication requires Lemma 6(v) for
(0/2)- and (2/0)-segments ofF (and hence G), Lemma 6(vi) for (1/1)-segments ofF (and hence G), and
Lemma 8(i) for single meeting and intersection points. Statement (vi) [(viii)] corresponds to Lemma 6(i)
[6(ii)]. Statement (ix) turns out to be always true since our assumption in an implementation is that our
universe of discourse U is always properly larger than the union of spatial objects contained in it. This
means for F and G that always F U G C U holds. We can conclude that U (F U G) / 0. According to
DeMorgan's Laws, this is equivalent to (U F) n (U G) / 0. But this leads us to the statement that
F-G n 0. o
Summarizing our results from the last two subsections, we see that Lemmas 1, 2, 3, 4, 5, and Theorem 1
provide us with a unique characterization of each individual matrix predicate of the 9-intersection matrix for
each type combination. This approach has several benefits. First, it is a systematically developed and not an
ad hoc approach. Second, it has a formal and sound foundation. Hence, we can be sure about the correctness
of topological flags and segment classes assigned to matrix predicates, and vice versa. Third, this evaluation
method is independent of the number of topological predicates and only requires a constant number of
evaluations for matrix predicate characterizations. Instead of nine, even only eight matrix predicates have to
be checked since the predicate F n G- / 0 yields true for all type combinations. Fourth, we have proved
the correctness of our provided implementation.
Based on this result, we accomplish the predicate verification of a topological predicate p with respect
to a particular spatial data type combination on the basis of p's 9-intersection matrix (see the complete
matrices in Figures 7 to 9 and 11 to 13) and the topological feature vectors vF and VG as follows: Depending
on the spatial data type combination, we evaluate the logical expression (given in terms of vF and vG) on
the right side of the first 9IMC according to Lemma 1, 2, 3, 4, 5, or Theorem 1, respectively. We then
match the Boolean result with the Boolean value at the respective position in p's intersection matrix. If both
Boolean values are equal, we proceed with the next matrix predicate in the 9-intersection matrix; otherwise
p is false, and the algorithm terminates. Predicate p yields true if the Boolean results of the evaluated logical
expressions of all 9IMCs coincide with the corresponding Boolean values in p's intersection matrix. This
requires constant time.
Predicate determination also depends on a particular spatial data type combination and leverages 9-
intersection matrices and topological feature vectors. In a first step, depending on the spatial data type
combination and by means of vF and VG, we evaluate the logical expressions on all right sides of the 9IMCs








according to Lemma 1, 2, 3, 4, 5, or Theorem 1, respectively. This yields a Boolean 9-intersection matrix. In
a second step, this Boolean matrix is checked consecutively for equality against all 9-intersection matrices of
the topological predicates of the particular type combination. If na,, with a, P3 {point2D, line2D, region2D}
is the number of topological predicates between the types a and 3, this requires na,p tests in the worst case.

4.4 Evaluation of Dimension-Refined Topological Predicates

The 9IMC described so far is applicable to the evaluation of proper topological predicates. In this subsection,
we show that, with a few extensions, the 9IMC can also evaluate dimension-refined topological predicates
(Section 2.3) for which the dimension of an intersection of two spatial objects plays an essential role. That
is, the exploration algorithms provide us with enough topological information to perform this task. In
this sense, dimension-refined topological predicates are a refinement of proper topological predicates. For
example, in a meeting situation, we could be interested in a query whether two line2D objects meet in a
zero-dimensional object (i.e., apoint2D object), or in a one-dimensional object (i.e., a line2D object), or
in both. Dimension refinement is only possible for the type combinations line2D/line2D, line2D/region2D,
and region2D/region2D. We distinguish the different dimensions of the maximal connected components of
a given point set in the two-dimensional space and define a "dimension type" as follows [30]:

dimType = {, OD, 1D, 2D, 01D,02D, 12D, 012D}

This type allows us to represent the dimension of a given point set as the "union" of the dimensions
of its maximal connected components. If a point set consists only of single points, only of lines, or only
of regions, the corresponding value of type dimType is OD, 1D, or 2D respectively. If a point set contains
maximal connected components of different dimension, the corresponding value is 01D, 02D, 12D, and
012D respectively. The I symbol is used to represent the undefined dimension of an empty point set.
In [30], we have shown that the dimension of the intersection of a zero-, one-, or two-dimensional spatial
component with another zero-, one-, or two-dimensional spatial component can only have the value 1, OD,
ID, 2D, or 01D. Further, only in case of two one-dimensional components, their intersection may lead to
components of different dimensions. This refers exclusively to the intersection between the interiors of two
line2D objects, between the interior of a line2D object and the boundary of a region2D object, or between
the boundaries of two region2D objects. The possible dimension values for these three cases are 1, OD, ID,
or 01D. They are evaluated or determined by our extended 9IMC method described now.
Let F, G E a E {line2D, region2D}, p be a topological predicate between F and G, and pd be a dimension-
refined predicate that is a refinement of p. We leverage the following relationship between p and pd:

(pd(F, G) > p(F, G)) # (-p(F, G) => -pd(F, G))

For the predicate verification of pd, we first apply the 9IMC for the predicate verification of p. From
the right side of the equivalence, we conclude that, if p(F, G) yields false, pd(F, G) must yield false too.
Otherwise, if p(F, G) yields true, we cannot make a statement about pd because the satisfaction of p is only
a necessary but not sufficient condition for the satisfaction of Pd. In this case, we need additional dimen-
sion L(lLI ,/ i :Iliio.,% that are given in Table 1. For each type combination, the table shows the relevant
intersection between the one-dimensional components of these types, the possible dimensions such an inter-
section can have, and for each dimension its characterization on the basis of the topological feature vectors
of this type combination. Since the dimension characterizations are different for each type combination,
they are unique. Therefore, we only have to look up the type combination and the dimension of interest,
evaluate the corresponding dimension characterization, and obtain its Boolean value as the result for the
dimension-refined predicate.








Type combination Intersection Dimension Dimension characterization
line x line F n G = 0 1 -VF [seg_shared] A -VF [interior_poi_shared]
F n G 0 OD -VF [seg_shared] A vF [interior poi_shared]
01D VF [seg_shared] A vF [interior _poi_shared]
1D VF [seg_shared] A -VF [interior_poi_shared]
line x region F n 3G =0 1 -vF [seg_shared] A -vF [poi_shared]
F n 3G / 0 OD -VF [seg_shared] A F [poi_shared]
01D VF [seg_shared] A VF [poishared]
1D VF [seg_shared] A VF [poishared]
region x region aF n aG =w0 1 (vF[(0/2)] V vF[(2/0)] V vF[(1/1)] V
VF [(boundpoi shared)])
3F n G / 0 OD -(vF[(0/2)] V vF[(2/0)] V vF[(1/1)]) A
VF [(boundpoi shared)]
01D (vF[(0/2)] V vF[(2/0)] V vF[(1/1)]) A
VF [(bound poi shared)]
1D (vF[(0/2)] V vF[(2/0)] V vF[(1/1)]) A
_VF [ (bound_poi shared)]

Table 1: Dimension characterizations for dimension-refined topological predicates

For the predicate determination of the matching dimension-refined topological predicate pd between
F and G, we first apply the 9IMC for the predicate determination of the matching topological predicate p
between F and G. Afterwards, we consecutively evaluate the four dimension characterizations of the type
combination under consideration from Table 1. If the resulting dimension is the I element, the dimension-
refined predicate pd coincides with p. Otherwise, we obtain either pd = OD-p, pd = O1D-p, or pd = 1D-p.


5 Optimized Evaluation Methods

Based on the exploration algorithms and leveraging the 9-intersection matrix characterization, we have
found a universal, correct, complete, and effective method for both predicate verification and predicate de-
termination of proper and dimension-refined topological predicates. So far, we have focused on the general
applicability and universality of our overall approach. In this section, we show that it is even possible to
fine-tune and thus improve our 9IMC approach with respect to efficiency if we look at predicate verification
and predicate determination separately. Section 5.1 delineates a sophisticated method called matrix thin-
ning for speeding up predicate verification. Section 5.2 describes a fine-tuned method called minimum cost
decision tree for accelerating predicate determination.

5.1 Matrix Thinning for Predicate Verification
The approach of matrix :li,,iii,, (MT) described in this subsection is based on the observation that for
predicate verification only a subset of the nine matrix predicates has to be evaluated in order to determine
the validity of a given topological relationship between two spatial objects F and G. For example, for
the aforementioned predicate 1 (disjoint) of the region2D/region2D case, the combination that F n G
0 A FF n aG = 0 holds (indicated by two O's) is unique among the 33 predicates. Consequently, only these
two matrix predicates have to be tested in order to decide about true or false of this topological predicate.
The question arises how the 9-intersection matrices can be systematically "thinned out" and nevertheless
remain unique among the n,, topological predicates between two spatial data types a and P. We use a
brute-force algorithm (Figure 10) that is applicable to all type combinations and that determines the thinned








00 0|* 1|* 1| 0|* 00 /1 0* 0|0 1* 0* 11 /11 0|* 11
1: 0|* 0|* 2: 02 | 0|* 0|* 3: 0|* 0|* 0|* 4: 0|* 0|* 0|* 5: 0|* 0|* 0|*
1* O* l* 00 0* 1* 11O* l* K 0|0 1|* K 11 0* 1|*
Figure 7: Complete and thinned out matrices for the 5 topological predicates of the point2D/point2D case.

0|0 00 1|* 0|0 00 1|* 0|0 1* 00 0|0 1* 00 00 11 11
1: 0 0o| 0|o 2: 0|* 0|* 0|* 3: 0|o 0|o 0|o 4: 0|* 0|* 0|* 5: 0|* 0|* 0|*
1|* 0|0 1* 1|* 1|1 1|* 1|* 0|0 1|* 1|0 1 1 10 0|0 1|*
0|0 1|1 1|1 1* 0|0 0|0 1* 00 0|0 |1 0|0 1|1 1|1 0|0 1|1
6: 0| 0|* 0|* 7: 0|* 0|* 0|* 8: 0|* 0|* 0|* 9: 0|* 0|* 0|* 10: 0|* 0|* 0|*
1|* 1|1 1|* 1|* 0|0 1|* | 1 |* l 1| 1|* 0|0 1|* l|* 1| 1 1| *
I1|1 1|1 0|0 1|1 1|1 0|0 1|1 1|1 1|1 1|1 1|1 1|1
11: O|* O|* O|* 12: 0|* 0** O| 13: 0|* 0** O| 14: 0|* 0|* O|*
S1* 0|0 1|* 1|* 11 1|* 1|* 0|0 1|* 1|* 1|1 l*

Figure 8: Complete and thinned out matrices for the 14 topological predicates of the point2D/line2D case.

out version of each intersection matrix associated with one of the n,,p topological predicates. Since this
algorithm only has to be executed once for each type combination, runtime performance and space efficiency
are not important here.
In a first step (lines 8 to 10), we create a matrix pos of so-called position matrices corresponding to all
possible 9-intersection matrices, i.e., to the binary versions of the decimal numbers 1 to 511 if we read the
9-intersection matrix (9IM) entries row by row. Each "1" in a position matrix indicates a position or entry
that is later used for checking two intersection matrices against each other. A "0" in a position matrix means
that the corresponding entries in two compared intersection matrices are not compared and hence ignored.
Because our goal is to minimize the number of matrix predicates that have to be evaluated, in a second
step, we sort the position matrices with respect to the number of ones in increasing order (lines 11 to 12).
That is, the list of position matrices will first contain all matrices with a single "1", then the matrices with two
ones, etc., until we reach the matrix with nine ones. At the latest here, it is guaranteed that an intersection
matrix is different to all the other n,,p 1 intersection matrices. Hence, our algorithm terminates.
In a third step, we initialize the entries of all n,,p thinned out intersection matrices with the "don't care"
symbol "*".
The fourth and final step computes the thinned out matrices (lines 15 to 33). The idea is to find for each
intersection matrix (line 15) a minimal number of entries that together uniquely differ from the correspond-
ing entries of all the other n,p 1 intersection matrices. Therefore, we start traversing the 511 position
matrices (line 17). For all "1"-positions of a position matrix we find out whether for the intersection matrix
under consideration another intersection matrix exists that has the same matrix values at these positions
(lines 20 to 21). As long as no equality has been found, the intersection matrix under consideration is com-
pared to the next intersection matrix (lines 19 to 23). If an equality is found, the next position matrix is taken
(line 30). Otherwise, we have found a minimal number of matrix predicates that are sufficient and unique
for evaluation (line 24). It remains to copy the corresponding values of the 9-intersection matrix into the
thinned out matrix (lines 25 to 28).

0|0 0|0 1|*\ 0|0 1|* 0|0 \ 0|0 1|1 11 1* 0 |0 1|1 0|0 1|1
1: 0 O| O* | 2: 0|* 0|* 0|* 3: 0 0* O| 4: 0|* O* O| 5: 0|* 0|* O|*
S1* 1|* 1|* 1| 1 1* 1| 1| 1* 1| 1| 1* 1| 1| *
( 1|1 1|1 0|0 / 1|1 1|1 1|1
6: 0* 0| 0| 7: 0|* 0|* 0|*
S1* 1|* * 1|* 1|* 1* )
Figure 9: Complete and thinned out matrices for the 7 topological predicates of the point2D/region2D case.









01 algorithm MatrixThinning
02 input: Three-dimensional 9IM im. im[i,l,m] E {0, 1}
03 denotes entry (l,m) (1 < l,m < 3) of the ith
04 9IM(1 < i < .).
05 output: Three-dimensional thinned out 9IM tim.
06 tim[i,l,m] {0 1,*}. '*' is 'don't care' symbol.
07 begin
08 Create three-dimensional matrix pos of 'position'
09 matrices wherepos[j,l,m] E {0, 1} denotes entry
10 (l,m) of the jth possible 9IM (1 11 Sort pos increasingly with respect to the number of
12 ones in a matrix;
13 Initialize all entries of matrices of tim with '*'; r : 1;
14 //Compute thinned out matrices
15 for each i in 1..., i, do
16 j:= 1;stop:= false;
17 while j< 511 and not stop do


18 k:= 1; unequal := true;
19 while 1 < k < n,,, and i k and unequal do
20 equal: im[i] and im[k] have the same values
21 at all positions (l,m) where pos[j,l,m] = 1;
22 unequal : unequal and not equal; inc(k);
23 endwhile;
24 if unequal then // Thin out im [i] by pos [].
25 for each l,m in 1...3 do
26 ifpos[j,l,m] 1
27 then tim[r,l,m] : im[i,l,m] endif
28 endfor;
29 inc(r); stop := true;
30 else inc(j);
31 endif
32 endwhile
33 endfor
34 end MatrixThinning.


Figure 10: Algorithm for computing the thinned out versions of the na,, intersection matrices associated
with the topological predicates between two spatial data types a and 3


Note that for the same intersection matrix it may be possible to find several thinned out matrices with the
same number of matrix predicates to be checked such that each of them represents the intersection matrix
uniquely among the na,, intersection matrices. Our algorithm always computes the thinned out matrix with
the "lowest numerical value". Figures 7 to 9 and 11 to 13 show the results for the different spatial data type
combinations. Definition 7 defines the measures we use to summarize and interpret these results.

Definition 7 Let IMMT be a thinned out 9IM, and cnt be a function that counts the number of relevant matrix
predicates of IM"T. Let n, p with a, 3 e {point2D, line2D,region2D} be the number of (thinned out) 9IMs
of the topological predicates between the types a and 3, and nkg be the number of thinned out 9IMs for
a ,[P


00
1: 0|*
lI*



S1|*
1*

11: 0|*
S0|0
(1*
16: 1|1


21: 0|*

1*
26: 0|0
( 1*


31: 1|1
S1|*

Figure 11:
case.


1* \
I1*
1|*
00 )
0|*
11*
1* \
11 )

11,
0|0
1*
iio
11
0|0
1|o
1* \
11
1* /
1i* \
0|0
1*


0|0
2: 0|*


7: 1|*
S1|*
111
12: 0|0
1 il

17: 1|1
1|*
1|*
22: 0|0

1|*
27: 1|
S 1|*
1|*
32: 1|1
1|*


1*
00
1*|
0|0
0|*
1|*
1* \
1*
1* /
1* \
11
1* /
1*
0|0
1* /
11
0|0
1|*
1* \
11
1* /


0|0
3: 0|*
1|*
1|*
8: 1|
33: i
11,
11,

1|*
1|1
1|*
1|*

1|*
1|*
28: 1|.
ill
1|1

33: 1|1
1|*


1* \
1*
1* /
0|0
0|*
1*
1* \
11
1* /
1* \
11
1* /
1*
0|0
1|*
1* \
1*
1* /
1* \
11,
1* /


0|0
4: 0|*


9: 1|1
1|,
11,

14: 1|1
1|,

19: 0|*
0|0
1|*
24: 0|*
0|0
1|*
29: 1|1
11,


1* \
11 1
1* /
00 N
0|*
11*
1* \
11,
1* /
1* \
1*


11,
1* /

1* \
11


1* /
II* /


1|*
5: 0|

111
10: 0|0
1|*
1|,
15: 1|
1|,
1 *
20: 0|0
1|,
1|*
25: 0|0
i ll1
1 *
30: 1|1
1|*


0|0
0|*
1|j
11
0|0
1* /
11
0|0
1* /
1* \
1*|
1* /
1* \
11,
1* /
1|*
0|0
1*


Complete and thinned out matrices for the 33 topological predicates of the region2D/region2D








0|0 0|0 1|* 0|0 0|0 1|* 0|0 0|0 1|* 0|0 0|0 1|* 0|0 0|0 1*
1: 00 0|0 0|0 2 0|0 00 0 00 11 : 0|0 040 1|1 5: 00 1|1 0|0
1|* 0|0 1|* 1 |* 1| 1|* 0|0 1|* 1|* |1 1l* 1|* 0|0 1|*
( 0|0 0|0 1|* 0|0 0|0 1|* 0|0 0|0 1|* 0|0 0|0 1|* 0|0 0|0 1|*
6: 0|0 11 00 7: 00 11 11 8: 0|0 11 1|1 9: 1|1 0| 0|0 10: 1|1 0|0 00
1|* l|1 1|* 01|* O|O 1| *\ 1| 1||1 00 l1 | 1* 1|1 1|*
0|0 0|0 1|* 0|0 0|0 1|* 0|0 0|0 1|* 0|0 0|0 1|* 0|0 0|0 1|*
11: 1|1 O0 1|1 12: 1|1 0|0 1|1 13: 1|1 1|1 0| 14: 1|1 11 0|0 15: 1|1 1|1 1|1
S1* 10| 1|* 1|* j l 1|* O|0 1|*j 1|*1 1|* 1|* 0|0 1|*
7| 0|00 1| 0|0 1|1 1|* 0 0| 1* 0|0 1|1 1|* 0| ( |0 1*
16: 1|1 1|1 1|1 17: 0|0 O|0 0| 18: O0 0|0 0| 19: 00 00 1|1 20: 0|0 0|0 11
1* 1|I 1| 1* 00 1|* 1|* 1|*) 1|* 0|0 1* 1|* 1|*!
0|0 1|1 1|* 0|0 1 0|0 1|1 1 0|0 1|1 1* 0|0 1|1 1*
21: 0|0 1|1 0|0 22: 0 10 1 0| 23: 0|0 1|1 1|1 24: 00 11 1|1 25: 1|1 0|0 00
S1* 0|0 1|* \ 1 11 l* 1* 00 1* 10* 1|1 1|* 1|* 0|0 1|*
70011 1\* 700111 \ 700 11 (00 111* N (00 11 *N
( 0|0 1|1 1| 00 1| | / 0|0 1|1 1|* 0|0 1|1 1|* 0|0 1|1 1|*
26: 1|1 00 0|0 27: 1|1 O0 11 28: 1|1 00 1|1 29: 11 1|1 0|0 30: 11 1|1 00
S1* 11 1|* 1|* 1 1| 11 1 1** 0|* 0 l1|*/ 1|* 11 1|*
0|0 1|1 1|* 0 0 |0 1 |1 1| 0|* 0|0 \ 1|* 0|* 0|0 1|* 0|* 0|0
31: 1|1 1|1 1|1 32: 1|1 1|1 1|1 33: 0|* 0|0 0|* 34: 0|0 0|0 0|* 35: 0|0 0|0 0|*
1|* 0|0 1| 1*l l 1|* 1 1| 10|0 1| 1* 1*|1 1|
1|* 0| 0|0 1|* 0|* 0|0 1* 0|00 (1* 0|* 0| 1|* 0|* 0|0
36: 0|* 1|1 0|* 37: 0|0 1|1 0|* 38: 0|0 11 0|* 39: 1|1 0|0 0|* 40: 1|1 0|0 0|*
S0|0 0* |* 1|1 001* \ 11 1 1|* 0 1*|00 K 1|* 1 1|*
1|* 0* 0|0 1* 0* 0|0 1 00 11 1|1 0|0 1|1 1|1 0|0 1|1
41: 1|1 1|1 0|* 42: 1|1 1|1 0|* 43: 0|* 0|0 00 44: 00 00 0|0 45: 0|0 00 0|0
1|* O|O 1|* 1|* 11 1| 0* O O* 10 OO 1|1 0|0 1| 1|1 1*
1|* 0|0 1* 1|1 0|0 1* 1|1 0|0 1| 1| 0|0 1 |1 |1 0|0 11
46: 0|* 0|0 1|1 47: 0|0 0|0 1|1 48: 0|0 0|0 1|1 49: 0|* 1|1 0|0 50: 0|0 11 00
|0 0|* 1* 1|1 0|0 10 1/ 1|11 j |1 0|0 0| 1| 1|1 00 1|*
1|1 0|0 1|1 1| 0|0 1| 1|1 0|0 1|* 1| 1|1 0|0 10 |1
51: 0|0 1|1 0|0 52: 0|* 1|1 1|1 53: O|0 1|1 1|1 54: O|0 1|1 1|1 55: 1|1 0|0 0|0
1|* I|1 1|* O1O 1|* 1| 00 1| * 11 1 1* 100 1|*
1|1 0|0 1|1 1|1 0|0 1| 1|1 0|0 1| 1|1 0|0 1|1 1|1 0|0 1|1
56: 1|1 0|0 00 57: 1|1 00 1|1 58: 1|1 00 1|1 59: 11 11 0| 60: 11 1|1 00
1* 1| 1| 1* 0|0 1|* 1 |* 1 | 1* 1|* 0|0 1|* 1|* 11 1|*
1|1 00 1 1|1 0|01* 1* 1|1 1|* |1 11 1| 1| 1* |1 1*
61: 1|1 1|1 11 62: 1 | 1 1| 1 63: 0|* 00 0| 64: 00 0|0 0| 65: 0|0 00 0|0
1|* 0|0 1* 1* 1|1 1* 00 0* 1* 1 00 1* / 1|* 11*
(1* I 11 1 N11 11 N 11 11 1N 1* 11 | 1111 1*
66: 0| 0|0 1|1 67: 0|0 00 11 68: 0|0 00 1|1 69: 0|* 1|1 O| 70: 00 11 0|0
S0|0O 0 | \ 1|1 OO 1* 1|* I| 1| | 0|0 0|* |* 1|1 0 1|*
(| 1|1 1| I1| I1* 11 N 11 111 111 11 ( 11 11 *
71: 0|0 1|1 0|0 72: 0* 11 1|1 73: 0|0 11 1|1 74: 0|0 1|1 11 75: 1|1 00 00
S1* 11 1* \0| 00l* |1* 1 | OlO 1|* 1|1 1 1|* 0|0 1|*
(11 11 1 11 1 N 1 11| N| 111 11 N 11 11 *
76: 1|1 00 0|0 77: 1|1 00 1|1 78: 11 0|0 1|1 79: 11 1|1 0|0 80: 11 11 00
1| 1|11 1|* 1|* 0|0 1| 1 *| 1|1 1| 1| 0|0 1| 1 1 |1 1|*
KI 1*1 1* /I 1* 11 1*\
81: 1|1 1|1 1|1 82: 1|1 1|1 1|1
1|* O|O 1|* 1|* |1 1|*

Figure 12: Complete and thinned out matrices for the 82 topological predicates of the line2D/line2D case.

which k (with 1 < k < 9) matrix predicates have to be evaluated. Let the cost, i.e., the total number of matrix
predicates to be evaluated for o and 3, be C ,p without matrix thinning and C ", with matrix thinning. We
then denote with RAC p the reduced average cost in percent when using matrix thinning. We obtain:









0|0
1: 0|*

l1*
(00
6: 0|*

010
S1|*

(1*
11: 0|*








l|*
16: 1|1
1|*







ll*
21: 0|0






111
26: 0|0



/1*
31: 1|1
S1|*

36: 0|0
S1|*

41: 11
( 1|*


1 *
00
1 )
00
O*





1 *




00)
00)
11 \
1* )





00
O*




1 )
V
00


0|0
2: 0|*
0|*

0|0
7: 0|*
1|*
0|0
12: 0|*
II*

1|*
1|*
22: 1|1

111
27: 0|0

111
32: 0|0


37: 0|0
S 1|*
1|*
42: 1|1
I|*


1* \
111
1* /
0|0
0|*
1|*
1|1 \
0|0
1* )
0|0
0|*
1|*
1i1
0|0
1* )
0|0
0|*
1|*
11 )
0|0
1* )
1* \
111
1* /
11 )
0|0
II* /


0|0
3: 0( *

0|0




1( |*
O11
18: 0|0
1|*
1|*
23: 1|1

111
28: 0|0
1|*

33: 0|0
1|*




43: 1|1
11 *
33: 0]


1*
00

11

1|*
1| \
11 N
00



1|*
1i*
111
1i*
0|0
0*
1|*
ii



Op
11
0|0
1|*
1* /
11 \
0|0
1|*
0|0
11* )

11
1i* /


0|0
4: 0|*

0|0
1*
Op
9: 0|*


14: 0|0
11
19: 0|0

II*

1|*
24: (11


29: 1|1
1 *
111
34: 0|0
S 1|*
1|*
39: 1|1
SI|,*


1* \
11
1*])
11N
00

O*
00
0*
1*
1* N
11
1*]/
11N
00
1*,)
00N
0*)

1* N
11
1*]/
11 \
00


0|0
5: 0|*

0|0
10: O|*
l|*
1|*

15: O|0
ll*

20: 0|0


25: 1|1

1|*
30: (1i
111
35: 0|0


S 1|*
I1,


0|0
0|*
1*|
1* )
111
1* /
00 0
0|*
1*
111
0|0
1* /
1* \
111
1* /
00 0
0|*
1*
1* \
111
1* /
1* \
111
1* /


Figure 13: Complete and thinned out matrices for the 43 topological predicates of the line2D/region case.


(i) cnt(IM T)
(ii) nka
(iii) no,1p
(iv) C, p
(v) AC,,p
(vi) C^T
(vii) AC
(viii) RAC


I{(l,m) 1 {IMM T I< i < nap,1< Ik < 9,cnt(IM T)
29 k
8k= a,P
8 - no,p
C,p1/na,p 8
Y9 k- n k

co A/na,'
100 AC/(AC=a,P 100 CA-/Ca,
a~pl", P /Ca'p,


ACMT denotes the average number of matrix predicates to be evaluated. Table 2 shows a summary of
the results and in the last two columns the considerable performance enhancement of matrix thinning. The
reduction of matrix predicate computations ranges from 27 percent for the line2D / line2D case to 75 percent
for the point2D / point2D case.


5.2 The Minimum Cost Decision Tree for Predicate Determination

In Section 4, we have seen that, in the worst case, n,,p matching tests are needed to determine the topological
relationship between any two spatial objects. For each test, Boolean expressions have to be evaluated that are
equivalent to the eight matrix predicates and based on topological feature vectors. We propose two methods
to improve the performance. The first method reduces the number of matrix predicates to be evaluated. This
goal can be directly achieved by applying the method of matrix thinning described in Section 5.1. That is,
the number na,p of tests remains the same but for each test we can reduce the number of matrix predicates
that have to be evaluated by taking the thinned out instead of the complete 9-intersection matrices.








n with k
Type combination n,p 1 2 3 4 5 6 7 8 9 Ca,p ACa,p COf AC, RACi,
point2D /point2D 5 1 3 1 0 0 0 0 0 0 40 8 10 2.00 25.00
line2D line2D 82 0 0 2 12 4 50 12 2 0 656 8 474 5.78 72.26
region2D region2D 33 0 6 6 10 11 0 0 0 0 264 8 125 3.79 47.35
point2D / line2D 14 0 0 6 8 0 0 0 0 0 112 8 50 3.57 44.64
point2D / region2D 7 0 3 4 0 0 0 0 0 0 56 8 18 2.57 32.14
line2D / region2D 43 0 0 5 18 12 7 1 0 0 344 8 196 4.56 56.98

Table 2: Summary of complete and thinned out 9IMs for the topological predicates of all type combinations.

The second method, which will be our focus in this subsection, aims at reducing the number na,p of
tests. This method is based on the complete 9-intersection matrices (Figures 7 to 9 and 11 to 13) but
also manages to reduce the number of matrix predicates that have to be evaluated. We propose a global
concept called minimum cost decision tree (MCDT) for this purpose. The term "global" means that we do
not look at each intersection matrix individually but consider all na,p intersection matrices together. The
idea is to construct a full binary decision tree whose inner nodes represent all matrix predicates, whose
edges represent the Boolean values true or false, and whose leaf nodes are the na,p topological predicates.
Note that, in a full binary tree, each node has exactly zero or two children. For searching, we employ
a depth-first search procedure that starts at the root of the tree and proceeds down to one of the leaves
which represents the matching topological predicate. The performance gain through the use of a decision
tree is significant since the tree partitions the search space at each node and gradually excludes more and
more topological predicates. In the best case, at each node of the decision tree, the search space, which
comprises the remaining topological predicates to be assigned to the remaining leaves of the node's subtree,
is partitioned into two halves so that we obtain a perfectly balanced tree. This would guarantee a search
time of O(logna,p). But in general, we cannot expect to obtain a bisection of topological predicates at
each node since the number of topological predicates yielding true for the node's matrix predicate will be
different from the number of topological predicates yieldingfalse for that matrix predicate. An upper bound
is the number 8, since at most eight matrix predicates have to be checked to identify a topological predicate
uniquely; the nineth matrix predicate yields always true. Hence, our goal is to determine a nearly balanced,
cost-optimal, full binary decision tree for each collection ofna,p intersection matrices.
If we do not have specific knowledge about the probability distribution of topological predicates in an
application (area), we can only assume that they occur with equal distribution. But sometimes we have more
detailed information. For example, in cadastral map applications, an adequate estimate is that 95 percent
of all topological relationships between regions are disjoint and the remaining 5 percent are meet. Our
algorithm for constructing MCDTs considers these frequency distributions. It is based on the following cost
model:

Definition 8 LetM,,p be an MCDT for the spatial data types c, 3 E {point2D,line2D,region2D}, wi be the
weight of the topological predicate pi with 1 < i < n,,p and 0 < wi < 1, and di with 1 < di < 8 be the depth
of a node in Map at which pi is identified. We define the total cost C0MDT of M, as
n,[ n,[
CCDT i -d with wi 1
i=1 i=l
That is, our cost model is to sum up all the weighted path lengths from each leaf node representing a
topological predicate to the root of the MCDT If all topological predicates occur with equal probability, we
set i The issue now is how to find and build an optimal MCDT with minimal total cost CMDT
on the basis of a given probability distribution (weighting) for the topological predicates. If all topological









01 algorithm MCDT
02 input: list im = ((imi,wi),..., (im;,,0,-, )) of 9IMs
03 with weights, list mp of the eight matrix predicates
04 output: MCDT AMf,
05 begin
06 bestrode := newjiode()O stop :=false;
07 discriminator := ,, _i ,i,,,,- ;
08 while not eol(mp) and not stop do
09 node := new-iodeO;
10 node.discr := discriminator; node.im := im;
11 if no_ofelem(im) = 1 then /* leaf node */
12 best-node := node; best-node.cost := 0
13 stop := true;
14 else
15 /* Let im = ((imkk ),..., (imk,- ))
16 with 1 < kl < ... < k, < n ,. */


17 partition(im, discriminator, iml, imr.)
18 if no_of_elem(imj) 0 and
19 noofelem(imr) 0 then
20 copy(mp, newjnp); del(newjnp, discriminator);
21 node. child := MCDT(iml, newinp);
22 node.rchild := MCDT(im,, newjnp);
23 node. cost := node. child.cost + node.rchild.cost
24 + "=kl w;
25 if node.cost < best-node.cost
26 then bestnode := node; endif;
27 endif;
28 discriminator := select-next(mp);
29 endif
30 endwhile;
31 return best-node;
32 end MCDT.


Figure 14: Minimum cost decision tree algorithm


predicates occur with equal probability, this problem corresponds to finding an optimal MCDT that requires
the minimal number of matrix predicate evaluations to arrive at an answer.
Figure 14 shows our recursive algorithm MCDT for computing a minimum cost decision tree for a set im
of na,, 9-intersection matrices that are annotated with a weight representing the corresponding predicates's
probability of occurrence, as it is characteristic in a particular application (line 2). Later these matrices be-
come the leaves of the decision tree. In addition, the algorithm takes as an input the list mp of eight matrix
predicates (we remember that the exterior/exterior intersection yields always true) that serve as discrimina-
tors and are attached to the inner nodes (line 3). This list is necessary to ensure that a matrix predicate is not
used more than once as a discriminator in a search path. During the recursion, the while-loop (lines 8 to 30)
terminates if either the list mp of matrix predicates to be processed is empty or the list im of 9-intersection
matrices contains only a single element. For each matrix predicate used as a discriminator, the operation
new-node creates a new tree node node (line 9). The matrix predicate discriminator as well as the list im
annotate the tree node node (line 10). If im has only one element (line 11), we know that node is a leaf node
representing the topological predicate pertaining to the single element in im. The cost for this leaf node is
0 since its current depth is 0 (line 12). Otherwise, if im consists of more than one element, we partition it
into two lists imi and imr (line 17). The partitioning is based on the values of each 9-intersection matrix in
im with respect to the matrix predicate serving as the discriminator. If such a value is 0 (false), the corre-
sponding 9-intersection matrix is added to the list iml; otherwise, it is added to the list imr. A special case
now is that im has not been partitioned so that either iml or imr is empty (condition in lines 18 to 19 yields
false). In this case, the discriminator does not contribute to a decision and is skipped; the next discriminator
is selected (line 28). If both lists imi and imr are nonempty (lines 18 to 19), we remove the discriminator
from a new copy new mp of the list mp (line 20) and recursively find the minimum cost decision trees for
the 9-intersection matrices in iml (line 21) and in imr (line 22). Eventually, all recursions will reach all leaf
nodes and begin returning while recursively calculating the cost of each subtree found. The cost of a leaf
node is 0. The cost of an inner node node can be expressed in terms of the cost of its two nonempty subtrees
node. child and node.rchild processing the lists imi and imr respectively. The depth of each leaf node with
respect to node is exactly one larger than the depth of the same leaf node with respect to either node. child
or node.rchild. Therefore, besides the costs of these two subtrees, for each leaf node of the subtree with root
node, we have to add the leaf node's cost (weight) one time (lines 23 to 24). These weights are stored in
node.im. The cost of node is then compared with the best cost determined so far, and the minimum will be
the new best option (lines 25 to 26). Eventually, when all the matrix predicates have been considered, we




































Table 3: MCDT pre-order representations for all type combinations on the
occurrence of all topological predicates.


basis of equal probability of


obtain the best choice and return the corresponding minimum cost decision tree (line 31).
Table 3 shows the results of this algorithm by giving a textual pre-order (depth-first search) encoding
of all MCDTs for all type combinations on the basis of equal probability of occurrence of all topological
predicates. The encodings allow an easy construction of the MCDTs. Since MCDTs are full (or proper)
binary trees, each node has exactly zero or two child nodes. We leverage this feature by representing an
MCDT as the result of a pre-order tree traversal. The pre-order sequence of nodes is unique in this case
for constructing the decision tree since inner nodes with only one child node cannot occur. Each inner
node in the pre-order sequence is described as a term XY where X, Y E {,,- }. Such a term represents a
matrix predicate AX n BY / 0 serving as a discriminator. For example, the termXY =" 0 denotes the matrix
predicate A n 3B / 0 (prefix notation for boundary). Each leaf node represents the 9-intersection matrix
number of a topological predicate. The matrix numbers are specified in the Figures 7 to 9 and 11 to 13.
Figures 15 shows a visualization of the MCDTs of three spatial data type combinations on the assump-
tion that all topological predicates occur with equal probability. The MCDTs for the other type combina-
tions have been omitted due to their very large size. Each inner node is annotated with a label XY where
X e {A, A,A } and Y E {B ,B,B }. A label represents a matrix predicate X nY / 0 serving as a dis-
criminator. For example, the label XY = AOB denotes the matrix predicate AO n 0B / 0. If the evaluation
of a matrix predicate yields false, we move to the left child; otherwise, we move to the right child. Each leaf
node represents the 9-intersection matrix number of a topological predicate.
The following Definition 9 defines measures that we use to summarize and interpret these results. We
are especially interested in the average number of matrix predicates to be evaluated.


Type combination MCDT preorder representation
point2D point2D - 2 3 o" 1 4 5
line2D /line2D a ao 0- -0 33 -3 34 35 -3 1 2 43 -3 44 45 o
3 4 46 47 48 -0 36 37 38 -a 5 6
49 50 51 -a 7 8 52 53 54 aa -a 9 10 -
11 12 0- a 13 14 a 15 16 0- a -a 39 40 a 41 42 3 0- -
55 56 57 58 -a 59 60 61 62 " o 0a -a 17 18 -3
19 20 -a 21 22 23 24 aa -a 25 26 27 28 -a 29
30 31 32 a aa -0 63 64 65 66 67 68 "- 69
a 70 71 72 a 73 74 3 0- -a 75 76 a 77 78 0- a 79 80
a 81 82
region2D / region2D 3 0- -0 5 6 o" 2 10 aa 1 -3 3 4 11 -3 12
13 3a 7 8 9 -a 15 16 aa 14 17 18 a3 0- 21
a 22 23 3a 19 20 24 25 26 a 0- 27 28 29 -
30 31 a 32 33
point2D / line2D "o3 1 2 "- 3 4 -3 5 6 3 0- 7 8 -3 9 10
0- 11 12 13 14
point2D /region2D "o 1 2 3 3 "- 4 5 6 7
line2D /region2D 3o 00 a 0- 0- 5 6 a 1 3 8 9 a 2 10 11 3 0
3 4 7 12 13 3 0- aa 14 15 a 0- 18 19 20 21 -
a -a 26 27 28 3 0- -a 32 33 34 35 36 37 3 "- a 16
17 a 0- 22 23 24 25 "- a -a 29 30 31 a 0- -a 38 39 -3
40 41 42 43









ABB A AB
AB- AAB"
A AOB AOaB
2 3 1 ABo
1 AB AB AB-
4 5
2 3 4 5 6 7

(a) (b)
ABo





A B ABA
AB ABAB
S AB- AB-
1 2 A -B A -B
A 9B A 9B A- B A-aB
3 4 5 6 -
7 8 9 10 11 12 13 14

(c)

Figure 15: Minimum cost decision trees for the 5 topological predicates of the point/point case (a), the 7
topological predicates of the point/region case (b), and the 14 topological predicates of the point/line case
(c) under the assumption that all topological predicates occur with equal probability.

Definition 9 Let CMCDT denote the total cost of an MCDT M ,p according to Definition 8. Let n,,p with
a, p3 {point2D, line2D,region2D} be the number of 9IMs of the topological predicates between the types a
and 3, IM, with 1 < i < no,p be a 9IM, and dp be the number of topological predicates associated with leaf
nodes in Ma,p of depth k (with 1 < k < 9). Further, let Ca,p be the cost without using an MCDT, AC,.p be
the average cost without using an MCDT, ACMDT be the average cost when using an MCDT, and RAC' '
be the reduced average cost in percent when using an MCDT The measures are defined as:

(i) d = Mi 1 < i< no,p, (ii) na,p = S l d,
(iii) Ca,p = 8 n,p
(iv) AC.B = 4. (,,e, + 1)
(v) AC'" I, /n,,p
(vi) RACID = 100 ACI0DT/ACp

To determine the average cost ACa,p without using an MCDT in (iv), we observe that the best case is
to check 8 matrix predicates, the second best case is to check 16 matrix predicates, etc., and the worst case
is to check all 8 na,p matrix predicates. The average number of matrix predicates that has to be checked
without using an MCDT is therefore 8 (1 + 2 +... + na,p)/n,p 4 (na,p + 1). ACDT in (v) yields the
average number of matrix predicates to be evaluated. Table 4 shows a summary of the results and in the last
two columns the considerable performance enhancement of minimum cost decision trees. The reduction of
matrix predicate computations ranges from 90 percent for the point2D / point2D case to 98 percent for the
line2D / line2D case.








dk with k =
Type combination n ~,p 1 2 3 4 5 6 7 8 9 C,p ACc,p CJDT ACjDT RACI0D
point2D /point2D 5 0 3 2 0 0 0 0 0 0 40 24 12 2.40 10.00
line2D / line2D 82 0 0 0 0 0 48 30 4 0 656 332 530 6.46 1.95
region2D / region2D 33 0 0 0 3 22 8 0 0 0 264 136 170 5.15 3.79
point2D / line2D 14 0 0 2 12 0 0 0 0 0 112 60 54 3.86 6.43
point2D / region2D 7 0 1 6 0 0 0 0 0 0 56 32 20 2.86 8.94
line2D / region2D 43 0 0 0 3 15 19 6 0 0 344 176 243 5.65 3.21

Table 4: Summary of the MCDTs for all type combinations on the basis of equal probability of occurrence
of all topological predicates.

The MCDT approach is similar to a technique introduced in [9] for topological predicates between
simple regions. However, their method of determining a binary decision tree rests on the thinned out 9-
intersection matrices and results in a near optimal algorithm and solution. The reason why optimality is not
achieved is that a topological predicate can have multiple, equipollent thinned out matrices, i.e., thinned out
matrices are not unique. Therefore, using a specific set of thinned out matrices as the basis for partitioning
the search space can only lead to an optimal decision tree for this set of thinned out matrices and may not
be optimal in the general case. Our algorithm rests on the complete 9-intersection matrices. It produces
an optimal decision tree (several optimal trees with the same total cost may exist) for the specified set of
9-intersection matrices and the given probability distribution. One can verify this by applying our algorithm
to the eight 9-intersection matrices for two simple regions and the same probability distribution as specified
in [9]. Our algorithm produces an optimal tree with the total cost of 2.13 while the so-called "refined cost
method" in [9], which uses thinned out matrices, produces a tree with the total cost of 2.16.
We can observe the following relationship between MCDTs and thinned out matrices:

Lemma 9 For each combination of spatial data types a and 3, the total cost of its minimum cost decision
tree (given in Table 4) is greater than or equal to the total cost of all its thinned out matrices (given in
Table 2), i.e.,

CMCDT > MT

Proof The proof is given by contradiction. Assume that for a spatial data type combination the total cost
of its MCDT is less than the total cost of all its thinned out matrices. Consequently, there must be at least
one path from the root to a leaf in the MCDT that contains a smaller number of matrix predicates than
the number of matrix predicates in the thinned out matrix for the topological predicate associated with that
leaf. This implies that we can identify this topological predicate with a smaller number of matrix predicate
decisions than the number of matrix predicates in its thinned out matrix. But this contradicts the definition
of a thinned out matrix. O


6 Implementation, Testing, Approach Assessment, and Performance Study

Section 6.1 describes our implementation approach and environment for the concepts presented in this arti-
cle. Section 6.2 delineates various tests based on this implementation in order to verify the correctness of
the concepts. We perform an overall assessment of our approach from two perspectives. From a qualitative
perspective, in Section 6.3, we briefly summarize the benefits of our design decision and compare our ap-
proach with an alternative ad hoc approach. From a quantitative perspective, in Section 6.4, we conduct a
performance study and analyze the results.








6.1 Implementation


To verify the feasibility, practicality, and correctness of the concepts presented, we have implemented and
tested our approach. The implementation of the algebra package SPAL2D for handling two-dimensional
spatial data includes the implementation of all six exploration algorithms from [39], the general evaluation
method of 9-intersection matrix characterization from Section 4 as well as the optimized evaluation tech-
niques of matrix thinning and minimum cost decision trees from Section 5. The implementation makes
use of the complex spatial data types point2D, line2D, and region2D, as they have been described in Sec-
tions 2.1 and 3.1 and specified in [39]. Since performance is one of the goals of this implementation, C++
is our employed programming language.
Two universal interface methods TopPredVerification and TopPredDetermination are provided to handle
predicate verification and predicate determination queries respectively. Both interfaces are overloaded and
take two spatial objects of any and possibly different type as input. The interface method TopPredVerifica-
tion takes a predicate identification number as an additional input parameter. It corresponds to the matrix
number (specified in [38] and used in Figures 7 to 9 and 11 to 13) of the topological predicate to be eval-
uated. The output is the Boolean value true if the topological relationship between the two spatial objects
corresponds to the specified predicate; otherwise, the value is false. The interface method TopPredDetermi-
nation outputs the matrix number of the topological predicate corresponding to the topological relationship
between the two spatial objects. Each of these interface methods involves three consecutive execution steps:
(i) the execution of an appropriate exploration algorithm which produces two topological feature vectors
(this functionality is provided by the interface method TopPredExploration from [39]), (ii) the computation
of the 9-intersection characterization for the respective spatial data type combination, and (iii) the execu-
tion of either the predicate verification using thinned out matrices or the predicate determination using a
minimum cost decision tree.

6.2 Testing

The testing of the concepts presented in this article is an extension of the testing environment and procedure
in [39] checking the functionality of the exploration algorithms and the correctness of the resulting values of
the topological feature vectors. We take the topological feature vectors as input for the 9-intersection matrix
characterization method and the optimization methods of matrix thinning and minimum cost decision trees.
The correctness of all methods has been checked by a technique known as gray-box testing (see [39] for
more details), which combines the advantages of two other techniques called black-box testing and white-
box testing. The black-box testing technique arranges for well defined input and output objects. In our case,
the input consists of two correct topological feature vectors as well as a matrix number of the topological
predicate to be verified in case of predicate verification. This enables us to test the functional behavior
of the three method implementations. The output is guaranteed to be either a Boolean value (predicate
verification) or a valid matrix number of a topological relationship predefined for the type combination under
consideration (predicate determination). The white-box testing technique considers every single execution
path and guarantees that each statement is executed at least once. This ensures that all cases that are specified
and handled by the algorithms are properly tested. All cases have been successfully tested and indicate the
correctness of our concepts and the ability of our algorithms to correctly verify or determine a topological
predicate.

6.3 Qualitative Assessment

Although it is usually accepted that some kind of plane-sweep algorithm is sufficient for implementing topo-
logical predicates, the article in [39] and this article have demonstrated that the decision on an appropriate










0.90
g 0.80 --
2
- 0.70 -.
E E
S0.60 C



S0o.so-I I -C)
I I S- I
0.20 1
a'
NE
000
PP PV PP PV PL PV PL PV PR PV PR PV LL PV LL PV LR PV LR PV RR RR
MT MT MT MT MT PV PV MT


2 3 4 5 6 7 8 9 10 11 12 13 14
Predicate number


(a) (b)

1.05
1.00
U 0.95
0.90
S0.85
c 0.80
0.75 j V X
'gr1: P verification wi.t wt ma trix thinni
| :: - : ,
Sr ,i ,1A L.WANAA ll l
Ill ,, ,', M ''' A I i i

) I IHK 1 H I M ,
N -" I- I M M K "" llH H t- t i IT
an s t i ss appoac of im
0.30 RRPVMT

1234567891111111111222222222233333333334444444444555555555566666666667777777777888
0123456789012345678901234567890123456789012345678901234567890123456789012
Predicate number
(c)

Figure 16: Predicate verification without and with matrix thinning


and sophisticated implementation strategy is of crucial importance. The possible ad hoc approach of imple-
menting a separate algorithm for each of the topological predicates results in a large number of algorithms
possibly up to the total number of topological predicates of a type combination. Even though this approach
is relatively straightforward, it suffers from many problems including large system implementation, non-
guaranteed correctness of the algorithms, error-proneness, redundancy, testing and evaluation difficulties,
and performance degradation. An essential problem of the ad hoc approach is the difficulty in handling
predicate determination queries. No particular algorithm is suitable for this task, thus requiring a linear
iteration through the large number of algorithms for all topological predicates.
Unlike the ad hoc approach, our approach does not suffer from these problems. In our implementation,
only a single plane-sweep algorithm is employed for all exploration algorithms. Only a single exploration
algorithm is implemented for all topological predicates of each type combination. This implementation strat-
egy allows us to take advantage of significantly smaller system implementation, widespread code reusability
and sharing, manageable system testing, and efficient handling of both predicate determination and verifica-
tion queries. This centralized approach is possible because, instead of considering each topological predicate
individually, we look deeper into their common definition blocks which are the nine matrix predicates of the
9-intersection matrix. This leads us to a systematic method. By creating a bidirectional link between the ma-












7.0
6.5
6.0
5.5
5.0
4.5
40
35
3.0
2.5
2.0
1.5
1.0
.5
.0
PPPD PPPD PLPD PLPD PRPD PRPD LLPD LLPD LRPD LRPD RRPD RRPD
MCDT MCDT M MCD MCDT MCDT MCDT

(a)


-1.8
S1.7
21.6-
2 1.5
. 1.4
1.3
.1.2
(U 1.1 -
S1-
- 0.9
0.8
^ 0.7
0.67
)< 0.5
S0.4
oC0.3
a 0.2
> 0.1
< 0


A






Z PD
Z PP PD MCDT
S iP I LA P
SL PD MCDT
X PR PD
10 PR PD MCDT


1 2 4 5 6 7 9H


1 2 3 4 5 6 7 8 9 10
Predicate number

(b)


13
12
011
b 10
E9

8- _
E 7 LL PD
H-*
P do w / ad LLPD MCDT
"0A LR PD
5 --LR PD MCDT
C) X RR PD
iLU a RR PD MCDT
3

< 1


1234567891111111111222222222233333333334444444444555555555566666666667777777777888
0123456789012345678901234567890123456789012345678901234567890123456789012
Predicate number

(c)

Figure 17: Predicate determination without and with MCDT



trix predicates and topological feature vectors, we are able to give a unique characterization for each matrix

predicate. This unique characterization frees us from providing algorithms for each topological predicate

in case of predicate verification and from evaluating all topological predicates in case of predicate deter-

mination. Furthermore, the correctness of the method is formally proven. Last but not least, based on the

concept of topological feature vectors, predicate matching techniques such as matrix thinning and minimum

cost decision trees can be used to increase the efficiency of answering predicate verification and predicate

determination queries respectively.



6.4 Performance Study and Analysis


We have performed a performance study that underpins the strengths of our approach by quantitatively

comparing the performance of our non-optimized alternative (only 9IMC) with our optimized evaluation

techniques (9IMC plus matrix thinning, 9IMC plus minimum cost decision tree). Our study shows that our

approach does not only provide qualitative but also quantitative benefits by applying optimization methods

for the evaluation of topological predicates. For each type combination, we measure and calculate the

average execution time for verifying and determining each predicate both without and with optimization.


11 12 13 14








For predicate verification (PV), Figures 16(b) and 16(c) illustrate the average execution time for each
predicate of each type combination without and with matrix thinning (MT). The overall average for each
type combination is shown in Figure 16(a). The performance improvements from using matrix thinning are
quite noticeable and range from 13 percent execution time reduction for the line2D/line2D case up to 55
percent for the point2D/point2D case.
Similarly, for predicate determination (PD), Figures 17(b) and 17(c) show the average execution time
for each predicate of each type combination without and with the use of minimum cost decision trees.
The overall average for each type combination is shown in Figure 17(a). The results indicate significant
performance improvements from using minimum cost decision trees. The improvements range from 75
percent execution time reduction for the the point2D/region2D case up to 91 percent for the line2D/line2D
case.
Although the execution time reductions are remarkable for both predicate verification and especially
predicate determination and clearly reflect the trend, the empirical results shown in Figures 16 and 17 are not
as optimistic as the computational results given in the Tables 2 and 4. The reason that we cannot reach these
lower bounds in practice consists in programming and runtime overheads such as extra conditional checks,
construction of thinned out matrices and minimum cost decision trees, and their traversals. Even with these
overheads, it is evident that our approach permits and offers considerable performance optimizations with
respect to predicate evaluations.


7 Conclusions

Conceptual models of topological predicates have been investigated in a large number of publications. How-
ever, their efficient implementation, which is of essential importance for the processing of queries that con-
tain spatial joins and spatial selections, has been widely neglected. Especially, the introduction of complex
spatial data types and the resulting increase of topological predicates between all their combinations re-
quire efficient and correct implementation concepts. We propose a two-phase approach that consists of an
exploration phase and an evaluation phase and that can be applied to both predicate verification and pred-
icate determination. The goal of the exploration phase is to traverse a given scene of two objects in space,
collect any topological information of importance, and store it in topological feature vectors. The goal of
the evaluation phase is to interpret the gained topological information and match it against the topological
predicates.
In this article, we have focused on the evaluation phase and in detail developed a general evaluation
method called 9-intersection matrix characterization. This method generates a formally proven connection
between theoretical concepts (9-intersection matrix, matrix predicate) and implementation concepts (topo-
logical feature vector, topological flag, segment class). Two sophisticated optimization techniques, called
matrix :;i,,iniwg and minimum cost decision tree, are introduced which help to speed up predicate deter-
mination and predicate verification respectively. Besides a quantitative analysis, we have also provided an
experimental performance study which documents both the correctness, robustness, and efficiency of our
approach. Our approach has been implemented in the SPAL2D software library which is currently under
development and determined for an integration into extensible databases.


References

[1] T. Behr and M. Schneider. Topological Relationships of Complex Points and Complex Regions. Int.
Conf on Conceptual Modeling, pp. 56-69, 2001.
[2] A. Brodsky and X. S. Wang. On Approximation-based Query Evaluation, Expensive Predicates and
Constraint Objects. Int. Workshop on Constraints, Databases, and Logic Programming, 1995.








[3] J. Claussen, A. Kemper, G. Moerkotte, K. Peithner, and M. Steinbrunn. Optimization and Evaluation
of Disjunctive Queries. IEEE Trans. on Knowledge and Data Engineering, 12(2):238-260, 2000.
[4] E. Clementini and P. Di Felice. A Comparison of Methods for Representing Topological Relationships.
Information Sciences Appli,,,i,, i, 3(3): 149-178, 1995.
[5] E. Clementini and P. Di Felice. A Model for Representing Topological Relationships between Complex
Geometric Features in Spatial Databases. Information Systems, 90(1-4):121-136, 1996.
[6] E. Clementini and P. Di Felice. Topological Invariants for Lines. IEEE Trans. on Knowledge and Data
Engineering, 10(1), 1998.
[7] E. Clementini, P. Di Felice, and G. Califano. Composite Regions in Topological Queries. Information
Systems, 20(7):579-594, 1995.
[8] E. Clementini, P. Di Felice, and P. van Oosterom. A Small Set of Formal Topological Relationships
Suitable for End-User Interaction. 3rd Int. Symp. on Advances in Spatial Databases, LNCS 692, pp.
277-295, 1993.
[9] E. Clementini, J. Sharma, and M.J. Egenhofer. Modeling Topological Spatial Relations: Strategies for
Query Processing. Computers and Graphics, 18(6):815-822, 1994.
[10] Z. Cui, A. G. Cohn, and D. A. Randell. Qualitative and Topological Relationships. 3rdlnt. Symp. on
Advances in SplJ,,id Databases, LNCS 692, pp. 296-315, 1993.
[11] J.R. Davis. IBM's DB2 Spatial Extender: Managing Geo-Spatial Information within the DBMS.
Technical report, IBM Corporation, 1998.
[12] M. J. Egenhofer. A Formal Definition of Binary Topological Relationships. 3rd Int. Conf on Founda-
tions of Data Ogo,-:ii,,ii, andAlg.,' ,,illni, LNCS 367, pp. 457-472. Springer-Verlag, 1989.
[13] M. J. Egenhofer. Definitions of Line-Line Relations for Geographic Databases. 16th Int. Conf on
Data Engineering, pp. 40-46, 1993.
[14] M. J. Egenhofer. Spatial SQL: A Query and Presentation Language. IEEE Trans. on Knowledge and
Data Engineering, 6(1):86-94, 1994.
[15] M. J. Egenhofer and R. D. Franzosa. Point-Set Topological Spatial Relations. Int. Journal of Geo-
graphical Information Systems, 5(2):161-174, 1991.
[16] M. J. Egenhofer and J. Herring. A Mathematical Framework for the Definition of Topological Rela-
tionships. 4th Int. Symp. on Spatial Data Handling, pp. 803-813, 1990.
[17] M. J. Egenhofer and J. Herring. Categorizing Binary Topological Relations Between Regions, Lines,
and Points in Geographic Databases. Technical Report 90-12, National Center for Geographic Infor-
mation and Analysis, University of California, Santa Barbara, 1990.
[18] M. J. Egenhofer and D. Mark. Modeling Conceptual Neighborhoods of Topological Line-Region
Relations. Int. Journal of Geographical Information Systems, 9(5):555-565, 1995.
[19] M.J. Egenhofer. Deriving the Composition of Binary Topological Relations. Journal of Vi,,,i Lan-
guages and Computing, 2(2): 133-149, 1994.
[20] M.J. Egenhofer, E. Clementini, and P. Di Felice. Topological Relations between Regions with Holes.
Int. Journal of Geographical Information Systems, 8(2):128-142, 1994.
[21] ESRI Spatial Database Engine (SDE). Environmental Systems Research Institute, Inc., 1995.
[22] S. Gaal. Point Set Topology. Academic Press, 1964.
[23] V. Gaede and O. Giinther. Multidimensional Access Methods. ACM Computing Surveys, 30(2):170-
231, 1998.








[24] R. H. Guting and M. Schneider. Realm-Based Spatial Data Types: The ROSE Algebra. VLDB Journal,
4:100-143, 1995.
[25] R.H. Gilting, T. de Ridder, and M. Schneider. Implementation of the ROSE Algebra: Efficient Algo-
rithms for Realm-Based Spatial Data Types. Int. Symp. on Advances in Spatial Databases, 1995.
[26] J. M. Hellerstein. Practical Predicate Placement. ACMSIGMOD Int. Conf on Management of Data,
pp. 325-335.
[27] J. M. Hellerstein and M. Stonebraker. Predicate Migration: Optimizing Queries with Expensive Pred-
icates. ACM SIGMOD Int. Conf on Management ofData, pp. 267-276, 1993.
[28] Informix Geodetic DataBlade Module: User's Guide. Informix Press, 1997.
[29] JTS Topology Suite. Vivid Solutions. URL: http://www.vividsolutions.com/JTS/JTSHome.htm.
[30] M. McKenney, A. Pauly, R. Praing, and M. Schneider. Dimension-Refined Topological Predicates.
13th ACM Symp. on Geographic Information Systems, pp. 240-249, 2005.
[31] OGC Abstract Specification. OpenGIS Consortium (OGC), 1999. URL:
http://www.opengis.org/techno/specs.htm.
[32] Oracle8: Spatial Cartridge. An Oracle Technical White Paper. Oracle Corporation, 1997.
[33] M. A. Rodriguez, M. J. Egenhofer, and A. D.Blaser. Query Pre-Processing of Topological Constraints:
Comparing a Composition-Based with Neighborhood-Based Approach. Int. Symp. on .Sp~rlil and
Temporal Databases, LNCS 2750, pp. 362-379. Springer-Verlag, 2003.
[34] M. Schneider. .Sliall Data Types for Database Systems Finite Resolution Geometry for Geographic
Information Systems, volume LNCS 1288. Springer-Verlag, Berlin Heidelberg, 1997.
[35] M. Schneider. Implementing Topological Predicates for Complex Regions. Int. Symp. on Spatial Data
Handling, pp. 313-328, 2002.
[36] M. Schneider. Computing the Topological Relationship of Complex Regions. 15th Int. Conf on
Database and Expert Systems A1ppl,1 I, l, pp. 844-853, 2004.
[37] M. Schneider and T. Behr. Topological Relationships between Complex Lines and Complex Regions.
Int. Conf on Conceptual Modeling, 2005.
[38] M. Schneider and T Behr. Topological Relationships between Complex Spatial Objects. ACM Trans.
on Database Systems, 31(1):39-81, 2006.
[39] M. Schneider and R. Praing. Efficient Implementation Techniques for Topological Predicates on Com-
plex Spatial Objects: The Exploration Phase. Technical report, University of Florida, Department of
Computer & Information Science & Engineering, 2006.
[40] M.F. Worboys and P. Bofakos. A Canonical Model for a Class of Areal Spatial Objects. 3rdlnt. Symp.
on Advances in Spatial Databases, LNCS 692, pp. 36-52. Springer-Verlag, 1993.




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs