Self-organizing features for regularized image standardization

Material Information

Self-organizing features for regularized image standardization
Gökçay, Didem ( Gokcay, Didem ), 1966- ( Dissertant )
Harris, John G. ( Thesis advisor )
Place of Publication:
Gainesville, Fla.
University of Florida
Publication Date:
Copyright Date:


Subjects / Keywords:
Atlases ( jstor )
Conceptual lattices ( jstor )
Dimensionality ( jstor )
Images of transformations ( jstor )
Interpolation ( jstor )
Landmarks ( jstor )
Pixels ( jstor )
Standardization ( jstor )
Topology ( jstor )
Triangulation ( jstor )
Computer and Information Science and Engineering thesis, Ph. D ( lcsh )
Dissertations, Academic -- Computer and Information Science and Engineering -- UF ( lcsh )
Human face recognition (Computer science) ( lcsh )
Image processing -- Digital techniques -- Standards ( lcsh )
Self-organizing maps ( lcsh )
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )


Image standardization is an important preprocessing step in several image processing applications. In neuroimaging, by reducing normal variability through the standardization of brains, functional activity from multiple subjects can be overlaid to study localization. Furthermore, variability outside normal ranges can be used to report abnormalities. In automatic facial expression recognition, by standardizing the facial features, the accuracy of the facial expression recognition can be increased. The current standardization methods are mostly based on global alignment and warping strategies. However, global standardization methods fail to align individual structures accurately. In this study, we propose a feature-based, semi-automatic, non-parametric, and non-linear standardization framework to complement the existing global methods. The method consists of three phases: In phase one, templates are generated from the atlas structures, using Self-Organizing Maps (SOMs). The parameters of each SOM are determined using a new topology evaluation technique. In phase two, the atlas templates are reconfigured using points from individual features, to establish a one-to-one correspondence between the atlas and individual structures. During training, a regularization procedure can be optionally invoked to guarantee smoothness in areas where the discrepancy between the atlas and individual feature is high. In the final phase, difference vectors are generated using the corresponding points of the atlas and individual structure. The whole image is warped by interpolation of the difference vectors through Gaussian radial basis functions, which are determined by minimizing the membrane energy. Results are demonstrated on simulated features, as well as selected sulci in brain MRIs, and facial features. There are two significant advantages of this system over the existing standardization schemes: increased accuracy and speed in the alignment of internal features. Although our framework does not handle standardization of global shape and size differences, it can easily be used as a complementary module for the existing global standardization techniques, to increase precision of local alignment. ( ,,, )
KEYWORDS: Image standardization, image warping, self-organizing maps, regularization, brain MRI
Thesis (Ph. D.)--University of Florida, 2001.
Includes bibliographical references (p. 109-116).
General Note:
Title from first page of PDF file.
General Note:
Document formatted into pages; contains ix, 117 p.; also contains graphics.
General Note:
Statement of Responsibility:
by Didem Gökçay.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
50743800 ( OCLC )
002729348 ( AlephBibNum )
ANK7112 ( NOTIS )


This item has the following downloads:

Full Text







Copyright 2001


Didem G6kgay

To the girls in our family:

my mom, who made learning a lifelong quest,

my sister, who discovered the champion within.

In celebration of our triple graduation,

with a MS in political sciences, an MBA, and a PhD in computer sciences,

at the turn of this millennium.


During the course of my dissertation, I received tremendous support. Dr. John

Harris deserves the best compliments for supervision. After I started working with him, a

dissertation so boring and monstrous became a challenge, a quest, of which I enjoyed

every minute. His technical judgement helped put a lot of quality into this work. I am

greatly indebted for all his efforts.

Dr. Ugur Halici from Middle East Technical University was a big influence when

I made the decision to pursue PhD. studies. I would like to thank her for being a

wonderful role model. Dr. Dawn Bowers always helped in hardships, as a friend, mentor,

and boss. I am grateful to Dr. Bruce Crosson for his support, and Dr. Richard Briggs for

his valuable time spent in the collection of the brain images used in this study. I have

learned a lot from them and I feel priviledged to have been a part of their research groups

for a long time. During times when I needed anatomical guidance, Dr. Tiana Leonard was

always available. I appreciate her patience in continually explaining to us engineers the

extent of anatomical variability. By her help, I learned that the precision expected in brain

anatomy by engineers is nonexistent. I am also indebted to Dr. Sartaj Sahni, who did not

hesitate to take charge at a very difficult time. And I would like to thank Dr. Jose

Principe for serving on my committee.

Some hidden heroes were also with me throughout this study. Mr. John Bowers,

our graduate secretary, and Dr. Doug Dankel, our graduate coordinator, helped

tremendously with official procedures and problems.

As a last word, I would like to mention that completing my doctoral studies has

been more of an emotional challenge than a technical one. Without the emotional support

of my husband, my mom, my dad, my sister, and most importantly my son, Tugra, I

would not have accomplished this feat. Erhan's perseverance, along with all my family's

confidence, restored my willingness to fight back at difficult times.



A C K N O W L E D G M E N T S .................................................................................................. iv

A B S T R A C T ................................ .....................................................v iii


1. IN T R O D U C T IO N ........................................................... ...................... ................ 1

1. 1. P relim in aries ................................................ ..... ......... ...... 3
1 .2 O v erv iew ................. ...................................................................... .... 6
1.2.1. Preprocessing Modules (Al, A2 and B1, B2)............................................. 7
1.2.2. T em plate C reaction (A 3) ........................................... ............................. 8
1.2.3. Feature A lignm ent (B 3) ........................................... ............................. 8
1.2.4. Im age W arping (B 4).............................................. ................................ 9
1.3. Im portant Issues .... ........................ ........ .............................. . ............ 12

2. B A C K G R O U N D ............................................................ .. .......... .. .............. 14

2.1. Im age M orphing Survey.......................................... .................................. 18
2.1.1. Voxel/pixel Based Morphing Methods ................................. .............. 18
2.1.2. Surface/boundary Based Morphing Methods ............................................20
2.2. Im age Standardization Survey ......................................... ...................................... 22
2.2.1. V oxel-based Standardization................................................. ....... ....... 22
2.2.2. Surface-based Standardization .................................................................. 25


3.1. Evaluation of the Quality of Topology Preservation ................... ...................... 31
3.1.1. Background on Methods for Determining Intrinsic Dimensionality ............. 32
3.1.2. Background on Methods for Evaluating Topology Preservation ................ 33
3.2. Proposed Method for Evaluating the Quality of the Atlas SOM Template........... 37
3.2.1. Delaunay Triangulation with Respect to Anchors ..........................................38
3.2.2. Intrinsic Dimensionality Estimation................. .......................................... 42
3.2.3. Calculation of Extended Topology Function ........................................... 43

F U N C T IO N S .................................................................................................................... 5 1

4 .1. Self-organized A lignm ent ................................................................ .................... 5 1
4.1.1. Background on Regularization Approaches.................................................. 52
4.1.2. Proposed Method for Regularized Feature Alignment ................................ 55
4.2. Proof of Preservation of Order Between the Template and Individual SOMs....... 59
4.3. Warping with Radial Basis Functions....................... .................. 63
4.3.1. Optimization of RBF Parameters ............................... .............. 63
4.3.2. Boundary Conditions ......... .................................... ............... .............. 68
4.3.3. Generalization to Multiple Features..... ................................. 69


5.1. D ata C collection and Preprocessing ........... ............. ....................................... 77
5.1.1. Transformation into Standard Coordinates ...............................................77
5.1.2. Feature Extraction ......... ........................... ........ ... .... .. ...... .... 78
5.2. Standardization............................................. .......... .................. .......... .. 80
5.2.1. Tem plate Generation ....................................................... ....... 80
5.2.2. Self-organized Alignment and Warping ................................ ............. 83
5.3. Perform ance Evaluation .............. ................................................. .............. 87
5.3.1. Q uantification of Errors .............. ......................................................... 87
5.3.2 C om parative E valuation ........................................................ .... .. .............. 89

6. 2D APPLICATIONS: STANDARDIZATION OF FACES.................................... 95

6.1. D ata C collection and Preprocessing ......... .. ............... ...................................... 95
6.1.1. Transformation into Standard Coordinates ................................. ...............96
6.1.2. Feature Extraction .............................................. ........ .. .......... 96
6.2. Standardization ............................................. .......... .................. .......... .. 98
6.3. Perform ance Evaluation ......................................................... .............. 100

7. C ON CLU SION ..................................... ................. ............ .............. .. 102

OPERATION PRINCIPLES OF LOFA ..................................................................... 105

LIST O F REFEREN CE S ................................................... ................................. 109

B IO G R A PH IC A L SK E T C H .......................................................................................... 117

Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy



Didem G6kcay

May 2001

Chairman: John G. Harris
Major Department: Computer and Information Sciences and Engineering

Image standardization is an important preprocessing step in several image

processing applications. In neuroimaging, by reducing normal variability through the

standardization of brains, functional activity from multiple subjects can be overlaid to

study localization. Furthermore, variability outside normal ranges can be used to report

abnormalities. In automatic facial expression recognition, by standardizing the facial

features, the accuracy of the facial expression recognition can be increased. The current

standardization methods are mostly based on global alignment and warping strategies.

However, global standardization methods fail to align individual structures accurately.

In this study, we propose a feature-based, semi-automatic, non-parametric, and

non-linear standardization framework to complement the existing global methods. The

method consists of three phases: In phase one, templates are generated from the atlas

structures, using Self-Organizing Maps (SOMs). The parameters of each SOM are

determined using a new topology evaluation technique. In phase two, the atlas templates

are reconfigured using points from individual features, to establish a one-to-one

correspondence between the atlas and individual structures. During training, a

regularization procedure can be optionally invoked to guarantee smoothness in areas

where the discrepancy between the atlas and individual feature is high. In the final phase,

difference vectors are generated using the corresponding points of the atlas and individual

structure. The whole image is warped by interpolation of the difference vectors through

Gaussian radial basis functions, which are determined by minimizing the membrane


Results are demonstrated on simulated features, as well as selected sulci in brain

MRIs, and facial features. There are two significant advantages of this system over the

existing standardization schemes: increased accuracy and speed in the alignment of

internal features. Although our framework does not handle standardization of global

shape and size differences, it can easily be used as a complementary module for the

existing global standardization techniques, to increase precision of local alignment.


The objective of this dissertation is to develop a fast and efficient framework for

image standardization. Image standardization is usually considered as a preprocessing

step in imaging applications. One example application is facial expression recognition, or

expression invariant face recognition. In order to differentiate between facial expressions,

the proposed algorithms should be invariant for lighting, pose and facial feature

differences. Developing invariant algorithms for lighting and pose differences is

somewhat simpler than developing invariant techniques for facial features, since the

changes in facial features is the main quantity that the algorithms must differentiate.

Therefore, facial feature standardization, when performed as an initial step for facial

expression identification, becomes very useful in eliminating the expression differences

caused by the variability of facial features.

Neuroimaging is another important application. Identification of the anatomical

correlates of functional activity in the brain through intersubject studies is an important

problem in neuroimaging. In order to investigate functional mapping of the brain and

connectivity of related anatomic structures, data must be pooled across subjects.

Unfortunately, anatomical variability introduced by individual subjects hinders

overlaying the functional activity data from multiple subjects. Combination and general

interpretation of functional activity images are possible only after an adequate amount of

standardization is performed in the individual anatomic images. Another objective in

neuroimaging is the detection of abnormal brain structures. After standardization of

individual brain images, an automatic comparison with a probabilistic atlas could be used

to reveal deviations from normal variability.

As discussed in the survey presented in chapter 2, the global alignment strategies

used by the current standardization methods are efficient in minimizing general size and

shape differences. However, global methods are highly computational, and less accurate

in aligning the individual features. In independent studies, some of the methods presented

in chapter 2 are subjected to tests which enhanced the global warping schemes with

interactive tools to align individual structures. Significant improvements are observed in

the registration errors in brain images by coupling global and local methods in this way

(Collins and Evans, 1999; Davatzikos, 1997; Thompson and Toga, 1996). Therefore in

our study, we focused on the development of a method complementary to global warping

techniques, that could improve accuracy in the alignment of individual structures with

low computational cost.

First, the variability in the features needs to be addressed. In some contexts such

as brain images, the variability of the features is large. Some of the features are

interrupted, branchy, or even non-existent, while others follow a general curvature pattern

(Ono et al., 1990). The performance of a fully automatic alignment algorithm is

unpredictable for this type of variability. On the other hand, manual methods offer the

precision sought for, but it is infeasible to trace and extract every single feature manually

if the number of features are high. Therefore, middle ground solutions can be found for

the accuracy versus feasibility dilemma in the alignment of highly variable features

through semi-automatic methods.

The other issue we need to concentrate on is speed. We observed that the feature

alignment strategies in image morphing are not used in the current image standardization

techniques, mostly because the image standardization techniques are preferably global,

whereas the morphing techniques are local. If a subset of individual structures are chosen

as features to be aligned, a whole body of literature in current image morphing

applications can be utilized for faster image warping. As indicated at the beginning, there

is a need for complementary local methods in image standardization. Therefore, adopting

local strategies similar to those proposed in the current image morphing literature may

speed up the alignment process of individual features.

We developed a framework for fast and efficient semi-automatic feature-based

image standardization. Our scheme combines self-organizing features and scattered data

interpolation techniques in the alignment and warping steps respectively. For simplicity,

at this point, we assume there is only one feature to be aligned. Generalization of the

alignment and warping procedures for multiple features will be presented in chapter 4.

1.1. Preliminaries

In this section, notation is introduced, the basic operations in our standardization

scheme are overviewed, and the data structures and algorithms are explained.


a) Ab) cB) d)

Figure 1. Illustration of features on a hypothetical atlas and individual image


Let IA and IB be the atlas and individual images, as shown in figure la and lb

respectively. Consider the same structural feature, AE IA, and Be IB in these images,

such that

A = {i, O2, ., CM} and B = {P1, P2, ..., N}

where am, m = 1,.., M and 3n, n = 1,.., N are points in 2d or 3d space in which IA and IB

are defined, and M and N are not constrained to be equal.

Given two images, IA and IB, and corresponding features, A and B, the basic

operation principles for the alignment and warping steps are as follows. Find a mapping

between A and B, and derive a displacement vector, dam, for each point, 0m e A. Then,

assign displacement vectors, daj, for all other pixels in the image, aj e IA (ajzm). This is

done by calculating dist(aj, cm)1, followed by interpolation of daj from the known

displacement vectors, dam. After determining all the displacement vectors, daj, obtain the

warped image, IB', by assigning an intensity value to each pixel, aj, such that IB'(aj) =

h(IB, aj + daj). This type of intensity assignment is called backward mapping (Wolberg,

1990). This way, all pixels in the image space are guaranteed to have intensity

assignments. In order to accommodate several different types of intensity interpolation,

the function h can be chosen from bilinear, nearest neighbor, or more complex


Since the features on the atlas image and the individual image are not constrained

to contain an equal number of points, finding a non-parametric topology preserving one-

to-one mapping between the atlas and individual feature is challenging. In our feature

1 dist is Euclidian distance

alignment scheme, we used self-organizing maps (SOM) developed by Kohonen (1988)

to vector quantize the features A and B into equal number of anchor points, while

preserving the topology during the mapping process. Hypothetical SOMs for features A

and B are illustrated in figure Ic and Id respectively.

Self-Organizing Maps:

Let X be a set of input vectors, X={xl, x2, ..., xj}, where xj e Rn. Let SOM = (S,

L) be a graph, such that, S = {si, s2, ..., sK}, is a set of nodes, along with a uniform lattice

structure, L, that defines Id, 2d, or 3d connectivity between Sk, k=1,...,K. Define a weight

vector for each Sk such that wk e Rn. W = {wi, w2, ..., wK} is called the codebook. The

objective is to project the input space X on to the SOM, such that

The codebook vectors, wk e W quantize X with minimal quantization


The topological distribution of input space is reflected on S, and W,

through the connectivity of the lattice.

Kohonen (1988) developed the following algorithm to achieve the above


Initialize wk to small random values

Given xj, determine a winner, w, with closest similarity to xj:

Ilw xjll |lwk- xjll, k=l, ..., K (1)

update wk iteratively for all inputs xj in the input set such that

Wk(t) = wk(t-l) + g(d,t) [xj wk(t-l)], k=1,...,K (2)

where g(d, t) is a gain term with monotonically decreasing amplitude, a(t),

and range, r(t), such that

g(d,t)= a(t)e-(d r(t)2) (3)

Here, d implicates the lattice distance, Il||.s of node Sk to winner node Sc:

d= |sc- Sklls

And 0
Intuitively, the SOM algorithm iterates to make W resemble X, in such a way that

close values of X are mapped onto neighboring nodes in S. This is illustrated in figure 2

below. In figure 2, X covers the shaded area, the vertices are Sk, plotted at locations wk,

and the edges are members of L.

a) Before training b) Mid-point in training c) End of training
Figure 2. Change of a 2d Self-Organizing Map During Training

1.2. Overview

Using the SOM algorithm, standardization can be achieved in two phases, as

illustrated in figure 3. In phase I, a SOM template is generated from the atlas feature,

using the original SOM algorithm. Once the atlas SOM template is created, it remains

unchanged throughout the standardization of individual images. In phase II, the

individual image is normalized in two basic steps: 1. Matching the individual feature to

the atlas feature by using the atlas SOM template as a model, and 2. Warping the entire

image accordingly. Phase II is repeated for each individual image.

Transformation Transformation
into standard coordinates into standard coordinates
(Talairach transformation) (Talairach transformation)
Al B1
Preprocessing \/ \ Preprocessing
Feature extraction Feature extraction
(Manual segmentation) (Automatic segmentation)
A2 B2

\1/ \J

Template creation
(SOM training)

Feature alignment
(Regularized SOM training)

Image warping
(Scattered data interpolation)

a) Phase I: Generation of atlas b) Phase II: Standardization of
template individual image
Figure 3. Image standardization scheme based on self-organizing features

1.2.1. Preprocessing Modules (Al, A2 and B1, B2)

The initial steps of phase I and II are similar. First, orientation and scaling

differences between the atlas and individual images are corrected. This step involves

using affine transformations to register the images into a standard coordinate system. For

brain images, one widely accepted coordinate system is Talairach (Talairach and

Tournoux, 1988). Transformation into Talairach coordinates can be achieved by either

fully automatic (Collins et al., 1994) or manual (Cox, 1996) methods.

The next step involves the extraction of features. It is feasible to maximize the

precision of feature extraction on the atlas image through manual tracing, since the atlas

image is processed only once. However, for the extraction of the feature on the individual

images, automatic or semi-automatic methods are preferred. In neuroimaging, one semi-

automatic method proposed recently involves thresholding and skeletonizing the brain

images to obtain a sulcal map of the brain (Royackkers et al., 1999). In this method,

depth and variability information on individual sulci are used to obtain a topographical

representation of individual structures. The adoption of such a semi-automatic method in

the feature extraction step in phase II is preferred since it is both feasible and accurate.

1.2.2. Template Creation (A3)

In this step, the atlas feature is modeled through a SOM template. Given the

mechanics of the SOM, the operations that are required for the creation of the atlas SOM

template are summarized as follows:

Generate the atlas template, SOMA, and initialize with random weights,


Set the initial radius, r(0), so that it is equal to the diameter of SOMA.

Additionally, determine a suitable initial gain value, c(0).

Given points cm e A from the atlas feature, train vk using (2) .

Use the final weight values, Vk(ted), where ted is the time and the end of

training, as the anchor points of the atlas feature.

This completes phase I of standardization.

1.2.3. Feature Alignment (B3)

In this step, a new template, SOMB, is constructed for the feature B in the

individual image by using the atlas template as a model. Initially the positioning of the

atlas template, SOMA, is used for setting the weights of SOMB. Then the points of the

individual feature are used to retrain SOMB, to vector quantize the feature B. During

training, the regularization procedure described in chapter 4 can be optionally invoked.

This is due to the fact that perfect alignment is not desirable at areas where the

discrepancy between the atlas and the individual feature is unacceptably high.

The operations involved in feature alignment can be summarized as follows:

Generate SOMB, such that its size and dimensionality are the same as

those of SOMA.

Initialize the weights, wk(0), such that wk(0) = vk(tend).

Set the initial neighborhood size, r(0), to a value less than the diameter of

SOMB, and determine a suitable initial value for the gain term, c(0).

Given points P3n E B from the individual feature, train wi using the update

rule in (2).

Determine the anchor points of the individual feature as the final weight

values, Wk(tend), where tend is the time at the end of training.

1.2.4. Image Warping (B4)

This is the final step of phase II. Using the one-to-one correspondence between

the atlas and individual SOMs established in the above steps, difference vectors are

generated for each anchor point. The whole image is warped by interpolating the

difference vectors at each pixel from the difference vectors at the anchor points.

Interpolation is done using Gaussian radial basis functions (RBF) centered at the anchor

points of the atlas SOM template. The whole procedure can be summarized as follows.

Let T be a 3d transformation: T = (Tx, Ty, Tz), where Tx(x, y, z), Ty(x, y, z), Tz(x,

y, z), are real valued multivariate functions, R3-- R. Compute d(aj), the displacement

vector, at voxel aj e IA, by:

d(aj) = (Tx(aj), Ty(aj), Tz(aj)), where the x component, Tx is equal to

K -(lla -v 112/a 2)
Tx(aj)= Xcke k k (4)

and ck are obtained by using the known displacement vectors, d(vk), at the anchor points,

solving a set of K linear equations:

Tx(vk) = dx(vk), and d(vk) = Vk-Wk (5)

Here, d(vk) shows the discrepancy between the corresponding codebook vectors in SOMA

and SOMB, and dx(vk) denotes the x component of d(vk). The parameters, ok, need to be

determined by the user, or some procedure that guarantees the smoothness of the warp, as

explained in chapter 4. Interpolation of the other components, Ty and Tz is obtained

similarly. Through this warping scheme, the displacements fade out smoothly as voxel

aj's distance from the feature increases. This completes phase II of standardization.

In figure 4, the proposed scheme is illustrated on a simulated image with a linear

feature. Figure 4a and 4b contain the original atlas and individual images and their

transformation into standard coordinates. The template SOM and the individual SOM are

chosen to be Id, as illustrated in figure 4c. The anchor point locations are indicated

through triangles. The SOM algorithm distributes the anchor points somewhat evenly

along the given feature space (Kohonen, 1988). Figure 4d shows the result of the

alignment and warping steps, in which regularization is not performed. As seen in figure

4d, the perfect alignment of the anchor points without regularization causes an


a) Original images

b) Images in standard coordinates



c) SOM representation of features

d) Warped image
Figure 4. Image standardization steps on a simulated feature

unacceptably high deformation at the end points of the individual feature. This is due to

the fact that the discrepancy between the atlas feature and the individual feature is not

homogenoeus. The discrepancy is high at the end points, and low at the middle points.

Therefore a perfect alignment strategy causes high tension, or stretching, at the areas

where discrepancy is above average.

1.3. Important Issues

In summary, the standardization procedure proposed above works as follows. It is

necessary to develop a SOM representation of the individual feature through the use of

the atlas feature in order to establish a correspondence between the homologous points on

the features. Once the correspondence is established, discrepancies between

corresponding anchor points can be determined, and spatial transformation of the whole

image can be performed by interpolating these discrepancies. There are several

challenges with this approach:

1) The choice of SOM size and dimensionality affects the accuracy of the

alignment dramatically. Because if the topology of a specific feature is not represented

accurately by the anchor points, matching of the corresponding anchor points of atlas and

individual images will be irrelevant. In this study, we developed our own technique to

evaluate the quality of the topological matching between the feature and the anchor

points. Using this technique, the SOM parameters can be adjusted to reflect the topology

of the feature in a better way. This is discussed in chapter 3.

2) Discrepancies in some areas of the features might be high, due to large

variations between specific parts of the features. Therefore, the spatial transformations in

the vicinity of these areas are likely to contain unacceptable deformations. An example of

this situation is presented in figure 4. The deformations at the end points of the feature in

figure 4 are much larger than deformations elsewhere. This problem can be dealt with

using a regularization approach. We extended the SOM algorithm to align areas with high

mismatches differently, so that the discrepancies between the atlas and individual features

are regularized. The new SOM update rule for regularized alignment is illustrated in

chapter 4.

3) A foldover effect might occur in the final warped image, when the anchor point

sets of the atlas feature and the individual feature do not reflect a similar topological

spread, or when the parameters of the interpolation procedure are not chosen

appropriately. The foldover effect is caused by multiple spatial assignments for a specific

point. Usually, in the warped images, the foldover effect presents itself as an interruption

of a smooth transition. We used membrane energy to optimize the parameters of the

warping scheme towards better smoothness, and to avoid foldovers. This is demonstrated

in chapter 4.

The layout of the rest of this dissertation is as follows. In chapter 3, a new method

is presented for evaluating the preservation of topology after modeling specific features

with SOM. Using this method, the size and dimensionalities of the SOMs are adjusted to

represent the given structures in a better way. In chapter 4, the details of the alignment

and warping procedures are explained. Applications are studied in chapters 5 and 6, for

3d and 2d images. Standardization of brain MR images and face images are selected as

3d and 2d applications.


In image processing, there are several related procedures that transform images

such as standardization, warping, morphing, and registration. It is essential to point out

the differences and key components of all of these procedures before discussing the

existing image standardization methods in the literature.

In a broad sense, image warping can be defined as computing mapping functions

that define a spatial relationship between all points in two images, Ii and I2. In image

morphing, a series of images IT1 to ITn are generated such that a visually pleasing fluid

transformation of I to 12 is obtained when the images I1, IT1, IT2, ..., ITn, 12 are displayed

in sequence. On the other hand, in image registration, the objective is to find a

correspondence function, M:I1- I2. Through the use of M, each pixel in image I1 can be

labeled using the predetermined labels in the corresponding pixel in image I2. In image

standardization, I1 and I2 are samples from the same population of images, but contain

some natural shape and size variability that should be minimized to facilitate comparison

and/or aggregation of information related to the images. The goal is to normalize I1, so

that it becomes more like the standard image, 12, in terms of its shape and size. For this

purpose, a standardization transformation, S:11- I1', is sought, where Il' is the image in

which a similarity criterion between Ii'and I2 is maximized.

There are several key components that are common to all of these image warping

problems. A general categorization of the suggested techniques is possible by making a

classification of the different approaches taken in each of these components. In the

following, we summarize the five key components of all image warping systems:

1. Object representation:

In the given images, the object to be warped can be represented either as a set of

pixels, or as a boundary. The 3d equivalent of these is a set of voxels versus a surface.

The choice of representation usually determines which methods are applicable in the

following components.

2. Alignment:

The alignment strategies can be classified as feature-based or global. Generally,

for visual effects, warping is performed through either linear or nonlinear alignment of a

set of features. Alternatively, in other methods, a global measure is minimized for finding

the best warping transformation. Minimum squared distance between the two images, I1

and 12, for instance, is a widely accepted global measure for alignment.

3. Feature Extraction:

Depending on the alignment strategy, the feature extraction step can be

categorized as follows: no feature extraction, automatic, or manual feature extraction.

The choice of object representation determines how features are treated. In pixel/voxel

based representations, a feature consists of a set of anchor points. On the other hand, in

boundary/surface representations, features are treated through a parametric curve or a

geometric surface model.

4. Warping:

Through warping, for every pixel, ii, in the image, Ii, a displacement vector is

determined which maps ii into its new location ii'. Warping strategies are contingent

upon the alignment and feature extraction steps. Broadly, we can generalize warping

approaches as either global or local. While global techniques lead to either linear or

nonlinear deformations, local techniques must be nonlinear. Affine and polynomial

transformations are two example warping methods, one global linear, the other global

nonlinear. Scattered data interpolation techniques, on the other hand, are widely used for

warping images locally and nonlinearly.

5. Resampling:

Once the deformations are determined spatially through warping, the only

remaining question is how to determine the intensity values for the newly mapped pixels.

This is merely an interpolation problem, and is not the major concern in this study. For

curious readers, a good survey is provided in Wolberg (1990).

At this point, before surveying the image morphing and image standardization

literature, it might be beneficial to illustrate the image standardization problem on some

real images. In figure 5, activations in functional magnetic resonance images (fMRI)

during a word generation task is shown. The anatomical detail in the individual image of

figure 5a is lost after averaging the anatomy for 28 subjects. This is due to the variation

of individual structures. By looking at the normalized average fMRI in figure 5b, it is

hard to tell the precise location of the average activity. Figure 6 illustrates the variability

of the cingulate sulcus, a relevant anatomical structure in word generation, on a sagittal

slice 1 cm. lateral from the midline. The discrepancy between the atlas1 and the

individual images reaches as much as 1.5 cm at some specific locations.

1 The atlas is chosen randomly from among 10 individual images. Scanner parameters used in the
acquisition of these images are given in chapter 5.


On the other hand, for the facial expression image standardization application,

some example images are provided in figure 7. The features are determined by using the

important landmark points in dental and maxillofacial surgery literature, as explained in

chapter 6. As seen in figure 7b, significant differences in facial features are observed,

when all features from ten subjects are overlaid. The effect of these differences are

observed as blurry areas in the average image of figure 7c.

a) Single brain fMRI b) Average brain fMRI
Figure 5. Activations in fMRI of word generation

a) Atlas image with tracing b) Atlas image with overlaid traces
from 10 brains
Figure 6. Anatomical variability in the cingulate sulcus


.... .. . ...... ..... .4

a) Standard image b) Standard image with c) Average image
features from 10 subjects of 10 subjects
Figure 7. Variability of facial features

2.1. Image Morphing Survey

Generation of visual effects by morphing one object into another has become a

popular application in the computer graphics field. Most of the problems addressed in the

morphing literature are relevant to image standardization, excluding transition control2. In

the following, we present these methods based on recent surveys of the morphing

literature (Wolberg, 1998; and Lazarus and Verroust, 1998) skipping the transition

control procedures. Our presentation below is split into two categories with respect to

object representation: voxel/pixel-based and surface/boundary-based morphing.

2.1.1. Voxel/pixel Based Morphing Methods

fplhinig 11 ith radial basis functions: This method is developed by Arad et al.

(1994) and Arad and Reisfeld (1995), for warping 2d images, and extended by Cohen et

al. (1998) for 3d. Some performance improvements are provided by Ruprecht and Muller

(1995). In this method, pairs of anchor points are specified on Ii and I2 interactively. The

2 Transition control addresses how to generate and sequence a set of morphs so that the transformation is
visually fluid.

mismatches between the corresponding anchor points are used as displacement vectors to

be interpolated for all other pixels in the image. The interpolation is done by radial basis

functions centered at the anchor points. The choice of radial basis functions is as

important as the choice of the basis function parameters (Arad and Reisfeld, 1995;

Ruprecht and Muller, 1995). By changing either the radial basis function type, or the

parameters of the radial basis functions, a range of deformations from global to local is

possible. Some popular radial basis functions are Gaussian, Hardy's multiquadratic, and

thin-plate splines. This method can be classified as a nonlinear warping method with

feature-based alignment, in which features are extracted manually. The basic advantage

of this method is that visually pleasing warps can be obtained using only a few anchor

points. Therefore, the definition of the features is not heavily user interactive. Another

advantage is that the method is not model dependent, since any type of feature can be

reduced to a set of points. One important disadvantage is that the radial basis functions do

not have compact support, therefore in order to determine the displacement vector at each

pixel, all radial basis functions should be evaluated. This is computationally intensive.

Ruprecht and Muller (1995) and Arad and Reisfeld (1995) provide examples of a few

compact support radial basis functions that perform well, although theoretically there is

no guarantee that the radial basis function coefficients are solvable for these functions.

Fieldmorphing: This method is developed for 2d images by Beier and Neely

(1992) and adapted by Lerios et al. (1995) for 3d. During alignment, multiple line pairs

are drawn to represent the features in Ii and I2. The mapping of the points in the vicinity

of the lines can be determined using their distances from the lines. Since multiple line

pairs are given, the displacement of each pixel is actually a weighted sum of the mapping

due to each line pair, where weights are attributed to distance and length. In terms of

classification, this method is global, nonlinear and performs alignment through manually

extracted features. However unlike the radial basis function morphs summarized above,

in this method, features are model-dependent: lines in 2d, and boxes in 3d images. One

basic advantage is that the resulting morphs are very expressive. The two biggest

disadvantages are speed and control. For every pixel, all line segments must be

referenced to obtain the displacement, which increases the computational complexity. In

addition, occasionally, unexpected deformations, or ghosts, may be generated, which

requires adoption of irrelevant line segments to counter their effects.

2.1.2. Surface/boundary Based Morphing Methods

There are several methods that use lattices as intermediate structures to facilitate

warping. Free form deformation, warping with delaunay triangulation, and mesh warping

are three different methods developed by Sederberg and Parry (1986), Ruprecht and

Muller (1995) and Smythe (1990) respectively. In free form deformation, a uniform

lattice structure is used to embody the object to be deformed. In the other methods, the

lattice is non-uniform, and the vertices of the lattice are defined to coincide with

landmark points on the image.

Free form deformation (ffd) is achieved by first registering the pixels on the

object with the lattice coordinates. All the other pixels inside the lattice are then assigned

lattice coordinates with respect to their relationship with the pixels on the object. Bezier

polynomials are used to establish the relationship between the object and non-object

pixels. Finally, the user defines deformation points and displacements on the lattice,

which are then applied to the object that lies within. If the object to be deformed is

embodied fully within the lattice, then warping is global. Alternatively, local

deformations are possible by enclosing the object partially. Other versions of ffd, such as

extended ffd, rational ffd (Bechmann, 1994) and multi-level ffd (Lee et al., 1995) are also

available, to allow non-uniform lattices, utilize weighted control points and prohibit

foldover effects. Advantages of the free-form deformation are as follows. It can be

applied locally or globally, with derivative continuity. It can be used on surface or

polygonal models in addition to solid models. And parametric curves and surfaces remain

parametric under ffd. One disadvantage is that local ffd is restricted, forming a planar

boundary with the undeformed portions of the object.

Delaunay triangulation of scattered data points is a commonly used concept, and

its application to warping is an ordinary interpolation problem (Ruprecht and Muller,

1995). Initially, the object is partitioned into delaunay triangles3 using anchor points. For

warping, each pixel is associated with a triangle, and linear deformation is applied within

the triangular patches. Mesh warping (Wolberg, 1998) is quite similar to delaunay

triangulation. In mesh warping, first, topologically equivalent meshes are defined on

source and target images. Then warp generation is achieved through bicubic spline

interpolation. The advantage of non-uniform lattice methods is that warping is a

straightforward process. On the other hand, one basic disadvantage of the non-uniform

lattice-based techniques is that unwanted foldover effects can occur during deformation,

when the topological structures of the lattices on the source image and target image are

not equivalent. Another disadvantage of these methods is that defining landmarks for

every lattice vertex is heavily user-interactive.

3 The triangles that partition the object are chosen to maximize the minimum inner angle of triangles.

2.2. Image Standardization Survey

Developments in neuroimaging in the past five years have been the driving force

in the introduction of several standardization algorithms for brain images. Therefore, our

survey in this section is mostly based on 3d techniques developed in the context of brain

warping. The presentation below is split into two categories with respect to object

representation: voxel-based standardization and surface-based standardization.

2.2.1. Voxel-based Standardization

In the literature there are several voxel-based standardization methods that

suggest completely different alignment and warping strategies. In the following, we

survey the most interesting and successful methods case by case. A complete review can

be found in Toga (1999).

Woods et al. (1998a, 1998b) developed Automated Image Registration (AIR) to

achieve both linear and nonlinear alignment based solely on voxel intensities. The

warping transformation utilizes either affine or polynomial mapping in 3d. In order to

determine the parameters of the transformation, a cost function, ratio image uniformity

(RIU) is minimized. RIU is obtained by dividing deformed atlas intensities by individual

image intensities voxel by voxel, and calculating the standard deviation divided by the

mean. The performance of the algorithm for global shape standardization and alignment

of structures with small variability is good.

In a method developed by Ashburner and Friston (1999), the nonlinear spatial

transformations are described by a linear combination of smooth basis functions

consisting of the lowest frequencies of the 3d discrete sine or cosine transform. The

objective is to determine optimum coefficients for each of the bases while minimizing a

cost function with two terms. The first term in the cost function is the minimum square

difference between the deformed atlas and the individual image. And the second term is

the energy of a rubber membrane, to guarantee smoothness.

Bookstein (1996) observed that converting the given shapes into Procustes

geometry has several advantages. In this geometry of shapes, all multivariate methods in

statistics are applicable. In addition, splines are linear, and their eigenvalues are

orthonormal, resulting in the linearity of thin plate spline deformations in this space.

Therefore when linear standardization methods are used in the Procustes space, nonlinear

transformations are obtained in the real data space. This results in superior performance

of the Procustes-linear methods compared to the bounding box methods in the real data


Two low resolution methods suggested by Kosugi et al. (1993), and Lin and

Huang (1995) are interesting since these are the only methods based on self-organizing

maps. Self-organizing maps vector quantize a given data space forming a topological

layout of the codebook vectors. Independently Kosugi et al. (1993), and Lin and Huang

(1995) used a self-organizing strategy to determine the displacement vectors that define

the spatial transformation. The data space is partitioned blindly into cubes. The 3d

nonlinear deformation is achieved in a piecewise linear fashion, fitting cubical

neighborhoods in sequence. Each data cube in one volume is translated to achieve an

optimal match within the other volume. Cubes are allowed to move within a range

defined by the grid spacing in relation to the central node inside the cubical region. The

range which defines the translation shrinks as the iterations proceed, generating a stiffer

grid structure over time. Smoothing is achieved by allowing cubes to overlap, and by

performing interpolation accordingly. The two basic disadvantages of this method are

low resolution, and increased computation time due to the representation of the whole

data space with self-organizing maps.

The above low resolution method is upgraded by Collins and Evans (1999), using

a multi-resolution approach in a system named ANIMAL. The major distinguishing

feature of this method is that the resolution is refined as the iterations proceed. Another

difference is that the above method uses minimum square difference to determine the

optimal match, whereas ANIMAL uses normalized correlation of intensities. A penalty is

introduced in ANIMAL to avoid large transformations. This penalty, when combined

with multi-resolution, generates a stiffer grid throughout iterations. One major

improvement is observed in standardization with this technique, when the individual

image is segmented along with the atlas.

All of the procedures described so far involve small deformations, with the

assumption that small piecewise linear transformations lead to nonlinear deformations.

These methods fail to smoothly deform large volumes into small ones and vice versa. On

the other hand, a technique developed by Christensen et al. (1996) allows for large

deformations without jeopardizing smoothness. This is achieved by modeling the data

space as a viscous fluid medium, which allows stress within the deformed configuration

to relax with time. Thereby, the mechanical energy does not increase with the magnitude

of deformation. The deformations are obtained from the nonlinear partial differential

equations of continuum mechanics. The number of parameters are high, comparable to

the number of data points. Alignment is achieved by applying forces in the direction of

maximum similarity, which is evaluated by the gradient of the atlas image times the

intensity difference between the deformed atlas image and individual image. The results

of this method are very impressive, however applicability is limited, since it is very

computational. The authors were able to reach a solution using massively parallel

computers, however this type of computer is not widely available.

2.2.2. Surface-based Standardization

The surface-based methods develop a generic model of the object to be deformed.

Then, standardization is performed by applying forces in the direction of alignment. In

cases where the objects exhibit convoluted surfaces, it may be more efficient to transform

the 3d models into a flat map. Over the past decade, several techniques have been

developed by Bajcsy and Kovacic (1989), Drury (1996), Davatzikos (1997), Dale (1999), Fischl (1999a), and Thompson and Toga (1996) for surface-based

standardization of brain images. These methods can be summarized best by addressing

surface construction, flattening, and alignment separately.

Surface construction. In surface construction of a given object, the definition of

the surface is not always limited to the outer surface. For instance, in brain applications,

generally the outer surface of the brain (cortex), is used. However, sometimes, inner

surfaces, such as the boundaries of the ventricles, or the surface of individual anatomic

structures are also used (Thompson and Toga, 1996; Davatzikos, 1997; Bajcsy and

Kovacic, 1989). Surface construction can be achieved either by deforming a generic

surface to the given object (Davatzikos and Bryan, 1996), or vice versa (Fischl et al.,


In the first approach (Davatzikos and Bryan, 1996), a homothetic4 mapping

between a spherical surface and the surface of the brain is formed as follows. Initially, the

gray matter of the brain is segmented, to serve as the object to be modeled. Then points

on the generic surface (thin shell) are transformed such that, for a given surface point, the

distance between itself and the center of mass in a spherical neighborhood which includes

some portion of the gray matter is minimized. At equilibrium, the transformed points

collide with the centers of mass, and the intersection of the generic surface and the

segmented gray matter boundary is maximized. This process is coupled with additional

elastic processes to guarantee the homotheticity of the mapping. One problem with this

mapping is the inability to capture deep foldings. Solutions to this problem are sought by

Vaillant and Davatzikos (1997) through representation of deep foldings with active

contour models.

On the other hand, in the second approach (Fischl, 1999b), first the gray

matter white matter boundary is segmented. Then a triangular tesellation is formed on

this highly convoluted surface. This surface is inflated by minimizing an energy term

which consists of two parts. The first part of the energy term addresses metric distortions,

and the second part addresses smoothness through spring energy. Points on the original

surface are moved to minimize the energy term, resulting in the inflation of the crumbled

structures into smooth surface portions.

Flattening. The procedures used in warping are performed more efficiently in 2d.

Also, one problem with 3d representations is that, for highly convoluted surfaces,

distances measured in 3d underestimate the true distance along a folded 2d surface,

particularly in cases where the points lie on different sides of a folding. Therefore, the

4 Homothetic mapping guarantees that distance relationships are preserved under a global scaling factor.

generic 3d surfaces obtained from the transformation of original objects can be converted

into 2d, by introducing cutting points to flatten the 3d geometry into a map. An example

cutting method is developed by Fischl et al (1999b). The brain is first mapped into two

spheres, one for each hemisphere, using the procedure defined before. Then radial cuts

are defined on the sides of the spheres which face each other. One cut is introduced to

remove the midbrain structures, while 4-5 additional equally spaced incisions accomplish

separation of prominent anatomical structures.

Alignment. Several different types of alignment methods are suggested.

Davatzikos (1997) suggests a two level warping strategy, global and local. At the global

level, the surface representations of the individual and atlas brains are compared and

homothetic warping is performed using forces to align centers of mass of the

corresponding surfaces. At the local level, curvatures of the surfaces are matched in a

multi-scale fashion. In order to match curvatures, binary curvature maps of both

individual and atlas brains are obtained at coarse scale, and forces are applied in the

direction of minimum square distance. In both global and local alignment schemes, the

forces operate with linear elastic equations to perform warping of the entire object.

Another method suggested by Bajcsy and Kovacic (1989) also uses linear elastic

equations for warping. The forces are defined in the direction of maximum similarity,

where similarity is measured through four features in a weighted manner around a small

neighborhood of 3x3x3 voxels. The four features are the gray value average inside the

neighborhood, and gradients in three dimensions.

Convexity is another feature as powerful as curvature for alignment. Fischl,

(1999b) propose alignment through maximization of the correlation of the convexity

between the individual and the average images. On the other hand, interpolation is always

an alternative, as proposed by Thompson and Toga (1996). Given multiple surfaces to be

aligned, weighted linear combination of deformation functions can be calculated for

warping, where weight is modulated by the distance to the corresponding surface. This

warping method is similar to Beier and Neely's (1992), and that of Lerios et al. (1995), as

discussed in the image morphing section.

Unfortunately there is no standard way of comparing the efficiency of the

methods described in this section. Some of the methods are completely automatic, while

others require intervention. The datasets are different, and the validations are done using

different measures. Smoothness, although a very important issue addressed by most of

the methods, is not reported quantitatively in the results. Computation times also vary

from tens of minutes to several hours, some steps requiring overnight runs. After

comparing these methods from the information presented in the articles, we did not

observe a clear advantage of one over the other. Obviously a benchmarking procedure is

required to be able to differentiate the performance versus the computational cost, which

is out of the scope of this study.


As discussed in section 1.3, the selection of a SOM template with appropriate size

and dimensionality is crucial for the precise alignment of features. The intrinsic

dimensionalities of the feature, and the SOM template model must be in agreement to

obtain visually pleasing warps. An example is provided in the following to demonstrate

the problems that may arise in case the SOM model and the feature have conflicting

dimensionalities. In figure 8a and 8b, warping results on two pairs of simulated features

with dimensionalities Id and 2d are shown respectively. In both cases, a Id SOM is used

to model the features. As seen in figure 8a, Id SOMs modeling the Id features produce

the standardization of the individual image through perfect alignment of features. This is

because a topologically consistent correspondence is established between the anchor

point pairs. However, unwanted disturbances occur, as shown in figure 8b, when the

dimensionality of the feature and the SOMs are not the same. This is because there are

topology violations in the SOMs, and therefore, the correspondence between the anchor

point pairs is topologically inconsistent.

More generally, when the dimensionality of the SOM is lower than that of the

feature space, the nodes snake around in the data space to fill the whole area, causing

non-uniform vector displacements. On the other hand, if the dimensionality of the input

space is lower, some of the SOM nodes, hence neighborhoods are obsolete, causing

irrelevant matching between corresponding anchor points.







a) Id features



b) 2d features
Figure 8. Sample warps with Id SOM templates

3.1. Evaluation of the Quality of Topology Preservation

Topology preservation is a demanded quality when two different data spaces are

mapped onto each other. Inherent in this definition is the dimensionality of the data

spaces, because usually mapping is performed to reduce the dimensionality of the given

data space. Given a set of input points, X e Rn, the intrinsic dimensionality of X might be

m, where m<
data surface, and is a measure of the number of parameters that govern the data

generation (Olsen and Fukunaga, 1973). In other words, intrinsic dimensionality is a

characteristic that relates to the geometry of the points. For example, if the input points

represent a line, or a curve in 3d coordinates, the intrinsic dimensionality is one.

Similarly, the intrinsic dimensionality is two, for planes, or augmented surfaces.

During the mapping of the input space onto a space with smaller dimensionality,

it is important to keep the geometry of the input space intact. The objective is to find a

topology preserving mapping M, such that M: x-y, where x e Rn, y e Rk, and k
Finding such a mapping is possible -although not trivial-, when k is equal to m, the

intrinsic dimensionality. Unfortunately, if k is less than m, a perfect topology preserving

mapping does not exist. Therefore it becomes important to evaluate the extent of

topology violation in these situations.

In broad terms, a mapping is said to be topology preserving, if the relationship

between the points in the input space is the same in the output space. How to define this

relationship is ambiguous, and usually determined by the requirements of the application

at hand. Therefore, several different measures were proposed in the past, for evaluating

the topology preservation quality of a given mapping. In the following, first a brief

summary will be provided for determining the intrinsic dimensionality of data spaces.

Later, some evaluation methods for the quality of topology preservation will be


3.1.1. Background on Methods for Determining Intrinsic Dimensionality

One method to choose the dimensionality of the SOM template accurately, is to

determine the dimensionality of the feature first. Then the SOM template can be made to

have exactly the same dimensionality. A general survey of the methods for determining

intrinsic dimensionality is given in Wyse (1980). These methods can be categorized using

two different criteria: The first criterion indicates whether the algorithm determines the

dimensionality globally or locally. The second criterion indicates whether the algorithm

is static or dynamic. In static methods, the intrinsic dimensionality calculation is applied

to the original data space, only once. The dynamic algorithms on the other hand, modify

the data space iteratively, unfolding it until a linear space of a lesser dimension is


One widely known global and static method is principal component analysis

(PCA), or Karhunen-Loeve transformation. In PCA, the number of significant

eigenvalues of the covariance matrix indicate the dimensionality. Since this is a linear

method, it overestimates the dimensionality in cases where the data space contains global

folding (Fukunaga and Olsen, 1971).

Another widely known method is FO, by Fukunaga and Olsen (1971). This

method is local and static. In this method, the pattern space is divided into small

overlapping, or non-overlapping hyperspherical regions with different sizes. PCA is

computed in each of these regions separately, and a final decision about the whole data

space is made based on these local intrinsic dimensionalities.

There are other local, and static methods in addition to dynamic methods. All of

the dynamic methods iterate to unfold the data space until it is roughly linear. The

description of these methods is not central to our discussion. Further details are provided

in Wyse (1980).

3.1.2. Background on Methods for Evaluating Topology Preservation

An alternative approach for the selection of suitable SOM template parameters is

to generate several SOM templates with different dimensionalities, and evaluate the

quality of topology preservation in each. Then the best SOM template can be chosen on

the basis of this information. In this section, a summary of evaluation methods for

determining the quality of topology preservation in a mapping is provided. At this point,

it is important to clarify the differences between three basic topology preservation


1. Continuity: The neighborhoods in the pattern space are preserved.

2. Metric Topology Preservation(MTP): Distance -or similarity- ordering

between pairs of points in the pattern space is preserved.

3. Isometry: Not only distance ordering, but also actual distances between pairs

of points are preserved.

Among these concepts, the topology constraints imposed on the mapping get

stronger from continuity to isometry. In the following, evaluation methods will be

summarized under two groupings: methods for one-to-one mappings, and methods for

many-to-one mappings.

Topology preservation in one-to-one mappings:

The following methods are chosen to clarify the above three different types of

topology reservations. One of the widely known isometric topology evaluation methods

is Sammon's (1969). Given N pairs of points, xi, xj, e X, and yi, yj e Y, and a mapping

M: X->Y, topology is preserved through actual distance relationships, di = dist(xi, xj) and

dij* = dist(yi, yj). In the ideal case, the following equation is expected to be minimal.

1 L [d- d*]2
E= Y
Y,* ,< d,
In this error function, the first term is used for normalization purposes, and the second

term shows the discrepancy between the distances in the input and output spaces. Another

method developed by Bezdek and Pal (1995) is based on MTP using equivalence of

ranks, such that:

R = j[r*(ij) -r(i,j)]2
is minimized in the ideal case, where r(i, j) and r*(i, j) are ranks associated with dij and

dij*, such that:

r(i, j) = 1 if {i, j} = argminr,s(drs) and

r(i, j) = N if {i, j} = argmaxr,s(drs).

When R is minimized, k-nearest-neighbors are preserved, where k is equal to N. An

alternative method by Goodhill et al (1995), with continuous topology preservation, is

based on the minimization of the following term in the ideally topology preserving case:

C>= Yd_*d

An important criticism for all of these methods is that it is very computational to

evaluate topology preservation since the data space can be quite large. Besides, indirect

techniques are required to make these methods applicable to many-to-one mappings.

Topology preservation of many-to-one mappings:

This type of evaluations for topology preservation is more relevant to our study,

since SOM is a many to one mapping. As a matter of fact, the methods which are

presented in this section are developed specifically for the evaluation of the SOM

topology. When a pattern space is mapped onto a SOM, it is important that the

neighborhoods established on the lattice are in correspondence with the neighborhoods in

the feature space. This correspondence can be verified by comparing immediate lattice

neighbors of a node sj, with the nodes having voronoi regions that neighbor the voronoi

region of node sj or vice versa. The following two methods are based on this type of


The Topological product is a measure developed by Bauer and Pawelzik (1992).

It defines two measures, Qi, and Qf to evaluate distance order violations in the lattice and

in the feature spaces respectively. Ql(j, k) compares lattice distances between the kth

lattice neighbor, and kth feature space neighbor of node j. Similarly, Qfj, k) compares

feature space distances between the kth lattice neighbor and kh feature space neighbor of

node j.

d(j,n, Ok)) d (j,n(, k))
Q j, k)= 1 Q (, k)= d f, where
S d In fk)) ik d f,n fk))

d(j,i)=lii-j js d (j,i)=||w -w I|E, and
/ f j '

IH.||s is the shortest distance between nodes i, andj on the lattice with respect to

graph L, and 1|.ElE is the Euclidean distance in the feature space.

n (j, k) is the kt neighboring node of node j on the lattice, and n (j, k) is the kth
1 f

neighboring node of node j in feature space

For an ideal topology preserving mapping, Qi (j,k) and Q(j,k) must both be equal

to 1 for all j, k. To obtain the topology preservation of the whole map, a value called

topographic product is computed by multiplying Qi and Qf for all j and k's. One

disadvantage of this measure is that the value of the topographic product is not intuitive.

It does not directly indicate the extent of topology violations.

An alternative evaluation measure, topologicalfunction is developed by Villmann

et al. (1997). This method also investigates the lattice violations and feature space

violations separately, but uses distance order relationships rather than actual distances.

Given node, sj, a violation is encountered, when some of the immediate neighbors of sj in

the feature space are not immediate neighbors in the lattice space, and vice versa.

Furthermore the extent of violation is indicated by a parameter, k, such that, the number

of nodes violating the immediate neighboring in the lattice space is reported only when

their distance to node sj is more than k. In this way, by changing k from 1, up to the

diameter of SOM, one can observe the number and extent of violations indexed to each

node. The measures f(j, k) and ffj, k) shown below measure violations in the lattice and

feature spaces respectively, around a node sj, for violations of distance order k and


f (j,k) =# i: d (i,j) = 1d (i,j) > k) f (j,k) =#{i:d(i,j) = ,d f(i,j) > k}
I f I fI

where d (j,i) =Ij -ill
1 S

d f (ji) =IIj -ill

such that, I||.|s and II||.D are the shortest paths between the given nodes in the lattice and

delaunay triangulation graphs, L and D.

For an ideal topology preserving mapping, fi and ff must both be equal to 0 for

k>0. To obtain the topology preservation of the whole map, two topographic functions,

gi(k), and g(k) are computed by averaging fi and ff for all j's. The topographic functions

gi and gf clarify when there is a folding of the SOM with respect to the data space. If the

SOM is folding around the data space, then there are several lattice violations, for small

and also high k's. This is observed as a nonzero gi function. Similarly, for feature space

violations, a nonzero gf function is observed. One disadvantage of this measure is that

normalization is not done. Therefore comparing several SOMs is not straightforward.

3.2. Proposed Method for Evaluating the Quality of the Atlas SOM Template

In this section we propose a method for choosing the best atlas template for the

feature on the atlas image. This is a general purpose method which can be used in

applications other than imaging, for determining the best SOM parameters to represent a

data manifold. The outline is presented in figure 9. Initially, a SOM template with

randomly assigned dimensionality and size parameters is chosen, and trained using the

points on the atlas feature. In the second step, delaunay triangulation of the data space is

obtained using the newly generated codebook, because the succeeding steps are built on

this partitioning. In the third step, the intrinsic dimensionality of the feature is estimated

using a local, static method proposed by Bruske and Sommer (1998). At the same time,

topology preservation statistics are derived through an extended version of the topology

evaluation method of Villmann et al. (1997). Multiple SOM templates can be evaluated

repetitively in this fashion, until a SOM template with acceptable topology preservation

quality is found.

SOM template

triangulation with
respect to anchors

Intrinsic dimensionality Calculation of extended
estimation topology function

Figure 9. Outline of the SOM topology evaluation method

3.2.1. Delaunay Triangulation with Respect to Anchors

An efficient delaunay triangulation method is suggested in Villmann et al. (1997),

assuming that the SOM generated from a set of input points is dense on the input space,

X. At this point it must be clarified what is meant by dense.

Definition I: A SOM is dense, if a triangle with vertices, xi, Sp, and Sq completely

resides inside the data space, where p = argminkll xi Wk I| and q = argminjl| xi wjl|, such

that j = k, but j p (Martinetz and Schulten, 1994).

Briefly, the above definition means that a triangle drawn by connecting a given

input point, and the two closest SOM nodes must stay inside the feature space, for a SOM

to be regarded as dense.

Given this definition, finding the corresponding delaunay triangulation is


Definition II: Let DLY = (S, D) be the delaunay triangulation graph, such that an

edge (p, q) e D if Sp and Sq are the first and second closest nodes to any given x e X,

satisfying the density condition given in definition I. S = {sl, S2, ...., SK} is the same set of

nodes explained in chapter 1.

Unfortunately, because of the condition that the SOM nodes must be dense in the

given pattern space, this method is not applicable to a wide collection of SOMs which

represent augmented data spaces. An example is provided in figure 10. If the number of

nodes in the SOM is small, the density condition is violated everywhere on curves with

high curvature, such as the one in figure 10a. On the other hand, in figure 10b, the

condition is violated only for the points that are around the edges of the closing ends of

the shape. The type of features that we intend to work with are highly augmented (i.e.

anatomic structures in the brain, or facial features such as eyes, mouth, ...etc.). Therefore,

we have developed an extension to the above delaunay triangulation method, which

removes the condition on the density of the SOM as follows.

2 1


4 7 x
5 6
a) Id pattern space b) 2d pattern space

Figure 10. Two sample SOMs which are not dense

In order to derive an efficient delaunay triangulation scheme for non-dense

SOMs, first we need to observe the problems that arise when the density criterion is

violated. In figure 10a and 10b, delaunay triangulation of nodes 1 through 6 is found

correctly, even though the SOM is not dense. However, for nodes 0 and 7, an edge (0,7)

will be added to D superficially, due to the interruption of continuity of the pattern space.

This problem can be corrected for, if we revert to the original definition of delaunay

triangulation. An edge, (p, q) should be inserted to D, only if: Vp(X) n Vq(X) # 0, where

Vp(X) and Vq(X) are voronoi regions1 around the nodes sp and Sq. If this definition is

invoked, the edge (0,7) will not be present in the delaunay triangulation of the data spaces

shown in figure 6, because the voronoi regions of nodes 0 and 7 are not intersecting.

Based on similar observations on other data spaces, the following two pass algorithm is

developed as an extension to the Villmann's (1997) algorithm.

Firstpass: Determine all edges, find borderline data points

Given x e X, let (p, q) e D if sp and Sq are the two closest SOM nodes to x.

Let x E Bpq if ||x Wp|| -||x Wq|| | < b, where b is a predetermined

threshold, and Bpq denotes the set of nodes that are borderline between the

voronoi regions of node Sp and node Sq.

Secondpass: Disconnect disqualified edges by investigating disjoint voronoi

regions at borders

For all p, q, where (p, q) e D, find two closest data points, xi and x2, in the

border of the voronoi regions of nodes Sp and Sq, such that, I||x x211 <= |Xr

-xs||, where x1, X2, Xr, Xs E Bpq.

remove (p, q) from D, if I||x x211 > d, where d is a predetermined

threshold based on interpoint distances in the data space.

I Vp(X) x x:x-wpJJ

Figure 11. Borderline areas of a sainple data space and SOM

The effect of this new algorithm is illustrated in figure 11. The dark areas indicate

borderline sets, B. As b increases, the size of the borderline sets increase. Data density

imposes a natural lower bound on b. The highest limit ofb is reached when all the points

in the voronoi regions are included in the borderline sets. This occurs as b gets closer to

the distance between two neighboring nodes. On the other hand, d must definitely be

higher than the interpoint distances between data points. Determining a higher bound on

d is difficult. Setting d too high will allow nodes 0 and 7 to stay connected in the

delaunay triangulation. Nevertheless, as a rule of thumb, we can say that d <<

0.5*avg(l|wi-wjl|) must hold.

The precision and general applicability provided by this algorithm comes with a

computational cost. While the complexity of the delaunay triangulation algorithm for

dense SOMs is O(n), where n is the number of points in the feature space, the complexity

of our extended algorithm is O(n2). Nevertheless, since the order of the feature space is

much less than the order of the whole image space, this increase in complexity is not an

important speed problem in the whole image warping framework. The delaunay

triangulation method described in this section is an essential step for the estimation of the

intrinsic dimensionality and the evaluation of the topology preservation as explained in

the following sections.

3.2.2. Intrinsic Dimensionality Estimation

The intrinsic dimensionality estimation method of Bruske and Sommer (1998), as

described in the following is directly applicable to SOMs. This method is a local, static

method, and involves optimally topology preserving maps (OTPM), which are structures

similar to SOMs. Bruske and Sommer (1998) proposed the following four steps to find

the intrinsic dimensionality of an OTPM (Martinetz and Schulten, 1994).

Quantize the data space X into a set of nodes, S = {s, S2, ..., sK} with

values, W = {wi, w2, ..., WK}.

Generate a graph, DLY = (V, D) such that V = S, and (p, q) e D +=Vp(X)

rn Vq(X) # 0, where Vp(X) and Vq(X) are voronoi regions around the

nodes Sp and sq.

Calculate local intrinsic dimensionalities in the delaunay neighborhood of

each node, sp e S, by performing PCA on the set Wp Wq, where (p,q) e D.

Average all local intrinsic dimensionality estimates to obtain the global

intrinsic dimensionality. Optionally, round to the nearest integer if the

global intrinsic dimensionality contains a fraction.

The first two steps above are exactly equivalent to creation of the SOM template,

and finding its delaunay triangulation. The difference between OTPMs and SOM is in the

lattice structure, L, such that the nodes in OTPM have variable connectivity, and the

lattice structure is not enforced. This difference does not prohibit the applicability of this

method to SOMs, because PCA of local areas are computed on delaunay triangulation

graph, D, rather than the lattice graph, L.

3.2.3. Calculation of Extended Topology Function

While the topology function developed by Villmann et al (1997) is powerful for

investigating the degree of folding in a SOM, it does not investigate the violations at the

level of individual nodes. In some applications it might be more important to observe the

pattern of topology violations on the lattice layout of SOM. We have developed several

different normalization techniques for the topology function, and modified the suggested

averaging method, so that violations can be reported with respect to nodes rather than the

violation distance. In addition, we developed some summary statistics which are

comparable across SOMs. By studying these summary statistics, the user can

conveniently make a selection among different SOMs to vector quantize a given data

space more appropriately. Based on Villmann's topologic function, we define two

measures Havg,l and Havg,f as follows.

Definition III: Let Havg,1(j) indicate the average lattice distance among immediate

voronoi region neighbors of node sj. Let Havg,fj) indicate the average delaunay graph

distance among immediate lattice neighbors of node sj.

Sd(ij) d (i, j)
1=1, I=l,
d 0,y)=1 da(,;)=1
H ( j)= H (/ )= -'(7)
Hvg, #{i:d (i,j) = 1,i= 1,..,K} #{i: d(i,j) = 1,i = 1,..,K})
f I

The d(.) and dl(.) terms in the above formulations are the same as in Villmann's

(1997) definition which is discussed earlier. Ideally, both of these measures are expected

to be 1, if topology is preserved. In order to facilitate comparison between different

SOMs, these measures can be normalized as Navg,i and Navg,f. Normalization is achieved

by dividing Havg, with the diameter of the lattice, LD, and Havg,f, with the diameter of the

delaunay graph, FD.

H U) H (j)
N (j) = ",g N (j)= "vgf (8)
avg, L avf F

Alternatively, maximum distances, Hmax,, and Hmaxf can be reported instead of the

average distances as follows.

Definition IV: Let Hmax,(j) indicate the maximum lattice distance among

immediate voronoi region neighbors of node sj. Let Hmaxfj) indicate the maximum

delaunay graph distance among immediate lattice neighbors of node sj.

H (j) = max {d(i,j)} H (j) = max {d (i,j)} (9)
max, =1, ,K max,f '=1, K f
d f(,j)=1 d (zj)=1

These formulations are very similar to those in (7), and their normalization is

performed in the same way, as given below. Ideally, if there is no topology violation,

Hmax, and Hmaxfare expected to be 1.

H (J) H (j)
N j)= max N (j) =max (10)
max,l L max,f F

Another set of measures is determined by reporting the total number of nodes that

violate the immediate neighborhoods of a specific node, sj.

Definition V: Let Hno,1(j) indicate the total number of violations in lattice

neighborhoods among immediate voronoi region neighbors of a node, sj. Let Hno,fj)

indicate the total number of violations in voronoi neighborhoods among immediate lattice

neighbors of a node, sj.

H () =#i: d(i,j) = 1, d(i,j) > 1, i = 1,.., K}
no,l f I

H (j)=#i: d (i,j)= 1,d (i,j)> 1, i = 1,..,K) (11)
no,f 1 f

Ideally, Hno,1 and Hno,fare expected to be 0, when there is no topology violation.

Normalization of Hno,1 and Hno, are done as Nno,i and Nno,f, by dividing the number of

nodes in violation to the total number of nodes that are immediate neighbors.

H (j) H 0)
N (j)= no' N (j) = nof (12)
no,1 #{i:d (i,j) = 1,i = 1,..,K} o #i:d (i,j) = 1,i = 1,..,K}
f I

The key difference between these functions and the previous topology violation

measures of Villmann et al. (1997) is that they are indexed to individual nodes.

Therefore, ifH -or N- is plotted on the SOM lattice, the areas that are in high violation

become visible immediately. In figure 12, the layout of topological violations are

illustrated on Id and 2d SOM lattices. Figures on the left are from the mapping of an

ellipsoidal contour to a Id SOM, whereas the ones on the right are from the mapping of

the same ellisoidal contour on a 2d SOM. Intuitively, one would expect no topological

violation when the ellipsoidal contour is mapped onto a Id SOM, since a contour is

definitely a Id structure. However, the generic Id SOM is a Id curve, whereas the

ellipsoidal contour is a Id loop. Therefore, a topological violation is observed at the end

points of the Id SOM on the left hand side of figure 12.

On the other hand, the 2d SOM lattice is clearly in violation of the Id ellipsoidal

data space. In particular, the two SOM nodes located inside the ellipsoidal contour do not

represent any portion of the data space, because their voronoi regions are empty sets.

These nodes have drifted with the other nodes, merely because of the connectivity of the

lattice, and the updating of the neighbors. These two nodes have not been chosen as

winners themselves through the evolution of the weight updates. It is interesting to note

that the delaunay triangulation method presented by Villman, et al. (1997) would indicate

these two nodes as valid if the density condition were removed. However, because the

density condition can not be removed, Villman's delaunay triangulation method becomes

inapplicable. On the other hand, our extension on Villman's method remains applicable

even under this situation, and marks these two nodes as irrelevant, by assigning them

negative values during the topological evaluation. As seen on the right hand side of figure

12, topology is severely violated by the mapping of ellipsoidal contour to a 2d SOM. All

the values plotted in figure 12 are normalized, allowing comparison with other SOM


Investigation of the quality of a given topological mapping is facilitated through

the H and N plots, as illustrated in figure 12. However, comparing a single value which

indicates the degree of topology preservation across several different SOMs is easier than

comparing the H functions, in our quest to determine a suitable SOM configuration.

Therefore we developed the following three different topology preservation quality


Definition VI: Let RD be the ratio of the diameter of the lattice, and the diameter

of the delaunay graph.

R =D (13)

The lattice and delaunay graph diameters are two simple, but powerful indicators

for observing the data layout. In an ideal topology preserving mapping, their ratios must

be 1. If RD>1, that means the lattice is folded on the feature space. On the other hand, if

RD<1, it means the data space is folded on the lattice. For example, if a uniform feature

a) SOM nodes



b) Navg

i r


c) Nmax

- 005

0.75 050
r 033
-I I

-1 -1

d) Nno
Figure 12. Layout of topology violations on Id (left) and 2d(right)
SOM mappings of an ellipsoidal contour



- I


space inside a square is represented by a Id SOM, the lattice folds within the square to

cover the entire data space. In this case, RD is greater than 1. Or similarly, when a line is

represented by a 2d SOM, RD
Another type of measure is the percentage of topology violations among lattice

and voronoi neighborhoods.

Definition VII: Let Pneigh,l be the percentage of violations in the lattice

neighborhoods, and Pneigh,f, be the percentage of violations in the voronoi space


P = *100
neigh,l #(i:d (i,j)= l,i = ,..,K}

P = 100 (14)
neighxf #ti:d (i, j) = 1,i = 1,..,K

In cases where topology violation is inevitable, a SOM that contains minimum

Pneigh values is preferred. The last measure is the percentage of nodes participating in a


DefinitionVIII: Let Pnodes,l indicate the percentage of nodes that is a part of lattice

violation. Let Pnodes,f indicate the percentage of nodes that is a part of delaunay graph


sgn(H o,()) sgn(Ho ())
P = _*100 P = *-100 (15)
nodes,l K nodes,f K

Similar to Pneigh, a lower value of Pnodes is preferable, in cases where violations are

inevitable. By observing RD, Pneigh and Pnodes in several different SOM configurations, one

can make a quick decision about the suitability of a given SOM. Ideally, RD=I, Pneigh=0,

and Pnodes=0.

We performed tests on simple Id and 2d pattern spaces to demonstrate the use of

these three measures. The results are summarized in table 1. The first two pattern spaces

are the same, and consist of the contour of a circle. The succeeding two pattern spaces are

also same, but consist of not only the contour, but also the interior area of a circle.

Intuitively one would expect that the Id SOM mapping of a contour and 2d SOM

mapping of a full circle would be better than others. And, this is exactly what is

suggested by RD, Pneigh and Pnodes in table 1. The mapping of a full circle to a 2d SOM in

the last row indicate zero topology violation. On the other hand, the mapping of a circle

contour to a Id SOM in the first row demonstrate a slight violation. From the earlier

illustration in figure 12, it is known that this violation occurs only at the end points of the

SOM, because the circle is closed at these points, but the SOM lattice is not. This is

indicated indirectly by the low values of Pneigh and Pnodes. The values of Pneigh and Pnodes in

the middle rows are unacceptably high, indicating severe topology violations.

Given a SOM, the user can investigate the existence, nature, and acceptability of

topology violations using the measures presented here. In some cases, violations are

inevitable. However, choosing a SOM configuration with fewer violations is possible by

studying the proposed measures.

Table 1. Topology preservation quality of SOMs for ID and 2D data patterns

Data SOM ID RD Pneigh Pnodes Violation
size (%) (%) Type

Circle 1x16 1 1.87 6 12 Lattice

Circle 4x4 1 0.57 40 88 Lattice and
Contour Feature space

Circle 1x16 2 3.70 55 100 Lattice

Circle 4x4 2 1.0 0 0 None

There is an inherent shape-constraint imposed by the generic structure of the

SOMs. The generic shapes of the SOMs allow for open, non-intersecting curves and

surfaces. Standardization with respect to features that contain self-intersecting curves and

surfaces might cause topology violations, which in turn, jeopardize the alignment, and

create undesirable warps. Unfortunately, it is impossible to develop a cure for this

problem within the framework proposed in this study. It is recommended that this

methodology be applied exclusively to structures that are not self-intersecting, to avoid

topology violations completely.


The regularized alignment of individual features and warping the entire image is

addressed in this section. In our framework, SOMs are used exclusively for developing a

model for the feature during the alignment step. On the other hand, radial basis functions

are used to interpolate the new position assignments of individual image pixels during the

warping step. There is no coupling between the alignment and warping steps, so the SOM

and RBF interpolation procedures operate independently.

4.1. Self-organized Alignment

It is important to point out that, in the context of alignment of a pair of features,

regularization means imperfect matching of corresponding feature points for the sake of

smooth deformations in the surrounding area. There is a tradeoff between smoothness

and precision in the alignment. Regularization is accomplished at two levels in our


1) Vector quantization: Alignment is performed only at the anchor points obtained

through vector quantization. Precision is lost by not aligning all the feature points,

however, this loss in precision is compensated by a gain in smoothness. Examples that

demonstrate the benefits of vector quantization in terms of smoothness in interpolation

are provided in Bishop, 1995, chapter 5. As illustrated by Bishop (1995), the balance

between the loss in precision and gain in smoothness is achieved easily by adjusting the

number of anchor points.

2) Regularized matching of anchor points: At some anchor points, mismatches are

higher. If these anchor points are aligned as perfectly as others, the deformations in the

surrounding area will be larger than those in the other parts of the image. Therefore a

mechanism is needed to align the highly mismatched anchor points in a relaxed fashion.

This is the basic topic that is addressed in this section.

4.1.1. Background on Regularization Approaches

Regularized alignment of anchor points can be achieved through either

regularizing the SOMs or RBFs independently. Our literature search yielded only one

study (Goppert and Rosenstiel, 1996) which brings together regularization and SOMs. On

the contrary, regularization is a widely applied concept in interpolation with RBFs. In the

following, the existing regularization approaches in SOMs and RBFs will be summarized


Regularized SOM lattice: At the end of SOM training, due to the final fine tuning

phase of SOM weight updates, in which the radius of the neighborhood reduces to one,

the positions of the codebook vectors become slightly disorganized, although their

ordering is kept intact. An extension to the SOM weight update rule can be made

(Goppert and Rosenstiel, 1996), so that the final positions of the nodes would reflect a

more regular lattice structure. Considering the SOM described in chapter 1, this extension

is formulated by replacing the weight update rule (2) with (16):

wk(t) = wk(t -1) + g(d, t)[xj wk( t -1)] + h(d,t) f(wk(t -1)) (16)

where g(d,t), and d are as described in chapter 1, f(.) is a multivariate function that returns

the ideal regularized position for Wk, and

SO, ifd = O
h(d, t) = _(2 / -2 with o set to either 2 or 3 to limit the range.
Xe elsee

Intuitively, this update rule works as follows. In the beginning of training, the

neighborhood function is much stronger, therefore the first term performs the

organization of the SOM. Towards the end, the effect of the first term diminishes, as

g(d,t) approaches zero. Therefore the second term becomes more important. At the last

stages of training, the winner is adapted towards the input vector, while its immediate

neighbors are adapted towards their regularized positions. In section 4.1.2 we adopt this

approach to retrain the template SOM in a regularized fashion for the feature in the

individual image.

RegularizedRBF interpolation: Given a set of sample points in the data space,

along with scalar target values, interpolation with basis functions is a process, which

associates scalar function values to all the points in the data space. This is an ill-defined

problem, because there are infinitely many solutions. In order to reach a pleasing unique

solution, smoothness constraints are introduced through regularization theory. The

smoothness criteria is enforced by the second term in the following error function, which

is minimized during interpolation.

K 2
E= L(d(v )-T(v)) + IP(T) 12, (17)
k=1 k k

where P is a differential operator, called stabilizer. The first term in the above equation

measures the accuracy of the function approximation, whereas the second term measures

the cost associated with the deviation from smoothness. The first term is small, when the

data, d, and the desired solution, T, are close. On the other hand, the norm of P(T) is

small, when T is a smooth warp. The regularization parameter, X, controls the

compromise between the degree of smoothness of the solution and its closeness to the

data (Poggio and Girosi, 1990a). Regularization theory restricts the types of functions

that can be used as basis functions to satisfy the smoothness criteria. Gaussian', and

inverse multiquadratic2 basis functions are supported by the regularization theory, while

multiquadratic3 functions are not (Haykin, 1999). The effect of the smoothness term in

the error function is reflected in the calculation of the coefficients of the radial basis

functions when ck are obtained by solving a set of linear equations. For the X translation,

the set of K linear equations (Poggio and Girosi, 1990b) are:

K -(01V -V 1 12/ya2)
T)x(vj)= Icck.(e k + S ) (18)
k=l k jk

= dx(vj) = vjx-wjx

where 67k is the Kronecker Delta (equal to 1 ifj = k; equal to 0 ifj k).

On the other hand, the smoothness term does not affect the calculation involved in

the interpolation of the whole image. The spatial displacements of each pixel, aj are still

computed in the x dimension by:

K -(lla -v 1/ k2)
Tx(aj)= Y cke

1 f(r) = exp(-r2/2o2), o>0
2 f(r) = 1 / (r2 + c2)1/2, c>
3 f(r) = (r2 + c2)1/2, C>

4.1.2. Proposed Method for Regularized Feature Alignment

In our framework, we have chosen to implement the regularization strategy in the

alignment step, through an extension to the SOMs. The reasons for this choice will be

clarified at the end of this section. There are two ways to develop a regularization

extension to SOMs. First, matching could be regularized with respect to the deformations

in the surrounding area of individual anchor points. Second, regularization could be based

exclusively on the positioning on the anchor points. In order to implement the first

approach, a region of interest, which consists of a subset of the whole image, must be

defined around each anchor point. At each iteration of the SOM, the part of the image

that falls inside the region of interest around the winning node must be warped and

evaluated for deformations. Accordingly, the SOM update term can be adjusted either

towards better fitting the individual feature, or for better smoothness. However, this

approach is very computational. It requires warping to be performed at every iteration,

which is quite time consuming. Therefore we developed the following regularization

scheme based on the displacements at the anchor points alone.

The basic error term at work in original SOM updates is the quantization error, Eq.

1 N 2
Eq= 111P -w |2 (19)

where 3n are the points on the individual feature, and w, is the closest node to 3n. The

quantization error measures how accurately the codebook vectors represent the feature

points. However, for regularized matching, accurate matching of the feature points is

unwanted at highly variable areas of the features. Instead, in these areas, less variability

between the atlas and individual features is preferred. Therefore an error term, Ed can be

constructed for measuring the discrepancy between the atlas and the individual feature.

1K 2
Ed= I v -w || (20)
Kk=l k k

Hence, the whole error term becomes:

E = Eq + Ed (21)

It is known (Kohonen, 1995) that the SOM algorithm does not employ a steepest

descent procedure to minimize E. However, it is shown (Kohonen, 1995) that the

resulting codebook vectors through the SOM procedure approximate the positioning that

would have resulted through a steepest descent procedure. Therefore, in order to

accommodate the new error term, we modified the original SOM weight update rule in

(2) as follows:

wk(t) = wk(t -1) + g(d, t)[xj wk( t -1)] + )(t)[wk(0) wk(t -1)], (22)

where wk(0) = vk(tend), as specified in the feature alignment section of chapter 1.The first

term in the weight update rule, (22), is the original SOM weight update term, which is

responsible for reducing Eq. On the other hand, the second term is for reducing Ed. The

parameter ) adjusts the balance between perfect matching of the individual feature, and

regularized matching. This approach is philosophically very similar to the approach taken

by Goppert and Rosenstiel (1996) which is described in the previous section.

It is important to note that there is a dilemma between accuracy versus regularity

(Goppert and Rosenstiel, 1996). Initially, when no action is taken, Ed is equal to zero,

because initially wk(0) = Vk(tend). As the SOM algorithm proceeds, Eq tends to be

minimized, while Ed increases. The final stage of the SOM training, in which the second

term dominates, aims to reduce Ed, and find a compromise between Eq and Ed. There are

several options to be considered in setting )(t). It could be a fixed small value, or a

slowly increasing function which reaches saturation towards the end of training. In our

simulations, we obtained successful results, particularly with the type of function

illustrated in figure 13. In figure 13, a(t) and )(t) obtain the same non-zero value towards

the end of training, and their values do not change from that point on.


20 40
Figure 13. Sample training functions

The simulated 2d image warping example presented in figure 4 of chapter 1 is

repeated using the regularized SOM (RSOM) algorithm. The resulting individual SOMB,

and the warped image IB' are shown in figure 14a and 14b. As seen from figure 14b, the

deformations at the end points are more acceptable. The compromise for this smoothness

is made by reduced alignment accuracy. As seen in figure 14a, the individual RSOM

anchor points are positioned midway through the atlas and the individual SOM points.

Therefore, the loss in accuracy in RSOM can be predicted to be around 50% in

comparison to SOM alignment.

Using data from the above simulation, we investigated the change in Eq and Ed

during training. We also studied the distribution of error with respect to individual nodes,

after training. In figure 15a-15c, the change in Eq, Ed, and their sum are shown

respectively throughout training. As seen from these figures, the value of Eq in the

a) Location of nodes in b) Warped image using
SOM versus RSOM RSOM nodes
Figure 14. Alignment and warping results with regularized SOM

original SOM algorithm is lower than that of the RSOM algorithm. However, the reverse

is true for Ed, because the RSOM algorithm reduces Ed, while the original SOM

algorithm does not. Overall, the improvement in Ed is much higher than the improvement

in Eq, so the total error is less in the RSOM algorithm.

When the errors are plotted after training, at the individual node level, the figures

15d-15f are obtained. In these figures, individual nodes are indexed along the x axis via

their node id numbers. As seen from figure 15d, Eq is high for the RSOM at the end

points of the feature. This is due to the relaxed matching of the anchor points at the ends.

On the contrary, Ed is much higher for the original SOM algorithm, as seen in figure 15e.

Finally, as shown in figure 15f, the RSOM improves the total error twice as much

compared to SOM.

In addition to the above tests based on regularized SOMs, we have performed

tests with regularized RBFs, on simulated images. The performance of both

regularization approaches were similar, both computationally and in terms of smoothness.

Generally we did not find a performance improvement in one method over the other.

However, we noticed that regularized SOMs have two advantages:

1) The setting of the parameter, k is more intuitive, and easier, because it is in the

same domain with c(t), which is a part of the g(d,t) multiplier. The user needs to specify

the initial gain term, c(t), regardless of whether regularization is performed or not. This

facilitates the setting of the )(t) multiplier, which can be set in comparison to c(t), to

achieve a desired balance between accuracy and smoothness.

2) One basic assumption that we made is that the feature space is uniform. While

this is true for simulated data, it will not hold for real images, especially when the feature

is extracted automatically. SOMs are sensitive to the density of the data space, whereas

RBFs are not. Therefore for non-uniform features, the regularized SOM algorithm is

expected to perform better than the regularized RBFs, in obtaining a desired level of


One last issue to address is the complexity. The generation and retraining of the

standard SOM template both have the same complexity, O(n), for either regularized, or

non-regularized alignment, where n is the number of points in the feature space.

4.2. Proof of Preservation of Order Between the Template and Individual SOMs

The proposed alignment scheme is based on the assumption that the ordering of

the codebook vectors are the same in the template and individual SOM. If this

assumption does not hold, a valid correspondence will not be achieved between the atlas

and individual features. In this section it is shown that the order between the template and

individual SOMs is preserved, based on a condition imposed on the initial radius at the


a) Quantization error, Eq, versus time

b) Deformation, Ed, versus time

d) Quantization error, Eq,
of individual nodes

e) Deformation, Ed,
of individual nodes

000 2 -C R


2-. SOM RO .- /1 RSOM

c) Eq + Ed, versus time f)Eq + Ed of individual nodes

Figure 15. Errors Eq and Ed, during and after training

beginning of training. When the dimensionality is higher than id, due to the complex

interaction between the codebook vectors, the proof of SOM characteristics involve

intensive theoretical depth. Therefore we will outline the order preservation proofs only

for Id data.

a vi v2 VK b atlas

case 1c Wi W2 e WKd individual

iO--O wO'd
case 2 c wK WK-I e wi d individual

Figure 16. Id template SOM and probable orderings of individual SOM

Theorem 1: Let [a, b] be the Id atlas feature space, and {vi, v2, .., VK} be the

weights of the nodes of the template SOM as shown in figure 16. If a new SOM

containing weights, {wi, w2, ..., wK}, and initial radius, r, is trained using the individual

data set, [c, d], such that, wk(0) = vk, k=1,...,K and 1 < r < K, the ordering of all the nodes

Vk and wk are in topological correspondence.


Part I, r < K: There are only two possibilities for the ordering of the individual

SOM, as illustrated in case 1, and case 2 of figure 16 (Kohonen, 1995). Case 1 is in

correct topological ordering with the atlas SOM, but case 2 exhibits reverse topological

ordering. We need to show that with the original SOM update rule, it is impossible to

obtain the ordering in case 2, when r < K. In order to prove that case 2 can not be

obtained, it is sufficient to show that after training, wi can not be positioned between


Consider feature points PE [e, d]. Since wK(O) = vK, and vK is the closest node to

points in [e, d], wK is the winner node for this set of P. Since r < K, only w2, w3, ...,wK are

updated with 3; wi never receives updates for PE [e,d]. Therefore according to Kohonen,

(1995), wi can only be positioned somewhere inside [c, e]. Indeed, if wi was the only

node to map into [c, e] it would have been placed at the centroid of [c, e]. But this is in

contradiction with case 2, which places wi outside [c, e]. Hence case 2 can not hold.

Part II, r > 1: Let r be 2. Initially, only a small subset of nodes will be winners,

because only few Vk are close to points in [c, d]. Assume only one node, wK, is a winner

initially. This will cause wK and wK-1 to be updated with P3e [c, d]. We know from

Kohonen (1995) that this will cause wK and wK-1 to be positioned somewhere inside [c,d].

But then this means that 3 3e [c, d], for which wK-1 is a winner, which in turn causes WK-2

to be updated with P. This way, not only wK, and wK-1, but also wK-2 will be positioned in

[c, d]. This process is repeated like a chain reaction, until all wk, k=l,...K are positioned

in [c, d].

Theorem 2: The regularized update rule in (22) does not spoil the ordering of the

codebook vectors, if )(t) is negligeably small initially.


When )(t) is negligible, the original SOM update rule is in effect. It is known

from theorem 1, that in this case, Wk will be in the same order as vk. After this point, )(t)

increases. At this stage, the updates each node receives are multiples of vk-Wk, and the

first term in the weight update rule (22) is negligible. But this is equivalent to a reverse

mapping: from the individual, to the atlas (from wk e [c,d] to Vk e [a,b]). Since the

individual SOM's codebook vectors are in the same order with the atlas as shown in

theorem 1, this reverse mapping will initiate with the atlas ordering. Therefore, the

ordering of the nodes must remain the same.

4.3. Warping with Radial Basis Functions

Image warping is the final stage of the framework proposed in this study. Radial

basis functions centered at anchor points carry out the interpolation process for finding

the spatial displacements of each pixel in the atlas image. The equations involved in this

step are given in chapter 1. An important problem is determining the variances for the

Gaussian radial basis functions. If the variances are not chosen for optimal smoothness,

unacceptable warps, such as the ones shown in figure 18 are inevitable. The features in

figure 18 are the same as those presented in figure 4.

Haykin (1999) suggested an ad-hoc rule for determining the variances in an RBF

network, such that:

yk = dmax/ (2*K)1/2, k=l,..., K (23)

where dmax is the maximum distance between the anchor points, and K is the total number

of anchor points. However, this setting is sub-optimal and poor deformations may result.

For example, the warped image of figure 4 is generated by setting the variances as

suggested above.

4.3.1. Optimization of RBF Parameters

Ideally, the deformations in the warped image are expected to be local, but not

sharp. Small variances lead to sharp deformations, whereas large variances lead to global

deformations. If an energy term which can capture this feature of the deformations can be

found, then variances can be determined by minimizing the energy. We studied several

energy terms, including the energy of a spring, bending energy of a thin-plate, and the

energy of a rubber membrane. The energy of a rubber membrane was found to be the

most suitable, because its minimal values generated visually pleasing warps. Our

preference was reinforced after finding that some of the published nonlinear image

standardization procedures in the brain warping literature used the same type of energy

function (Ashbumer and Friston, 1999). The energy of a rubber membrane for a 2d image

is calculated using (24) below.

T (x,y) dT(x,y) dT (x ,y) dT (x ,y)
Em = x( 1 )2 +( )2"+( Y )2 +( Y y )2 (24)
m dox y ox ay

When we apply this energy definition to the interpolation term in (4), which is

used for assigning new spatial locations to pixels, we find:

K 2a (X -Xk + k 2)
E = I [- e k[( +y ( +
m k 21 k(

k 2b )2 2
2b -_( k 2)
+[- e y [(x +y)-(x +y)]] (25)
( k k

where ak and 7xk are the coefficients, and variances for Tx, and bk and Gyk are the

coefficients, and variances for Ty.

Now we can study the suitability of this energy function. We would prefer the

value of the energy function to be high when the deformations are unacceptably sharp.

Also, as the deformations become more global, the energy term must increase. So, the

energy function should exhibit a graph with large values at low and high variances, with a

minimum somewhere in the middle. By setting all of the variances to the same value we

investigated the change in membrane energy on the simulated feature of figure 4. As

illustrated in figure 17, membrane energy has high values for very low and very high

variances. Optimum membrane energy is achieved only when the variances are allowed





E 10000
a 6967=optimum
E 5000

0 10 20 30 40 50 60

Figure 17. Change of membrane energy with respect to variance

to differ, and its value is half of that obtained through the suboptimal setting of the

variances. In figure 18, the effect of non-optimal setting of the variances is shown on the

simulated feature. When the variance is low, the deformation is unacceptably sharp, and

when the variance is high, global deformations are observed at the end points of the


a) Gk= 15 b) Gk =40

Figure 18. Two nonoptimal warps

We have used the steepest descent strategy to minimize the energy term in (25).

In figure 19a, 19c, 19b and 19d, results obtained from non-regularized alignment of the

linear features of figure 4 can be compared. Figures 19a and 19b are obtained using sub-

optimal variances, whereas figures 19c and 19d are obtained from optimal variances. The

warp in figure 19c is smoother than the one in figure 19a. The distribution of the

corresponding membrane energies in figures 19b and 19d illustrates that smoothness is

attained through a reduction in membrane energy. The total membrane energy Em is equal

to 11959.6 in figure 19b, and 6967.76 in figure 19d.

Similarly, in figure 19e, 19g, 19f and 19h, results obtained from regularized

alignment of the linear features of figure 4 are given. Figures 19e and 19f are obtained

using sub-optimal variances, whereas figures 19g and 19h are obtained from optimal

variances. The warp in figure 19g is smoother than the one in figure 19e. The total

membrane energy Em is equal to 2304.75 in figure 19f, and 658.45 in figure 19h.

Finding optimal variances for the warp is the most computational part of the

standardization procedure. Its order is O(P), where P is the number of pixels in the whole

image. In order to reduce the computations, the energy function might be evaluated not

on the whole image, but rather on a select subset of pixels (Woods et al, 1998a). The

pixels for which the evaluations are done can be spread randomly, uniformly, or more

closely around the feature. Assuming there are Q such pixels where Q<
computations can be reduced to the order O(Q), without significantly affecting the


a) No regularization, Mj l warp

c) No regularization, optimal warp

e) Regularization, g m1 warp

d) E,, (total=6967.76)

f) Em (total=2304.75)

g) Regularization, optimal warp h) Em (total58.45)
Figure 19. Sample warps using SOM and RSOM for alignment

b) E, (total=11959.6)

4.3.2. Boundary Conditions

Up to this point, no restrictions are imposed for the new spatial assignments of the

pixels on the boundary of the image. In some applications, the outer boundaries of the

image, which consists of a rectangular frame for the 2d images, and a cubical box for the

3d images, must be immobilized. This can be achieved in three ways:

1) Penalty term: A new term can be introduced to the energy function to penalize

the situation in which the boundary moves. Therefore, the energy to be minimized can be

reconstructed as: E = y*Em + rl*Ep, where Em is the energy of the rubber membrane, Ep is

the penalty term, and y, fr are weights to adjust the effect of the penalty. In 2d, the penalty

term might be chosen as:

E, = Tx (xb yb2 y (xb,y )2 (26)

where b is a pixel on the boundary of the image.

2) Displacement multiplier: After finding the displacements of the pixels in the

whole image using Tx and Ty, these displacements can be multiplied with a 2d surface,

xy. Surface values are assigned to the image pixels such that, xy = 1 when the pixel

(x,y) is away from the boundaries. On the other hand, xy = 0, if(x,y) is on the boundary.

And when (x,y) is close to the boundaries, 0 < xy <1 holds. This way the boundaries are

pinned down with a smooth transition. We prefer to use this approach since it is the

simplest way to guarantee immobilized boundaries.

3) Boundary features: Boundary movement can also be avoided by defining them

as features. When the atlas image and the individual image have exactly the same

features, the discrepancies at the anchor points of these features is zero. Warping can be

accomplished by aligning multiple features rather than a single feature. Therefore, for 2d

images, 4 additional Id features can be defined, consisting of the sides of the bounding

frame. For 3d images there must be 12 additional 2d features, one for each side of the

bounding cubical box. The treatment of multiple features, as summarized in the next

section, is merely a warping issue, which does not affect the alignment step.

4.3.3. Generalization to Multiple Features

In this section, we show how to handle the standardization problem with respect

to multiple features. First, we need to revise the definition of features, based on the

definition of a single feature given in chapter 1.

Definition I:

Let IA and IB be the atlas and individual images respectively. Consider two sets of

structural features, AE IA, and Be IB in these images, consisting of U elements, such that:

A = {Ai, A2, ...., Au}, and B = {B1, B2, ...., Bu},

where A1 = {y1, U12, ..., (} and B1 = {3n, 312, ..., I1N},

A2 = {021, U22, ..., 2p} and B2 = {21, 322, ..., 32Q}, ... etc.

Here, tim, m = 1,.., M and Pin, n = 1,.., N are points in 2d or 3d space in which IA

and IB are defined, and M and N are not constrained to be equal. The same argument

holds for the points aum, and (Xun in the other features Au and Bu.

The objective is to align Au with Bu, u =1, ..., U by generating anchor points, and

to warp the image IB by interpolating the discrepancies between the whole set of anchor

points. The alignment step is not affected by having multiple features, because alignment

is a process which is exclusive to a pair of features, one from the atlas image, the other

from the individual image. However, we still need to revise the notation of the data

structures in the template generation and alignment steps.

Definition II:

Let the anchor points of atlas feature, Au, be Vu = {vul, Vu2, ..., VuK} and the

anchor points of the same feature, Bu, in the individual image be Wu = {wul, Wu2, ...,

WuK}. Define the discrepancies as Du = {vul ul, Vu2 Wu2, ...., VuK WuK}.

The multiplicity of the features only affect the warping step. Since warping is

accomplished by interpolating the discrepancies at the anchor points, there is no reason to

discriminate between the anchor points as belonging to specific features. The set of

displacements, Du, u = 1, ..., U can be pooled under one displacement set, D. Therefore,

the RBF coefficients must be calculated for this whole set, and variances of the RBFs at

all anchor points must be optimized at the same time. The spatial displacements for

individual pixels can be found as follows.

Let T be a 3D transformation: T = (Tx, Ty, Tz), where Tx(x, y, z), Ty(x, y, z), Tz(x,

y, z), are real valued multivariate functions, R3 --R. Compute d(aj), the displacement

vector, at voxel aj e IA, by:

d(aj) = (Tx(aj), Ty(aj), Tz(aj)), where

K -(|a -v H12/o 2) K -(lla -v 112 /a 2)
T (a)= clk.e k k + C22.e C2k 2k
x k=l k=l

K -(a -v l12/y 2)
+ IUk.e J (27)

Here Cuk, and Guk are the coefficients and the variances for the RBFs in feature Au

respectively. Ku is the number of anchor points. The coefficients can be obtained by using

the known displacement vectors, duk, at the anchor points, solving the set of K + K2 +.

+ Ku linear equations:

Tx(Vuk) = (Vuk-Wuk)x4 (28)

4 (.)x is an operation that returns the x component of the vector in the argument


One important application of neuroimaging is localization of brain activity. In

localization studies, neuroanatomical correlates of function are investigated by overlaying

the functional activity in multiple brains. However, the anatomical variability hinders

precise localization of the activity, as illustrated previously in figure 5 of chapter 2.

Reduction of normal variability non-linearly, without introducing sharp deformations is

the ultimate objective in standardization of brain images.

While the current automatic methods are quite efficient in normalizing the overall

size and shape of brain, mismatches of up to 1.5 cm in major anatomic structures such as

the sylvian fissure, central sulcus, parieto-occipital sulcus and superior temporal sulcus

still prevail after standardization (Grachev et al, 1999; Steinmetz et al.; 1989, Leonard et

al., 1998). Considering the complexity of the anatomy and the extent of variability in

individual structures, this result is not surprising. Although most structures follow a

general pattern of curvature, they also exhibit broad differences in continuity, branching,

and positioning. Further details of the types of variability in major cortical structures can

be found in Ono (1990) and Royackkers (1999), and numerical figures are provided in

Rademacher et al. (1993), Steinmetz et al. (1989), and Thompson et al. (1996a).

In the earlier chapters, we have presented a framework for performing smooth

non-linear alignment of the individual features. However, there are two important issues

to be resolved before applying these procedures to brain standardization.

1) Selection of the standard image:

Currently, there is no consensus in the neuroimaging field regarding the standard

atlas brain. The widely accepted Talairach atlas (Talairach and Tournoux, 1988) consists

of manual sketches of the brain of a 57 year old woman. Although the coordinate system

of the Talairach atlas is inarguably the current standard reference system, the validity of

the atlas itself as a standard is not yet established. This is because the atlas does not

consist of MR images, and a single brain can not exhibit the average characteristics of all

brains. If the chosen atlas brain contains average characteristics of the structures, at least

it is guaranteed that the atlas itself does not contain some extreme features. In addition,

the discrepancies between the atlas and any given brain will be less pronounced.

Therefore, a widely accepted approach is the creation of probabilistic atlases derived

from multiple brain images. In the past six years, following an initiative for the creation

of probabilistic brain atlases (Roland and Zilles, 1994; Mazziotta et al., 1995), several

groups independently generated probabilistic atlases consisting of merged topologies

from 10-300 brains (Evans at al., 1994; Roland et al., 1994; Thurfjell et al., 1995; Woods

et al., 1999; Thompson et al., 2000).

On the other hand, in the literature, it is a general practice to test new

standardization methods on a standard image which is randomly chosen from the image

pool of that particular study. Obviously, such an image cannot be accepted as a general

standard atlas. However, this choice facilitates quick and easy use of the newly developed

methodology, and helps demonstrate the benefits of the proposed method without getting

into the format and access difficulties of a third-party atlas image. In the standardization

procedures in this chapter, we subscribed to this approach, and dedicated one randomly

chosen brain MRI as the standard image. We are aware that such a choice of the atlas is

not ideal, but it is sufficient to demonstrate the results of our method. In the future, the

studies presented in this chapter can be replicated on a larger subject pool, and a

generally accepted atlas image.

2) Quantitative evaluation of the variability:

Evaluation of the variability in brain images is another subject on which

consensus is not yet established partly because the optimal metric depends on the

particular application. For an efficient evaluation and benchmarking of the

standardization algorithms, differences in the four following issues need to be resolved.

a) Definition of error: Several different types of errors are suggested in the

literature. Broadly these can be grouped under two categories: registration errors and

volumetric errors. Registration error is defined as the distance between an anatomic

landmark on the standard image and on the warped individual image. Ideally, after

standardization, the landmarks on the standard and the warped image should coincide,

resulting in zero registration error. Registration errors may differ according to the

distance term used. On the other hand, volumetric errors can be used to capture the

variability more generally, usually along an anatomic structure, rather than distinct

anatomic landmark points. As illustrated in figure 20, three types of mismatches are

possible while comparing the volumes of the structure in the standard and the warped

image. Type I: The feature on the standard image has points that lie outside the feature on

the warped image. Type II: The feature on the warped image has points that lie outside

the feature on the standard image. Type III: The features on the standard and warped

image are completely mismatched.

Type I Type II Type III

Figure 20. Volumetric mismatch types

For instance, Gee et al. (1993) defined volumetric discrepancy as:

V r)V
A B (29)
V uV

where VA is the volume of the given structure in the standard image and VB is the volume

of the same structure in the warped individual image. Ideally, when the structures in the

standard, and the warped image completely overlap, this volume difference becomes one.

While it captures both type I and type II of mismatches, the volume error definition in

(29) does not indicate the severeness of type III mismatch.

Another volume definition by Fischl et al. (1999a) is as follows:

V -V
B,all Bavg (30)

where VB,all is the union of the volume of the given structure across all subjects and VB,avg

is the average volume of the given structure among all subjects in the warped images.

Ideally, if the volumes overlap completely, this volume error will become zero. Only type

I and type II errors are captured in this formula for multiple subjects.

In order to report volumetric errors, we preferred to define a new volume error

which has ability to capture all three types of mismatches. The details of this volumetric

error are provided in section 5.3.

b) Selection of landmarks: In Grachev et al. (1999), an extensive set of anatomic

landmark points (128 per hemisphere) is defined. Several other sets of landmarks are

used throughout the literature to demonstrate the results of individual standardization

algorithms. However, there is no study that brings together all reported landmarks to

validate their use and summarize the underlying incentives in their selection. While

reporting the registration errors, we selected landmark points that are relevant to the

structures in our study using the list of landmarks provided in Grachev et al. (1999).

c) Measure of deformation: The distribution and scale of the deformation that the

individual images undergo during the warping process are very important measures.

Some of the standardization algorithms in the literature, including ours, take the energy

embodied in the deformation into account while choosing the parameters of the warping.

However, there is no numeric report in the literature about the scale of deformation, nor

its change relative to the parameters that are assigned independently. In the context of

brain images, it is especially important that the deformation is smooth and minimal. At

the end of section 5.3, we provide the scaling of the deformation in our framework with

respect to the only independent parameter, the number of SOM nodes.

d) Standard test set: In order to compare the performance of different

standardization techniques reliably, all algorithms must be run on the same data set

consisting of the standard image and multiple individual images. Currently such a test set

is not established for brain imaging applications. The research reported in the literature

are all performed on independently chosen test sets. In addition, the scan parameters also

vary from one test set to the other. Therefore, we have collected our own test data as

explained in the next section.

5.1. Data Collection and Preprocessing

Multi-spectral MRIs consisting of TI and T2 contrasts are collected from 10

subjects. Our motivation behind this type of image acquitision was to enable the

preprocessing of the same data set through both automatic and manual feature extraction.

In the literature, multi-spectral MRIs are shown to be efficient in automatic segmentation

of CSF, gray matter and white matter (Hall et al., 1992; Bezdek et al., 1993; Clarke et al.,

1993; Fletcher et al., 1993; Ozkan et al., 1993; Lundervold and Storvik, 1995; Alfano et

al., 1997; Vaidyanathan et al., 1997). Once these three types of tissues are segmented,

specific features can be extracted using shape recognition techniques.

The MRI acquisitions are performed on a 1.5T Siemens with the following scan

parameters. For the T1 weighted images, TR=15 ms, TE=7ms, FA=150, slab number=l,

slab thickness= 160mm, number of partitions=160, shift mean=0mm, imaging

plane=sagittal, field ofview=250mm, voxel size=0.98x0.98x1.00mm, scan time=10 min.

For the T2 weighted images, TR=3300ms, TE=105ms, FA=1800, slab number=8, slab

thickness=20mm, number of partitions=16, shift mean=0mm, imaging plane=sagittal,

field ofview=250mm, voxel size=0.95x0.98x1.25mm, scan time=19.5 min.

5.1.1. Transformation into Standard Coordinates

An affine transformation is performed in order to transform the images into

Talairach coordinates, which is considered to be the current standard coordinate system

for the brain. In order to perform the transformation, 10 anatomic landmark points1 are

specified manually using the AFNI toolbox (Cox, 1996). Using these landmarks, the

1 List of landmark points: Anterior commissure, posterior commissure, two points on the interhemispheric
fissure, two extreme points on the brain for each dimension (a total of six points)

affine transformation brings two structures, anterior commissure (AC) and posterior

commissure (PC) into register with the standard coordinates, aligns the line between AC

and PC with the horizontal axis, and coarsely scales the brain using the specified 6

extreme points on the outer brain. In figure 21, a brain image is shown before and after

the affine transformation into Talairach coordinates.

..': :. : .' . . .... .....: .. . .

t 1..*

Figure 21. A sample brain image before (left) and after (right)
transformation into standard coordinates

5.1.2. Feature Extraction

All the features presented in this dissertation, including some of the simulated

ones are extracted manually using LOFA, a software developed in our lab (Gokcay,

1999). LOFA, Localization of Functional Activity, is a toolbox written in Pv-Wave for

delineating anatomic structures in individual subjects, and extracting the corresponding

region of interests in overlaid functional images. Additional information on the

operational principles of LOFA is provided in the Appendix.

The benefits of self-organizing feature based alignment is maximized in the

standardization of brain images, if several structures are chosen to represent the entire

brain. Some of these structures are Sylvian fissure, superio-temporal sulcus, parieto-

occipital sulcus, cingulate sulcus, central sulcus. Unfortunately, there aren't many sulci in

the frontal areas of the brain that follow a general pattern of curvature having variability

in the ranges comparable to the sulci listed above. The variability in the frontal lobe is

higher than in other regions, and therefore the deformations in this area would be greater,

if frontal structures are included in the standardization.

In this section, our main objective is to compare the performance differences of

the proposed self-organizing framework with respect to the alignment of topologically

different structures. Therefore we have chosen two structures, the Sylvian fissure and the

cingulate sulcus to demonstrate our method. The Sylvian fissure starts on the lateral

aspect of the brain and runs deep. Most of the current standardization algorithms result in

high registration errors for landmarks on the Sylvian fissure. On the contrary, the

cingulate sulcus is a thin structure located on the medial aspect of the brain. According to

the figures in Steinmetz et al. (1989), the maximum variation in the posterior part of the

Sylvian fissure and the marginal ramus of the cingulate sulcus are both 2 cm. However,

when variability is compared in all areas of these two sulci, the cingulate sulcus is aligned

somewhat better during transformation into Talairach coordinates, since this

transformation specifically aligns two other midline structures AC and PC.

While tracing these two structures using LOFA, we established some guidelines

for consistency. The cingulate sulcus is traced sagittally, slice by slice, starting in the

midline, working laterally, until it no longer is visible. On the average, there were 13

slices that qualified. Whenever the sulcus is interrupted, the traces are discontinued and a

new trace is started at a new branch. Traces are specifically positioned on the sulcus,

rather than the gyral area. The posteriormost point is chosen as the marginal ramus, and

the anteriormost point is chosen as the point that meets a vertical line passing through the

genu of the corpus callosum. Sample tracings of the right cingulate sulcus (RCS) are

shown in figure 22a.

Similarly, the Sylvian fissure is also traced on the sagittal slices. The drawings

proceeded from medial to lateral. The first slice is chosen as the one where the Heschl's

gyms is clearly visible, and the insula appears in the background. The last slice is

determined as the lateralmost slice where the frontal brain disappears at the background

and only the temporal lobe becomes visible. On average, there were 24 slices that

qualified. Traces are specifically positioned on the lower bank of the Sylvian fissure,

other branches were left out. The posterior and anteriormost points are found at the two

end points of the horizontal ramus. Sample tracings of the right sylvian fissure (RSF) are

shown in figure 22b.

5.2. Standardization

As summarized in chapter 1, standardization is performed in two phases. In phase

I, the SOM templates are generated from the structures in the atlas image. In phase II,

these SOM templates are retrained using the structures in the given image for alignment,

and warping is performed using the difference vectors generated during alignment. In this

section, the steps involved in both phase I and II are discussed.

5.2.1. Template Generation

In order to choose the correct dimensionality for the SOM templates of the two

features, RCS and RSF, the topology evaluation computations presented in chapter 3 are

performed. Results of these evaluations are shown in table 2. As seen from this table, the

correct dimensionality for RCS is 1, and RSF is 2. Because otherwise, severe topology

viloations are encountered, as indicated by the Pnodes value of the rows two and three. The

slight topology violation observed in the 2d RSF is investigated. This violation is

consistent across the representation of Sylvian fissures from different subjects, hence

does not present a conflict in the one-to-one correspondence. The violation occurs

because of the flattening of the Heschl's gyms as the slices move from medial to lateral.

". ". :" ...

a) RCS, first eight medial to lateral slices ordered first left-right, then top-bottom

b) RSF, first eight medial to lateral slices ordered left-right then top-bottom
Figure 22. Tracings of two sulci (RCS, RSF)on the standard image

Table 2. Evaluation of topology preservation of the RCS and RSF SOM templates

Data SOM ID RD Pneigh Pnodes Violation
size (%) (%) Type

RCS 1x10 1 1.0 0 0 None

RCS 7x3 1 0.5 34 76 Feature space

RSF 1x20 2 2.1 34 75 Lattice

RSF 7x3 2 0.85 6 19 Feature space

a) Id SOM model (1x20) of RSF

c) Id SOM model (1x10) of RCS

b) 2d SOM model (3x7) of RSF

d) 2d SOM model (2x10) of RCS

Figure 23. SOM templates for two sulci: Right sylvian fissure (RSF)
and Right cingulate sulcus (RCS)

In figure 23, the SOM templates that are evaluated in table 2 are visualized. As

seen in figures 23b and 23c, the 2d RSF and Id RCS result in perfect lattices with equally

spaced nodes in the feature space. On the other hand, the Id RSF in figure 23a, and the

2d RCS in figure 23d exhibit nodes wrapping around in the feature space and vice versa.

5.2.2. Self-organized Alignment and Warping

Alignment of the features in a given brain image is achieved by retraining the

SOM templates of figure 23b and 23c with the points belonging to the individual

structures of the given brain image. During retraining, we used an initial radius of 2, and

the initial gain and regularization multipliers are set as 0.2 and 0.05. In figure 24, the

difference vectors are visualized for each SOM node after alignment. As seen in figures

24b and 24d, regularization improves the uniformity of the discrepancy vectors.

In figure 25, the resulting standardized images are shown for one slice in each

structure, RCS and RSF. The tracing of the atlas structure is overlaid on all images to

indicate the differences between the feature on the standard, given and warped images.

As seen in figure 25b, there are significant differences between the atlas and the given

images in terms of the positioning of the two features, RCS and RSF. After warping, the

differences are reduced as seen in figures 25c and 25d. The matching of the features after

SOM alignment is better in comparsion to RSOM alignment. However, SOM alignment

results in high deformation at the anterior tip of the Sylvian fissure, whereas the

deformation in RSOM alignment is smoother, hence, more acceptable. Numeric

comparison of membrane energy and matching errors for the standardization of all 10

brains is provided in section 5.3.


Finally, in figure 26, the features from all 10 brains are overlaid on the standard

image to visualize the discrepancy after transformation into standard coordinates and

warping simultaneously. As seen in figure 26a, the registration error in some parts extend

to 1.5-2.0 cm. This error roughly drops by half after applying the framework proposed in

our study. SOM alignment provides better matching as indicated by the difference in the

alignment of figures 26b and 26c. However, the advantage of RSOM alignment in

smoothness is not clear. This is covered in figure 27c of section 5.3.

a) Discrepancy in SOM alignment of RSF


1^0 -

: ^

b) Discrepancy in RSOM alignment of RSF

c) Discrepancy in SOM alignment ofRCS d) Discrepancy in RSOM alignment of RCS

Figure 24: Discrepancy vectors for two sulci:Right sylvian fissure (RSF)
and Right cingulate sulcus (RCS)


~ i-

a) Standard brain (IA)

b) Individual brain (IB)

c) Warped individual brain (IB') with RSOM alignment

d) Warped individual brain (IB') with SOM alignment

Figure 25: Sample warps with SOM and RSOM alignment on the
RCS(left) and RSF(right)

a) After affine alignment

b) After RSOM alignment

c) After SOM alignment

Figure 26: RCS (left) and RSF (right) traces overlaid from 10 brains after

affine, RSOM, and SOM alignment



. ..... .....

.... . .... .
.. ... ..... ... ...

... ....
:i........ ...
... ...... ... .

.... .. ..

. .....

............. ..
............ .
........... ..

.... ..

. .. . .

.. .........

5.3. Performance Evaluation

5.3.1. Quantification of Errors

In our performance evaluation studies, we studied both registration errors and

volumetric errors. Registration errors are determined by evaluating the Euclidean

distance between the landmark points on the standard image and on the warped

individual image. Two landmarks are used for each structure. One of the landmarks is

chosen on the structure which was aligned, while the other is chosen on a nearby

structure to test the success of alignment in the surrounding area. The anatomic definition

of the four landmarks are as follows:

Cac: Intersection of the cingulate sulcus with the vertical line passing through

anterior commissure. The sagittal slice is chosen to be 1 cm. lateral to the

interhemispheric fissure.

Cs: The superiormost point of the central sulcus. The axial slice is chosen from

the midline sagittal slice where a vertical line drawn on the anterior commissure meets

the hemispheric margin.

Sph: The posteriormost point of the horizontal ramus of the Sylvian fissure. The

sagittal slice is chosen to be midway between the posteriormost point of the insula and

the hemispheric margin, on the axial slice at the level of the anterior commissure.

Spa: The posteriormost point of the ascending ramus of the Sylvian fissure. The

sagittal slice is chosen as in Sph.

Further details on the anatomic definitions can be found in Grachev et al. (1999).

The volume error, Ev is defined to evaluate the goodness of matching even in severe

situations where the structures to be aligned does not even overlap after warping. This is

accomplished by the following calculation.

f max f (x) min fk (x) dx
E =k k (31)
v P(f (x))

where i is an index over all slices in which the structure occurs, j and k are indices over

all subjects, and s is the index of the standard image. The functionfj plots the structure in

slice i, for subject j, and the function P returns the number of points belonging to the

specified structure. This volumetric error definition accentuates the volume difference,

because it also takes into account the type III mismatches. If the features in figure 20 are

studied, the formula in (31) is the same as:

V uV uV
A B Till (32)

where VTIII denotes the volume of the type III mismatch.In the ideal case, where all

structures in the warped images coincide perfectly, Ev is 1.

In figure 27, the maximum values of the registration and volumetric errors are

shown for the alignment of the cingulate sulcus, and the Sylvian fissure in a set of 9

subjects. The SOMs are chosen to be Id, with 10 nodes for the cingulate sulcus (RCS),

and 2d, with 3x7 nodes for the Sylvian fissure (RSF). As seen in figure 27a, the original

SOM alignment yields more improvement in registration error over the standard linear

Talairach transformation in comparison to regularized SOM alignment. However, we find

the results of RSOM alignment preferable, because it not only improves the registration

error on a scale comparable to SOM alignment, but it also reduces the membrane energy,

by half, as seen in figure 27c. If the volumetric errors are studied, 24-28% improvement

is observed in the alignment of cingulate sulcus, in contrast to 41-59% improvement in

the Sylvian fissure depending on whether regularized SOM or original SOM is used. On

the other hand, the improvement in the registration error is 25-38% and 21-45%


It is very important to point out that the error improvements in figure 27 are

contingent upon the selection of the number of SOM nodes. We studied the change of

these errors with respect to the number of nodes on one sample individual image, and

only for the RCS. The results are plotted in figure 28. As seen from figure 28a, as the

number of SOM nodes increases, the quantization error, Eq, decreases. The same trend

applies to the volumetric error and membrane energy. In the volume error graph, K=0 is

used to indicate the initial volume error after the transformation into Talairach


5.3.2. Comparative Evaluation

In this section, the proposed feature based standardization method is compared

with a widely used standardization method, AIR (Woods et al., 1998b). As summarized

in chapter 2, AIR is based on minimizing the global intensity differences between the

given image and the standard image. Minimization is performed through least squared

distance, and warping is achieved through polynomial transformations. For maximum

performance, prior to the application of this algorithm, external tissues, such as skull,

scalp and dura, and the background must be stripped, exposing only the brain. We used

the brain extraction procedure provided in the AFNI package (Cox, 1996) for this


O Talairach M Reg. SOM H Orig. SOM

registration error


volume error


membrane energy








SCing. S. Sylv. F.

Reg. SOM Orig. SOM

0 Original SOM 0 Regularized SOM

Figure 27: Quantification of errors and membrane energy in RSOM and SOM


sOO s02 s03 s04 s06 s07 s08 s09 sl)


25 E




10 15 20 25 30

Volume Error


0 5 10 15 20 25 30


Figure 28: Change in quantization error, volume error, and membrane energy
with respect to number of SOM nodes in the alignment ofRCS