CHARACTERIZATION OF THE DISTRIBUTION OF DEVELOPMENTAL
INSTABILITY
By
GREGORY ALAN BABBITT
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
2006
Copyright 2006
by
Gregory Alan Babbitt
This document is dedicated to my grandmother, Sarah Miller, who taught me to admire
the natural world.
ACKNOWLEDGMENTS
I would like to thank Rebecca Kimball, Susan Halbert, Bernie Hauser, Jane
Brockmann, and Christian Klingenberg for helpful discussions and comments; Susan
Halbert and Gary Steck (Division of Plant Industry, State of Florida) for specimen ID;
Marta Wayne for use of her microscope and digital camera; Glenn Hall (Bee Lab,
University of Florida) for a plentiful supply of bees; and the Department of Entomology
and Nematology (University of Florida) for use of its environmental chambers. I thank
Callie Babbitt for help in manuscript preparation and much additional love and support.
TABLE OF CONTENTS
page
A C K N O W L E D G M E N T S ................................................................................................. iv
LIST OF TABLES .................................................... ....... .. .............. viii
LIST OF FIGURES ......... ......................... ...... ........ ............ ix
ABSTRACT ........ ........................... .. ...... .......... .......... xii
CHAPTER
1 SEARCHING FOR A CONSISTENT INTERPRETATION OF
DEVELOPMENTAL INSTABILITY? A GENERAL INTRODUCTION ................1
2 ARE FLUCTUATING ASYMMETRY STUDIES ADEQUATELY SAMPLED?
IMPLICATIONS OF A NEW MODEL FOR SIZE DISTRIBUTION..................... 10
Intro du action .......................... ................... ... ... ................. ............ 10
Developmental Stability: Definition, Measurement, and Current Debate ..........10
The Distribution of Fluctuating Asymmetry .................................................12
M e th o d s ........................................................................... 1 7
R e su lts ...................................... .................................................... 2 2
D isc u ssio n ......................................................... .............. ................ 3 0
T he D distribution of F A ................................................................. .....................30
Sample Size and the Estimation of M ean FA................... .................................32
The Basis of Fluctuating Asymmetry..... .......... ...................................... 33
C o n c lu sio n .................................................. ................ 3 4
3 INBREEDING REDUCES POWERLAW SCALING IN THE DISTRIBUTION
OF FLUCTUATING ASYMMETRY: AN EXPLANATION OF THE BASIS OF
DEVELOPMENTAL INSTABILITY .............................. .....................36
In tro d u ctio n ......................................................... ............. ................ 3 6
W hat is the B asis of FA ? .................................. ........... .................. 36
Exponential Growth and NonNormal Distribution of FA..............................38
Testing a M odel for the Basis of FA ............................ .............. ............. 40
M e th o d s ..............................................................................4 1
M odel D evelopm ent ....... .......... ........ .................... .............. 41
Simulation of geometric Brownian motion................................ ............... 41
Simulation of fluctuating asymmetry .........................................................44
Inbreeding E xperim ent ............................................................. .....................4 5
M orphom etric analyses ............................................... ............................ 48
M odel selection and inference....................................... .......................... 49
R e su lts ...................................... .................................................... 4 9
M odel Sim ulation ...................... .................. ................... ..... .... 49
E x p erim ental R esu lts......................................... ............................................50
M odel Selection and Inference................................ ......................... ........ 52
D iscu ssion ................................................................................. 54
Revealing the Genetic Component of FA .............. .....................................54
Lim stations of the M odel ......................................................... .............. 56
T he Sources of Scaling .......................................................................... .... ... 56
Potential Application to Cancer Screening........................................................57
C o n c lu sio n .................................................. ................ 5 8
4 TEMPERATURE RESPONSE OF FLUCTUATING ASYMMETRY TO IN AN
APHID CLONE: A PROPOSAL FOR DETECTING SEXUAL SELECTION ON
DEVELOPM EN TAL IN STABILITY ........................................................................60
In tro d u ctio n ......................................................... ............. ................ 6 0
T he G enetic B asis of F A ......................................................................... ..... 60
The Environm ental Basis of FA ....................................................................... 61
Temperature and FA in and Aphid Clone ................................. ............... 63
M eth o d s ..............................................................................6 5
R e su lts ...................................... .................................................... 6 8
D iscu ssio n ...................................... ................................................. 7 2
5 CONCLUDING REMARKS AND RECOMMENDATIONS .............................. 75
H ow M any Sam ples Are Enough? ................................ .. ............................... 76
W hat M measure of FA is best?................ ......................................... ............... 76
Does rapid growth stabilize or destabilize development? .......................................78
Can fluctuating asymmetry be a sexually selected trait?................. .................78
Is fluctuating asymmetry a valuable environmental bioindicator?.............................79
Scaling Effects in Statistical Distributions: The Bigger Picture..............................80
APPENDIX
A LANDMARK WING VEIN INTERSECTIONS CHOSEN FOR ANALYSISOF
FLUCTUATING A SYM M ETRY ..................................... ......................... .......... 84
B USEFUL MATHEMATICAL FUNCTIONS ...................................... ...............86
A sym m etric Laplace D istribution......................................... .......................... 86
H alfN orm al D distribution ........................................ ............................................86
L ognorm al D distribution ............................................................. ...........................86
Double Pareto Lognormal Distribution ............................. .................... 86
L IST O F R E F E R E N C E S ........................................................................ .. ....................88
B IO G R A PH IC A L SK E TCH ...................................................................... ..................96
LIST OF TABLES
Tablege
21. Maximized loglikelihood (MLL), number of model parameters (P) and Akaike
Information Criterion differences (A AIC) for all distributional models tested.
Winning models have AIC difference of zero. Models with nearly equivalent
goodnessoffit to winners are underscored (A AIC <3.0). 4.0
indicates some support for specified model. A AIC >10.0 indicates no support
(Burnham and Anderson 1998). Distances between landmarks (LM) used for
first univariate size FA, aphids LM 12, bees LM 14 and longlegged fly LM 3
6 and for second univariate FA aphids LM 23, bees LM 26 and longlegged fly
L M 4 5 ............ .......... .... ................. .................................................2 3
22. Best fit parameters for models in Table 21. Parameters for univariate size FA
are very similar to multivariate size FA and are not shown. Skew indexes for
asymmetric Laplace are also not shown............... ................... ....... ........... 24
31. Distribution parameters and model fit for multivariate FA in two wild
populations and four inbred lines of Drosophila simulans and one isogenic line
of Drosophila melanogaster. Model fits are A AIC for unsigned centroid size
FA (zero is best fit, lowest number is next best fit). .............................................. 51
LIST OF FIGURES
Figure page
21. Schematic representation of mathematical relationships between candidate
models for the distribution of fluctuating asymmetry. Mixtures here are
continue ou s. ........................................................ ................. 14
22. Distribution of multivariate shape FA of A) cotton aphid (Aphis gossipyii) B)
domestic honeybee (Apis mellifera) and C) longlegged fly (Chrysosoma
crinitus). Best fitting lognormal (dashed line, lower inset), and double Pareto
lognormal (solid line, upper inset) are indicated................................................25
23. Distribution of multivariate centroid size FA of A) cotton aphid (Aphis gossipyii)
B) domestic honeybee (Apis mellifera) and C) longlegged fly (Chrysosoma
crinitus). Best fitting halfnormal (dashed line, lower inset) and double Pareto
lognormal distribution (solid line, upper inset) are indicated. ................................26
24. Distribution of univariate unsigned size FA of A) cotton aphid (Aphis gossipyii)
B) domestic honeybee (Apis mellifera) and C) longlegged fly (Chrysosoma
crinitus). Best fitting halfnormal (dashed line, lower inset) and double Pareto
lognormal distribution (solid line, upper inset) are indicated. ................................27
25. Distribution of sample sizes (n) from 229 fluctuating asymmetry studies reported
in three recent metaanalyses (Vollestad et al. 1999, Thornhill and Moller 1998
and Polak et al. 2003). Only five studies had sample sizes greater than 500 (not
show n). ...............................................................................28
26. Relationship between sample size and % error for estimates of mean FA drawn
from best fitting size (dashed line) and shape (solid line) distributions using
1000 draws per sample size. All runs use typical winning double Pareto
lognormal parameters (shape FA v = 3.7, T = 0.2, a = 1000, 0 = 9; for size FA v
= 1.2 = 0 .7 a = 4 .0 = 4 .0 ) .......................................................................... .....2 9
27. The proportion and percentage (inset) of individuals with visible developmental
errors on wings (phenodeviants) are shown for cotton aphids (A) and honeybees
(B) in relation to distribution of shape FA (Procrustes distance). Average FA for
both normal and phenodeviant aphids (C) and honeybees (D) are also given.........30
31. Ordinary Brownian motion (lower panel) in N simulated by summing
independent uniform random variables (W) (upper panel) ....................................42
32. Geometric Brownian motion in N and log N simulated by multiplying
independent uniform random variables. This was generated using Equation 1.5
w ith C = 0 .5 4 ...................................................................... 4 3
33. Geometric Brownian motion in N and log N with upward drift. This was
generated using Equation 1.5 with C = 0.60. ................................ ..................44
34. Model representation of Reed and Jorgensen's (2004) physical size distribution
model. Variable negative exponentially distributed stopping Times of random
proportional growth (GBM with C = 0.5) create double Pareto lognormal
distribution of size .............................. ................................... 46
35. A model representation of developmental instability. Normally variable
stopping times of random proportional growth (GBM with C > 0.5) create
double Pareto lognormal distribution of size. .................................. .................47
36. Simulated distributions of cell population size and FA for different amounts of
variation in the termination of growth (variance in normally distributed growth
stop time). The fit of simulated data to the normal distribution can determined
by how closely the plotted points follow the horizontal line (a good fit is
h horizontal ........................................................ ................ 50
37. Distribution of fluctuating asymmetry and detrended fit to normal for two
samples of wild population collected in Gainesville, FL in summers of 2004 and
2005 and four inbred lines of Drosophila simulans derived from eight
generations of fullsib crossing of the wild population of 2004. Also included is
one isogenic line of Drosophila melanogaster (me175). All n = 1000. The fit of
data to the normal distribution can determined by how closely the plotted points
follow the horizontal line (a good fit is horizontal). .............................................. 53
41. Predicted proximate and ultimate level correlations of temperature and growth
rate to fluctuating asymmetry are different. Ultimate level (evolutionary) effects
assume energetic limitation of individuals in the system. Proximate level
(growth mechanical) effects do not. Notice that temperature and fluctuating
asymmetry are negatively correlated in the proximate model while in the
ultimate model they are positively correlated. ................................. ............... 64
42. Cotton aphid mean development time +1 SE in days in relation to temperature
(n= 53 1). ...............................................................................68
43. Distribution of isogenic size, size based and shape based FA in monoclonal
cotton aphids grown in controlled environment at different temperatures.
Distributions within each temperature treatment are similar to overall
distributions show n here................................................. .............................. 69
44. Mean isogenic FA for (A.) centroid size based and (B.) Procustes shapebased)
in monoclonal cotton aphids (collected in Gainesville FL) grown on isogenic
cotton seedlings at different temperatures.................... ........... .............. 70
45. Kurtosis of sizebased FA in monoclonal cotton aphids grown on isogenic cotton
seedlings at different tem peratures ........................................................................ 71
Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy
CHARACTERIZATION OF THE DISTRIBUTION OF DEVELOPMENTAL
INSTABILITY
By
Gregory Alan Babbitt
May 2006
Chair: Rebecca Kimball
Cochair: Benjamin Bolker
Major Department: Zoology
Previous work on fluctuating asymmetry (FA), a measure of developmental
instability, has highlighted its controversial relationship with environmental stress and
genetic architecture. I suggest that conflict may derive from the fact that the basis of FA
is poorly understood and, as a consequence, the methodology for FA studies may be
flawed. While sizebased measures of FA have been assumed to have halfnormal
distributions within populations, developmental modeling studies have suggested other
plausible distributions for FA. Support for a nonnormal distribution of FA is further
supported by empirical studies that often record leptokurtic (i.e., fat or longtailed)
distributions of FA as well. In this dissertation, I investigate a series of questions
regarding the both the basis and distribution of FA in populations. Is FA normally
distributed and therefore likely to be properly sampled in FA studies? If not normal,
what candidate model distribution best fits the distribution of FA? Is the shape of the
distribution of FA similar to a simple and specific growth model (geometric Brownian
motion)? Does reducing individual variation in populations through inbreeding affect
follow the prediction of this model? How does this shape respond to environmental
factors such as temperature when genetic variation is controlled?
In three species of insects (cotton aphid, Aphis gossipyii Glover; honeybee, Apis
mellifera; and longlegged fly, Chrysosoma crinitus (Dolichipodidae)), I find that FA was
best described by a double Pareto lognormal distribution (DPLN), a lognormal
distribution with powerlaw tails. The large variance in FA under this distribution and
the scaling in the tails both act to slow convergence to the mean, suggesting that many
past FA studies are undersampled when the distribution of FA is assumed to be normal.
Because DPLN can be generated by geometric Brownian motion, it is ideal for describing
behavior of cell populations in growing tissue. I demonstrated through both a
mathematical growth model and an inbreeding experiment in Drosophila simulans that
the shape of the distribution of FA is highly dependent on the level of genetic redundancy
or heterogeneity in a population. In monoclonal lines of cotton aphids, I also demonstrate
that FA decreases with temperature and that a shift in kurtosis is associated with
temperature induced phenotypic plasticity. This supports the prediction of a proximate
model for the basis of FA and also suggests shape of the distribution of FA responds to
environmentally induced changes in gene expression on the same genetic background.
CHAPTER 1
SEARCHING FOR A CONSISTENT INTERPRETATION OF DEVELOPMENTAL
INSTABILITY? A GENERAL INTRODUCTION
Everywhere, nature works true to scale and everything has its proper size.
 D'Arcy Thompson
There have been many incarnations of the idea that stability and symmetry are
somehow related. Hippocrates (460377 B.C.) was the first to postulate internal
corrective properties that work in the presence of disease. Waddington (1942) suggested
existence of similar homeostatic buffering against random and presumably additive errors
occurring during development. The term "fluctuating asymmetry," first introduced by
Ludwig (1932), was later adopted by Mather (1953), Reeve (1960) and Van Valen (1962)
to describe a measurable form of morphological noise representing a hypothetical lack of
buffering that is always present during development of organisms. Recently, biologists
have become very interested both in fluctuating asymmetry's potential usefulness as a
universal bioindicator of environmental health (Parsons 1992) and in its potential as an
indicator or even an overt signal of an individual's overall genetic quality (Moller 1990).
However, over a decade of work has left the field with no clear relationship between
increased fluctuating asymmetry and either environmental or genetic stress (Bjorksten et
al. 2000, Lens et al. 2002). Despite this fact, fluctuating asymmetry is still often assumed
to indicate of developmental instability. Recently, Debat and David (2001) page 560
define developmental stability as "a set of mechanisms historically selected to keep the
phenotype constant in spite of small random developmental irregularities potentially
inducing slight differences among homologous parts within individuals." (I have
italicized the aspects of this definition that I think are left undefined.)
Fluctuating asymmetry is defined and measured as the average right minus left
difference in size or shape of morphological characters in a population and has been
generally accepted as an indicator of developmental instability because both sides of a
bilaterally symmetric organism have been developed by the same genetic program in the
same environment (Moller and Swaddle 1997). Fluctuating asymmetry is measured as
leftright side differences in the size or shape of paired bilaterally symmetric biological
structures of organisms. In a character trait that demonstrates fluctuating asymmetry, it is
assumed that the distribution of signed leftright differences is near zero and that there is
no selection for asymmetry (Palmer and Strobek 1986, 2003). Other types of asymmetry
do exist and are thought to indicate selection against symmetry. Directional asymmetry
denotes a bias towards left or right sidedness that causes the population mean to move
away from zero. In antisymmetry, left or right side biases occur equally at the individual
level creating a population that is bimodal or platykurtic.
While much attention has been directed toward the possible genetic basis of
fluctuating asymmetry (reviewed by Leamy and Klingenberg 2005, Woolf and Markow
2003), response of fluctuating asymmetry to stress (reviewed by Hoffman and Woods
2003) and correlation of fluctuating asymmetry with mate choice (Moller and Swaddle
1997); little scientific effort has been directed toward investigating its basis or origin at
levels of organization lower than the individual. A few theoretical explanations for the
basis of fluctuating asymmetry have been developed (reviewed by Klingenberg 2003).
Only two of these offer a causal explanation for the increased levels of fluctuating
asymmetry that are sometimes observed under periods of environmental stress. While
not mutually exclusive, these two explanations differ at the organizational level at which
they are thought to act. Moller and Pomiankowski (1993) first suggested that strong
natural or sexual selection can remove regulatory steps controlling the symmetric
development of certain traits (e.g., morphology used in sexual display). They suggest that
with respect to sexually selected traits (and assuming that they are somehow costly to
produce), individuals may vary in their ability to buffer against environmental stress in
proportion to the size of their own energetic reserves. These reserves are often indicative
of the genetic quality of individuals. Therefore, high genetic quality is expected to be
associated with low fluctuating asymmetry. Emlen et al. (1993) present another and
more proximate explanation of the basis of fluctuating asymmetry. They do not invoke
sexual selection but hypothesize that fluctuating asymmetry is due largely to the non
linear dynamics of signaling and supply that may occur during growth. Here fluctuating
asymmetry is thought to result from the scaling up of compounding temporal
asymmetries in signaling between cells during growth. In their model, hypothetical
levels of signaling compounds (morphogens) and or growth precursors used in the
construction of cells vary randomly over time. When growth suffers less interruption, in
other words, when it occurs faster and under less stress, there is also less complexity (and
fractal dimension) in the dynamics of signaling and supply. This is should reduce
fluctuating asymmetry.
Only a few other models have since been proposed. A model by Graham et al.
(1993) suggests that fluctuating asymmetry in the individual can also be the net result of
compounding time lags and chaotic behavior between hormonally controlled growth rates
on both sides of an axis of bilateral symmetry. More recently Klingenberg and Nijhout
(1999) present a model of morphogen diffusion and threshold response that includes
genetic control of each component. They demonstrate that fluctuating asymmetry can
result from genetically modulated expression of variation that is entirely nongenetic in
origin. In other words, even without specific genes for fluctuating asymmetry,
interaction between genetic and nongenetic sources of variation (G x E) can cause
fluctuating asymmetry (Klingenberg 2003).
While all these theories for the basis of fluctuating asymmetry have proven
useful in making some predictions about fluctuating asymmetry in relation to sexual
selection and growth rate/trait size, none are grounded in any known molecular
mechanisms. The search for any single molecular mechanism that stabilizes the
developmental process has proven elusive. A recent candidate was the heat shock
protein, Hsp90, which normally target conformationally plastic proteins that act as signal
tranducers (i.e., molecular switches) in many developmental pathways (Rutherford and
Lundquist 1998). Because Hsp90 recognizes protein folding, it can also be diverted to
misfolded proteins that are denatured during environmental stress. Therefore, Hsp90 can
potentially link the developmental process to the environment and these authors suggest it
may also capacitate the evolution of novel morphology during times of stress by
revealing genetic variation previously hidden to selection in nonstressful environments.
However, additional research by Milton et al. (2003) shows that while Hsp90 does buffer
against a wide range of morphologic changes and does mask the effect of much hidden
genetic variation in Drosophila, it does not appear to affect average levels of fluctuating
asymmetry through any single Hsp90 dependent pathway or process.
It is said that wisdom begins with the naming of things. The case of fluctuating
asymmetry reminds us that the act of giving names to things in science can, in fact, lend a
false impression that we have achieved a true understanding of that which has been
named. While fluctuating asymmetry has had several specific descriptive definitions, we
do not really know how to define it at a fundamental level because we do not understand
exactly how development is destabilized under certain conditions of both gene and
environment. All we can say for now is that fluctuating asymmetry is a mysterious form
of morphological variation. Mary Jane WestEberhard (2003) has dubbed it the "dark
side" of variation because it may represent that noisy fraction of the physicalbiological
interface that is still free of selection, and not under direct control of the gene.
Any study of natural variation would do well to begin by simply observing its
shape or its distribution in full. In nature, statistical distributions come mostly in two
flavors: those generated by large systems of independent additive components and those
generated by large systems of interacting multiplicative components (Vicsek 2001,
Sornette 2003). When large systems are composed of independent subunits, random
processes result in the normal distribution, the cornerstone of classical statistics. The
normal distribution is both unique and extreme in its rapidly decaying tails, its very
strong central tendency and its sufficient description by just two parameters, the mean
and variance, making it, in this sense, the most parsimonious description of random
variation. However, the normal distribution does not describe all kinds of stochastic or
random behavior commonly observed in the natural sciences. When subunits comprising
large systems interact, random processes are best described by models that underlie
statistical physics (sometimes called Levy statistics) (Bardou et al. 2003. Sornette 2003).
Fattailed distributions are often the result of propagation of error in the presence of
strong interaction. These types of distributions include the power function, Pareto, Zipf
and the double Pareto or logLaplace distributions, all characterized by a power law and
an independence of scale. Collective effects in complex interacting systems are also
often characterized by these power laws (Wilson 1979, Stanley 1995). Examples include
higher order phase transitions, selforganized criticality and percolation. During second
order phase transition at the critical temperature between physical phases, external
perturbation of the network of microscopic interactions between molecules results in
system reorganization at a macroscopic level far above that of interacting molecules (e.g.,
the change from water to ice). This results in collective imitation that propagates among
neighboring molecules over long distances. Exactly at these critical temperatures,
imitation between neighbors can be observed at all scales creating regions of similar
behavior that occur at all sizes (Wilson 1979). Thus, a selfsimilar powerlaw manifests
itself in the interacting system's susceptibility to perturbation and results, in this case,
from the multiplicity of interaction paths in the interaction network (Stanley 1995). As
the distance between two objects in a network increases, the number of potential
interaction pathways increases exponentially and the correlation between such paths
decreases exponentially. The constant continuous degree of change represented by the
power law is the result of a combined effect between both an exponentially increasing
and decreasing rate of change. This highlights the fact that power laws can be easily
manifested from combinations of exponential functions which are very common to
patterns of change in many natural populations (both living and nonliving).
Therefore, it is important to note that power laws in statistical distributions do not
have to always indicate strong interaction in a system and that many other simple
mechanisms can create them (Sornette 2003), especially where the behavior of natural
populations are concerned. For example, an apparently common method by which power
laws are generated in nature is when stochastic proportional (geometric) growth is
observed randomly in time (Reed 2001). Power law size distributions in particle size,
human population size, and economic factors are all potentially explained by this process
(Reed and Jorgensen 2004). Here we also have exponential increase in size opposing an
exponential decrease in the probability of observation or termination that results in a
gently decaying power law. This process can also explain why power law scaling can
occur through the mixing of certain distributions where locally exponentially increasing
and decreasing distributions overlap. For example, superimposing lognormal
distributions results in a lognormal distribution with power law tails (Montroll and
Shlesinger 1982,1983).
Given that exponential relationships are so common in the natural world, we should
assume that in observing any large natural population outside of an experiment, there is
probably some potential for a powerlaw scaling effect to occur. Therefore some degree
of nonnormal behavior may be likely to be observed, often in the underlying
distribution's tail. If we assume an underlying normal distribution, and sample it
accordingly, we are likely to undersample this tail. And so we rarely ever present
ourselves with enough data to challenge our assumption of normality, and we risk
missing the chance to observe a potentially important aspect of natural variation.
Until now, the basis of fluctuating asymmetry has been addressed only with very
abstract models of hypothetical cell signaling, or at the level of selection working on the
organism with potential mechanism remaining hypothetical. In this dissertation, the
underlying common theme is that fluctuating asymmetry must first and foremost be
envisioned as a stochastic process occurring during tissue growth, or in other words,
occurring within an exponentially expanding population of cells. This expansion process
can be modeled by stochastic proportional (geometric) growth that is terminated or
observed randomly over time. As will be explained in subsequent chapters, this
generative process can naturally lead to variation that distributes according to a
lognormal distribution with power laws in both tails (Reed 2001). In chapter 2, I
examine the distribution of fluctuating asymmetry in the wings of three species of insects
(cotton aphid, Aphis gossipyi Glover, honeybee, Apis mellifera, and longlegged flies,
Chrysosoma crinitus) and test various candidate models that might describe the statistical
distribution of fluctuating asymmetry. I then address whether, given the best candidate
model, fluctuating asymmetry studies have been appropriately sampled. I suggest that
much of the current controversy over fluctuating asymmetry may be due to the fact that
past studies have been undersampled. In chapter 3, I extend and test a
phenomenological model for fluctuating asymmetry that is introduced in chapter 2. I
present the model and then examine some of its unique predictions concerning the effects
of inbreeding on the shape of the distribution of fluctuating asymmetry in Drosophila. I
present evidence that the genetic structure of a population can have a profound effect on
the scaling and shape of the observed distribution of fluctuating asymmetry. In chapter 4,
I characterize the pattern of developmental noise (fluctuating asymmetry in the absence
9
of genetic variation) in two monoclonal populations of cotton aphid cultured under
graded environmental temperatures. I investigate how developmental noise is altered by
this simple change in the environment. I present evidence that the environmental
response of sizebased FA is directly related to developmental time. Lastly, I conclude
with a review of my major findings in the context of the introduction I have presented
here.
CHAPTER 2
ARE FLUCTUATING ASYMMETRY STUDIES ADEQUATELY SAMPLED?
IMPLICATIONS OF A NEW MODEL FOR SIZE DISTRIBUTION
Introduction
Developmental Stability: Definition, Measurement, and Current Debate
Developmental stability is maintained by an unknown set of mechanisms that
buffer the phenotype against small random perturbations during development (Debat and
David 2001). Fluctuating asymmetry (FA), the most commonly used assay of
developmental instability, is defined either as the average deviation of multiple traits
within a single individual (Van Valen 1962) or the deviation of a single trait within a
population (Palmer and Strobek 1986, 2003; Parsons 1992) from perfect bilateral
symmetry. Ultimately, an individual's developmental stability is the collective result of
random noise, environmental influences, and the exact genetic architecture underlying the
developmental processes in that individual (Klingenberg 2003; Palmer and Strobek
1986). Extending this to a population, developmental stability is the result of individual
variation within each of these three components.
Currently, there is conflict in the literature regarding the effect of both environment
and genes on the developmental stability of populations. The development of bilateral
symmetry appears to be destabilized to various degrees by both environmental stressors
(review in Moller and Swaddle 1997) and certain genetic architectures (usually created
by inbreeding: Graham 1992; Lerner 1954; Mather 1953; Messier and Mitton 1996;
review by Mitton and Grant 1984). While the influence of inbreeding on FA is not
consistent (Radwan 2003; Carchini et al. 2001; Fowler and Whitlock 1994; Leary et al.
1983,1984; Lens et al. 2002; Perfectti and Camachi 1999; Rao et al. 2002; Vollestad et al.
1999), it has led biologists to use terminology such as "genetic stress" or "developmental
stress" when describing inbred populations (Clarke et al. 1986 and 1993; Koehn and
Bayne 1989; Palmer and Strobek 1986).
While genetic and environmental stressors have been shown to contribute to
developmental instability and FA, the full picture is still unclear (Bjorksten et al. 2000;
Lens et al. 2002). While FA has been proposed as a universal indicator of stress within
individual organisms (Parsons 1992), its utility as a general indicator of environmental
stress has been contentious (Bjorksten et al. 2000; Ditchkoff et al. 2001; McCoy and
Harris 2003; Merila and Bjorklund 1995; Moller 1990; Rasmuson 2002; Thornhill and
Moller 1998; Watson and Thornhill 1994; Whitlock 1998). Despite many studies, no
clear general relationship between environmental stress and FA has been demonstrated or
replicated through experimentation across different taxa (Bjorksten et al. 2000; Lens et al.
2002). Furthermore, the effects of stress on FA appear to be not only speciesspecific but
also traitspecific and stressspecific (Bjorksten et al. 2000). Several metaanalyses have
attempted to unify individual studies on the relation of sexual selection, heterozygosity,
and trait specificity to FA (Polak et al. 2003; Thornhill and Moller 1998; Vollestad et al.
1999); while some weak general effects have been found, their biological significance is
still unresolved.
Taken together, the ambiguity of the results from FA studies suggests unresolved
problems regarding the definition and/or measurement of FA. The distribution and
overall variability of FA are sometimes discussed with regards to repeatability in FA
studies (Whitlock 1998; Palmer and Strobeck 2003), but is seldom a primary target of
investigation. Until we can quantify FA more reliably and understand its statistical
properties, the potential for misinterpretation of FA is likely to persist.
The Distribution of Fluctuating Asymmetry
Although it is always risky to infer underlying processes from observed patterns,
careful examination of the distribution of FA in large samples may help distinguish
between possible scenarios driving FA. For instance, a good fit to a single statistical
distribution may imply that the same process operates to create FA in all individuals in a
population. In contrast, a good fit to a discrete mixture of several different density
functions might suggest that highly asymmetric individuals suffer from fundamentally
different developmental processes than their more symmetric counterparts. Thintailed
distributions (e.g., normal or exponential) may indicate relative independence in the
accumulation of small random developmental errors, whereas heavytailed distributions
may implicate nonindependent cascades in the propagation of such error. Despite much
interest in the relationship between environmental stress and levels of FA, the basic
patterns of its distribution in populations remain largely unexplored.
One common distributional attribute of FA, leptokurtosis, has been discussed in the
literature. Leptokurtosis denotes a distribution that has many small and many extreme
values, relative to the normal distribution. Two primary causes of this kind of departure
from the normal distribution are the mixing of distributions and/or scaling effects in data.
For example, the Laplace or double exponential distribution is leptokurtic (but not heavy
tailed) and can be represented as a continuous mixture of normal distributions (Kotz et al.
2001; Kozubowski and Podgorski 2001). Just as log scaling in the normal distribution
results in the lognormal distribution, log scaling in the Laplace leads to logLaplace (also
called double Pareto) distributions (Kozubowski and Podgorski 2002; Reed 2001), which
are both leptokurtic and heavytailed (see Figure 21). Several explanations for
leptokurtosis in the distribution of FA have been proposed. Both individual differences
in developmental stability within a population (Gangestad and Thornhill 1999) and
differences in FA between subpopulations (Houle 2000) have been suggested to lead to
continuous or discrete mixtures of normal distributions with different developmental
variances, which in turn would cause leptokurtosis (e.g., a Laplace distribution) in the
overall distribution of FA. Mixtures of nonnormal distributions may also cause either
leptokurtosis or platykurtosis (more intermediate values than the normal distribution:
Palmer and Strobek 2003). A potential example of this is illustrated by Hardersen and
Frampton's (2003) demonstration that a positive relationship between mortality and
asymmetry can cause leptokurtosis. Alternatively, Graham et al. (2003) have argued that
developmental error should behave multiplicatively in actively growing tissues, creating
a lognormal size distribution in most traits rather than the normal distribution that is
usually assumed. They argue that this ultimately results in leptokurtosis (but not fat tails)
and size dependent expression of FA. Because simple growth models are often geometric,
we should not be surprised if distributions of sizebased FA followed the lognormal
distribution (see Limpert et al. 2001 for a review of lognormal distributions in sciences).
Not well recognized within biology is the fact that close interaction of many
components can result in powerlaw scaling distributionall tails that decrease
proportional to xa rather than to some exponential function of x such as exp(ax)
[exponential] or exp(ax2) [normal] and hence to heavytailed distributions (Sornette
2003). Power law scaling is often associated with the tail of the lognormal distribution,
14
especially when log standard deviation is large ( Mitzenmacher unpublished ms.;
Montroll and Shlesinger 1982, 1983; Roman and Porto 2001; Romeo et al. 2003).
additive/linear
geometric/logarithmic
NORMAL
Normal
Laplace
gA
Double Pareto
lognormal (DPLN)
:t
LAPLACE
LOGNORMAL
M
x
X
T
U
R
E
LOGLAPLACE
Figure 21. Schematic representation of mathematical relationships between candidate
models for the distribution of fluctuating asymmetry. Mixtures here are
continuous.
KEY:
Solid line scale of variable (x * ln(x))
Dashed line random walk observed at constant stopping rate (i.e., negative
exponentially distributed stop times) Note: random walk on log scale exhibits geometric
Brownian motion
Dotted line convolution of two distributions (one of each type)
Block arrow a continuous mixture of distributions with stochastic (exponentially
distributed) variance
(Note LogLaplace is also called double Pareto by Reed 2001, Reed and Jorgensen
2004)
* *
Because FA in a population may reflect fundamental developmental differences
between different classes or groups of individuals, for example stressed and nonstressed,
or between different subpopulations as suggested by Houle (2000), we might expect
discrete mixtures of different distributions to best describe FA. For instance, extreme
individuals falling in a heavy upper tail may be those who have exceeded some
developmental threshold. Major disruption of development, resulting in high FA, may
also reveal the scaling that exists in the underlying gene regulatory network (Albert and
Barabasi 2002; Clipsham et al. 2002; Olvai and Barabasi 2002). Alternatively, ifFA is
produced by a single process, but to various degrees in different individuals, then one
might expect a continuous mixture model to best describe the distribution of FA.
The possibility of nonnormal distribution of FA opens the door to several potential
sampling problems. For instance, if the lognormal shape, or multiplicative variance
parameter, is large, then broad distribution effects may slow the convergence of the
sample mean to the population mean as sample sizes are increased (Romeo et al. 2003).
An additional, thornier, problem is caused by powerlaw scaling in the tails of
distributions. Many lognormally distributed datasets exhibit powerlaw scaling (or
amplification) in the tail region (Montroll and Shlesinger 1982, 1983; Romeo et al. 2003;
Sornette 2003), sometimes called ParetoLevy tails or just Levy tails. As sample sizes
grow infinitely large, powerlaw and Pareto distributions may approach infinite mean (if
the scaling exponent is less than three) and infinite variance (if the scaling exponent is
less than two), and therefore will not obey the Law of Large Numbers (that sample means
approach the population mean as sample sizes increase). Increased sampling actually
increases the likelihood of sampling a larger value in the tail of a Pareto distribution
(Bardou et al. 2003; Quandt 1966), creating more uncertainty in estimates of the mean as
sample size increases. The presence of powerlaw tails can slow overall convergence
considerably even in distributions that are otherwise lognormal with low variance (which
may not look very different from wellbehaved lognormal distributions unless a large
amount of data is accumulated).
This discussion points to two effects that need to be assessed broad distribution
effects, controlled by the shape (lognormal variance) parameter of the body of the FA
distribution, and powerlaw tails, controlled by the scaling exponents of the tails of the
FA distribution. To assess these effects, I apply a new statistical model, the double
Pareto lognormal (DPLN) distribution (Reed and Jorgensen 2004). The DPLN
distribution is a lognormal distribution with powerlaw behavior in both tails (for values
near zero and large positive values). Similar to the logLaplace distributions, the DPLN
distribution can be represented as a continuous mixture of lognormal distributions with
different variances. It can also be derived from a geometric Brownian motion (a
multiplicative random walk) that is stopped or "killed" at a constant rate (i.e., the
distribution of stop times is exponentially distributed: Reed and Jorgensen 2004; Sornette
2003). The parameters of the DPLN distribution include a lognormal mean (v) and
variance ('2) parameter which control the location and spread of the body of the
distribution, and powerlaw scaling exponents for the left (3) and right (c) tails. Special
cases of the DPLN include the right Pareto lognormal (RPLN) distribution, with a power
law tail on the right but not the left side (>oo); the left Pareto lognormal (LPLN)
distribution, with a powerlaw tail only near zero (cac); and the lognormal distribution,
with no powerlaw tails (a>c,3>co). For comparison, I also fit normal and halfnormal
distributions as well as the asymmetric Laplace distribution to the data on FA. See
Figure 21 for schematic representation of relationships between these candidate models
for FA.
In the following study, I directly test the fit of different distributions to large FA
datasets from three species of insects. I include a lab cultured monoclonal line of cotton
aphid, Aphis gossipyii Glover, in an attempt to isolate the distribution of developmental
noise for the first time. I also analyze data from a semiwild population of domestic
honeybee, Apis mellifera, taken from a single inseminated single queen colony, and from
a large sample of unrelated wildtrapped Longlegged flies (Chrysosoma crinitus:
Dolichopididae).
I address two primary groups of goals in this study. First, I investigate what
distributions fit FA data best and how the parameters of these distributions vary across
species, rearing conditions, and levels of genetic relatedness. I also address whether
"outliers" (individuals with visible developmental errors) appear to result from discrete or
continuous processes. Secondly, I determine how accurate the estimates of population
mean FA are at various sample sizes to determine whether past studies of FA been
adequately sampled to accurately estimate mean or average FA in populations. In
addition to these two primary goals, I also compare the bestfitting distributions and level
of sampling error for three of the most common methods of measuring FA: a univariate
and a multivariate sizebased metric of asymmetry, and a multivariate shapebased
method.
Methods
Wings were collected and removed from three populations of insects and dry
mounted on microscope slides. These populations included a monoclonal population of
1022 Cotton Aphid (Aphis gossipyii Glover) started from a single individual collected
from citrus in Lake Alfred, Florida, 1001 honeybees Apis mellifera, maintained at
University of Florida and 889 longlegged flies (Chrysosoma crinitus :Dolichopodidae).
All species identifications were made through the State of Florida Department of Plant
Industry in Gainesville and voucher specimens remain available in their collections.
Aphid cultures were maintained on potted plants in reachin environmental
chambers at 150C with constant 14/10 hour LD cycle generated by 4 20 watt
fluorescent Grolux brand bulbs. Aphid cultures were cultured on approximately 10 day
old cotton seedlings (Gossipium) and allowed to propagate until crowded. Crowding
stimulated alate formation (winged forms) in later generations which were collected
every twenty days with a fine camel hair brush wetted in ethanol. New plants were added
every ten days and alates were allowed to move freely from plant to plant starting new
clones until they were collected. The temperature at which the colony was maintained
created a low temperature "dark morph" Cotton Aphid which still propagated on host
plants quickly but was larger than high temperature "light morphs" that form at
temperatures greater than 17C. Dark morphs colonize stems on cotton whereas light
morphs colonized the undersides of leaves.
The bees were collected in June 2004 by Dr. Glenn Hall at the University of
Florida's Bee Lab from a single inseminated single queen colony. They were presumably
all foragers and haplodiploid sisters collected as they exited the hive into a collection bag.
The bag was frozen for three hours and then the bees were placed in 85% ethanol.
Longlegged flies were trapped from a wild population using 14 yellow plastic
water pan traps in southwest Gainesville, Florida, during May 2003 and MayJune 2004.
A very small amount of dishwashing detergent was added to the water to eliminate
surface tension and enhance trapping. Traps were checked every three hours during
daylight and set up fresh every day of trapping.
All insect specimens were dried in 85% ethanol, and then pairs of wings were
dissected (in ethanol) and airdried to the glass slides while the ethanol evaporated.
Permount was used to attach cover slips. This technique prevented wings from floating
up during mounting, which might slightly distort the landmark configuration. Dry
mounts were digitally photographed. All landmarks were identified as wing vein
intersections on the digital images (six landmarks on each wing for aphids, eight for
honeybees and Dolichopodid flies). See Appendix A for landmark locations on wings for
each species.
Wing vein intersections were digitized three times each on all specimens using
TPSDIG version 1.31 (Rohlf, 1999). All measures of FA were taken as the average FA
value of the three replicate measurements for each specimen. Specimens damaged at or
near any landmarks were discarded. Fluctuating asymmetry was measured in three ways
on all specimens. First, a common univariate metric of absolute unsigned asymmetry
was taken for two different landmarks: FA = abs(R L) where R and L are the Euclidean
distances between the same two landmarks on either wing. In aphids, landmarks 12 and
23 were used; in bees, landmarks 14 and 26 were used; and in longlegged flies,
landmarks 36 and 45 were used. Two multivariate geometric morphometrics using
landmarkbased methods were performed using all landmarks shown in Appendix A. A
multivariate sizebased FA (FA 1 in Palmer and Strobek 2003) was calculated as absolute
value of (R L) where R and L are the centroid sizes of each wing (i.e., the average of
the distances of each landmark to their combined center of mass or centroid location). In
addition, a multivariate shapebased measure of FA known as the Procrustes distance was
calculated as the square root of the sum of all squared Euclidean distances between each
left and right landmark after twodimensional Procrustes fitting of the data (Bookstein
1991; Klingenberg and McIntyre 1998; FA 18 in Palmer and Strobeck 2003; Smith et al.
1997). Procrustes fitting is a three step process including a normalization for centroid
size followed by superimposition of two sets of landmarks (right and left) and rotation
until all distances between each landmark set is minimized. Centroid size calculation,
Euclidean distance calculation and Procrustes fitting were performed using 0yvind
Hammer's Paleontological Statistics program PAST version 0.98 (Hammer 2002). For
assessing measurement error (ME) of FA (or more specifically, the digitizing error), we
conducted a Procrustes ANOVA (in Microsoft Excel) on all pairs of wing images
resampled three times each for every species (Klingenberg and McIntyre 1998). Percent
measurement error was computed as (ME/average FA) x 100 where
ME = (FA FA2 + FA2 FA3 + FA FA3)/ 3 All subsequent statistical analyses
were performed using SPSS Base 8.0 statistical software (SPSS Inc.).
The fits of all measures of FA to eight distributional models (normal, halfnormal,
lognormal, asymmetric Laplace, double Pareto lognormal (DPLN), two limiting forms of
DPLN, the right Pareto lognormal (RPLN) and the left Pareto lognormal (LPLN) and a
discrete mixture of lognormal and Pareto) were compared by calculating negative log
likelihood and Likelihood Ratio Test (LRT) if models were nested and Akaike
Information Criteria (AIC) if not nested (Burnham and Anderson 1998; Hilborn and
Mangel 1998). Both of these approaches penalize more complex models (those with
more parameters) when selecting the bestfit distributional model for a given dataset.
(The Likelihood Ratio Test does not technically apply when the nesting parameter is at
the boundary of its allowed region, e.g., when c>oc for the DPLN, but Pinheiro and
Bates (2000) suggest that the LRT is conservative, favoring simpler models, under these
conditions.) Best fitting parameters were obtained by maximizing the loglikelihood
function for each model (Appendix B). The maximization was performed using the
conjugate gradient method within unconstrained solve blocks in the program MathCad by
MathSoft Engineering and Education Inc (2001), and was also confirmed using Nelder
Mead simplex algorithm or quasiNewton methods in R version 2.0.1 (2003), a
programming environment for data analysis and graphics.
Phenodeviants were defined as individuals demonstrating missing wing veins, extra
wing veins or partial wing veins on either one or both wings. All phenodeviants in
honeybees involved absence of the vein at landmark 6 (LM 6). Phenodeviants in aphids
were more variable but mostly involved absence of wing vein intersections at LM 2 or
LM 3. Procrustes distances were estimated for phenodeviants by omitting the missing
landmarks (caused by the phenodeviance) and controlling for the effect of this removal
on the sums of squares. I added an average of the remaining sums of squares in place of
the missing sums of squares so that the calculated Procrustes distance is comparable to
normal specimens (i.e., six landmarks). In almost all phenodeviants, this involved
omission of only one set of landmark values. The frequency of phenodeviants was
examined across the range of the FA distribution (i.e., Procrustes distance), and mean
values of the FA for phenodeviants were compared to normal individuals in order to
assess whether phenodeviants tended to show higher than normal levels of FA in
characters that were not affected by the missing, partial, or extra wing veins.
The best fitting parameters of the best fitting models were used to build a
distributional model under which repeated sampling was simulated at various sample
sizes. Average error in estimation of the mean FA was calculated as a coefficient of
variation ((s / x) 100) for 1,000 randomly generated datasets. Lastly, comparison were
made of the estimation errors given the best fitting distributions of FA to a distribution of
sample sizes from 229 FA studies published in three recent metaanalyses (Polak et al.
2003; Thornhill and Moller 1998; Vollestad et al. 1999).
Results
In the distributions of shapebased FA in monoclonal cotton aphids (n = 1022),
domesticated honeybees (n = 1001), and wild trapped longlegged flies (n = 889), AIC
and LRTs always favored DPLN or RPLN models by a large margin. All sizebased FA
distributions favored DPLN or LPLN by a large margin (Table 21). All variants of
discrete mixture models we tried had very poor results (data not shown). Figures 22
through 24 demonstrate best fitting models for multivariate shape FA (DPLN and
lognormal), multivariate centroid size FA (DPLN and halfnormal), and univariate size
FA (DPLN and halfnormal) for all three species. FA was often visually noticeable in
aphids, where the mean shape FA (Procrustes distance) was three times higher (0.062 +
0.00050) than in bees (0.023 0.00026) or flies (0.019 0.00028). I note that
distribution of size FA in aphids and bees fit halfnormal distribution in the upper tails
fairly well but fit relatively poorly among individuals with low FA. Longlegged flies
exhibit poor fit to halfnormal in both tails.
Table 21. Maximized loglikelihood (MLL), number of model parameters (P) and
Akaike Information Criterion differences (AAIC) for distributional models
tested. Winning models have AAIC zero. Models with goodnessoffit nearly
equal to winners are underscored (AAIC <3.0). 4.0
some support for specified model. AAIC >10.0 indicates no support (Burnham
and Anderson 1998). Distances between landmarks for first univariate size
FA, aphids LM 12, bees LM 14 and longlegged fly LM 36 and for second
univariate FA aphids LM 23, bees LM 26 and longlegged fly LM 45.
Multivariate shape FA Multivariate size FA First univariate size FA
MLL
255.117
264.72
256.723
734.726
2241
3523
637.044
132.908
130.224
136.361
523.667
3371
5319
1148
146.603
128.617
149.585
506.553
2985
5147
1041
A AIC
0.000
17.207
1.213
955.218
3968
6530
761.855
7.369
0.000
12.275
784.888
6480
10370
2036
37.971
0.000
41.935
753.871
5710
10030
1825
MLL
542.906
547.44
543.288
1261
2847
2370
1638
536.278
549.641
535.906
1262
2180
1775
1387
452.728
454.056
455.818
1055
1760
1084
940.241
A AIC
1.236
8.303
0.000
1433
4606
3650
2189
2.744
27.47
0.000
1451
3286
2475
1703
0.000
0.656
4.18
1201
2610
1256
973.02
MLL
545.725
549.789
545.317
1265
3352
2893
1883
454.337
460.51
455.483
1081
1709
2687
1162
453.958
460.261
456.237
1064
1806
1252
1016
A AIC
2.815
8.944
0.000
1438
5612
4691
2676
0.000
10.346
0.292
1249
2505
1771
1413
0.000
10.606
2.559
1216
2701
1590
1121
Second univariate size FA
MLL AAIC
614.123 2.333
617.498 7.083
613.957 0.000
1413 1596
3534 5838
3047 4862
1950 2671
Species/model
Cotton Aphid
DPLN
RPLN
LPLN
LNORM
NORM
HNORM
LAPLACE
Honey Bee
DPLN
RPLN
LPLN
LNORM
NORM
HNORM
LAPLACE
Longlegged Fly
DPLN
RPLN
LPLN
LNORM
NORM
HNORM
LAPLACE
6.329
23.82
0.000
1393
3574
2816
1863
0.000
20.182
0.212
1144
2508
1368
1169
For multivariate shape analysis, right tail powerlaw exponents (a) were very high
(thousands), lefttail exponents (3) from 3.99.9, while the dispersion parameter (') was
narrowly distributed from 0.310 to 0.356. Thus, shape FA exhibited little scaling in tails
(i.e., nearly lognormal). For sizebased FA, dispersion was much larger (0.570.74), and
powerlaw exponents more variable for univariate and multivariate sizebased FA (left
515.93
525.675
513.765
1211
2302
1924
1445
412.007
422.993
412.902
985.894
1668
1099
997.28
tail (P) and right tail (ca) were generally low, indicating moderate powerlaw scaling in
both tails) as shown in Table 22.
Table 22. Best fit parameters for models in Table 1. Parameters for univariate size FA
are very similar to multivariate size FA and are not shown. Skew indexes for
asymmetric Laplace are also not shown.
Multivariate shape FA
Multivariate size FA
Cotton Location Dispersion/ Right tail
Aphid
DPLN
RPLN
LPLN
LNORM
NORM
HNORM
LAPLACE
Honey Bee
DPLN
RPLN
LPLN
2.62
3.01
Shape
0.356
0.415
2.62 0.353
2.87
0.062
0.009
0.049
1160
7.03
Left
Tail
4.04
00
0o 3.90
0.434
0.027
0.059
0.058
3.74 0.310 8380 9.80
4.02 0.274 5.52 oo
3.71 0.305
LNORM 3.84
NORM 0.023
HNORM 0.007
LAPLACE 0.021
Longlegged Fly
DPLN Q
RPLN
LPLN
LNORM
NORM
HNORM
LAPLACE
00 7.75
0.327
0.008
0.018
0.018
flnA AQnf an ICW~
2
4.28 0.231 3.73 oo
3.92 0.341
4.02
0.019
0.007
0.017
0.351
0.008
0.015
0.017
oo 9.91
2
Location Dispersion/ Right tail
1.45
1.07
1.73
1.28
4.92
0.163
0.657
1.32
0.571
1.52
0.777
2.95
0.078
0.550
Shape
0.735
0.800
0.699
0.824
3.94
6.31
2.19
6.24
4.58
Left
Tail
3.01
00
0o 2.23
0.565 10.4 1.57
0.829 4.82 oo
0.510
0.850
2.14
3.64
2.24
oo 1.35
0.067 0.683 3.09 5.14
0.302 0.698 2.75
0.319 0.743
 0.060
 2.00
 0.075
 0.346
oo 3.85
0.783
2.24
2.11
1.06
Figure 25 shows the distribution of FA sample sizes from 229 studies published in
three recent metaanalyses (Polak et al. 2003; Thornhill and Moller 1998; Vollestad et al.
1999).
A. Cc tton Aphid
.J. I
B. Domestic Honeybee
, DPLN DPLN
0 005 1 0 15 02 025
obsDATA
o Lognormal 9
0xLGN 2
o 1
S005 01 0 15 02 025
12 0 002 004 006 022 008 24
10 DPLNo
x pDPLN
obDATA
Lognormal
X..
Figure 22. Distribution of multivariate shape FA of A) cotton aphid (Aphis gossipyii) B)
domestic honeybee (Apis mellifera) and C) longlegged fly (Chrysosoma
crinitus). Best fitting lognormal (dashed line, lower inset), and double Pareto
lognormal (solid line, upper inset) are indicated.
Il*lr.mr:,~ n I I I "
Figure 23. Distribution of multivariate centroid size FA of A) cotton aphid (Aphis
gossipyii) B) domestic honeybee (Apis mellifera) and C) longlegged fly
(Chrysosoma crinitus). Best fitting halfnormal (dashed line, lower inset) and
double Pareto lognormal distribution (solid line, upper inset) are indicated.
A Cotton Aphid (LM 23)
DPLN
o o 0 fnormal
oboATA
Halfnormal
, i i i~
B Domestic Honeybee (LM 14)
B Domestic Honeybee (LM 26)1
xpDaFN
DPLN
obr ATA
Halfnormal ; +
n5, i
Fly (LM 36) l
S, DPLN
Halfnormal
5 I
C Longlegged Fly (LM 45)
Nf
A Cotton Aphid (LM 12)
Figure 24. Distribution of univariate unsigned size FA of A) cotton aphid (Aphis
gossipyii) B) domestic honeybee (Apis mellifera) and C) longlegged fly
(Chrysosoma crinitus). Best fitting halfnormal (dashed line, lower inset) and
double Pareto lognormal distribution (solid line, upper inset) are indicated.
'\1 h P
20 DPLN
o o 0 o o 4'o
obhDATA
Halfnormal
0 5 0 10
DPLN
obsDATA
Halfnormal
DPLN
Hfr
obDATA
Halfnormal
i i i
40
30
20
10
0
sample or treatment size
Figure 25. Distribution of sample sizes (n) from 229 fluctuating asymmetry studies
reported in three recent metaanalyses (Vollestad et al. 1999, Thornhill and
Moller 1998 and Polak et al. 2003). Only five studies had sample sizes greater
than 500 (not shown).
Nearly 70% of the 229 FA studies have sample or treatment sizes less than 100.
Figure 26 demonstrates the hypothetical error levels (coefficients of variations) in
estimated mean FA at various sample sizes. Approximate best fit parameters were used
to estimate the coefficient of variation (CV) under the DPLN distribution (for shape FA,
v = 3.7, T = 0.35, a = 1000, 0 = 9; for size FA, v = 1.2, T = 0.70, a = 4.0, 0 = 4.0). For
the same set of landmarks, multivariate shape FA measures lead to the least amount of
error in estimating mean FA under DPLN at any sample size. Both univariate and
multivariate size FA perform more poorly in terms of both convergence and overall
percentage error.
I found that while phenodeviants occurred in almost all regions of the distribution
range of FA, the percentage of phenodeviant individuals increased dramatically with
increasing FA (Figure 27; aphids, r = 0.625 p = 0.013; bees, r = 0.843 p = 0.001). I also
found that individuals with phenodeviant wings (both aphids and bees) showed
significantly higher levels of FA across those wing landmarks unaffected by the
phenodeviant traits (p < 0.002 in both aphids and bees). Only a single case of
phenodeviance was sampled in longlegged flies.
45
4 0 1 .. ...............
40
35
3 0 ...................... .. .. .. .... .... .. ..............................................
o30 Legend
0 ............................................................................... ................. heg epn d F ....................... ... .............
% ERROR IN ESTIMATION 25 shape FA
OF THE MEAN (CV)  size FA
20
2 0 .. ... .. ... .. ... .... .... ..... ... ... .... ... .... .... ... ... .... ................................................ ....... .....................
15
0
0 \ \. I, '. i ii
0 50 100 150 200 250 300 350 400 450 500
SAMPLE SIZE
Figure 26. Relationship between sample size and % error for estimates of mean FA
drawn from best fitting size (dashed line) and shape (solid line) distributions
using 1000 draws per sample size. All runs use typical winning double Pareto
lognormal parameters (shape FA v = 3.7, T = 0.2, a = 1000, 0 = 9; for size FA
v = 1.2, z = 0.7, a = 4.0, 0 = 4.0).
Percent measurement error for shape FA was 1.41% in aphids, 1.63% in bees, and
2.42 % in flies while for size FA, it was 4.5% in aphids, 5% in bees, and 6.5 % in
flies. In a Procrustes ANOVA (Klingenberg and McIntyre 1998) the mean squares for
the interaction term of the ANOVA (MSinteraction) was highly significant p<0.001 in all
three species indicating that FA variation was significantly larger than variation in
measurement error (ME). The distribution of signed ME was normal and exhibited
moderate platykurtosis for all types of FA in all species examined. Measurement error
was very weakly correlated to FA in all samples (0.01 < r2 <0.07).
300 300
A B
250 250 
200 200
Frequency
150 150
100 100
Legend IIIILegend
50 n phenodeviants I phenodeviants
50 normal 50 normal
0 0
&'' '' & .O k '',:' .... ...... O *' o g"O" C o" O
o o. r t $ *0
Procrustes Distance Procrustes Distance
064 030
C 0 D
So 028
o60
Q 027
058
054 024
S052 023
NORMALWING PHENODEVIANT NORMALWING PHENODEVIANT
APHID WINGS BEE WINGS
Figure 27. The proportion and percentage (inset) of individuals with visible
developmental errors on wings (phenodeviants) are shown for cotton aphids
(A) and honeybees (B) in relation to distribution of shape FA (Procrustes
distance). Average FA for both normal and phenodeviant aphids (C) and
honeybees (D) are also given.
Discussion
The Distribution of FA
The data demonstrate a common pattern of distribution in the FA in wing size and
shape of three different species whose populations existed under very different
environmental conditions (lab culture, freeliving domesticated, and wild) and genetic
structure (monoclonal, haplodiploid sisters, and unrelated). The similarities across the
very different species and rearing conditions used in this study suggest that the
distribution of size and shape FA may have universal parameters (e.g., T 0.35 for shape
FA and T 0.7 for size based FA). The data confirm that although size FA sometimes
exhibits reasonable fit to halfnormal in the upper tail, and shape FA is reasonably well fit
by lognormal distributions, large datasets of FA in both size and shape are always best
described by a double Pareto lognormal distribution (DPLN) or one of its limiting forms,
LPLN and RPLN. Multivariate shape FA demonstrates narrow distribution with a large
right tail, including the top few percent of the most extremely asymmetric individuals,
that is best fit by DPLN or RPLN. Both univariate and multivariate size FA exhibit a
considerably broader distribution with moderate leptokurtosis that is best fit by DPLN or
LPLN. The data suggest that the DPLN distribution and its limiting forms are generally
the most appropriate models for the distribution of FA regardless of method of
measurement.
Evidence that distribution of FA closely follows DPLN, a continuous mixture
model, and appearance of phenodeviance across nearly the entire range of data suggested
that developmental errors may be caused by a similar process across the entire
distribution of FA in a population. In other words, variation in FA may have a single
cause in most of the data. Although phenodeviance is significantly related to increased
levels of FA and is more prevalent in the right tail region of the shape FA distribution, it
does not appear associated exclusively with the right tail, as a threshold model for high
FA might predict. The very poor fits to all variations on the discrete mixture model also
suggest a lack of distinct processes creating extreme FA in the three datasets. However, I
caution that use of maximum likelihood methods to fit data to discrete mixtures is often
technically challenging. I found no block effects (e.g., no differences in FA levels
between longlegged fly samples collected on different weeks or aphid samples collected
from different pots or growth chambers), so it appears there is no obvious sample
heterogeneity that could result in a discrete mixture. With the usual caveats about
inferring process from pattern, I do not find obvious thresholds in the distribution of
asymmetries at the population level that would suggest threshold effects at a genetic or
molecular level. Lastly, based on the appearance of only one phenodeviant among our
wild trapped longlegged fly population (as opposed to many in the bee and lab reared
aphid populations), I speculate that mortality related to phenodeviance (and perhaps high
FA) in wing morphology may be relaxed in lab culture and domestication. But this could
be confounded by other differences between the three datasets including genetic
redundancy and species differences. Further comparisons among populations of single
species under different conditions would be needed to test this idea.
Sample Size and the Estimation of Mean FA
In random sampling under DPLN, we found that broad distribution effects due to
the shape parameter were minimal in their effect of slowing convergence to the
population mean in multivariate shapebased FA ('z 0.35 for all three datasets).
However, these effects are considerable for univariate and multivariate sizebased FA
(where T Z 0.70). The effects of scaling in the tails of the distribution, which cause
divergence from the mean, appear to have little effect in the right tail of the distribution
of shapebased FA. However, larger effects in the tails of the size FA create more
individuals with very low and very high asymmetry than expected under the assumption
of normality. The point estimates of the scaling exponents for size FA are close to the
range where very extreme values may be sampled under the distribution tails (if a < 3 or
p <3), greatly affecting confidence in the estimate of mean FA. With a sample size of 50
and the best fitting DPLN parameters typical for asymmetry in multivariate shape, we
found that the coefficient of variation for mean FA is about 5%, whereas sizebased mean
FA fluctuates about 13% from sample to sample. At a sample size of 100, these
coefficients of variation are 3.2% and 7.5% respectively. Unless experimental treatment
effects in most FA studies are larger than this, which is unlikely in studies using size
based measures of FA of more canalized traits, statistical power and repeatability will be
low. Given the sample size range of most previous studies (n = 30100) and their
tendency to favor sizebased measurement methods, our results suggest that many past
FA studies may be undersampled. Furthermore, it is also likely that given the small
sample sizes in many FA studies, particularly involving vertebrates where n < 50, the tail
regions of natural FA distributions are often severely under sampled and sometimes
truncated by the removal of outliers. These factors may artificially cause nonnormal
distributions to appear normal, also potentially resulting in inaccurate estimation of mean
FA.
The Basis of Fluctuating Asymmetry
The surprisingly good fit of FA distributions to the DPLN model in our study
suggests that the physical basis of FA may be created by the combination of random
effects in geometrically expanding populations of cells on either side of the axis of
symmetry (i.e., geometric Brownian motion). Studies in the Drosophila wing indicate
that cell lines generally compete to fill a prescribed space during development with more
rapidly dividing lines outcompeting weaker ones (Day and Lawrence 2000). Because
regulation of the growth of such cell populations involves either nutrients and/or
signaling substances that stop the cell cycle when exhausted, it is likely that the
distribution of numbers of cells present at the completion of growth follows an
exponential distribution. Reed and Jorgensen (2004) demonstrate that when a population
of repeated geometric random walks is "killed" at such a constant rate, the DPLN
distribution is the natural result. There are many other examples of growth processes in
econometrics and physics where random proportional change combined with random
stopping/observation create size distributions of the kind described here (Reed 2001;
Kozubowski and Podgorski 2002). In the future, when applying this model to instability
during biological growth, it would be very interesting to investigate how genetic and
environmental stress might affect the parameters of this model. If scaling effects are
found to vary with stress, then leptokurtosis may potentially be a better candidate signal
of developmental instability than increased mean FA.
Conclusion
Although sizebased FA distributions can sometimes appear to fit normal
distributions reasonably well as previous definitions of FA suppose, I demonstrate that
three large empirical datasets all support a new statistical model for the distribution of FA
(the double Pareto lognormal distribution), which potentially exhibits powerlaw scaling
in the tail regions and leading to uncertain estimation of true population mean at sample
sizes reported by most FA studies. The assumption of normality fails every time
candidate models are compared on large datasets. Failure of this assumption in many
datasets may have been a major source of discontinuity in results of past FA studies.
Future work should attempt to collect larger sample/treatment sizes (n = 500) unless the
magnitude of treatment effects on FA (and thus the statistical power of comparisons) is
very large. Our results demonstrate that multivariate shapebased methods (Klingenberg
and McIntyre 1998) result in more repeatable estimates of mean FA than either
multivariate or univariate sizebased methods. I would also recommend that
methodology be reexamined even in large sample studies of FA. For example, because
Drosophila are usually reared in many replicates of small tubes with less than 50 larvae
per tube, many large studies may still be compromised by individual sizes of replicate
samples. I also suggest that authors of past metaanalyses and reviews of FA literature
reassess their conclusions after excluding studies in which undersampling is found to be
problematic. Careful attention to distributional and sampling issues in FA studies has the
potential both to mitigate problems with repeatability and possibly to suggest some of the
underlying mechanisms driving variation in FA among individuals, populations, and
species.
CHAPTER 3
INBREEDING REDUCES POWERLAW SCALING IN THE DISTRIBUTION OF
FLUCTUATING ASYMMETRY: AN EXPLANATION OF THE BASIS OF
DEVELOPMENTAL INSTABILITY
Introduction
Fluctuating asymmetry (FA) is the average difference in size or shape of paired or
bilaterally symmetric morphological trait sampled across a population. The study of FA,
thought to be a measure of developmental instability, has a controversial history.
Fluctuating asymmetry is hypothesized by some to universally represent a population's
response to environmental and/or genetic stress (Parsons 1992, Clarke 1993, Graham
1992). It is also generally accepted that FA may be coopted as an indicator or even a
signal of individual genetic buffering capacity to environmental stress (Moller 1990,
Moller and Pomiankowski 1993). Recent literature reviews reveal that these conclusions
are perhaps premature and analyses of individual studies often demonstrate conflicting
results (Lens 2002, Bjorksten 2000). Babbitt et al. (2006) demonstrate that this conflict
may be caused by undersampling due to a false assumption that FA always exhibits a
normal distribution. Also, FA may be responding to experimental treatment in a complex
and as yet unpredictable fashion. Until the basis of FA is better understood, general
interpretation of FA studies remains difficult.
What is the Basis of FA?
Earlier studies of sexual selection and FA conclude that FA is ultimately a result
of strong selection against the regulation of the development of a particular morphology
(e.g., morphology used in sexual display). Thus in some instances FA may increase and
therefore become a better signal and/or more honest indicator of good genes. Sexually
selected traits tend to have increased FA (Moller and Swaddle 1997), however the exact
mechanism by which FA increases remains unexplained (i.e., a black box). More
recently, theoretical attempts have been made to explain how FA may be generated but
none have been explicitly tested. Models for the phenomenological basis of FA fall into
two general categories: reactiondiffusion models and diffusionthreshold models
(reviewed in Klingenberg 2003). The former class of models involves the chaotic and
nonlinear dynamics in the regulation or negative feedback among neighboring cells
(Emlen et al. 1993) or adjacent bilateral morphology (Graham et al. 1993). The latter
class of models combines morphogen diffusion and a threshold response, the parameters
of which are controlled by hypothetical genes and a small amount of random
developmental noise (Klingenberg and Nijhout 1999). The result of this latter class of
model is that different genotypes respond differently to the same amount of noise,
providing an explanation for genetic variation in FA response to the same environments.
Traditionally, models for the basis of FA assume that variation in FA arises from
independent stochastic events that influence the regulation of growth through negative
feedback rather than processes that may fuel or promote growth. None of these models
explicitly or mathematically address the effect of stochastic behavior in cell cycling on
the exponential growth curve. More recently Graham et al. (2003) makes a compelling
argument that fluctuating asymmetry often results from multiplicative errors during
growth. This is consistent with one particular detail about how cells behave during
growth. For several decades there has been evidence that during development, cells
actually compete to fill prescribed space until limiting nutrients or growth signals are
depleted (Diaz and Moreno 2005, Day and Lawrence 2000). Cell populations effectively
double each generation until signaled or forced to stop. It has also been observed that in
Drosophila wing disc development, that synchrony in cell cycling does not occur across
large tissue fields but rather extends only to an average cluster of 48 neighboring cells
regardless of the size and stage of development of the imaginal disc (Milan et al. 1995).
The assumption of previous models, particularly the reactiondiffusion type, that cell
populations are collectively controlling their cell cycling rates across a whole
developmental compartment is probably unrealistic. Regulatory control of fluctuating
asymmetry almost certainly does occur, but probably at a higher level involving multiple
developmental compartments where competing cells are prevented from crossing
boundaries. However, given that individual cells are behaving more or less
autonomously during growth within a single developmental compartment, I suggest that
variation in fluctuating asymmetry can be easily generated at this level by a process
related to stochastic exponential expansion and its termination in addition to regulatory
interactions that probably act at higher levels in the organism. In this paper, I explore and
test simple model predictions regarding the generation of fluctuating asymmetry though
multiplicative error without regulatory feedback.
Exponential Growth and NonNormal Distribution of FA
In previous work, Babbitt et al. (2006) demonstrate that the distribution of unsigned
FA best fits a lognormal distribution with scaled or powerlaw tails (double Pareto
lognormal distribution or DPLN). This distribution can be generated by random
proportional (exponential) growth (or geometric Brownian motion) that is stopped or
observed randomly according to a negative exponential probability (Reed and Jorgensen
2004). I suggest that this source of powerlaw scaling in the tails of the FA distribution is
also the cause of leptokurtosis that is often observed empirically in the distribution of FA.
Kurtosis is the value of the standardized fourth central moment. Like the other moments,
(location, scale, and skewness), kurtosis is best viewed as a concept that can be
formalized in multiple ways (Mosteller and Tukey 1977). Leptokurtosis is best
visualized as the location and scalefree movement of probability mass from the
shoulders of a symmetric distribution towards both its center and tail (Balanda and
MacGillivray 1988). Both the Pareto and powerfunction distributions have shapes
characterized by the powerlaw and a large tail and therefore exhibit a lack of
characteristic scale. Both because kurtosis is strongly affected by tail behavior, and
because leptokurtosis involves a diminishing of characteristic scale in the shape of a
distribution, the concepts of scaling and kurtosis in real data can be, but are not
necessarily always, interrelated.
Both in the past and very recently, leptokurtosis in the distribution of FA has been
attributed to a mixture of normal FA distributions caused by a mixing of individuals, all
with different geneticallybased developmental buffering capacity, or in other words,
different propensity for expressing FA (Gangestad and Thornhill 1999, Palmer and
Strobek 2003, Van Dongen et al. 2005). Although not noted by these authors, continuous
mixtures of normal distributions generate the Laplace distribution (Kotz et al. 2001,
Kozubowski and Podgorski 2001) and can be distinguished from other potential
candidate distributions by using appropriate model selection techniques, such as the
Akaike Information Criterion technique (Burnham and Anderson 1998). Graham et al.
(2003) rejects the typical explanation of leptokurtosis through the mixing of normal
distributions by noting that differences in random lognormal variable can generate
leptokurtosis. Babbitt et al. (2006) also reject the explanation that leptokurtosis in the
distribution of FA is caused by a mixture of normal distributions because the double
Paretolognormal distribution, not the Laplace distribution, always appears the better fit
to large samples of FA. Therefore, leptokurtosis, often observed in the distribution of
FA, may not be due to a mixing process, but instead may be an artifact of scaling in the
distribution tail, which provides evidence of geometric Brownian motion during
exponential expansion of populations of cells.
Testing a Model for the Basis of FA
I propose that the proximate basis for variation of FA in a population of organisms
is due to the random termination of stochastic geometric growth. The combination of
opposing stochastic exponential functions results in the slow powerlaw decay that
describes the shape of the distribution's tail. In this paper, I present a model for FA and
through simulation, test the prediction that genetic variation in the ability to precisely
terminate growth will lead to increased kurtosis and decreased scaling exponent in the
upper tail of the distribution. Then, I assess the validity of this model by direct
comparison to the distribution of FA within large samples of wild and inbred populations
of Drosophila. Under the assumption that a less heterozygous population will have less
variance in the termination of growth, I would predict that inbreeding should act to
reduce powerlaw scaling effects in the distribution of FA in a population. Inbreeding
should also reduce the tail weight kurtosiss) and mean FA assuming inbred individuals
have lower variance in the times at which they terminate growth. I also assume that
inbreeding within specific lines does not act to amplify FA due to inbreeding depression.
It has been demonstrated that Drosophila melanogaster do not increase mean FA in
response to inbreeding (Fowler and Whitlock 1994) and it is suggested that large
panmictic populations typical of Drosophila melanogaster may not harbor as many
hidden deleterious recessive mutations as other species (Houle 1989), making them
resistant to much of the typical genetic stress of inbreeding. Therefore, the absence of
specific gene effects during inbreeding suggests that Drosophila may be a good model
for investigating the validity of our model as an explanation of the natural variation
occurring in population level FA.
Methods
Model Development
Simulation of geometric Brownian motion
Ordinary Brownian motion is most easily simulated by summing independent
Gaussian distributed random numbers or white noise (X,). See Figure 31.
(3.1) W(X)= X,
i=1
which simulated in discrete steps is
(3.2) N,= Nt, + 1
where N = cell population size, t = time step and W = a random Gaussian variable.
Exponential or geometric Brownian motion, a random walk on a natural log scale, can be
similarly simulated. Geometric Brownian motion is described by the stochastic
differential equation
(3.3) dY(t) = pY(t)dt + vY(t)dW(t)
or also as
(3.4) d =(t) udt + udW(t)
Y(t)
WHITE NOISE
ORDINARY BROWNIAN MOTION
10050 If
1000
600
N 200
47 70 93 116139162185 08 231 2542773003233466 43
400
600
time
Figure 31. Ordinary Brownian motion (lower panel) in N simulated by summing
independent uniform random variables (W) (upper panel).
where W(t) is a Brownian motion (or Weiner process) and / and v are constants that
represent drift and volatility respectively. Equation 3.3 has a lognormal analytic solution
(3.5) Y(t) = Y(O)e(' 12)t+vW(t)
A simulation of geometric Brownian motion in discrete form follows as
(3.6) Nt = Nt,1 tW,
where N = cell population size, t = time step and W = a random Gaussian variable. See
Figure 32. Equation 3.6 is identical to the equation for multiplicative error in Graham et
al. (2003). I modify equation 3.6 slightly by letting Wrange uniformly from 0.0 to 1.0
with W= 0.5 and adding the drift constant C that allows for stochastic upward drift (at C
> 0.5) or downward drift (at C < 0.5).
(3.7) N = CN1 + NtlW
= (C + W,)N,,
At C = 0.5, eqns. 3.6 and 3.7 behave identically. Geometric Brownian motion with
upward drift is shown in Figure 33.
0.00045
0.0004
0.00035
0.0003
0.00025
0.0002
0.00015
0.0001
0.00005
0
35 69 103 137 171 205 239 273 307 341 375 409 443 477
time step
Log N
1
0.1
0.01
0.001
0.0001
0.00001
0.000001
0.0000001
time step
time
Figure 32. Geometric Brownian motion in N and log N simulated by multiplying
independent uniform random variables. This was generated using Equation
1.5 with C = 0.54.
36 71 106 141 176211 246 281 316 351 386421 456
j___A__ War_
AA VW



44
4500000000
4000000000
3500000000
3000000000
2500000000
N 2000000000
N
1500000000
1000000000
500000000
0
1 37 73 109 145 181 217 253 289 325 361 397 433 469
time step
1E+11
1E+09
1000000
0
100000
Log N 1000
10 
01 34 133 166 199 232 265 298 331 364 397 430 463
0 001
000001
time step
time
Figure 33. Geometric Brownian motion in N and log N with upward drift. This was
generated using Equation 1.5 with C = 0.60.
Simulation of fluctuating asymmetry
Using the MathCad 13 (Mathsoft Engineering and Education 2005), two
independent geometric random walks were performed and stopped randomly at mean
time t= 200 steps with some variable normal probability. The random walks result in cell
population size equal to N, (orNf and NI on left and right sides respectively).
Fluctuating asymmetry (FA) was defined as the difference in size resulting from this
random proportional growth on both sides of the bodies axis of symmetry plus a small
degree of random uniformly distributed noise or
(3.8) FA= NL N +rv
where (rv) was uniformly distributed with a range of +0.1 ( N N ). Using a
MathCadbased simulation in VisSim LE, the generation of individual FA values was
repeated until a sample size of 5000 was reached. The random noise (rv) has no effect on
the shape of the FA distribution but fills empty bins (gaps) in distribution tails. Because
rv is small in comparison to N NJ, its effect is similar to that of measurement error
(which would be normally distributed rather than uniform). Schematic representation of
the simulation process for Reed and Jorgensen (2004) model and simulation of
fluctuating asymmetry are shown in Figure 4 and 5. The simulation of the distribution of
fluctuating asymmetry was compared at normal standard deviation of termination of
growth (t) ranging from a = 0.5, 0.8, 1.2,3 and 7 with a drift constant ofC = 0.7.
Inbreeding Experiment
In May 2004 in Gainesville, Florida, 320 freeliving Drosophila simulans were
collected in banana baited traps and put into a large glass jar and cultured on instant rice
meal and brewer's yeast. After two generational cycles 1000 individuals were collected
in alcohol. This was repeated again in June 2005 with 200 wild trapped flies.
Lines of inbred flies were created from the May 2004 wild population through eight
generations of full sib crosses removing an estimated 75% of the preexisting
heterozygosity (after Crow and Kimura 1970). Initially, ten individual pairs were
isolated from the stock culture and mated in 1/2 pint mason jars with media and capped
with coffee filters. In each generation, and in each line, and to ensure that inbred lines
were not accidentally lost though an inviable pairing, four pairs of Fl sibs from each
cross were then mated in 12 pint jars. Offspring from one of these four crosses were
randomly selected to set up the next generation. Of the original ten lines, only four
remained viable after eight generations of full sib crossing. These remaining lines were
allowed to increase to 1000+ individuals in 1 quart mason jars and then were collected
for analysis in 85% ethanol. This generally took about 4 generations (8 weeks) of open
breeding. One completely isogenic (balancer) line of Drosophila melanogaster was
obtained from Dr. Marta Wayne, University of Florida and also propagated and collected.
0 10 20 30 40 50 60 70 80 90 100
I001
1 DOUBLE PARETO LOGNORMAL DISTRIBUTIONITH
NEGATIVE EXPONENTI AL REPEAT
VARIATION IN GROWTH SAMPLING
0 10 20 30 40 50 60 70 80 90 100
STOP TIME
T SO DOUBLE PARETO LOGNORMAL DISTRIBUTION
'0 100
o VARIATION IN GO 00
Figure 34. Model representation of Reed and Jorgensen's (2004) physical size
distribution model. Variable negative exponentially distributed stopping
Times of random proportional growth (GBM with C = 0.5) create double
Pareto lognormal distribution of size.
Stochastic Geometric Growth
RANDOM WALKS WITH
GEOMETRIC BROWNIAN
MOTION
GROWTH STOP TIME
SIZE DISTRIBUTION
I_ \ 1
REPEAT
NORMAL VARIATION SAMPLING
IN GROWTH STOP TIME
0 50 100 150
Figure 35. A model representation of developmental instability. Normally variable
stopping times of random proportional growth (GBM with C > 0.5) create
double Pareto lognormal distribution of size.
All flies were cultured on rice meal and yeast at 290C with 12:12 LD cycle in
environmentally controlled chambers at the Department of Entomology and Nematology
at the University of Florida. Wings were collected and removed from 1000 flies from
each of the two samples of wild population and four samples of inbred lines and dry
mounted on microscope slides. Specimens were dried in 85% ethanol, and then pairs of
wings were dissected (in ethanol) and airdried to the glass slides. Permount was used to
attach cover slips. This technique prevented wings from floating up during mounting,
which might slightly distort the landmark configuration. Dry mounts were digitally
photographed. All landmarks were identified as wing vein intersections on the digital
images (eight landmarks on each wing). See Appendix A for landmark locations.
Morphometric analyses
Wing vein intersections were digitized on all specimens using TPSDIG version
1.31 (Rohlf, 1999). Specimens damaged at or near any landmarks were discarded.
Fluctuating asymmetry was measured in two ways on all specimens using landmark
based multivariate geometric morphometrics. A multivariate sizebased FA (FA 1 in
Palmer and Strobek 2003) was calculated as absolute value of(R L) or just RL in
signed FA distributions where R and L are the centroid sizes of each wing (i.e., the sum
of the distances of each landmark to their combined center of mass or centroid location).
In addition, a multivariate shapebased measure of FA known as the Procrustes distance
was calculated as the square root of the sum of all squared Euclidean distances between
each left and right landmark after twodimensional Procrustes fitting of the data
(Bookstein 1991; Klingenberg and McIntyre 1998; FA 18 in Palmer and Strobeck 2003;
Smith et al. 1997). Centroid size calculation, Euclidean distance calculation and
Procrustes fitting were performed using 0yvind Hammer's Paleontological Statistics
program PAST version 0.98 (Hammer 2002). A subsample of 50 individuals from the
fourth inbred line (pp4B3) was digitized five times to estimate measurement error. In
these cases, measures of FA were taken as the average FA value of the five replicate
measurements for each specimen. Percent measurement error was also computed as
(ME/average FA) x 100 whereME = sd(FA, FA2, FA3, FA4, FA5) (as per Palmer and
Strobek 2003). For assessing whether measurement error (ME) interfered significantly
with FA, a Procrustes ANOVA (in Microsoft Excel) was performed on the five
replications of the 50 specimen subsample (Klingenberg and McIntyre 1998). Any
subsequent statistical analyses were performed using SPSS Base 8.0 statistical software
(SPSS Inc.).
Model selection and inference
The fits of unsigned size FA to three distributional models (halfnormal, lognormal,
and double Pareto lognormal (DPLN)) were compared in the Drosophila lines, by
calculating negative log likelihood and Akaike Information Criteria (AIC) (Burnham
and Anderson 1998; Hilborn and Mangel 1998). This method penalizes more complex
models (those with more parameters) when selecting the bestfit distributional model for
a given dataset. Best fitting parameters were obtained by maximizing the loglikelihood
function for each model (Appendix B). The maximization was performed using the
conjugate gradient method within unconstrained solve blocks in the program MathCAD
by MathSoft Engineering and Education Inc (2001).
Results
Model Simulation
The amount of variance in termination times related directly to levels of simulated
FA (i.e., low variance in termination time (t) gives low FA and vice versa). I found that
not only does amount of FA increase with increased variance in (t), but so do both
kurtosis and the scaling effect in the distribution tails. In Figure 36, normal quantile
plots of signed FA are shown for different standard deviations in the normal variation of
the termination of growth of geometrically expanding cell populations. The degree of the
50
S or sigmoidal shape in the plot indicates level ofleptokurtosis. The leptokurtosis in the
quantile plot is reduced greatly with a decrease in the standard deviation of the normal
variation in growth termination times.
S= 0.5
kurtosis
o =0.8
kurtosis
 =1.2
kurtosis
0.634
0.933
=1.149
o =3.0
kurtosis = 11.812
r = 7.0
kurtosis
:27.266
SIGNED DISTRIBUTION OF
FLUCTUATING ASYMMETRY
4
7at
a
fl
th
DETRENDED FIT
TO NORMAL
Figure 36. Simulated distributions of cell population size and FA for different amounts
of variation in the termination of growth (variance in normally distributed
growth stop time). The fit of simulated data to the normal distribution can
determined by how closely the plotted points follow the horizontal line (a
good fit is horizontal).
I
I
Experimental Results
Both mean unsigned size FA and shape FA decreased with inbreeding in all lines.
I also observed that the kurtosis of signed size FA and the skewness of both unsigned size
and shape FA follow an identical trend. The trend was strongest in kurtosis, which
decreased rapidly with inbreeding, indicating that, as predicted, changes in mean FA are
influenced strongly by the shape and tail behavior of the distribution of FA (Table 31).
Table 31. Distribution parameters and model fit for multivariate FA in two wild
populations and four inbred lines of Drosophila simulans and one isogenic
line of Drosophila melanogaster. Model fits are A AIC for unsigned centroid
size FA (zero is best fit, lowest number is next best fit).
Kurtosis Mean Mean Scaling
(signed (unsigned (shape FA) exponent AAIC AAIC AAIC
size FA) size FA) in upper HNORM LGN DPLN
tail
Wild 2004 28.3 5.73 0.0223 1.49 90.6 324 0.000
+0.12 0.38 0.0003
Wild 2005 45.9 4.68 0.0237 2.87 66.2 261 0.000
+0.15 0.19 0.0003
Inbred line 2.43 3.27 0.0183 5.09 0.000 104 15.5
1 (OR18D) 0.11 0.09 0.0002
Inbred line 8.85 4.24 0.0212 3.10 11.50 223 0.000
2(OR18D3) 0.15 0.13 0.0002
Inbred line 2.90 3.31 0.0184 4.63 0.000 130 44.0
3 (PP4B2) 0.15 0.09 0.0002
Inbred line 3.30 3.72 0.0213 5.78 0.000 228 79.7
4 (PP4B3) 0.15 0.11 0.0003
Isogenic 0.550 3.90 0.0201 9.99 0.000 84.9 101
(me175) 0.11 0.10 0.0002
While wild populations on the whole, demonstrated increased mean, skew and
leptokurtosis of FA compared to inbred lines, significant differences between populations
were found in all distributional parameters, even between each of the inbred lines
(ANOVA mean size FA for all lines; F = 27.49, p < 0.001; ANOVA mean size FA for
inbred lines only; F = 14.38, p < 0.001; note shape FA also show same result). Overall,
inbred lines demonstrated lower kurtosis, just slightly above that expected from a normal
distribution (Table 1). They also had lower mean FA. No significant differences were
found with respect to these results according to sexes of flies. In Figure 37, I show the
distribution of FA and detrended normal quantile plots for the wild population, four
inbred lines and one isogenic line respectively.
As in the simulated data, the degree of the S shape in the plot indicated level of
leptokurtosis. The S shape in quantile plot is reduced greatly with inbreeding and nearly
disappears in the isogenic line.
Model Selection and Inference
The comparison of candidate distributional models of FA demonstrated normalization
associated with inbreeding in three of the four inbred lines (Table 31). In the remaining
inbred line, the halfnormal candidate model was a close second to the double Pareto
lognormal distribution. In the wild population samples, the distribution of unsigned size
based FA was best described by the double Pareto lognormal distribution (DPLN), a
lognormal distribution with scaling in both tails. In the best fitting parameters of this
distribution there was no observable trend in lognormal mean or variance across wild
populations and inbred lines. The scaling exponent of the lower tail (P in Reed and
Jorgensen 2004) was close to one in all lines while the scaling exponent in the upper tail
(a in Reed and Jorgensen 2004) increased with inbreeding (Table 31). Both samples of
the wild population demonstrated the lowest scaling exponent in the upper tail of the
distribution. The low scaling exponents here indicate divergence in variance (2004 and
2005 where a < 3) and in the mean (2004 only where a < 2). The inbred lines all show
higher scaling exponents that are consistent with converging mean and variance.
wild 2004
kurtosis = 28.25
wild 2005
kurtos =45.85
inbred OR18D
kurtosis =2.43
inbred OR18D3
kurtosis =8.85
inbred PP4B2
kurtosis = 2.90
kurtosis =3.30
isogenic mel 75
kurtosis =0.55
aP
II
A
r
a
U,\
[r
eat ai
Figure 37. Distribution of fluctuating asymmetry and detrended fit to normal for two
samples of wild population collected in Gainesville, FL in summers of 2004
and 2005 and four inbred lines of Drosophila simulans derived from eight
generations of fullsib crossing of the wild population of 2004. Also included
is one isogenic line of Drosophila melanogaster (me175). All n = 1000. The
fit of data to the normal distribution can determined by how closely the
plotted points follow the horizontal line (a good fit is horizontal).
a
I
~~
"
The isogenic Drosophila melanogaster line demonstrated the highest scaling exponent
and the most normalized distribution of unsigned FA.
Measurement error was 7.6% for shapebased FA and 13.0% for centroid size
based FA. In a Procrustes ANOVA (Klingenberg and McIntyre 1998) the mean squares
for the interaction term of the ANOVA (MSinteraction) was highly significant p<0.001
indicating that FA variation was significantly larger than variation due to measurement
error (ME).
Discussion
Revealing the Genetic Component of FA
While we should be cautious about inferring process from pattern, the very similar
results of both the modeling and the inbreeding experiment in Drosophila seem to
suggest the presence of a scaling component in the distribution of fluctuating asymmetry
that is caused by a random multiplicative growth process as suggested previously by
Graham et al. 2003. This parameter appears to change with the genetic redundancy of the
population which is presumably increased by genetic drift and and reduction in
heterozygosity during inbreeding. Specifically, the scaling exponent(s) of the upper tail
(a) of the unsigned FA distribution, or outer tails of the signed FA distribution, are
increased with inbreeding, causing more rapid powerlaw decay in the shape of the tails.
This effect also reduces kurtosis and apparently normalizes the distribution of FA in more
inbred populations. This suggests that individual genetic differences in the capacity to
control variance in the termination of random proportional growth (i.e., geometric
Brownian motion) may be responsible for determining the shape and kurtosis of the
distribution of FA. In other words, leptokurtosis kurtosiss >3) in signed FA distribution
indicates genetic variability in the population while normality kurtosiss = 3) indicates
genetic redundancy.
Because leptokurtosis is very often observed in the distribution of FA, genetic
variability potentially underlies a large proportion of the variability observed in the FA of
a given population. Observed differences or changes in FA are therefore not only a
response of development to environmental stress, but clearly also can reflect inherent
differences in the genetic redundancy of populations. The significantly different levels of
mean FA among the inbred lines in this study, presumably caused by the random fixation
of certain alleles, also suggests that there is a strong genetic component to the ability to
buffer development against random noise. It is assumed that the differences observed in
FA between wild trapped and inbred populations in this study do not indicate an effect of
inbreeding depression in the study for two reasons. First, the four inbred lines analyzed
in this study were vigorous in culture so the fixation of random alleles was probably not
deleterious. Second, and more important, inbreeding reduced FA rather than increasing it
as would have been expected under genetic stress.
It is also important to note that kurtosis is potentially a much stronger indicator of
FA than the distribution mean. The low scaling exponents found in the nonnormal
distributions of FA in the wild populations of Drosophila simulans are capable of
slowing and perhaps even stopping, the convergence on mean FA with increased sample
size. Because kurtosis is a fourth order moment, estimating it accurately also requires
larger sample sizes. However, if kurtosis can be demonstrated to respond as strongly to
environmental stress as it does here to inbreeding, its potential strength as a signal of FA
may allow new interpretation of past studies of FA without the collection of more data.
This may help resolve some of the current debate regarding FA as a universal indicator of
environmental health and as a potential sexual signal in "good genes" models of sexual
selection.
Limitations of the Model
There are certain aspects of the model presented here that may be
oversimplifications of the real developmental process. First of all, this model assumes
developmental instability is generated leftright growing tissue fields with no regulatory
feedback or control other than when growth is stopped. It is very likely that leftright
regulation is able to occur at higher levels of organization (e.g., across multiple
developmental compartments) even though there is no evidence of regulated cell cycling
rates beyond the distance of 68 cells on average within any given developmental
compartment (Milan et al. 1995). Therefore this model explains how fluctuating
asymmetry is generated, not how it is regulated. Second, this model only considers cell
proliferation as influencing size. It is known that both cell size and programmed cell
death, or apoptosis, are also important in regulating body size (Raff 1992, Conlin and
Raff 1999). Both of these may play a more prominent role in vertebrate development,
than they do in insect wings, where apparently growth is terminated during its
exponential phase. Nevertheless, this simple model seems to replicate very well, certain
behavioral aspects of the distribution of fluctuating asymmetry in the insect wing.
The Sources of Scaling
The basic process generating powerlaw scaling effects illustrated here (Reed 2001)
offers an alternative perspective to the often narrow explanations of powerlaws caused
by selforganized criticality in the interactions among system components (Gisiger 2000).
The power law that results from selforganized criticality is created from the multiplicity
of interaction paths in the network. As the distance between two interacting objects is
increased in a network or multidimensional lattice, the number of potential interaction
pathways increases exponentially, while the correlation between such paths decreases
exponentially (Stanley 1995). These opposing exponential relationships create the power
law scaling observed in simulations of selforganized critical systems. While there is no
such "interaction" in our model, there is a powerlaw generated by opposing exponential
functions. The constant degree of change represented by the power law in both the
statistical physics of critical systems and the mathematics of both Reed and Jorgensen's
process and the model given here is the result of the combined battle between both the
exponentially increasing and decreasing rates of change (Reed 2001). While the natural
processes are quite different, the underlying mathematical behavior is very similar.
Potential Application to Cancer Screening
Just as changes in the shape of the distribution of fluctuating asymmetry is
normalized across a population of genetically redundant individuals, genetic redundancy
in a population of cells may also help maintain normal cell size and appearance. The loss
of genetic redundancy in a tissue is a hallmark of cancer. The abnormal gene expression
and consequent genetic instability that characterizes cancerous tissue often results in
asymmetric morphology in cells, tissues and tissue borders. Baish and Jain (2000)
review the many studies connecting fractal (scale free) geometry to the morphology of
cancer. Cancer cells also are typically pleomorphic or more variable in size and shape
than normal cells and this pleomorphy is associated with intercellular differences in the
amount of genetic material (Ruddon 1995). Frigyesyi et al. (2003) have demonstrated a
powerlaw distribution of chromosomal aberrations in cancer. Currently the type of
distributional shape of the pleomorphic variability in cell size is not known, or at least not
published. However, Mendes et al. (2001) demonstrated that cluster size distributions of
HN5 (cancer) cell aggregates in culture followed a powerlaw scaled distribution.
Furthermore these authors also demonstrated that in MDCK (normal) cells and Hep2
(cancer) cells, cluster size distributions transitioned from shorttailed exponential
distributions to longtailed powerlaw distributions over time. The transition is
irreversible and is likely an adaptive response to high density and long permanence in
culture due to changes in either control of replication or perhaps cell signaling. Taken
collectively, these studies may suggest that scaling at higher levels of biological
organization observed in cancer is due to increased relative differences in length of cell
cycling rates of highly pleomorphic cell populations that have relatively larger
intercellular differences in amount of genetic material. The stochastic growth model I
have proposed as the basis for higher variability in population level developmental
instability or FA may also provide a possible explanation for higher variability in the cell
sizes of cancerous tissue. If genetic redundancy in growing tissue has the same
distributional effect as genetic redundancy in populations of organisms and tends to
normalize the observed statistical distribution, then one might predict that the genetic
instability of cancerous tissue would create a scaling effect that causes pleomorphy in
cells and scaling in cell cluster aggregations. Statistical comparison of cell and cell
cluster size distributions in normal and cancerous tissues may provide a useful and
general screening technique for detecting when genetic redundancy is compromised by
cancer in normal tissues.
Conclusion
Until now, the basis of fluctuating asymmetry has been addressed only with
abstract models of hypothetical cell signaling, or elsewhere, at the level of selection
working on the organism with potential mechanism remaining in the black box.
However, fluctuating asymmetry must first and foremost be envisioned as a stochastic
process occurring during tissue growth, or in other words, occurring in an exponentially
expanding population of cells. As demonstrated in the model presented here, this
expansion process can be represented by stochastic proportional (geometric) growth that
is terminated or observed randomly over time. These results imply that the fluctuating
asymmetry observed in populations is not only related to potential environmental
stressors, but also to a large degree, the underlying genetic variability in those molecular
processes that control the termination of growth. Therefore, fluctuating asymmetry
responses to stress may be hard to interpret without controlling for genetic redundancy in
the population. Both the simulation and experimental results suggest that measures of
distributional shape like kurtosis, scaling exponent and tail weight may actually be a
strong signal of variability in the underlying process that causes developmental
instability. Therefore the kurtosis parameter of the fluctuating asymmetry distribution
may provide more information about fluctuating asymmetry response than does a
populations' average or mean fluctuating asymmetry. This may provide a novel method
by which to resolve conflicts in previous undersampled research without the collection
of more data.
CHAPTER 4
TEMPERATURE RESPONSE OF FLUCTUATING ASYMMETRY TO IN AN APHID
CLONE: A PROPOSAL FOR DETECTING SEXUAL SELECTION ON
DEVELOPMENTAL INSTABILITY
Introduction
Developmental instability is a potentially maladaptive component of individual
phenotypic variation with some unknown basis in both gene and environment (Moller
and Swaddle 1997, Fuller and Houle 2003). Developmental instability is most often
measured by the manifestation of fluctuating asymmetry (FA), the right minus left side
difference in size or shape in a single trait across the population (Palmer and Strobek
1986, 2003; Parsons 1992, Klingenberg and McIntyre 1998). Because FA is thought to
indicate stress during development, the primary interest in the study of FA has been its
potential utility as an indicator of good genes in mate choice (Moller 1990, Moller and
Pomiankowski 1993) or its utility as a general bioindicator of environmental health
(Parsons 1992).
The Genetic Basis of FA
For FA to become a sexually selected trait, it must be assumed that it has a
significant genetic basis, and can therefore evolve (Moller 1990, Moller and
Pomiankowski 1993). However, researchers who use FA as a bioindicator of
environmental health often assume that FA is a phenotypic response that is mostly
environmental in origin (Parsons 1992, Lens et al. 2002). Studies of the heritability of
FA seem to indicate low heritability for FA exists (Whitlock and Fowler 1997, Gangestad
and Thornhill 1999). However, because FA is essentially a variance that is often
measured with only two data points per individual, FA may have a stronger but less
easily detectable genetic basis (Whitlock 1996, Fuller and Houle 2003). Additive genetic
variation in FA in most studies has been found to be minimal, but several quantitative
trait loci studies suggest significant dominance and character specific epistatic influences
on FA (Leamy 2003, Leamy and Klingenberg 2005). Babbitt (chapter 3) has
demonstrated that a population's genetic variability affects the distributional shape of FA.
So while studies investigating mean FA may be inconclusive, changes in the population's
distributional shape seem to indicate potential genetic influence on FA. However, no
studies have observed FA in a clonal organism for the express purpose of assessing
developmental instability that is purely environmental (i.e., nongenetic) in its origin (i.e.,
developmental noise).
The Environmental Basis of FA
It has long been assumed that FA is the result of some level of geneticallybased
buffering of additive independent molecular noise during development. Because of the
difference in scale between the size of molecules and growing cells it would be unlikely
that molecular noise would comprise an important source of variation in functioning
cells. However, Leamy and Klingenberg (2005) rationalize that molecular noise could
only scale to the level of tissue when developmentally important molecules exist in very
small quantity (e.g., DNA or protein) and therefore FA may represent a stochastic
component of gene expression. This is somewhat similar in spirit to the Emlen et al.
(1993) explanation of FA that invokes nonlinear dynamics of signaling and supply that
may also occur during growth. Here FA is thought to be the result of the scaling up of
compounding temporal asymmetries in signaling between cells during growth. In this
model, hypothetical levels of signaling compounds (morphogens) and or growth
precursors used in the construction of cells vary randomly over time. When growth
suffers less interruption, thus when it occurs faster, there is also less complexity (or
fractal dimension) in the dynamics of signaling and supply. Graham et al. (1993) suggest
that nonlinear dynamics of hormonal signaling across the whole body may also play a
similarly important role in the manifestation of FA. However, while the levels of FA are
certainly influenced by the regulation of the growth process, both Graham et al. (2003)
and Babbitt (chapter 3) also suggest that FA levels reflect noise during cell cycling that is
amplified by exponentially expanding populations of growing cells.
Although the proximate basis of FA is not well understood, its ultimate
evolutionary basis, while heavily debated, is easier to understand. Moller and
Pomiankowski (1993) first suggested that strong natural or sexual selection can remove
regulatory steps controlling the symmetric development of certain traits (e.g.,
morphology used in sexual display). They suggest that with respect to these traits (and
assuming that they are somehow costly to produce), individuals may vary in their ability
to buffer against environmental stress and developmental noise in relation to the size of
their individual energetic reserves; which are in turn often indicative of individual genetic
quality. Therefore high genetic quality is associated with low FA.
Most existing proximate or growth mechanical explanations of FA assume that
rapid growth is less stressful in the sense that fewer interruptions of growth by various
types of noise should result in lower FA (Emlen et al. 1993, Graham et al. 1993).
However, ultimate evolutionary explanations of FA assume that rapid growth is
potentially more stressful because it is energetically costly and therefore rapid growth
should increase the level of FA when energy supply is limiting (Moller and
Pomiankowski 1993). This high FA is relative and so should be especially prominent in
individuals of lower genetic quality who can least afford to pay this additional energy
cost. This fundamental difference between the predictions of proximate (or mechanistic)
level and ultimate evolutionary level effects of temperature on FA is shown in Figure 1.
The theoretical difference in the correlation of temperature and growth rate to FA in both
the presence and absence of energetic limitation could be used to potentially detect sexual
selection on FA. However it first should be confirmed that FA should decrease with
more rapid growth in the absence of energetic limitation to growth and genetic
differences between individuals in a population. This later objective is the primary goal
of this study.
Temperature and FA in and Aphid Clone
At a very basic level, entropy or noise in physical and chemical systems has a
direct relationship with the physical energy present in the system. This energy is
measured by temperature. Because FA is speculated to tap into biological variation that
is somewhat free of direct genetic control, it may therefore respond to temperature in
simple ways.
First, increased temperature may increase molecular entropy which may in turn
increase developmental noise during development thereby increasing FA. Second,
increased temperature may act as a cue to shorten development time (as in many aphids
where higher temperature reduces both body size and development time), thereby
reducing the total time in which developmental errors may occur. This should reduce
FA. A third possibility is that a species specific optimal temperature exists. If so, FA
should increase while approaching both the upper and lower thermal tolerance limits of
organisms.
PROXIMATE LEVEL EFFECT
DEVELOPMENTAL +
TIME
TEMPERATURE FLUCTUATING
ASYMMETRY
GROWTH RATE
ULTIMATE LEVEL EFFECT
+
STRESS DEVELOPMENTAL
+ \^ / TIME
NONOPTIMAL  FLUCTUATING
TEMPERATURE ASYMMETRY
ENERGETIC G R
RESERVES GROWTH RATE
+
Figure 41. Predicted proximate and ultimate level correlations of temperature and
growth rate to fluctuating asymmetry are different. Ultimate level
(evolutionary) effects assume energetic limitation of individuals in the system.
Proximate level (growth mechanical) effects do not. Notice that temperature
and fluctuating asymmetry are negatively correlated in the proximate model
while in the ultimate model they are positively correlated.
Only a few studies have directly investigated the relationship between FA and
temperature. The results are conflicting. FA is either found to increase on both sides of
an "optimal" temperature (Trotta et al. 2005, Zakarov and Shchepotkin 1995), to be
highest at low temperature (Chapman and Goulson 2000), to simply increase with
increasing temperature (Savage and Hogarth 1999, Mpho et al. 2002) or not to respond
(Hogg et al. 2001). None of these studies investigate the relationship between
developmental noise (FA in a clonal line) and temperature.
The characterization of developmental noise in response to temperature was
investigated in this study using the cotton aphid, Aphis gossipyii. These aphids reproduce
parthenogenetically, are not energetically limited in their diet (because they excrete
excess water and sugar as "honeydew") and produce wings that are easily measured using
multiple landmarks. They demonstrate large visible variation in body size, wing size and
even wing FA. The visible levels of wing asymmetry in cotton aphids reflect levels of
FA that are about four times higher than that observed in other insect wings (Babbitt et al.
2006, Babbitt in press). Because parthenogenetic aphids cannot purge deleterious
mutations each generation and because Florida clones often never use sexual
reproduction to produce overwintering eggs, this remarkably high FA may be the result
of Muller's ratchet. Cotton aphids are also phenotypically plastic in response to
temperature, producing smaller lighter morphs at high temperatures and larger dark
morphs at low temperatures. This feature allows observation of two genetically
homogeneous groups in which different gene expression patterns (causing the color
morphs) exist. The central prediction is that of the proximate effect model: that in the
absence of energetic limitation and genetic variation, temperature and environmentally
induced FA (or developmental noise) should be negatively correlated in a more or less
monotonic relationship.
Methods
In March 2003, a monoclonal population of Cotton Aphids (Aphis gossipyii
Glover) was obtained from Dr. J.P. Michaud in Lake Alfred, Florida and was brought to
the Department of Entomology and Nematology at the University of Florida. The culture
was maintained on cotton seedlings (Gossipium) grown at different temperatures (12.50C,
150C, 170C, 190C, 22.50C and 250C with n = 677 total or about 100+ per treatment)
under artificial grow lights (14L:10D cycle). Because of potential undersampling caused
by a nonnormal distribution of FA (see Babbitt et al. 2006), a second monoclonal
population collected from Gainesville, FL in June 2004 was reared similarly but in much
larger numbers at 12.50C, 150C, 170C, 190C, and 250C (n = 1677 or about 300+ per
treatment).
Development time for individual apterous cotton aphids (Lake Alfred clone) were
determined on excised cotton leaf discs using the method Kersting et al. (1999). Twenty
randomly selected females were placed upon twenty leaf discs (5 cm diameter) per
temperature treatment. Discs were set upon wet cotton wool in petri dishes and any first
instar nymphs (usually 35) appearing in 24 hours were then left on the discs.
Development time was taken as the average number of days taken to reach adult stage
and compared across temperatures. Presence of shed exoskeleton was used to determine
instar stages. Cotton was wetted daily and leaf discs were changed every 5 days.
Humidity was maintained at 50+5%.
In each temperature treatment, single clonal populations were allowed to increase
on plants until crowded in order to stimulate alate (winged individuals) production.
Temperature treatments above 170C produced light colored morphs that were smaller and
tended to feed on the undersides of leaves of cotton seedlings. Temperature treatments
below 17C produced larger dark morphs that tended to feed on the stems of cotton
seedlings. Alatae were collected using small brushes dipped in alcohol and stored in 80%
ethanol. Wings were dissected using fine insect mounting pins and dry mounted as pairs
on microscope slides. Species identification was by Dr. Susan Halbert at the State of
Florida Department of Plant Industry in Gainesville.
Specimens were dried in 85% ethanol, and then pairs of wings were dissected (in
ethanol) and airdried to the glass slides while ethanol evaporated. Permount was used to
attach cover slips. This technique prevented wings from floating up during mounting,
which might slightly distort the landmark configuration. Dry mounts were digitally
photographed. Six landmarks were identified as the two wing vein intersections and four
termination points for the third subcostal. See Appendix A for landmark locations.
Wing vein intersections were digitized using TPSDIG version 1.31 (Rohlf,
1999). Specimens damaged at or near any landmarks were discarded. Fluctuating
asymmetry was calculated using a multivariate geometric morphometric landmarkbased
method. All landmarks are shown in Appendix A. FA (FA 1 in Palmer and Strobek
2003) was calculated as absolute value of (R L) where R and L are the centroid sizes of
each wing (i.e., the sum of the distances of each landmark to their combined center of
mass or centroid location). In addition, a multivariate shapebased measure of FA known
as the Procrustes distance was calculated as the square root of the sum of all squared
Euclidean distances between each left and right landmark after twodimensional
Procrustes fitting of the data (Bookstein 1991; Klingenberg and McIntyre 1998; FA 18 in
Palmer and Strobeck 2003; Smith et al. 1997). This removed any difference due to size
alone. Centroid size calculation, Euclidean distance calculation and Procrustes fitting
were performed using 0yvind Hammer's Paleontological Statistics program PAST
version 0.98 (Hammer 2002). Percent measurement error was computed as (ME/average
FA) x 100 where ME= ( FA FA2 + FA2 FA3 + FA FA3)/3 in a smaller subset
(200 wings each measured 3 times = FA1, FA2 and FA3) of the total sample. All
subsequent statistical analyses were performed using SPSS Base 8.0 statistical software
(SPSS Inc.). Unsigned multivariate size and shape FA as well as the kurtosis of signed
FA were then compared at various temperatures using oneway ANOVA.
Results
Development time (Figure 42) was very similar to previously published data
(Kersting et al. 1999, Xia et al. 1999) decreasing monotonically at a much steeper rate in
dark morphs than in light morphs. The distributional pattern of centroid size, size FA and
shape FA appear similar exhibiting right log skew distributions (Figure 43). Similar
distributional patterns are observed within temperatures (not shown) as that observed
across temperatures (Figure 43).
30
Dark Morph Light Morph
2 5 ......................... . . .
25 
20 
DEVELOPMENT
TIME (days) 15 
1 5 .. .... .... ... 1 ......... 1 ........ ............................................................
10
8 10 12 14 16 18 20 22 24 26
TEMPERATURE (C)
Figure 42. Cotton aphid mean development time +1 SE in days in relation to
temperature (n= 531).
FREQUENCY
100
ND S364 00
CENTROID SIZE
St Dev = 596
T O N = 1822 00
CENTROID SIZE FLUCTUATING ASYMMETRY
J Std Dev= 02
Mean= 056
0 N = 121900
,% *% ,% ,% ,% ,% *o, *,% *% ,% *,%. *% *%
SHAPE FLUCTUATING ASYMMETRY
Figure 43. Distribution of isogenic size, size based and shape based FA in monoclonal
cotton aphids grown in controlled environment at different temperatures.
Distributions within each temperature treatment are similar to overall
distributions shown here.
Coefficient of variation for FA was slightly higher for dark morphs (12.5 C =
92.59%, 15 C = 93.02%, 17 C = 88.95%, 19 C = 79.02%, 22.5 C = 80.00% and 25 C =
85.35%). Mean isogenic FA (both size and shape) was highly significantly different
across temperatures (ANOVA F = 6.691, df between group = 4, df within group = 1673,
p < 0.001) in the Gainesville FL clone (Figure 44) but not in the Lake Alfred clone
(ANOVA F = 1.992, df between group = 5, df within group = 672, p < 0.078). This is an
indication ofundersampling in the Lake Alfred clone. In the Gainesville clone, mean
centroid size FA (Figure 44A) and development time (Figure 42) follow a nearly
identical pattern, decreasing rapidly at first then slowing with increased temperature.
1 1 1 1 1 1
10.5 Dark Morph Light Morph
10 A
9.5 
9
8.5
7.5
6.5
mean isogenic 610 12 14 16 18 20 22 24 26
fluctuating asymmetry
0.068 
Dark Morph Light Morph
0.066 
0.064 
0.062
0.06
0.058
0.056
0.054
0.052 .
10 12 14 16 18 20 22 24 26
TEMPERATURE (C)
Figure 44. Mean isogenic FA for (A.) centroid size based and (B.) Procustes shape
based) in monoclonal cotton aphids (collected in Gainesville FL) grown on
isogenic cotton seedlings at different temperatures.
71
Mean shape FA was also significantly different across temperature classes
(ANOVA F = 4.863, df within group = 4, df between group = 1673, p = 0.001) but this
difference is due solely to elevated FA in the 12.50C group (Figure 44B). Less than one
percent of the variation in FA was due to variation in body size (r = 0.101 for shape FA;
r = 0.088 for size FA). Kurtosis in the shape of the distribution of size based FA (Figure
45) was significantly higher in dark morphs than in light morphs (t = 2.21 p = 0.027).
Within each morph (light or dark), kurtosis in the distribution of FA appears to increase
with temperature slightly (Figure 45). Measurement error for shape FA was estimated
at2.6% (Lake Alfred clone) and 2.2% (Gainesville clone). For size FA these estimates
were 6.1% (Lake Alfred) and 5.7% (Gainesville).
KURTOSIS
1
10
10
12 14 16 18 20
TEMPERATURE (C)
Figure 45. Kurtosis of sizebased FA in monoclonal cotton aphids grown on isogenic
cotton seedlings at different temperatures.
Dark Morph Light Morph ...................................
. ..........
22 24
Discussion
It appears that the prediction of the proximate effect model holds in the case of this
population of cotton aphid, which, in general, is not energetically limited and in this
study, is not genetically variable. Temperature and growth rate (which are positively
correlated in insects) are negatively associated with purely environmentally derived FA
(i.e., developmental noise). This confirms the predictions of several proximate models of
the basis of FA. Furthermore, it appears that centroid sizebased FA is a simple function
of development time. Because individual genetic differences in the capacity to buffer
against developmental noise are, in a sense, controlled for in this study by the use of
natural clones, the response of FA to temperature in this study represents a purely
environmental response of FA. The fact that aphids excrete large amount of water and
sugar in the form of honeydew, as well as the lack of a strong correlation between FA and
size also suggests that there is no real energetic "cost" to being large in the aphids in this
study. An important next step will be to compare this result to the association of growth
rate to FA in genetically diverse and energetically limited sexually selected traits where
the predicted association between growth rate and FA would be the opposite of this
study.
It also appears that because the environmental component of sizebased isogenic
FA is largely a function of developmental time and temperature, dark morphs of cotton
aphids, which have much longer development time than light morphs, also have
significantly higher FA. The temperature trend in mean FA within both light and dark
morphs, where development times are similar, is not consistent although it appears that
the kurtosis of FA increases slightly with temperature within each morph.
It is very interesting that there is a strong difference in kurtosis between the two
temperature morphs in this study. Previous work suggests that kurtosis is related to
genetic variability in a population (Babbitt in press). The difference observed here in
light of genetic homogeneity in the monoclonal aphid cultures, suggests that there may be
a difference in developmental stability of light and dark aphid morphotypes that is due
primarily to the differential expression of genes in each phenotype.
It is surprising that temperature trends in mean developmental noise are slightly
different regarding whether a size or shapebased approach was used. In populations
where individuals are genetically diverse, both size and shapebased measures of FA are
often correlated to some degree (as they are here too). However, size and shape are
regulated somewhat differently in that cell proliferation is mediated both extrinsically via
cyclin E acting at the G1/S checkpoint of the cell cycle, predominantly affecting size, and
intrinsically at via cdc25/string at the G2/M checkpoint, predominantly affecting pattern
or shape (Day and Lawrence 2000). Extrinsic mechanisms regulate size through the
insulin pathway and its associated hormones (Nijhout 2003) providing a link between
size and the nutritional environment. This may explain why size FA follows
development time more closely than shape FA in this study.
Size is usually less canalized than is shape and therefore more variable. While this
makes the sizebased measure of FA generally attractive, it has been found in previous
work (Babbitt et al. 2006) that population averages of sizebased FA are often so variable
that they are frequently undersampled because of the broad and often long tailed
distribution of FA. Shapebased FA does not suffer as much from this problem and so its
estimation is much better although it may be less indicative of environmental influences
and perhaps therefore better suited for genetic studies of FA. Because size FA has been
adequately sampled in this study and because size is more heavily influenced by
environmental factors like temperature, I find that sizebased FA is the more interesting
measure of developmental instability in this study.
In general, it is clear that developmental noise is not constant in genetically
identical individuals cultured under near similar environments. The overall distribution
pattern in both size and shape FA is log skewed even within temperature classes. This
suggests that sampling error due solely to random noise can play a very significant role in
FA studies. Furthermore, it appears that the response of FA to the environment is
potentially quite strong. This suggests that FA may indeed be a responsive bioindicator,
however its response will not be very generalizable in environments with fluctuation in
temperature.
In conclusion, the environmental response of developmental noise to temperature
in absence of genetic variability and nutrient limitation supports an important prediction
of theoretical explanations of the proximate basis of FA. This prediction is that FA
should decrease with growth rate (and temperature) in ectothermic organisms. Because it
is only in sexually selected traits that the opposite prediction should hold, that FA should
increase and not decrease with growth rate, the results presented here may offer a useful
method for discriminating between FA that is under sexual selection and FA that is not.
CHAPTER 5
CONCLUDING REMARKS AND RECOMMENDATIONS
In this dissertation, I confirm some existing hypotheses concerning FA and provide
a new explanatory framework that explains alteration in the distribution of fluctuating
asymmetry (FA) and its subsequent effect upon mean FA. The basis of FA and the
influences that shape its response to genes and the environment in populations can be
suitably modeled by stochastic proportional growth in expanding populations of cells on
both sides of the body that are terminated with a small degree of geneticallybased
random error. I model stochastic growth with geometric Brownian motion, a random
walk on a log scale. And I also model error in terminating growth with a normal
distribution. The resulting distribution of FA is a lognormal distribution characterized by
powerlaw scaled tails. Because under a powerlaw distribution, increased sample size
increases the chance of sampling rare events under these long tails, convergence to the
mean is potentially slowed or even stopped depending on the scaling exponents of the
powerlaw describing the tails. I demonstrated that the effect of reduced convergence to
the mean is substantial and has probably caused much of the previous work on FA to be
undersampled (Chapter 1). I have also demonstrated that the scaling effect in the
distribution of FA is directly related to genetic variation in the population. Therefore a
low scaling exponent and also high kurtosis is associated with a large degree of genetic
variation between individuals in their ability to precisely terminate the growth process
(Chapter 2). I also demonstrate that when genetic differences do not exist (where FA is
comprised of only developmental noise) and when development is not limited
energetically, such as in a parthenogenetic aphid clone, FA depends upon developmental
time (Chapter 3).
How Many Samples Are Enough?
The answer to this clearly depends upon whether researchers chose to study FA in
size or FA in shape. The required sample size depends upon the distributional type and
parameters of the distribution of FA. As previously discussed, FA distributions differ
depending upon whether FA is based on size or shape differences between sides. As
demonstrated in chapter 1, multivariate measures of FA based upon shape, such as
Procrustes Distances, tend to have distributional parameters that allow convergence to the
mean that is about five times more rapid than either univariate or multivariate measures
of FA based upon size (Euclidian distance or centroid size). While this information
might seem to favor a shapebased approach, there are some other very important
considerations given below. In the end, sample size must be independently evaluated in
each study depending on the magnitude of asymmetry that one wishes to detect between
treatments or populations. As a general rule, if shape FA is used, 100200 samples may
suffice, but if size FA is used then many hundreds or even a thousand samples may be
required to detect a similar magnitude difference.
What Measure of FA is best?
The handicap of sampling requirements of sizebased FA aside, it appears that
measures of the distributional shape of FA, like kurtosis, scaling exponent in the upper
tail and skewness are strongly related to similar but smaller changes in mean size FA
(chapter 2). Because size FA is most commonly used in past studies of FA, these
measures of distribution shape might allow clearer conclusions to be drawn from
literature reviews and metaanalyses as well as individual studies of FA where the
collection of more data is not possible. My work linking shape of the FA distribution to
the genetic variability of the population will require that future studies of FA which are
focused on the environmental component of FA be controlled for genetic differences
between populations. Where FA is sought as a potential bioindicator of environmental
stress, the potentially differing genetic structure of populations will need to be considered
in order to make meaningful conclusions about levels of FA. Additionally, because in the
absence of genetic variation in monoclonal aphids, size FA is directly related to the total
developmental times of individuals (chapter 3), it may a better choice than shape FA for
studies of environmental influences on FA.
In most organisms, body size is less canalized than body shape. Body size is highly
polygenic and depends upon many environmentally linked character traits. However
body shape or patterning is determined by a sequential progression of the activity of far
fewer genes. Body size is also regulated at a different checkpoint during the cell cycle
than is body pattern formation or shape (Day and Lawrence 2000). Patterning or shape is
regulated at G2/M checkpoint while size is regulated at G1/S checkpoint. The latter
checkpoint is associated with the insulin pathway, linking size to nutrition and hence to
the environment. Because of this the study of environmental responses of FA may
ultimately be best served by using multivariate sizebased measures of FA (i.e., centroid
size FA) while at the same time making sure to collect enough samples to accurately
estimate the mean.
In conclusion, multivariate size FA may be the better metric for large studies
interested in the effects of environmental factors upon FA. Where data is harder to come
by (e.g., vertebrate field studies), multivariate shape FA may perform better with the
caveat that shape and pattern are less likely to vary in a population than does size.
Because centroid size must be calculated in order to derive Procrustes distance, both
measures can easily and should be examined together.
Does rapid growth stabilize or destabilize development?
The effects of temperature and growth rate on FA appear to support the proximate
model which predicts that FA declines with increasing growth rates (Figure 43). Aphids
from the Gainesville clone decrease mean FA in response to increased temperature and
growth rate. It is likely that the Lake Alfred clone is undersampled and therefore
estimates of mean FA are not accurate in that sample. The results of the Gainesville
clone, which was sampled adequately, suggests that the prediction of proximate models
of the basis ofFA (Emlen et al. 1993, Graham et al. 1993), that rapid growth should
decrease FA holds true. Moller's hypothesis that rapid growth is stressful because of
incurred energetic costs does not appear to hold in the case of aphids. Of course aphids,
being phloem feeders, generally have access to more water and carbon than they ever
need. This is evidenced by analysis of honeydew composition in this and many other
aphid species. Growth and reproduction in aphids is probably more limited by nitrogen
than by water or energy (carbon). So it would seem that this system may not be a very
good one in which to assess Moller's hypothesis directly. It should be further
investigated whether FA of sexually selected traits in energetically limited, genetically
diverse populations is positively associated with growth rate as is predicted by the
ultimate or evolutionary model presented in Chapter 4.
Can fluctuating asymmetry be a sexually selected trait?
Given that a genetically altered population (chapter 3) demonstrates a consistent
response in the shape of the FA distribution does suggest that a potentially strong genetic
basis for FA exists, but is not easily observable through levels of mean FA. This implies
some heritability exists and possibly opens the door for selection to act upon FA in the
context of mate choice. However, in the future it must be demonstrated, with appropriate
sample sizes, that the level of FA in the laboratory can be altered by selection for
increased or decreased FA. The opposing predictions of the proximate or mechanistic
and ultimate (evolution through sexual selection) explanations of FA, regarding the
relationship of temperature, growth rate and FA may offer a method for detecting sexual
selection upon FA (chapter 4). If FA has evolved as an indicator of good genes, and
hence has related energetic costs, the expectation that FA should increase with growth
rate and temperature (in ectotherms) is plausible. Otherwise, ifFA is caused by random
accumulation of error during development, then the expectation is reversed. FA should
decrease with growth rate and temperature as I have demonstrated in a clonal population
of aphids that are not energetically limited (chapter 4).
Is fluctuating asymmetry a valuable environmental bioindicator?
Given that the genetic structure of a population (chapter 3) has potentially strong
effects upon the shape and location (mean) of the distribution of FA, the usefulness of FA
as a bioindicator of environmental stress has to be questioned. Average levels of FA of
populations would be difficult to interpret or compare without additional information
regarding genetic heterogeneity. However, provided that the study of FA and population
genetics were undertaken simultaneously, this problem could be avoided. So FA could
still be of use in this regard, however its expense and ease of use compared to other
bioindicators would need to be reassessed.
Scaling Effects in Statistical Distributions: The Bigger Picture
This dissertation demonstrates that underlying distributions of some biological data
can contain partial selfsimilarity or powerlaw scaling. The proper model for describing
data such as this sits at a midpoint between those models used in classical statistics and
those of statistical physics: Levy statistics (Bardou et al. 2003). To my knowledge, this
work represents the first application of such a model in biology.
The sources of powerlaw scaling in the natural sciences are diverse. Sornette
(2003) outlines 14 different ways that powerlaws can be created, some of which are very
simple. The hypothesis that all powerlaw scaling in nature is due to a single phenomena
such as selforganized criticality (SOC) (Bak 1996, Gisiger 2000) or highly optimized
tolerance (HOT) (Newman 2000) is unlikely. Reed (2001) suggests that the model I have
used here, stochastic proportional growth that is observed randomly, may explain a great
deal of powerlaw scaled size distributions formerly speculated to have a single cause
like SOC or HOT. These phenomena include distributions of city size (Zipf's law),
personal income (Pareto's Law), sand particle size, species per genus in flowering plants,
frequencies of words in sequences of text, sizes of areas burnt in forest fires, and species
body sizes, just to name a few examples.
One of the most common ways that powerlaws can be obtained is by combining
exponential functions. This is effect is also observed when positive exponential/
geometric/proportional growth is observed with a likelihood described by to a negative
exponential function such as in Reed and Jorgensen's (2004) model for generating size
distributions. Power laws generated by combing exponential functions are also present in
the statistical mechanics of highly interactive systems (e.g., SOC) where the correlation
in behavior between two nodes or objects in an interaction network or lattice decreases
exponentially with the distance between them while the number of potential interaction
paths increases exponentially with the distance between them (Stanley 1995). In
Laplacian fractals, or diffusion limited aggregations observed in chemical
electrodeposition and bacterial colony growth (Viscek 2001) there may be a similar
interplay between exponential growth (doubling) and the diffusion of nutrients supporting
growth which are governed by the normal distribution, of the form y = e
The exponential function holds a special place in the natural sciences. Any
frequency dependent rate of change in nature, or in other words, any rate of change of
something that is dependent on the proportion of that something present at that time, is
described by the exponential function. Therefore it finds application to many natural
phenomena including behavior of populations, chemical reactions, radioactive decay, and
diffusion just to name a few. It seems only fitting that many of the powerlaws we
observe in the natural sciences probably owe their existence to the interplay of
exponential functions, one of the most common mathematical relationships observed in
nature.
As far as we can ascertain from the recording of ancient civilizations, human beings
have been using numbers for at least 5000 years if not longer. And yet the concepts of
probability are a relatively recent human invention. The idea first appears in 1545 in the
writings of Girolamo Cardano and is later adopted by the mathematicians, Galileo,
Fermat, Pascal, Huygens, Bernoulli and de Moivre in discussions of gambling over the
next several hundred years. The concepts of odds and of random chance are not
generalized until the early twentieth century by Andreyevich Markov and not formalized
into mathematics until the work of Andrei Kolmogorov in 1946.
The role of uncertainty in nature is yet to be resolved. Does uncertainty lie only
with us or does it underlie the very fabric of the cosmos? Most 19th century scientists
(excepting perhaps Darwin) believed that the universe was governed by deterministic
laws and that uncertainty is solely due to human error. The first application of statistical
distributions by Carl Frederick Gauss and Pierre Simon Laplace were concerned only
with the problem of accounting for measurement error in astronomical calculations. The
more recent view, that uncertainty is something real, was largely the work of early
twentieth century quantum theorists Werner Heisenberg and Erwin Schrodinger and the
statistical theory embodied in the work of Sir Ronald Fisher. The basic conceptual
revolution in modem physics was that observations cannot be made at an atomic level
without some disturbance, therefore while one might observe something exactly in time,
one cannot predict how the act of observing will affect the observed in the future with
any degree of certainty (at least at very small scales). This is Heisenberg's Uncertainty or
Indeterminancy Principle.
Scientists observe natural "laws" only through emergent properties of many atoms
observed at vastly larger scales where individual behavior is averaged into a collective
whole. Jakob Bernoulli's Law of Large Numbers, Abraham de Moivre's bell shaped
curve, and Sir Ronald Fisher's estimation of the mean are also examples of how scientists
rely upon emergent properties of large systems of randomly behaving things in order to
make sense of what is thought to be fundamentally uncertain world. However, this
probabilistic view of the natural world has been challenged in recent decades by some
mathematicians who are again championing a deterministic view of nature. Observations
by Edward Lorenz, Benoit Mandelbrot and others, of simple and purely deterministic
equations that behave in complex unpredictable ways, has led to a revival in the
deterministic view among mathematicians; in this case, uncertainty is a product of human
inability to perceive systems that are extremely sensitive to initial starting conditions.
For this reason, this new view is often called deterministicc chaos". And so the question
as to whether uncertainty is real or imagined is yet to be resolved by modern science.
One remarkable observation of this latest revolution in mathematics is the frequent
occurrence of scale invariance or powerlaw scaling in systems that exhibit this sort of
complex and unpredictable behavior. And so just as the normal distribution or bell curve
is an emergent property governing the random behavior of independent objects, the
powerlaw appears to be an emergent property of nature as well; one that seems not only
to often to appear in the behavior of interacting objects (i.e., critical systems) but in
systems where growth in randomly observed as well. Statistical physics now recognizes
two classes of "stable laws", one that leads to the Gaussian or normal distribution and
another that is Levy or powerlaw distributed. In the former class, emergent behavior is
governed by commonly occurring random events while in the latter class the behavior of
the group is governed by a few rarely occurring random events. As I have already
reviewed earlier, we find that convergence to the mean under these two frameworks can
be radically different. Yet both are present in the natural world, and as I have shown in
my work, both of these classes of behavior can underlie naturally occurring distributions,
causing partial scale invariance in real data. It is my conviction that the biological
sciences must in the future adopt the statistical methods of working under both of these
frameworks and not simply make assumptions of normality whenever and wherever
random events are found to occur.
APPENDIX A
LANDMARK WING VEIN INTERSECTIONS CHOSEN FOR ANALYSISOF
FLUCTUATING ASYMMETRY
Figure Al. Six landmark locations digitized for Aphis gossipyii
Figure A2. Six landmark locations digitized for Apis mellifera
85
Figure A3. Six landmark locations digitized for Chrysosoma crinitus
Figure A4. Eight landmark locations digitized for Drosophila simulans
APPENDIX B
USEFUL MATHEMATICAL FUNCTIONS
The following is a list of the probability density functions for candidate models of
the distribution of fluctuating asymmetry. Log likelihood forms of these functions were
maximized to obtain best fitting parameters of each model for our data.
Asymmetric Laplace Distribution
(see Kotz et al. 2001)
f(x) = ( 1)e eo for x > 0
f(x) = ((1 )) for x < 0
where 0 = location, c = scale and K = skew index (Laplace when K = 1)
HalfNormal Distribution
f(x) = ( )( )e'
where [= minimum data value and 0 = dispersion
Lognormal Distribution
(see Evans 2000 or Limpert et al. 2001)
f(x) = 1 e( (logx v) 22)
xr(2r)1/2
where v = location and T = shape or multiplicative standard deviation
Double Pareto Lognormal Distribution
(see Reed and Jorgensen 2004)
f(x) =  g(log(x))
given the normalLaplace distribution
g(y) = (At) [R(a Y)+ R(P. ) + )]
where the Mill s ratio R(z) is
1 O(z)
R(z) =
and where D is the cumulative density function and q is probability density function for
standard normal distribution N(0,1), where a and 0 are parameters that control powerlaw
scaling in the tails of the lognormal distribution.
The limiting forms of the double Pareto lognormal are the left Pareto lognormal
(a = o ), right Pareto lognormal distributions (f = oo ), and lognormal distributions
(a = o, f = o) with Pareto tails on only the left side, only the right side, or on neither
side, respectively.
A description of Reed and Jorgensen's generative model of double Pareto lognormal size
distribution.
Reed and Jorgenson's (2004) generative model begins with the Ito stochastic
differential equation representing a geometric Brownian motion given below.
dX = uXdt + crXdw
with initial state X(0) = XO distributed lognormally, log XO N(v,'2). After Ttime units
the state X(T) is also distributed lognormally with log X(T) N(v + (LaC2/2)T,T2 + C2T).
The time T, at which the process is observed, is distributed with densityfT(t) = XeXt
where X is a constant rate. The double Pareto lognormal distribution is generated when
geometric Brownian motion is sampled repeatedly at time t with a negative exponential
probability.
