Citation
Chemical Biodiversity and Signaling: Detailed Analysis of FRMFamide-Like Neuropeptides and Other Natural Products by NMR and Bioinformatics

Material Information

Title:
Chemical Biodiversity and Signaling: Detailed Analysis of FRMFamide-Like Neuropeptides and Other Natural Products by NMR and Bioinformatics
Copyright Date:
2008

Subjects

Subjects / Keywords:
Amides ( jstor )
Amino acids ( jstor )
Chemical equilibrium ( jstor )
Insects ( jstor )
Isomers ( jstor )
Neuropeptides ( jstor )
Proteins ( jstor )
Protons ( jstor )
Receptors ( jstor )
Roundworms ( jstor )

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Embargo Date:
10/10/2006

Downloads

This item has the following downloads:


Full Text












CHEMICAL BIODIVERSITY AND SIGNALING:
DETAILED ANALYSIS OF FMRFamide-LIKE NEUROPEPTIDES AND OTHER
NATURAL PRODUCTS BY NMR AND BIOINFORMATICS













By

AARON TODD DOSSEY


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA


2006

































Copyright 2006

by

Aaron Todd Dossey

































To my family; to members of the laboratory of Dr. Arthur S. Edison; and to God
Almighty and the magnificent natural world he created which has given me much joy and
a basis for my career in science.















ACKNOWLEDGMENTS

As with any endeavor one may pursue in life, I cannot take sole credit for anything

I have done and, thus, thanks are certainly in order to those who have helped make my

PhD possible. First and foremost, I would like to thank my family. In particular, I thank

my grandparents, Jerry and Emma Dossey, and mother, Teresa (Dossey) Scott, for

instilling in me three key components of my successes in life thus far: determination, a

strong work ethic, and faith in myself. I would also like to thank God the creator for my

life and the bountiful life forms of this earth that I enjoy studying every day.

I would also like to thank Oklahoma State University and others who helped foster

the early stages of my career in biochemistry. I thank Dr. Eldon C. Nelson, my

undergraduate advisor, for showing great care about my career and keeping me focused

and motivated on the career related aspects of my tenure there. For many interesting and

encouraging conversations about entomology from which I learned a lot, I would like to

thank Don C. Arnold, the curator of the Oklahoma State University Entomological

Museum. I also thank the Southwestern Bell Telephone Corporation and Sylvia Coles

Denebeim for supporting me through generous scholarships which helped tremendously

with school related costs and allowed me to focus on my education.

At the University of Florida (UF), I would like to thank Dr. Arthur Edison, my

supervisory committee chair, for providing a research atmosphere that has allowed me to

develop as a scientist. I also thank Dr. Edison for his patience while I was training in

protein NMR and learning how to write scientific articles. I thank James R. Rocca in the









UF AMRIS (Advanced Magnetic Resonance Imaging and Spectroscopy) facility for

countless hours of help and discussion. His tireless efforts in NMR training were an

invaluable complement to the training I received from Dr. Edison. Pat Jones, the

Biochemistry department secretary, was also invaluable to me by keeping me in line with

deadlines and course registration and I thank her for that as well. I also thank my

supervisory committee members (Drs. Arthur S. Edison, Ben M. Dunn, Brian D. Cain,

Joanna R. Long, and Stephen A. Hagen) for always being available to help guide me

through my PhD studies. I also thank the University of Florida for awarding me the

Grinter Fellowship for the first three years of my tenure there.

For help with specific projects, others certainly deserve my thanks. I thank Dr.

Cherian Zachariah for help, training, and experiments in protein expression and

purification and NMR. For experiments performed and exciting discussion on phasmid

insect pheromone chemistry, I thank Dr. Spencer Walse and James Rocca. For data

resulting in my first publication (Chapter 3), I thank Drs. Mario de Bono, Peter Evans,

and Vincenzina (Reale) Evans, and Heather Chatwin for bioassay experiments on FLP-18

related neuropeptides. For their support and friendship, I would like to thank Omjoy

Ganesh, Iman Al-Naggar, Ramazan Ajredini, Fatma Kaplan, Dr. James Smith, and Dr.

Terry B. Green (all fellow members of Dr. Edison's lab).
















TABLE OF CONTENTS

page

A C K N O W L E D G M E N T S ................................................................................................. iv

L IST O F TA B LE S .......... ... .................... ......... .......... .... .............

LIST OF FIGURES ......... ......................... ...... ........ ............ xi

A B STR A C T ..................... ................................... ........... ... .............. xiii

CHAPTER

1 IN TR OD U CTION ............................................... .. ......................... ..

Importance ofNematodes............................................................... .
FM RFamide-Like Neuropeptides (FLPs).................................. ....................... 5
FLP Precursor Proteins......... ................. .................. ................... ...............
R eceptors and Functions....................................... .......................................6
FLPs as N natural Products .................................. ......................................9
N atu ral P ro d u cts ................................................................................................... 10
Dissertation Outline .................. .................................... ..... ...............13

2 BIOCHEMICAL PROPERTIES OF FLPS AND THEIR PRECURSOR PROTEINS
FROM THE NEMATODE Caenorhabditis elegans.........................................17

Introduction ........................ ..............................17
E x p erim mental M eth od s........................ ......................................................... ...... 18
Data Mining for Nematode flp Precursor Protein Sequences ...........................18
Alignment and Phylogenetic Analysis of flp Precursor Proteins......................20
Analysis of Biochemical Properties of flp Precursor Proteins and Figure
Generation .................................................. ............. ...............21
Analysis of FLP Mature Peptide Biochemical Properties...............................22
R e su lts ........................................................................................... .. 2 3
Analysis of Biochemical Properties of flp Precursor Proteins..........................23
Sequence Repetition Patterns ........................................ ....................... 34
Charge Distribution ....... ................ ............ ........... .... .............. 35
U nstructured Propensity ...................................... ........................... ........ 36
O their F features .................................................................. 40









Biochemical Properties of Mature Processed FLPs ................. ............. .....41
Peptide Charge ............. .................. ........................... ............ 44
Peptide Length and Amino Acid Conservation.............................44
D iscu ssion .......................... .. ... ................... ............ .... .... ............... ....... 4 7
Grouping of FLP Subfamilies by Precursor and Peptide Properties ...................47
T he flp-1 G roup ....................... .... ................................ ...... ..... .... 4 8
T he flp-6 G group ......... ............ .................. .................... 49
The flp-7 G group ......... .. .... ................ ........ ..... .... .. ........ .... 49
The flp-20 G group ......... .......................... ........ ..... ...... .. ............ 50
The flp-21 G roup...................................... ..... ...... .... .... .... .... ..51
Charge Compensation and Possible flp Precursor Structure.............................53

3 NMR ANALYSIS OF C. elegans FLP-18 NEUROPEPTIDES: IMPLICATIONS
FOR NPR-1 ACTIVATION ......................................................... .............. 55

In tro d u ctio n ...................................... ................................................ 5 5
Experim ental Procedures .............................................. ....... ............................. 56
P eptide Synthesis........... .......................................................... ...... .... .....56
Peptide Sample Preparation...................... ..... ........................... 56
B biological A activity A says: ............................................................. ..............57
N M R Spectroscopy ........................... ........................ .. ...... .. ...... ............58
Results ................ .... ........... ... ............... .........59
Peptide Design Rationale and Physiological Responses: ................................59
NMR Chemical Shifts Reveal Regions of flp-18 Peptides with Significant
Structure: ................................................................................................... 63
pH Dependence of Amide Proton Chemical Shifts Reveal Regions of flp-18
Peptides with Significant Structure:.................................... ................. 67
pH Dependence of Arginine Side-Chains Reveal Long-Range Interactions: .....70
Quantitative Determination of pKa Reveals Multiple Interactions:....................70
Temperature Dependence of Amide Chemical Shifts Corroborates Regions with
H -B on din g : .................................................................................... 74
Overall Peptide Charge is Correlated With Activity on NPR-1: ....................75
D discussion ............................... ....... ..................... ..... .. ................75
The backbone structure of the conserved PGVLRF-NH2 is predominantly
unstructured ........... ......................................... ... ...... ....... ..........78
DFDG forms a structural loop stabilized by H-bonding ...................................78
The DFDG loop may interact with the second loop to form a dynamic bicyclic
structure which reduces binding to NPR-1 ..................... ...... ........... 79
Charge is also important in determining the activity of flp-18 peptides on
N P R -1 ...................................... ............................... ......... ...... 79

4 ANISOMORPHAL: NEW INSIGHTS WITH SINGLE INSECT NMR .................83

In tro d u ctio n ........................................................................................................... 8 3
E xperim ental P procedures ................................................................... ...................88
Anim al Collection and Rearing ................................................... ............... 88
Sam ple Collection and H handling ................................................. .............. 88









N M R Spectroscopy ................... ........................... .. .......... ...... .. ........ 90
High Pressure Liquid Chromatography -Mass Spectrometry (LC-MS)............92
Gas Chromatography (GC)....................................... ..... ................... 93
Gas Chromatography Mass Spectrometry (GC-MS) ....................................94
R results .................. ................. ..... ..... .............................. 94
NMR of Single Milkings Shows a New Component and Isomeric Heterogeneity94
Glucose Verified by Chromatography and Colorimetric Assay......................104
Stereoisomeric Heterogeneity Verified by Gas Chromatography and Mass
Spectrom etry ......................................... ................... .... ...... 105
D iscu ssion ..................... ................... ...................... ..............107
Glucose Discovered in Stick Insect Defensive Spray Potential Functions ....107
Isomeric Heterogeneity in Phasmid Defensive Compounds Chemical
B iodiversity ...................................... ......... ......... .... .... ........ 108
Peruphasmal A Novel Phasmid Defensive Compound Isomer ......................109

5 CONCLUSIONS AND FUTURE DIRECTIONS ...............................................111

C o n c lu sio n s ......................................................................................................... 1 1 1
Future Directions ............................................... ..... .....115
Evolutionary History of FLPs and Other Neuropeptides ...............................1.16
Neuropeptide Structure/Function Analyses...................................................118
Anisomorphal and Other Insect Natural Products................. ..................119

APPENDIX

A ACCESSION NUMBERS (WITH CORRESPONDING SEQUENCE NAMES)
FOR ALL FLP PRECURSOR PROTEIN SEQUENCES FROM ALL NEMATODE
SPECIES USED IN WORK RELATED TO CHAPTER 2 ...................................121

B ALIGNMENTS OF FLP PRECURSOR PROTEINS FROM Caenorabditis elegans
AND OTHER NEMATODE SPECIES ........................................................128

C H NMR ASSIGNMENTS FOR ALL PEPTIDES EXAMINED IN CHAPTER 3.163

D PKA VALUES CALCULATED FOR RESONANCES WITH PH DEPENDANT
C H E M IC A L SH IFT S. ................................................ ........................................ 175

E H AND 1C NMR CHEMICAL SHIFT ASSIGNMENTS FOR DOLICHODIAL-
LIKE ISOMERS FROM THE WALKING STICK INSECT SPECIES Anisomorpha
buprestoides AND Peruphasma schultei .................................... .....................178

F HLPC-MASS SPEC IDENTIFICATION OF GLUCOSE FROM DEFENSIVE
SECRETIONS OF Anisomorpha buprestoides................................ ... ............... 179

G 2D NMR SPECTRA OF DEFENSIVE SECRETIONS OF Anisomorpha
buprestoides AND Peruphasma schultei .................................... .....................181









H GAS CHROMATOGRAPHY TRACES AND MASS SPECTRA OF DEFENSIVE
SECRETIONS OF Anisomorpha buprestoides AND Peruphasma schultei AND
EXTRACTS OF Teucrium marum..................................................... 192

LIST OF REFEREN CES ........................................... ........................ ............... 195

B IO G R A PH IC A L SK E T C H ........................................ ............................................212
















LIST OF TABLES


Table pge

1-1 Global numbers of the major human nematode infections. ............. ..................4

1-2 Sample quantities required for analysis using the three most powerful analytical
techniques for chemical structure determination................. ................................. 14

1-3 Mean values for a selection of molecular properties among natural, drug, and
synthetic com pounds ............................................... .. ...... .. ........ .... 15

2-1 Sequence patterns and properties common to various groups of flp precursor
proteins in C. elegans ...................... .................. ................... .. .....33

3-1: Peptides examined by NMR and their activities on NPR-1 ............................61
















LIST OF FIGURES


Figure page

1-1 FMRFamide-Like Neuropeptides predicted from database searches for precursor
protein s. .................................................... ............................ 7

1-2 The flp-18 and afl-1 precursor proteins from C. elegans and A. suum. .....................8

1-3 Schematic diagram of the neuropeptide activated GPCR NPR-1 from C. elegans. 11

2-1 Graphical illustrations of various chemical properties of FMRFamide-like
Neuropeptide precursor proteins in C. elegans. ........................................... ........... 24

2-2 Examples of known natively structured and unstructured proteins analyzed by
IU P R E D .......................................................................... 3 8

2-3 Portion of the alignment for all known flp-7 precursor protein orthologues in the
phylum N em atoda. ........................ .................... ... .... ........ ......... 42

2-4 Groupings of similar FLP neuropeptide subfamilies based on their chemical
properties and precursor protein sequence similarities. ........................................43

2-5 Predicted net charge frequency (pH 7.0) for all FLP neuropeptides predicted from
C elegans. ...........................................................................45

2-6 Prevalence of peptide lengths for all predicted FLP neuropeptides in C. elegans...46

3-1 Dose response curves of select FLP-18 peptides. ................... .................64

3-2 NMR chemical shift deviations from random coil values....................................66

3-3 Amide region of one-dimensional NMR data, collected as a function of pH..........69

3-4 Arg H" region of one-dimensional NMR data, collected as a function of pH..........71

3-5 Proposed H-bonding Interactions Between Backbone Amide Protons and Carboxyl
S id e -c h a in s ...............................................................................................................7 3

3-6 Relationship between Temperature Coefficients and pH dependence of chemical
shift among Backbone Amide Protons ............ ............................................... 76

3-7 Relationship between overall peptide charge and activity on the NPR-1 receptor..77









3-8 Model of interactions thought occurring within native FLP-18 peptides ................80

4-1 Adult pair ofAnisomorpha buprestoides on leaves of a sweetgum tree
(Liquidam bar styraciflua) ............................................. ............................... 84

4-2 Geminal diol equilibrium observed for dolichodial-like isomers in defensive
secretions from A. buprestoides and P. schultei............................ .....................87

4-3 Example of milking procedure for collecting defensive secretion from individual A.
b up rest id es ....................................................... ................ 8 9

4-4 Procedure used for obtaining pooled defensive spray samples from A. buprestoides
and P schultei .........................................................................9 1

4-5 ID NMR spectra of single milking samples from A. buprestoides, its chloroform
extract, the aqueous fraction, a sample of glucose for comparison, and a glucose
spike experim ent ......................................................................96

4-6 Expansions of vinyl region of ID H NMR spectra for single milkings of A.
buprestoides on different days. ........................................ .......................... 100

4-7 2D COSY and ROESY 1H NMR spectra of defensive secretions from A.
buprestoides and P. schultei................................................_.. ... .. .............. ... 101

4-8 Integrals of aldehyde and vinyl regions from single milkings ofA. buprestoides
defen siv e secretion .. ........................................................... ...................... 103

4-9 Gas Chromatographs of dolichodial-like isomers isolated from defensive secretions
of the insects Anisomorpha buprestoides and Peruphasma schultei and extracts of
the plant Tecrium m arum .............................................. ............................. 106















Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

CHEMICAL BIODIVERSITY AND SIGNALING:
DETAILED ANALYSIS OF FMRFAMIDE-LIKE NEUROPEPTIDES AND OTHER
NATURAL PRODUCTS BY NMR AND BIOINFORMATICS

By

Aaron Todd Dossey

August 2006

Chair: Arthur Scott Edison
Major Department: Biochemistry and Molecular Biology

Natural products are simply the molecules produced by biological systems. My

studies examined the structure, function, and chemical biodiversity of two types of

natural products, FMRFamide-like neuropeptides and defensive dolichodial-like

compounds from walkingstick insects (Order Phasmatodea).

FMRFamide-like neuropeptides (FLPs) make up one of the largest known

neuropeptide families. The first member was identified in 1977 as a cardioactive

component of extracts from clam. FLPs are characterized by an N- to C-terminal

gradient of conservation, with subfamilies produced on the same gene having similar C-

terminal sequences. The canonical C-terminal motif of FLPs is Arg-Phe-NH2. They are

involved in a wide variety of biological and behavior processes. They have been

identified in humans and are particularly well represented in invertebrates.

First, my study examined evolutionary relationships among several FLP

subfamilies in the nematode Caenorhabditis elegans. Using bioinformatics tools and









precursor protein sequence comparisons, I identified several important features of FLP

neuropeptides and their precursor proteins and genes. Also, some subfamilies of FLP

neuropeptides and their precursor proteins were categorized into groups based on a

number of similar features.

Next, I examined the structure/function relationships for a particular subfamily of

two FLPs (EMPGVLF-NH2 and DFDGAMPGVLRF-NH2) from C. elegans of the FLP-

18 subfamily. These peptides have been demonstrated to regulate feeding behavior in C.

elegans by activating the NPR-1 receptor. NMR pH titration experiments and chemical

shift indexing were used to probe transient hydrogen bonding and electrostatic

interactions between aspartate sidechains and amide protons. These results indicate that

the longer of the two native C. elegans peptides possesses N-terminal structure, stabilized

by a hydrogen bonding network, which reduces its potency on the NPR-1 receptor

relative to the shorter peptide.

To examine non-polypeptide natural products, a novel microsample NMR probe

was used to examine the chemical biodiversity of walkingstick insects. The results show

the following: 1) Anisomorpha buprestoides produces two stereoisomers of dolichodial in

its defensive spray, 2) Peruphasma schultei produces only a single isomer (Peruphasmal)

of the same compound, and 3) defensive secretions of both species contain glucose

(previously unreported from walkingstick insect defensive secretions). These findings

would not have been possible using other NMR technologies.














CHAPTER 1
INTRODUCTION

Importance of Nematodes

Nematodes (Kingdom Animalia, Phylum Nematoda) are among the most numerous

groups of animals on the planet. The number of species worldwide is controversial (1),

but some estimate there are as many as 1 million species in the world (2, 3). As well as

being quite speciose, nematodes are among the most ubiquitous groups in the animal

kingdom, and four in every five individual living animals is a nematode (4). One

hundred grams of a typical soil sample can contain about 3,000 individual nematodes (3).

Nathan Augustus Cobb's rather famous 1914 description of the abundance of nematodes

on earth illuminates their prevalence (5):

In short, if all the matter in the universe except the nematodes were swept away,
our world would still be dimly recognizable, and if, as disembodied spirits, we
could then investigate it, we should find its mountains, hills, vales, rivers, lakes,
and oceans represented by a film of nematodes. The location of towns would be
decipherable, since for every massing of human beings there would be a
corresponding massing of certain nematodes. Trees would still stand in ghostly
rows representing our streets and highways. The location of the various plants and
animals would still be decipherable, and, had we sufficient knowledge, in many
cases even their species could be determined by an examination of their erstwhile
nematode parasites.

Nematodes take advantage of a wide variety of ecological niches. They occur in

arid desert areas, the bottoms of freshwater bodies, and in hot springs; they have even

been thawed out alive from Antarctic ice (6). The trait for which nematodes are best

known is their parasitism. Of over 20,000 species of nematodes described, about 25 to

33% parasitize vertebrates (1, 2), causing extensive health problems for people as well as









the livestock animals on which we depend. Many other nematode species are plant

parasites, and cause about $80 billion in crop damage annually (4). However, nematodes

also present great potential benefits to mankind, agriculture, and ecology. Many

nematodes are parasitic on pest organisms such as insects (7), and several species are

even commercially available as pest control agents (8).

As mentioned above, nematodes are best known for their parasitism, particularly of

humans. Accumulation of nematode parasites in various human host tissues causes a

wide variety of physiological and deforming pathologies. Filarial diseases, caused when

filarial nematodes like Wuchereria bangrofti and Brugia malayi infect the lymphatic

system, result in fever, chills, skin lesions, and other debilitating symptoms. If left

untreated, filarial infections can manifest as elephantiasis. In this later stage of the

disease, the lymph ducts are actually clogged with nematodes and fluid builds up in the

extremities, causing them to swell and become grossly deformed. Intestinal nematode

parasites (such as Ascaris lumbricoides in humans) can result in malnutrition, difficulty

breathing (when they migrate to the lungs), and intestinal blockage of the host.

Hookworms (Necator americanus and Ancylostoma duodenale) can result in skin rashes

and asthma-like symptoms when they penetrate skin or migrate into the lungs during their

lifecycle. Additionally, since adult hookworms live in the intestine and feed on blood,

severe infections can result in abdominal pain, anemia, and heart conditions. Another

intestinal parasite, Trichuris trichiura (the human "whipworm"), can also cause anemia

as well as diarrhea and abdominal pain. Infection by a type of filarial worm, Onchocerca

volvulus (common in some tropical regions), can cause the disease known as river

blindness. It is the larval stage of this parasite which causes the most severe









complication. The adult female, which resides under the human host's skin, lays eggs

there. Once the larvae hatch, they reside in the bloodstream awaiting uptake by a

secondary host, the Black fly (genus Simulium). Some migrate into the human eye.

Immune response to the worms in the eye causes damage to the cornea, resulting in

blindness. In addition to those described above (often considered the major nematode

parasites of humans), many other nematode parasites have a serious impact on millions of

people worldwide (Table 1-1).

Given the importance of nematodes to mankind and the rest of the planet, it is clear

that intense study of nematodes is necessary to both control their negative impacts and

exploit their potential benefits. One particular nematode, Caenorhabditis elegans, has

been employed as one of the best characterized model organisms in modern biology. It

was first described in 1900 in a study on nematode reproduction (9). Among the most

significant advances in the understanding of the biology of C. elegans was the mapping

of the complete cell lineage from fertilized egg through adult hermaphrodite and male

(10-12). That work was vital in elucidating the function of each cell type in these

animals. Such information also helps answer genetic questions. To date, this is the only

multicellular organism for which such information is available. To complement this fine

level of anatomic and developmental detail, the entire genome for C. elegans has also

been sequenced (13).

The wealth of nematode biology (mainly the extensive characterization of C.

elegans) tells us much about the neuroanatomy and neuroconnectivity of these animals

(2). Among genera of subclass Rhabditia (Caenorabditis, Ascaris, etc.) the

neuroanatomy is both conserved and simple (2, 14). An adult hermaphrodite C. elegans










Table 1-1: Global numbers of the major human nematode infections (in millions).
Species Sub-Saharan Latin Middle India China Other Asia TOTAL
Africa America" Eastern and Islands
Crescent
Ascaris lumbricoides 105.0 171.00 96.00 188.0 410.0 303.0 1273.0


Trichuris trichiura


Hookwormb


88.0 147.00

138.0 130.00


64.00 134.0 220.0

95.00 306.0 367.0


249.0 902.0

242.0 1277.0


Onchocerca volvulus

Wuchereria bancrofti


Brugia malayic


2.6 4.2


6.2 12.9


Notes: Data from the Global Burden of Disease Study from the World Bank. Regions are
also as defined by this study. aIncludes Caribbean nation. bBoth Necator americanus and
Ancylostoma duodenale combined. CBoth infection and disease cases. Table regenerated
from "The Biology of Nematodes" 2002, p. 600 (15).


0.03


0.40


0.34 45.5


115.1









has exactly and invariably 302 neurons (14). Other species in the Rhabditia also have

invariant numbers of neurons (invariant among individuals within a species) ranging

from 150-300 cells (2). This simplicity and lack of variability seemingly contradicts the

behavioral diversity observed in this group of animals. Rhabditia range from small free-

living nematodes with simple soil existences (such as the Caenorhabditis sp.) to larger,

more complex, and more specialized parasitic species (such as Ascaris sp). One

component of their nervous system which adds additional levels of diversity and

complexity is their neurochemistry. A substantial part of the neurochemical diversity

observed in phylum Nematoda is their neuropeptides (16). Though little is known about

the biological function of most nematode neuropeptides, the best characterized family of

these is the FMRFamide-Like Neuropeptides (FLPs). They are the second largest

neuropeptide family in nematodes (second only to peptides hypothesized to be processed

from the Neuropeptide-Like (nlp) protein gene family) and certainly the most studied to

date (16-18).

FMRFamide-Like Neuropeptides (FLPs)

FMRFamide was first discovered in 1977 by Price and Greenberg as a

cardioexcitatory peptide from the clam Macrocallista nimbosa (19). FMRFamide-Like-

Peptides (FLPs) are the largest family of neuropeptides found in invertebrates (17, 18, 20,

21), but mammalian (even human) FLPs have also been identified (22-25). These

peptides are characterized by an N- to C-terminal gradient of increasing sequence

conservation, and most end in RF-NH2. This is true when FLP peptide sequences are

compared as a whole, from within taxa, within species, or even on specific precursor

proteins (18, 20, 21, 26). While for many neuropeptides, including FLPs, the C-terminus

is conserved. However, other neuropeptide families have different patterns of









conservation; for example, in insect orcokinins the N-terminus is the conserved region

(27), and in insulin the cystine framework and other central residues portions are

conserved (28).

The first nematode FLP, AF 1, was isolated from Ascaris suum (29), and most

subsequent early nematode FLP work was done on this species (30-36). The FLPs are

highly expressed in nematodes, and thus are likely important chemical components of

their anatomically simple nervous systems (17, 21, 32). Though much work has been

done to elucidate the activities of FLPs, the definitive biological functions of FLPs

(Figure 1-1) are still unknown. All hypothetical FLPs produced by C. elegans are

tabulated in Figure 1-1.

FLP Precursor Proteins

FLPs, like most neuropeptides and hormones, are synthesized as part of larger

precursor proteins and processed in the secretary pathway (37). Peptides on a particular

precursor have conserved regions in the mature peptides that are often associated with

receptor binding and make up a subfamily (18). In C. elegans, 28 different genes

encoding well over 60 possible FLPs have been identified using bioinformatic

approaches (18, 21, 38) and 28 of the putative processed peptides have been detected

biochemically (39-43) (Figure 1-1). Examples of two precursor proteins from two

nematode species are shown in Figure 1-2.

Receptors and Functions

FLPs are involved in a wide range of biological processes that have been reviewed

previously (26, 44-47). Some of the more prominent functional studies have focused on

their role in cardioexcitation (19), muscle contraction (33), modulation of the action of













flp-laa
KPNFMRYG
SAAVKSLG
AGSDPNFLRFG
SQPNFLRFG*
ASGDPNFLRFG*
SDPNFLRFG*
AAADPNFLRFG*
SADPNFLRFG*
PNFLRFG*

flp-lba
KPNFMRYG
SADPNFLRFG*
SQPNFLRFG*
ASGDPNFLRFG*
SDPNFLRFG*
AAADPNFLRFG*
SADPNFLRFG*
PNFLRFG*


flp-lca
KPNFMRYG
SADPNFLRFG*
SDPNFLRFG*
AAADPNFLRFG*
SADPNFLRFG*
PNFLRFG*

flp-2a
LRGEPIRFG
SPREPIRFG

flp-2b
LRGEPIRFG
SPREPIRFG


flp-3b
SPLGTMRFG
TPLGTMRFG*
SAEPFGTMRFG*
NPENDTPFGTMRFG*
ASEDALFGTMRFG*
EDGNAPFGTMKFG
EAEEPLGTMRFG*
SADDSAPFGTMRFG*
NPLGTMRFG
flp-4
PTFIRFG
ASPSFIRFG

flp-5"
PKFIRFG
AGAKFIRFG
GAKFIRFG*

flIpE6d
KSAYMRFG*
KSAYMRFG*
KSAYMRFG*
KSAYMRFG*
KSAYMRFG*
KSAYMRFG*

flp-7
TPMQRSSMVRFG
SPMQRSSMVRFG
SPMQRSSMVRFG
SPMQRSSMVRFG
SPMERSAMVRFG
SPMDRSKMVRFG
SSIDRASMVRLG
TPMQRSSMVRFG


flp-8e





flp-9f


flp-13h
KNEFIRFG* AMDSPLIRFG*
KNEFIRFG* AADGAPLIRFG*
KNEFIRFG* APEASPFIRFG*
AADGAPLIRFG*
APEASPFIRFG*
KPSFVRFG* ASPSAPLIRFG*
KPSFVRFG* SPSAVPLIRFG*
SAAAPLIRFG*


flp-10
QPKARSGYIRFG

flp-11ag
AMRNALVRFG
ASGGMRNALVRFG*
SPLDEEDFAPESPLQG*
NGAPQPFVRFG*

flp-11bg
AMRNALVRFG
ASGGMRNALVRFG*
SPLDEEDFAPESPLQG*

flp-llcg
AMRNALVRFG
ASGGMRNALVRFG*
SPLDEEDFAPESPLQG*
NGAPQPFG
flp-12
RNKFEFIRFG


ASSAPLIRFG
flp-14'
KHEYLRFG*
KHEYLRFG*
KHEYLRFG*
KHEYLRFG*
flp-15
GGPQGPLRFG
RGPSGPLRFG
flp-163
AQTFVRFG*
AQTFVRFG*
GQTFVRFG*
flp-17
KSAFVRFG
KSAFVRFG
KSQYIRFG
flp-18k
DFDGAMPGVLRFG*
EMPGVLRFG
SVPGVLRFG*
SVPGVLRFG*
EIPGVLRFG*
SEVPGVLRFG*
DVPGVLRFG
SVPGVLRFG*


flp-191



flp-20





flp-21


flp-22-





flp-23a


WANQVRFG*
ASWASSVRFG

AVFRMG
AMMRFG
AMMRFG
SVFRLG


GLGPRPLRFG


SPSAKWMRFG*
SPSAKWMRFG*
SPSAKWMRFG*


GALFRSG
WGQQDFLRFG

flp-23b
NSGCPGALFRSG
TKFQDFLRFG
flp-24n
VPSAGDMMVRFG*
flp-25
DYDFVRFG
ASYDYIRFG
flp-26
EFNADDLTLRFG*
GGAGEPLAFSPDMLSLRFG*
flp-27


flp-28


GLGGRMRFG


VLMRFG


Figure 1-1: FMRFamide-Like Neuropeptides predicted from database searches for

precursor proteins. Peptide subfamilies are denoted by the number after "flp"

underlined in red (ie: peptides produced on the same precursor protein the

proteins being paralogues). A letter beside the number denotes peptides

derived from an alternate transcript of that precursor (for example, "flp-la).

Each peptide has a C-terminal glycine (G) in red that is predicted to be

converted to a C-terminal amide group during processing. The conserved RF

motif in peptides that possess it is shown in blue. Peptides that have been

verified to exist using biochemical methods (in their mature C-terminal

amidated forms) are denoted by an asterisk to the right of the peptide. Notes:

Peptides in the noted precursors have been biochemically verified in the

following references: a) (39, 40), b) (39), c) (39), d) (43), e) f) (48), g),

h) (49), i) (41), j) (39), k) (39, 49) 1) (39), m) (39), n) (39), o) (39)









flp-18 MRFDDDTTCATTCADKLRTIEVLTGPTRFIQLYCVFFSYFSTTLTFFNYSLHH
afp-1 MVELAAIAVHLFAILCISVSAEIELPDKRAQFDDSFLPYYPSSAFMDSDEAIV

flp-18 LPCFSIFKIVFFVSERADQLCFFLNEKSSSQALKFLPKIESYVYSRLDMQRWS
afp-1 AVPSSKPGRYYFDQVGLDAENAMSARE

flp-18 GVLLISLCCLLRGALAYTEPIYEIVEEDIPAEDIEVTRTNEKQDGRVFS
afp-1

flp-18 KR**DFDGAMPGVLRFGKRGGVWEKRESSVQKKEMPGVLRFGKRAYFDEKKSV
afp-1 KRGFGDEMSMPGVLRFGKR GMPGVLRFGKR ENEKKAV

flp-18 PGVLRFGKRSYFDEKK*SVPGVLRFGKRDVPMDKR*EIPGVLRFGKRDYMADS
afp-1 PGVLRFGKR GDVPGVLRFGKR SDMPGVLRFGKR

flp-18 FDKRSEVPGVLRFGKRDVPGVLRFGKRSDLEEHYAGVLLKKSVPGVLRFGRK
afp-1 *SMPGVLRFGRR


Figure 1-2: The flp-18 and afl-1 precursor proteins from C. elegans and A. suum,
respectively. Peptide sequences red, processing sites blue, denotes amino
acid gap in peptide, denotes a gap in a spacer region. Note that this analysis
is not an alignment (18, 36). The longer peptides for each precursor are
underlined.









morphine (50), egg laying (51) and feeding behavior in nematodes (47). Also, disruption

of theflp-1 gene in C. elegans resulted in a number of phenotypes (52). Two types of

receptors for FLPs have been identified: GPCRs (G-Protein Coupled Receptors) (53-58)

and a sodium channel gated by FMRF-NH2 (59-61). Other human/mammalian ion

channel receptors have been identified whose activities are modulated by FLPs, including

the Acid Sensing Ion Channels (ASICs) and Epithelial Na+ Ion Channels (ENaCs) (24,

62). The best characterized FLP GPCR in nematodes to date is NPR-1. This receptor has

been shown to be involved in feeding and foraging behavior and its endogenous ligands

have been identified as peptides encoded by theflp-18 andflp-21 genes (53). NPR-l's

characterization was greatly aided by the discovery of two natural isolates of C. elegans,

differing by only a single point mutation in the third intracellular loop of NPR-1 (Figure

1-3), which have drastically different feeding behaviors (63). In one case, this position is

a valine and the worms spread out to feed on a bacteria coated agar plate. In the other

isolate, the same position in NPR-1 is, instead, a phenylalanine and the worms cluster to

feed in areas of high bacteria density on the plate. This is a fascinating story of how

minor changes in individual proteins can give rise to drastic phenotypic changes and

illustrates the importance of at least one function of a particular set of FLPs.

FLPs as Natural Products

The term "Natural Products Chemistry" often brings to mind the plethora of small

non-protein/non-nucleic acid metabolites isolated from nature, usually with some notable

biological function or use to people. However, many proteinacious and polypeptide

substances definitely meet the requirements of natural products and have proven to be

extremely valuable to mankind. In particular, two classic examples are insulin (used for

decades from cow, pig, and a cloned human form called "Humulin" from Eli Lilly in









1982) and venom toxin peptides from the sea snail genus Conus (conotoxins, or cone

snail toxins) (64, 65). The first conotoxin analogue (Ziconotide, Prialt Elan

Pharmaceuticals, FDA NDA number 21-060 (66-68)) was recently approved by the

United States Food and Drug Administration (FDA) for use against severe chronic pain.

One of the advantages of these molecules is that they function as non-opioid pain killers,

so they are useful non-addictive alternatives to opioid pain killers such as morphine (69).

FLPs, as naturally occurring bioactive neuropeptides, should certainly be

considered in the realm of natural products. An example that shows a direct illustrative

analogy is that indeed one FLP, "conorfamide", has even been isolated from the actual

venom of cone snails (70). Given the natural origins of FLPs, designing therapeutic

agents based on them is certainly an endeavor rooted in the natural products chemistry.

Identification of compounds in natural sources is helpful in elucidating their biological

function and potential additional uses.

Natural Products

Recently, in the age of sequenced genomes, proteomics, and high throughput drug

screening, natural products have taken a back seat, particularly in the drug industry, to

synthetic combinatorial libraries (71-75). In the past, this was logical due to the limited

availability of natural products and quantities needed for full chemical characterization

(76). However, a plethora of recent technologies in nuclear magnetic resonance (NMR)

(77), chromatography (78), mass spectrometry (79), and small scale bioassay screening

technologies have helped to fuel a revisiting of natural products as sources of future

































NPR-1 Phenotypes:
V = Solitary Feeders
F = Social Feeders


Figure 1-3: Schematic diagram of the neuropeptide activated GPCR NPR-1 from C.
elegans. Each circle with a letter in it represents an amino acid along the
polypeptide chain. The red circles represent the natural single amino acid
polymorphism that gives rise to drastically different feeding behaviors.
Yellow circles represent conserved amino acids in the alignment that appears
in the same article as this figure. This figure adapted from de Bono, et al,
1998 (63).









molecules of choice for use in a variety of applications. The amount of material required

for chemical characterization has decreased drastically in recent decades. As late as the

1960's, many milligrams were required to characterize natural products (76, 80). Today,

with much improved analytical techniques at our disposal, natural products chemistry is

certainly poised to make substantial contribution to our knowledge of and ability to

benefit from the chemical complexity of the biological world. To illustrate the advances

that have been made since, the amount of material needed to acquire datasets for various

analytical techniques for a molecule with a mass of about 300 daltons is given in Table 1-

2 from a paper by one of the giants in natural products chemistry, Dr. Jerrold Meinwald

(76). Even though the quantities for various techniques listed in Table 1-2 is small, their

source was published in 2003. Since that year, improvements over these values have

been made in several of the categories, particularly for NMR (77, 81).

Purification and characterization aside, natural products represent a more logical

and still largely untapped reservoir within chemical space (71, 72, 76, 82). Part of what

makes natural products so attractive is largely the work of millions of years of evolution.

Issues such as solubility and refined chemical structure chiralityy, molecular scaffolds,

etc.) have largely been solved by nature (72, 83). Some of these are illustrated in Table

1-3 (72). LogP is the octanol-water partitioning coefficient and represents the

solubility/hydrophobicity of organic molecules. The higher the number, the greater

fraction of that molecule will partition into octanol and, thus, the greater its

hydrophobicity. Conversely, the lower the number, the more water soluble the

compound will be.









Also, the fundamental issue of biological activity and relevance is certainly met by

natural product compounds (72). This is where a good knowledge of biology helps

natural products chemistry substantially. In fact, the field involved with understanding

the functional role of natural products in the ecosystem, Chemical Ecology, has

substantial potential in aiding natural products chemistry in its search for biologically

relevant molecules and their potential function/uses (71, 73, 74). Though some emphasis

has been taken off of natural products chemistry in recent years, the case is certainly clear

that the field is poised for a formidable resurgence.

Dissertation Outline

The main goal of this dissertation is to inspire future researchers to make use of the

techniques and observations described within.

In Chapter 2, analysis is provided of the chemical biodiversity observed in FLPs

from C. elegans and their precursor proteins. I believe that a wise molecular evolutionist

in the future will note the observations I have made in this fascinatingly complex and

diverse protein family. Paired with future functional data for nematode FLPs as they

come online, they will be able to complete the protein family's evolutionary history

reconstruction. This will inevitably provide the fields of molecular evolution,

nematology, and neuroscience with a greater understanding of the nervous system and

behavior of nematodes and possibly other animals.

With Chapter 3, my hope is that future structural biologists will take note of the

value of pH titration NMR experiments in observing otherwise unobservable transient

electrostatic and hydrogen bonding interactions in polypeptides and other organic










Table 1-2: Sample quantities required for analysis using the three most powerful
analytical techniques for chemical structure determination. A plus (+) in the
column below each technique means that the corresponding sample quantities
(left) are applicable to that technique. A minus (-) in a similar position means
that that sample quantity (left) is not sufficient for use with the corresponding
technique. (76) A red dot (e) indicates improvement in NMR sensitivity
since 2003 (77).
Sample Size (g) Number of Molecules X-ray NMR Mass Spectrometry


Crystallography Spectroscopy
~ 300 x 10 623x 1023 + + +


50 x 10'
50 x 10-
50 x 10-
50 x 10-
50 x 10-
50 x 10-
50 x 10-
50 x 10-


1023
1020
1017
1014
1011
108
105
102







15


Table 1-3: Mean values for a selection of molecular properties among natural, drug, and
synthetic compounds (72). The terms natural products, drugs, or synthetics
describe the type of chemical library that was used to generate the data.
Natural Products Drugs Synthetics

Molecular Weight 300-414 340-356 393

LogP 2.4-2.9 2.1-2.2 4.3

Number of Chiral Centers 3.2-6.2 1.2-2.3 0.1-0.4

Number of N atoms 0.84 1.64 2.69

Number of O atoms 5.9 4.03 2.77

% of rings that are aromatic 31% 55% 80%









molecules. In fact, these types of interactions are likely to provide the fields of structural

biology and protein folding with the next level of detail needed to fully understand how

macromolecules take their native form.

Chapter 4 is a recent, yet exciting and fascinating, "last-minute" addition to my

graduate research and is also a preview of my future goals as a scientist. My interest in

insects has always been a strong passion in my life. Earlier this year (2006) on a whim I

decided to take advantage of my recent access to the 1 mm high temperature

superconducting microsample NMR probe which Dr. Edison had helped to develop and

explore a curiosity I had about some of the insects (Anisomorpha buprestoides) I had

been breeding. This led to the work in Chapter 4. With the ability to examine single

milkings of individual stick insects, we have been able to illustrate an intriguing level of

chemical biodiversity of a single defensive compound produced by these creatures.

Additionally, we were able to identify a component of this secretion not previously

known and are beginning to hypothesize on its biological function. I hope that readers of

this chapter will come away with an appreciation for insects and the field of insect natural

products chemistry and will be inspired to explore the field further.














CHAPTER 2
BIOCHEMICAL PROPERTIES OF FLPS AND THEIR PRECURSOR PROTEINS
FROM THE NEMATODE Caenorhabditis elegans

Introduction

The focus of this study was to understand the patterns of sequence conservation and

variability observed in FMRFamide-Like Neuropeptides (FLPs) and their precursor

protein sequences from the nematode C. elegans. This work is largely observational in

nature, but is motivated by the hypothesis that important information about the function

and evolution of the nematode nervous system can be elucidated by understanding the

origin of the sequence properties observed in these polypeptides. Illustrated here are

various biochemical and sequence properties of FLPs and their precursor proteins that I

believe to be important in understanding this family of neuropeptide genes in C. elegans

and to speculate on their relevance to FLP molecular evolution.

To my knowledge, no one has published such a study on this gene family. The

only discussion of the non-peptide "spacer" regions of flp precursors I am aware of in the

literature was by Greenberg and Price (20). Only non-nematode sequences were

analyzed, and it was postulated that they may serve to regulate the pH in the secretary

vesicles where FLPs are ultimately processed (37). In this chapter, based on comparisons

of complete flp precursor proteins from nematodes, I postulate that these spacer regions

may interact with peptide and processing site regions in the unprocessed protein to

stabilize their 3-dimensional structure. This is further examined in the Discussion section

of this Chapter. Also, I have found no structural data beyond primary sequence for the









flp precursors and no record of any having been recombinantly expressed or purified in

their unprocessed form. The only work I am aware of on structural properties of

unprocessed peptides illustrated that their processing sites are flexible in solution (84).

To investigate the molecular evolution of these gene products I extended some

analyses to flp precursors in all nematode species that could be identified from available

sequence databases (in collaboration with Dr. Slim Sassi from the laboratory of Prof.

Steven A. Benner). From this effort, I was able to collect 334 unique flp precursor

sequences from 38 different nematode species. With these, we attempted to reconstruct

ancestral sequences for each flp subfamily (30 in total) to aid in the reconstruction of the

evolutionary history of all 28 flp precursor proteins so far identified from C. elegans.

This effort was unsuccessful and will be discussed in the Discussion section of this

Chapter. Additionally, methods that may prove to be more useful in addressing this

problem are discussed in the "Future Directions" section of Chapter 5 of this Dissertation.

Experimental Methods

Data Mining for Nematode flp Precursor Protein Sequences

The flp precursor protein sequences used in this study were obtained from

databases available on NCBI Entrez Protein databases (which can be accessed at:

http://www.ncbi.nlm.nih.gov/BLAST/) using accession numbers from a previous study

that identified hundreds of FLP peptides encoded by nematode ESTs (Expressed

Sequence Tags) (21), other similar studies (18), and from BLAST searches using those

sequences and the theoretical mature peptides which they contained. For protein

BLASTs, whole C. elegans flp precursor was used in "protein-protein" searches (blastp)

by varying parameters around default settings (for example: expectation of 10, word Size

3, and a BLOSUM62 matrix, useful for weaker alignments). Also, each of the theoretical









peptides were used individually in "Short, nearly exact match" searches and varying

parameters around default settings (for example: expectation of 20,000, word size of 2,

and a PAM30 matrix, useful for shorter sequences). For ESTs, translated EST searches

(tblastn) were performed with both individual FLP peptides and whole precursors using

parameters previously described (21): searching only Phylum Nematoda, searching the

"estothers" NCBI database, expectation of 10,000, and a BLOSUM62 matrix.

Accession numbers for sequences used from the databases are shown in Appendix A.

EST hits were translated into protein using Expasy's TRANSLATE tool

(http://ca.expasy.org/tools/dna.html).

In this chapter, a few terms and symbols that require definition are used. The term

orthologue will refer to a particular protein or gene that is the same protein or gene from

another species (inferred by alignment). For example, all flp-1 proteins from various

nematode species are orthologues of one another. Paralogue will be used to denote a

gene or protein homologous to another within the same species. For example, flp-1 is a

paralogue of flp-18 in C. elegans. This nomenclature is well established in the field of

molecular evolution. For different types of sequences, the following symbolism is used:

FMRFamide-like neuropeptide(s) = FLPs, precursor protein(s) = flp(s), and DNA

sequence(s) coding for flp precursors) =flp(s).

True flp orthologues were recognized by regions of recognizable homology to the

theoretical processed peptide in the query sequence and by being flanked by canonical

mono- or dibasic processing sites. The longest reasonable EST available for a particular

precursor was used to represent that precursor from that particular species. Sequences

that were clearly too long, made of multiple concatenated ESTs, or sequence errors in key









places (early stop codons, etc.) were ignored. Some sequences were clearly unique but

seemed to be orthologues of other flp precursors from the same species. No previous

record of more than one copy of a flp orthologue is known, but alternate transcripts of

some have been previously identified. Thus, these sequences are assumed to be alternate

transcripts.

Translated ESTs were truncated where there was a sequence error or stop codon.

The nomenclature used for flp precursors is as follows: flp_#x_yz(s) where # is the

number given to a specific flp precursor orthologue group (for example, flp-1) designated

in the literature, x is a letter indicating a flp protein resulting from a different alternate

transcript when more than one was found in the databases, y is the first letter of the genus

of the species from which the sequence came, z is the first letter of that species, and in

some cases a third letter (denoted here as "s") is used when more than one species have

the same genus-species initials. For example: flp_lace denotes alternate transcript "a"

of flp-1 from C. elegans. As an additional example, flp_l_aca denotes flp-1 from

Ancylostoma caninum. In this case, the third letter in the genus/species initial

distinguishes that protein from flp_l_ace, which is flp-1 from Ancylostoma ceylanicum.

This nomenclature was used so that in text editor programs the names would be grouped

by orthologue (or, subfamily, ie: all flp-23's) rather than by species. Also, the term

subfamily will refer only to predicted mature processed peptides that occur on the same

precursor. For example, all of the peptides produced on the flp-18 precursor will be

called the FLP-18 subfamily.

Alignment and Phylogenetic Analysis of flp Precursor Proteins

To attempt to build a molecular phylogeny and reconstruct the evolutionary history

of the 28 flp precursors known in C. elegans, we employed some novel techniques









developed by Dr. Slim Sassi and myself. The general method involved generating

ancestral sequences for all 28 precursors and using them for phylogenetic reconstruction.

First, all orthologues of the 28 C. elegans flp precursors from various nematode species

were compiled. Then, they were grouped in orthologous groups and aligned using

ClustalW (85) and manual alignment by eye using the program Bioedit (86, 87)

(Appendix B). Preference was given to aligning predicted peptide regions. Where this

was ambiguous (with the repetitive nature of these peptide sequences), processing sites,

regions immediately flanking those, and spacer regions between predicted peptides were

used. A complete phylogenetic reconstruction will not be shown in this Chapter due to

reasons described in the Discussion section.

Analysis of Biochemical Properties of fip Precursor Proteins and Figure Generation

In order to analyze various sequence motifs in flp precursor proteins, several graphs

were constructed and consolidated into Figure 2-1. For illustrating patterns of sequence

repetition, two dimensional dotplots were employed. The dotplots in Figure 2-1 were

made by comparing each flp precursor from C. elegans to itself using the program

Bioedit (86, 87). The program produces a dot when amino acids are the same in two

sequences being compared. Thus, when both sequences are the same a diagonal of dots

is generated; off-diagonal dots represent sequence repetition. The upper threshold limit

was set at 10 and the lower set at 5. Using a threshold between 3 and 5 had little effect

on the resulting plots. This plot was inserted into Microsoft Powerpointc (Powerpoint).

In this slide, the other components of each panel of Figure 2-1 corresponding to a

particular flp precursor were added. The color coded precursor annotation graphic was

produced in Powerpoint. The resulting object's width was scaled to align the predicted

repeated peptide units and processing sites with their off-diagonal regions of the dotplot.









For signal sequence prediction, the online tool SignalP (88-90) was used to analyze the

first 100 amino acids of each flp precursor sequence. Results from the hidden Markov

model (HMM) cleavage site prediction were used to determine the C-terminal end of the

signal sequence (90). To determine intron/exon boundary positions in the precursor

proteins, BLAT searches using the UCSC C. elegans genome browser

(http://genome.brc.mcw.edu/cgi-bin/hgBlat?hgsid=148945) was used by asking for exons

to be in upper case and introns to be in lower case in the query results. Exons were then

translated using the Expasy Translate tool (http://www.expasy.org/tools/dna.html) and

compared to the published protein sequences to determine the intron/exon positions. The

theoretical PI for each precursor was calculated using the PeptideMass tool which is

available for use on the web at: http://www.expasy.org/tools/peptide-mass.html (91, 92).

The charge plot for each panel of Figure 2-1 was created using Microsoft Excel" with

comma delimited protein sequences. The positively charged amino acids were counted

using the command "COUNTIF" in Excel to count argnine and lysine residues as +1 and

aspartate and glutamate residues as -1. All other amino acids were given a charge of 0.

The unstructured protein propensity plots were made using the online IUPRED tool using

the "short disorder" prediction method (93, 94). The raw data from this tool was copied

and pasted into Excel and plotted as a line plot. This plot was edited in CorelDRAW and

Powerpoint. Care was taken to maintain the same scale among graphs from all flp

precursors in the final figure.

Analysis of FLP Mature Peptide Biochemical Properties

Peptides were predicted from precursors for all nematode species by 1) their

homology to previously published FLP sequences, 2) being flanked by mono- or dibasic

processing sites, 3) by possessing a C-terminal glycine residue, and 4) being in a region









of the protein near where other flps were observed (ie: not within the signal sequence or

far away from the other peptides in the primary sequence). These sequences were

excised from the precursor proteins and subjected to further analysis. In Figures 2-5 and

2-6, all FLPs from C. elegans are tabulated from one transcript of each precursor gene.

Where more than one alternate transcript is known, the longest precursor is used.

Barplots of peptide charge and length for C. elegans FLPs were made in Excel with tab

delimited peptide sequences. The charged amino acids and the N-termini of the peptides

were counted using the command "COUNTIF" in as described above for the precursors.

These were summed for each peptide to give their total theoretical charge at pH 7. For

the peptide length plots, the peptides were aligned in the spreadsheet anchored at C-

termini (with no gaps). The "COUNTIF" function was used for each column of letters

(representing a position in the peptide "alignment") to count all peptides which had a

letter (COUNTIF for each of the 20 possible amino acid single letter codes) in that

column. This resulted in a row of numbers representing peptides at least this length or

shorter. Then, in another row, each value in the previous row was subtracted from the

value to the right of itself. This gave the true numbers for peptides that were exactly that

length, their N-terminus truncated at that position with counting starting at the C-

terminus. These figures aided in the observations, analyses, and conclusions presented

below.

Results

Analysis of Biochemical Properties of fip Precursor Proteins

The patterns of amino acid sequences observed in the FMRFamide-like

neuropeptide family and their precursor proteins as a whole has intrigued our research

group for some time (26, 36). To analyze various properties of these sequences that










flp-la


1
1-


1
1_ I I


flp-2a
75
Pl = 10.39


\

t t t 1
&FII I IIIIIII [~T~FUMIIAII II


AAB22368


75-


CAC42354


flp-3


PI =10.11












180-



\ A \\AC940
AAC08940


1
ii 111


flp-4
90
PI =8.18


CAA88434


Figure 2-1: Graphical illustrations of various chemical properties of FMRFamide-like
Neuropeptide precursor proteins in C. elegans. 2D Dotplots: Each dotplot
provides a sequence comparison of the repetitive elements of a flp precursor
protein. When both dimensions of a dotplot are the same sequence a diagonal
of dots results. With the threshold set to 5, off-diagonal dots represent places
with 5 adjacent amino acids that are repeated more than once in the protein.
Graphical Annotation of flp protein sequences: Legend: Purple Rectangles
= predicted neuropeptide sequences, Red Rectangles = predicted processing
sites, = Glycine residue predicted to be posttranslationally


165
I I I
P1 = 9.87


\\\


I II III II









converted to a C-terminal amide group in the mature peptide, Grey
Rectangles = predicted signal peptide sequences (88-90), and Black
Rectangles = regions of unknown function or "spacer regions". Red dots
contained within purple rectangles indicate a peptide that has been
biochemically characterized in the literature (see also Figure 1-1 in Chapter 1
for corresponding references). Asterisks above a processing site indicate that
the sequence is KR (the most common processing site observed in flp
precursors). An arrow below the graph corresponds to an intron/exon
boundary (ie: protein region corresponding to an RNA splicing site); Charge
Plots of flp protein sequences: In the barplot below the graphical protein
annotation, bars pointing up and blue correspond to positively charged amino
acids at pH 7.0 (arginine and lysine). Bars pointing down that are red
correspond to negatively charged amino acids at pH 7.0 (aspartic acid and
glutamic acid); Unstructured propensity plots of flp protein sequences: These
plots are shown below the charge plots. The scale of these plots goes from 0
to 1. The horizontal axis shown is placed at 0.5. Values above the line denote
regions with a propensity to be unstructured and values below the line denote
regions predicted to be structured (93, 94).










flp-6
165
PI = 9.57



\\\\ \,


flp-7
165

PI = 11.31


* *


* *


165-


11 11 11 111111 1" Y JF I 11 Ell

CAA94786

flp-8a
1 130
1-
= 8.60











130- *
t-
1130- .i .'i 1' .\ :.


AAC69107


CAD21627


Figure 2-1 Continued


flp-5


1
1-


I = 10.76


+U \


AAK68683


1
1-1












165-


4,4
vnhuALLUJJJALFIWWJI A I










flp-10
1 65
- PI = 10.50


NP 501306


1
1 I I


flp-12
85

PI = 8.68


+ t
*i e1 IIr11 11


AAK39250


Figure 2-1 Continued


flp-9


150
I I I9.76
PI = 9.76


CAA93480


flp-11a


1
1-1 I I


I = 10.34


lb lb


AAA96196


*

I I II I











flp-13


I I I I
P1 = 11.77


II II I I I I I 1 I I


AAB88376


flp-15
1 85

*PI = 9.52














*.
i r n m


flo-14
1 200
1-
i _1 i I I I I I
P = 7.16
















-- I I i;i~i~i~i

CAA21533


flp-16
1 90
1-
*. PI =10.14













90 *
\IIII


CAB05022


CAE17795


Figure 2-1 Continued










flp-17
130
Pl = 9.24


** *
'It


NP 503051


flp-18


PI = 9.91
















-\ 50\5 \\\-
NP 508514


flp-19
85
SPI =5.13


flp-20


1
1-' '


it it
I T r11t1


CX*

I' 1


CAA90690


Figure 2-1 Continued


1
1-


110

PI = 7.89


NP 509574











flp-21
1 65

PI = 5.61


flp-22


1
1-l ll


90

PI =10.42


I I *


* *11r


AAB37072


CAB03086


1
1111111 '


flp-23a
80

PI =8.14


1
ill' '


flp-24
65

Pl = 9.10


t


AAY18633


AAB70310


Figure 2-1 Continued


I I I II II










flp-25
95
PI = 8.84


flp-26


I I I I I I I I


CAE54900


flp-27
1 85

II I I=8.14













\.^^^*

\ lt i.l l


* *


AAM51536


1
1-


flp-28
90

PI = 6.53


*


AAK31451


CAE17946


Figure 2-1 Continued


1
1-


80

PI = 9.69









would aid in alignments and phylogenetic analyses, as well as suggest functional

properties of these polypeptides, plots illustrating functional motif arrangement

(hydrophobic signal sequences, predicted peptides, and predicted processing sites),

intron/exon boundaries, charged amino acid distribution, and regions of predicted

unstructured propensity were compiled. These plots are shown in Figure 2-1.

Some features are common to nearly all of the flp precursors in C. elegans. First,

the overall arrangement in nearly all of the precursors, from N- to C- terminus, is as

follows: 1) N-terminal hydrophobic signal sequence, 2) spacer region of unknown

function, and 3) a C-terminal region rich in predicted FLP peptides. This paradigm holds

true for the flp 1-3, 7-9, 12-22, and 24-27 precursors (Table 2-1, column A). The

exceptions are as follows: For flp-4, 5, and 20, the most N-terminal amino acids do not

fall within the region predicted by SignalP to be the hydrophobic signal sequence. This

could be due to several potential causes such as an artifact of the SignalP prediction

method, an incorrectly predicted start site for the transcripts in the database, or a signal

sequence that is truly not N-terminal. For flp-6, 10, 11, 23, and 28, there is no

substantially long spacer region between the first predicted FLP peptide and the N-

terminal signal sequence.

Also, many of the flp precursors end in a FLP peptide (with or without a C-terminal

basic processing site). The exceptions to this are: flp-2, 7, 10, 11, 15, 16, 22, 23, 24, 27,

and 28. Again, this could be an artifact of the stop codon chosen for the protein sequence

in the database. One striking feature of some precursors that do end in a processing site

is the sequence of that site. KR is certainly the most common processing site within

these precursors, and this is true among many families of neuropeptides ((95), Appendix









Table 2-1: Sequence patterns and properties common to various groups of flp precursor
proteins in C. elegans. An X or other annotation means that the precursor for
that row has the property in column: A) Precursors having the motif: N-
terminal signal sequence, large spacer region, C-terminal neuropeptide rich
region, B) the precursor protein ends in RK or K, C) N-terminal most
predicted peptide farther from the others in the sequence than they are to one
another, D) non-peptide spacer regions between nearly all pairs of FLPs, E)
several peptides occurring in tandem, F) large spacer between the signal
sequence and first FLP peptide that is rich in acidic residues, G) alternating
spacer regions between peptides that are rich in acidic residues, H) protein
isoelectric point (PI) less than 8.0, I) much of the protein predicted to be
folded by the IUPRED tool, J) intron/exon boundaries that occur in conserved
regions of FLP peptides. See also Figure 2-1 for graphical illustrations of
these sequence patterns for individual precursor proteins.
Precursor A B C D E F G H I J

flp-1 X K X X X X X
flp-2 X X X
flp-3 X K X X X X
flp-4 K X X
flp-5 X X
flp-6 X X X
flp-7 X X X X
flp-8 X X X X
flp-9 X RK X X
flp-10 X X
flp-11 X X X X
flp-12 X RK X X
flp-13 X RK X X X X
flp-14 X RK X X X X X
flp-15 X X X
flp-16 X X X X
flp-17 X K X X X
flp-18 X RK X X X X X
flp-19 X X X X
flp-20 X X X X X
flp-21 X X X X
flp-22 X X X X
flp-23 X X
flp-24 X X X
flp-25 X K X X
flp-26 X RK X
flp-27 X X X
flp-28 X X X X









B). However, many of the flp precursors end in RK (flp-9, 12-14, 18, and 25) or K (flp-

1, 3, 4, 17, and 26) (Table 2-1, column B). This is another case where prediction of the

true protein sequence deposited in the database may be the reason why even more flp

precursors were not observed to end in such seemingly conserved C-terminal sequence

motifs.

Details of specific findings from Figure 2-1 are presented in the following

subsections.

Sequence Repetition Patterns

One of the better known properties of the flp precursor protein sequences (across

most phyla) is their repetitive nature. Many (possibly most) flp precursor proteins

contain several copies of C-terminally related neuropeptides that make up a subfamily.

Though most of the repeated sequence patterns are due to the conserved predicted

peptides, some of the peptides predicted by this study to be processed and secreted are

non-canonical and thus, are not represented as repeated units in the dotplots of Figure 2-

1. Examples of this occur in the flp-1, 11, and 23, precursors. Thus, I studied the flp

precursor proteins from C. elegans for commonality in their patterns of sequence

repetition as illustrated by the dotplots.

For some precursors, the N-terminal most unit FLP peptide region is relatively

distantly spaced from the others in the protein primary sequence. This seems to be the

case for the flp 1, 3, and 18 precursors (Table 2-1, column C). Our research group has

previously postulated that the N-terminal most peptides on flp-18 and related precursor

proteins in other nematodes are unique based on their position in the precursor and length

(96). These common sequence patterns suggest possible evolutionary homology.









Another pattern of sequence repetition present in several flp precursors is the

presence of spacer regions between most pairs of FLP peptides. In such cases, no more

than two peptides occur in tandem. Precursors showing this pattern of repetition include

flp-6, 8, 17, and 18 (Table 2-1, column D).

For other flp precursors, multiple (3 or more) repeated regions corresponding to

predicted neuropeptides occur in tandem, separated only by processing sites. This pattern

results in what resembles a solid block of off-diagonal spots in the dotplots for those

precursors (Figure 2-1). This is the most common pattern of sequence repetition

observed in C. elegans flp precursors, and is reminiscent of the earliest sequenced

FMRFamide precursors (20). C. elegans precursors with this pattern of repetition include

flp-1, 3, 7, 11, 13, 14, 16, 20, and 22 (Table 2-1, column E).

Charge Distribution

One of the more suggestive patterns in nematode flp precursor proteins observed by

this study is the distribution of charged amino acids. Peptide and processing site regions,

as expected, are together rich in positively charged (basic) amino acids. However, the

spacer regions tend to be rich in negatively charged (acidic) amino acids. These regions

are of unknown function and occur outside of the peptide, processing site, or hydrophobic

signal sequence regions. It has been previously reported that spacer regions in flp

precursors of other phyla also tend to be rich in acidic residues (20). However, no

published work to date has examined these sequences in as much detail as this Chapter.

For some precursors there is a distinct spacer region (immediately after the

hydrophobic signal sequence) rich in negatively charged amino acids followed by a

region of only FLP neuropeptides and processing sites (being rich in positively charged

amino acids). This is the case for the flp 1-4, 7, 11-16, 19-22, 24, and 28 precursors









(Table 2-1, column F). For other cases, there is a pattern of alternating spacer region

with peptide/processing site regions. This is the case for the flp-5, 6, 8, 9, 10, 17, 18, and

25 (Table 2-1, column G). In the case of flp-23, the predicted hydrophobic signal

sequence is immediately followed by a processing site and a FLP neuropeptide, which is

unique among the C. elegans precursors analyzed here.

Having made these observations, I became interested in determining the reason

behind the apparently biased charge distribution in these proteins, particularly in the

spacer regions. To determine the extent to which the charges are balanced in these

precursors, I calculated the theoretical pi for each protein. The pi of the flp precursors,

with the exception of flps 14, 19-21, and 28 (Table 2-1, column H), are all greater than

8.0, indicating a net positive charge. Thus, positive charged residues prevail overall.

Notably, flps 21 and 28 only contain one FLP peptide each, resulting in less evolutionary

pressure to possess numerous positive charges in FLP neuropeptides and processing sites.

The implications of charge compensation in flp precursors will be covered in the

Discussion section of this chapter.

Unstructured Propensity

Members of our laboratory have long thought that 3-dimensional structural motifs

could be beneficial or even required for functional interactions of these proteins with

their processing enzymes ((84), and personal correspondence with Dr. Arthur Edison and

Dr. Cherian Zachariah). The striking patterns of charge compensation observed in C.

elegans flp precursor proteins led me to further probe the precursor sequences for

potential structural properties. To do this, I used the IUPRED tool (93, 94) to plot the

unstructured propensities for each of the 28 precursor protein sequences. This method is

based on the energies of potential contacts between pairs of amino acids in a protein. The









Y axis scale of the plots shown in Figure 2-1 goes from a minimum of 0 (complete order)

to a maximum of 1 (complete disorder), and the X axis is simply the protein sequence.

The horizontal line is for a Y value of 0.5. The greater the propensity for a region of a

protein to be unstructured, the greater the Y value of that region on the plot. The

interpretation of the data given by the authors of the IUPRED method is that all protein

regions with values greater than 0.5 (above the line in Figure 2-1) are predicted to be

disordered and those below 0.5 ordered (93, 94).

For all flp precursors analyzed, the hydrophobic signal sequence is predicted by

IUPRED to have a maximal propensity to be ordered. This is likely due to the highly

hydrophobic nature of these sequences, and unrelated to charge compensation. Also, the

signal sequences are predicted to be cleaved from the rest of the protein and would not

contribute to its structure after that event.

For most of the flp precursors, much of the plot for the sequence lies below the 0.5

cutoff value. This is true for the flpl, 4, 5, 8-21, and 23-28 precursors (Table 2-1,

column I). Those that show a predominantly unstructured propensity (outside the

hydrophobic signal sequence) include flp-2, 3, 6, 7, and 22. There appears to be no

correlation between which precursors show unstructured propensity and other chemical

properties observed (such as charge distribution).

To compare the IUPRED results for C. elegans flp precursors with those of

proteins which have known amounts of structure, I chose several proteins and show their

analyses in Figure 2-2.









STRUCTURED PROTEINS UNSTRUCTURED PROTEINS


YprA


IA3


Ubiquitin human


Lysozyme hen egg white


MARCKS human


Alpha Synuclein Common Canary


Figure 2-2: Examples of known natively structured and unstructured proteins analyzed
by IUPRED (93, 94). The X axis is the sequence of the protein and Y axis is
IUPRED score for unstructured propensity. Blue arrows indicate the
maximum score for unstructured propensity and red arrows indicate the
minimum score. All portions of the graph below the X axis are predicted to
be structured and regions above to be unstructured. References for these
proteins: (97-108)


_









A brief summary of the proteins used for Figure 2-2 is as follows:

Structured proteins: Yeast Proteinase A (YprA) is a globular aspartic proteinase

from the yeast Saccharomyces cerevisiae whose crystal structure has been solved (100,

101). Ubiquitin is a folded protein that, when covalently bound to misfolded proteins and

others targeted for degredation, acts as a signal for the protein to be shuttled to the

proteosome in cells and subsequently degraded (97-99). Lysozyme is a folded protein

component of the innate immune system in animals that is involved in destroying

bacterial pathogens (104-106).

Unstructured proteins: Inhibitor of Yeast Proteinase A from fraction 3 (IA3) is an

endogenous inhibitor of YprA in S. cerevisiae that has been shown to be natively

unstructured in solution (107). Myristoilated Alanine Rich C-Kinase Substrate

(MARCKS) is a protein involved in a wide variety of possible functions and has been

shown previously to be unstructured in solution (108). Alpha-synuclein is a highly

conserved presynaptic protein that has been implicated in Parkinson's disease and other

types of dementia. It has been shown to be natively unstructured in solution with a strong

propensity to form helices when bound to phospholipids membranes (102, 103, 109).

It is not the intent of the work in this Chapter to exhaustively verify the IUPRED

methodology. However, to interpret the IUPRED results for the flp precursors, a few

points about the non-flp proteins analyzed are worth noting. First, the N-terminus of IA3

forms an alpha-helix when it binds to its native binding partner, Yeast Proteinase A (101,

107). This N-terminus is also thought to have a native propensity toward a low but

detectable population of alpha helix (based on work submitted by Omjoy Ganesh, and his









collaborators). This could be a reason for the apparent "dip" in the N-terminal half of the

plot for this protein in Figure 2-2.

Second, the slight dip in the middle of the plot for MARCKS corresponds to one of

that protein's important regulatory domains. This domain, called the phosphorylated site

domain (PSD), forms a more compact structure when physphorylated (110).

Finally, for alpha-synuclein, as stated previously, it has been shown to become

alpha-helical upon binding to lipids. This protein also is highly repeated (like many of

the flp precursors in nematodes and other phyla), with 11 imperfect repeated units

conserved among vertebrates (103, 109). These points about the proteins analyzed in

Figure 2-2 may indicate that IUPRED is able to detect a structural predisposition that is

stablized by interactions with other biological molecules. The implications of these

proteins as their IUPRED results compare to those of the flp precursor proteins from C.

elegans are discussed in the Discussion section of this Chapter.

Other Features

One intriguing feature of the flp precursor genes is the position of some of their

intron/exon boundaries. The end sequences of many introns are well conserved in

eukaryotes, with a large number having a GT at the 5' end and an AG at the 3' end (111,

112). This is probably so that the RNA splicing sites are recognized by the RNA splicing

machinery. However, in flp precursors, I have noticed that the ends of exons are also

conserved and often occur in regions corresponding FLP neuropeptide coding regions.

This occurs in the flp 1-3, 6, 7, 11, 13, 14, 18, 22, 23, 27, and 28 precursors (Table 2-1

column J and Figure 2-1). In fact, known alternate transcripts of the flp-1 and flp-11

precursors exist. These result in production of different FLP neuropeptides (113).

Additionally, for flp-23, previous FLP neuropeptide predictions in the literature disagree









as to which FLP neuropeptide is actually produced by theflp-23 gene, based on different

predicted transcripts (16, 21, 114). These observations illustrate the potential importance

of inron/exon boundaries and RNA splicing and their affect on FLP neuropeptide

diversity.

Another interesting phenomenon was discovered among the flp-7 nematode

orthologous, particularly in Caenorhabditis briggsae. In flp-7 from C. briggsae, a

peculiar and apparent insertion was observed (Figure 2-3). A novel FLP-7 related

peptide sequence exists in Caenorhabditis briggsae and not in the other nematode

sequences shown. This indel (insertion or deletion) could be viewed as an error in

alignment due to a few point mutations in a peptide otherwise like the others

(particularly, the N-terminal leucine and C-terminal leucine). However, the complete

alignment of these proteins can be seen in Appendix B. From those alignments, it is clear

that nearly all other sites in the C. elegans and C. briggsae flp-7 precursor protein

sequences are homologous. Also quite interesting is that this indel occurs in the middle

of an exon. I have not yet determined the true origin of this indel nor understand its

functional consequencess. It is nonetheless very intriguing and could be potentially very

informative for future researchers of nematode flp precursor evolution.

Biochemical Properties of Mature Processed FLPs

In addition to similarities in patterns observed among several of the flp precursor

proteins, several points of similarity are apparent even among the peptides derived from

certain sets of precursors (Figure 2-4). The rationale for those groupings is described in

the discussion section of this chapter.











flp-7-hg
flp-7-ce
flp-7-cb
flp-7-cr
flp-7-oo
flp-7-ac
flp-7-mj
flp-7-mh
flp-7-mi
flp-7-rs


--FESLA
SSMVRFG
SAMVRFG
SSMVRFG
STMVRFG
SSMVRFG
SALVRFG
SALVRFG
SALVRFG
SAMARFG


APLDRSAMARFG
SPMQRSSMVRFG
SPMDRSAMVRFG
SPMQRSSMVRFG
APMDRSTMVRFG
APMDRSSIVRFG
APLDRSALVRFG
APFDRSALVRFG
APLDRSALVRFG
APLDRSAMARFG


LPSDRSSMVRLG


APLDRSALARFG
SPMERSAMVRFG
SPMDRSAMVRFG
SPMERSAMVRFG
APMDRSSMVRFG
APMDRSSMVRFG
APLDRAAMVRFG
APLDRAAMVRFG
APLDRAAMVRYG
APLDRSAMVRFG


Figure 2-3: Portion of the alignment for all known flp-7 precursor protein orthologues in
the phylum Nematoda. The amino acids in blue are part of predicted FLP-7
neuropeptides which align with one another and contain the canonical "RFG"
C-terminus. In green are the predicted processing sites. In red is a novel indel
of a non-canonical FLP neuropeptide found only in Caenorhabditis briggsae
thus far. It is likely to be processed with the others.











flp-la
KPNFMRYG
SAAVKSLG
AGSDPNFLRFG
SQPNFLRFG
ASGDPNFLRFG
SDPNFLRFG
AAADPNFLRFG
SADPNFLRFG
PNFLRFG
flp-3
SPLGTMRFG
TPLGTMRFG
SAEPFGTMRFG
NPENDTPFGTMRFG
ASEDALFGTMRFG
EDGNAPFGTMKFG
EAEEPLGTMRFG
SADDSAPFGTMRFG
NPLGTMRFG
flp-13
AMDSPLIRFG
AADGAPLIRFG
APEASPFIRFG
AADGAPLIRFG
APEASPFIRFG
ASPSAPLIRFG
SPSAVPLIRFG
SAAAPLIRFG
ASSAPLIRFG
flp-18
DFDGAMPGVLRFG
EMPGVLRFG
SVPGVLRFG
SVPGVLRFG
EIPGVLRFG
SEVPGVLRFG
DVPGVLRFG
SVPGVLRFG


flp-6
KSAYMRFG
KSAYMRFG
KSAYMRFG
KSAYMRFG
KSAYMRFG
KSAYMRFG
flp-8
KNEFIRFG
KNEFIRFG
KNEFIRFG
flp-9
KPSFVRFG
KPSFVRFG
flp-14
KHEYLRFG
KHEYLRFG
KHEYLRFG
KHEYLRFG
flp-17
KSAFVRFG
KSAFVRFG
KSQYIRFG



flp-7
TPMQRSSMVRFG
SPMQRSSMVRFG
SPMQRSSMVRFG
SPMQRSSMVRFG
SPMERSAMVRFG
SPMDRSKMVRFG
SSIDRASMVRLG
TPMQRSSMVRFG
flp-22
SPSAKWMRFG
SPSAKWMRFG
SPSAKWMRFG


flp-20
AVFRMG
AMMRFG
AMMRFG
SVFRLG
flp-28
VLMRFG



flp-21
GLGPRPLRFG
flp-27
GLGGRMRFG


Figure 2-4: Groupings of similar FLP neuropeptide subfamilies based on their chemical
properties and precursor protein sequence similarities.









Peptide Charge

The FLP neuropeptides in C. elegans nearly all have a net positive charge. In fact,

the most common net charge in these peptides is +2 (Figure 2-5). It is clear that a net

positive charge is a strongly conserved feature of FLP neuropeptides in C. elegans. In

fact, many entire subfamilies of FLPs in C. elegans have a conserved N-terminal lysine

residue (Figure 2-4). These subfamilies include peptides produced on the flp-6, 8, 9, 14,

and 17 precursors. Other subfamilies of precursors tend to actually be rich in negatively

charged residues in the N-terminal variable regions. Such subfamilies include FLP

neuropeptides produced on the flp-1, 3, 13, and 18 precursor proteins. This illustrates

possible homology and is strongly suggestive of evolutionary constraint. Indeed

functional results in Chapter 3 show that more positively charged analogues of FLP-18

peptides are more active on the NPR-1 receptor than less positively charged ones (96).

The potential importance of charge similarities among FLP subfamilies will be explained

in the Discussion section of this Chapter.

Peptide Length and Amino Acid Conservation

Another key structural property of FLP neuropeptides is simply the number of

amino acids they contain. As ligands, molecular size likely affects their role in receptor

binding and activation. Peptide length appears to be evolutionarily constrained in FLPs.

Many that occur on the same precursor protein are exactly 7 amino acids; this is also the

most common length for FLPs identified in this study (Figure 2-6).

Also, some groups of FLPs have similar patterns of amino acid distribution. For

example, some of the FLPs of different subfamilies have a proline that is 5, 6, or 7 amino

acids from the C-terminus. Precursors with peptides containing a proline in such a

position include flp-1, 3, 4, 5, 6, 9, 11, 13, and 18. Another commonality among certain









40
U)

30


0~
'B20
L.
lo
E 10
z


+4 +3 +2 +1 0 -1
Charge

Figure 2-5: Predicted net charge frequency (pH 7.0) for all FLP neuropeptides predicted
from C. elegans.


-T--


I












"D
i 20
0

" 10
E
z


18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2
Peptide Length

Figure 2-6: Prevalence of peptide lengths for all predicted FLP neuropeptides in C.
elegans.









FLPs is the persistence of an aromatic residue that is 4 amino acids from the C-terminus

for all the peptides on a particular precursor containing multiple peptides. This is

observed for flps 3-6, 8, 9, 14, 16, 17, and 25. This property is common to many

FMRFamide-like neuropeptides. Examples of these patterns of the sequence similarity

among various groups of FLPs are illustrated in Figure 2-4. These groupings are

explained further in the Discussion section of this Chapter.

Discussion

The purpose of this work has been to catalogue various hypothesis generating

observations about the FMRFamide-like neuropeptides and their precursor proteins, with

emphasis on those likely to guide future molecular evolution studies. Indeed I have been

able to make a number of observations and have given support for their potential

importance. Though this chapter is largely observational, some logical conclusions can

be drawn from the data and results presented:

* Several striking commonalities among flp precursor proteins are strongly
suggestive of an evolutionary relationship

* Charge distribution is a conserved feature of flp precursor proteins that may imply
a propensity for 3-dimensional structure

* Charge, peptide length, and common amino acids among FLP neuropeptide
subfamilies together suggest an evolutionary relationship and functional
importance

* Positive charges in FLPs may be important for receptor binding or interactions with
the membrane

Grouping of FLP Subfamilies by Precursor and Peptide Properties

The flp precursor proteins as a protein family appear to be highly divergent in the

phylum Nematoda. However, some similarities among these various paralogous family

members exist. The similarities among FLPs and their precursors discussed in the results









section above warrant a preliminary grouping by relatedness. Below are suggested group

designations and the rationale behind these groupings which are summarized in Figure 2-

4.

The flp-1 Group

This group contains the flp-1, 3, 13, and 18 precursor proteins and their peptide

products. Nearly all of the peptides in this group contain a proline that is 5, 6, or 7

residues from the C-terminus in their mature processed form. Proline is one of the least

common amino acids and imparts a constrained local conformation on the polypeptide

chain. Thus, it is very likely the structures (either in solution or bound to receptors) of

FLP peptides containing proline in a similar position are more similar than those of

peptides lacking proline. The prevalence of proline in this subgroup suggests a structural

motif under evolutionary constraint. Additionally, the majority of the peptides in this

group have N-terminal regions that are rich in negatively charged residues. In the

following chapter I show data illustrating that charge influences the biological activities

and structures of members of this group. Since charge is an important property of these

molecules, it is very likely that this property has been conserved through evolution.

The arrangement of the sequence repetition pattern in flp-13 does not include one

peptide in the precursor protein which is separated from the others. Flp-18 is replete with

spacer regions in contrast to other precursor proteins listed in this group. However, since

the mature processed peptides are known to be the functional portions of these proteins in

their interaction with receptors, it is likely that the most evolutionary pressure is on these

regions. Thus, based on FLP peptide similarities, the flp-13 and 18 precursors are

included in this group.









The flp-6 Group

This group contains the flp-6, 8, 9, 14, and 17 precursors and peptides processed

from those proteins. The common features among members of this subgroup are

definitely the most striking of the classifications made here. The peptides themselves all

contain exactly the same number of amino acids. They also all begin with the same

positively charged amino acid (lysine). Additionally, they all have an aromatic residue at

exactly the same relative position (Figure 2-4). These similarities strongly suggest an

evolutionary relationship. Additionally, their precursor proteins are quite similar; all the

precursors contain several copies of nearly identical peptides. This contrasts with other

flp precursors in C. elegans which contain peptides with considerable N-terminal

variability. Also, for the flp-6, 8, 9, and 17 precursors, there are acidic-residue-rich

spacer regions between nearly all pairs of peptides.

The flp-7 Group

This group contains the flp-7 and 22 precursors and peptides processed from those

proteins. There are two main reasons behind grouping these polypeptides together. First,

they have striking similarities in their N-terminal sequences (Figure 2-4). This is peculiar

because FLPs overall have a gradient of decreasing conservation, with the N-terminus

being less well conserved. It is very likely that these peptides may bind an overlapping

complement of receptors. Indeed, it has been recently shown that flp-7 and flp-22

peptides can activate the same receptor from C. elegans (115). Also, the precursors for

this group of FLPs are in the category of those in which all predicted peptides are

concatenated in tandem, separated only by the necessary processing sites.









The flp-20 Group

This group contains the flp-20 and 28 precursors and peptides processed from those

proteins. The key similarities among these neuropeptides are their length and chemical

nature. Peptides in these subfamilies contain exactly 5 amino acids each in their mature

form. They are also rich in hydrophobic amino acids, with the only charged residue

being their conserved penultimate arginine.

For the flp-28 precursor, no published sequence has been annotated to deem it as

flp-28 DNA or protein, though two publications suggest that it has been observed in

unpublished observations in the C. elegans genome (16, 21). Without further information

on those findings, I believe I am the first to identify a sequence in the database (accession

number CAE17946, currently annotated as a "hypothetical protein") as coding for the

FLP-28 neuropeptide and being a bona-fide flp-28 precursor in C. elegans. Additionally,

a previous report separated some of the sequences in the flp-28 alignment (Appendix B),

into flp-28 (VLMRF peptide motif) and some into flp-29 (ILMRF peptide motif) (21).

However, based on alignment of the full precursor proteins, and the facts that: 1) both

flp-28 and "flp-29" have not been identified within the same species and 2) no flp

precursor gene has been identified with more than one copy in any one nematode

genome, I designate the previously named flp-28 and "flp-29" precursor proteins simply

as flp-28. In order to retain the nomenclature in the literature, I will continue to use the

flp-30 and flp-31 designations as previously described by McVeigh et al (21). This

leaves a designation void for the name flp-29 which may be filled by a future nematode

flp paralogue, disrupting the chronological order of this discovery naming scheme

previously used for nematode flp paralogues.









The flp-21 Group

This group contains the flp-21 and 27 precursors and peptides processed from those

proteins. Their similarities are predominantly in their N- and C- termini, with a major

difference occurring due to the presence of two proline residues in the middle of FLP-21.

One of the proline positions in FLP-21 is held by a glycine in FLP-27 and the other

missing. There is also an arginine in the middle of both peptides which is the only

charged residue in both aside from the penultimate arginine characteristic of all FLPs.

The evolutionary relationships postulated above are based on precursor proteins

and predicted neuropeptide subfamily similarities. Their shared properties very likely

affect precursor processing as well as receptor binding and activation. It has been

postulated that receptors are more likely to evolve the ability to bind and use a pre-

existing neuropeptide ligand than neuropeptides are to evolve to bind a different receptor

(116). It has also been estimated that there are about 50-130 neuropeptide receptors in C.

elegans (117). However, there are well over 209 neuropeptides predicted in the C.

elegans genome (16). Additionally, NPR-1 has been shown to be activated by peptides

from both flp-18 and flp-21 precursors (53). With fewer receptors than ligands and some

FLP subfamilies being very similar to others, it seems that indeed FLP receptors are

likely regulated by more than one ligand in the nervous systems of nematodes. In fact, a

receptor from C. elegans has been recently cloned and characterized which is activated

by a number of C. elegans FLPs at micromolar concentrations (115). This would imply

that similar FLP subfamilies co-evolve post-duplication to maintain their original

similarities and functionality on particular receptors. My hypothesis is that our flp

precursor grouping designations above are legitimate and very likely to be upheld by

future studies. This, however, remains to be proven. The Future Directions section of









Chapter 5 provides a discussion of some ways in which studies validating these

classifications may proceed.

It has been postulated previously that gene conversion in flp precursor proteins is

favored (36). The results of this study, particularly from the alignments of different

orthologous subfamilies (Appendix B), corroborate that hypothesis. Thus, an optimized

neuropeptide can be duplicated within a gene and processed along with the ancestral

copy(s) during the same preexisting regulation pathway. If a higher dose release of

peptide with the original sequence is beneficial the concerted evolution would certainly

also be beneficial by maintaining the sequence conservation in the repeated peptide units.

A full evolutionary reconstuciton of the FMRFamide-like neuropeptide precursor

gene family in C. elegans was not achieved. The method we attempted to use involved

generating ancestral sequences of all 28 precursors and using those for phylogenetic

reconstruction. The rationale behind this was that flp precursors have very divergent

sequences. Since ancestral sequences theoretically represent paralogues shortly after

their divergence, they should be more similar than the modern sequences and, thus, easier

to align. However, the resulting ancestral sequences were often too short and no more

helpful for phylogenetic analysis than were the modem sequences from which they were

generated. Thus, Dr. Sassi, Dr. Edison and I decided that generating a phylogenetic tree

for flp precursor proteins is more complicated than we had envisioned. Reconstructing

the molecular phylogenetic history of this large and diverse gene family will require a

more complete and higher quality data set. It may also require the development of new

methodologies and was deemed beyond the scope of this chapter. Likely requirements to









complete such a study are discussed in the Discussion section of this Chapter and in the

Future Directions section of Chapter 5.

Charge Compensation and Possible fip Precursor Structure

One of the questions guiding the above analyses of flp precursor proteins has been:

What could be driving the charge compensation observed? It was hypothesized in

previous work that the acidic residue rich spacer regions in flp precursors may be present

in order to regulate pH in secretary vesicles where the FLP peptides are processed from

those proteins (20). Another possibility is that they are conserved in order to interact

with receptors for trafficking to the proper vesicles for processing and secretion. They

may have an opposite charge from the FLP peptide regions so that these regions are

unbound and free to interact with processing enzymes. This latter suggestion seems

unlikely, particularly in cases like flp-6 where the protein consists almost entirely of

alternating peptide and spacer regions. It is hard to imagine a spacer region interaction

with a receptor that would also accommodate access by processing enzymes.

Charge compensation in protein sequences has been commented on previously

(118, 119). One of the more logical reasons for correlated conservation of amino acid

charge properties is to maintain necessary structure stabilizing contacts (119). For the

vast majority of the flp precursors analyzed, IUPRED results indicate more structure

propensity than would be expected from completely disordered proteins. Thus, it seems

likely that the spacer regions function to stabilize structural interactions with the peptide

and processing site regions of the proteins. However, the structural properties of these

proteins beyond their primary sequences have not yet been characterized A discussion

of future experiments that would likely be beneficial to such an endeavor is provided in

the Future Directions section in Chapter 5 of this dissertation.









Based on alignments I have performed with many different nematode flp precursor

proteins (Appendix B), it is clear that observations made and conclusions drawn on C.

elegans precursors are applicable to the phylum Nematoda as a whole. In general, it is

my desire to convey through this work an inspiring and driving appreciation for the

fascinating sequence diversity and neurochemical importance of the FLPs and flp

precursor proteins in the Phylum Nematoda. I hope it will be useful in guiding further

research. It has been postulated that neuropeptides in general predate such classical

neurotransmitters as acetylcholine and serotonin (120). Thus, an understanding of the

evolutionary history of these polypeptides will likely give rise to a wealth of knowledge

on both the evolution and function of the components of the nematode nervous system

and their influence on behavior. Several ideas for future research based on my

experiences on this project are provided in the corresponding section of "Future

Directions" in Chapter 5 of this dissertation.














CHAPTER 3
NMR ANALYSIS OF C. elegans FLP-18 NEUROPEPTIDES: IMPLICATIONS FOR
NPR-1 ACTIVATION

The work in this Chapter was published in Biochemistry, Vol. 45, No. 24, pp. 7586-

7597, June 20, 2006 (96), in collaboration with the laboratories of Profs Peter Evans and

Mario de Bono.

Introduction

NPR-1, a GPCR that modulates feeding behavior in C. elegans, is activated by two

subfamilies of FLPs in C. elegans, including the FLP-18 peptides and FLP-21 peptide

(53). All of the FLP-18 peptides occur on the same precursor and are presumably

processed and released simultaneously (18). The most active of these peptides is

EMPGVLRF-NH2; by comparison DFDGAMPGVLRF-NH2 is significantly less active

(53). These observations motivated us to further investigate the structural properties of

these peptides relative to their biological activities.

In previous studies, our research group has suggested that N-terminal hydrogen

bonding can influence FLP activity (121). Structural interactions in small peptides such

as FLPs are generally invisible to techniques such as X-ray crystallography, because

small peptides are dynamic in solution. Also, some NMR parameters such as NOE

correlations for distance measurements are of limited value on small peptides due to their

dynamic properties in solution (122). However, these transient structural properties can

modulate their activities (121). Although a high resolution X-ray crystallographic

structure for a FLP-18 peptide bound to NPR-1 would be extremely useful, it would not









provide any information on the unbound state of the peptides. We are interested in

understanding how structural interactions within free ligands can affect their binding

properties with a receptor. This work seeks to illuminate that portion of the FLP-18

pharmacology on NPR-1.

In the present study we have monitored pH titrations, temperature titrations, and

chemical shifts via NMR to identify transient long-range interactions within FLP-18

peptides and designed control analogues. The sequence and activity diversity among

these peptides motivated us to examine the structural properties of two extreme cases,

EMPGVLRF-NH2 and DFDGAMPGVLRF-NH2, which may influence the activity of

each peptide on NPR-1. The material presented in this chapter examines the hypothesis

that local structure in the variable N-terminal regions offlp-18 peptides can modulate

their binding to NPR-1.

Experimental Procedures

Peptide Synthesis

Peptides listed in Table 3-1 were synthesized using standard Fmoc solid phase

methods, purified by HPLC, and verified by MALDI-TOF mass spectrometry at the

University of Florida Interdisciplinary Center for Biotechnology Research (UF ICBR)

protein core facility.

Peptide Sample Preparation

Lyophilized peptides were weighed and dissolved to -1 mM in 95% H20 and 5%

D20, and the pH was adjusted to 5.5 by additions of either HC1 or NaOH. The peptide

solution was then aliquoted and frozen at -20 OC until needed for biological assays or

NMR experiments. Small aliquots of each sample were submitted for amino acid

analysis at the UF ICBR protein core to verify their concentrations. For NMR









spectroscopy, the pH-stable chemical shift standard DSS (2,2-dimethyl-2-silapentane-5-

sulfonic acid) was added to NMR samples at a final concentration of 0.17 mM (123).

Biological Activity Assays:

The experiments described in this section (Biological Activity Assays) were

performed by Dr. Vincenzina Reale from the lab of Prof. Peter Evans and Heather

Chatwin from the lab of Prof. Mario de Bono. Sense cRNA was prepared in vitro using

the mCAPTM RNA Capping Kit (Stratagene, La Jolla, CA) from plasmid DNA

containing full-length npr-1 215V cDNA cloned inpcDNA3 (Invitrogen Ltd., Paisley

UK). RNA transcripts were synthesized using T7 RNA Polymerase (Stratagene, La Jolla,

CA) after linearizing the plasmid with Apa I (Promega UK, Southampton, UK) and

blunting the 3' overhangs with T4 DNA Polymerase (Amersham Pharmacia Biotech,

Little Chalfont, Bucks, UK). T7 RNA transcripts synthesized in vitro with the mCAPTM

RNA Capping Kit are initiated with the 5' 7MeGpppG 5' cap analog. Sense cRNA was

prepared in a similar manner from the GIRK1 and GIRK2 clones in pBS-MXT (124)

(kindly donated by Drs. S.K. Silverman and H.A. Lester, California Institute of

Technology, Pasedena, USA) after linearizing the plasmid with Sal I (Promega).

All experiments using Xenopus laevis were carried out under a Home Office (UK)

Project License. Stage V and VI oocytes from virgin female adult X laevis were

prepared using standard procedures (53, 125-127). Oocytes were then injected with 50 ng

of npr-1 receptor sense cRNA, either alone, or together with 0.5 ng each of GIRK 1 and

GIRK 2 sense cRNA and incubated at 190C for 2 5 days. Uninjected oocytes were used

as controls. Electrophysiological recordings were made from oocytes using a two-

microelectrode voltage-clamp technique (53, 125-127).









NMR Spectroscopy

NMR data were collected at 600 MHz using a Bruker Advance (DRX)-600 console

with a 14.1 Tesla magnet equipped with a 5 mm TXI Z-Gradient CryoProbe. Unless

otherwise stated, all NMR experiments were collected at 288 K, and spectra were

collected with a 6600 Hz spectral width and were referenced by setting the methyl proton

resonance peak from DSS protons to 0.0 ppm. The 1H carrier frequency was centered on

water which was suppressed using a WATERGATE sequence (128) or presaturation.

Two-dimensional TOCSY (129) experiments were collected using a DIPSI-2 mixing

sequence with a 60 ms mixing time. Two-dimensional ROESY (130) experiments were

collected using a 2.27 kHz field spinlock cw applied for 250 ms.

Processing of ID NMR spectra and creation of stack plots of pH and temperature

titrations was done using Bruker XWINNMR and XWINPLOT version 4.0 software.

Two-dimensional NMR datasets were processed with NMRPipe (131) using standard

methods including removal of residual water signal by deconvolution, multiplying the

data with a squared cosine function, zero-filling, Fourier transformation, and phase

correction. Data were analyzed and assigned with NMRView (132) using standard 1H-

based methods (133).

One-dimensional pH titration experiments were performed for all peptides in Table

3-1 that contain aspartate and/or glutamate residue(s), as well as PGVLRF-NH2 and

SGSGAMPGVLRF-NH2. One-dimensional NMR spectra were collected at increments

of about 0.2 pH units from 5.5 to 1.9 by successive addition of 1-3 ptL of 0.01-0.1 M HCI

for each pH value. pKa values and effective populations (c in Equation 1) of pH









dependent resonance peaks were calculated using Origin 7.0 software and a modified

version of the Henderson-Hasselbach equation below as previously described (134):

Equation 1:

Sc (- b)x10 pKpH
3(pH)= c = -pH
1 1+10PK pH 1
j = or 2

where c(pH) is the experimental chemical shift, 8b is the chemical shift at the least acidic

condition, a is the chemical shift at the more acidic condition, pKa is the negative

common log of the acid/base equilibrium constant for the ith titration event, and c, is the

contribution of the ith titration event to the total pH dependence of chemical shift.

One-dimensional NMR temperature titrations were collected on a standard TXI

probe at 5 Kelvin (K) increments from 278 to 328 K, then ramped back to 278 K to check

for sample integrity. The temperature for each experiment was calibrated using methanol

(for 278.15 298.15 K) and ethylene glycol (for 308.15 328.15 K) and the corrected

temperatures were used for the determination of the temperature dependence of the

chemical shifts, called temperature coefficients (TC) in parts per billion per Kelvin.

Results

Peptide Design Rationale and Physiological Responses:

Three major considerations have motivated this study. First, our lab has been

intrigued for some time by the amino acid diversity in FLPs (20, 26, 29, 30, 36, 121, 135,

136). In particular, as described above, FLPs display patterns of decreasing amino acid

conservation from the C- to the N-termini, and the comparison of the C. elegansflp-18

(18) and A. suum afp-1 (36) genes suggests that the longer peptides produced by these

genes are unique (see Chapter 1, Table 1-1). Second, the activity at NPR-1 of the long









FLP-18 peptide, DFDGAMPGVLRF-NH2, is significantly lower than the shorter

EMPGVLRF-NH2 (53). Finally, in previous work on FLPs from mollusks, Edison,

Carlacci, and co-authors found that different amino acid substitutions significantly

changed the conformations of the peptides (121, 135) and that these conformational

differences are correlated with their differences in activity (137).

In designing the peptides for this study, we considered several possibilities to

explain the difference in NPR-1 activity between two native FLP-18 peptides,

EMPGVLRF-NH2 and DFDGAMPGVLRF-NH2: First, the N-terminus could have

intrinsic activity or act as a competitive inhibitor; second, a glutamic acid might be

required in a position corresponding to the first residue of the more active EMPGVLRF-

NH2; third, the extra amino acids could prevent the active portion of the peptide from

efficiently binding to NPR-1; fourth, the N-terminal extension of DFDGAMPGVLRF-

NH2 could be involved in structural interactions that cause it to be less potent on NPR-1

than EMPGVLRF-NH2. To address these possibilities, two native FLP-18 peptides, and

a range of substituted and derived analogues (Table 3-1), were tested for their ability to

activate the NPR-1 215V receptor expressed as described in the Experimental Methods.

In the following, we use the term "activity" to indicate the magnitude of the potassium

current evoked by a 10-6 M pulse of peptide. We will refer to peptides by their peptide

number shown in Table 3-1.

It was observed that the long native FLP-18 peptide 2 was much less effective than

the shorter FLP-18 peptide 1 at activating the receptor (Table 3-1), confirming previous

observations (53). To test if the N-terminus of peptide 2 could have intrinsic activity or

act as a competitive inhibitor, we designed and analyzed the effect of peptide









Table 3-1: Peptides examined by NMR and their activities on NPR-1. a) Naturally
occurring sequences are underlined. The conserved PGVLRF-NH2 sequence
is in bold. N-terminal "extension" sequences of native FLP-18 peptides are
bold and: Red for C. elegans based sequences and Blue for A. suum based
sequences. "n" is the number of repeated experiments. b) Peptides (1 JiM M)
were applied in 2 minute pulses to Xenopus oocytes expressing NPR-1 215V.
Results expressed as a % of response to 1 |JM Peptide 1 (EMPGVLRF-NH2)
+/- SEM (Standard Error). Data for all peptides were normalized to peptide 1
at 100%, which was repeated for each of the measurements. In a previous
study, the current measured from applying peptide 1 was 32.2 +/- 3.8 nA
(n=33) (53). c) Most active native C. elegans FLP-18 peptide. d) Longest
and least active native FLP-18 peptide. e) Longest A. suum AFP-1 peptide. f)
Chimera of long FLP-18 + long AFP-1. g) Chimera of long AFP-1 + long
FLP-1.
Name Sequence a % Response (n)
Peptide 1 EMPGVLRF-NH2 100
Peptide 2d DFDGAMPGVLRF-NH2 29.1+5.7 (16)
Peptide 3 DFDGAM-NH2 0(3)
Peptide 4 DFDGEMPGVLRF-NH2 19.02.6 (8)
Peptide 5 SGSGAMPGVLRF-NH2 118.7+11.0 (4)
Peptide 6 AAAAAMPGVLRF-NH2 62.43.2 (10)
Peptide 7 GFGDEMSMPGVLRF-NH2 49.316.5 (4)
Peptide 8 GFGDEM-NH2 0(3)
Peptide 9f DFDGEMSMPGVLRF-NH2 45.010.5 (6)
Peptide 101 GFGDAMPGVLRF-NH2 104.82.8 (8)
Peptide 11 PGVLRF-NH2 43.05.5 (14)
Peptide 12 PGVLRFPGVLRF-NH2 198.133.3 (10)









3. This peptide had no intrinsic activity on the receptor (Table 3-1) and did not block the

effects of 1 .iM pulses of peptide 1 (n = 3) (data not shown). To test the possibility that a

glutamic acid might be required in a position corresponding to the first residue of peptide

1, we examined peptide 4, where a glutamic acid residue is substituted for the alanine at

position 5 in peptide 2. However, this substitution did not improve, and in fact

weakened, the effectiveness of the long peptide. It also seemed possible that the added

bulk of peptide 2 due to the extra amino acids could be preventing access to the NPR-1

binding site. Thus, we analyzed peptides 5 and 6. Peptide 5 was designed to eliminate

any potential structure in the N-terminus based on commonly used flexible linker

sequences in fusion protein constructs (pET fusion constructs, Novagen Inc.). The N-

terminus of peptide 6 was designed to induce a nascent helical structure in the same

region (138, 139). It can be seen that peptide 5 completely restored activity compared to

peptide 2, while peptide 6 only partially restored activity in comparison with the short

native peptide.

As shown in Chapter 1, Table 1-1, the longest native peptide from af-1 in A. suum

(33), peptide 7, is two amino acids longer than the corresponding longest peptide from

the C. elegansflp-18 gene (18), peptide 2. Thus, we also synthesized and tested peptide

7, as well as its N-terminus, peptide 8. Peptide 7 was slightly more effective than the

peptide 2. However, the short N-terminal peptide sequence was again inactive (Table 3-

1) and did not block the effects of 1 [iM pulses of peptide 1 (n = 3) (data not shown). In

addition, we also made chimeras of the long C. elegans and A. suum sequences, peptides

9 and 10. Peptide 9, in which two extra amino acids (SM) are introduced into the center

of peptide 4 to give it the same number of amino acids as peptide 7, showed similar









activity to that of the long native Ascaris peptide itself, GFGDEMSMPGVLRF-NH2.

However, peptide 10 showed similar activity to that of peptide 1.

As shown later, our results indicate that the conserved C-terminal PGVLRF-NH2 is

largely unstructured in solution. Thus, we tested peptide 11 and also peptide 12, in which

the conserved sequence was duplicated. The activity of the peptide 11 was less than that

of peptide 1 and similar to that of peptides 7 and 9. When compared to peptides 1 and 5,

the reduced activity of peptide 11 could indicate that methionine preceding proline is

important for activity. However, the C-terminal duplicated peptide with no methionine

was approximately twice as active as peptide 1.

To further investigate the effects of changing the structure of the N-terminal

sequence of the C. elegans long FLP-18 peptide, we determined full dose response curves

for the peptides 1, 2, and 5 (Fig. 3-1).

From Figure 3-1, peptide 5 is more potent on NPR-1 than peptide 2, both having a

length of 12 amino acids. This suggests that elimination of structure at the N-terminus of

peptide 2 can increase its potency on the receptor. Also, peptides 2 and 5 are more

efficacious at higher concentrations than peptide 1, suggesting that longer peptides might

be more efficacious on NPR-1 than shorter peptides.

NMR Chemical Shifts Reveal Regions of flp-18 Peptides with Significant Structure:

Chemical shifts are extremely sensitive to molecular and electronic environments

and thus provide unique atomic probes in molecules. Specifically, in peptides and

proteins, chemical shifts of many nuclei along the polypeptide backbone have been










0


200

2 180

" 160

140

120

100

E 80
06)
4 60

40


= Peptide 1: EMPGVLRF-NH,
= Peptide 2: DFDGAMPGVLRF-NH,
= Peptide 5: SGSGAMPGVLRF-NH,


-10 -9 -8 -7 -6 -5


Log [Agonist] (M)

Figure 3-1: Dose response curves of select FLP-18 peptides. Peptides were applied to
Xenopus oocytes expressing NPR-1 215V, and the responses from inward
rectifying K+ channels were recorded and normalized to the response of
peptide 1 (EMPGVLRF-NH2) at 10-6 M. Filled circles are peptide 1
(EMPGVLRF-NH2) (EC5o = 10-6.80 M), open squares are peptide 5
(SGSGAMPGVLRF-NH2) (EC5o = 10-6.12 M), and open circles are peptide 2
(DFDGAMPGVLRF-NH2) (EC5o = 10-5.28 M). Three measurements at each
peptide concentration were obtained and results are shown +/- SEM. These
data were acquired by the lab of Prof. Peter Evans.









shown to be dependent on secondary structure (122, 140-145). Thus, the first step in

NMR analysis is the assignment of resonance peaks in spectra to atoms in the molecule.

For all peptides in Table 3-1, nearly complete NMR resonance assignments were made

using standard two-dimensional 1H-based methods (133) (Appendix C).

Short, linear peptides are often very dynamic, lack a 3D hydrophobic core, and

interconvert rapidly between many different conformations. Despite this inherent

flexibility, numerous studies have demonstrated that regions of short peptides can be

highly populated in specific types of secondary structure (146-149). A fundamental

hypothesis of this study is that differences in local structure of variable N-termini of free

FLPs could partially explain differences in their potencies on receptors.

In order to compare chemical shifts from one peptide to another and to identify

regions that contain significant populations of secondary structure, it is useful to compare

experimental chemical shifts to random-coil values (141-143, 145). Fig. 3-2 plots the

difference between experimental and random-coil values for some of the peptides

analyzed in this study. The white and black bars represent deviations from random-coil

values for amide and alpha protons, respectively, and the magnitude of the deviations

reflects the population of local structure along the backbone of the peptides (141, 145).

All data were compared to published random coil values at pH 2.3.

Several features in Figure 3-2 are worth noting. First, the chemical shifts of

residues in the conserved PGVLRF-NH2 regions of each peptide are close to random-coil

values, and rather similar among all the peptides examined. This suggests that this

conserved sequence is unstructured in solution and that flexibility is important for

binding to NPR-1. This flexibility may help the ligand diffuse/maneuver more













Peptide 3


0.1
O.I
0.'
0.0

-0.4
-0.
-0.1


E o.
o.e








-0.
-0.
0 O.I
0.1
0o.
0.;

-0.1
-0.'
-0.I


GF G D E M-NH2


Peptide 5


Peptide 2
a- l 'u


I -T u


DF DGAMP GVL R F-NH,



Peptide 7


G F G DEMSMPG V L RF -NH,
GFGDE MSMP GV L R F-NH2


l [


SGSGAMPG VL R F -NH,


Peptide 1


E M PG V L R F -NH,


Figure 3-2: NMR chemical shift deviations from random coil values. Experimental
chemical shifts at pH -2.3 were subtracted from sequence corrected random
coil values (141, 142). Filled and open bars represent alpha and amide
protons, respectively.


8
1
27


2

8 DF DGAM-NH,



8 Peptide 8
].l ,% ,--Q


1
8
B
2
4-
0


2
4
6
r
5









effectively into a binding pocket on the receptor. Second, the N-terminal extension of

peptide 2 shows significant deviation from random-coil values. In particular, the G4

amide proton has a very large deviation, suggesting its involvement in significant

structure. Third, the chemical shift deviations of amide and alpha protons of peptides 3

and 8 are nearly identical to the corresponding regions of the full-length peptides 2 and 7,

respectively. This indicates that these N-terminal extensions are behaving as independent

structural units. Finally, peptide 5, designed to lack N-terminal structure, indeed shows a

consistent very small deviation from random-coil values in its first five residues.

pH Dependence of Amide Proton Chemical Shifts Reveal Regions of flp-18 Peptides
with Significant Structure:

The sensitivity of NMR chemical shifts to electronic structure and hydrogen

bonding make them ideal probes of longer-range interactions with titratable side-chains.

NMR studies of peptides that utilize amide protons often need to be conducted below

-pH 6 to prevent amide proton exchange (133, 150). By varying the pH from about 5.5

to 2, both aspartic and glutamic acid side-chains will be converted from negatively

charged and deprotonated to neutral and protonated. These different charge states of the

carboxylate groups will produce changes in the electronic environment in interacting

atoms proportional to 1/R3, where R is the distance between the charged group and

chemical shift probe. Backbone amide resonances are particularly sensitive to

interactions such as H-bonding (134, 143) and thus provide ideal probes of long-range

interactions with side-chain carboxylates. This phenomenon provides a powerful

mechanism to study long-range hydrogen bonding and salt bridge interactions in small

peptides (121, 150, 151).









Many of the FLP-18 peptides contain aspartic or glutamic acids, so we performed

ID NMR pH titration experiments on all peptides in Table 3-1 containing these residues

and, as controls, on peptides 5 and 11, neither of which showed any pH dependence in the

proton chemical shifts. Stackplots of the amide region for a representative set of peptides

are shown in Figure 3-3.

No chemical shifts in peptide 5 (Figs 3-3E and 3-4E) or peptide 11 (data not

shown) have any pH dependence, demonstrating that backbone amide proton chemical

shifts are not intrinsically pH-dependent in this pH range. Second, several resonances in

other peptides have large pH-dependent shifts. To our knowledge no systematic study

has been undertaken to identify the maximum change in chemical shift of backbone

amide protons as a function of pH, but Wtithrich and coworkers showed that a well-

defined (-70-90% populated) hydrogen-bond between an aspartic acid side chain and a

backbone amide proton in a small protein led to a change of 1.45 ppm over the titratable

range of the aspartic acid (152). Thus, in Figure 3-3, some amide proton resonances of

non-titratable amino acids have pH-dependent shifts that are characteristic of significant

H-bond interactions. Others have smaller pH-dependent changes, suggesting either more

transient dynamic interactions or much longer and weaker H-bonds. In contrast, several

resonances in peptides with titratable groups show little or no pH dependence, showing

that these effects are relatively specific. Next, the spectra from the N-terminal truncated

peptides 3 and 8 are highly dependent on pH and are nearly perfect subsets of the same

regions in their full length counterparts. Consistent with chemical shift data in Figure 3-

2, this demonstrates that the N-terminal extensions of peptides 2 and 7 behave as

independent structural units. The extensions also do not interact significantly













A

pH F2
5.5---4


9.0 8.8 8.6 8.4 8.2 8.0 7.8 7.6 7.4


9.0 8.8 8.6 8.4 8.2 8.0 7.8 7.6


8.9 8.7 8.5 8.3 8.1 7.9 7.7 7.5


8.6 8.4 8.2 8.0


-NH,


---------- --------






1.9--
7.8 7.6 9.1 8.9 8.7


ppm


8.5 8.3 8.1 7.9 7.7 7.5


Figure 3-3: Amide region of one-dimensional NMR data, collected as a function of pH
from about 1.9 to 5.5. Peaks are labeled with their assigned amino acids and
the panels correspond to the following peptides: A. Peptide 3 = DFDGAM-
NH2, B. Peptide 2 = DFDGAMPGVLRF-NH2, C. Peptide 8 = GFGDEM-
NH2, D. Peptide 7 = GFGDEMSMPGVLRF-NH2, E. Peptide 5 =
SGSGAMPGVLRF-NH2, F. Peptide 1 = EMPGVLRF-NH2. pH dependent
interactions are summarized in Figure 3-5, and complete pKa analyses are
provided in Appendix D.


9.0 8.8


111-"-
---"
------
I-"I



---"
II
--xI-
I--cl









with the more C-terminal backbone atoms, which show relatively little pH dependence,

indicating that the conserved C-termini are less structured than the N-terminal extensions.

pH Dependence of Arginine Side-Chains Reveal Long-Range Interactions:

The penultimate arginine residue is highly conserved and found in the same

position in all FLPs. This arginine is at least 7 residues away from any carboxyl groups,

so we were surprised to find in several FLP-18 peptides that its epsilon proton (Arg fH) is

pH dependent (Figure 3-4).

Peptide 5 demonstrates that there is no intrinsic pH dependence for Arg H" over the

pH range investigated, and we conclude that in other peptides there are long-range

interactions between the Arg and the N-terminal carboxylates. Such an interaction would

indicate a non-covalent ring structure. These interactions show up in most of the FLP-18

analogues having N-terminal carboxyl sidechains and, at first glance, do not appear to

relate to the activity of the peptides (Table 3-1). For example, peptide 1 (one of the more

active peptides) has nearly the same Arg H" pH dependence as peptide 4 (the least active

PGVLRF-NH2 containing examined). Moreover, peptide 10, with similar activity to

peptide 1, has nearly no Arg H" pH dependence. Thus, Arg interaction with acidic

residues alone is not sufficient to explain the difference in activity among the FLP-18

analogues tested.

Quantitative Determination of pKa Reveals Multiple Interactions:

Several of the peptides in Table 3-1 have more than one carboxylate, so it is not

always obvious which is responsible for the pH dependence of a particular resonance. If

the titrating groups have distinct pKa values, then it should be possible to determine the











C-terminal
Amide
Proton
Rll epsilon


7.15


7.


7.]


= 1.9
7.10 7.25


ppm


Figure 3-4: Arg H' region of one-dimensional NMR data, collected as a function of pH
from 1.9 to 5.5. These long-range interactions on the penultimate C-terminal
Arg result from the titratable carboxylate groups on the N-termini. pH
dependent interactions are summarized in Figure 3-5, and complete pKa
analyses are provided in Appendix D. Legend: A. Peptide 4 =
DFDGEMPGVLRF-NH2, B. Peptide 2 = DFDGAMPGVLRF-NH2, C.
Peptide 10 = GFGDAMPGVLRF-NH2, D. Peptide 7 =
GFGDEMSMPGVLRF-NH2, E. Peptide 5 = SGSGAMPVLRF-NH2, F.
Peptide 1 = EMPGVLRF-NH2


C-terminal
Amide
Proton


R11 epsilon


1.9
7.25


7.20


7.20


C-terminal
Amide
Proton


Rll epsilon


1.9
10 7.25



D

pH
5.5


7.15


C-terminal
Amide
Proton


1.9 =
7.25


7.20


1.9
10 7.25


F
pH
5.5


C-terminal
Amide
Proton

R11 epsilon


7.15
C-terminal
Amide
Proton


7.10


R5 epsilon


1.9
7.25


7.15


7.10









contribution of each carboxylate on each titrating resonance using Equation 1. Every

peak that exhibited pH-dependent chemical shifts was fitted using first one, then two,

then three pKa values. In all cases we used the minimum number of interacting pKa

values to get a good quality of fit and maximum linear regression coefficient (R2) to the

experimental data. In the peptides with three titrating groups, inclusion of three

interacting groups in the calculation did not improve the fits more so than including only

two.

The complete table of relative pKa contributions (c from Eq 1) and pKa values are

provided in Appendix D, and the interactions are represented graphically in Fig. 3-5. As

we discuss below, the interactions between titrating groups and resonances in these

peptides is rather complicated and dynamic. The data presented here illustrate that,

though there is a heterogeneous ensemble of H-bonding interactions between various

backbone amide protons, certain ones are prominent.

Using the pKas calculated for the pH dependent resonances of the peptides in this

study we can assign most H-bonding interactions between titrating carboxyl side-chains

and either backbone amide or Arg H" protons. Figure 3-5 also illustrates the relative

strength of these interactions. The most significant interaction (the largest shift from a

long range interaction) is from a hydrogen bond between the Dl carboxylate and G4

amide in all peptides containing the N-terminal DFDG sequence. It contributes 40% to

the observed titration of the G4 amide in peptides 2 and 3, and 55% to that of peptide 4

(Appendix D). The calculated pKa of Dl (-3.0) is significantly lower than that of D3

(-4.0), indicating that Dl is likely interacting with the positively charged amino terminus

and stabilizing its negative charge. This is also seen in peptide 1, as El also has an








0 FD M-NH



NH2



I IIIII
SG S GFAM P G V L R F-NH





o10 2
gD EMPGVL R F-NH,


5

10E rM PG L R F-NH2


Figure 3-5: Proposed H-bonding Interactions Between Backbone Amide Protons and
Carboxyl Side-chains: DFDGAM-NH2 = Peptide 3, DFDGAMPGVLRF-NH2
= Peptide 2, DFDGEMPGVLRF-NH2 = Peptide 4, SGSGAMPGVLRF-NH2
Peptide 5, EMPGVLRF-NH2 = Peptide 2. Each H-bond acceptor residue is
color coded to match the arrows leading from it to its H-bond donors. The
arrow widths are proportional to the relative extent to which that particular
interaction affects the chemical shift of the amide proton at the point end of
the arrow. The bar plots show the temperature coefficient of the backbone
amide proton resonances









unusually low pKa (-3.5). Additional support for this pKa assignment comes from the

pKa of the alpha protons ofD1 and D3 of peptide 3 and Dl of peptide 2, which are 3.23,

4.09, and 2.97, respectively (Appendix D).

The interactions observed from pH titrations of peptides beginning in GFGD are

different from and less substantial than those beginning in DFDG. For example, the

largest pH dependent chemical shift change of the amide proton of a non-acidic amino

acid for peptide 7 is -0.12 ppm (G3), whereas G4 in peptide 2 is -0.28 ppm. Also, the

arginine sidechains of peptides 7 and 9 show a rather small chemical shift change in the

pH titration experiments (Fig. 3-4D) compared to the same resonance in peptides 2 and 4.

Additionally, the D4 sidechain of GFGD containing peptides seems to interact primarily

with backbone amides N-terminal to it (Appendix D). This is an entirely different

conformation than that observed in DFDG containing peptides, where Dl has a

substantial interaction with G4.

Temperature Dependence of Amide Chemical Shifts Corroborates Regions with H-
Bonding:

Although complicated and often over-interpreted, the temperature dependence of

amide proton chemical shifts in polypeptides can be associated with hydrogen bonding

(122, 153). Additionally, some peptides analyzed here lacked carboxyl side-chains, so

pH titration results were not valid in determining possible structural interactions in these

peptides. We therefore measured temperature coefficients (TCs) for several relevant

peptides (Fig. 3-5). A rough guideline to interpreting TCs is that an absolute value less

than 4 indicates an internal hydrogen bond, values between 4 and 6 indicate weak

hydrogen bonding, and values greater than 6 are not involved in hydrogen bonding (122,

153, 154). The magnitudes of the temperature coefficients for all amide protons in this









are inversely correlated with the magnitude of chemical shift pH dependence for those

resonances (Fig. 3-6), which is consistent with H-bonding interactions as described

above.

Overall Peptide Charge is Correlated With Activity on NPR-1:

The experimental data presented above demonstrate that acidic residues in the N-

terminal regions of FLP-18 peptides can interact with numerous amide protons and the

conserved penultimate Arg. Although there are many additional factors influencing

activity as addressed below, there appears to be a qualitative relationship between their

charge properties (particularly of the N-terminus) and activities on NPR-1. This

relationship is demonstrated in Figure 3-7 where the overall net charge at pH 7 of the

entire peptide is plotted against its activity on NPR-1.

Discussion

The goal of this work has been to determine the conformational properties of

unbound FLP-18 neuropeptides from C. elegans and how these may affect their potencies

on NPR-1. The starting point for this study was the knowledge that two of the peptides

encoded by theflp-18 gene have significantly different potencies on NPR-1 (53). The

major findings reported above can be summarized as follows:

* The backbone of the conserved PGVLRF-NH2 is predominantly unstructured.

* DFDG forms a structural loop stabilized by H-bonding.

* Another loop forms when N-terminal acidic residue(s) interact with the conserved
C-terminal penultimate arginine side-chain.

* The DFDG loop interacts with the second loop to form a dynamic bicyclic structure
that might influence binding to NPR-1.

* Charge also affects the activity of FLP-18 peptides on NPR-1.











8 7 .


6-

I $
',5-

S4-

3
0 0.1 0.2 0.3
pH Dependence (ppm)

Figure 3-6: Relationship between Temperature Coefficients and pH dependence of
chemical shift among Backbone Amide Protons: Plotted here is the chemical
shift change with pH vs with Temperature for backbone amide resonances. R2
for the linear fit is 0.58. All data from peptides 1-4 are represented.















3
U
,m


250

200

150

100

50

-2 -1 0 1 2 3 4
Charge


Figure 3-7: Relationship between overall peptide charge and activity on the NPR-1
receptor. The overall peptide charge at neutral pH is plotted against activity
for all of the peptides analyzed in this study. For the linear fit, R2 = 0.67.
Legend: Filled circle = 4, Open Circle = 2, Filled Square = 9, Open Square =
5, Filled Triangle = 6, Open Triangle = 7, Filled Diamond = 1, Open Diamond
= 10, Filled Hexagon = 11, Open Hexagon = 12 (numbers correspond to
peptide numbers in Table 3-1)









The backbone structure of the conserved PGVLRF-NH2 is predominantly
unstructured

All NMR structural parameters measured in this study for the PGVLRF-NH2 region

of FLP-18 peptides indicate that the peptide backbone of this conserved sequence is

predominantly unstructured. The only significant evidence for any kind of structural

motif is the interaction between the conserved penultimate arginine side-chain and acidic

residues in the N-termini. These results suggest that the primary receptor-binding region

of FLP-18 peptides is highly flexible before interacting with NPR-1.

DFDG forms a structural loop stabilized by H-bonding

We observe transient H-bonding and ionic interactions within FLP-18 peptides

beginning in the sequence DFDG. Specifically, acidic residues in the variable N-termini

form substantial H-bonds to backbone amides N-terminal to the conserved proline

(Figures 3-5 and 3-8).

In the DFDG containing peptides, G4 has the smallest temperature coefficient of all

amides in the study; this is characteristic of involvement in a significant H-bonding

interaction (122, 153, 155). This phenomenon is particularly prominent in peptide 4,

where the Dl pKa rather than that of D3 is the most significant contributor to the G4

amide proton titration. It is also the least active PGVLRF-NH2 containing peptide tested.

Also, weak ROESY peaks were observed between Dl beta protons and G4 alpha protons

in both peptides 2 and 3 (data not shown). This further corroborates the pH titration

results that indicate significant long-range H-bonding between the Dl sidechain and G4

backbone amide proton of peptides beginning with DFDG. In contrast, the N-terminal

SGSG region of peptide 5 is unstructured based on our NMR results, and is one of the

most active peptides analyzed.









The DFDG loop may interact with the second loop to form a dynamic bicyclic
structure which reduces binding to NPR-1

There is no direct or simple correlation between the activity data and any one set of

NMR data. However, the two carboxylate residues in peptides 2 and 4 allow both the N-

terminal loops as well as the ionic interaction between the conserved arginine and the

aspartates (Fig. 3-8A). The increased activities of peptides 7 and 9, along with their

apparent weaker interaction between the penultimate arginine and acidic residues relative

to peptides 2 and 4, illustrate that the residues SM inserted in the middle of these peptides

can interfere with loop formation between the N- and C- termini. FLP-18 peptides are

short and flexible, and both loop interactions are likely dynamic. However, there is a

possibility that the bulk of the N-terminal loop in DFDG containing peptides is brought

into proximity of the conserved receptor-binding region by the action of the second loop

involving the penultimate arginine. We propose that this bicyclic structure reduces

binding to NPR-1.

Charge is also important in determining the activity of flp-18 peptides on NPR-1

There is a significant correlation between charge and activity such that more

positively charged peptides tend to activate NPR-1 better than more negatively charged

ones. Interestingly, the vast majority of predicted FLPs in C. elegans tend to be

positively charged (18), including the peptide encoded byflp-21, which has an overall

charge of +3 and is active on both naturally occurring isoforms of NPR-1 (215F and 215

V). However, peptides 4 and 9 have the same charge but different activities on NPR-1.

The acidic residues of peptides 4 and 9 differ substantially in their interaction with the C-

terminal arginine. This is likely due to the insertion of the residues SM in the middle of

peptide 9. Thus, the N-terminal DFDG loop in peptide 9 does not interact well with the
























Figure 3-8: Model of interactions thought occurring within native FLP-18 peptides. This
figure shows the most significant H-bonding interactions supported by NMR
data. A: For DFDGAMPGVLRF-NH2, H-bonds between the Dl side-chain
carboxylate and G4/A5 backbone amide protons as well as H-bonding/ionic
interaction between the D3 side-chain by red dashed lines. The N-terminal
loop structure implicated in inhibiting binding to NPR-1 is circled in black.
B: EMPGVLRF-NH2 is shown with the most significant H-bonding and ionic
interactions for which we have evidence. Notice that it has no N-terminal
loop, in contrast to DFDGAMPGVLRF-NH2. Also, the same unstructured
region for both peptides is shown in ribbon view. The N- and C- termini are
also labeled on both peptides.









penultimate arginine, whereas that of peptide 4 does. This further supports the bicyclic

model and the affect of a two-loop conformation on the activities of DFDG containing

peptides.

Peptides 6 and 12 were often outliers in our attempts to correlate specific NMR

data parameters to activity results. Peptide 6 was designed to possess a helix in the N-

terminus, and we predicted reduced binding to NPR-1 resulting in activity similar to that

of peptide 2. This prediction was incorrect, and peptide 6 had more activity than peptide

2. However, with no carboxylates, peptide 6 lacks the ability to form sidechain mediated

H-bonding loops, which our model suggests should give it an activity more like that of

peptides 1 and 5. Thus, the activity of peptide 6 (intermediate between peptides 1 and 2)

suggests that other properties of its structure modulate its potency.

Peptide 12 unexpectedly had nearly exactly twice the activity of peptide 1. It is

composed of two copies of the conserved PGVLRF sequence that is responsible for FLP-

18 activity on NPR-1. Previous studies on FLP receptors show that the C-terminal amide

group is necessary for activity (40), so it is extremely unlikely that the C-terminal

PGVLRF in peptide 12 can interact with the active site of NPR-1. However, this peptide

is also the most positively charged of all among those tested. This is consistent with our

observation that a peptide's charge influences its activity on NPR-1.

Both native FLP-18 peptides in this study, DFDGAMPGVLRF-NH2 and

EMPGVLRF-NH2, differ in both potency and efficacy. We have shown that N-terminal

structure, peptide charge, loop formation and backbone flexibility in PGVLRF-NH2

modulate the activity of FLP-18 peptides on NPR-1. One interesting feature of the dose

response curves in Figure 3-1 is that the two longer peptides have a larger maximal









response and a steeper linear portion than the shorter peptide. This suggests that the

native peptides could induce different configurations of the NPR-1 receptor with different

abilities to couple to the G-protein pathway under subsequent second messenger

pathways (156-159). Both the A. suum and C. elegans long peptides have been isolated

(33, 39), demonstrating that these exist in vivo. However, other studies (160, 161) have

shown that many peptide degradation products can also be found in cells. Perhaps

multiple forms of FLP-18 peptides could shape the behavioral response to NPR-1 activity

in a way that could not be achieved by any one peptide alone. It is possible that the

ensemble of peptides functions as a bouquet (e.g. ensemble) to achieve a unique,

beneficial, fine tuned response (20, 121)














CHAPTER 4
ANISOMORPHAL: NEW INSIGHTS WITH SINGLE INSECT NMR

Introduction

Over 2500 walkingstick insect species (order Phasmatodea) have been identified on

earth so far (162). Many of these are known or postulated to produce allomonal

defensive secretions (80, 162-170) (allomone = compound secreted by one organism that

negatively affects the behavior another). However, to date, the chemical composition of

the secretions from only a handful of species has been characterized (80, 164-169, 171).

Anisomorpha buprestoides is a walkingstick insect phasmidd) (Figure 4-1) which is

common to the southeastern United States.

It ranges from Florida to Texas and North to South Carolina, including the

following states: Florida, South Carolina, Georgia, Alabama, Mississippi, Louisiana, and

Texas (172). They are commonly found in mating pairs with the male riding on the

female's back (Figure 1). This is stereotypical behavior for the genus and begins when

the animals are not yet adult, which is atypical in insects. The species is mostly nocturnal

and groups of them often cluster on tree trunks or in hidden areas in the daytime while

remaining motionless. When disturbed or threatened, A. buprestoides is well known for

accurately spraying a defensive secretion up to 40 cm toward the offending stimulus,

causing temporary blindness or irritation (173). The active component of this compound

was characterized as a cyclopentanyl monoterpene dialehyde called "anisomorphal" by

Meinwald et. al. in 1962 (80). This effort required over 1000 "milkings" of hundreds of

individual females. The substance was immediately extracted into methylene chloride






































Figure 4-1: Adult pair ofAnisomorpha buprestoides on leaves of a sweetgum tree
(Liquidambar styraciflua). Adult females are usually about 6-7 cm in length
and adult males are about 4-5 cm. Photo by Aaron T. Dossey.









for analysis. Samples were mostly from females due to their larger body size and larger

venom ejection volume.

At about the same time, Cavill and Hinterberger isolated a similar compound in the

ant species Dolichoderus acanthoclinea (order Hymenoptera) that they named

"dolichodial" (174). Later, two other isomers of were identified in cat thyme (Teucrium

marum) (175-177). In 1997, Eisner et. al. reanalyzed anisomorphal for comparison with

defensive secretion from another phasmid species. This work referred to the solution

sprayed as a pure single stereoisomer from extracts ofA. buprestoides secretions (164).

The stereospecific structure shown in that paper referenced previous work by Smith et. al.

(171) who referred to work by Pagnoni et. al. (176). The Pagnoni paper from 1976

assigned specific stereochemistry to anisomorphal by comparison of physical and spectral

results (176) with those of Meinwald et. al. in 1962 (80).

In this chapter I present new work on the defensive secretion ofA. buprestoides.

Here it is necessary to define some terms that will be used: A milking (noun) will refer

to a single ejection of defensive spray from an insect. Secretion will refer to the crude

substance coming from an insect, either pooled or from a single milking. Milking (verb)

will also be used to describe the process of collecting secretion samples from insects.

Due to lack of clarity of stereospecific common names for various isomers of dolichodial,

all stereoisomers of this compound from here forth will be referred to collectively as

"dolichodial-like". In the Discussion section of this chapter, a detailed designation of

names for specific stereoisomers of dolichodial will be given based on the data presented.









Using very fresh unpurified samples of small single milkings from half grown male

A. buprestoides, we were able to collect high quality ID and 2D NMR spectra in the time

normally required for standard 600 ptL samples. Here we report previously unobserved

isomeric heterogeneity of dolichodial-like isomers. The ratios of the two major isomers

vary between individual insects and over time in some individuals. Each isomer is in

equilibrium with its geminal diol at one of the aldehyde positions (Figure 4-2).

We also show that in addition to those compounds, glucose is present in nearly

equimolar concentration compared with the dolichodial-like isomers. Additionally, we

were also able to analyze a very small sample of defensive spray from a recently

described species of phasmid insect, Peruphasma schultei (163). Based on data that will

be presented, the defensive spray of this species contains both glucose and, in contrast to

A. buprestoides, homogeneous and unique dolichodial-like isomer that we have named

peruphasmal. The rationale for this nomenclature will be given in the Discussion section

of this chapter.

This work has been made possible by new high temperature superconducting 1mm

NMR probe which Dr. Edison helped develop (77). Other analytical techniques have

also improved tremendously in recent decades, as discussed in Chapter 1. The work in

this chapter takes advantage of these cutting edge tools in analytical chemistry. It shows

how, using such advances in analytical chemistry, our planet's chemical biodiversity can

now be explored as never before.




Full Text

PAGE 1

CHEMICAL BIODIVERSITY AND SIGNALING: DETAILED ANALYSIS OF FMRFamide-LIKE NEUROPEPTIDES AND OTHER NATURAL PRODUCTS BY NMR AND BIOINFORMATICS By AARON TODD DOSSEY A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLOR IDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2006

PAGE 2

Copyright 2006 by Aaron Todd Dossey

PAGE 3

To my family; to members of the laborato ry of Dr. Arthur S. Edison; and to God Almighty and the magnificent natural world he created which has given me much joy and a basis for my career in science.

PAGE 4

iv ACKNOWLEDGMENTS As with any endeavor one may pursue in li fe, I cannot take sole credit for anything I have done and, thus, thanks are certainly in order to those who ha ve helped make my PhD possible. First and foremost, I would like to thank my family. In particular, I thank my grandparents, Jerry and Emma Dossey, a nd mother, Teresa (Dossey) Scott, for instilling in me three key components of my su ccesses in life thus far: determination, a strong work ethic, and faith in myself. I w ould also like to thank God the creator for my life and the bountiful life forms of this earth that I enjoy st udying every day. I would also like to thank Oklahoma State Un iversity and others who helped foster the early stages of my career in bioche mistry. I thank Dr. Eldon C. Nelson, my undergraduate advisor, for showing great car e about my career and keeping me focused and motivated on the career related aspects of my tenure there. For many interesting and encouraging conversations about entomology from which I lear ned a lot, I would like to thank Don C. Arnold, the curator of the Oklahoma State University Entomological Museum. I also thank the Southwestern Bell Telephone Corporation and Sylvia Coles Denebeim for supporting me through gener ous scholarships which helped tremendously with school related costs and allowe d me to focus on my education. At the University of Flor ida (UF), I would like to th ank Dr. Arthur Edison, my supervisory committee chair, for providing a re search atmosphere that has allowed me to develop as a scientist. I al so thank Dr. Edison for his pati ence while I was training in protein NMR and learning how to write scientific articles. I thank James R. Rocca in the

PAGE 5

v UF AMRIS (Advanced Magnetic Resonance Imaging and Spectroscopy) facility for countless hours of help and discussion. Hi s tireless efforts in NMR training were an invaluable complement to the training I received from Dr. Edison. Pat Jones, the Biochemistry department secretary, was also in valuable to me by keeping me in line with deadlines and course registration and I thank her for that as well. I also thank my supervisory committee members (Drs. Arthur S. Edison, Ben M. Dunn, Brian D. Cain, Joanna R. Long, and Stephen A. Hagen) for always being available to help guide me through my PhD studies. I also thank the Un iversity of Florida for awarding me the Grinter Fellowship for the first th ree years of my tenure there. For help with specific projects, others cer tainly deserve my thanks. I thank Dr. Cherian Zachariah for help, training, and experiments in protein expression and purification and NMR. For experiments pe rformed and exciting discussion on phasmid insect pheromone chemistry, I thank Dr. Sp encer Walse and James Rocca. For data resulting in my first publicati on (Chapter 3), I thank Drs. Mario de Bono, Peter Evans, and Vincenzina (Reale) Evans, and Heathe r Chatwin for bioassay experiments on FLP-18 related neuropeptides. For their support and friendship, I would like to thank Omjoy Ganesh, Iman Al-Naggar, Ramazan Ajredini Fatma Kaplan, Dr. James Smith, and Dr. Terry B. Green (all fellow member s of Dr. EdisonÂ’s lab).

PAGE 6

vi TABLE OF CONTENTS page ACKNOWLEDGMENTS.................................................................................................iv LIST OF TABLES...............................................................................................................x LIST OF FIGURES...........................................................................................................xi ABSTRACT.....................................................................................................................xi ii CHAPTER 1 INTRODUCTION........................................................................................................1 Importance of Nematodes.............................................................................................1 FMRFamide-Like Neuropeptides (FLPs).....................................................................5 FLP Precursor Proteins..........................................................................................6 Receptors and Functions........................................................................................6 FLPs as Natural Products......................................................................................9 Natural Products.........................................................................................................10 Dissertation Outline....................................................................................................13 2 BIOCHEMICAL PROPERTIES OF FL PS AND THEIR PRECURSOR PROTEINS FROM THE NEMATODE Caenorhabditis elegans ..................................................17 Introduction.................................................................................................................17 Experimental Methods................................................................................................18 Data Mining for Nematode fl p Precursor Protein Sequences.............................18 Alignment and Phylogenetic Analys is of flp Precursor Proteins........................20 Analysis of Biochemical Properties of flp Precursor Proteins and Figure Generation........................................................................................................21 Analysis of FLP Mature Peptide Biochemical Properties...................................22 Results........................................................................................................................ .23 Analysis of Biochemical Propert ies of flp Precursor Proteins............................23 Sequence Repetition Patterns.......................................................................34 Charge Distribution......................................................................................35 Unstructured Propensity...............................................................................36 Other Features..............................................................................................40

PAGE 7

vii Biochemical Properties of Mature Processed FLPs............................................41 Peptide Charge.............................................................................................44 Peptide Length and Amino Acid Conservation............................................44 Discussion...................................................................................................................47 Grouping of FLP Subfamilies by Precu rsor and Peptide Properties...................47 The flp-1 Group............................................................................................48 The flp-6 Group............................................................................................49 The flp-7 Group............................................................................................49 The flp-20 Group..........................................................................................50 The flp-21 Group..........................................................................................51 Charge Compensation and Possibl e flp Precursor Structure...............................53 3 NMR ANALYSIS OF C. elegans FLP-18 NEUROPEPTIDES: IMPLICATIONS FOR NPR-1 ACTIVATION.......................................................................................55 Introduction.................................................................................................................55 Experimental Procedures............................................................................................56 Peptide Synthesis.................................................................................................56 Peptide Sample Preparation.................................................................................56 Biological Activity Assays:.................................................................................57 NMR Spectroscopy.............................................................................................58 Results........................................................................................................................ .59 Peptide Design Rationale and Physiological Responses:....................................59 NMR Chemical Shifts Reveal Regions of flp-18 Peptides with Significant Structure:..........................................................................................................63 pH Dependence of Amide Proton Chemical Shifts Reveal Regions of flp-18 Peptides with Significant Structure:.................................................................67 pH Dependence of Arginine Side-Chain s Reveal Long-Range Interactions:.....70 Quantitative Determination of pKa Re veals Multiple Interactions:....................70 Temperature Dependence of Amide Chemi cal Shifts Corroborates Regions with H-Bonding:......................................................................................................74 Overall Peptide Charge is Corre lated With Activity on NPR-1:.........................75 Discussion...................................................................................................................75 The backbone structure of the conserved PGVLRF-NH2 is predominantly unstructured......................................................................................................78 DFDG forms a structural loop stabilized by H-bonding.....................................78 The DFDG loop may interact with the second loop to form a dynamic bicyclic structure which reduces binding to NPR-1......................................................79 Charge is also important in determin ing the activity of flp-18 peptides on NPR-1..............................................................................................................79 4 ANISOMORPHAL: NEW INSIGHTS WITH SINGLE INSECT NMR.................83 Introduction.................................................................................................................83 Experimental Procedures............................................................................................88 Animal Collection and Rearing...........................................................................88 Sample Collection and Handling.........................................................................88

PAGE 8

viii NMR Spectroscopy.............................................................................................90 High Pressure Liquid Chromatogra phy – Mass Spectrometry (LC-MS)............92 Gas Chromatography (GC)..................................................................................93 Gas Chromatography – Mass Spectrometry (GC-MS).......................................94 Results........................................................................................................................ .94 NMR of Single Milkings Shows a New Component and Isomeric Heterogeneity94 Glucose Verified by Chromatograp hy and Colorimetric Assay........................104 Stereoisomeric Heterogeneity Veri fied by Gas Chromatography and Mass Spectrometry..................................................................................................105 Discussion.................................................................................................................107 Glucose Discovered in Stick Insect De fensive Spray – Potential Functions....107 Isomeric Heterogeneity in Phasmi d Defensive Compounds – Chemical Biodiversity....................................................................................................108 Peruphasmal – A Novel Phasmid Defensive Compound Isomer......................109 5 CONCLUSIONS AND FUTURE DIRECTIONS...................................................111 Conclusions...............................................................................................................111 Future Directions......................................................................................................115 Evolutionary History of FLPs and Other Neuropeptides..................................116 Neuropeptide Structure/Function Analyses.......................................................118 Anisomorphal and Other Insect Natural Products.............................................119 APPENDIX A ACCESSION NUMBERS (WITH CORRESPONDING SEQUENCE NAMES) FOR ALL FLP PRECURSOR PROTEIN SEQUENCES FROM ALL NEMATODE SPECIES USED IN WORK RELATED TO CHAPTER 2.....................................121 B ALIGNMENTS OF FLP PRECURSOR PROTEINS FROM Caenorabditis elegans AND OTHER NEMAT ODE SPECIES...................................................................128 C 1H NMR ASSIGNMENTS FOR ALL PEPTIDES EXAMINED IN CHAPTER 3.163 D PKA VALUES CALCULATED FOR RE SONANCES WITH PH DEPENDANT CHEMICAL SHIFTS...............................................................................................175 E 1H AND 13C NMR CHEMICAL SHIFT ASSIGNMENTS FOR DOLICHODIALLIKE ISOMERS FROM THE WALKING STICK INSECT SPECIES Anisomorpha buprestoides AND Peruphasma schultei ..................................................................178 F HLPC-MASS SPEC IDENTIFICATION OF GLUCOSE FROM DEFENSIVE SECRETIONS OF Anisomorpha buprestoides ........................................................179 G 2D NMR SPECTRA OF DEFENSIVE SECRETIONS OF Anisomorpha buprestoides AND Peruphasma schultei ..................................................................181

PAGE 9

ix H GAS CHROMATOGRAPHY TRACES AND MASS SPECTRA OF DEFENSIVE SECRETIONS OF Anisomorpha buprestoides AND Peruphasma schultei AND EXTRACTS OF Teucrium marum ...........................................................................192 LIST OF REFERENCES.................................................................................................195 BIOGRAPHICAL SKETCH...........................................................................................212

PAGE 10

x LIST OF TABLES Table page 1-1 Global numbers of the major human nematode infections........................................4 1-2 Sample quantities required for analysis using the three most powerful analytical techniques for chemical structure determination......................................................14 1-3 Mean values for a sele ction of molecular properti es among natural, drug, and synthetic compounds................................................................................................15 2-1 Sequence patterns and properties comm on to various groups of flp precursor proteins in C. elegans ...............................................................................................33 3-1: Peptides examined by NMR and their activities on NPR-1.......................................61

PAGE 11

xi LIST OF FIGURES Figure page 1-1 FMRFamide-Like Neuropeptides predicte d from database searches for precursor proteins....................................................................................................................... 7 1-2 The flp-18 and afl-1 precursor proteins from C. elegans and A. suum ......................8 1-3 Schematic diagram of the neur opeptide activated GPCR NPR-1 from C. elegans .11 2-1 Graphical illustrations of various chemical properties of FMRFamide-like Neuropeptide precursor proteins in C. elegans ........................................................24 2-2 Examples of known natively structured and unstructured proteins analyzed by IUPRED...................................................................................................................38 2-3 Portion of the alignment for all known fl p-7 precursor protein orthologues in the phylum Nematoda....................................................................................................42 2-4 Groupings of similar FLP neuropeptid e subfamilies based on their chemical properties and precursor prot ein sequence similarities............................................43 2-5 Predicted net charge frequency (pH 7.0) for all FLP neuropeptides predicted from C. elegans .................................................................................................................45 2-6 Prevalence of peptide lengths for all predicted FLP neuropeptides in C. elegans ...46 3-1 Dose response curves of select FLP-18 peptides.....................................................64 3-2 NMR chemical shift deviat ions from random coil values.......................................66 3-3 Amide region of one-dimensional NMR data, collected as a function of pH..........69 3-4 Arg H region of one-dimensional NMR data collected as a function of pH..........71 3-5 Proposed H-bonding Interactions Between Backbone Amide Protons and Carboxyl Side-chains...............................................................................................................73 3-6 Relationship between Temperature Coeffi cients and pH dependence of chemical shift among Backbone Amide Protons.....................................................................76 3-7 Relationship between overall peptide ch arge and activity on the NPR-1 receptor..77

PAGE 12

xii 3-8 Model of interactions thought occu rring within native FLP-18 peptides................80 4-1 Adult pair of Anisomorpha buprestoides on leaves of a sweetgum tree ( Liquidambar styraciflua )........................................................................................84 4-2 Geminal diol equilibrium observed fo r dolichodial-like isomers in defensive secretions from A. buprestoides and P. schultei .......................................................87 4-3 Example of milking procedure for colle cting defensive secretion from individual A. buprestoides .............................................................................................................89 4-4 Procedure used for obtaining pool ed defensive spray samples from A. buprestoides and P. schultei ..........................................................................................................91 4-5 1D NMR spectra of single milking samples from A. buprestoides its chloroform extract, the aqueous fraction, a sample of glucose for comparison, and a glucose spike experiment......................................................................................................96 4-6 Expansions of vinyl region of 1D 1H NMR spectra for single milkings of A. buprestoides on different days...............................................................................100 4-7 2D COSY and ROESY 1H NMR spectra of defensive secretions from A. buprestoides and P. schultei ...................................................................................101 4-8 Integrals of aldehyde and vinyl regions from single milkings of A. buprestoides defensive secretion.................................................................................................103 4-9 Gas Chromatographs of do lichodial-like isomers isolated from defensive secretions of the insects Anisomorpha buprestoides and Peruphasma schultei and extracts of the plant Tecrium marum .......................................................................................106

PAGE 13

xiii Abstract of Dissertation Pres ented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy CHEMICAL BIODIVERSITY AND SIGNALING: DETAILED ANALYSIS OF FMRFAMID E-LIKE NEUROPEPTIDES AND OTHER NATURAL PRODUCTS BY NMR AND BIOINFORMATICS By Aaron Todd Dossey August 2006 Chair: Arthur Scott Edison Major Department: Biochemi stry and Molecular Biology Natural products are simply the molecule s produced by biological systems. My studies examined the structure, function, a nd chemical biodiversity of two types of natural products, FMRFamide-like neurope ptides and defensive dolichodial-like compounds from walkingstick insects (Order Phasmatodea). FMRFamide-like neuropeptides (FLPs) make up one of the largest known neuropeptide families. The first member was identified in 1977 as a cardioactive component of extracts from clam. FLPs ar e characterized by an Nto C-terminal gradient of conservation, with subfamilies produced on the same gene having similar Cterminal sequences. The canonical C-te rminal motif of FLPs is Arg-Phe-NH2. They are involved in a wide variety of biological and behavior processes. They have been identified in humans and are particularly well repr esented in invertebrates. First, my study examined evolutionary relationships among several FLP subfamilies in the nematode Caenorhabditis elegans Using bioinformatics tools and

PAGE 14

xiv precursor protein sequence comparisons, I iden tified several important features of FLP neuropeptides and their precursor proteins and genes. Also, some subfamilies of FLP neuropeptides and their precursor proteins were categorized in to groups based on a number of similar features. Next, I examined the structure/function rela tionships for a particular subfamily of two FLPs (EMPGVLF-NH2 and DFDGAMPGVLRF-NH2) from C. elegans of the FLP18 subfamily. These peptides have been demo nstrated to regulate feeding behavior in C. elegans by activating the NPR-1 receptor. NMR pH titration experiments and chemical shift indexing were used to probe tr ansient hydrogen bonding and electrostatic interactions between aspartate sidechains and amide protons. These re sults indicate that the longer of the two native C. elegans peptides possesses N-termin al structure, stabilized by a hydrogen bonding network, which reduces its potency on the NPR-1 receptor relative to the shorter peptide. To examine non-polypeptide natural products, a novel microsample NMR probe was used to examine the chemical biodiversity of walkingstick insects. The results show the following: 1) Anisomorpha buprestoides produces two stereoisomers of dolichodial in its defensive spray, 2) Peruphasma schultei produces only a single isomer (Peruphasmal) of the same compound, and 3) defensive secr etions of both species contain glucose (previously unreported from walkingstick inse ct defensive secretions). These findings would not have been possible using other NMR technologies.

PAGE 15

1 CHAPTER 1 INTRODUCTION Importance of Nematodes Nematodes (Kingdom Animalia, Phylum Ne matoda) are among the most numerous groups of animals on the planet. The number of species worldwide is controversial ( 1 ), but some estimate there are as many as 1 million species in the world ( 2, 3 ). As well as being quite speciose, nematodes are among th e most ubiquitous groups in the animal kingdom, and four in every five indivi dual living animals is a nematode ( 4 ). One hundred grams of a typical soil sample can contain about 3,000 individual nematodes ( 3 ). Nathan Augustus CobbÂ’s rather famous 1914 de scription of the abundance of nematodes on earth illuminates their prevalence ( 5 ): In short, if all the matter in the univer se except the nematodes were swept away, our world would still be dimly recognizable and if, as disembodied spirits, we could then investigate it, we should find its mountains, hill s, vales, rivers, lakes, and oceans represented by a film of nema todes. The location of towns would be decipherable, since for every massing of human beings there would be a corresponding massing of certain nematode s. Trees would still stand in ghostly rows representing our streets and highways The location of the various plants and animals would still be decipherable, a nd, had we sufficient knowledge, in many cases even their species could be determin ed by an examination of their erstwhile nematode parasites. Nematodes take advantage of a wide variet y of ecological niches. They occur in arid desert areas, the bottoms of freshwater bodies, and in hot spring s; they have even been thawed out alive from Antarctic ice ( 6 ). The trait for which nematodes are best known is their parasitism. Of over 20,000 speci es of nematodes described, about 25 to 33% parasitize vertebrates ( 1, 2 ), causing extensive health problems for people as well as

PAGE 16

2 the livestock animals on which we depend. Many other nematode species are plant parasites, and cause about $80 bi llion in crop damage annually ( 4 ). However, nematodes also present great potential benefits to mankind, agriculture and ecology. Many nematodes are parasitic on pest organisms such as insects ( 7 ), and several species are even commercially available as pest control agents ( 8 ). As mentioned above, nematodes are best know n for their parasitism, particularly of humans. Accumulation of nematode parasi tes in various human host tissues causes a wide variety of physiological a nd deforming pathologies. Filarial diseases, caused when filarial nematodes like Wuchereria bangrofti and Brugia malayi infect the lymphatic system, result in fever, chills, skin lesions, and other debilitating symptoms. If left untreated, filarial infections can manifest as elephantiasis. In this later stage of the disease, the lymph ducts are actually clogged with nematode s and fluid builds up in the extremities, causing them to swell and become grossly deformed. Intestinal nematode parasites (such as Ascaris lumbricoides in humans) can result in malnutrition, difficulty breathing (when they migrate to the lungs), and intestinal blockage of the host. Hookworms ( Necator americanus and Ancylostoma duodenale ) can result in skin rashes and asthma-like symptoms when they penetrate skin or migrate into the lungs during their lifecycle. Additionally, since adult hookworms live in the intestine and feed on blood, severe infections can result in abdominal pa in, anemia, and heart conditions. Another intestinal parasite, Trichuris trichiura (the human “whipworm”), can also cause anemia as well as diarrhea and abdominal pain. Infection by a type of filarial worm, Onchocerca volvulus (common in some tropical regions), can cause the dis ease known as river blindness. It is the larval stage of this parasite which causes the most severe

PAGE 17

3 complication. The adult female, which reside s under the human hostÂ’s skin, lays eggs there. Once the larvae hatc h, they reside in the bloods tream awaiting uptake by a secondary host, the Black fly (genus Simulium ). Some migrate into the human eye. Immune response to the worms in the eye cau ses damage to the cornea, resulting in blindness. In addition to those described a bove (often considered the major nematode parasites of humans), many other nematode pa rasites have a serious impact on millions of people worldwide (Table 1-1). Given the importance of nematodes to manki nd and the rest of the planet, it is clear that intense study of nematode s is necessary to both contro l their negative impacts and exploit their potential benefits One particular nematode, Caenorhabditis elegans has been employed as one of the best characteri zed model organisms in modern biology. It was first described in 1900 in a study on nematode reproduction ( 9 ). Among the most significant advances in the unde rstanding of the biology of C. elegans was the mapping of the complete cell lineage from fertilized egg through adult hermaphrodite and male ( 10-12 ). That work was vital in elucidating the function of each cell type in these animals. Such information also helps answer ge netic questions. To date, this is the only multicellular organism for which such information is available. To complement this fine level of anatomic and developmen tal detail, the entire genome for C. elegans has also been sequenced ( 13 ). The wealth of nematode biology (mainl y the extensive characterization of C. elegans ) tells us much about the neuroanatomy and neuroconnectivity of these animals ( 2 ). Among genera of subclass Rhabditia ( Caenorabditis Ascaris etc.) the neuroanatomy is both conserved and simple ( 2, 14 ). An adult hermaphrodite C. elegans

PAGE 18

4 Table 1-1: Global numbers of the major human nema tode infections (in millions). Species Sub-Saharan Africa Latin Americaa Middle Eastern Crescent India China Other Asia and Islands TOTAL Ascaris lumbricoides 105.0 171.00 96.00 188.0 410.0 303.0 1273.0 Trichuris trichiura 88.0 147.00 64.00 134.0 220.0 249.0 902.0 Hookwormb 138.0 130.00 95.00 306.0 367.0 242.0 1277.0 Onchocerca volvulus 17.5 0.14 0.03 __ __ __ 17.7 Wuchereria bancroftic 50.2 0.40 0.34 45.5 __ __ 115.1 Brugia malayic __ __ __ 2.6 4.2 6.2 12.9 Notes: Data from the Global Burden of Di sease Study from the World Bank. Regions are also as defined by this study. aIncludes Caribbean nation. bBoth Necator americanus and Ancylostoma duodenale combined. cBoth infection and disease cases. Table regenerated from “The Biology of Nematodes” 2002, p. 600 ( 15 ).

PAGE 19

5 has exactly and invariably 302 neurons ( 14 ). Other species in the Rhabditia also have invariant numbers of neurons (invariant am ong individuals within a species) ranging from 150-300 cells ( 2 ). This simplicity and lack of va riability seemingly contradicts the behavioral diversity observed in this group of animals. Rhabditia range from small freeliving nematodes with simple soil existences (such as the Caenorhabditis sp .) to larger, more complex, and more specialized parasitic species (such as Ascaris sp ). One component of their nervous system which adds additional levels of diversity and complexity is their neurochemistry. A subs tantial part of the ne urochemical diversity observed in phylum Nematoda is their neuropeptides ( 16 ). Though little is known about the biological function of most nematode neur opeptides, the best characterized family of these is the FMRFamide-Like Neuropeptides (FLPs). They are the second largest neuropeptide family in nematodes (second only to peptides hypothesized to be processed from the Neuropeptide-Like (nlp) protein gene family) and certainly the most studied to date ( 16-18 ). FMRFamide-Like Neuropeptides (FLPs) FMRFamide was first discovered in 1977 by Price and Greenberg as a cardioexcitatory peptide from the clam Macrocallista nimbosa ( 19 ). FMRFamide-LikePeptides (FLPs) are the larg est family of neuropeptides found in invertebrates ( 17, 18, 20, 21 ), but mammalian (even human) FLPs have also been identified ( 22-25 ). These peptides are characterized by an Nto Cterminal gradient of increasing sequence conservation, and most end in RF-NH2. This is true when FLP peptide sequences are compared as a whole, from within taxa, with in species, or even on specific precursor proteins ( 18, 20, 21, 26 ). While for many neuropeptides, including FLPs, the C-terminus is conserved. However, other neuropep tide families have different patterns of

PAGE 20

6 conservation; for example, in insect orcokini ns the N-terminus is the conserved region ( 27 ), and in insulin the cystine framework and other central residues portions are conserved ( 28 ). The first nematode FLP, AF1, was isolated from Ascaris suum ( 29 ), and most subsequent early nematode FLP work was done on this species ( 30-36 ). The FLPs are highly expressed in nematodes, and thus ar e likely important chemical components of their anatomically simple nervous systems ( 17, 21, 32 ). Though much work has been done to elucidate the activiti es of FLPs, the definitive biological functions of FLPs (Figure 1-1) are still unknown. All hypothetical FLPs produced by C. elegans are tabulated in Figure 1-1. FLP Precursor Proteins FLPs, like most neuropeptides and hormones, are synthesized as part of larger precursor proteins and processe d in the secretory pathway ( 37 ). Peptides on a particular precursor have conserved regions in the mature peptides that are often associated with receptor binding and make up a subfamily ( 18 ). In C. elegans 28 different genes encoding well over 60 possible FLPs have been identified us ing bioinformatic approaches ( 18, 21, 38 ) and 28 of the putative processed peptides have been detected biochemically ( 39-43 ) (Figure 1-1). Examples of tw o precursor proteins from two nematode species are shown in Figure 1-2. Receptors and Functions FLPs are involved in a wide range of biol ogical processes that have been reviewed previously ( 26, 44-47 ). Some of the more prominent functional studies have focused on their role in cardioexcitation ( 19 ), muscle contraction ( 33 ), modulation of the action of

PAGE 21

7 flp-24 n VPSAGDMMV RF G *flp-23a GALFRS G VVGQQDFL RF Gflp-23b NSGCPGALFRS G TKFQDFL RF Gflp-22 m SPSAKWM RF G SPSAKWM RF G SPSAKWM RF G *flp-21 GLGPRPL RF Gflp-25 DYDFV RF G ASYDYI RF Gflp-26 o EFNADDLTL RF G GGAGEPLAFSPDMLSL RF G *flp-27 GLGGRM RF Gflp-28 VLM RF Gflp-20 AVFRM G AMM RF G AMM RF G SVFRL Gflp-19 l WANQV RF G ASWASSV RF G flp-16 j AQTFV RF G AQTFV RF G GQTFV RF G *flp-18 k DFDGAMPGVL RF G EMPGVL RF G SVPGVL RF G SVPGVL RF G EIPGVL RF G SEVPGVL RF G DVPGVL RF G SVPGVL RF G *flp-17 KSAFV RF G KSAFV RF G KSQYIRF Gflp-1a aKPNFMRY G SAAVKSL G AGSDPNFL RF G SQPNFL RF G ASGDPNFL RF G SDPNFL RF G AAADPNFL RF G SADPNFL RF G PNFL RF G *flp-15 GGPQGPL RF G RGPSGPL RF Gflp-14 i KHEYL RF G KHEYL RF G KHEYL RF G KHEYL RF G *flp-13 h AMDSPLI RF G AADGAPLI RF G APEASPFI RF G AADGAPLI RF G APEASPFI RF G ASPSAPLI RF G SPSAVPLI RF G SAAAPLI RF G ASSAPLI RF G flp-12 RNKFEFI RF Gflp-11c gAMRNALV RF G ASGGMRNALV RF G SPLDEEDFAPESPLQ G NGAPQPF Gflp-11b gAMRNALV RF G ASGGMRNALV RF G SPLDEEDFAPESPLQ G *flp-11a gAMRNALV RF G ASGGMRNALV RF G SPLDEEDFAPESPLQ G NGAPQPFV RF G *flp-10 QPKARSGYI RF Gflp-9 f KPSFV RF G KPSFV RF G *flp-8 e KNEFI RF G KNEFI RF G KNEFI RF G *flp-7 TPMQRSSMV RF G SPMQRSSMV RF G SPMQRSSMV RF G SPMQRSSMV RF G SPMERSAMV RF G SPMDRSKMV RF G SSIDRASMVRL G TPMQRSSMV RF Gflp-6 d KSAYM RF G KSAYM RF G KSAYM RF G KSAYM RF G KSAYM RF G KSAYM RF G *flp-5 c PKFI RF G AGAKFI RF G GAKFI RF G *flp-4 PTFI RF G ASPSFI RF Gflp-3 bSPLGTM RF G TPLGTM RF G SAEPFGTM RF G NPENDTPFGTM RF G ASEDALFGTM RF G EDGNAPFGTMKF G EAEEPLGTM RF G SADDSAPFGTM RF G NPLGTM RF Gflp-2b LRGEPI RF G SPREPI RF Gflp-2a LRGEPI RF G SPREPI RF Gflp-1b aKPNFMRY G SADPNFL RF G SQPNFL RF G ASGDPNFL RF G SDPNFL RF G AAADPNFL RF G SADPNFL RF G PNFL RF G *flp-1c aKPNFMRY G SADPNFL RF G SDPNFL RF G AAADPNFL RF G SADPNFL RF G PNFL RF G *flp-24 n VPSAGDMMV RF G *flp-23a GALFRS G VVGQQDFL RF Gflp-23b NSGCPGALFRS G TKFQDFL RF Gflp-22 m SPSAKWM RF G SPSAKWM RF G SPSAKWM RF G *flp-21 GLGPRPL RF Gflp-25 DYDFV RF G ASYDYI RF Gflp-26 o EFNADDLTL RF G GGAGEPLAFSPDMLSL RF G *flp-27 GLGGRM RF Gflp-28 VLM RF Gflp-20 AVFRM G AMM RF G AMM RF G SVFRL Gflp-19 l WANQV RF G ASWASSV RF G flp-16 j AQTFV RF G AQTFV RF G GQTFV RF G *flp-18 k DFDGAMPGVL RF G EMPGVL RF G SVPGVL RF G SVPGVL RF G EIPGVL RF G SEVPGVL RF G DVPGVL RF G SVPGVL RF G *flp-17 KSAFV RF G KSAFV RF G KSQYIRF Gflp-1a aKPNFMRY G SAAVKSL G AGSDPNFL RF G SQPNFL RF G ASGDPNFL RF G SDPNFL RF G AAADPNFL RF G SADPNFL RF G PNFL RF G *flp-15 GGPQGPL RF G RGPSGPL RF Gflp-14 i KHEYL RF G KHEYL RF G KHEYL RF G KHEYL RF G *flp-13 h AMDSPLI RF G AADGAPLI RF G APEASPFI RF G AADGAPLI RF G APEASPFI RF G ASPSAPLI RF G SPSAVPLI RF G SAAAPLI RF G ASSAPLI RF G flp-12 RNKFEFI RF Gflp-11c gAMRNALV RF G ASGGMRNALV RF G SPLDEEDFAPESPLQ G NGAPQPF Gflp-11b gAMRNALV RF G ASGGMRNALV RF G SPLDEEDFAPESPLQ G *flp-11a gAMRNALV RF G ASGGMRNALV RF G SPLDEEDFAPESPLQ G NGAPQPFV RF G *flp-10 QPKARSGYI RF Gflp-9 f KPSFV RF G KPSFV RF G *flp-8 e KNEFI RF G KNEFI RF G KNEFI RF G *flp-7 TPMQRSSMV RF G SPMQRSSMV RF G SPMQRSSMV RF G SPMQRSSMV RF G SPMERSAMV RF G SPMDRSKMV RF G SSIDRASMVRL G TPMQRSSMV RF Gflp-6 d KSAYM RF G KSAYM RF G KSAYM RF G KSAYM RF G KSAYM RF G KSAYM RF G *flp-5 c PKFI RF G AGAKFI RF G GAKFI RF G *flp-4 PTFI RF G ASPSFI RF Gflp-3 bSPLGTM RF G TPLGTM RF G SAEPFGTM RF G NPENDTPFGTM RF G ASEDALFGTM RF G EDGNAPFGTMKF G EAEEPLGTM RF G SADDSAPFGTM RF G NPLGTM RF Gflp-2b LRGEPI RF G SPREPI RF Gflp-2a LRGEPI RF G SPREPI RF Gflp-1b aKPNFMRY G SADPNFL RF G SQPNFL RF G ASGDPNFL RF G SDPNFL RF G AAADPNFL RF G SADPNFL RF G PNFL RF G *flp-1c aKPNFMRY G SADPNFL RF G SDPNFL RF G AAADPNFL RF G SADPNFL RF G PNFL RF G Figure 1-1: FMRFamide-Like Neuropeptides predicted from database searches for precursor proteins. Peptide subfamilies are denoted by the number after “flp” underlined in red (ie: peptides produced on the same precursor protein – the proteins being paralogues). A letter beside the number denotes peptides derived from an alternate tr anscript of that precursor (for example, “flp-1a). Each peptide has a C-terminal glycine (G) in red that is predicted to be converted to a C-terminal amide group during processing. The conserved RF motif in peptides that possess it is show n in blue. Peptides that have been verified to exist using biochemical methods (in their mature C-terminal amidated forms) are denoted by an asterisk to the right of the peptide. Notes: Peptides in the noted precursors have been biochemically verified in the following references: a) ( 39, 40 ), b) ( 39 ), c) ( 39 ), d) ( 43 ), e) f) ( 48 ), g) h) ( 49 ), i) ( 41 ), j) ( 39 ), k) ( 39, 49 ) l) ( 39 ), m) ( 39 ), n) ( 39 ), o) ( 39 )

PAGE 22

8 flp-18 MRFDDDTTCATTCADKLRTIEVLTGPTRFIQLYCVFFSYFSTTLTFFNYSLHH afp-1 MVELAAIAVHLFAILCISVSAEIELPDKRAQFDDSFLPYYPSSAFMDSDEAIV flp-18 LPCFSIFKIVFFVSERADQLCFFLNEKSSSQALKFLPKIESYVYSRLDMQRWS afp-1 AVPSSKPGRYYFDQVGLDAENAMSARE__________________________ flp-18 GVLLISLCCLLRGALAYTEPIYEIVEEDIPAEDIEVTRTNEKQDGRVFS afp-1 _________________________________________________ flp-18 KR **DFDGAMPGVLRFG KR GGVWEKRESSVQ KK EMPGVLRFG KR AYFDE KK SV afp-1 KR GFGDEMSMPGVLRFG KR ____________ __ GMPGVLRFG KR __ENE KK AV flp-18 PGVLRFG KR SYFDE KK *SVPGVLRFG KR DVPMD KR *EIPGVLRFG KR DYMADS afp-1 PGVLRFG KR _____ __ GDVPGVLRFG KR _____ __ SDMPGVLRFG KR ______ flp-18 FD KR SEVPGVLRFG KR DVPGVLRFG KR SDLEEHYAGVLL KK SVPGVLRFG RK afp-1 __ __ *SMPGVLRFG RR Figure 1-2: The flp-18 and af l-1 precursor proteins from C. elegans and A. suum respectively. Peptide sequences red, processing sites blue, denotes amino acid gap in peptide, denotes a gap in a spacer region. Note that this analysis is not an alignment ( 18, 36 ). The longer peptides for each precursor are underlined.

PAGE 23

9 morphine ( 50 ), egg laying ( 51 ) and feeding behavior in nematodes ( 47 ). Also, disruption of the flp-1 gene in C. elegans resulted in a number of phenotypes ( 52 ). Two types of receptors for FLPs have been identified: GPCRs (G-Protein Coupled Receptors) ( 53-58 ) and a sodium channel gated by FMRF-NH2 ( 59-61 ). Other human/mammalian ion channel receptors have been identified whose activ ities are modulated by FLPs, including the Acid Sensing Ion Channels (ASICs) and Epithelial Na+ Ion Channels (ENaCs) ( 24, 62 ). The best characterized FLP GPCR in nemat odes to date is NPR-1. This receptor has been shown to be involved in feeding and foraging behavi or and its endogenous ligands have been identified as peptides encoded by the flp-18 and flp-21 genes ( 53 ). NPR-1’s characterization was greatly aided by the discovery of two natural isolates of C. elegans differing by only a single point mutation in th e third intracellular l oop of NPR-1 (Figure 1-3), which have drastically different feeding behaviors ( 63 ). In one case, this position is a valine and the worms spread out to feed on a bacteria coated agar plate. In the other isolate, the same position in NPR-1 is, inst ead, a phenylalanine and the worms cluster to feed in areas of high bacteria density on the plate. This is a fascinating story of how minor changes in individual proteins can give rise to drastic phenotypic changes and illustrates the importance of at least one function of a particular set of FLPs. FLPs as Natural Products The term “Natural Products Chemistry” ofte n brings to mind the plethora of small non-protein/non-nucleic acid metabolites isolated from nature, usually with some notable biological function or use to people. Ho wever, many proteinacious and polypeptide substances definitely meet the requirements of natural products and have proven to be extremely valuable to mankind. In particular two classic examples are insulin (used for decades from cow, pig, and a cloned human form called “Humulin” from Eli Lilly in

PAGE 24

10 1982) and venom toxin peptides from the sea snail genus Conus (conotoxins, or cone snail toxins) ( 64, 65 ). The first conotoxin analog ue (Ziconotide, Prialt Elan Pharmaceuticals, FDA NDA number 21-060 ( 66-68 )) was recently approved by the United States Food and Drug Admi nistration (FDA) for use against severe chronic pain One of the advantages of these molecules is that they function as non-opioid pain killers, so they are useful non-addictiv e alternatives to opioid pain killers such as morphine ( 69 ). FLPs, as naturally occurring bioactiv e neuropeptides, should certainly be considered in the realm of natural products. An example that shows a direct illustrative analogy is that indeed one FLP, “conorfamide”, has even been isolated from the actual venom of cone snails ( 70 ). Given the natural origins of FLPs, designing therapeutic agents based on them is certainly an endeavor rooted in the natural products chemistry. Identification of compounds in natural source s is helpful in elucidating their biological function and potential additional uses. Natural Products Recently, in the age of sequenced genome s, proteomics, and high throughput drug screening, natural products have taken a back seat, particularly in the drug industry, to synthetic combinatorial libraries ( 71-75 ). In the past, this was logical due to the limited availability of natural products and quantitie s needed for full chemical characterization ( 76 ). However, a plethora of recent technol ogies in nuclear magnetic resonance (NMR) ( 77 ), chromatography ( 78 ), mass spectrometry ( 79 ), and small scale bioassay screening technologies have helped to fuel a revisiting of natural products as sources of future

PAGE 25

11 Figure 1-3: Schematic diagram of the neurope ptide activated GPCR NPR-1 from C. elegans Each circle with a letter in it represents an amino acid along the polypeptide chain. The red circles repr esent the natural single amino acid polymorphism that gives ri se to drastically differe nt feeding behaviors. Yellow circles represent conserved amino acids in the alignment that appears in the same article as this figure. This figure adapted from de Bono, et al, 1998 ( 63 ). NPR-1 Phenotypes : V = Solitary Feeders F = Social Feeders

PAGE 26

12 molecules of choice for use in a variety of a pplications. The amount of material required for chemical characterization has decreased drasti cally in recent decades. As late as the 1960Â’s, many milligrams were required to characterize natural products ( 76, 80 ). Today, with much improved analytical techniques at our disposal, natural products chemistry is certainly poised to make substantial cont ribution to our knowledge of and ability to benefit from the chemical complexity of the biological world. To illustrate the advances that have been made since, the amount of mate rial needed to acquire datasets for various analytical techniques for a molecule with a mass of about 300 daltons is given in Table 12 from a paper by one of the giants in natu ral products chemistry, Dr. Jerrold Meinwald ( 76 ). Even though the quantities for various tech niques listed in Table 1-2 is small, their source was published in 2003. Since that y ear, improvements over these values have been made in several of the categories, particularly for NMR ( 77, 81 ). Purification and characteriza tion aside, natural products represent a more logical and still largely untapped rese rvoir within chemical space ( 71, 72, 76, 82 ). Part of what makes natural products so attractive is largely the work of millions of years of evolution. Issues such as solubility and refined chemi cal structure (chirality, molecular scaffolds, etc.) have largely been solved by nature ( 72, 83 ). Some of these ar e illustrated in Table 1-3 ( 72 ). LogP is the octanol-water part itioning coefficient and represents the solubility/hydrophobicity of organic molecules. The higher the number, the greater fraction of that molecule will partition into octanol and, thus, the greater its hydrophobicity. Conversely, the lower the number, the more water soluble the compound will be.

PAGE 27

13 Also, the fundamental issue of biological activity and relevance is certainly met by natural product compounds ( 72 ). This is where a good knowledge of biology helps natural products chemistry substantially. In fact, the field involved with understanding the functional role of na tural products in the ecosystem, Chemical Ecology, has substantial potential in aidi ng natural products chemistry in its search for biologically relevant molecules and thei r potential function/uses ( 71, 73, 74 ). Though some emphasis has been taken off of natural products chemistry in recent years, the case is certainly clear that the field is poised for a formidable resurgence. Dissertation Outline The main goal of this dissertation is to in spire future researchers to make use of the techniques and observations described within. In Chapter 2, analysis is provided of th e chemical biodiversity observed in FLPs from C. elegans and their precursor proteins. I believe that a wise molecular evolutionist in the future will note the observations I have made in this fascinatingly complex and diverse protein family. Paired with future functional data for nematode FLPs as they come online, they will be able to complete the protein familyÂ’s evolutionary history reconstruction. This will inevitably provi de the fields of molecular evolution, nematology, and neuroscience with a greater understanding of the nervous system and behavior of nematodes and possibly other animals. With Chapter 3, my hope is that future st ructural biologists will take note of the value of pH titration NMR experiments in observing otherwise unobservable transient electrostatic and hydrogen bondi ng interactions in polypept ides and other organic

PAGE 28

14 Table 1-2: Sample quantities required for analys is using the three most powerful analytical techniques for chemical struct ure determination. A plus (+) in the column below each technique means that the corresponding sample quantities (left) are applicable to that technique. A minus (-) in a similar position means that that sample quantity (left) is not sufficient for use w ith the corresponding technique. ( 76 ) A red dot ( ) indicates improvement in NMR sensitivity since 2003 ( 77 ). Sample Size (g) Number of Molecules X-ray Crystallography NMR Spectroscopy Mass Spectrometry ~ 300 x 100 6.23 x 1023 + + + 50 x 100 1023 + + + 50 x 10-3 1020 + + + 50 x 10-6 1017 + + + 50 x 10-9 1014 + 50 x 10-12 1011 + 50 x 10-15 108 50 x 10-18 105 50 x 10-21 102

PAGE 29

15 Table 1-3: Mean values for a selection of molecular properties among natural, drug, and synthetic compounds ( 72 ). The terms natural products, drugs, or synthetics describe the type of chemical library that was used to generate the data. Natural Products Drugs Synthetics Molecular Weight 300-414 340-356 393 LogP 2.4-2.9 2.1-2.2 4.3 Number of Chiral Centers 3.2-6.2 1.2-2.3 0.1-0.4 Number of N atoms 0.84 1.64 2.69 Number of O atoms 5.9 4.03 2.77 % of rings that are aromatic 31% 55% 80%

PAGE 30

16 molecules. In fact, these types of interactions are likely to provide the fields of structural biology and protein folding with the next level of detail needed to fully understand how macromolecules take their native form. Chapter 4 is a recent, yet exciting and fa scinating, “last-minute” addition to my graduate research and is also a preview of my future goals as a scientist. My interest in insects has always been a strong passion in my life. Earlier this year (2006) on a whim I decided to take advantage of my r ecent access to the 1 mm high temperature superconducting microsample NM R probe which Dr. Edison ha d helped to develop and explore a curiosity I had a bout some of the insects ( Anisomorpha buprestoides ) I had been breeding. This led to the work in Chap ter 4. With the ability to examine single milkings of individual stick inse cts, we have been able to il lustrate an intr iguing level of chemical biodiversity of a single defens ive compound produced by these creatures. Additionally, we were able to identify a co mponent of this secretion not previously known and are beginning to hypothesize on its bi ological function. I hop e that readers of this chapter will come away with an appreciati on for insects and the field of insect natural products chemistry and will be inspired to explore the field further.

PAGE 31

17 CHAPTER 2 BIOCHEMICAL PROPERTIES OF FLPS AND THEIR PRECURSOR PROTEINS FROM THE NEMATODE Caenorhabditis elegans Introduction The focus of this study was to understand the patterns of sequence conservation and variability observed in FMRFamide-Like Neuropeptides (FLPs) and their precursor protein sequences from the nematode C. elegans This work is largely observational in nature, but is motivated by the hypothesis that important informati on about the function and evolution of the nematode nervous sy stem can be elucidated by understanding the origin of the sequence properties observed in these polypeptides. Illustrated here are various biochemical and sequence properties of FLPs and their precursor proteins that I believe to be important in understanding this family of neuropeptide genes in C. elegans and to speculate on their relevance to FLP molecular evolution. To my knowledge, no one has published such a study on this gene family. The only discussion of the non-peptide “spacer” regi ons of flp precursors I am aware of in the literature was by Greenberg and Price ( 20 ). Only non-nematode sequences were analyzed, and it was postulated that they may serve to regulate the pH in the secretory vesicles where FLPs are ultimately processed ( 37 ). In this chapter, based on comparisons of complete flp precursor prot eins from nematodes, I postula te that these spacer regions may interact with peptide and processing si te regions in the unprocessed protein to stabilize their 3-dimensional structure. This is further examined in the Discussion section of this Chapter. Also, I have found no stru ctural data beyond primary sequence for the

PAGE 32

18 flp precursors and no record of any having b een recombinantly expressed or purified in their unprocessed form. The only work I am aware of on structural properties of unprocessed peptides illustrated that their processing sites are flexible in solution ( 84 ). To investigate the molecular evolution of these gene products I extended some analyses to flp precursors in all nematode spec ies that could be identified from available sequence databases (in collaboration with Dr Slim Sassi from the laboratory of Prof. Steven A. Benner). From this effort, I wa s able to collect 334 unique flp precursor sequences from 38 different nematode species. With these, we attempted to reconstruct ancestral sequences for each flp subfamily (30 in total) to aid in th e reconstruction of the evolutionary history of all 28 flp prec ursor proteins so far identified from C. elegans This effort was unsuccessful and will be di scussed in the Discussion section of this Chapter. Additionally, methods that may prove to be more useful in addressing this problem are discussed in the “Fut ure Directions” section of Chap ter 5 of this Dissertation. Experimental Methods Data Mining for Nematode flp Precursor Protein Sequences The flp precursor protein sequences used in this study were obtained from databases available on NCBI Entrez Protei n databases (which can be accessed at: http://www.ncbi.nlm.nih.gov/BLAST/ ) using accession numbers from a previous study that identified hundreds of FLP peptides encoded by nematode ESTs (Expressed Sequence Tags) ( 21 ), other similar studies ( 18 ), and from BLAST searches using those sequences and the theoretical mature pep tides which they contained. For protein BLASTs, whole C. elegans flp precursor was used in “pro tein-protein” searches (blastp) by varying parameters around default settings (f or example: expectation of 10, word Size 3, and a BLOSUM62 matrix, useful for weaker a lignments). Also, each of the theoretical

PAGE 33

19 peptides were used individually in “Short, nearly exact match” searches and varying parameters around default settings (for exam ple: expectation of 20,000, word size of 2, and a PAM30 matrix, useful fo r shorter sequences). For ESTs, translated EST searches (tblastn) were performed with both individual FLP peptides and whole precursors using parameters previously described ( 21 ): searching only Phylum Nematoda, searching the “est_others” NCBI database, expectat ion of 10,000, and a BLOSUM62 matrix. Accession numbers for sequences used from the databases are shown in Appendix A. EST hits were translated into pr otein using Expasy ’s TRANSLATE tool ( http://ca.expasy.or g/tools/dna.html ). In this chapter, a few terms and symbols th at require definition are used. The term orthologue will refer to a particular protein or gene that is the same protein or gene from another species (inferred by alignment). For example, all flp-1 proteins from various nematode species are orthologues of one anot her. Paralogue will be used to denote a gene or protein homologous to another within the same species. For example, flp-1 is a paralogue of flp-18 in C. elegans This nomenclature is well established in the field of molecular evolution. For different types of sequences, the following symbolism is used: FMRFamide-like neuropeptide(s) = FLPs, precursor protein(s) = flp(s), and DNA sequence(s) coding for flp precursor(s) = flp ( s ). True flp orthologues were recognized by regions of recognizable homology to the theoretical processed peptid e in the query sequence and by being flanked by canonical monoor dibasic processing sites. The longe st reasonable EST available for a particular precursor was used to represent that precurs or from that particular species. Sequences that were clearly too long, made of multiple concatenated ESTs, or sequence errors in key

PAGE 34

20 places (early stop codons, etc.) were ignored. Some sequences were clearly unique but seemed to be orthologues of other flp precurs ors from the same species. No previous record of more than one copy of a flp ort hologue is known, but alternate transcripts of some have been previously id entified. Thus, these sequences are assumed to be alternate transcripts. Translated ESTs were truncated where there was a sequence error or stop codon. The nomenclature used for flp precursors is as follows: flp_#x_yz(s) where # is the number given to a specific flp precursor orth ologue group (for example, flp-1) designated in the literature, x is a lette r indicating a flp protein result ing from a different alternate transcript when more than one was found in the databases, y is the fi rst letter of the genus of the species from which the sequence came, z is the first letter of that species, and in some cases a third letter (denoted here as “s”) is used when more than one species have the same genus-species initials. For example: flp_1a_ce denotes alternate transcript “a” of flp-1 from C. elegans As an additional example, flp_1_aca denotes flp-1 from Ancylostoma caninum In this case, the third lett er in the genus/species initial distinguishes that protein from flp_1_ace, which is flp-1 from Ancylostoma ceylanicum This nomenclature was used so that in text editor programs the names would be grouped by orthologue (or, subfamily, ie: all flp-23’s) rather than by species. Also, the term subfamily will refer only to predicted mature processed peptides that occur on the same precursor. For example, all of the peptides produced on the flp-18 precursor will be called the FLP-18 subfamily. Alignment and Phylogenetic Analys is of flp Precursor Proteins To attempt to build a molecular phylogeny a nd reconstruct the evolutionary history of the 28 flp precursors known in C. elegans we employed some novel techniques

PAGE 35

21 developed by Dr. Slim Sassi and myself. The general method involved generating ancestral sequences for all 28 precursors and using them for phylogenetic reconstruction. First, all orthologues of the 28 C. elegans flp precursors from various nematode species were compiled. Then, they were grouped in orthologous groups and aligned using ClustalW ( 85 ) and manual alignment by eye using the program Bioedit ( 86, 87 ) (Appendix B). Preference was given to aligni ng predicted peptide regions. Where this was ambiguous (with the repetit ive nature of these peptide sequences), processing sites, regions immediately flanking t hose, and spacer regions between predicted peptides were used. A complete phylogenetic reconstruction will not be shown in this Chapter due to reasons described in th e Discussion section. Analysis of Biochemical Prop erties of flp Precursor Prot eins and Figure Generation In order to analyze various sequence motifs in flp precursor proteins, several graphs were constructed and consolidated into Figur e 2-1. For illustrati ng patterns of sequence repetition, two dimensional dotplots were em ployed. The dotplots in Figure 2-1 were made by comparing each flp precursor from C. elegans to itself using the program Bioedit ( 86, 87 ). The program produces a dot when amino acids are the same in two sequences being compared. Thus, when both sequences are the same a diagonal of dots is generated; off-diagonal dots represent sequence repetition. The upper threshold limit was set at 10 and the lower set at 5. Using a threshold between 3 and 5 had little effect on the resulting plots. This plot wa s inserted into Microsoft Powerpoint (Powerpoint). In this slide, the other components of each panel of Figure 2-1 corresponding to a particular flp precursor were added. The color coded precu rsor annotation graphic was produced in Powerpoint. The resulting objectÂ’s width was sc aled to align the predicted repeated peptide units and proces sing sites with their off-diag onal regions of the dotplot.

PAGE 36

22 For signal sequence prediction, the online tool SignalP ( 88-90 ) was used to analyze the first 100 amino acids of each flp precursor sequence. Results from the hidden Markov model (HMM) cleavage site prediction were us ed to determine the C-terminal end of the signal sequence ( 90 ). To determine intron/exon boundary positions in the precursor proteins, BLAT searches using the UCSC C. elegans genome browser ( http://genome.brc.mcw.edu/cgi-bin/hgBlat?hgsid=148945 ) was used by asking for exons to be in upper case and introns to be in lower case in the query results. Exons were then translated using the Expasy Translate tool ( http://www.expasy.org/tools/dna.html ) and compared to the published protein sequences to determine the intron/exon positions. The theoretical PI for each precursor was calcula ted using the PeptideMass tool which is available for use on the web at: http://www.expasy.org/tools/peptide-mass.html ( 91, 92 ). The charge plot for each panel of Figure 2-1 was created using Microsoft Excel with comma delimited protein sequences. The posit ively charged amino acids were counted using the command “COUNTIF” in Excel to coun t argnine and lysine residues as +1 and aspartate and glutamate residues as -1. All other amino acids were given a charge of 0. The unstructured protein propens ity plots were made using th e online IUPRED tool using the “short disorder” prediction method ( 93, 94 ). The raw data from this tool was copied and pasted into Excel and plot ted as a line plot. This plot was edited in CorelDRAW and Powerpoint. Care was taken to maintain the same scale among graphs from all flp precursors in the final figure. Analysis of FLP Mature Pept ide Biochemical Properties Peptides were predicted from precursor s for all nematode species by 1) their homology to previously published FLP sequenc es, 2) being flanked by monoor dibasic processing sites, 3) by possessing a C-terminal glycine residue, and 4) being in a region

PAGE 37

23 of the protein near where other flps were obs erved (ie: not within the signal sequence or far away from the other peptides in the primary sequence). These sequences were excised from the precursor proteins and subjec ted to further analysis In Figures 2-5 and 2-6, all FLPs from C. elegans are tabulated from one transc ript of each precursor gene. Where more than one alternate transcript is known, the longest precursor is used. Barplots of peptide charge and length for C. elegans FLPs were made in Excel with tab delimited peptide sequences. The charged ami no acids and the N-term ini of the peptides were counted using the command “COUNTIF” in as described above for the precursors. These were summed for each peptide to give th eir total theoretical charge at pH 7. For the peptide length plots, the peptides were aligned in the spread sheet anchored at Ctermini (with no gaps). The “COUNTIF” func tion was used for each column of letters (representing a position in the peptide “alignm ent”) to count all peptides which had a letter (COUNTIF for each of the 20 possible amino acid single letter codes) in that column. This resulted in a row of numbers representing peptides at least this length or shorter. Then, in another row, each value in the previous row was subtracted from the value to the right of itself. This gave the tr ue numbers for peptides that were exactly that length, their N-terminus truncated at that position with countin g starting at the Cterminus. These figures aided in the obser vations, analyses, and conclusions presented below. Results Analysis of Biochemical Propert ies of flp Precursor Proteins The patterns of amino acid sequences observed in the FMRFamide-like neuropeptide family and their precursor protei ns as a whole has in trigued our research group for some time ( 26, 36 ). To analyze various properties of these sequences that

PAGE 38

24 Figure 2-1: Graphical illust rations of various chemical properties of FMRFamide-like Neuropeptide precursor proteins in C. elegans 2D Dotplots : Each dotplot provides a sequence comparison of the re petitive elements of a flp precursor protein. When both dimensions of a dot plot are the same sequence a diagonal of dots results. With the threshold se t to 5, off-diagonal dots represent places with 5 adjacent amino acids that are repeated more than once in the protein. Graphical Annotation of flp protein sequences : Legend: Purple Rectangles = predicted neuropeptide sequences, Red Rectangles = predicted processing sites, Green Rectangles = Glycine residue predicted to be posttranslationally

PAGE 39

25 converted to a C-terminal amide group in the mature peptide, Grey Rectangles = predicted signal peptide sequences ( 88-90 ), and Black Rectangles = regions of unknown function or “spacer regions”. Red dots contained within purple rectangles indicate a peptide that has been biochemically characterized in the literature (see also Figure 1-1 in Chapter 1 for corresponding references). Asterisks above a processing s ite indicate that the sequence is KR (the most comm on processing site observed in flp precursors). An arrow below the gr aph corresponds to an intron/exon boundary (ie: protein region corresponding to an RNA splicing site); Charge Plots of flp protein sequences : In the barplot belo w the graphical protein annotation, bars pointing up and blue co rrespond to positively charged amino acids at pH 7.0 (arginine and lysine). Bars pointing down that are red correspond to negatively charged amino acids at pH 7.0 (aspartic acid and glutamic acid); Unstructured propens ity plots of flp protein sequences : These plots are shown below the charge plots. The scale of these plots goes from 0 to 1. The horizontal axis shown is pl aced at 0.5. Values above the line denote regions with a propensity to be unstruc tured and values below the line denote regions predicted to be structured ( 93, 94 ).

PAGE 40

26 Figure 2-1 Continued

PAGE 41

27 Figure 2-1 Continued

PAGE 42

28 Figure 2-1 Continued

PAGE 43

29 Figure 2-1 Continued

PAGE 44

30 Figure 2-1 Continued

PAGE 45

31 Figure 2-1 Continued

PAGE 46

32 would aid in alignments and phylogenetic analyses, as well as suggest functional properties of these polypeptides, plots illustrating functional motif arrangement (hydrophobic signal sequences, predicted pep tides, and predicted processing sites), intron/exon boundaries, charged amino acid di stribution, and regions of predicted unstructured propensity were compiled. Th ese plots are shown in Figure 2-1. Some features are common to near ly all of the flp precursors in C. elegans First, the overall arrangement in nearly all of the pr ecursors, from Nto Cterminus, is as follows: 1) N-terminal hydrophobic signal sequence, 2) spacer region of unknown function, and 3) a C-terminal re gion rich in predicted FLP pe ptides. This paradigm holds true for the flp 1-3, 7-9, 12-22, and 24-27 precursors (Table 2-1, column A). The exceptions are as follows: For flp-4, 5, and 20, the most N-terminal amino acids do not fall within the region predicted by SignalP to be the hydrophobic signal sequence. This could be due to several potential causes such as an artifact of the SignalP prediction method, an incorrectly predicted start site for the transcripts in the database, or a signal sequence that is truly not N-terminal For flp-6, 10, 11, 23, and 28, there is no substantially long sp acer region between the first pr edicted FLP peptide and the Nterminal signal sequence. Also, many of the flp precursors end in a FL P peptide (with or without a C-terminal basic processing site). The exceptions to this are: flp-2, 7, 10, 11, 15, 16, 22, 23, 24, 27, and 28. Again, this could be an artifact of the stop codon chosen for the protein sequence in the database. One striking feature of some precursors that do end in a processing site is the sequence of that site. KR is certai nly the most common processing site within these precursors, and this is true among many families of neuropeptides (( 95 ), Appendix

PAGE 47

33 Table 2-1: Sequence patterns and properties common to various groups of flp precursor proteins in C. elegans An X or other annotation means that the precursor for that row has the property in column : A) Precursors having the motif: Nterminal signal sequence, large spacer region, C-terminal neuropeptide rich region, B) the precursor pr otein ends in RK or K, C) N-terminal most predicted peptide farther from the others in the sequence than they are to one another, D) non-peptide spacer regions be tween nearly all pairs of FLPs, E) several peptides occurring in tandem, F) large spacer between the signal sequence and first FLP peptide that is ri ch in acidic residues, G) alternating spacer regions between peptides that are rich in acidic residues, H) protein isoelectric point (PI) less than 8.0, I) much of the protein predicted to be folded by the IUPRED tool, J) intron/e xon boundaries that occur in conserved regions of FLP peptides. See also Fi gure 2-1 for graphical illustrations of these sequence patterns for i ndividual precursor proteins. Precursor A B C D E F G H I J flp-1 X K X X X X X flp-2 X X X flp-3 X K X X X X flp-4 K X X flp-5 X X flp-6 X X X flp-7 X X X X flp-8 X X X X flp-9 X RK X X flp-10 X X flp-11 X X X X flp-12 X RK X X flp-13 X RK X X X X flp-14 X RK X X X X X flp-15 X X X flp-16 X X X X flp-17 X K X X X flp-18 X RK X X X X X flp-19 X X X X flp-20 X X X X X flp-21 X X X X flp-22 X X X X flp-23 X X flp-24 X X X flp-25 X K X X flp-26 X RK X flp-27 X X X flp-28 X X X X

PAGE 48

34 B). However, many of the flp precursors e nd in RK (flp-9, 12-14, 18, and 25) or K (flp1, 3, 4, 17, and 26) (Table 2-1, column B). This is another case where prediction of the true protein sequence deposited in the databa se may be the reason why even more flp precursors were not observed to end in such seemingly conserved C-terminal sequence motifs. Details of specific findings from Figur e 2-1 are presented in the following subsections. Sequence Repetition Patterns One of the better known properties of the flp precursor protein sequences (across most phyla) is their repetitive nature. Ma ny (possibly most) flp precursor proteins contain several copies of C-terminally rela ted neuropeptides that make up a subfamily. Though most of the repeated sequence patterns are due to the conserved predicted peptides, some of the peptides predicted by this study to be proce ssed and secreted are non-canonical and thus, are not re presented as repeated units in the dotplots of Figure 21. Examples of this occur in the flp-1, 11, and 23, precursors. Thus, I studied the flp precursor proteins from C. elegans for commonality in their patterns of sequence repetition as illustrated by the dotplots. For some precursors, the N-terminal mo st unit FLP peptide region is relatively distantly spaced from the othe rs in the protein primary sequence. This seems to be the case for the flp 1, 3, and 18 precursors (Table 2-1, column C). Our research group has previously postulated that the N-terminal most peptides on flp-18 and related precursor proteins in other nematodes are unique base d on their position in th e precursor and length ( 96 ). These common sequence patterns sugge st possible evolutionary homology.

PAGE 49

35 Another pattern of sequence repetition present in several flp precursors is the presence of spacer regions between most pair s of FLP peptides. In such cases, no more than two peptides occur in tandem. Precursor s showing this pattern of repetition include flp-6, 8, 17, and 18 (Table 2-1, column D). For other flp precursors, multiple (3 or more) repeated regions corresponding to predicted neuropeptides occur in tandem, sepa rated only by processing sites. This pattern results in what resembles a solid block of off-diagonal spots in the dotplots for those precursors (Figure 2-1). This is the mo st common pattern of sequence repetition observed in C. elegans flp precursors, and is reminiscent of the earliest sequenced FMRFamide precursors ( 20 ). C. elegans precursors with this patte rn of repetition include flp-1, 3, 7, 11, 13, 14, 16, 20, and 22 (Table 2-1, column E). Charge Distribution One of the more suggestive patterns in nema tode flp precursor proteins observed by this study is the distribution of charged amino acids. Peptide and processing site regions, as expected, are together rich in positively charged (basic) amino acids. However, the spacer regions tend to be rich in negatively charged (acidic) amino acids. These regions are of unknown function and occur outside of the peptide, processi ng site, or hydrophobic signal sequence regions. It has been previ ously reported that spacer regions in flp precursors of other phyla also tend to be rich in acidic residues ( 20 ). However, no published work to date has examined these sequen ces in as much detail as this Chapter. For some precursors there is a distinct spacer region (immediately after the hydrophobic signal sequence) rich in negatively charged amino acids followed by a region of only FLP neuropeptides and processi ng sites (being rich in positively charged amino acids). This is the case for th e flp 1-4, 7, 11-16, 19-22, 24, and 28 precursors

PAGE 50

36 (Table 2-1, column F). For other cases, ther e is a pattern of alte rnating spacer region with peptide/processing site re gions. This is the case for the flp-5, 6, 8, 9, 10, 17, 18, and 25 (Table 2-1, column G). In the cas e of flp-23, the predicted hydrophobic signal sequence is immediately followed by a processing site and a FLP neuropeptide, which is unique among the C. elegans precursors analyzed here. Having made these observations, I became interested in determining the reason behind the apparently biased ch arge distribution in these pr oteins, particularly in the spacer regions. To determine the extent to which the charges are balanced in these precursors, I calculated the theoretical pI for each protein. The pI of the flp precursors, with the exception of flps 14, 19-21, and 28 (T able 2-1, column H), are all greater than 8.0, indicating a net positive ch arge. Thus, positive charge d residues prevail overall. Notably, flps 21 and 28 only contain one FLP pe ptide each, resulting in less evolutionary pressure to possess numerous positive charges in FLP neuropeptides and processing sites. The implications of charge compensation in flp precursors will be covered in the Discussion section of this chapter. Unstructured Propensity Members of our laboratory have long thought that 3-dimensional structural motifs could be beneficial or even required for func tional interactions of these proteins with their processing enzymes (( 84 ), and personal correspondence with Dr. Arthur Edison and Dr. Cherian Zachariah). Th e striking patterns of charge compensation observed in C. elegans flp precursor proteins le d me to further probe th e precursor sequences for potential structural properties. To do this, I used the IUPRED tool ( 93, 94 ) to plot the unstructured propensities for each of the 28 precursor protein sequences. This method is based on the energies of potential contacts betw een pairs of amino acids in a protein. The

PAGE 51

37 Y axis scale of the plots shown in Figure 21 goes from a minimum of 0 (complete order) to a maximum of 1 (complete disorder), and th e X axis is simply the protein sequence. The horizontal line is for a Y value of 0.5. The greater the propens ity for a region of a protein to be unstructured, the greater the Y value of that region on the plot. The interpretation of the data gi ven by the authors of the IUPRED method is that all protein regions with values greater than 0.5 (above th e line in Figure 2-1) are predicted to be disordered and those below 0.5 ordered ( 93, 94 ). For all flp precursors analyzed, the h ydrophobic signal sequence is predicted by IUPRED to have a maximal propensity to be ordered. This is lik ely due to the highly hydrophobic nature of these sequences, and unrel ated to charge compensation. Also, the signal sequences are predicted to be cleaved from the rest of the protein and would not contribute to its structur e after that event. For most of the flp precursors, much of the plot for the sequence lies below the 0.5 cutoff value. This is true for the fl p1, 4, 5, 8-21, and 23-28 precursors (Table 2-1, column I). Those that show a predomin antly unstructured propensity (outside the hydrophobic signal sequence) incl ude flp-2, 3, 6, 7, and 22. There appears to be no correlation between which precursors show uns tructured propensity and other chemical properties observed (such as charge distribution). To compare the IUPRED results for C. elegans flp precursors with those of proteins which have known amounts of structur e, I chose several prot eins and show their analyses in Figure 2-2.

PAGE 52

38 Figure 2-2: Examples of known natively stru ctured and unstructured proteins analyzed by IUPRED ( 93, 94 ). The X axis is the sequence of the protein and Y axis is IUPRED score for unstructured propensity. Blue arrows indicate the maximum score for unstructured propensity and red arrows indicate the minimum score. All portions of the graph below the X axis are predicted to be structured and regions above to be unstructured. References for these proteins: ( 97-108 )

PAGE 53

39 A brief summary of the proteins used for Figure 2-2 is as follows: Structured proteins: Yeast Proteinase A (YprA) is a gl obular aspartic proteinase from the yeast Saccharomyces cerevisiae whos e crystal structure has been solved (100, 101). Ubiquitin is a folded protein that, when covalently bound to misfolded proteins and others targeted for degredation, acts as a si gnal for the protein to be shuttled to the proteosome in cells and subsequently degrad ed (97-99). Lysozyme is a folded protein component of the innate immune system in animals that is involved in destroying bacterial pathogens (104-106). Unstructured proteins: Inhibitor of Yeast Proteinase A from fraction 3 (IA3) is an endogenous inhibitor of YprA in S. cerevis iae that has been shown to be natively unstructured in solution (107). Myristoi lated Alanine Rich C-Kinase Substrate (MARCKS) is a protein involve d in a wide variety of possi ble functions and has been shown previously to be unstructured in solution (108). Alpha-synuclein is a highly conserved presynaptic protein that has been implicated in ParkinsonÂ’s disease and other types of dementia. It has been shown to be natively unstructured in solution with a strong propensity to form helices when bound to phospholipids membranes (102, 103, 109). It is not the intent of the work in this Chapter to exhaustivel y verify the IUPRED methodology. However, to interpret the IUPRED results for the flp precursors, a few points about the non-flp proteins analyzed are worth noting. Fi rst, the N-terminus of IA3 forms an alpha-helix when it binds to its native binding partner, Yeast Proteinase A ( 101, 107 ). This N-terminus is also thought to have a native propens ity toward a low but detectable population of alpha helix (based on work submitted by Omjoy Ganesh, and his

PAGE 54

40 collaborators). This could be a reason for the apparent “dip” in the N-terminal half of the plot for this protein in Figure 2-2. Second, the slight dip in the middle of th e plot for MARCKS corresponds to one of that protein’s important regulatory domains. This domain, called the phosphorylated site domain (PSD), forms a more compact structure when physphorylated ( 110 ). Finally, for alpha-synuclein, as stated pr eviously, it has been shown to become alpha-helical upon binding to lipids. This prot ein also is highly repeated (like many of the flp precursors in nematodes and other phyla), with 11 imperf ect repeated units conserved among vertebrates ( 103, 109 ). These points about th e proteins analyzed in Figure 2-2 may indicate that IUPR ED is able to detect a stru ctural predisposition that is stablized by interactions with other biological molecules. The implications of these proteins as their IUPRED results compare to those of the flp precursor proteins from C. elegans are discussed in the Discussion section of this Chapter. Other Features One intriguing feature of the flp precursor genes is the position of some of their intron/exon boundaries. The end sequences of many introns are well conserved in eukaryotes, with a large numb er having a GT at the 5’ end and an AG at the 3’ end ( 111, 112 ). This is probably so that the RNA sp licing sites are recognized by the RNA splicing machinery. However, in flp precursors, I have noticed that the ends of exons are also conserved and often occur in regions corre sponding FLP neuropep tide coding regions. This occurs in the flp 1-3, 6, 7, 11, 13, 14, 18, 22, 23, 27, and 28 precursors (Table 2-1 column J and Figure 2-1). In fact, known a lternate transcripts of the flp-1 and flp-11 precursors exist. These result in pro duction of different FLP neuropeptides ( 113 ). Additionally, for flp-23, previous FLP neuropeptid e predictions in the literature disagree

PAGE 55

41 as to which FLP neuropeptide is actually produced by the flp-23 gene, based on different predicted transcripts ( 16, 21, 114 ). These observations illustrate the potential importance of inron/exon boundaries and RNA splicing and their affect on FLP neuropeptide diversity. Another interesting phenomenon was di scovered among the flp-7 nematode orthologous, particularly in Caenorhabditis briggsae In flp-7 from C. briggsae a peculiar and apparent insertion was obser ved (Figure 2-3). A novel FLP-7 related peptide sequence exists in Caenorhabditis briggsae and not in the other nematode sequences shown. This indel (insertion or de letion) could be view ed as an error in alignment due to a few point mutations in a peptide otherw ise like the others (particularly, the N-terminal leucine and C-terminal leucine). However, the complete alignment of these proteins can be seen in A ppendix B. From those alignments, it is clear that nearly all other sites in the C. elegans and C. briggsae flp-7 precursor protein sequences are homologous. Also quite interesting is that this indel occurs in the middle of an exon. I have not yet determined the tr ue origin of this i ndel nor understand its functional consequence(s). It is nonetheless ve ry intriguing and could be potentially very informative for future researchers of nematode flp precurs or evolution. Biochemical Properties of Mature Processed FLPs In addition to similarities in patterns observed among several of the flp precursor proteins, several points of similarity are a pparent even among the pe ptides derived from certain sets of precursors (Figure 2-4). The rationale for those groupings is described in the discussion section of this chapter.

PAGE 56

42 flp-7-hg --FESLA KR APLDRSAMARFG K -------------R APLDRSALARFG K flp-7-ce SSMVRFG KR SPMQRSSMVRFG K -------------R SPMERSAMVRFG flp-7-cb SAMVRFG KR SPMDRSAMVRFG KR LPSDRSSMVRLG KR SPMDRSAMVRFG K flp-7-cr SSMVRFG KR SPMQRSSMVRFG K -------------R SPMERSAMVRFG flp-7-oo STMVRFG -R APMDRSTMVRFG K -------------R APMDRSSMVRFG K flp-7-ac SSMVRFG -R APMDRSSIVRFG K -------------R APMDRSSMVRFG K flp-7-mj SALVRFG KR APLDRSALVRFG K -------------R APLDRAAMVRFG K flp-7-mh SALVRFG KR APFDRSALVRFG K -------------R APLDRAAMVRFG K flp-7-mi SALVRFG KR APLDRSALVRFG K -------------R APLDRAAMVRYG K flp-7-rs SAMARFG KR APLDRSAMARFG K -------------R APLDRSAMVRFG K Figure 2-3: Portion of the alignment for a ll known flp-7 precursor protein orthologues in the phylum Nematoda. The amino acids in blue are part of predicted FLP-7 neuropeptides which align with one a nother and contain the canonical “RFG” C-terminus. In green are the predicted pr ocessing sites. In red is a novel indel of a non-canonical FLP ne uropeptide found only in Caenorhabditis briggsae thus far. It is likely to be processed with the others.

PAGE 57

43 flp-1a K P NFMRYG SAAVKSLG AGSD P NFLRFG SQ P NFLRFG ASGD P NFLRFG SD P NFLRFG AAAD P NFLRFG SAD P NFLRFG P NFLRFGflp-3 S P LGTMRFG T P LGTMRFG SAE P FGTMRFG NPENDT P FGTMRFG ASEDALFGTMRFG EDGNA P FGTMKFG EAEE P LGTMRFG SADDSA P FGTMRFG N P LGTMRFGflp-13 AMDS P LIRFG AADGA P LIRFG APEAS P FIRFG AADGA P LIRFG APEAS P FIRFG ASPSA P LIRFG SPSAV P LIRFG SAAA P LIRFG ASSA P LIRFGflp-18 DFDGAM P GVLRFG EM P GVLRFG SV P GVLRFG SV P GVLRFG EI P GVLRFG SEV P GVLRFG DV P GVLRFG SV P GVLRFG flp-6 K SA Y MRFG K SA Y MRFG K SA Y MRFG K SA Y MRFG K SA Y MRFG K SA Y MRFGflp-8 K NE F IRFG K NE F IRFG K NE F IRFGflp-9 K PS F VRFG K PS F VRFGflp-14 K HE Y LRFG K HE Y LRFG K HE Y LRFG K HE Y LRFGflp-17 K SA F VRFG K SA F VRFG K SQ Y IRFG flp-7 T P MQRSSMVRFG SP MQRSSMVRFG SP MQRSSMVRFG SP MQRSSMVRFG SP MERSAMVRFG SP MDRSKMVRFG SSIDRASMVRLG T P MQRSSMVRFGflp-22 SP SAKWMRFG SP SAKWMRFG SP SAKWMRFG flp-20 AVFRMG AMMRFG AMMRFG SVFRLGflp-28 VLMRFG flp-21 GLG PRPLRFGflp-27 GLG GRMRFG flp-1a K P NFMRYG SAAVKSLG AGSD P NFLRFG SQ P NFLRFG ASGD P NFLRFG SD P NFLRFG AAAD P NFLRFG SAD P NFLRFG P NFLRFGflp-3 S P LGTMRFG T P LGTMRFG SAE P FGTMRFG NPENDT P FGTMRFG ASEDALFGTMRFG EDGNA P FGTMKFG EAEE P LGTMRFG SADDSA P FGTMRFG N P LGTMRFGflp-13 AMDS P LIRFG AADGA P LIRFG APEAS P FIRFG AADGA P LIRFG APEAS P FIRFG ASPSA P LIRFG SPSAV P LIRFG SAAA P LIRFG ASSA P LIRFGflp-18 DFDGAM P GVLRFG EM P GVLRFG SV P GVLRFG SV P GVLRFG EI P GVLRFG SEV P GVLRFG DV P GVLRFG SV P GVLRFG flp-6 K SA Y MRFG K SA Y MRFG K SA Y MRFG K SA Y MRFG K SA Y MRFG K SA Y MRFGflp-8 K NE F IRFG K NE F IRFG K NE F IRFGflp-9 K PS F VRFG K PS F VRFGflp-14 K HE Y LRFG K HE Y LRFG K HE Y LRFG K HE Y LRFGflp-17 K SA F VRFG K SA F VRFG K SQ Y IRFG flp-7 T P MQRSSMVRFG SP MQRSSMVRFG SP MQRSSMVRFG SP MQRSSMVRFG SP MERSAMVRFG SP MDRSKMVRFG SSIDRASMVRLG T P MQRSSMVRFGflp-22 SP SAKWMRFG SP SAKWMRFG SP SAKWMRFG flp-20 AVFRMG AMMRFG AMMRFG SVFRLGflp-28 VLMRFG flp-21 GLG PRPLRFGflp-27 GLG GRMRFG Figure 2-4: Groupings of sim ilar FLP neuropeptide subfamilies based on their chemical properties and precursor protei n sequence similarities.

PAGE 58

44 Peptide Charge The FLP neuropeptides in C. elegans nearly all have a net positive charge. In fact, the most common net charge in these peptides is +2 (Figure 2-5). It is clear that a net positive charge is a strongly conserve d feature of FLP neuropeptides in C. elegans In fact, many entire subfamilies of FLPs in C. elegans have a conserved N-terminal lysine residue (Figure 2-4). These subfamilies incl ude peptides produced on the flp-6, 8, 9, 14, and 17 precursors. Other subfamilies of precursors tend to actually be rich in negatively charged residues in the N-terminal variab le regions. Such subfamilies include FLP neuropeptides produced on the flp-1, 3, 13, and 18 precursor proteins. This illustrates possible homology and is strongly suggestive of evolutionary constraint. Indeed functional results in Chapter 3 show that more positively charged analogues of FLP-18 peptides are more active on the NPR-1 r eceptor than less positively charged ones ( 96 ). The potential importance of charge simila rities among FLP subfamilies will be explained in the Discussion section of this Chapter. Peptide Length and Amino Acid Conservation Another key structural property of FLP neuropeptides is simply the number of amino acids they contain. As ligands, molecu lar size likely affects their role in receptor binding and activation. Peptide length appears to be evolutionarily constrained in FLPs. Many that occur on the same precursor protein are exactly 7 amino acids; this is also the most common length for FLPs identifi ed in this study (Figure 2-6). Also, some groups of FLPs have similar patterns of amino acid distribution. For example, some of the FLPs of different subf amilies have a proline that is 5, 6, or 7 amino acids from the C-terminus. Precursors with peptides containing a proline in such a position include flp-1, 3, 4, 5, 6, 9, 11, 13, and 18. Another commonality among certain

PAGE 59

45 Figure 2-5: Predicted net ch arge frequency (pH 7.0) for all FLP neuropeptides predicted from C. elegans

PAGE 60

46 Figure 2-6: Prevalence of peptide lengths for all pred icted FLP neuropeptides in C. elegans

PAGE 61

47 FLPs is the persistence of an aromatic residue that is 4 am ino acids from the C-terminus for all the peptides on a particular precur sor containing multiple peptides. This is observed for flps 3-6, 8, 9, 14, 16, 17, and 25. This property is common to many FMRFamide-like neuropeptides. Examples of these patterns of the sequence similarity among various groups of FLPs are illustrated in Figure 2-4. These groupings are explained further in the Discu ssion section of this Chapter. Discussion The purpose of this work has been to catalogue various hypothesis generating observations about the FMRFamide-like neuropep tides and their precursor proteins, with emphasis on those likely to guide future molecu lar evolution studies. Indeed I have been able to make a number of observations and have given support for their potential importance. Though this chapte r is largely observational, so me logical conclusions can be drawn from the data and results presented: Several striking commonal ities among flp precursor proteins are strongly suggestive of an evolu tionary relationship Charge distribution is a cons erved feature of flp precursor proteins that may imply a propensity for 3-dimensional structure Charge, peptide length, and common amino acids among FLP neuropeptide subfamilies together suggest an evolutionary relationship and functional importance Positive charges in FLPs may be important for receptor binding or interactions with the membrane Grouping of FLP Subfamilies by Precursor and Peptide Properties The flp precursor proteins as a protein fam ily appear to be highly divergent in the phylum Nematoda. However, some similari ties among these various paralogous family members exist. The similarities among FLPs and their precursors discussed in the results

PAGE 62

48 section above warrant a preliminary grouping by relatedness. Below are suggested group designations and the rationale behind these gr oupings which are summarized in Figure 24. The flp-1 Group This group contains the flp-1, 3, 13, and 18 precursor proteins and their peptide products. Nearly all of the peptides in th is group contain a prolin e that is 5, 6, or 7 residues from the C-terminus in their mature processed form. Proline is one of the least common amino acids and imparts a constrai ned local conformation on the polypeptide chain. Thus, it is very likely the structures (either in so lution or bound to receptors) of FLP peptides containing proline in a simila r position are more similar than those of peptides lacking proline. The prevalence of proline in this subgroup suggests a structural motif under evolutionary constraint. Additiona lly, the majority of the peptides in this group have N-terminal regions that are rich in negatively charged residues. In the following chapter I show data illustrating that charge influences the biological activities and structures of members of this group. Si nce charge is an important property of these molecules, it is very likely that this property has been conserved through evolution. The arrangement of the seque nce repetition pattern in fl p-13 does not include one peptide in the precursor protein which is separated from the othe rs. Flp-18 is replete with spacer regions in contrast to other precursor pr oteins listed in this group. However, since the mature processed peptides are known to be the functional portions of these proteins in their interaction with receptors, it is likely th at the most evolutionary pressure is on these regions. Thus, based on FLP peptide sim ilarities, the flp-13 and 18 precursors are included in this group.

PAGE 63

49 The flp-6 Group This group contains the flp-6, 8, 9, 14, a nd 17 precursors and peptides processed from those proteins. The common featur es among members of this subgroup are definitely the most striking of the classifications made here. The peptides themselves all contain exactly the same number of amino aci ds. They also all begin with the same positively charged amino acid (lysine). Additiona lly, they all have an aromatic residue at exactly the same relative position (Figure 2-4) These similarities strongly suggest an evolutionary relationship. Additionally, their precursor proteins are quite similar; all the precursors contain several copies of nearly id entical peptides. This contrasts with other flp precursors in C. elegans which contain peptides with considerable N-terminal variability. Also, for the flp-6, 8, 9, and 17 precursors, there are acidic-residue-rich spacer regions between nearly all pairs of peptides. The flp-7 Group This group contains the flp-7 and 22 precu rsors and peptides processed from those proteins. There are two main reasons behind grouping these polypeptides together. First, they have striking similarities in their N-term inal sequences (Figure 2-4). This is peculiar because FLPs overall have a gradient of decreasing conservation, with the N-terminus being less well conserved. It is very likely that these peptides may bind an overlapping complement of receptors. Indeed, it has been recently shown that flp-7 and flp-22 peptides can activate the same receptor from C. elegans ( 115 ). Also, the precursors for this group of FLPs are in the category of those in which all predicted peptides are concatenated in tandem, separated only by the necessary processing sites.

PAGE 64

50 The flp-20 Group This group contains the flp-20 and 28 precu rsors and peptides processed from those proteins. The key similarities among these ne uropeptides are their length and chemical nature. Peptides in these subfamilies contain exactly 5 amino acids each in their mature form. They are also rich in hydrophobic amino acids, with the only charged residue being their conserved penultimate arginine. For the flp-28 precursor, no published seque nce has been annotated to deem it as flp-28 DNA or protein, though two publications suggest that it has been observed in unpublished observations in the C. elegans genome ( 16, 21 ). Without further information on those findings, I believe I am the first to identify a sequence in the database (accession number CAE17946, currently annotated as a “hypothetical protein” ) as coding for the FLP-28 neuropeptide and being a bo na-fide flp-28 precursor in C. elegans Additionally, a previous report separated some of the se quences in the flp-28 alignment (Appendix B), into flp-28 (VLMRF peptide motif) and so me into flp-29 (ILMRF peptide motif) ( 21 ). However, based on alignment of the full precurs or proteins, and the facts that: 1) both flp-28 and “flp-29” have not been identifie d within the same species and 2) no flp precursor gene has been identified with more than one copy in any one nematode genome, I designate the previously named flp28 and “flp-29” precursor proteins simply as flp-28. In order to retain the nomenclature in the literature, I w ill continue to use the flp-30 and flp-31 designati ons as previously descri bed by McVeigh et al ( 21 ). This leaves a designation void for the name flp29 which may be filled by a future nematode flp paralogue, disrupting the chronological order of this discovery naming scheme previously used for nematode flp paralogues.

PAGE 65

51 The flp-21 Group This group contains the flp-21 and 27 precu rsors and peptides processed from those proteins. Their similarities are predominantly in their Nand Ctermini, with a major difference occurring due to the presence of tw o proline residues in the middle of FLP-21. One of the proline positions in FLP-21 is held by a glycine in FLP-27 and the other missing. There is also an ar ginine in the middle of both peptides which is the only charged residue in both aside from the penultima te arginine characteri stic of all FLPs. The evolutionary relationsh ips postulated above are ba sed on precursor proteins and predicted neuropeptide subfamily similariti es. Their shared properties very likely affect precursor processing as well as r eceptor binding and activation. It has been postulated that receptors are more likely to evolve the ability to bind and use a preexisting neuropeptide ligand than neuropeptides are to evolve to bind a different receptor ( 116 ). It has also been estimated that th ere are about 50-130 neur opeptide receptors in C. elegans ( 117 ). However, there are well over 20 9 neuropeptides predicted in the C. elegans genome ( 16 ). Additionally, NPR-1 has been shown to be activated by peptides from both flp-18 and flp-21 precursors ( 53 ). With fewer receptors than ligands and some FLP subfamilies being very similar to othe rs, it seems that indeed FLP receptors are likely regulated by more than one ligand in the nervous systems of nematodes. In fact, a receptor from C. elegans has been recently cloned and characterized which is activated by a number of C. elegans FLPs at micromolar concentrations ( 115 ). This would imply that similar FLP subfamilies co-evolve pos t-duplication to maintain their original similarities and functionality on particular receptors. My hypothesis is that our flp precursor grouping designations above are legi timate and very likely to be upheld by future studies. This, however, remains to be proven. The Future Directions section of

PAGE 66

52 Chapter 5 provides a discussion of some wa ys in which studies validating these classifications may proceed. It has been postulated previously that ge ne conversion in flp pr ecursor proteins is favored ( 36 ). The results of this study, particularly from the alignments of different orthologous subfamilies (Appendix B), corroborate that hypothesis. Thus, an optimized neuropeptide can be duplicated within a ge ne and processed along with the ancestral copy(s) during the same preex isting regulation pathway. If a higher dose release of peptide with the original sequence is benefi cial the concerted evol ution would certainly also be beneficial by maintaining the sequence conservation in the repeated peptide units. A full evolutionary reconstuciton of th e FMRFamide-like neuropeptide precursor gene family in C. elegans was not achieved. The method we attempted to use involved generating ancestral sequences of all 28 pr ecursors and using those for phylogenetic reconstruction. The rationale behind this wa s that flp precursors have very divergent sequences. Since ancestral sequences theore tically represent para logues shortly after their divergence, they should be more similar than the modern sequences and, thus, easier to align. However, the resulting ancestral sequences were often too short and no more helpful for phylogenetic analysis than were th e modern sequences from which they were generated. Thus, Dr. Sassi, Dr. Edison and I decided that generati ng a phylogenetic tree for flp precursor proteins is more complicat ed than we had envi sioned. Reconstructing the molecular phylogenetic history of this la rge and diverse gene family will require a more complete and higher quality data set. It may also require the development of new methodologies and was deemed beyond the scope of this chapter. Likely requirements to

PAGE 67

53 complete such a study are discussed in the Di scussion section of this Chapter and in the Future Directions section of Chapter 5. Charge Compensation and Possible flp Precursor Structure One of the questions guiding the above analys es of flp precursor proteins has been: What could be driving the charge compensation observed? It was hypothesized in previous work that the acidic residue rich sp acer regions in flp pr ecursors may be present in order to regulate pH in s ecretory vesicles where the FLP peptides are processed from those proteins ( 20 ). Another possibility is that they are conserved in order to interact with receptors for trafficking to the proper vesicles for processing and secretion. They may have an opposite charge from the FLP pe ptide regions so that these regions are unbound and free to interact with processing enzymes. This latter suggestion seems unlikely, particularly in cases like flp-6 wh ere the protein consists almost entirely of alternating peptide and spacer regions. It is hard to imagine a sp acer region interaction with a receptor that would also acco mmodate access by processing enzymes. Charge compensation in protein sequen ces has been commented on previously ( 118, 119 ). One of the more logical reasons fo r correlated conservation of amino acid charge properties is to maintain n ecessary structure stabilizing contacts ( 119 ). For the vast majority of the flp precursors analyze d, IUPRED results indicate more structure propensity than would be expected from comp letely disordered proteins. Thus, it seems likely that the spacer regions function to stabilize structural interactions with the peptide and processing site regions of the proteins. However, the structural properties of these proteins beyond their primary sequences have not yet been characterized A discussion of future experiments that would likely be be neficial to such an e ndeavor is provided in the Future Directions section in Chapter 5 of this dissertation.

PAGE 68

54 Based on alignments I have performed with many different nematode flp precursor proteins (Appendix B), it is clear that observations ma de and conclusions drawn on C. elegans precursors are applicable to the phylum Nematoda as a whole. In general, it is my desire to convey through this work an inspiring and driving appreciation for the fascinating sequence diversity and neuroc hemical importance of the FLPs and flp precursor proteins in the Phylum Nematoda. I hope it will be useful in guiding further research. It has been postula ted that neuropeptides in ge neral predate such classical neurotransmitters as acetylcholine and serotonin ( 120 ). Thus, an understanding of the evolutionary history of these polypeptides will likely give rise to a wealth of knowledge on both the evolution and function of the comp onents of the nematode nervous system and their influence on behavior. Several ideas for future research based on my experiences on this project are provided in the corresponding section of “Future Directions” in Chapter 5 of this dissertation.

PAGE 69

55 CHAPTER 3 NMR ANALYSIS OF C. elegans FLP-18 NEUROPEPTIDES: IMPLICATIONS FOR NPR-1 ACTIVATION The work in this Chapter was published in Biochemistry Vol. 45, No. 24 pp. 75867597, June 20, 2006 ( 96 ), in collaboration with the labor atories of Profs Peter Evans and Mario de Bono. Introduction NPR-1, a GPCR that modulates feeding behavior in C. elegans is activated by two subfamilies of FLPs in C. elegans including the FLP-18 pep tides and FLP-21 peptide ( 53 ). All of the FLP-18 peptides occur on the same precursor and are presumably processed and released simultaneously ( 18 ). The most active of these peptides is EMPGVLRF-NH2; by comparison DFDGAMPGVLRF-NH2 is significantly less active ( 53 ). These observations motivated us to fu rther investigate the st ructural properties of these peptides relative to their biological activities. In previous studies, our research group has suggested that N-terminal hydrogen bonding can influence FLP activity ( 121 ). Structural interactions in small peptides such as FLPs are generally invisi ble to techniques such as X-ray crystallography, because small peptides are dynamic in solution. Also, some NMR parameters such as NOE correlations for distance measurements are of limited value on small peptides due to their dynamic properties in solution ( 122 ). However, these transien t structural properties can modulate their activities ( 121 ). Although a high resoluti on X-ray crystallographic structure for a FLP-18 peptide bound to NPR1 would be extremely useful, it would not

PAGE 70

56 provide any information on the unbound state of the peptides. We are interested in understanding how structural in teractions within free ligands can affect their binding properties with a receptor. This work seek s to illuminate that portion of the FLP-18 pharmacology on NPR-1. In the present study we have monitored pH titrations, temperat ure titrations, and chemical shifts via NMR to identify transi ent long-range interactions within FLP-18 peptides and designed control analogues. The sequence and ac tivity diversity among these peptides motivated us to examine the structural properties of two extreme cases, EMPGVLRF-NH2 and DFDGAMPGVLRF-NH2, which may influence the activity of each peptide on NPR-1. The material presente d in this chapter examines the hypothesis that local structure in the variable N-terminal regions of flp-18 peptides can modulate their binding to NPR-1. Experimental Procedures Peptide Synthesis Peptides listed in Table 3-1 were synthe sized using standard Fmoc solid phase methods, purified by HPLC, and verified by MALDI-TOF mass spectrometry at the University of Florida Interdisciplinary Ce nter for Biotechnology Research (UF ICBR) protein core facility. Peptide Sample Preparation Lyophilized peptides were weighed and dissolved to ~1 mM in 95% H2O and 5% D2O, and the pH was adjusted to 5.5 by additi ons of either HCl or NaOH. The peptide solution was then aliquoted and frozen at 20 C until needed for biological assays or NMR experiments. Small aliquots of each sample were submitted for amino acid analysis at the UF ICBR protein core to verify their concentrations. For NMR

PAGE 71

57 spectroscopy, the pH-stable chemical shift st andard DSS (2,2-dimethyl-2-silapentane-5sulfonic acid) was added to NMR samples at a final concentration of 0.17 mM ( 123 ). Biological Activity Assays: The experiments described in this sec tion (Biological Activity Assays) were performed by Dr. Vincenzina Reale from the lab of Prof. Peter Evans and Heather Chatwin from the lab of Prof. Mario de B ono. Sense cRNA was prepared in vitro using the mCAPTM RNA Capping Kit (Stratag ene, La Jolla, CA) from plasmid DNA containing full-length npr-1 215V cDNA cloned in pcDNA3 (Invitrogen Ltd., Paisley UK). RNA transcripts were s ynthesized using T7 RNA Polyme rase (Stratagene, La Jolla, CA) after linearizing the plasmid with Apa I (Promega UK, Southampton, UK) and blunting the 3' overhangs with T4 DNA Polymerase (Amersham Pharmacia Biotech, Little Chalfont, Bucks, UK). T7 RNA transc ripts synthesized in vi tro with the mCAPTM RNA Capping Kit are initiated with the 5' 7MeGpppG 5' cap analog. Sense cRNA was prepared in a similar manner from the GIRK1 and GIRK2 clones in pBS-MXT ( 124 ) (kindly donated by Drs. S.K. Silverman a nd H.A. Lester, California Institute of Technology, Pasedena, USA) after linearizi ng the plasmid with Sal I (Promega). All experiments using Xenopus laevis were carried out unde r a Home Office (UK) Project License. Stage V and VI oocytes from virgin female adult X. laevis were prepared using standard procedures ( 53, 125-127 ). Oocytes were then injected with 50 ng of npr-1 receptor sense cRNA, either alone, or together with 0.5 ng each of GIRK 1 and GIRK 2 sense cRNA and incubated at 19oC for 2 5 days. Uninjected oocytes were used as controls. Electrophysiological recordi ngs were made from oocytes using a twomicroelectrode voltage-clamp technique ( 53, 125-127 ).

PAGE 72

58 NMR Spectroscopy NMR data were collected at 600 MHz us ing a Bruker Advance (DRX)-600 console with a 14.1 Tesla magnet equipped with a 5 mm TXI Z-Gradient CryoProbe. Unless otherwise stated, all NMR experiments were collected at 288 K, and spectra were collected with a 6600 Hz spectral width and we re referenced by setting the methyl proton resonance peak from DSS protons to 0.0 ppm. The 1H carrier frequency was centered on water which was suppressed using a WATERGATE sequence ( 128 ) or presaturation. Two-dimensional TOCSY ( 129 ) experiments were collected using a DIPSI-2 mixing sequence with a 60 ms mixing time. Two-dimensional ROESY ( 130 ) experiments were collected using a 2.27 kHz field spin lock cw applied for 250 ms. Processing of 1D NMR spectra and creation of stack plots of pH and temperature titrations was done using Bruker XWINNMR and XWINPLOT version 4.0 software. Two-dimensional NMR datasets were processed with NMRPipe (131) using standard methods including removal of residual water signal by deconvolution, multiplying the data with a squared cosine function, zero -filling, Fourier transformation, and phase correction. Data were analyzed and assigned with NMRView ( 132 ) using standard 1Hbased methods ( 133 ). One-dimensional pH titration experiments we re performed for all peptides in Table 3-1 that contain aspartate and/or glutamate residue(s), as well as PGVLRF-NH2 and SGSGAMPGVLRF-NH2. One-dimensional NMR spectra were collected at increments of about 0.2 pH units from 5.5 to 1.9 by successive addition of 1-3 L of 0.01-0.1 M HCl for each pH value. pKa valu es and effective populations (c in Equation 1) of pH

PAGE 73

59 dependent resonance peaks were calculated using Origin 7.0 software and a modified version of the Henderson-Hasselbach equa tion below as previously described ( 134 ): Equation 1: 2 1 ; 1 ; 10 1 10 ) ( ) (1 1or j c c pHj i i j i pH pK pH pK b a iai ai where ( pH ) is the experimental chemical shift, b is the chemical shift at the least acidic condition, a is the chemical shift at the more acidic condition, pKai is the negative common log of the acid/base e quilibrium constant for the ith titration event, and ci is the contribution of the ith titration event to the total pH dependence of chemical shift. One-dimensional NMR temperature titratio ns were collected on a standard TXI probe at 5 Kelvin (K) increments from 278 to 328 K, then ramped back to 278 K to check for sample integrity. The temperature for each experiment was calibrated using methanol (for 278.15 – 298.15 K) and ethylene glycol (for 308.15 – 328.15 K) and the corrected temperatures were used for the determina tion of the temperatur e dependence of the chemical shifts, called temperat ure coefficients (TC) in part s per billion per Kelvin. Results Peptide Design Rationale and Physiological Responses: Three major considerations have motivated this study. First, our lab has been intrigued for some time by the amino acid diversity in FLPs ( 20, 26, 29, 30, 36, 121, 135, 136 ). In particular, as described above, FL Ps display patterns of decreasing amino acid conservation from the Cto the N-termini, and the comparison of the C. elegans flp-18 ( 18 ) and A. suum afp-1 ( 36 ) genes suggests that the longe r peptides produced by these genes are unique (see Chapter 1, Table 1-1) Second, the activity at NPR-1 of the long

PAGE 74

60 FLP-18 peptide, DFDGAMPGVLRF-NH2, is significantly lower than the shorter EMPGVLRF-NH2 ( 53 ). Finally, in previous work on FLPs from mollusks, Edison, Carlacci, and co-authors f ound that different amino acid substitutions significantly changed the conformations of the peptides ( 121, 135 ) and that these conformational differences are correlated with their differences in activity ( 137 ). In designing the peptides for this study, we considered seve ral possibilities to explain the difference in NPR-1 activity between two native FLP-18 peptides, EMPGVLRF-NH2 and DFDGAMPGVLRF-NH2: First, the N-terminus could have intrinsic activity or act as a competitive i nhibitor; second, a glutamic acid might be required in a position correspondi ng to the first residue of the more active EMPGVLRFNH2; third, the extra amino acids could prev ent the active portion of the peptide from efficiently binding to NPR-1; fourth, the N-terminal extension of DFDGAMPGVLRFNH2 could be involved in structur al interactions that cause it to be less potent on NPR-1 than EMPGVLRF-NH2. To address these possibilities, two native FLP-18 peptides, and a range of substituted and derived analogues (Table 3-1), were tested for their ability to activate the NPR-1 215V receptor expressed as described in the Experimental Methods. In the following, we use the term “activity” to indicate the magnitude of the potassium current evoked by a 10-6 M pulse of peptide. We will refer to peptides by their peptide number shown in Table 3-1. It was observed that the l ong native FLP-18 peptide 2 wa s much less effective than the shorter FLP-18 peptide 1 at activating the receptor (Table 3-1), confirming previous observations ( 53 ). To test if the N-terminus of peptide 2 could have intrinsic activity or act as a competitive inhibito r, we designed and analyzed the effect of peptide

PAGE 75

61 Table 3-1: Peptides examined by NMR and th eir activities on NPR-1. a) Naturally occurring sequences are underlin ed. The conserved PGVLRF-NH2 sequence is in bold. N-terminal “extension” se quences of native FLP-18 peptides are bold and: Red for C. elegans based sequences and Blue for A. suum based sequences. “n” is the number of rep eated experiments. b) Peptides (1 M M) were applied in 2 minute pulses to Xenopus oocytes expressing NPR-1 215V. Results expressed as a % of response to 1 M Peptide 1 (EMPGVLRF-NH2) +/SEM (Standard Error). Data for all peptides were normalized to peptide 1 at 100%, which was repeated for each of the measurements. In a previous study, the current measured from a pplying peptide 1 was 32.2 +/3.8 nA (n=33) ( 53 ). c) Most active native C. elegans FLP-18 peptide. d) Longest and least active native FLP-18 peptide. e) Longest A. suum AFP-1 peptide. f) Chimera of long FLP-18 + long AFP-1. g) Chimera of long AFP-1 + long FLP-1. Name Sequence a % Response (n)b Peptide 1c EM PGVLRF-NH 2 100 Peptide 2d DFDG AM PGVLRF-NH 2 29.15.7 (16) Peptide 3 DFDG AM-NH2 0 (3) Peptide 4 DFDG EM PGVLRF-NH2 19.02.6 (8) Peptide 5 SGSGAM PGVLRF-NH2 118.711.0 (4) Peptide 6 AAAAAM PGVLRF-NH2 62.43.2 (10) Peptide 7e GFGD EMSM PGVLRF-NH 2 49.316.5 (4) Peptide 8 GFGD EM-NH2 0 (3) Peptide 9f DFDG EMSM PGVLRF-NH2 45.010.5 (6) Peptide 10g GFGD AM PGVLRF-NH2 104.82.8 (8) Peptide 11 PGVLRF-NH2 43.05.5 (14) Peptide 12 PGVLRFPGVLRF-NH2 198.133.3 (10)

PAGE 76

62 3. This peptide had no intrinsic activity on the receptor (Table 3-1) and did not block the effects of 1 M pulses of peptide 1 (n = 3) (dat a not shown). To test the possibility that a glutamic acid might be required in a position corresponding to the first residue of peptide 1, we examined peptide 4, where a glutamic acid residue is substituted for the alanine at position 5 in peptide 2. However, this substitution did not improve, and in fact weakened, the effectiveness of the long peptide. It also seemed possible that the added bulk of peptide 2 due to the extra amino aci ds could be preventing access to the NPR-1 binding site. Thus, we analyzed peptides 5 and 6. Peptide 5 was designed to eliminate any potential structure in the N-terminus based on commonly used flexible linker sequences in fusion protein c onstructs (pET fusion constructs, Novagen Inc.). The Nterminus of peptide 6 was designed to indu ce a nascent helical structure in the same region ( 138, 139 ). It can be seen that peptide 5 comp letely restored activity compared to peptide 2, while peptide 6 onl y partially restored activity in comparison with the short native peptide. As shown in Chapter 1, Table 11, the longest native peptide from afp-1 in A. suum ( 33 ), peptide 7, is two amino acids longer than the corresponding longest peptide from the C. elegans flp-18 gene ( 18 ), peptide 2. Thus, we also synthesized and tested peptide 7, as well as its N-terminus, peptide 8. Pe ptide 7 was slightly more effective than the peptide 2. However, the short N-terminal peptide sequence was again inactive (Table 31) and did not block the effects of 1 M pulses of peptide 1 (n = 3) (data not shown). In addition, we also made chimeras of the long C. elegans and A. suum sequences, peptides 9 and 10. Peptide 9, in which two extra amino acids (SM) are introduced into the center of peptide 4 to give it the same number of amino acids as peptide 7, showed similar

PAGE 77

63 activity to that of the long native Ascaris peptide itself, GFGDEMSMPGVLRF-NH2. However, peptide 10 showed similar activity to that of peptide 1. As shown later, our results indicate that the conserved C-terminal PGVLRF-NH2 is largely unstructured in solution. Thus, we tested peptide 11 and also peptide 12, in which the conserved sequence was duplicated. The activity of the peptide 11 was less than that of peptide 1 and similar to that of peptides 7 and 9. When compared to peptides 1 and 5, the reduced activity of peptide 11 could indicate that methio nine preceding proline is important for activity. However, the C-termin al duplicated peptide with no methionine was approximately twice as active as peptide 1. To further investigate the effects of ch anging the structure of the N-terminal sequence of the C. elegans long FLP-18 peptide, we determ ined full dose response curves for the peptides 1, 2, and 5 (Fig. 3-1). From Figure 3-1, peptide 5 is more poten t on NPR-1 than peptide 2, both having a length of 12 amino acids. This suggests that e limination of structure at the N-terminus of peptide 2 can increase its potency on the re ceptor. Also, peptides 2 and 5 are more efficacious at higher concentrations than pep tide 1, suggesting that longer peptides might be more efficacious on NPR-1 than shorter peptides. NMR Chemical Shifts Reveal Re gions of flp-18 Peptides wi th Significant Structure: Chemical shifts are extremely sensitive to molecular and electronic environments and thus provide unique atomic probes in mo lecules. Specifically, in peptides and proteins, chemical shifts of many nuclei along the polypeptide backbone have been

PAGE 78

64 %Responseto10-6M -10 -9 -8 -7 -6 -5 0 20 40 60 80 100 120 140 160 180 200 = Peptide 5: SGSGAMPGVLRF-NH2 = Peptide 1: EMPGVLRF-NH2 = Peptide 2: DFDGAMPGVLRF-NH2 Peptide 1 Figure 3-1 : Dose response curves of select FLP-18 peptides. Peptides were applied to Xenopus oocytes expressing NPR-1 215V, and the responses from inward rectifying K+ channels were recorded and normalized to the response of peptide 1 (EMPGVLRF-NH2) at 10-6 M. Filled circles are peptide 1 (EMPGVLRF-NH2) (EC50 = 10-6.80 M), open squares are peptide 5 (SGSGAMPGVLRF-NH2) (EC50 = 10-6.12 M), and open circles are peptide 2 (DFDGAMPGVLRF-NH2) (EC50 = 10-5.28 M). Three measurements at each peptide concentration were obtained a nd results are shown +/SEM. These data were acquired by the la b of Prof. Peter Evans.

PAGE 79

65 shown to be dependent on secondary structure ( 122, 140-145 ). Thus, the first step in NMR analysis is the assignment of resonance peaks in spectra to atoms in the molecule. For all peptides in Table 3-1, nearly comp lete NMR resonance assignments were made using standard two-dimensional 1H-based methods ( 133 ) (Appendix C). Short, linear peptides are often very dynamic, lack a 3D hydrophobic core, and interconvert rapidly between many different conformations. Despite this inherent flexibility, numerous studies have demonstrat ed that regions of short peptides can be highly populated in specific type s of secondary structure ( 146-149 ). A fundamental hypothesis of this study is that di fferences in local structure of variable N-termini of free FLPs could partially explain differences in their potencies on receptors. In order to compare chemical shifts from one peptide to another and to identify regions that contain significant populations of secondary structur e, it is useful to compare experimental chemical shifts to random-coil values ( 141-143, 145 ). Fig. 3-2 plots the difference between experimental and random-coil values for some of the peptides analyzed in this study. The white and black bars represent deviat ions from random-coil values for amide and alpha protons, respectiv ely, and the magnitude of the deviations reflects the population of local structur e along the backbone of the peptides ( 141, 145 ). All data were compared to published random coil values at pH 2.3. Several features in Figure 3-2 are worth noting. First, the chemical shifts of residues in the conserved PGVLRF-NH2 regions of each peptide are close to random-coil values, and rather similar among all the pep tides examined. This suggests that this conserved sequence is unstructured in solu tion and that flexibility is important for binding to NPR-1. This flexibility may help the ligand diffuse/maneuver more

PAGE 80

66 Figure 3-2 : NMR chemical shift deviations fr om random coil values. Experimental chemical shifts at pH ~2.3 were subtracted from sequence corrected random coil values ( 141, 142 ). Filled and open bars represent alpha and amide protons, respectively.

PAGE 81

67 effectively into a binding poc ket on the receptor. Second, the N-terminal extension of peptide 2 shows significant de viation from random-coil values In particular, the G4 amide proton has a very large deviation, suggesting its involvement in significant structure. Third, the chemical shift deviati ons of amide and alpha protons of peptides 3 and 8 are nearly identical to th e corresponding regions of the fu ll-length peptides 2 and 7, respectively. This indicates that these N-term inal extensions are behaving as independent structural units. Finally, peptid e 5, designed to lack N-termin al structure, indeed shows a consistent very small deviation from random -coil values in its first five residues. pH Dependence of Amide Proton Chemical Sh ifts Reveal Regions of flp-18 Peptides with Significant Structure: The sensitivity of NMR chemical shifts to electronic structure and hydrogen bonding make them ideal probes of longer-range in teractions with titrat able side-chains. NMR studies of peptides that utilize amide protons often need to be conducted below ~pH 6 to prevent amide proton exchange ( 133, 150 ). By varying the pH from about 5.5 to 2, both aspartic and glutamic acid side -chains will be converted from negatively charged and deprotonated to neut ral and protonated. These diffe rent charge states of the carboxylate groups will produce changes in the electronic environment in interacting atoms proportional to 1/R3, where R is the distance between the charged group and chemical shift probe. Backbone amide res onances are particul arly sensitive to interactions such as H-bonding ( 134, 143 ) and thus provide ideal probes of long-range interactions with side-cha in carboxylates. This phe nomenon provides a powerful mechanism to study long-range hydrogen bonding and salt bridge interactions in small peptides ( 121, 150, 151 ).

PAGE 82

68 Many of the FLP-18 peptides contain aspart ic or glutamic acids, so we performed 1D NMR pH titration experiments on all peptid es in Table 3-1 containing these residues and, as controls, on peptides 5 and 11, neither of which showed any pH dependence in the proton chemical shifts. Stackplots of the amid e region for a representative set of peptides are shown in Figure 3-3. No chemical shifts in peptide 5 (Figs 3-3E and 3-4E) or peptide 11 (data not shown) have any pH dependence, demonstrat ing that backbone amide proton chemical shifts are not intrinsica lly pH-dependent in this pH range Second, several resonances in other peptides have large pH -dependent shifts. To our knowledge no systematic study has been undertaken to iden tify the maximum change in chemical shift of backbone amide protons as a function of pH, but Wthrich and coworkers showed that a welldefined (~70-90% populated) hydrogen-bond betw een an aspartic acid side chain and a backbone amide proton in a small protein led to a change of 1.45 ppm over the titratable range of the aspartic acid ( 152 ). Thus, in Figure 3-3, some amide proton resonances of non-titratable amino acids have pH-dependent sh ifts that are characte ristic of significant H-bond interactions. Others have smaller pH -dependent changes, suggesting either more transient dynamic interactions or much longe r and weaker H-bonds. In contrast, several resonances in peptides with titratable gr oups show little or no pH dependence, showing that these effects are relativel y specific. Next, the spectra from the N-terminal truncated peptides 3 and 8 are highly dependent on pH and are nearly perfect subsets of the same regions in their full length count erparts. Consistent with chemical shift data in Figure 32, this demonstrates that the N-term inal extensions of peptides 2 and 7 behave as independent structural units. The extens ions also do not inte ract significantly

PAGE 83

69 Figure 3-3: Amide region of one-dimensional NMR data, collected as a function of pH from about 1.9 to 5.5. Peaks are labe led with their assi gned amino acids and the panels correspond to the following peptides: A. Peptide 3 = DFDGAMNH2, B. Peptide 2 = DFDGAMPGVLRF-NH2, C. Peptide 8 = GFGDEMNH2, D. Peptide 7 = GFGDEMSMPGVLRF-NH2, E. Peptide 5 = SGSGAMPGVLRF-NH2, F. Peptide 1 = EMPGVLRF-NH2. pH dependent interactions are summarized in Figure 3-5, and complete pKa analyses are provided in Appendix D.

PAGE 84

70 with the more C-terminal backbone atoms, wh ich show relatively l ittle pH dependence, indicating that the conserved C-termini are less structured than the N-terminal extensions. pH Dependence of Arginine Side-Chain s Reveal Long-Range Interactions: The penultimate arginine re sidue is highly conserve d and found in the same position in all FLPs. This argi nine is at least 7 residues away from any carboxyl groups, so we were surprised to find in several FL P-18 peptides that its epsilon proton (Arg H) is pH dependent (Figure 3-4). Peptide 5 demonstrates that there is no in trinsic pH dependence for Arg H over the pH range investigated, and we conclude that in other pe ptides there are long-range interactions between the Arg a nd the N-terminal carboxylates. Such an interaction would indicate a non-covalent ring st ructure. These interactions show up in most of the FLP-18 analogues having N-terminal carboxyl sidechains and, at first glance, do not appear to relate to the activity of the peptides (Table 31). For example, peptide 1 (one of the more active peptides) has nearly the same Arg H pH dependence as pept ide 4 (the least active PGVLRF-NH2 containing examined). Moreover, peptide 10, with similar activity to peptide 1, has nearly no Arg H pH dependence. Thus, Ar g interaction with acidic residues alone is not sufficient to explai n the difference in activity among the FLP-18 analogues tested. Quantitative Determination of pK a Reveals Multiple Interactions: Several of the peptides in Table 3-1 have more than one carboxylate, so it is not always obvious which is responsib le for the pH dependence of a particular resonance. If the titrating groups have distin ct pKa values, then it should be possible to determine the

PAGE 85

71 Figure 3-4: Arg H region of one-dimensional NMR data collected as a function of pH from 1.9 to 5.5. These long-range inte ractions on the penultimate C-terminal Arg result from the titratable car boxylate groups on the N-termini. pH dependent interactions are summarized in Figure 3-5, and complete pKa analyses are provided in Appendix D. Legend: A. Peptide 4 = DFDGEMPGVLRF-NH2, B. Peptide 2 = DFDGAMPGVLRF-NH2, C. Peptide 10 = GFGDAMPGVLRF-NH2, D. Peptide 7 = GFGDEMSMPGVLRF-NH2, E. Peptide 5 = SGSGAMPVLRF-NH2, F. Peptide 1 = EMPGVLRF-NH2

PAGE 86

72 contribution of each carboxylate on each titrat ing resonance using Equation 1. Every peak that exhibited pH-dependent chemical shifts was fitted using first one, then two, then three pKa values. In all cases we us ed the minimum number of interacting pKa values to get a good quality of fit and maximum linear regression coefficient (R2) to the experimental data. In the peptides with thr ee titrating groups, inclusion of three interacting groups in th e calculation did not improve the f its more so than including only two. The complete table of relative pKa contribu tions (c from Eq 1) and pKa values are provided in Appendix D, and the interactions are represented gr aphically in Fig. 3-5. As we discuss below, the interactions between titrating groups and resonances in these peptides is rather complicated and dynamic. The data presented here illustrate that, though there is a heterogeneous ensemble of H-bonding interactions between various backbone amide protons, certain ones are prominent. Using the pKas calculated for the pH depende nt resonances of th e peptides in this study we can assign most H-bonding interactio ns between titrating carboxyl side-chains and either backbone amide or Arg H protons. Figure 3-5 also illustrates the relative strength of these interactions. The most significant interaction (the largest shift from a long range interaction) is from a hydr ogen bond between the D1 carboxylate and G4 amide in all peptides containing the N-termin al DFDG sequence. It contributes 40% to the observed titration of the G4 amide in peptid es 2 and 3, and 55% to that of peptide 4 (Appendix D). The calculated pKa of D1 (~3.0) is significantly lower than that of D3 (~4.0), indicating that D1 is likely interacti ng with the positively charged amino terminus and stabilizing its negative charge. This is al so seen in peptide 1, as E1 also has an

PAGE 87

73 Figure 3-5: Proposed H-bonding Interactions Between Backbone Amide Protons and Carboxyl Side-chains: DFDGAM-NH2 = Peptide 3, DFDGAMPGVLRF-NH2 = Peptide 2, DFDGEMPGVLRF-NH2 = Peptide 4, SGSGAMPGVLRF-NH2 = Peptide 5, EMPGVLRF-NH2 = Peptide 2. Each H-bond acceptor residue is color coded to match the arrows lead ing from it to its H-bond donors. The arrow widths are proportional to the relative extent to which that particular interaction affects the chem ical shift of the amide pr oton at the point end of the arrow. The bar plots show the temperature coefficient of the backbone amide proton resonances

PAGE 88

74 unusually low pKa (~3.5). Additional support for this pKa assignment comes from the pKa of the alpha protons of D1 and D3 of pe ptide 3 and D1 of peptide 2, which are 3.23, 4.09, and 2.97, respectively (Appendix D). The interactions observed from pH titrations of peptides beginning in GFGD are different from and less substa ntial than those beginning in DFDG. For example, the largest pH dependent chemical shift change of the amide proton of a non-acidic amino acid for peptide 7 is ~0.12 ppm (G3), whereas G4 in peptide 2 is ~0.28 ppm. Also, the arginine sidechains of peptides 7 and 9 show a rather small chemical shift change in the pH titration experiments (Fig. 3-4D) compared to the same resonance in peptides 2 and 4. Additionally, the D4 sidechain of GFGD containing peptides s eems to interact primarily with backbone amides N-terminal to it (Appe ndix D). This is an entirely different conformation than that observed in DFDG containing peptides, where D1 has a substantial interaction with G4. Temperature Dependence of Amide Chemical Shifts Corroborates Regions with HBonding: Although complicated and ofte n over-interpreted, the temperature dependence of amide proton chemical shifts in polypeptides can be associated with hydrogen bonding ( 122, 153 ). Additionally, some peptides analyzed here lacked carboxyl side-chains, so pH titration results were not valid in determin ing possible structural interactions in these peptides. We therefore measured temperatur e coefficients (TCs) for several relevant peptides (Fig. 3-5). A rough guideline to interp reting TCs is that an absolute value less than 4 indicates an internal hydrogen bond, values between 4 and 6 indicate weak hydrogen bonding, and values greater than 6 are not involved in hydrogen bonding ( 122, 153, 154 ). The magnitudes of the temperature coe fficients for all amide protons in this

PAGE 89

75 are inversely correlated with the magnitude of chemical shift pH dependence for those resonances (Fig. 3-6), which is consistent with H-bonding interact ions as described above. Overall Peptide Charge is Correlated With Activity on NPR-1: The experimental data presented above demonstrate that acidic residues in the Nterminal regions of FLP-18 peptides can in teract with numerous amide protons and the conserved penultimate Arg. Although ther e are many additional factors influencing activity as addressed below, there appears to be a qualitative rela tionship between their charge properties (particularly of the Nterminus) and activities on NPR-1. This relationship is demonstrated in Figure 3-7 where the overall net charge at pH 7 of the entire peptide is plotted agai nst its activity on NPR-1. Discussion The goal of this work has been to de termine the conformational properties of unbound FLP-18 neuropeptides from C. elegans and how these may affect their potencies on NPR-1. The starting point for this study was the knowledge that two of the peptides encoded by the flp-18 gene have significantly di fferent potencies on NPR-1 ( 53 ). The major findings reported above can be summarized as follows: The backbone of the conserved PGVLRF-NH2 is predominantly unstructured. DFDG forms a structural l oop stabilized by H-bonding. Another loop forms when N-terminal acidic residue(s) interact with the conserved C-terminal penultimate arginine side-chain. The DFDG loop interacts with the second l oop to form a dynamic bicyclic structure that might influence binding to NPR-1. Charge also affects the activity of FLP-18 peptides on NPR-1.

PAGE 90

76 3 4 5 6 7 8 00.10.20. 3 p H De p endence (pp m ) TC (-ppb/K) Figure 3-6: Relations hip between Temperature Coeffi cients and pH dependence of chemical shift among Backbone Amide Prot ons: Plotted here is the chemical shift change with pH vs with Temper ature for backbone amide resonances. R2 for the linear fit is 0.58. All data from peptides 1-4 are represented.

PAGE 91

77 0 50 100 150 200 250 -2-101234ChargeActivity (%) Figure 3-7: Relations hip between overall peptide ch arge and activity on the NPR-1 receptor. The overall peptide charge at neutral pH is plot ted against activity for all of the peptides analyzed in this study. For the linear fit, R2 = 0.67. Legend: Filled circle = 4, Open Circle = 2, Filled Square = 9, Open Square = 5, Filled Triangle = 6, Open Triangle = 7, Filled Diamond = 1, Open Diamond = 10, Filled Hexagon = 11, Open Hexagon = 12 (numbers correspond to peptide numbers in Table 3-1)

PAGE 92

78 The backbone structure of the conserved PGVLRF-NH2 is predominantly unstructured All NMR structural parameters measur ed in this study for the PGVLRF-NH2 region of FLP-18 peptides indicate that the peptide b ackbone of this conserved sequence is predominantly unstructured. The only signi ficant evidence for any kind of structural motif is the interaction between the conserved penultimate arginine side-chain and acidic residues in the N-termini. These results s uggest that the primar y receptor-binding region of FLP-18 peptides is highly flexible before interacting with NPR-1. DFDG forms a structural loop stabilized by H-bonding We observe transient H-bondi ng and ionic interactions within FLP-18 peptides beginning in the sequence DFDG. Specifically, acidic residues in the variable N-termini form substantial H-bonds to backbone amides N-terminal to the conserved proline (Figures 3-5 and 3-8). In the DFDG containing peptides, G4 has th e smallest temperature coefficient of all amides in the study; this is characteris tic of involvement in a significant H-bonding interaction ( 122, 153, 155 ). This phenomenon is particularly prominent in peptide 4, where the D1 pKa rather than that of D3 is the most significant contributor to the G4 amide proton titration. It is also the least active PGVLRF-NH2 containing peptide tested. Also, weak ROESY peaks were observed between D1 beta protons and G4 alpha protons in both peptides 2 and 3 (data not shown). This further corroborates the pH titration results that indicate signifi cant long-range H-bonding betwee n the D1 sidechain and G4 backbone amide proton of peptides beginning w ith DFDG. In contrast, the N-terminal SGSG region of peptide 5 is unstructured base d on our NMR results, and is one of the most active peptides analyzed.

PAGE 93

79 The DFDG loop may interact with the s econd loop to form a dynamic bicyclic structure which reduces binding to NPR-1 There is no direct or simple correlation between the activ ity data and any one set of NMR data. However, the two carboxylate resi dues in peptides 2 and 4 allow both the Nterminal loops as well as the ionic interac tion between the conserved arginine and the aspartates (Fig. 3-8A). The increased activ ities of peptides 7 and 9, along with their apparent weaker interaction between the penul timate arginine and aci dic residues relative to peptides 2 and 4, illustrate that the residues SM inserted in the middle of these peptides can interfere with loop formation between the Nand Ctermini. FLP-18 peptides are short and flexible, and both loop interactions are likely dynamic. However, there is a possibility that the bulk of the N-terminal loop in DFDG c ontaining peptides is brought into proximity of the conserved receptor-bi nding region by the action of the second loop involving the penultimate argini ne. We propose that this bicyclic structure reduces binding to NPR-1. Charge is also important in determining the activity of flp18 peptides on NPR-1 There is a significant correlation between charge and activity such that more positively charged peptides tend to activate NP R-1 better than more negatively charged ones. Interestingly, the vast majority of predicted FLPs in C. elegans tend to be positively charged ( 18 ), including the peptide encoded by flp-21 which has an overall charge of +3 and is active on both naturally occurring isof orms of NPR-1 (215F and 215 V). However, peptides 4 and 9 have the same charge but different activities on NPR-1. The acidic residues of peptides 4 and 9 differ s ubstantially in their interaction with the Cterminal arginine. This is likely due to the insertion of the residues SM in the middle of peptide 9. Thus, the N-terminal DFDG loop in peptide 9 does not in teract well with the

PAGE 94

80 AB N C-NH2 N C-NH2 Figure 3-8: Model of interac tions thought occurring within na tive FLP-18 peptides. This figure shows the most significant Hbonding interactions supported by NMR data. A : For DFDGAMPGVLRF-NH2, H-bonds between the D1 side-chain carboxylate and G4/A5 backbone amide pr otons as well as H-bonding/ionic interaction between the D3 side-chain by red dashed lines. The N-terminal loop structure implicated in inhibiting binding to NPR-1 is circled in black. B : EMPGVLRF-NH2 is shown with the most significant H-bonding and ionic interactions for which we have evidence. Notice that it has no N-terminal loop, in contrast to DFDGAMPGVLRF-NH2. Also, the same unstructured region for both peptides is shown in ribbon view. The Nand Ctermini are also labeled on both peptides.

PAGE 95

81 penultimate arginine, whereas that of peptide 4 does. This further supports the bicyclic model and the affect of a two-loop conforma tion on the activities of DFDG containing peptides. Peptides 6 and 12 were often outliers in our attempts to correlate specific NMR data parameters to activity results. Peptid e 6 was designed to possess a helix in the Nterminus, and we predicted reduced binding to NPR-1 resulting in activity similar to that of peptide 2. This predicti on was incorrect, and peptide 6 had more activity than peptide 2. However, with no carboxylates, peptide 6 l acks the ability to form sidechain mediated H-bonding loops, which our model suggests should give it an activity more like that of peptides 1 and 5. Thus, the activity of pe ptide 6 (intermediate between peptides 1 and 2) suggests that other properties of its structure modulate its potency. Peptide 12 unexpectedly had nearly exactly twice the activity of peptide 1. It is composed of two copies of the conserved P GVLRF sequence that is responsible for FLP18 activity on NPR-1. Previous studies on FLP receptors show that the C-terminal amide group is necessary for activity ( 40 ), so it is extremely unlikely that the C-terminal PGVLRF in peptide 12 can interact with the acti ve site of NPR-1. However, this peptide is also the most positively charged of all among those tested. This is consistent with our observation that a peptideÂ’s charge in fluences its activity on NPR-1. Both native FLP-18 peptides in this study, DFDGAMPGVLRF-NH2 and EMPGVLRF-NH2, differ in both potency and efficacy. We have shown that N-terminal structure, peptide charge, loop formati on and backbone flexibility in PGVLRF-NH2 modulate the activity of FLP-18 peptides on NPR-1. One inte resting feature of the dose response curves in Figure 3-1 is that the tw o longer peptides have a larger maximal

PAGE 96

82 response and a steeper linear por tion than the shorter peptide. This suggests that the native peptides could induce different configur ations of the NPR-1 receptor with different abilities to couple to the G-protein pathway under subsequent second messenger pathways ( 156-159 ). Both the A. suum and C. elegans long peptides have been isolated ( 33, 39 ), demonstrating that these exist in vivo However, other studies ( 160, 161 ) have shown that many peptide degradation products can also be found in cells. Perhaps multiple forms of FLP-18 peptides could shape the behavioral response to NPR-1 activity in a way that could not be achieved by any one peptide alone. It is possible that the ensemble of peptides functions as a b ouquet (e.g. ensemble) to achieve a unique, beneficial, fine tuned response ( 20, 121 )

PAGE 97

83 CHAPTER 4 ANISOMORPHAL: NEW INSIGHTS WITH SINGLE INSECT NMR Introduction Over 2500 walkingstick insect species (ord er Phasmatodea) have been identified on earth so far ( 162 ). Many of these are known or postulated to produce allomonal defensive secretions ( 80, 162-170 ) (allomone = compound secreted by one organism that negatively affects the behavior another). Howe ver, to date, the chemical composition of the secretions from only a handful of species has been characterized ( 80, 164-169, 171 ). Anisomorpha buprestoides is a walkingstick insect (pha smid) (Figure 4-1) which is common to the southeastern United States. It ranges from Florida to Texas and North to South Carolina, including the following states: Florida, South Carolina, Ge orgia, Alabama, Mississippi, Louisiana, and Texas ( 172 ). They are commonly found in mati ng pairs with the male riding on the female’s back (Figure 1). This is stereot ypical behavior for the genus and begins when the animals are not yet adult, which is atypical in insects. The species is mostly nocturnal and groups of them often cluster on tree trunks or in hidden areas in the daytime while remaining motionless. When disturbed or threatened, A. buprestoides is well known for accurately spraying a defensive secretion up to 40 cm toward the offending stimulus, causing temporary blindness or irritation ( 173 ). The active component of this compound was characterized as a cyclopentanyl monot erpene dialehyde called “anisomorphal” by Meinwald et. al. in 1962 ( 80 ). This effort required ove r 1000 “milkings” of hundreds of individual females. The substance was imme diately extracted into methylene chloride

PAGE 98

84 Figure 4-1: Adult pair of Anisomorpha buprestoides on leaves of a sweetgum tree ( Liquidambar styraciflua ). Adult females are usua lly about 6-7 cm in length and adult males are about 4-5 cm. Photo by Aaron T. Dossey.

PAGE 99

85 for analysis. Samples were mostly from fe males due to their larger body size and larger venom ejection volume. At about the same time, Cavill and Hinterberger isolated a similar compound in the ant species Dolichoderus acanthoclinea (order Hymenoptera) that they named “dolichodial” ( 174 ). Later, two other isomers of were identified in cat thyme ( Teucrium marum ) ( 175-177 ). In 1997, Eisner et. al. reanal yzed anisomorphal for comparison with defensive secretion from another phasmid sp ecies. This work referred to the solution sprayed as a pure single stereoisomer from extracts of A. buprestoides secretions ( 164 ). The stereospecific structure shown in that pape r referenced previous work by Smith et. al. ( 171 ) who referred to work by Pagnoni et. al. ( 176 ). The Pagnoni paper from 1976 assigned specific stereochemistry to anisom orphal by comparison of physical and spectral results ( 176 ) with those of Meinwa ld et. al. in 1962 ( 80 ). In this chapter I present new wo rk on the defensive secretion of A. buprestoides Here it is necessary to define some terms that will be used: A milking (noun) will refer to a single ejection of defensive spray from an insect. Secretion will refer to the crude substance coming from an insect, either pooled or from a single milking. Milking (verb) will also be used to describe the process of collecting secretion samples from insects. Due to lack of clarity of st ereospecific common names for various isomers of dolichodial, all stereoisomers of this compound from here forth will be referred to collectively as “dolichodial-like”. In the Discussion section of this chapter, a de tailed designation of names for specific stereoisomers of dolichodial will be given based on the data presented.

PAGE 100

86 Using very fresh unpurified samples of small single milkings from half grown male A. buprestoides we were able to collect high quality 1D and 2D NMR spectra in the time normally required for standard 600 L samples. Here we report previously unobserved isomeric heterogeneity of dolichodial-like isom ers. The ratios of the two major isomers vary between individual insect s and over time in some individuals. Each isomer is in equilibrium with its geminal diol at one of the aldehyde positions (Figure 4-2). We also show that in addition to those compounds, glucose is present in nearly equimolar concentration compared with the do lichodial-like isomers. Additionally, we were also able to analyze a very small sample of defensive spray from a recently described species of phasmid insect, Peruphasma schultei ( 163 ). Based on data that will be presented, the defensive spray of this specie s contains both glucose and, in contrast to A. buprestoides homogeneous and unique dolichodiallike isomer that we have named peruphasmal. The rationale for this nomenclature will be given in the Discussion section of this chapter. This work has been made possible by new high temperature superconducting 1mm NMR probe which Dr. Edison helped develop ( 77 ). Other analyti cal techniques have also improved tremendously in recent decades, as discussed in Chapter 1. The work in this chapter takes advantage of these cutting edge tools in anal ytical chemistry. It shows how, using such advances in analytical chemis try, our planetÂ’s chem ical biodiversity can now be explored as never before.

PAGE 101

87 CH3H H OH O H H OH * *f 1f 2 '3 'CH3H H O O H H 3 2 14 5 Figure 4-2: Geminal diol equilibrium observed for dolic hodial-like isomers in defensive secretions from A. buprestoides and P. schultei. Asterisks (*) indicate chiral carbons. The labels on the righthan d molecule correspond to the NMR resonance assignments in Appendix E and are according to Chemical Abstracts Services (CAS) nomenclature.

PAGE 102

88 Experimental Procedures Animal Collection and Rearing Mating adult pairs of A. buprestoides were collected near light s at night at a Circle K petroleum station in Gu lf Hammock, Florida (FL), USA (Levy County – latitude = 29o 15’ 12.16” north and longitude = 82o 43’ 26.96” west) duri ng the fall of 2005. Though many nocturnal insects are attracted to lig hts at night, phasmids are typically not. The animals were fed various plants including willow (Salix sp.) and oak (Quercus sp.). Eggs that were dropped to th e bottom of their cages were collected and kept on medium moist soil at room temperature until they began to hatch in about January 2006. The hatchlings were fed on a diet consisti ng only of variegated Chinese privet (Ligustrum sinense) purchased from a local plant nursery in Gainesville, FL. This diet remained consistent throughout the rema inder of their life. Sample Collection and Handling For single insect milkings of A. buprestoides, a clean glass Pasteur pipette was used as shown in Figure 4-3. The end of the pipette was pressed gently against the spray gland as to disturb the insect and cause it to spra y. For single milking NMR samples, the end of the pipette was dipped immediately into 10 L of previously prepared deuterium oxide (D2O) containing 0.11 mM Sodium 3-(t rimethylsilyl)propionate-2,2,3,3-d4 (TSP) as an internal chemical shift reference. The so lution was mixed until it became clear. When the solution remained milky, more D2O with TSP was added until the solution became clear. The mixing was done in a clean conical glass vial. A syringe was used to transfer 10 L of the resulting mixture to a 1 mm NMR ca pillary tube. For organic extractions, a sample was re moved from the 1 mm NMR capillary tube using a syringe, 15 mL of chloroform-d3 was added, the mixture was vortexed

PAGE 103

89 Figure 4-3: Example of m ilking procedure for collecting defensive secretion from individual A. buprestoides. Photo by Aaron T. Dossey.

PAGE 104

90 vigorously, and the tube was centrifuged to separate the aqueous (top) from the chloroform (bottom) phases. A syringe was th en used to collect each phase and add each to its own 1 mm NM R capillary tube. For pooled samples of defensive spray from A. buprestoides and P. schultei, a small clean gas vial with a Teflon coated cap was used as illustrated in Figure 4-4. The vial was simply pressed onto the thorax of the insect with the vial opening held above the spray gland. Care was taken so that the insect was in a position such that the vial did not have to be inverted as to a void spilling the contents of prev ious milkings. All glassware used for milking procedures was rinsed with deionized water and th en with HPLC (High Pressure Liquid Chromatography) grade methanol and allowed to dry. NMR Spectroscopy All experiments were done in the sa me magnet and with the same console described in the Experimental Methods section of Chapter 3. The probe used was a novel 1 mm high temperature supe rconducting (HTS) probe develo ped for microsample NMR (77). All spectra were collected at 20o C. All 1D spectra were obtained with 8 scans. 2D COSY (Correlati on Spectroscopy) (178) and TOCSY (Total Correlation Spectroscopy) (129) experiments provide correlations (cross-peaks) betw een protons that are within 3 covalent bonds. They were done using the following parameters: sweep width = 7184 Hz (Hertz) in both dimensions carrier frequency set at 4.84 parts per million (ppm) for both dimensions, 2048 comple x points acquisition, 256 complex points indirect (StatesTime Proportional Ph ase Incrementation (States-TPPI)) (179), number of scans = 8. All of the above parameters were also used for ROESY (Rotating-frame Overhauser Enhancement Spectroscopy) (130), which provides correlations between protons that are within 5 in space. RO ESY spectra were collected with 32 scans.

PAGE 105

91 Figure 4-4: Proce dure used for obtaining pooled de fensive spray samples from A. buprestoides and P. schultei. Photo by Aaron T. Dossey.

PAGE 106

92 2D HMQC (Heteronuclear Mul tiple Quantum Correlation) (180, 181) experiments correlate 1H with 13C through single bonds. Thes e were obtained using a 13C-HMBC BIRD (Bilinear Rotation Decoupling) (182) pulse sequence to minimize 1H signals attached to 12C nuclei. Parameters used for HMQC experiments were: operating frequency = 600 MHz for 1H and 150.9 MHz for 13C, sweep width = 7184 Hz for 1H and 24140 Hz for 13C, carrier frequency = 4.84 ppm for 1H and 82.8 ppm for 13C, 720 complex points acquisition, 128 complex point s indirect (States-TPPI), and 32 scans. 2D HMBC (Heteronuclear Mu ltiple-Bond Correlation) (183) experiments correlate 1H with 13C that are separated by up to 3 bonds. HMBC experiments were optimized for 10 Hz 13C-1H J couplings using the following para meters: operating frequency = 600 MHz for 1H and 150.9 MHz for 13C, sweep width = 7184 Hz for 1H and 36323 Hz for 13C, carrier freque ncy = 4.84 ppm for 1H and 117.8 ppm for 13C, 1024 complex points acquisition, magnitude indirect = 512, and 64 scans, 50 ms delay to generate multiple quantum coherence. High Pressure Liquid Chromatography – Mass Spectrometry (LC-MS) Gas chromatography (GC), Liquid chromat ography (LC), colorimetric assay of glucose, and mass spectrometry (MS) were all done by Dr. Spencer Walse at the USDA Center for Medical, Agricultural, and Veterinary Entomology (USDA-ARS) in Gainesville, FL, 32604. The descriptions of these methods below were provided by Dr. Walse. To verify the presence of glucose in insect venom samples, HPLC-MS (-) electrospray ionization (ESI) of aqueous fractions supplemented with 13C6-D-glucose was done. Data from these experiments are shown in Appendix F. For HPLC, eluants were 0.1% formic acid (FA) in ACN (a), 10mM amm onium formate (b), and 10mM ammonium

PAGE 107

93 formate in 90% ACN(c). Elution through an YMC-NH2 analytical column (L = 250 mm, ID = 4.6 mm, S = 5mm, 20nm) was isocrati c (4a:24b:72c) for 18min. The mobile phase (1mL/min) was split 10:1 between a PDA photod iode array detector (PDA) and MS. For mass spectra of corresponding HPLC traces (A ppendix F), the following parameters were used: Total ion current (TIC) and C Select ion monitoring (SIM) of m/z 225; Dm/z 225 MS2; E -SIM m/z 231, Fm/z 231 MS2, and sheath and sweep gas flow rates (arb) were 40 and 20, respectively. Equipment used for HPLC included a Thermo Separation Products Spectra SYSTEM SCM1000 memb rane degasser, P4000 pump, and AS3000 auto-sampler. For mass spec, a thermo Finnigan UV6000LP LDC PDA, and Finnigan LCQ DecaXP Max mass spectrometer in ESI mo de (+/-) (5kV spray voltage and 275C capillary temperature) was used. Gas Chromatography (GC) For GC experiments, two injection methods were employed to illustrate that peaks observed were not artifacts of injection t echnique. Cool on-co lumn and split-less injections (1 L) were at 40C and 200C, respectivel y; the detector was maintained at 260C. The oven program was as follows: isot hermal for 5 min, heating from 40C to 200C at 11 C/min, isothermal for 10min, heating from 200C to 250C at 25 C/min, and then isothermal for 15min. GlasSeal connectors (Supleco) fused three silica columns in series: a primary deactivated co lumn (L = 8 cm, ID = 0.53 mm), a HP-1MS retention gap column (L = 2 m, ID = 0.25 mm, df = 0.25 mm), and a J&W DB-5 analytical column (L = 30 m, ID = 0.25 mm, df = 0.25 mm). Equipment used for the GC experiment included: a Hewlett Packard (Palo Alto, CA) 5890 series II gas chromatograph and a flame-ionization detector (GC-FID) with nitr ogen make-up gas (1.5

PAGE 108

94 mL/min) and helium carrier gas (1.3 mL/min). Data from these experiments are shown in Appendix H. Gas Chromatography – Mass Spectrometry (GC-MS) Mass spectra for peaks from GC experime nts (described above) were used to identify the eluting compounds. For these experiments, a Varian 3400 gas chromatograph and a Finnigan MAT Magnumion trap mass spectrometer (GC-ITMS) in electron impact (EI) ionization mode (70 eV) with a filament bias of 11.765 volts or chemical ionization (CI) mode (isobutane) were employed to acquire full-scan spectra over the ranges m/z 40 to 400 at 0.85 s per scan. Holox (Charlotte, NC) high purity helium was used as a carrier gas (1.4 mL/m in). Injection and oven conditions were as above. Transfer-line and mani fold temperatures were 240 and 220C, respectively. Results The motivation for the first experiments done for this study was to use the 1 mm HTS probe to characterize a single milking sample from A. buprestoides using aqueous solvent to determine 1) if va riation could be observed between individual animals and 2) if additional previously unreported water soluble components could be seen in the defensive secretions. Previous reports indi cated the solution to be pure anisomorphal (80, 164, 176). Pure substances are uncommon from biological sources. Additional motivation was in the form of the possibi lity of discovering a new compound from a recently described phasmid insect, P. schultei (163). NMR of Single Milkings Shows a New Component and Isomeric Heterogeneity In the first aqueous singlemilking NMR spectrum from A. buprestoides, it was clear that the sample contained a more co mplex mixture than expected for a pure ten carbon molecule (Figures 4-2 and 4-5 A).

PAGE 109

95 Immediately, chemical shifts of the vinyl and aldehyde pr otons were roughly verified to be those expected for anisomor phal based on original work by Meinwald and co-workers (80). However, due to the apparent co mplexity of the mixture in Figure 4-5 A, the sample was extracted with chloroform-d4 and NMR spectra were taken of the aqueous (Figure 4-5 B) and chloroform (Fi gure 4-5 C) phases of the extraction. The aqueous fraction appeared to c ontain a sugar-like molecule si milar to glucose (Figure 4-5 A and B) (personal communication with Jame s R. Rocca, UF AMRIS), we compared its spectrum to a pure sample of D-glucose (Figur e 4-5 F). Indeed, the resonances observed from glucose are identical to those in the a queous fraction of the si ngle milking sample from A. buprestoides. To further verify by NMR that the aqueous component was glucose, another single milking sample (Figur e 4-5 D) was spiked w ith D-glucose (Figure 4-5 E) and the spectrum was compared directly with a sample of pure D-glucose (Figure 4-5 F). Clearly, the only peak s in the spectrum that increas ed after addition of glucose are those of the sugar-like aqueous constituent of the defensive secretion of A. buprestoides. In addition to glucose, early experiment s indicated that possibly more than one isomer of the dolichodial-like cyclopentanyl monoterpene di aldehyde previously reported from A. buprestoides defensive secretions (80, 164) was present. This was intriguing since previous work reported only a si ngle isomer present in this species (80, 164, 176). Thus, to verify these observations for other an imals and to determine if variability in the relative isomeric concentrations exists between individual animals or within an individual animal over time, 1D spectra from several addi tional single milkings were collected. We separated four half-grown male A. buprestoides from our culture so that they could be

PAGE 110

96 Figure 4-5: 1D NMR spectra of single milking samples from A. buprestoides, its chloroform extract, the aqueous fraction, a sample of glucose for comparison, and a glucose spike experiment. A) Single milking from A. buprestoides in D2O with 0.11 mM TSP. The milking (~1 L) was added to 10 L of D2O containing 0.11 mM TSP and then added to a 1 mm capillary NMR tube. The spectrum was taken within 10 minutes of being ejected from the insect. A

PAGE 111

97 bracket indicates the region of expansion shown in D, E, and F. B) Aqueous phase from chloroform extract of sa mple A. C) Chloroform phase of chloroform extract from sample. Dolichodial-like isomers extracted completely into chloroform and were 100% in the dialdehyde form in this solvent. A. D) Another single milking sample from A. buprestoides prepared similarly to sample A. E) Sample D (10 L) spiked with 0.9 L of 50 mM glucose dissolved in D2O containing 0.11 mM TSP. F) pure glucose (~5 mM) in D2O containing 1 mM TSP.

PAGE 112

98 distinguished from one anothe r. We then collected 1D 1H NMR spectra of single milking samples from each insect on different days (Figure 4-6). In the 1D 1H spectra from indi vidual milkings of A. buprestoides, we observed four different dolichodial-like isomers that were in different ratios fo r different individual animals and even changed for certain individu als over time (Figure 46). This isomeric heterogeneity is most easily observed in the vinyl proton region, thus we use expansions of this region to illustrate the isomeric ratios. For animals 1, 2, and 4, the same isomer remained dominant for all days observed. However, for animal 3, the second most abundant isomer became the most a bundant from day 2 to day 8. To verify that all resonances observed in NMR spectra were only from either the dolichodial-like isomers or from glucose, a sufficient complement of 2D NMR spectra were collected (Appendix G) and complete 1H and 13C NMR assignments of the dolichodial-like isomers were made (Figure 4-6 and Appendix E). Indeed, the only peaks consistent among all samples collected from A. buprestoides correspond to glucose, two dolic hodial-like isomers, and gemi nal diols of those isomers at the f2 aldehyde position (Figure 4-2). Howe ver, inspection of the vinyl region expansions in Figure 4-6 shows th at four isomers exist. One of the possibilities for this is that at least one of the aldehydes is in e quilibrium with its geminal diol, which is expected to be present in aqueous solution. Indeed a 1H resonance at 4.92 ppm correlated to a 13C at 96.06 was observed in 2D spectra (A ppendices E and G). Also, integrals of aldeyde peaks (Figure 4-8 A) i ndicated that at some of the isomers were deficient in one of the aldehyde protons. Specifically, the sum of the f1 aldehyde proton integrals (at ~9.7 and 9.3 ppm – Figures 4-2 and 4-8) are less than the sum of the f2 protons (at ~9.45 ppm).

PAGE 113

99 This should not be the case if the molecules are all in the dialdehyde form (Figure 4-2). These protons can also be identified because the f1 protons (Figure 42) are doublets due to coupling with the C1 proton, whereas the f2 protons are singlets, as expected. Using 1D 1H integrals of the aldehyde protons (Figure 4-8 A) we were able to calculate that each isomer is in equilibriu m with about 14% being in the geminal diol form at the f2 position (Figure 4-2). Inspection of the 1D 1H spectra in the vinyl region shows two “major” isomers a nd two “minor” isomers (Figur e 4-8). The two “major” isomers were assigned by 2D 1H and 13C NMR to correspond to dialdehydes. Each “major” form has a corresponding “minor” form that integrates to about 14% of the sum of the two. This strongly suggests that the two “minor” fo rms are the geminal diols. Additionally, by the same logic, the vinyl pr oton resonance for each dialdehyde isomer can be assigned to its corresponding geminal di ol. This issue is de scribed further in the Discussion section of this Chapter. In addition to secretion samples from A. buprestoides, we were able to obtain a defensive secretion sample from the recently described stick insect species from Peru, Peruphasma schultei (163). The sample was pooled from milkings of three individual P. schulti. Sufficient sets of 1D and 2D NMR data for these were collected using NMR experiments identical to those desc ribed above for secretions from A. buprestoides (Figure 4-7). These show that P. schultei produce a single dolichod ial-like stereoisomer that is different from the two observed by NMR from A. buprestoides (Appendices E and G). Complete 1H and 13C assignments were made of this isomer and its geminal diol (Appendix E). Presence of gluc ose was also verified in the sample by comparison of 1D 1H NMR spectra with one an authentic pure glucose sample and by extracting the

PAGE 114

100 Day 1 Day 2 Day 8 A nimal 1 6.66.56.46.36.2 A nimal 2 6.66.56.46.36.2 A nimal 3 6.66.56.46.36.2 A nimal 4 6.66.56.46.36.21H (ppm) Figure 4-6: Expansions of vinyl region of 1D 1H NMR spectra for single milkings of A. buprestoides on different days. Samples were prepared the same as described in the Experimental Methods section of this Chapter and in the caption for Figure 4-5.

PAGE 115

101 Figure 4-7: 2D COSY and ROESY 1H NMR spectra of defensive secretions from A. buprestoides and P. schultei – The lines indicate i ndependent spin systems involving the crosspeaks which they conn ect. Each panel (top and bottom) of consists of spectra from three differe nt types of 2D NMR experiments: two

PAGE 116

102 leftmost spectra = ROESY, larger spect ra to the right = COSY. TOP spectra: = single milking from A. buprestoides, BOTTOM spectra = pooled sample from P. schultei.

PAGE 117

103 9.25 9.35 9.45 9.55 9.65 9.75 32.685 20.545 135.208 99.968A.pp m 6.1 0 6.20 6.30 6.40 6.50 6.60 6.70 17.127 98.039 7.927 35.108 5.969 100.000 17.779 33.440 B. a b c d e f g h i j k l Figure 4-8: Integrals of aldehyde and vinyl regions from single milkings of A. buprestoides defensive secretion. Integral s of NMR peaks demonstrate the relative concentrations of the protons th at give rise to resolvable resonance peaks. The integrals here are all normali zed to the peak at ~6.5 ppm (B) set at 100. A) aldehyde and B) vinyl region expansions of the 1D 1H spectrum taken on the sample from animal number 2 on Day 1 (Figure 4-6). To assign aldehyde and vinyl resonan ces to the corresponding is omers, integrals were summed and compared for the different peaks as follows: a and d represent the same proton (on the f2 carbon, Figure 4-2) on two dolichodial-like isomers. Proton a is 76% of the tota l and d is 24%. Since we observe variability in isomeric ratios between samples, this ratio holds only for this particular sample. Also, for the majo r vinyl resonances, g and k have equal integrals and so do e and i. Protons g and k together make up 74% of that sum and protons e and i make up 26%. Thus, pr otons a, g, and k are from the same isomer and protons d, e, and i are fr om another isomer. To determine the proportion of geminal diol form fo r each isomer and to assign vinyl resonances to the diol form of each isomer integrals were used as follows: b and c are resonances for the protons on the f1 carbon and a and d are the protons on the f2 carbon (Figure 4-2). Thus, in tegrals a + d should equal b + c. However, a + d = 0.86 x (b + c). This indicates that ~14% of the protons on the f2 carbon are missing. From 2D NMR assignments, those protons show up at around 5 ppm. This indicate s that the missing proportion of the molecule is a diol at the f2 carbon position. Also, c = 0.14 x (b + c), so c is the f1 proton for the diol form of the molecule (Figure 4-2). Additionally, integrals 0.86 x (f + g + k + l) = g + k and (0.83) x (e + h + i + j) = e + i. This indicates that f and l are re sonances from the diol of the same stereoisomer as g and k, and h and j are from the diol of the same stereoisomer as e ad i.

PAGE 118

104 P. schultei sample with chloroform. These experiment s were similar to those described above for milking samples from A. buprestoides. Glucose Verified by Chromatogr aphy and Colorimetric Assay To further verify the identity of glucose in defensive secretions of A. buprestoides, methodology coupling liquid chromatography with mass spectrometry was employed (Appendix F). For this, a pooled sample of several milkings of A. buprestoides was used. The sample was first extracted with organi c solvent to remove the dolichodial-like substance(s). Next, the sample was spiked with 13C6-D-glucose and loaded onto a polyamine analytical column followed by an isocratic elution with ammonium formate and formic acid (FA) in acetonitrile (ACN ). To compare elution profiles with 13C glucose and the aqueous constituent of A. buprestoides secretions, nega tive ions with masses of 178.85, 184.75, 224.84, and 230. 75 daltons were monitored using an electrospray ionization (ESI) mass spectrometer. These masses correspond to M-1 ions of D-glucose (179), 13C D-glucose (185), a formic acid adduct of D-glucose (225), and a formic acid adduct of 13C-D-glucose (231), respectively. The LC traces in Appendix F show that all of these constituents co-elute This independently illustrates that the aqueous non-dolichodial-like constitu ent in defensive secretions of A. buprestoides is glucose. This corroborates the previously de scribed NMR identification of glucose. Presence of glucose was also ve rified by colorimetric assay (184). This method also allowed the quantification of glucose con centration. We calculate that fresh crude ejections of defensive spray from A. buprestoides contain about 140-280 mM glucose. The concentration of dolichodial-like isomers, based on 1D 1H NMR spectra, is present in roughly the same con centration (Figure 4-5).

PAGE 119

105 Stereoisomeric Heterogeneity Verified by Gas Chromatography and Mass Spectrometry As additional verification of the iden tity of dolichodial-like isomers from A. buprestoides and P. schultei defensive secretions, organic extracts of these and of the catmint (T. marum) were characterized using gas chromatography coupled with mass spectrometry (Figure 4-9). Two stereoisomers were pr eviously verified from T. marum in a 9:1 ratio by Pagnoni et. al. Based on prior results by Meinwald et. al., the minor form was determined in the Pagnoni paper to be anisomorphal (80, 176). The chromatograph in Figure 4-9 shows a major form (retention time = 11.95 minutes) and a minor form (retention time = 12.15 minut es). In samples from A. buprestoides, two major forms and one minor form were observed. Though th e minor form was not observed by NMR, the presence of the two major forms corroborates the results from NMR described above. One major form has the same retention time as the major form from T. marum and the other major form has the same retention time as the minor form from T. marum. Samples from P. schultei yielded only a single dolichodial-like isomer by GC that corresponds to a previously uncharacterized form. It has the same retention time as that of the minor form observed from A. burpestoides. GC peaks for all dolichodial-like isomers from all species studied were identif ied by their fragmentation pa ttern using both chemical ionization (CI) and electron impact ionization (EI) mass spectrometry (Appendix H). Additionally, to determine that various isom ers of dolichodial obser ved in the different samples, two different injection techniques we re employed: cool on-column and splitless injection. Identical chromatograms were observed with both tec hniques, demonstrating that the peaks observed were not an artifact of injection method.

PAGE 120

106 Time (min)Relative abundance11.78 11.95 12.15P. schultei A. buprestoides T. marum 11.0011.4011.8012.2012.6013.00 Figure 4-9: Gas Chro matographs of dolichodial-like is omers isolated from defensive secretions of the insects Anisomorpha buprestoides and Peruphasma schultei and extracts of the plant Tecrium marum. Cool on-column injection was used to load the samples onto fused silica columns. Detection was done using a flame ionization detector (FID). Hori zontal dotted lines show peaks from different chromatograms with identical relative retention times.

PAGE 121

107 Discussion The work described above used cutting edge new microsample NMR technology to chemically characterize defensive se cretions of two insect species (Anisomorpha buprestoides and Peruphasma schultei). These results were verified by comparison with extracts from the plant Tecrium marum and by more standard natural products chemistry techniques. The major findings made possible by the novel microsample NMR technology are as follows: Glucose was observed for the first time from phasmid insect defensive secretions Anisomorpha buprestoides can produce two major stereoisomeric forms of its defensive compound Relative isomeric concentrations of A. buprestoides defensive compounds vary between individuals and over time Peruphasma schultei produces a unique stereoisomer of anisomorphal called Peruphasal, as well as glucose Glucose Discovered in Stick Insect Defensive Spray – Potential Functions Glucose associated with the defensive s ecretions has been previously observed in some arthropod taxa such as leaf beetles (Class insecta, Order coleoptera, Family chrysomelidae) (185). The presence of glucose in such secretions is postulated to be due to -glucosidase activity which liberates gluc ose from the active component of their defensive secretion (185, 186). Other glandular components from insects also arise from -glucosidase activity (186). However, the findings in this study are the first report of glucose in the defensive secretions of phasm id insects. The reason this substantial component was missed by other groups was undoubtedly because they only analyzed organic extracts of the secretions (80, 164-167, 171) or because the only method used

PAGE 122

108 was gas chromatography which only analyzes volatile compounds. These methods were probably chosen based on the original work by Meinwald et. al. (80). The similar concentrations of glucose and dolichodiallike isomers (presumably the active defensive components) observed in the phasmid insect secretions seem to suggest that these compounds are at some point c ovalently linked during their bi osynthesis. However, this has yet to be proven. Hypothe ses for the function of glucose in the defensive secretons of phasmid insects will be discussed in the Fu ture Directions section of Chapter 5. Isomeric Heterogeneity in Phasmid Defens ive Compounds – Chemical Biodiversity The findings in this chapter are in direct contradiction to those by Meinwald, Pagnoni, and Eisner (80, 164, 176) for the dolichodial-like secretion of A. buprestoides. Here we have shown that A. buprestoides can produce a mixture of dolichodial-like isomers. In our work we provide NMR evidence for the presence of two distinct stereoisomers and their corresponding geminal di ols in aqueous soluti on (Figure 4-2). By GC we observe three distinct dolichodial-like isomers. Th e reason previous studies may have observed pure stereoisomers of dolichodial from A. buprestoides could be that different genetic backgrounds or environmenta l factors influence the stereochemistry of defensive secretions in this species. Altern atively, improvement in analytical techniques may have allowed their detection. By comparing the integrals of the aldehyde peaks in 1D spectra, we can determine that the proportion of diol in solution is roughly 14%. This is, of course, an average of the two stereoisomers, since the f1 protons (Figure 4-2) of the stereoisomers are not resolved. Also, with this information we can determine which peak from an apparent diol in the vinyl region goes with which p eak from the corresponding dialdehyde stereoisomer. The integrals are distinct enough to make these designations by simple

PAGE 123

109 inspection of the vinyl region in Figure 4-6. However, this is based on the assumption that the aldehyde/geminal diol equilibria for both stereoisomers are the same. The integrals indeed make a compelling case that this is true, but further experiments on pure stereoisomer solutions are required to definitiv ely make such assignments. Thus, for this Dissertation, chemical shift assignments (Appe ndix E) given for the geminal diol isomers of dolichodial and anisomorphal (bot h present as ma jor forms from A. buprestoides as previously discussed) are given with out specific stereochemistry. Peruphasmal – A Novel Phasmid Defensive Compound Isomer Correct stereochemistry is extremely importa nt for the proper biological function of many molecules. The vast majority of amino acids found in proteins are of the L isoform at their alpha carbon and glucose is always D at carbon 2. Semiochemicals in insects are no exception. A famous case of this was shown for the sex pheromone of the Japanese beetle, Papilla japonica. In that case, the enantiomer of the pheromone actually causes inhibition of male attracti on to the proper isomer (187). Thus, the heterogeneity of isomer concentrations among Anisomorpha makes the findings in this chapter particularly intriguing. The work in this Chapter is also the first report of chem ical analysis of the defensive secretion from a very recently de scribed phasmid insect species from Peru, Peruphasma schultei (163). This species is classified as a member of the same Tribe (Anisomorphini) within the order Phasmatodea. This study found that it produces a unique stereoisomer of dolichodial previously uncharacterized from any natural source. Also, P. schultei defensive secretions contain glucos e in similar quantities to those of A. buprestoides. This is an example of both conserva tion of defensive secretion formulation and stereochemical biodiversity.

PAGE 124

110 Based on the findings discussed above, we provide nomenclature designations for the three stereoisomers of dolichodial charact erized from phasmid insect defensive secretions. First, since the isomer which eluted last (12.15 minut es, Figure 4-9) from A. buprestoides corresponded to the minor form from T. marum and was previously named as anisomorphal by Meinwald and Pagnoni (80, 176), we will retain that designation. Since the isomer which eluted from GC at 11. 95 minutes (Figure 4-9) is the major isomer from T. marum and has no previous stereospecific name designation, we designate its name to be dolichodial. The unique isomer from P. schultei which eluted at 11.78 minutes is the only isomer isolated from this species, is only present in slight traces from A. buprestoides, and has not previously been reported. Thus, we designate the name for that isomer to be peruphasmal. These results are fascinati ng in their own right – as novel observations of phasmid insect defensive secretion chemistry. Ho wever, the newly available NMR tools along with the methodology used here truly have farther reaching impacts in chemical prospecting and addressing ch emical biodiversity. The ability to apply such a nondestructive and information rich method as NMR to minimally manipulated samples will certainly allow for never before possibl e discoveries and more efficient chemical characterizations at many stages of natura l products chemistry – from isolation through synthesis.

PAGE 125

111 CHAPTER 5 CONCLUSIONS AND FUTURE DIRECTIONS Conclusions The importance of understanding biological systems on a molecular level is well established. Paramount to that understanding is knowledge of the various intramolecular interactions that give rise to their 3-dimensi onal structure. This dissertation has sought to understand the molecular detail observed in tw o types of natural products: FMRFamidelike neuropeptides from nematodes and phasmid insect defensive secretions. The work described and illustrated in this dissertation leads to several conclusions summarized in this chapter. For the work on bioinformatic analysis of FLPs and their precursor proteins in Chapter 2, a number of features of these molecules have been illuminated. FLPs in nematodes tend to have a net th eoretical positive charge. Th is is clearly an important chemical feature that could affect their bindi ng to and activation of receptors, interaction with membranes, and other biochemical entities In fact, data in Chapter 3 (Figure 3-7) suggest a particular preference for NPR-1 to be activated by more positively charged flp18 analogues. Also, FLPs are rather shor t, with the most co mmon amino acid length being 7 amino acids. This is also likely an important property that has been conserved within this family of peptides to preserve their receptor interaction properties and, thus, their functions. Bioinformatic comparisons coupled with functi onal results for one subfamily versus others allows for elucidat ion of conserved functional motifs as more

PAGE 126

112 data become available it may be possible to predict functional prope rties in advance of biological assays. Most of Chapter 2 dealt w ith various properties of fl p precursor proteins and comparing the 28 known in C. elegans. Overall, the proteins begin with an N-terminal hydrophobic signal sequence, followed by a “s pacer” region of unknow n function, and the C-terminal portion of most flp precursors contains the pr edicted peptides themselves, in many cases several repeated copies of C-te rminally related sequences. As previously stated, the peptides tend to be positively ch arged. As is also well established for neuropeptides such as these, they are fl anked by additional positively charged amino acids that serve as their proteolytic processing sites. What I was able to observe was that the “spacer” regions tend to be largely rich in negatively charged acidic residues. This property was previously known (20) but not studied in as much detail as it was in Chapter 2. The fact that this property and others are conserved across ma ny of the flp precursor paralogues led me to investigate further. Using an unstructured pr otein prediction model, it seems likely that the “spacer” regions are ch arge-compensated relative to the peptides and processing sites to stabilize some sort of structure. In addi tion to the precursor proteins, I was able to take some advant age of the complete genomic information available for C. elegans. One interesting feature of the flp genes is that, relatively frequently, intron/exon boundaries occur in the conserved regions of peptide coding regions. This could potentially add to the diversity of peptides produced. I also attempted to reconstruct the evolutiona ry history of all 28 flp precursors in C. elegans and phylum Nematoda usi ng various phylogenetic rec onstruction methods in collaboration with Dr. Slim Sassi and by co mparison of the various sequence patterns

PAGE 127

113 discussed in detail in Chapter 2. Due to the divergent and repetitive nature of these genes, it is clear that this level of anal ysis will require data not yet available and development of novel algorithms beyond the scope of this Dissertation. However, this reconstruction will be discussed further in the “Future Directions” section below. In Chapter 3, I have shown data that demonstrates the importance of transient hydrogen bonding, electrostatic interactions and charge in th e function of short neuropeptides in the FLP-18 subfamily of C. elegans. This was done by comparing activity on the GPCR NPR-1 to NMR resu lts for a longer native FLP-18 peptide (DFDGAMPGVLRF-NH2) with a shorter native one (EMGVLRF-NH2) and a panel of other analogues. These results, taken t ogether, show that a complex and dynamic Hbonding network in the DFDG region of the l onger peptide forms and interacts with the penultimate arginine residue to attenuate bind ing to NPR-1. In addition to the biological implications specific to this one case study, this work dem onstrates the value of these techniques in understanding th e structure/function relationshi ps of other FLPs and, in general, any peptide or proteinaceous macr omolecule. The NMR pH and temperature titration experiments illuminate such transient interactions in a way that is simply not possible with other techniques. In particul ar, these techniques, in combination with others currently used, could pr ovide valuable insights into the early stages of protein folding. This is discussed further in the “Future Directi ons” section below. Chapter 4 provides a conclusive report of the complete chemical composition of the defensive sprays produced by two insect spec ies from the Order Phasmatodea, Family Pseudophasmatidae, Tribe Anisomorphini: Anisomorpha buprestoides and Peruphasma schultei. The defensive substance from A. buprestoides, a ten carbon cyclopentanyl

PAGE 128

114 monoterpene dialdehyde prev iously characterized and named anisomorphal, was previously described as a pure isomer ic composition of this molecule (80, 171, 176). However, the work in Chapter 4 clearly demons trates that the defensive spray from these creatures contains a subs tantial amount of glucose (nearly equimolar with “anisomorphal”) and at least two major stereo isomers of “anisomorphal”. We were also able to use substantially less material fo r our experiments than was required for the original characterization (80). Our work also shows that the relative concentrations of these isomers can vary over time within one individual animal and between different individuals. Additiona lly, the work presented in Chapte r 4 was able to conclude that samples from the newly described species, P. schultei, also contained gl ucose, but a pure third stereoisomer of the ten ca rbon molecule characterized from A. buprestoides. The presence of glucose in these defensive c oncoctions as well as the varying isomeric mixtures within one species and homogeneity in another raises a number of intriguing scientific questions that will inevitably lead to some fascinating future experiments and scientific findings. These are discussed in the “F uture Directions” section of this chapter. This dissertation describes scientific re search on two rather different types of biological molecules: neuropeptides a nd non-polypeptide defensive secretion small molecules. However, these are both natural prod ucts in the true sense of the term. They are produced in biological systems, have evolved biol ogical functions, and can be isolated and studied for possi ble human-related applications. Additionally, both types of molecules studied in this dissertation ar e indeed signaling molecules. The FLP neuropeptides are produced to transmit signa ls from one cell to another within an individual animal. The defensive secretions of insects, such as those studied in chapter 4,

PAGE 129

115 function as allomones that are intended to se nd a signal from one animal to another of a different species to get away! A unifying conclusion of this dissertati on is that with ne w technologies and resources available today we are able to begin adding a level of knowledge about biological molecule behavior beyond which has been possible in the past. With genomes of species after species being completed and published and advan ces in analytical chemistry techniques, full correlations be tween molecular interactions, biological function, and genetics can begin to be reali zed. This level of analysis will inevitably provide science with a clearer picture of a nd feel for what is actually occurring at the molecular level of nature beyond traditional ge netics. In understanding the chemistry of biological systems, we cannot only begin to use them for human purposes but to adapt our own needs to reduce our de trimental impacts on the natural order on which we deeply depend. To understand chemical biodiversity is to be a witness to Mother NatureÂ’s battery of possibilities an d observe her toolbox through its most fundamental components. Future Directions A theme that has evolved across the diffe rent projects in which I have been involved as a graduate student, a nd one that I think is central to science itself is: These results leave yet more questions to be resolv ed. This, I believe, is indeed a hallmark of good observational science. It is rare (arguably impossible) fo r any one realm of study to be logically deemed complete, requiring no furt her investigation. So, as I have drawn conclusions and answered questi ons to the satisfaction of my self and colleagues, possibly my greatest contribution to science in this Di ssertation will be the legacy of ideas; the

PAGE 130

116 future directions that I hope, along with the data covered in this document, will inspire future investigators in similar endeavors. Evolutionary History of FL Ps and Other Neuropeptides This work has been largely observationa l in nature. Good observational science can inspire and direct future successful experi ments. Based on my study of this topic, I can make a number of well-guided suggesti ons for researchers who will be involved in this project in the future. First of all, much more data is needed to resolve the evolutionary history of all 28 flp precursor genes in C. elegans and for the complementary set in nematodes in general. The best data, of course, would be the complete genomic and cDNA for several complete sets of flp genes for several species representing a variety of dive rgent nematode clades as well as logical outgroups (such as possibly flp genes from annelid worms, etc.). Some potentially very informative possibilities such a database of genomic DNA sequences would be able to identify are regulatory elements and corresponding transcri ption factors for these proteins to allow comparison with published expression data (114). Such a study would provide extremely valuable information on the role of FLPs in nematode nervous system function and evolution. Additionally, a well-supported phylogeny of flp genes in C. elegans could be compared with phylogenies of G-Protein Coupled Receptor phylogenies to observe receptor/ligand co-evolution and to potentiall y predict other as of yet identified FLP receptors. Also, since the pr otein alignments performed during efforts in Chapter 2 proved to be divergent and full of gaps, genomic DNA should at least show whether some of the gaps in these alignments are due to differential transcription or not. Of course, the DNA alignments should be anc hored, when possible, by regions of known function such as signal sequences, peptide coding regions, peptide processing sites,

PAGE 131

117 intron/exon boundaries, or exceptionally well conserved regions which may be providing a structural function in the pre-processed pr ecursor protein. Absent additional DNA data, it appears from the precursor alignments we ha ve performed that the peptides themselves may be quite informative in determini ng the relatedness of flp paralogues in C. elegans. Thus, in collaboration with Dr. Slim Sassi I do plan to attempt an evolutionary reconstruction of flp precursors by only consid ering the peptides before we attempt to publish the results of Chapter 2. Some of the data in Chapter 2 suggests th at the “spacer” regions of flp precursor proteins may indeed be providi ng some structural role in th e pre-processed protein. The apparent charge balance betw een peptide/processing and spac er regions of the sequences along with the unstructured pr otein prediction models showing low propensity to be unstructured for most C. elegans flp precursors provides substantial evidence that these proteins may indeed form structures in soluti on. These structures may also be crucial to their processing or other functions. Thus, one informative set of experiments that could illuminate the function of these regions would be structural characterization. Previous attempts by Dr. Edison’s lab (by Dr Cherian Zachariah) to produce C. elegans flp precursors used constructs that contained their native N-termin al hydrophobic signal sequences, expressed in a prokary otic system (the bacterium Escherichia coli). These constructs produced observable amounts of flp precursor protein in the insoluble fraction of bacterial lysis extracts. One possible re ason for this result, among others, may have been that the signal sequences were not pos t-translationally cleaved and, thus, caused aggregation into insol uble inclusion bodies. It is we ll known that hydroph obic regions of proteins can result in aggregation (188). Based on these results and analyses in Chapter 2

PAGE 132

118 of this dissertation, I woul d propose that new constructs be made which have the C. elegans signal sequence and corresponding cleavag e site replaced with one native to the expression system used (either prokaryotic or eukaryotic expression cells). Vectors containing such signal sequen ces (for secretion of th e expressed protein) are commercially available (189, 190). Once the proteins are expressed and purified, of course, they are available for analysis by such structural techniques as NMR, X-ray crystallography, and circular dichroism. Neuropeptide Structur e/Function Analyses I spent the bulk of my time as a PhD stude nt working on this pr oject. The results gained suggest a number of experiments that I believe would provide future researchers with valuable information. First, one additional NMR technique that would provide additional information on the structure/dyman ic state of the flp-18 derived peptides analyzed in Chapter 3 would be relaxation dispersion measurements. This technique provides information on the timescale of in tramolecular motion and has been applied successfully on larger proteins (191). Also, for a more detailed model of solution structure populations of FLP-18 peptides, quantum calculations of NMR chemical shifts, and comparison with our experimental valu es, based on molecula r dynamic simulation data would be extremely useful. These st udies are actually be ing carried out in collaboration with Georgios Leonis in the laboratory of Dr. Adrian Roitberg in the Chemistry Department here at the Universi ty of Florida. In addition to the FLP-18 peptides, such studies like those in Chapte r 3 could be performed on other FLPs in C. elegans for which receptors have been identified. This has even been suggested by one of the reviewers of the manuscript that was published in Biochemistry for the work in Chapter 3 (96). Also, in general, NMR techniques su ch as pH and temperature titrations

PAGE 133

119 could be used along with rela xation dispersion experiments in systematic studies of how transient hydrogen bonding and electrosta tic interactions (among others) guide polypeptide conformations in solution. This, in my opinion, will provide the basis for the next level of knowledge in fully understandi ng the mysteries of how proteins achieve their three-dimensional folds. Anisomorphal and Other Insect Natural Products Of the three chapters that I call my “Results Chapters” (Chapters 2-4), the work described in Chapter 4 is most re levant to my immediate future career goals as a scientist. For at least part of the future of this project, I will be involved directly. First of all, it will be crucial to the project to elucidate th e stereochemistry of the three isomers of dolichodial that have been obser ved in this study (one from Peruphasma schultei, two from Teucrium marum, and three from Anisomorpha buprestoides). I myself plan to continue work toward this end during the year followi ng my graduation in August 2006 as a post-doc under Dr. Edison (my current Ph D advisor). Also, we plan to pursue various hypotheses about the role that glucose is play ing in the secretion of A. buprestoides. Among our several hypotheses on this issu es are: 1) glucose is covalently conjugated to anisomorphal at some point dur ing its biosynthesis be fore release, 2) glucose is working as a humectant to prev ent evaporation of anisomorphal from the defensive spray before it makes contact with its target, 3) glucose works to stabilize an emulsion of a super-concentrated aqueous mi xture of anisomorphal in the gland of the insect. In addition to the chemistry of anisom orphal, we plan to characterize secretions of several other stick insects from which we will be able to get defensive spray samples from breeders and biologists from all over th e world. Also, in correlation with our hypothesis that glucose and anis omorphal are at some point a covalent conjugate, we plan

PAGE 134

120 to perform rtPCR experiments to identify se quences of new glycos idase enzymes from A. buprestoides. The experiments described in this s ection of “Future Directions” thus far are the subject of a grant submitted to the National Science Foundation (NSF) by Dr. Edison. In addition to these e xperiments, I would also suggest, in general, that insects are indeed a potential gold-mine of novel compoun ds of potential therapeutic value. Very few studies have been done to isolate therap eutic compounds from insect sources. This indeed is a topic that I definitely hope to pursu e in the next stages of my scientific career.

PAGE 135

121 APPENDIX A ACCESSION NUMBERS (WITH CORRESPONDING SEQUENCE NAMES) FOR ALL FLP PRECURSOR PROTEIN SEQUEN CES FROM ALL NEMATODE SPECIES USED IN WORK RELA TED TO CHAPTER 2 These are the accession numbers for all ne matode flp precursor sequences (from protein and EST databases) used for the work of Chapter 2. Many of these sequences can be found in the alignments in Appendix B. All sequences can be found on the NCBI Pubmed database. This is not a complete list of all EST coding for nematode flp precursors. A description of the seque nce naming scheme can be found in the Experimental Methods section of Chapter 2. flp-1a-cv Caenorhabditis vulgaris 2019407A flp-1b-cv Caenorhabditis vulgaris AAA74036 flp-1-gp Globodera pallida CAC36149 flp-1-od Oesophagostomum dentatum AAO18224 flp-1-cb Caenorhabditis briggsae CAE74119 flp-1a-ce Caenorhabditis elegans AAB22368 flp-1b-ce Caenorhabditis elegans CAD56243 flp-1c-ce Caenorhabditis elegans CAD56244 flp-1-cr – Caenorhabditis remanei DR406727 flp-1-ace Ancylostoma ceylanicum CB276900 flp-1-na Necator americanus BG467519 flp-1-aca Ancylostoma caninum BQ667011 flp-1-wb Wuchereria bancrofti CK850238 flp-1-pt Parastrongyloides trichosuri BI451336 flp-1-ss Strongyloides stercoralis BE029557 flp-1-mh Meloidogyne hapla CA996026 flp-1-sr Strongyloides ratti BI073329 flp-1-mj Meloidogyne javanica CF350356 flp-1-gr Globodera rostochiensis AW506435 flp-1-hs Heterodera schachtii CF587423 flp-1-mp Meloidogyne paranaensis CK426955 flp-1-ma Meloidogyne arenaria CF358507 flp-1a-ov Onchocerca volvulus BF918235 flp-1b-ov Onchocerca volvulus BF918235 flp-1-mi Meloidogyne incognita BQ519755 flp-2a-ce – Caenorhabditis elegans CAC42354 flp-2b-ce – Caenorhabditis elegans CAA90031 flp-2-cb – Caenorhabditis briggsae CAE58781 flp-2-oo Ostertagia ostertagi BQ625812 flp-2-aca Ancylostoma caninum BM077744

PAGE 136

122 flp-2-na Necator americanus BU088173 flp-3-ce – Caenorhabditis elegans AAC08940 flp-3-cb – Caenorhabditis briggsae CAE58780 flp-3-mh Meloidogyne hapla BQ835769 flp-3-ma Meloidogyne arenaria CF357063 flp-3-mc Meloidogyne chitwoodi CD419092 flp-3-hg Heterodera glycines BI451558 flp-3-mi Meloidogyne incognita BE238982 flp-4-ce – Caenorhabditis elegans CAA88434 flp-4-cb – Caenorhabditis briggsae CAE59354 flp-4-cr – Caenorhabditis remanei DR406767 flp-4-ace Ancylostoma ceylanicum CB175187 flp-4-aca Ancylostoma caninum AW589126 flp-4-hg Heterodera glycines BF013688 flp-5-hg Heterodera glycines AAO92289 flp-5-ce – Caenorhabditis elegans AAK68683 flp-5-cb – Caenorhabditis briggsae CAE74914 flp-5-cr – Caenorhabditis remanei DR780615 flp-5-na Necator americanus BG467770 flp-5-aca Ancylostoma caninum BE352478 flp-5-gr Globodera rostochiensis BM355778 flp-5-ppe Pratylenchus penetrans BQ580548 flp-5-ma Meloidogyne arenaria CF357389 flp-5-mj Meloidogyne javanica CK427481 flp-6-hg Heterodera glycines AAP02990 flp-6-gp Globodera pallida CAC32451 flp-6-as Ascaris suum AAQ90306 flp-6-ce – Caenorhabditis elegans CAA94786 flp-6-cb – Caenorhabditis briggsae CAE72174 flp-6-oo Ostertagia ostertagi BQ625748 flp-6-aca Ancylostoma caninum AW588352 flp-6-ace Ancylostoma ceylanicum CB275709 flp-6-na Necator americanus BU088036 flp-6-ss Strongyloides stercoralis BE223657 flp-6-gr Globodera rostochiensis BM355107 flp-6-ov Onchocerca volvulus BE491891 flp-6-mh Meloidogyne hapla BQ836492 flp-6-mp Meloidogyne paranaensis CN478220 flp-6-mj Meloidogyne javanica CF350594 flp-6-mc Meloidogyne chitwoodi CB930118 flp-6-tc Teladorsagia circumcincta CB043060 flp-6-bm Brugia malayi H30948 flp-7-hg Heterodera glycines AAO92290 flp-7-ce – Caenorhabditis elegans AAC69107 flp-7-cb Caenoehabditis briggsae CAE68815 flp-7-cr Caenorhabditis remanei DR777024 flp-7-oo Ostertagia ostertagi BQ457797 flp-7a-aca Ancylostoma caninum BG232666 flp-7-mj Meloidogyne javanica CF350741 flp-7-mh Meloidogyne hapla CA997325 flp-7-mi Meloidogyne incognita AW828638 flp-7-rs Radopholus similis CO961269 flp-7b-aca Ancylostoma caninum BM077310 flp-7-gp Globodera pallida BM416182 flp-7-gr Globodera rostochiensis BM354546 flp-7-ss Strongyloides stercoralis BE581906 flp-8a-as Ascaris suum AAQ23188

PAGE 137

123 flp-8-hg Heterodera glycines AAO92292 flp-8-gp Globodera pallida CAC32452 flp-8b-as Ascaris suum AAQ23189 flp-8-ce Caenorhabditis elegans CAA93746 flp-8-cb – Caenorhabditis briggsae CAE63241 flp-8-ace Ancylostoma ceylanicum CB276388 flp-8-nc Necator americanus BU666412 flp-8-xi Xiphinema index CV579592 flp-8-ov Onchocerca volvulus AI132906 flp-9-ce Caenorhabditis elegans CAA93480 flp-9-cb Caenorhabditis briggsae CAE74551 flp-9-ace Ancylostoma ceylanicum CB276818 flp-9-na Necator americanus BU088898 flp-9-oo Ostertagia ostertagi BQ098580 flp-9-aca Ancylostoma caninum AW870447 flp-10-ce Caenorhabditis elegans NP_501306 flp-10-cb Caenorhabditis briggsae CAE70900 flp-10-ace Ancylostoma ceylanicum CB190197 flp-10-xi Xiphinema index CV127810 flp-11b-as Ascaris suum AAU10528 flp-11a-ce Caenorhabditis elegans AAK39250 flp-11b-ce Caenorhabditis elegans AAM54174 flp-11c-ce Caenorhabditis elegans AAP68932 flp-11-cb Caenorhabditis briggsae CAE68632 flp-11-ace Ancylostoma ceylanicum CB277083 flp-11-aca Ancylostoma caninum BQ666279 flp-11-hc Haemonchus contortus CA958720 flp-11-ss Strongyloides stercoralis BE580749 flp-11-gp Globodera pallida CV578361 flp-11-rs Radopholus similis CO897709 flp-11-gr Globodera rostochiensis BM355337 flp-11-mi Meloidogyne incognita CN443314 flp-11-hg Heterodera glycines BG310901 flp-11a-oo Ostertagia ostertagi BQ626272 flp-11b-oo Ostertagia ostertagi BQ626347 flp-11-ppe Pratylenchus penetrans BQ580400 flp-11-wb Wuchereria bancrofti CK855093 flp-11-mp Meloidogyne paranaensis CK241887 flp-11-tc Teladorsagia circumcincta CB037331 flp-11-na Necator americanus BU087757 flp-12-as Ascaris suum AAQ23189 flp-12-ce Caenorhabditis elegans AAA96196 flp-12-cb Caenorhabditis briggsae CAE68619 flp-12-na Necator americanus BU666676 flp-12-aca Ancylostoma caninum BG232259 flp-12-ov Onchocerca volvulus AI322067 flp-12-gp Globodera pallida CV579010 flp-12-gr Globodera rostochiensis BM345276 flp-12-hg Heterodera glycines BF013867 flp-12-mh Meloidogyne hapla BQ836272 flp-12-mp Meloidogyne paranaensis CK426905 flp-12-ma Meloidogyne arenaria CF358524 flp-12-mj Meloidogyne javanica CF350621 flp-12-mc Meloidogyne chitwoodi CB855876 flp-12-mi Meloidogyne incognita BM774054 flp-13-ce Caenorhabditis elegans AAB88376 flp-13-cb Caenorhabditis briggsae CAE61941

PAGE 138

124 flp-13-aca Ancylostoma caninum BG232320, BF250033 flp-13-nc Necator americanus BU087792 flp-13-hc Haemonchus contortus CB016271 flp-13-oo Ostertagia ostertagi BQ625928 flp-13-mi Meloidogyne incognita CN443399 flp-13-mc Meloidogyne chitwoodi CB930779 flp-13-mj Meloidogyne javanica BI744798 flp-13-ppe Pratylenchus penetrans BQ627113 flp-13-hg Heterodera glycines CA939675 flp-13-ace Ancylostoma ceylanicum CB189610 flp-13-ss Strongyloides stercoralis BG226450,BE579723 flp-13-gr Globodera rostochiensis BM356001 flp-13-ov Onchocerca volvulus AI322164 flp-13-ppa Pristionchus pacificus BM565888 flp-14a-as Ascaris suum AAQ90307 flp-14-hg Heterodera glycines AAO92291 flp-14-ce Caenorhabditis elegans CAA21533 flp-14-cb Caenorhabditis briggsae CAE69552 flp-14-od Oesophagostomum dentatum AAO18223 flp-14-gp Globodera pallida CAC33830 flp-14-ls Litomosoides sigmodontis DN557377 flp-14-pv Pratylenchus vulnus CV199627 flp-14-mj Meloidogyne javanica BE578321 flp-14-mp Meloidogyne paranaensis CK240793 flp-14a-ma Meloidogyne arenaria BI746287 flp-14b-ma Meloidogyne arenaria CF357195 flp-14-ace Ancylostoma ceylanicum CB276152 flp-14-mc Meloidogyne chitwoodi CB855512 flp-14-tc Teladorsagia circumcincta CB037888 flp-14-mh Meloidogyne hapla CA996620 flp-14-nc Necator americanus BU089161 flp-14-ts Trichinella spiralis BQ738434 flp-14-ppe Pratylenchus penetrans BQ626900 flp-14-mi Meloidogyne incognita BQ548259 flp-14-gr Globodera rostochiensis BM355758 flp-14-aca Ancylostoma caninum BM130358 flp-14-pt Parastrongyloides trichosuri BG661529 flp-14-ss Strongyloides stercoralis BE580752 flp-14-bm Brugia malayi AW562017 flp-14-ov Onchocerca volvulus AA624962 flp-14-rs Radopholus similis CO961028 flp-15-ce Caenorhabditis elegans CAB05022 flp-15-cb Caenorhabditis briggsae CAE71340 flp-15-ace Ancylostoma ceylanicum CB276974 flp-15-nb Nippostrongylus brasiliensis BU493427 flp-15-tc Teladorsagia circumcincta CB038363 flp-15-oo Ostertagia ostertagi BQ626183 flp-15-nc Necator americanus BU089071 flp-16-tc Teladorsagia circumcincta AAT76297 flp-16-ce Caenorhabditis elegans CAE17795 flp-16-cb Caenorhabditis briggsae CAE59445 flp-16-cr Caenorhabditis remanei DT933897 flp-16-pv Pratylenchus vulnus CV200735 flp-16-ppe Pratylenchus penetrans BQ627255 flp-16-rs Radopholus similis CO960999 flp-16-hg Heterodera glycines CB379328 flp-16a-ace Ancylostoma ceylanicum CB275518

PAGE 139

125 flp-16b-ace Ancylostoma ceylanicum CB275518 flp-16-as Ascaris suum CB039338 flp-16-na Necator americanus BU666314 flp-16-oo Ostertagia ostertagi BQ626326 flp-16-gr Globodera rostochiensis BM355569 flp-16-pt Parastrongyloides trichosuri BG661621 flp-16-mi Meloidogyne incognita AW783259 flp-16-hc Haemonchus contortus AW670781 flp-16-ov Onchocerca volvulus AI239301 flp-17-ce Caenorhabditis elegans NP_503051 flp-17-cb Caenorhabditis briggsae CAE57405 flp-17-ace Ancylostoma ceylanicum CB275548 flp-17-na Necator americanus BU666368 flp-17-oo Ostertagia ostertagi BQ626145 flp-17-ss Strongyloides stercoralis BG226536 flp-17-aca Ancylostoma caninum AW700432 flp-17-hc Haemonchus contortus BF060047 flp-17-xi Xiphinema index CV507209 flp-18-df Dictyocaulus filaria AAT76299 flp-18-od Oesophagostomum dentatum AAO18225 flp-18-tc Teladorsagia circumcincta AAT76298 flp-18-gp Globodera pallida CAC36150 flp-18-as Ascaris suum (afp-1) P41854 flp-18-ce Caenorhabditis elegans Q9N4V0 flp-18-cb Caenorhabditis briggsae CAE68363 flp-18-ss Strongyloides stercoralis BG224856 flp-18-ace Ancylostoma ceylanicum BQ288579 flp-18-gr Globodera rostochiensis BM345252 flp-18-oo Ostertagia ostertagi BQ626359 flp-18-hc Haemonchus contortus BI595566 flp-18-ppa Pristionchus pacificus CN442744 flp-18-mc Meloidogyne chitwoodi CD683415 flp-18-aca Ancylostoma caninum BM077514 flp-18-na Necator americanus BG467632 flp-18-mj Meloidogyne javanica CK427248 flp-18-mi Meloidogyne incognita AW828021 flp-18-ov Onchocerca volvulus BG809034 flp-18-mh Meloidogyne hapla CA995344 flp-18-xi Xiphinema index CV581449 flp-18-ts Trichinella spiralis BG520853 flp-19-ce Caenorhabditis elegans CAA90690 flp-19-cb Caenorhabditis briggsae CAE60808 flp-19-tc Teladorsagia circumcincta CB038676 flp-19-na Necator americanus BU087010 flp-19-hg Heterodera glycines CK351572 flp-19-ppe Pratylenchus penetrans BQ537629 flp-19-di Dirofilaria immitis BQ482007 flp-19-mi Meloidogyne incognita AW829081 flp-19-sr Strongyloides ratti BQ091264 flp-20-ce Caenorhabditis elegans NP_509574 flp-20-cb Caenorhabditis briggsae CAE69832 flp-20-tc Teladorsagia circumcincta CB043312 flp-20-oo Ostertagia ostertagi BQ625941 flp-20-aca Ancylostoma caninum AW589141 flp-20-ace Ancylostoma ceylanicum BQ274673 flp-20-na Necator americanus BU666902 flp-20-hc Haemonchus contortus BM139246

PAGE 140

126 flp-20-pt Parastrongyloides trichosuri BI743005 flp-21-ce Caenorhabditis elegans AAB37072 flp-21-ce Caenorhabditis elegans CAE73192 flp-21-tc Teladorsagia circumcincta CB043130 flp-21-hc Haemonchus contortus CA994703 flp-21-na Necator americanus BU666883 flp-21-oo Ostertagia ostertagi BQ626344 flp-21-aca Ancylostoma caninum AW735291 flp-21-rs Radopholus similis CO897716 flp-21-mh Meloidogyne hapla CA995582 flp-21-ppe Pratylenchus penetrans BQ580634 flp-21-bm Brugia malayi AA991111 flp-21-ss Strongyloides stercoralis N21799 flp-21-ppa Pristionchus pacificus AW115120 flp-21-hg Heterodera glycines CB281642 flp-22-ce Caenorhabditis elegans CAB03086 flp-22-ppa Pristionchus pacificus CO870762 flp-22-tc Teladorsagia circumcincta CB043100 flp-22-oo Ostertagia ostertagi BQ625860 flp-22-aca Ancylostoma caninum BG232417 flp-22-ace Ancylostoma ceylanicum CB190147 flp-22-ss Strongyloides stercoralis BG225971 flp-22-pt Parastrongyloides trichosuri BG661626 flp-22-rs Radopholus similis CO960970 flp-22-ppe Pratylenchus penetrans BQ538004 flp-22-gr Globodera rostochiensis AW506016 flp-22-hg Heterodera glycines BF014029 flp-23-tc Teladorsagia circumcincta CB036818 flp-23b-ce Caenorhabditis elegans AY941160 flp-23a-ce Caenorhabditis elegans AAY18633 flp-23-cb Caenorhabditis briggsae CAE57228 flp-24-2-ce Caenorhabditis elegans AAB70310 flp-24-1-ce Caenorhabditis elegans BJ130802 flp-24-cb Caenorhabditis briggsae CAE69204 flp-24-ace Ancylostoma ceylanicum CB277074 flp-24-na Necator americanus BU087019 flp-24-oo Ostertagia ostertagi BQ625734 flp-24-aca Ancylostoma caninum BM285294 flp-24-1-as Ascaris suum AAW78865 flp-24-2-as Ascaris suum BI594103 flp-24-ov Onchocerca volvulus BF918187 flp-24-bm Brugia malayi R47630 flp-25-ce Caenorhabditis elegans CAE54900 flp-25-cb Caenorhabditis briggsae CAE65230 flp-25-mc Meloidogyne chitwoodi CB933060 flp-25-na Necator americanus BU666043 flp-25-ss Strongyloides stercoralis BE030124 flp-25-mj Meloidogyne javanica CK427415 flp-25a-gr Globodera rostochiensis AW506297 flp-25b-gr Globodera rostochiensis BM354975 flp-26-ce Caenorhabditis elegans AAM51536 flp-26-cb Caenorhabditis briggsae CAE65832 flp-26-na Necator americanus BU087803 flp-26-ace Ancylostoma ceylanicum BQ289277 flp-26-aca Ancylostoma caninum BM077386 flp-27-ce Caenorhabditis elegans AAK31451 flp-27-cb Caenorhabditis briggsae CAE59169

PAGE 141

127 flp-27-na Necator americanus BU087635 flp-27-aca Ancylostoma caninum AW700657 flp-27-mj Meloidogyne javanica CK427146 flp-27-mp Meloidogyne paranaensis CK241502 flp-27-mc Meloidogyne chitwoodi CB856153 flp-27-mh Meloidogyne hapla CA996597 flp-27-mi Meloidogyne incognita AW871675 flp-27-rs Radopholus similis CO961482 flp-27-hg Heterodera glycines BF014825 flp-28-ce Caenorhabditis elegans CAE17946 flp-28-cb Caenorhabditis briggsae CAE58779 flp-28c-ppa Pristionchus pacificus BM812533 flp-28-aca Ancylostoma caninum AW626880 flp-28-hc Haemonchus contortus AW670822 flp-28-oo Ostertagia ostertagi BQ626068 flp-28-pt Parastrongyloides trichosuri BI743005 flp-28a-ppa Pristionchus pacificus BM812533 flp-28-as Ascaris suum BM282578 flp-28b-ppa Pristionchus pacificus BM565728 flp-30-mj Meloidogyne javanica CF350654 flp-30-mc Meloidogyne chitwoodi CB931648 flp-30-mh Meloidogyne hapla BQ837449 flp-30-mi Meloidogyne incognita AW588622 flp-31-mh Meloidogyne hapla BU095081 flp-31-mc Meloidogyne chitwoodi CD682169 flp-31-ppe Pratylenchus penetrans BQ626685 flp-31-mi Meloidogyne incognita BM882182

PAGE 142

128 APPENDIX B ALIGNMENTS OF FLP PRECURSOR PROTEINS FROM Caenorabditis elegans AND OTHER NEMATODE SPECIES These alignments were used to generate ancestral flp precurs or sequences. The accession numbers corresponding to theses sequen ces used can be found in Appendix A. The single letter amino acid codes in these sequences are colored based on amino acid type. For purposes of generating ancestral sequences, in every instance where there was a sequence error in the original EST data (usu ally resulting in an “X” in the translated protein), we replaced these posit ions in the translated protein with a gap (denoted by “-“). A description of the sequence naming scheme can be found in the Experimental Methods section of Chapter 2.

PAGE 143

129flp-1 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_1a_cv -----------------------M T LL Y Q V G LLLLV AA T Y K V S A E CC T P G A TS D F C T V F S ML ST M E Q N E VM S Y L G E N C E G D A E V A L Q K M E K R K P N F M R Y flp_1_gp M C RR G M A Q H F GG P I H C R P S E K AA ST M T K P TTS MI CT G I R Q R H N S G V G L R P LILLL N S A IV W C LL A Q P H T V D AA V SS G H L A P MV P N S LL S I Q S D P N F L R F flp_1_od -----------------------ML T LL Q A G LLL G L G A I T Q V S A E CC S P G D Q S D F C MV F N ML S P I E Q A E VM T YL G D T C T G D A D Q A L R LM E K R K P N F M R Y flp_1_cb -----------------------M T LL Y Q V G LLLLV AA S F K V S A E CC T P G A TS D F C T V F S ML ST M E Q N E VM S Y L G E N C E G D A D V A L Q K M E K R K P N F M R Y flp_1a_ce -----------------------M T LLY Q V G LLLLV AA T Y K V S A E CC T P G A TS D F C T V F S ML ST M E Q N E VM N F I G E N C D G D A E V A L Q K M E K R K P N F M R Y flp_1_cr ----------------------------------------------------------------------R S Y L G E N C E G D A E V A L Q K M E K R K P N F M R Y flp_1_ace ---------------------------S AR G LLL A L G A V A Q V S A E CC S P G D Q S D F C MV F N ML S P M E Q A E VM S Y L G D A C N G D A DE A L R LI E K R K P N F M R Y flp_1_na -----------------------RR P LL Q V G L F L T L G A L A Q V S A E CC SS G E Q S D F C IV F N ML S PM E Q A E VM S Y L G E T C N G D A DE A L R LM E K R K P N F M R Y flp_1_aca -------------------G LL Q M P T LL Q V G LLL A L G A V A Q V S A E CC S P S D Q S D F C MV F N ML S P M E Q A E VM S Y L G D V C N G D A DE A L R LI E K R K PN F M R Y flp_1_wb --------------------------------ILLI C S L A Q V SS E CC R N G I TS D Y C II F N ML SSS QQ A E I R Q Y F G H D C Q D V DE A T R K I E K R K P N F I R F flp_1_pt -----------------------------LL T V S Y I SSS I T A F P D CC K T N Q N A E V C LV F N K L S EDE K T -F V TT E G VL DE Q C E L P H I T P E K R K P N F I R Y flp_1_ss ------------------------------------------------------------------------------------------K R K P N F I R Y flp_1_mh ---------------------------------------K M K G Y C N M T E MVL F G F VMVI G Q M T VL G A N S A N R N S LLM S G -P W A L N S W S E A D P N F L R F flp_1_sr ----------I TF W C K IMI K YY Q K N I F IILL A V N FF S III N A F P E CC K R N V N A D I C Q G F D K L S P EE Q A S L S A V G VL DD Q C Q LI T H T I P D K R K P N F I R Y flp_1_mj -----------N L A H K R V K R F ILI FF Q ---------R L K M K G Y C N MT E L A L F G LLVI F V A Q M S VL G A N S A N R N S LLM S G -P W A L N S W S D A D P N F L R F flp_1_gr -----------------------------------------N S --------------------------------------P N S P L S I Q S D P N F L R F flp_1_hs ----V E A T M T G -V T QQQQ NN A T R F I R ---------R Q H A K H G T EY R S LLL F L S L A I G CC A L A Q S H AA D GG TS N G H L A P MV P N S P I SS I Q S D P N F L R F flp_1_mp --------F L R KK A H K R V K I Y F N FF Q ---------R L K M K G Y C N M T E L A L F G F VVLIV G Q M S VL G A N S A N R N S LLMS G -P W A L N S W S D A D P N F L R F flp_1_ma -------------------------------------R L K M K G Y C N M T E L A L F G F VVL F V G Q M S VL G A N S A N R N S LLM S G -P W A L N S W S D A D P N F L R F flp_1a_ov ------------------------------------------------------------------------------------A P KK I E KKK P N F I R F flp_1_mi ----------------------------------------------------------------------AN R N S LLM S G -S W A L N S W S D A D P N F L R F 110 120 130 140 150 160 170 180 190 200 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_1a_cv G R S AA V K S L G KK A G S D P N F L R F G -----R S Q P N F L R F G K A S G D P N F L R F R -------S D P N F L R F G K AAA ----------D P N F L R F G K R -----flp_1_gp G R S G S L ---------------------N E F G I P H Q T P T R TSS N F L R F G K SS A S M STSS E P N F L R F G R Q K --------GG V D P T F L R F G R A ----N flp_1_od G R S -------------------------------I T F G KK G S D P N F L R F G R -------N Q P N F L R F G K AA G ----------D P NF L R F G R A -----flp_1_cb G R S AA V K S L G KK A G S D P N F L R F G -----R S Q P N F L R F G K A S G D P N F L R F G R -------S D P N F L R F G K AAA ----------D P N F L R F G K R -----flp_1a_ce G R S AA V K S L G KK A G S D P N F LR F G -----R S Q P N F L R F G K A S G D P N F L R F G R -------S D P N F L R F G K AAA ----------D P N F L R F G K R -----flp_1_cr G R S AA V K S L G KK A G S D P N F L R F G -----R S Q P N F L R F G K A S G D P N F L R F G R -------SD P N F L R F G K AAA ----------D P N F L R F G K R -----flp_1_ace G R S -------------------------------I T F G KK G S D P N F L R F G R -------N Q P N F L R F G K AA G ----------D P N F L R F G R A -----flp_1_na G R S -------------------------------I T F G KK G S D P N F L R F G R -------N Q P S F L RF G K AA G ----------D P N F L R F G R A -----flp_1_aca G R S -------------------------------I T F G KK G S D P N F L R F G R -------N Q P N F L R F G R AA G ----------D P N F L R F G R A -----

PAGE 144

130flp_1_wb G R T A L P IM Y G KK D A D P K F L Q F G H SSS A F T P S G Q N F L R F G R E A E P N F L H F G R V ------T D P N F L R F G K S A -----------E P N F L R F G K R T ----flp_1_pt G TS --------------------------T GG V P T A V D KK AA D P N F L R F GR SS -----D H Q N F L R F G R N L G L N --------E A N F L R F G K S -----flp_1_ss G R S --------------------------L N VI P Q P M D KK A V D P N F L R F G R S ------E H Q N F L R F G R S L GG N --------N G N F L R F G K S ---N S flp_1_mh G R S D P S G ---------------------Q V TSN E G I K R AA Q S A N F L R F G K S ----A P Y D P N F L R F G R -A NNN QQ H N K G LV D Q S Y L R F G R S -G A K A flp_1_sr G R S --------------------------L S N M QQ S L D KK AA D P N F L R F G R S ------E H Q N F L F G R N L GG N --------N AN F L R F G K S ---N S flp_1_mj G R S A P S -------------------------N EE G I K R AA G Q S A N F L R F G R S ----A P Y D P N F L R F G R Q L G N QQQQ H N K G LV D Q S Y L R F G R SS GG N K G flp_1_gr G R S G S L ---------------------N E F G I P H LI P T R TSSN F L R F G K SS A S M STS E P N F L R F G R Q K --------GG V D P T F L R F G R A ----N flp_1_hs G R S N G Q L ---------------------N E F N S A S L T P T R TSS K F L R F G K S -S MLV S E P N F L R F G R Q K V GG A G ---GG V D P T F L RF G R A K ---N flp_1_mp G R S AA S -------------------------N EE G I K R AA G Q S A N F L R F G R S ----A P Y D P N F L R F G R Q L G N QQQQ H N K G LV G Q S Y L R F G R SS GG N K G flp_1_ma G R S AA S -------------------------N EE G I K R AA G Q S A N F L R F G R S -----A P Y D P N F L R F G R Q L G N QQQQ H N K G LV G Q S Y L R F G R SS GG N K G flp_1a_ov G R A ----------------------------A S P IM H G KK D T D P N F L Q F E R SS S A F T P S G Q N F L R F G R AA -----------E P N F L R F G R V R ----flp_1_mi G R S AA S --------------------------N EE G I K R AA G Q S A N F L R F G R S ----A P Y D P N F L R F G R Q L G N QQQQ H N K G F VVL S Y L R F G R SS GG NN G 210 220 230 240 250 260 270 280 290 300 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_1a_cv ---------------------S A D P N F L R F G R -----------S F D N F D R E --------S R K P N F L R F G L --------------------------flp_1_gp NN F L RF G R AA GG E LLV AA EEE -----------------------P F E R N Y R Q ---------A N P N F L R F G ---------------------------flp_1_od ---------------------S A D P N F L R F G K R ----------S V D P N F L R F G R -------K P N F L R F G K --------------------------flp_1_cb ---------------------S A D P N F L R F G R S ----------S F D N F D R E --------S R K P N F LR F G K --------------------------flp_1a_ce ---------------------S A D P N F L R F G R -----------S F D N F D R E --------S R K P N F L R F G K --------------------------flp_1_cr ---------------------S A D P N F L R F G R -----------S F D N F D R E --------S R K P N F L R F G K --------------------------flp_1_ace ---------------------S A D P N F L R F G K R A V D P N FL R F G R ------------------K P N F L R F G K --------------------------flp_1_na ---------------------T A D P N F L R F G K R S V D P N F L R F G R ------------------K P N F L R F G K --------------------------flp_1_aca ---------------------S A D P N F L R F G K R A V D P N F L R F G R ------------------K P N F L R F G K --------------------------flp_1_wb --------------------------------E V G D PN F L R F G K N SS F Q P T P E Y N E G F S R Q D R K P N F L R F G K --------------------------flp_1_pt -----------------SS P D F L R F G K R N A E I K E P N F L R F G K R NN F L R F G R N LI DD Q F N R E Y R K P N F L R F G K --------------------------flp_1_ss P D F L R F GKK --------S I Q V G K E P N F L R F G K R E ---------N F I G F G K S MV EE Q F N R E Y R K P N F L R C G K --------------------------flp_1_mh NN F L R F G R G S ED ---I P T E A E -------------------------A F E R E -----Y R Q S NN P N F L R F G ---------------------------flp_1_sr P D F L R F G K R ------------------------------------NM E S D K E P ---------------------------------------------flp_1_mj NN F L R F G R G A ED ---I P S E A E -------------------------A F E R E -----Y R Q S NN P N F L R F G ---------------------------flp_1_gr NN F L R F G R AA GG E LLV AA EEE -------------------------P F E R D -----Y R Q A N P N F L R F G ---------------------------flp_1_hs NN F L R F G R AA GG D A MLIS A DDDE T -----------------------P F T R E -----Y R Q A N P N F L R F G ---------------------------flp_1_mp NN F L R F G R G A ED ---I P S E A E -------------------------A F E R E -----Y R Q S NN P N F L R F G ---------------------------flp_1_ma NN F L R F G R G A ED ---I P S E A E -------------------------A F E R E -----Y R Q S NN P N F LR F G ---------------------------flp_1a_ov ----------------------D S N F L R F G -------------------------------------------N LL N R I S F D S V N A L K R TS Q I F C N L E flp_1_mi NN F L R F A T G A ED ---I P S Q A E -------------------------A F E R E -----Y R Q P NN P N F L R F G ---------------------------

PAGE 145

131 310 320 330 340 ....|....|....|....|....|....|....|....| flp_1a_cv ---------------------------------------flp_1_gp ---------------------------------------flp_1_od ---------------------------------------flp_1_cb ---------------------------------------flp_1a_ce ---------------------------------------flp_1_cr ---------------------------------------flp_1_ace ---------------------------------------flp_1_na ---------------------------------------flp_1_aca ---------------------------------------flp_1_wb ---------------------------------------flp_1_pt ---------------------------------------flp_1_ss ---------------------------------------flp_1_mh ---------------------------------------flp_1_sr ---------------------------------------flp_1_mj ---------------------------------------flp_1_gr ---------------------------------------flp_1_hs ---------------------------------------flp_1_mp ---------------------------------------flp_1_ma ---------------------------------------flp_1a_ov E T L P C R MI K N L E K V S AA K I G N LI SY A S V NN A D R I H IL C Y P flp_1_mi ---------------------------------------flp-2 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_2a_ce M Q V S G IL S A L F LVLL A VI ------------------------------G TT V A Q P A V N D N T L G I F E A S A M A K R L R G E P I R F G K R S P R E P I R F G K R --F flp_2_cb M Q A S A IL S A L F LVLL A VI ------------------------------G TT V A Q -S N D N Q L G V F E A SMM A K R L R G E P I R F G K R S P R E P I R F G K R --F flp_2_oo ---------IVVVL A VL --------------------------------S LL A S A V S P Q A E A MM E S R QQ F K R F R G E P I R F G K R V P R E P I R F G K R G P M F flp_2_aca --------V T LVVL A IL --------------------------------S LL TS A V SS Q A ET A M E A R QQ F K R F R G E P I R F G K R V P R E P I R F G K R A P L F flp_2_na ---------------------------------------------------------------------Q F K R F R G E P I R F G K R V P R E P I R F G K R A P L F

PAGE 146

132 105 ....|.... flp_2a_ce N P L P D Y D F Q flp_2_cb N P L P D Y D F Q flp_2_oo E P Y L D Y --flp_2_aca E P Y F D Y --flp_2_na E P Y F D Y --flp-3 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_3_ce -----MI S P N H LILL F C V N C A F LV A S D A T P --------------------------------------------------------------------flp_3_cb -----MI S P N R LIL FF LI G C A V S AA S E A S P --------------------------------------------------------------------flp_3_mh F P N I R H L T F A I F P F LL C H S N Q FF LM P F T P T Q I TST M T I Q K V T N P T A L F VL F AA S IVIS N C S P VV P R Q E L -P S LL P T G A L H E T L N D P F V QQQ -R W L S flp_3_ma ---------------------------P T Q T -I F T M T N I S N LV K L F VLI A V S F VI S N C S P A V P R Q E T I -P S LL P N G A L H E T L T D P F QQQ -R WI S flp_3_mc ------A I K I A I F LI S V S --F KK N F L C H L A I F T M A I R K I F N IVI FF LMIV F S LII S N C S P V TT R Q E II -P S LL P A G A L H E S L T D P F L QQQ -R W I S flp_3_hg ----------------------------S K L K VL SSS A D F SS F L F L A LL FD F S F S Q P T I QQ K S I -Q E LL N T P T F L P T A E K H D Y F V A P F L QQQ K Q R W L S flp_3_mi ----------------------------------------S -C Q T F VLI A V S F VI S N C S P A V P R Q E T I -P S LL P N G A L H E T L T D P F QQQ -R W I S 110 120 130 140 150 160 170 180 190 200 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_3_ce ----------------------------------------K R S P L G T M R F G K R A I A DE M T F EED G YY P S N VM W K R ST V D SS E P VI R D Q R T P L G T M R F flp_3_cb ----------------------------------------K R S P L G T M R F G K R A S L DD VL A F EEE Y P G -VL W K R ST V D S E P VI R D QR T P L G T M R F flp_3_mh V R P F Q D Y P R P H N EE L G LLL Q Y L D S K R N A P LL DE ------------------------------------------------N AA Q IV G E S P L G T M R F flp_3_ma V R P F Q D Y P R P H N EE L G LLL Q Y L D S K R N A P LL DE ------------------------------------------------N V A Q LV G E S P LG T M R F flp_3_mc V R P F Q E Y P R P H N EE L G LLL Q Y L D S K R N A P LI DE ------------------------------------------------N V A Q VV G E S P L G T M R F flp_3_hg V R P F Q E F RR S P Q T A D L G LLL Q Y L D N K R N A P S LM EDE G ----------------------------------------------N G R H G I S G S P L GT M R F flp_3_mi V R P F Q D Y P R P H N EE L G LLL Q Y L D S K R N A P LL DE ------------------------------------------------N V A Q LV G E S P L G T M R F 210 220 230 240 250 260 270 280 290 300 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_3_ce G K R S A E P F G T M R F G K R N P E N D T P F G T M R F G K R A S ED A L F G T MR F G K R ED G N A P F G T M K F G K R E A EE P L G T M R F G K R S A DD S A P F G T M R F G K R N P L G T M flp_3_cb G K R S A E P F G T M R F G K R D S E I D A P F G T M R F G K R E T V D A P F G T M RF G K R AA ED A P F G T M R F G K R D P DE P L G T M R F G K R S A DD T G A P F G T M R F G K R N T L G T M flp_3_mh G K RR N SS P L G T M R F G ------------------------------------------------------------------------------------flp_3_ma G K RR N TS P L G T M R F G ------------------------------------------------------------------------------------

PAGE 147

133flp_3_mc G K RR N SS P L G T M R F G ------------------------------------------------------------------------------------flp_3_hg G K R T F N S P L G T M R F G K R E Y N K S PP G T M R F G --------------------------------------------------------------------flp_3_mi G K RR N TS P L G T M R LV N F ----------------------------------------------------------------------------------.... flp_3_ce R F G K flp_3_cb R F G K flp_3_mh ---flp_3_ma ---flp_3_mc ---flp_3_hg ---flp_3_mi ---flp-4 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_4_ce M N A F SSS L K T F I F S LL F A T LL A L T AA H PP SS G -EE I A E Q EE K N I A S P DE LI P E IV E QQ N F W PP V H L R G L R SS N G K P T F I R F G K R A S P S F I R F G K -flp_4_cb M N A F P SS L K T F LF S IL F A T LLV Y AA G Q T P SS ED V E P Q EE QQQ K E LI A P DE Y I T P E II E Q T N -L P A V H L R G L R SS N G K P T F I R F G K R A S P S F I R F G R -flp_4_cr ---------------------------------ED V E Q IV Q KK G F E T Q DE Y I T P E IV E Q T NN FW PP V H L R G L R SS N G K P T F I R F G K R A S P S F I R F G R K flp_4_ace -----T R N T Q C F V A F C I A C IVLV A G F D -----E R V N D A Y E P E P V AA D S G FF R N F --------------R SSS N G K P T F I R F G K R A Q P S F I RF G R A Q flp_4_aca ------M N M Q C F V A F C I A C F VLV A G F D -----E R A N D V Y E P E P A V A D S A FF R N F --------------R SSS N G K P T F I R F G K R A Q P S F I R F G R A Q flp_4_hg ------M TT A V Q W A P -----------------E L R V K S W T K WR P --D F W V H W N FF K A I KK S Q L C P N S -R K S G R T N SS F I R F G K RRR ---------110 ....|....|.. flp_4_ce -----------flp_4_cb -----------flp_4_cr -----------flp_4_ace P S F I R F G R Q A T A flp_4_aca P S F I R F G R Q A T A flp_4_hg -----------

PAGE 148

134flp-5 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_5_hg -------------M E M R C V P S RR P F R LI ----A P L F N -------A I P T F Y LIL T LI F C IL S P --------------W A S A D A DE W Y G A R Q L R A P K flp_5_ce ----M R S V P A F Q L P R Q H PP F T K Q S F L A T M SS R STT I A -----------F L F I A T LLVF Q C V S A Q S ------------S A ED A D Y L E K Y Q R I A R A P K flp_5_cb ----M R S F S PP F Q P R L P L PP LL N R S F L A T M SS R STT I A -----------F L F I A T LLV F Q C V S A Q L ------------S DEE TS F L D Q Y Q R V A R A P K flp_5_cr --------------------------------R STT I A ------------F L F I A T LLV F Q C V S A Q S ------------S DED S E Y L D K Y Q R I A R A P K flp_5_na -----------------------------TSTS R Q T Y A -----------VL F I A S ILVL Q Y V T A Q -------------S DD ---M Y E F Q R AA R A -flp_5_aca ----------------------------------H T Y A -----------VL F I A S ILVL Q Y V T A Q -------------S DE ---I Y E F H RD A R A -flp_5_gr ---T K F ILIL F L D Q K M S CC A S P A N R P F A ----RR FF S -------E I TT F Y VI AA ILLL C I Q SS E QQ I ----------R A D S D M D G W F D R Q L R V P K flp_5_ppe S G D Q T Q R S A M A Q H Q K Q LI P G I F SSSSSSS A S P S A SA M F T R QQQ RRR I P SSS F L A L SS M F VLLLI F C T E Q S N A QQQ P Q L A S A S V EE S P F Q N W F D R Q V R A P K flp_5_ma --K F K Y FF L H L P I KK M TS K Q F P I N K N L F N S LI K A T I F S -------P I F IVII F V S L F I F T LE M A L A E K A I N -------E V SS P F K N W F D R D T R S P K flp_5_mj --------------K M TS K Q F P I N K N L F N S LI K TT I F S -------P I F IVII F A S L F I F TS E M A L A EE A I N -------E V SS P F K N W F E R D V R S P K 110 120 130 140 150 160 170 180 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|. flp_5_hg P K F I R F G R S G ---Q K MI P F S R S F ST E T Q Y G DE G --I N Q L D A LV D A V D K L Y Q S -------S P E I R A F K T A P K R A Q K F I R F G -flp_5_ce P K F I R F G R A G ---A K F I R F G R S R N T W E --------D GY A S -P S V N E L Y V --------------------K R G A K F I R F G -flp_5_cb P K F I R F G R A G ---A K F I R F G R S G A N T W E --------D G Y AAAA P S V N D L Y V --------------------K R G A K F I R F G -flp_5_cr P K F M R F G R A G ---A K F I K F G R SG V N S W E --------D G Y AA -P S V N E L Y V --------------------K R G A K F I R F G -flp_5_na P K F I R F G R GGG --A K F I R F G R S G S N T W E --------N D V Y DDD VI P E IL R ED -------------------K R AA K F I R F G -flp_5_aca P K F I R F G R GGG --A K F I RF G R S G T N T W E --------N D M T D Y E GG S G ML R ED -------------------K R AA K F I R F G -flp_5_gr P K F I R F G R A G ---Q K LI R F G R S ST A P N Y S DE ---L S Q L D A LV D A V DE L Y P S -------S P E L R A F K SS P K R A Q K F I RF G -flp_5_ppe P K F I R F G R A NN G Q G Q K F I R F G K R N S P Q Y D G A E N L DE LI D VV ED L Y P A ----------QQQ L R Q M A --Y L T A P K R A Q K F I R F G -flp_5_ma P K F I R F G R S A G NN Q K F I R F G R T P S L ELV GG D P ST E V E N L DD LI D A V E G L Y P S E R L R QQQ E S P S M T A V Y L T A P K R A Q K F I R F G KK flp_5_mj P K F I R F G R S A G NN Q K F I R F G R T P S L E LV GG D P ST E V E N L DD LI D A V E G L Y PS E R L R QQQ E S P A M AA V Y L T A P K R A Q K F I R F G KK

PAGE 149

135flp-6 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_6_hg ------M A M N T N S I H C SS A SSS P S V P MML S M F C F I F A I F L H P A SSS D LL SSS DD A K L QQ ML C A R F P G L A E C Q P N D E A S R G IM T K R K S A Y M R F G R AAAA flp_6_gr ----P VMLL NNN T M S C SS ----S L F MI W S LF VIL F S C F I F R P A SS D LL SSST D A Q L QQ L C A R F P G M E V C Q P DD E V L G A M A K R K S A Y M R F G R AAA E flp_6_gp ------MLL NNN T M S C SS ----S L F MM W S L FF IL F S C F I F R P A SS D LL SSST D A Q L QQ L C A RF P G M E V C Q P DD E A L G A M A K R K S A Y M R F G R AAA E flp_6_as -MML Q S A L Y M T L F G A V C A L R IL T K EE S E P Q LV S D N D P T MI C E L Y P Q L E L C T D A H LM D K R -K S A Y M R F G R S D SS AI R G D A EE V E K R K S A Y M R F G K R DD S flp_6_ce -------------------------M N S R G LIL T L G VVI A V A F A QQ D S E V E R E MM K R -K S A Y M R F G R S D G ---G N P M E M E K R K S A Y M R F G K R SS G flp_6_cb -------------------------M N S R G LIILML G VVIA V A V A QQ D S D L E R E M E K R -K S A Y M R F G R S D G ---G N P M E M E K R K S A Y M R F G K R S -flp_6_oo -----------------------------------F LL T F A VV Y A F Q P DD Q L S E M E K R -K S A Y M R F G R S D P Q -L A D Q LMM D K RK S A Y M R F G K R S E A flp_6_aca ---------------------------M N R T VV AA LLL T F A V A Y A F Q S DD Q F S G M E K R -K S A Y M R F G R S D P E -L A D Q F LM D K R K S A Y M R F G K R S E I flp_6_ace -------------------R AA V S G T R M N R T VV AA LLL T FA V A Y A F Q S DD Q F S G M E K R -K S A Y M R F G R S D P E -L A D Q F LM D K R K S A Y M R F G K R S E V flp_6_na -------------------------L S M N R T VV AA LLL T F A V A Y A F Q P DD Q F P G M E K R -K S A Y M R F G R S D PE -L T D Q LMM D K R K S A Y M R F G K R S E V flp_6_ss K YY ILVV F L S I S C L F N V K G NN E I K E SS A I E NN K Q L Y L C D II P E H Y L C TS DE S L ST P I K R -K S A Y M R F G R S D P ------G E V E K R K S A Y M RF G K R SS G flp_6_ov ---------------T P G I R H E Q M P T IVVLLMV T MM A I Y S G V E VV E S L Q M Y E N D P E M N ----------------------------------E G E I R flp_6_tc ----------------------------P G L R N ST ML Q M R S ----------------------------------------M E --------------flp_6_bm -----------A V -N H IIML K N Q M P ST P A LL T VIMMII GG V E VV K C L EM F E N P E N V -D R I R T L C S L P T F I C A E Q A M D K R K SS Y M R F G R S Y P A flp_6_mh -----NN QQ K M T K I S K T F H F N L N H LLII TS LLLI N I T I F I S A T F D K T N I E F N D V T E I E R L C QQ FP G LV E C R F V --S PP M Q M E --------------flp_6_mp --LI K N TS F K M P N I S K S NN F N L N Q LLII TS LLLI N L T I F I S A N Y DE T N L E S N L A E I K Q L C QQ F P N L A E C R ILL -S PP M Q M E --------------flp_6_mj --LI K N P S F K M P NI S K S NN F N L N Q LLII TS LLLV N I T I F I S A N Y DE T N I E S N LV E I K Q L C QQ F P N L A E C R IL --S PP M Q M E --------------flp_6_mc --------------------------------------------------------V A K L C Q N Y P E IV E C L Y L P I E N Q M K M E --------------110 120 130 140 150 160 170 180 190 200 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_6_hg A D P S EE Y --------E M E K R K S A Y M R F G K R ---------S P A EE G E T Y Q I W AA N E P D A Q ------------------------M E K R K S A Y M R F G K R flp_6_gr EE P -----------A M E K R K S A Y M R F G K R S --------P EDE N V D Q LL A F N E P D A L Q ------------------------M E K R K S A Y MR F G K R flp_6_gp Q E P -----------A M E K R K S A Y M R F G K R ---------S P EDE N V D Q LL A L N E P D A QQ ------------------------M E K R K S A Y M R F G K R flp_6_as A SS L S D N G Q -T Y D G E I E K R K S A Y M R F G K R K S A Y M R F G K RS -DE Q P T A E I -----------------------------------E K R K S A Y M R F G RR flp_6_ce G DE Q E LV G -G DD I D M E K R K S A Y M R F G K R -------S G P Q EDD --------------------------------------M P M E K R K S A Y M R F G K R flp_6_cb G DE Q ED IV G A GG DD M E M T K R K S A Y M R F G KR -------S G A P EED -------------------------------------VM S A E K R K S A Y M R F G K R flp_6_oo L DED T --------M D M E K R K S A Y M R F G K R K S A Y M R F G K R SS E F D ------E A --------------------------P D A I D M E K R K S A Y M R F G R flp_6_aca P N EE N --------L D M E K R K S AY M R F G K R K S A Y M R F G K R F S E F D ------D G --------------------------S E P F D M E K R K S A Y M R F G R flp_6_ace P N EE S --------L D M E K R K S A Y M R F G K R K S A Y M R F G K R F S E F D ------D G --------------------------S E P F D M E K R K S A YM R F G R flp_6_na P DEE S --------L D M E K R K S A Y M R F G K R K S A Y M R F G K R F S D F D ------DD --------------------------S E P M D M E K R K S A Y M R F G R flp_6_ss N DE I EDE A II P -E N G I E K R K S A Y M R F G K R K S A Y M R FG K R D M D M E ------S G -------------------------S D I Y S P L E K R K S A Y M R F G K flp_6_ov T L C S L N P T L S F C S E H A M E K R K SS Y M R F G R S Y P VIL D I E P -------------------------Y P F E ------------------K R K S A Y M R F G K R

PAGE 150

136flp_6_tc -----------------K R K S A Y M R F G R ----------------------------------------------------------------------flp_6_bm IL E T E P H L S E -------K R K S A Y M R G K T L C R F Q ----------------------------------------------------------------flp_6_mh -----------------K R K S A Y M R L G K R K S A Y M R F G K R G V N ED N Q I P D S E Y TS I -------------------D G LM S E N Q P M E K R K SA Y M R F G K R K flp_6_mp -----------------K R K S A Y M R P A K R K S A Y M R F G KKK R ---------------------------------------------------------flp_6_mj -----------------K R K S A Y M R L G K R K S A Y M R F G K R -----------------------------------------------------------flp_6_mc -----------------K R K S A Y M R L G K R K S A Y M R F G K R G VV E V Q I P D A K YI D I -------------------N G LL EE N Q P M E K R K S A Y M R F G K R K 210 220 230 240 250 ....|....|....|....|....|....|....|....|....|....|....|.... flp_6_hg K S A Y M R F G R K ------------------------------------------------flp_6_gr K S A Y M R F G R K ------------------------------------------------flp_6_gp K S A Y M R F G R K ------------------------------------------------flp_6_as ----------------------------------------------------------flp_6_ce SS D M E VI G N E G V D G ---D A H D L F K R K SA Y M R F G K R S M G EEED H D MM K R K S A Y M R F G R flp_6_cb SS D M E IL G N E G I D G AA EDE S H D L F K R K S A Y M R F G K R S M G Q EED H D MM K R K S A Y M R F G R flp_6_oo ----------------------------------------------------------flp_6_aca ----------------------------------------------------------flp_6_ace ----------------------------------------------------------flp_6_na ----------------------------------------------------------flp_6_ss ----------------------------------------------------------flp_6_ov ST D S N --------------------------------------D F T K R K S A Y M R F G R flp_6_tc ----------------------------------------------------------flp_6_bm ----------------------------------------------------------flp_6_mh S A Y M R L G ---------------------------------------------------flp_6_mp ----------------------------------------------------------flp_6_mj ------G V N E A N Q I P D S E Y I S I D G LM A E K Q P KKKKK --------------------flp_6_mc S A Y M R L G ---------------------------------------------------

PAGE 151

137flp-7 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_7_hg ------------------M A Q F P V A N ILL A S LLL G F VIL N ---------------S K T N A Q F Q Y GGG ML ----------------A E G P Q M ---E G flp_7_ce ---------ML G S R F LLL A L G LLVLVL A EE S A E QQ V Q E P T E L E K S G E Q L ----S EED LI DE Q K R T P M Q R SS MV R F G R S P M QR SS MV R F G K R S P M Q R S flp_7_cb ---------ML G S P R F LLL A L G LLVLVL A E SS V E QQ V Q D Q T D L D K S G E Q L ----S EED II EE Q K R S P M E R SS MV R F G R S P M E R SS MV R F G K R S P M E R S flp_7_cr P T R PP T R P E ML G SR F LLL A L G L F VLV W A E K ST E QQ V Q E Q T D L D K S G E Q L ----S EED LI DE Q K R N P M Q R SS MV R F G R S P M Q R SS MV R F G K R S P M Q R S flp_7_oo -----------------------Q A S LLVV T V C VVII G A S --------------AA F D SS Q Y D F A D S IL -----------------T DE K RA P M D R S flp_7_rs --I Q N P F T L S F P R Q STT K M A Q F L Y A -F L A S LLL G VVL F D ---------------R N A M G QQ F A E F G ------------------V A G P E MM D L D Y G flp_7b_aca -----------------------CCC L R L T Q I S L P G A Q T R --------------EE F D N L E R F DD ---------------------F E K R AP M D R H flp_7_gp A T G A Q L E F G TS F K S PP T K T M A Q F P V A N ILL A S LLL S F VVL N ---------------R M TS G Q L Q Y GGG M F ----------------V D G P E M G AA D G flp_7_gr -----------K S -T E MM A Q F P V A N ILL A S LLL G F VVL N ---------------R M TS G Q L Q Y GGG VF ----------------V D G P E M G AA D G flp_7_ss -------S FFFFF C R N I D M S RR L F TS IVIV C I A T IV F G V E K E T Y PP N TS D V ----A I D F S E N K Y L S E V D G -I P E Y I S Q G N D V ED N E A I E K R A P L D R S flp_7_mj --------------------------------------------------------D N E N S YE S M A ------------------------K R A P L D R S flp_7_mh --------------------W P LI R P LI G --------------------------D N E NN Y E S M A ------------------------K R A P F D R S flp_7_mi E M A Q II Y N T LL A T LI F S A FF I S R F S N G Q L T N D S G S Q LM S L Q D F G DD Y NN A I F A D I D G D N E N SY E S M A ------------------------K R A P L D R S 110 120 130 140 150 160 170 180 190 200 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_7_hg LL ED Y G -----R V G D Y P F E S L A K R -------------A P L D R S A M A R F G K R A P L D R S A L A R F G K R A P L D R S A I A R F G K -------------------flp_7_ce S MV R F G --K R S P M Q R SS MV R F GK -------------R S P M E R S A MV R F G R S P M D R S K MV R F G R SS I D R A S MV R L G K R T P M Q R SS MV R F G K R S M E F E flp_7_cb A MV R F G --K R S P M D R S A MV R F G K R L P S D R SS MV R L G K R S P M D R S AMV R F G K R S P M D R S A MV R F G R SS I D R A S MV R L G K R T P I Q R SS MV R F G K R S A DE T flp_7_cr S MV R F G --K R S P M Q R SS MV R F G K -------------R S P M E R S A MV R F G R S P M D R S K MV R F G R SS I D RA S MV R L G K R T P M Q R SS MV R F G K R S A P S D flp_7_oo S MV R F G --K R A P M D R ST MV R F G R --------------A P M D R ST MV R F G K R A P M D R SS MV R F G K R A P M D R SS MV R F G K R I P S EE L A P Y F G L -----flp_7_rs SI N D I E S LV K R A S L D R S A M A R F G K R -------------A P L D R S A M A R F G K R A P L D R S A MV R F G K R A P L D R S A M A R F G K -------------------flp_7b_aca R MV R F G --K R A P M D R SS MV R F G R --------------A P M D R SS MV R F G K R AP M D R SS MV R F G K R A P M D --M F Q H G --------------------flp_7_gp ML ED Y G ---R V G D Y P F E S L A K R --------------A P L D R S A M A R F G K R A P L D R S A L A R F G K R A P L D R S A I A R F G K -------------------flp_7_gr ML ED Y G ---R V G D YP F E S L A K R --------------A P L D R S A M A R F G K R A P L D R S A L A R F G K R A P L D R S A I A R F G K -------------------flp_7_ss S MI R F G --K R A P L D R A MV R F G R --------------S P I D R SS MV R F G R A P L D R SS MV R F G R A PL A R TS MI R F G K R A P L D R A MV R --------flp_7_mj A LV R F G --K R A P L D R S A LV R F G K R -------------A P L D R AA MV R F G K R A P L D R AA MV R F G K R A P F D R SS MV R F G K R K -----------------flp_7_mh A LV R F G --K R A P F D R S ALV R F G K R -------------A P L D R AA MV R F G K R A P L D R AA MV R F G K R A P F D R SS MV R F G K R K -----------------flp_7_mi A LV R F G --K R A P L D R S A LV R F G K R -------------A P L D R AA MV R Y G K R A P L D R AA MV R F G K R A P F D R SS M------------------------

PAGE 152

138 210 ....|....|.. flp_7_hg -----------flp_7_ce M Q S N E K N I ED S E flp_7_cb E N ------T N E flp_7_cr I N ---E I Q DDE flp_7_oo -----------flp_7_rs -----------flp_7b_aca -----------flp_7_gp -----------flp_7_gr -----------flp_7_ss -----------flp_7_mj -----------flp_7_mh -----------flp_7_mi -----------flp-8 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_8a_ce ------ML S G VL F S I F VL A I S A N A S C D V S A L TT E N E K E L G L R I C H L E A ---------------------------------------------E M Q VV flp_8_cb ------MLL G VV F S I F VL A I S A H A T C D V S A L A T E S E K E L G L R L C R L E S ---------------------------------------------E M Q VI flp_8_hg ---------M AL Q I S N T A F MLIVL A TS LLLLM --P SSSS A K VM P Q L G ----F A N A E Q LM N D N SS P M G A G V E G E VM D K V E A R LL G A L E LL Q S Y K E V P flp_8_gp ---------MI QQ I P T A LLLV T L A T A LLML S G V K T N A Q N A H LLV E R D ----F G NA E R V N D R -P M N G V D G E VI D K M E S R LL G A L E LL Q T Y R D A P flp_8b_as ----------M Y Q F V A F LLL F L S L A F S Q K T L A --Q S K G S P E LI Q P S I Y A --------------------T D S E VI A K V Q G Q LL G A I T LL D A L Q DG -flp_8_ace --------------S D R H E A G F S Y E C D V G S F P E S Q R E L G RR V C R L E N -----------------------E V G VL E AA V Q E ML Q R T D VVL N S DE P flp_8_nc ------------L G A T L F A S G F S Y E C D V T N F P E S Q R D L G RR V C R L E N ------------------------E V G ML E AA V G E ML Q R T V P A M N S DDE P flp_8_xi Q F G S V R E R M D G H F L F V F VL AA VM A P N T I G SS V S E L C G S G H G N S E G L S D L C ----------------------G A K M E V D A L Q E R L E G M A D R IL Q E N GG Q flp_8_ov -----T F K TT A M T F I P T V S N A F I Q VLLIMVLV P N VL P L P F A V H L P V H D S F Y DE PP VV D Y N F V P Y S G L R I N DD LM N E P F ---------------------110 120 130 140 150 160 170 180 190 200 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_8a_ce Q R A L Q E VM QQ T D V T L Y D Q E V P VM N K R K N E F I R F G K R S ---------D G M E K R K N E F I R F G K R K N -E F I R F G R S D K G L G L DD N -------D V S M E K R -flp_8_cb E R A L Q E VM QQ T D V TS F D Q E V P A M N K R K N E F I R F G K R S --------D G M E K R K N E F I R F G K R K N --E F I R F G R S D K G L G L DD N --------V S M E K R -flp_8_hg --------------IV P K F T E K R K N K F E F I R F G K RRR -------------------------------------------------------------

PAGE 153

139flp_8_gp --------------IV P K F T V K R K N K F E F I R F G K RRR -------------------------------------------------------------flp_8b_as ---------------T V K LL E K RR N K F E F I R F G RR ---------------------------------------------------------------flp_8_ace --------------------T L E K R K N E F I R F G K R S --------L N D V K R K N E F I R F G K R K N -E F I R F G R S D P LIL DD AA --------V -E K R -flp_8_nc --------------------N I E K R K N E F I R F G K R S --------L S D V K R K N E F I R F G K R K N -E F I R F G R S D P FF A DD A T -------V -E K R -flp_8_xi V ---------------------E K R K N E F V R F G R S GG DE G S Y E V G H D M G K RK N E F V R F G K R K N -E F V R F G K R K N E F V R F G KK -S DD G -V -E K R -flp_8_ov --------------------R V D K R K N E F I R F G K R D -------D P M K F K R K N E F I R F G K R ---------------S V K L KK F --------------....|... flp_8a_ce K N E F I R F G flp_8_cb K N E F I R F G flp_8_hg -------flp_8_gp -------flp_8b_as -------flp_8_ace K N E F I R F G flp_8_nc K N E F I R F G flp_8_xi K N E F V R F G flp_8_ov -------flp-9 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_9_ce M C V Y V C A Q T PP I R VL S IL S Q D S A P I K A H FFF W S R F Q R K T QQ H R L KK G E T FF V S KKKK M N Q F Y A L F LV A C I AA M A N A Y EE -P D L D A L A E F C G K E S N R K Y C flp_9_cb -M W V C VVLL R L RP F SSSSS L D S A H P E N P IV TS R TS --------------T A K I ED M S R L Y A LLLIV C I A N V A ST A P E S I P D L D A L A E F C A K E S N K R Y C flp_9_ace -------------------------------------------------H E A P VM K S IVV A IL S LIL C AA MV T V S A Q D ---P E A LI D Y C A Q P Q N R E VC flp_9_na --------------------------------------------------------------F S LIL C AA IV T V S G Q D ---N E A LV N Y C A Q P Q N R E V C flp_9_oo --------------------------------------------------PP S VM R S IVL A IL S LI F A V A VV T V S A Q D ---Q E A L A E Y C A Q T Q N R E V C flp_9_aca ---------------------------------------------------------------T LIL C AA MV T V S A Q D ---P E A LI D Y W AQ PP N R E V C

PAGE 154

140 110 120 130 140 150 160 ....|....|....|....|....|....|....|....|....|....|....|....|... flp_9_ce D Q I A Q L A T Q H A I G I N Q E Q V R M E K R K P S F V R F G K R S G Y P LVI DDEE M R M D K R K P S F V R F G R K flp_9_cb A Q L A Q L S L D S A M E A N Q E Q VI Q M E K R K P S F V R F G K R S GY P LII D N EE L R M D K R K P S F V R F G R K flp_9_ace E Q LL A S ---VI A EE Q E S L P Q V D K R K P S F V R F G K R ---------AA LV E K R K P S F V R F G R K flp_9_na E Q LL A S ---LIL EE Q E S L P Q M D K R K P S F V R F GK R T --------V A MM E K R K P S F V R F G R K flp_9_oo D Q LL AA ---LI D G S E T I P Q M D K R K P S F V R F G K R S D G ------E M A I E K R K P S F V R F G R K flp_9_aca E Q LL A S ---VI A EE H D S L P Q V D K R K P S FV R F G K R ---------A P LM E K R K P S F V R F G R K flp-10 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_10_ce ------------M Q L S IV F V FF VL C L AA V F A V P I S D A S R A RR Q V A S E K ----------------------------------------------R Q flp_10_cb ------------M Q L S A V F V F LVL C L AA V F A V P L K D A S R A RR E V A P S E K ----------------------------------------------R Q flp_10_ace G --T R P S L R Y T M R FF S FA LI F A L F V F V A M A K P R G P P E V T R E T R G H N D K ----------------------------------------------A P flp_10_xi G S A E SS P Q RR K R K M G VV K F LLIIVIIL Y S A K T A L Q N E I P K F E L E C D G E N E K N MV K V K C E LM K L A K E Y EED Q R K EE N E N YK H L S G E G Q P D S I S P V E T D R M K 110 120 130 140 150 160 170 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|.. flp_10_ce P K A R S G Y I R F G K RR ----------V D P N A E LL Y L -----D Q LLI ------------------------------flp_10_cb T K S R S G Y I R F G K RR ----------V D P N A E LL Y L -----D Q LLL ------------------------------flp_10_ace Q K T R SS Y I R F G K R ------------A N P N A D LL Y L -----D Q LIL ------------------------------flp_10_xi K M A G Y K Y L R F G K R Q I G Y K T GG A Y W ----------------------------------------------------

PAGE 155

141flp-11 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_11b_as -----------------------------ML R ST VV G L F A C I A V A VV A -S A EDDE Q ---------------------------------------V flp_11a_ce -----------------------------M T Q F S A L A LLLIV F V AA S F A -Q S Y DD V S ----------------------------------------flp_11_cb -----------------------------M T K F S A LVLILVI F V AA S F A -Q P Y DD V S ----------------------------------------flp_11_ace -----------G C R Y L A R G P T L R LL S LII F LMP S P A STT L C LI A VLVV A -A L A Q DD S ---------------------------------------S flp_11_aca -------------------------G LII F LM P S P A STT L C LI A VLVV A -A L A Q DD S ---------------------------------------S flp_11_hc ------------Y S FF T E G -Y G N I H I RR MI R A N F V S N VL Q LL D SS LV S -CC PP I H L S ---------------------------------E I W K F S flp_11_ss ----------------------------IM KSS IVL S LL F T V F AA FF I A N A QQ Y DE N S L ---------------------------------G --Y L flp_11_gp S L R Q I K F V T N C L C A P N S N G D PP E R PP E M S K L C R F S R FF Y A VLLLL T L C S IL -M A D AA V W ------------------------------------R M R flp_11_gr ------------------------I A M A K L C R F S R FF Y A LLLLLT L C S IL -M A E AA I W ------------------------------------R M R flp_11_hg ----L F RR F L P V N S A P D A H S I D G V D R C R V S P F SS D N P F A L F A V A Q E A DE N GG I G ----------------S A S LM A K R H L F E A L A R Q G R S P R S flp_11_mi ---------L P KF E N IM D II TT KK L K I F N LII FFF LI F N IL F SS A K P LI N DEE G II P T W ------------------------------------K M R flp_11_mp ---------------VM D III T KK L K I F N LII FFF LI F N IL F SS A K P LI N DEE G II P T W ------------------------------------K M R flp_11_rs --------F S R F H R H I QQ F S M A HH A L S L F V AA L P RP G T LLLLLV C AAA V F D C S A E A M T W ------------------------------------K M R flp_11b_oo ---------------------------------------------------------------------------------------------------flp_11_ppe ------PP F P F HH R LI Y R L F E I R QQ K MM H L Q G M G F I S LLIV A I A P F I G H L TS AA D L DE G N MLIM E F S G Q S P A E I P T E Y G DE N P F A S LI G A S A E KR AA SS flp_11_wb -----------K E V K SS R D K ST I NN R T M H F V H T Y F LL S Y C F IV Q F V T I A I TS V R N DE A ---------------------------------------L flp_11_tc --------------------TT H T Q T AA MI G R S F VL T L F I S MLIV E AA L P ---------------------------------------------R M R flp_11_na ----------------------------ML A R S IV F T LL F T ILIV D AA L P ---------------------------------------------R LR 110 120 130 140 150 160 170 180 190 200 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_11b_as A E K R A M R N A LV R F G ----R S G -M R N A LV R F G K R -----------A -DD N E Y --ST L DE K R N G A P Q P F V R F G R S G R V D H I H D IL ST L Q R L Q L A N E flp_11a_ce A E K R A M R N A LV R F G ----R A S GGM R N A LV R F G K R -----------S P L DEED F A P E S P L Q G K R N G A P Q P F V R F G R S G Q L D H M H D LL ST L Q K L K F A NN K flp_11_cb A E K R A M R N A LV R F G ----R A S GG M R N A LV R F G K R -----------S A M DEED F S P E SS L QG K R N G A P Q P F V R F G R S G Q I D H M H D IL ST L Q K L Q Y A G N K flp_11_ace P E K R A M R N A LV R F G ----R A GGG M R N A LV R F G K R -------------S S A DDD Y E AA M Q D L F N G A P Q P F V R F G R S G H L D H M H DIL ST L Q K L E M A N YY flp_11_aca P E K R A M R N A LV R F G ----R A GGG M R N A LV R F G K R -------------S S A DDD Y E AA M Q D K R N G A P Q P F V R F G R S G H L D H M H D IL ST L Q K L E M A N YY flp_11_hc L E K R A M R N A LV RF G ----R A GG S M R N A LV R F G K R ------------Y L A T DDD Y A T AAA Q G K R N G A P Q P F V R F G R S G H L D H I H D IL ST L Q K L QQ A N -flp_11_ss T E K R A M R N A LV R F G ----R A G -M R N A V R F G K R --------------N -I D N DI P E F A L K R N AA P Q P F V R F G R SS N L S P S G Y F I P L NN M Y D N T E A flp_11_gp T D KK A M R N A LV R F G ----K R ---------------------------------------------------N A Y R SS G E A F V G AA G F G D S G A H ----flp_11_gr T D KK A M R N A LV R F G ----K R ---------------------------------------------------N AY R SS G E A F V G AA G F G D S G A H ----flp_11_hg A SS A T M R N A LV R F G K H A L F P ----------------------------------------------------V A L DD K H N PP Q P F I H F G H S A ------flp_11_mi N D KK A M R N S LV R F G ----K R SS Q SS L K RR N VIL P SS P Q L SS P Y F Y L P E NN EILL P S E F M K N LIL P E I SS K I S LI P S D S LI F V D K T R K E I Q R W K E R K --flp_11_mp N D KK A M R N S LV R F G ----K R SS Q SS KKK ---------------------------------------------------------------KKKK --

PAGE 156

142flp_11_rs T D KK A M R N A LV R F G ----K R ---------------------------------------------------N A Y R P A G E A F V G A S G S E GG Q N ----flp_11b_oo --------LV R F G ----R A G R S M R N A LV R F G K R ------------SST A DDD Y AAA V A Q D K R N G A P Q P F V R F G R S G H L D H I H D IL S F LQ K L Q M A N YY flp_11_ppe S A S G T M R N A LI R F G P S P -R SS A T M R N A LV R F G K R -----------S A P A N G A N F D VL G S I K R N S A P E P F V R F G R S P HH RR S AAA V N DE N S I K M PP G F flp_11_wb G E K R A I R N ALV R F D ----R S G -I R N A LV R F G K R T --S D T Y F -------------L N A E S R G P T A L N P S V R -------------------------flp_11_tc H T K R A M R N S LV R F G ----K R ------------------------------------------------------------------------------flp_11_na P A KK A M R N S LV R F G ----K R ------------------------------------------------------------------------------210 220 230 240 250 260 ....|....|....|....|....|....|....|....|....|....|....|....|... flp_11b_as --------------------------------------------------------------flp_11a_ce --------------------------------------------------------------flp_11_cb --------------------------------------------------------------flp_11_ace --------------------------------------------------------------flp_11_aca Y -------------------------------------------------------------flp_11_hc --------------------------------------------------------------flp_11_ss --------------------------------------------------------------flp_11_gp ---------LL R D I G M D G R Q T Q W AA F G D GG A P -R P I K R LLL W P E Q ---------------flp_11_gr ---------LL R D I G M DD R Q T Q W AA F G D GG A P -R P I K R LLL W P E Q ---------------flp_11_hg --------------------------------------------------------------flp_11_mi --------------------------------------------------------------flp_11_mp --------------------------------------------------------------flp_11_rs ----------A Y A G S LM R G V D G LV A F Y N S A E Q P I W R L Q ----RR A LM D N S R GG V -------flp_11b_oo --------------------------------------------------------------flp_11_ppe A F A F L ---------------------------------------------------------flp_11_wb --------------------------------------------------------------flp_11_tc ---------A D L S D VVLL EE P S G I A D S D L F Y S G V A Q P R N Q L R T L Y N ---------------flp_11_na ---------G D V S DN V F L G E S F G P G E T D G L Y F E R E Q P K I P V Q Y S YY ---------------

PAGE 157

143flp-12 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_12_as -------------------------------M Y Q F V A F LLL F L S L A F --------S Q K T L A Q S K G S P E LI Q P S I Y A T D S ----------------E V flp_12_ce --------------------------------M N V Q VII A LL F C LI --------A T C A T Q K V K G S P E VL P AA M Y D G E L S H --------------E S flp_12_cb --------------------------------M N S Q LVL A LLLC F I --------A TS V A Q K A K G S P E VL P AA M Y D G D A S H --------------E S flp_12_na LI A C R S L P R LL SS L HH I T Y I T L S K TT E T K EE S N M R TSTT VI F A M S MLL --------V T A Q K S H S K G A P E LI Q P VM Y D P D N S ---------------E T flp_12_aca -----------------------------AAN M R TSTT VL F LV S LLL --------V S A Q K S H S K G T P E LI PP MI Y E N D N S ---------------E M flp_12_ov -------------------------------L S MM TS F I P F II F I S A --------V F S Q K I Q N K Q T Q D LL E P I T Y N R G N R ---------------E L flp_12_gp -------------------L K R S I S K T K MI QQ I P T A LLLVT L A T A L -------LML S G V K T N A Q N A H LLV E R D F G N A E R -V N D R -P M N G V D G E V flp_12_gr ----------E C Q PP T R A K L K R SS K TT K MI QQ I P T A LLLV T L A T A L -------LML S G V K T N A Q N A H LLV E R D F D N A E R-V N D R -P M N G V D G E V flp_12_hg -----------------------------A L Q I P N T V F MLIVL A TS L -------LLLM --P SSS N A K VM P Q L G F A N A E Q L M N D N SS P M G A G V E G E V flp_12_mh ------------------------SS K MMI YY P K N L F LLL T V C IV S --V T -I A I N V Q N MN D L Q R N H LI E R E F P G E N IL N G E S Q L Q R Q V H T M DEE M flp_12_mp ------------------------SS K N T MV YY QQ N L F LLL T V C IV S --L T -L A I N V Q N M N D L Q R N H LI E R E F P G E N IL N A E S Q L Q R Q V H T M DEE M flp_12_ma ------------------------S F K N T MI YY QQ N L F LL F T V C IV S --L T -L A I N V Q N M N D L Q R N H LI E R E F P G E N IL N T E S Q L Q R Q V H T M DEE M flp_12_mj ------------------------S F K N T MI YY QQ N L F LL F T V C IV S --L T -L A I N V Q N M N D LQ R N H LI E R E F P G E N IL N T E S Q L Q R Q V H T M DEE M flp_12_mc ------------------------KK RR MI YYY P K N I Y F L F T L C II AA III TS F Y A I N V Q N M N D F Q K S R LV E V R E F P G E N VL N G E S K QQ Q P H T L DEE I flp_12_mi ------------------------S F KN T MI YY QQ N L F LL F T V C IV S --L T -L A I N V Q N M N D L Q R N H LI E R E F P G E N IL N T E S Q L Q R Q V H T M DEE M 110 120 130 140 ....|....|....|....|....|....|....|....|....|... flp_12_as I A K V Q G Q LL G A I T LL D A L Q D G ---T V K LL E K RR N K F E F I R FG RR --flp_12_ce V N K I S A Q LL N A L S E L E A L Q E G N -QQ L K M A E K RR N K F E F I R F G R K --flp_12_cb L N K I ST Q LL N A L A E L E A L Q E G S -QQ L K M A E K RR N K F E F I R F G R K --flp_12_na L A K V SS QLL T A L A T I E S I Q E GG V PP I K I A E K RR N K F E F I R F G R K --flp_12_aca L A K V S A Q LM N A L A T I E N M Q E G --T P I K I A E K RR N K F E F I R F G R K --flp_12_ov L A K A EE Q LL N T L S LL Q VL A DD S --N Q F E M E K RR N K F E F I R F G RR --flp_12_gp I D K M E S R LL G A L E LL Q T Y R D -A P IV P K F T V K R K N K F E F I R F G K RRR flp_12_gr I D K M E S R LL G A L E LL Q T Y R D V S A P IV P K F T V K R K N KF E F I R F G K RRR flp_12_hg M D K V E A R LL G A L E LL Q S Y K E -V P IV P K F T E K R K N K F E F I R F G K RRR flp_12_mh L G R V E A Q LM G A M E ML Q N Y R AA SS P A K F T E K R K NN K F E F I R F G ----flp_12_mp LG R V E A Q LM G A M E ML Q N Y R AA SS P A K F T E K R K NN K F E F I R F G R ---flp_12_ma L G R V E A Q LM G A M E ML Q N Y R AA SS P A K F T E K R K NN K F E F I R F G R ---flp_12_mj L G R V E A Q LM G A M E ML Q N YR AA SS P A K F T E KKK NN K F E F I R F G R ---flp_12_mc L G R I E G Q LM G A M E ML Q N Y R A SSS P T K Y TT E K R K NN K F E F I R F G R ---flp_12_mi L G R A E A Q LM G A M E ML Q N Y R AA SS P A K F T E K R K NN K F E FI R F G R ---

PAGE 158

144flp-13 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_13_ce ----------------------------------MM TS LL T I S M F VV A I Q A F D SS E I R ML DE Q Y --------------D T K N P FF Q F L E N S K R S D R flp_13_cb ----------------------------------MM TS LLVI P MM F VV A I Q A F D SS E I R ML DE Q Y --------------E T N H P Y F P F L E Q K R S D R flp_13_aca -------------------------------R P Q TR P ML R A C VLVL T V G I AAA F D SS E M R ML EDE F L N G K R G R G L W F Y T N D A V E K R S E P L R L P -R E G R flp_13_na ---------------------------------Q T R P ML R T C VLVLIV G V AAA F D SS E I R ML ED G Y --------------H V E K R AA P I H L P -R EE R flp_13_hc ----------------------------------T RLML R T VVLLL A V S L A V A Y D SS E L R ML EDE Y --------------A M D K R V D T L S ---R E S R flp_13_oo -----------------------------------R P ML Q T C A LLL T V A L AAA F D SS E L R ML EDE Y --------------A M D K R A E P M Q P --R EE R flp_13_mi ------------------------------K H A L A L H S LL F IV AAA L PP H K G I YD ST E L TSS E M Q S ----------------G K Q M S P F I S Y Q P W S Y M flp_13_mc -------------------------------T A L A L N S LLILV A S L PP Y S A G I Y D ST E L N SS E M Q S ----------------G K R M S P F N S Y Q P W S Y M flp_13_mj ---------------------------------------------------------E L TSS E M Q S ----------------G K R M S P F I S Y Q P W SY M flp_13_ppe ---------------------F R M F S D C R P F S I G LLV S LLM A L F S A Q F A I A G S Y E S A E L TSS E L Q S ----------------G K R M S P Y V S Y Q P W S Y M flp_13_hg -K S Q C F A VV RR F VLI K FF V R -----PPPP F L S LL S LL F L S LL ---S V N S H S Y ESS E M R IM EE R G ----------------G K R M F ---Y Q P W S F M flp_13_ace ---------------------------------------------------------------------------------------------------flp_13_ss ----------------------------S L S LMI K Q T I S I FF LLL P LIV TS F G L E P A E I R T L D N T N V Q G K R N -----T L DDD S V A IL P Y K M F Y E P IV S flp_13_gr E R PPP I RR IV S R C S L TS P IA RRR Y L P W P V P T A I P R A L F L F L P LL F I SSS V S A Q S Y E S A E L R IL EE K G ----------------G K R M F ---Y Q P W S Y M flp_13_ov -----------------------------F D L R LL F N P MI Y LV AA L A C I T F T H V E S L G I R T L E S E Y -------------------------D I A P A E R flp_13_ppa ---------------------------------------------------------------------------------------------------110 120 130 140 150 160 170 180 190 200 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_13_ce P T R A M D S P LI R F G K R -AA D G A P LI R -F G R A P E A S ------P F I R F G K R AA D G A P LI R F G R A P E A S P F I R F G K R A S P S A P LI R F G R S P S A V P LI R flp_13_cb P T R A MD S P LI R F G K R -AA D G A P LI R -F G R A P E A S ------P F I R F G K R AA D G A P LI R F G R A P E A S P F I R F G K R AA P S A P LI R F G R S P S AA P LI R flp_13_aca S Y DE T A G P LI R F G K R -D M E G A PLI R -F G R T P D A Q ------P LI R F G K R T P D G A P LI R F G R N P E A Q P LI R F G K R -S Q G A P LI R F G R S V S A P LV R F flp_13_na S F DE N A G P LI R F G K R -D S Q G V P LI R -F G R A P E A Q ------P LI RF G K R S P D S A P LI R F G R D S E A Q P LI R F G K R -S P AA P LI R F G R S V S A P LV R F flp_13_hc S F EE N A S P LI R F G K R -D L S G A P LI R -F G R A P E A H ------P LI R F G K R A P D S A P LI R F G RD P E A S P LI R F G K R -S P AA P LI R F G ----------flp_13_oo S F EE A S P LI R F G K R -D F S G A P LI R -F G R A P E A H ------P LI R F G K R S P D N A P F I R F G R N P E A S P LI R F G K R -S P AA P LI R F G ----------flp_13_mi K R A P T A P II R F G K R SS ------W EE LI E R L N K E ---N EE NN F QQQQ K R S P N S A P LI R F G RR -------------L NN A P LI R F G R ---------flp_13_mc K R A P T A P II R F G K R S N ------W E K LI E R S N E NN -V N QQQ F Y NNNN D K R S P N SA P LI R F G RR -------------L SS A P LI R F G R ---------flp_13_mj K R A P T A P II R F G K R SS ------W EE LI E R L N K E ---N ED -F QQQQ K R S P N S A P LI R F G RR -------------L NN A P LI R F G R ---------flp_13_ppe K R A P AA P II R F G K R SS H S --V E W RE QQ P R N T R A -V A L S N P LI R F G K R V P S N A P LI R F G RR D A G P LI R F G K R S P A GG A P LI R F G R ---------flp_13_hg K R T P S V A K A N D F Q K R P E G R Q F LL R W P --S A D H F G -K R ST VV P LM R FG RR P A E R AA P LI R F G K R -------A Y I R T -D AA P LI R F G R ---------flp_13_ace -----------------------------------------------------D G A P LI R F G R N P E A Q P LI R F G K R -S Q G A P LI R F G R S V S A P LV R F flp_13_ss Y D K R SST P LV R F G K R SS D L K DE S LI T R E LR A S ML D S ------P IV R F G K R S L S G P LV R F G R S P N G P LV R F G R A --S A G P LV R F G K R S Y L N S F T P F flp_13_gr K R TSST E P II R F E K R S G DD Q F M E G F K P L SS A D P F R FF K R S IVV P LM R F G RRP A E R AA P LI R F G K R --------A Q I R T N A V P LI R F G R ---------

PAGE 159

145flp_13_ov K N E N MM D Y F V R I G R -----G S E K P G H I F G R -I A R T E A F Q TS P LI R F G K R ------------------------------------------------flp_13_ppa F L S L A V S P LL R F G R -------------------------------------------------------------------------------------210 220 230 240 250 260 270 280 290 300 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_13_ce F G R S AAA P LI R F G R A SS A P LI R F G R K ------------------------------------------------------------------------flp_13_cb F G R S AAA P LIR F G R A SS A P LI R F G R K ------------------------------------------------------------------------flp_13_aca G R S L E AA P LL R F G R S P E A S P LI R F G K -------------------------------------------------------------------------flp_13_na G R S V E A V P LL R F G R S P E A S P LI R F G K -------------------------------------------------------------------------flp_13_hc ------------R S P N A S P LI R F G KK ------------------------------------------------------------------------flp_13_oo ------------R S P N A S P LI RF G KK ------------------------------------------------------------------------flp_13_mi ---------------------------S W K G ----N EE K E F N E ------------------------------------------------------flp_13_mc ---------------------------S W K G ----N E G E Y E -------------------------------------------------------flp_13_mj ---------------------------S W K G ----N EE K E F N E ------------------------------------------------------flp_13_ppe ---------------------------S W E N EED W T N E G EE R K -------------------------------------------------------flp_13_hg ---------------------------SS -EE R K ---------------------------------------------------------------flp_13_ace G R S L E AA P LL R F G R S P EA S P LI R F G K -------------------------------------------------------------------------flp_13_ss D K R L E SS P LI R F G R A SS D P LV R F G K R S F M EEDE F L P N I K A D --------------------------------------------------------flp_13_gr ---------------------------SS -EE -----------------------------------------------------------------flp_13_ov ---------------------------N VM DD A R S E VL A R LL Q Y L QQ E Q P Y F DE S N R L HH --------------------------------------flp_13_ppa --------------------------W AA L G L G VL W GA Y R L T VI R E Y H A D I R E W E H E K AAA K AA ED A KKKK W L A K DE M R Y LM K VV N I P F EE G V A Q F G V E ....|.. flp_13_ce ------flp_13_cb ------flp_13_aca ------flp_13_na ------flp_13_hc ------flp_13_oo ------flp_13_mi ------flp_13_mc ------flp_13_mj ------flp_13_ppe ------flp_13_hg ------flp_13_ace ------flp_13_ss ------flp_13_gr ------

PAGE 160

146flp_13_ov ------flp_13_ppa D L Y R EE flp-14 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_14a_as ----------------------------------------------------------------------------M N V QQ A G LLLLL G VII A C V ---flp_14_hg ----------------------------------M K S R F S P G A SS F P SSS -P S L P VL H R A P LLLL C A M AA VL Q LI P C SSSSS L A N AA VL F A E S G H P -flp_14_ce M P L G LL T E PP I H FF W FF L FF I TS G A D S Q I A T VLL N F VS FFF LLL E L Y Q A T P TT F H F Q H R A PP IL K S P L A P T E SS L H MMI C L P T A LLL S A F VV AA S G Q E A P flp_14_cb ----------------------------------------------------------------------------MI R L T A T A LL VLIV AA Y G Q D AA flp_14_od ----------------------------------------------------------------------------M R G L G A G LLL A T L ST LV P S ---flp_14_gp --------------------------------------M T I R SSSSS C SS --P L SS LL H R G F VVL C A L S VL Q F K P ----S L TT AAA V F A E S G P S -flp_14_ls -----------------------------------------------------------------R V E N H S H L S I A M K S L F HH L R SS N LI T A LVI ---flp_14_pv ---------------------------------------S P H Q SS P KK G E F ----------S K M SSS P S ---------A LLLM C A L AA F L Q V P S -flp_14_ace ---------------------------------------------------------------------------L R H E V G A G LLL A T L ST LM P S ---flp_14_tc ----------------------------------------------------------G S R E A T R S G R G F G R H I R K M R G L G A G LLL A T L ST LL P S ---flp_14_na ---------------------------------------------------------------------------------------------------flp_14_ts ----------------------------------------------------------------M Y P S K N W L N K N K M Q S II Y IV T F A T VI C G S L C L E -flp_14_ppe ---------------------------------------L A N R M SS P ------------------S AA V S ---------LL F LM ST L AA LL N S I P S -flp_14_gr --------------------E T D S A S PP S P A P R PP Y ST M T I R SSSSSS C S F L P L SS LL H R G F VVL C A L S VL Q F K P ----S L TT AAA V F A E S G P A -flp_14_aca --------------------------------------------------------------------------------G A G LLL A T L ST LMP S ---flp_14_pt -------------------------------------------------------------------G L Y AA L A --------------VLIVIV ---flp_14_ss ---------------------------------------------------------------------------------------------------flp_14_bm -------------------------------------------S YY I R S P I T I T V T I T I T I T I T I T I T I T I T A N VIM K G R HH L F SST LI T A LII ---flp_14_ov ---------------------------------------------------------------------------A M K G LL R H L C SSS LI T A LII ---flp_14_rs ------------------------------------A LL T QQ M K SS VLV S -------------A LL C A I P -----------S LLL R V Q A N AA P D AA -flp_14_mi -----------------------------------------------T F L -------------L S LIL S ------------L F C VLV C LL Q P G ---flp_14_mh -----------------------------------F S F L K N K M Q SST N S ---------------LIIL S ------------L F C G LV C LL K P G ---flp_14_mj -----------------------------------I Y FF S K M Q P S NN S F -------------LMVIL S ------------L F C VLV C LL QP G ---flp_14_mp ---------------------------------------------------------------------------------------------------flp_14a_ma -------------------------------------------------------------T K F S Y R S VV AA F A F D T A D M S A R T G IL C F I A S VLL ---flp_14_mc ---------------------------------------------------------------------------------------------------

PAGE 161

147 110 120 130 140 150 160 170 180 190 200 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_14a_as ---S A D A P Q T -----C S Q VM A S P G EE N H N K MLL C Q L F E SST LL A Q L G A LV S E G L D R LMI T Q G ----------------VV P D V S -----A EEE G Q S flp_14_hg ---A D QQQQ -F G Q S D C A Q L AA G DEE R ---MLL C Q L Y E SS A LL A Q L G ILV N E GI G R L A V S Q G -----------------M G -----------K F VV flp_14_ce A G A G A S G AA Q A P H N P K D C Q A IL A NN G D Q -Q E A LL C Q L S E SS MLL A Q L G A LV S E G V E R LV Q T H ---------------G L A L EEE T -----N E G D N D flp_14_cb P G A G A G A V Q AAH N P K D C Q S IL A NN G D Q -Q E A LL C Q L S E SS MLL A Q L G A LV S E G V E R LV Q T H G LI P L SS E P Q DE P MI P G L A L EEE T -----N E S D G D flp_14_od ---S A G A V E G T -A K N C N E IL S N GG D Q -Q E LLL C Q L SE SST LL A Q L G ILV S E G L D R LM Q N Q G --------------I AA V EE P Q -----G VV EE N G flp_14_gp ---SSS V D Q -F V H S D C A Q L A GG DEE R ---LLL C Q L Y E SS A LL A Q L G VLV N E G I G R L A V S Q G -----------------M G N ----------K F IV flp_14_ls ---TS C M Q Y AA G QL Q T C S Q IV A SST DE N D K VML C Q L Y E SSS LL A E L G TS I S K N V E N LL A N K G V ---------------V T A ED ----------V Q D flp_14_pv ---ST A T A S G A E S E M S C A Q L A G S DE K R ---LLL C Q L Y E SS A LL A Q L G A LV N D G I E R L A V TQ G -----------------L T R -----------AA G flp_14_ace ---S A G AA E G --A K N C N E IL S N GG D Q -Q E LLL C Q L S E SST LL A Q L G ILV S E G L D R LM Q N Q G --------------I AA I EE Q S -----G E V DE N G flp_14_tc ---S AAA P A D S VV S K S C N E IL A N GG D Q -Q E LLL CQ L S E SST LL A Q L G ILV S E G L D R LM Q N Q G --------------I AAA E G -------T E L DE S G flp_14_na ---------------------------------L C Q L S E SST LL A Q L G ILV S E G L D R LM Q N Q G --------------I A T V EE P A -----G E V DE T G flp_14_ts -----P EE K L C S Q VL SS E Q LL K ED S G S ---A QL C R MM Q IM T Q F Q T Y V R LL EE TT A Q A L A D R G ----------------II F D I P ---------A E I flp_14_ppe ---S L A V E S --V E SS C A Q M A G S DEE R ---LLL C Q L Y E SS A LL A Q L G A LV N D G I E R L AA T Q G -----------------L G K -----------A I G flp_14_gr ---SSS V D Q-F V H S D C A L F A GG DEE R ---LLL C Q L Y E SS A LL A Q L G VLV N E G I G R L A V S Q G -----------------M G N ----------K F IV flp_14_aca ---P G S AA E G --A K N C N E IL S N GG D Q -Q E LLL C Q L S E SST LL A Q L G ILV S E G L D R LM Q N Q G --------------IAA I EE Q A -----G E V DE N G flp_14_pt ---G TT L S AA D S G A P T C EE IM K TT N E L E P K F L T C K V Y H D S M A S A V E A I R V T K E L E H LLIL N G I ----------------A I D S N D I E S L N G S ED M EE S flp_14_ss -----------------------------K F L A C K V Y H N SI A TS L E A I R V T K E L E Q LLIL N G I ----------------II E G N E T G V S D L S N D G EED flp_14_bm ---S G C V Q Y TT G Q L Q T C S Q V F A S A T EE N G K MLL C E L Y E SSS LL A Q L G T F V S K D V E K LL A N E G ----------------V T I DD ----------V Q D flp_14_ov ---TTS V Q Y T M G Q L Q T C S Q IL A SST EE N E K V W L C Q L Y E SSS LL T Q L G A LV Y K D V K LL S N E G ----------------VII DE ----------V Q D flp_14_rs ---S D F LV P -V P S V C A Q L A G S DEE R ---MLL C Q L Y E SS A LL A Q L G A LV NE G I G R L A TS Q G -----------------L A R A P --------S V AA flp_14_mi ------L A E -N GG D N C A Q L A GG DEE R ---LLL C Q L Y E SST LL S Q L G N F V T E G I E R L AA T H G -----------------L A E --------------flp_14_mh ------L A E -N G D T C A Q L A GG DEE R ---LLL C Q L Y E SSTLL A Q L G N F V T E G I E R L AA T H G -----------------L A G --------------flp_14_mj ------L A E -N GG D N C A Q L A GG DEE R ---LLL C Q L Y E SST LL S Q L G N F V T E G I E R L AA T H G -----------------L A E --------------flp_14_mp ---------------------------------------------------------------------------------------------------flp_14a_ma ---S F Q L SS A D P I S Q S C S Q IV ANN P E G D E K VLL C Q L Y E SSS LL A Q L G A LV H D G L E R LML N Q G I ---------------ST D S D ----------S E N flp_14_mc ---------------------------------------------------------------------------------------------------210 220 230 240 ....|....|....|....|....|....|....|....|....|.... flp_14a_as I E K R K ---H E Y L R F G K R K H E Y L R F G K R K H E Y L R F G R K ----------flp_14_hg A E A G R ---------E K R K HE Y L R F G K R K H E Y L R F G R K ----------flp_14_ce M E K R K ---H E Y L R F G K R K H E Y L R F G K R K H E Y L R F G K R K H E Y L R F G R K flp_14_cb M E K R K ---H E Y L R F G K R K H E Y L R F G K R K H E Y LR F G K R K H E Y L R F G R K flp_14_od L E K R K ---H E Y L R F G K R K H E Y L R F G K R K H E Y L R F G K R K H E Y L R F G R K flp_14_gp A D GG R ---------E K R K H E Y L R F G K R K H E Y L R F G R K ----------flp_14_ls IE K R K ---H E Y L R F G K R K H E Y L R F G K R K H E Y L R F G R K ----------

PAGE 162

148flp_14_pv A E GG --------R M E K R K H E Y L R F G K R K H E F V R F G R K ----------flp_14_ace V E K R K ---H E Y L R F G K R K H E Y L R F G K R K H E Y L R F G K R K H E Y L R F G R K flp_14_tc V E K R K ---H E Y L R F G K R K HE Y L R F G K RR H E Y L R F G K R K H E Y L R F G R K flp_14_na V E K R K ---H E Y L R F G K R K H E Y L R F G K R K H E Y L R F G K R K H E Y L R F G R K flp_14_ts L N D TT ------F N I N K R K H E Y L R F G K RK H D Y L R F G K R K H --------flp_14_ppe SSS P S L S L N E AA V R M E K R K H E Y L R F G K R K H E F V R F G R K ----------flp_14_gr A D GG R ---------E K R K H E Y L R F G K R K H E Y L R F G R K ----------flp_14_aca V E K R K ---H E Y L R F G KR K H E Y L R F G K R K H E Y L R F G K R K H E Y L R F G R K flp_14_pt R E K R K ---H E Y L R F G K R K H E Y L R F G K R K H E Y L R F G R K ----------flp_14_ss R E K R K ---H E Y L R F G K R K H E Y L R F G K R K HE Y L R F G KK ----------flp_14_bm I E K R K ---H E Y L R F G K R K H E Y L R F G K R K H E Y L R F G R K ----------flp_14_ov I E K R K ---H E Y L R F G K R K H E Y L R F G K R K H E Y I R F G R K ----------flp_14_rs A D A G R ---------E K R K H EY L R F G K R K H E Y L R F G R K ----------flp_14_mi K D A G R ---------E K R K H E Y L R F G K R K H E F V R F G R K ----------flp_14_mh R D A G R ---------E K R K H E Y L R F G K R K H E F V R F G R K ----------flp_14_mj K D A G R ---------E K R K H E Y LR F G K R K H E F V R F G R K ----------flp_14_mp ---------------K R K H E Y L R F G K R K H E F V R F G R K ----------flp_14a_ma R E K R K ---H E Y L R F G K R K H E Y L R F G K R K H E Y L R F G R K ----------flp_14_mc D A G R --------A D K R K H E Y L RF G K R K H E F V R F G KK ----------flp-15 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_15_ce -----------------------M Q F ST LI R V A V F A VL A I -------A T L A D Y DD N S V G T I P V A V D L D Y F S N Y V KK ----------------GG P Q G flp_15_cb -----------------------M Q F ST L F R F A F L A VL A V -------S A F A D Y DD S I G T I P V A V D L D Y F S N Y V KK ----------------GGP Q G flp_15_ace ------T R L T A R ------D TS P M H S Y S IL R LLLVLL AA V -------C V F A E I DD ------VV S E P E Y F V P Y L KK ----------------A G P Q G flp_15_nb E N S A S G Y V R L P A Y I -----Y T Q LM H S Y S IL R LLLVLL AA G -------C I F A D I DD ------A V N E P E YF V P Y L KK ----------------A G P Q G flp_15_tc W T G N S A Y G R D S A S G P Y G Q R Q T H VLM H S Y R IV R LLLVLL AA V -------C I F A E I EE ------V A DD S K Y F N P Y L KK ----------------GG P Q G flp_15_oo ----S K L R K P S D Q F M Q V Q V G S R T F E A Q EL Q K LI P Q L EE A I Q R K D QQ L K A QQ G T V E N ---H I RR I A E L E A E V TS L Q K S K R P STT R ST LI R I C L GG P Q G flp_15_na --F H V D A K L T A R ------Y TS LM H S Y G IL R LLVLLIV A V -------C V F A D I DE ------V A S E P E Y FV S Y L KK ----------------A V P Q G

PAGE 163

149 110 120 130 140 ....|....|....|....|....|....|....|....|.... flp_15_ce P L R F G K RR G P S G P L R F G K R SS F H V A P AA ED -V A S W Y Q ----flp_15_cb P L R F G K RR G P S G P L R F G K R S F H T A L A P ED -VV S F Y Q ----flp_15_ace P L R F G K RR D G P T G P L R F G K R SSL D Y S P L AA -QQ P H Y FF V --flp_15_nb P L R F G K RR GG P S G P L R F G K R S Q L D Y L S P L S Q P QQQ P YY L T MLV flp_15_tc P L R F G K RR G P S G P L R F G K R S A Y D Y R A L F D -QQ P YY F V ---flp_15_oo P LR F G K RR G P S G P L R F G K R S A Y D Y R A L F D -QQ P YY F V ---flp_15_na P L R F G K RR D G P T G P L R F G K R SS L D F Q P M A T -QQ P YY F LV --flp-16 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_16_tc -----------------------------------------------------M N G V E L A LL A T -------------------------------C A I flp_16_ce ---------------------------------------------------M N F S G F E F SS IV A ---------------------------------F flp_16_cb ---------------------------------------------------M S L S G F E F SS II A ---------------------------------V flp_16_cr ---------------------------------------G N T H TT R T V Q N K M N F S G F E L SS II A ---------------------------------V flp_16_rs LIL F D F R H T K ML Y R K MMI A P S AAVVMM S LLV ST G L C S A LL W QQ N E P K D K AA Q P A Q E AAAA P L FFF P AA N AAA L E Q L AA EE A Q LI A E AAA QQ AAA DEE Q E A flp_16_hg -------------------------------------------SSSS V C V P F S L G H K L F L F LL P -----------------------------IL F A V flp_16a_ac -----------------------------------------------------M N S V E LVLL AA -------------------------------C T V flp_16_as ------------------------------------------F Q I T E MA S A L A FF G FF G C IVM F ---------------------------------S flp_16_na ------------------------------------------------S L R Y N M SS V E LVLLV A -------------------------------C TT flp_16_oo ------------------------------------------------------------------------------------------------C A I flp_16_gr -----------------------------------F E Q L T M SSSTSS H C V P SS F S R K L F L W LL P A T V T -------------------------L F TT I flp_16_pt -----------------------------------------T M N ------------FF S LLVV ------------------------------LL C G flp_16_mi --------------------------------------F P L K M N L KE QQ I Y L N I Q LL FF IL A V SS F L TT K G S E V K Q R E NN K L E Y N K N E I E R Q K E Q LI R D flp_16_hc --------------------------------------------------------V E L A F L A T -------------------------------C A I flp_16_ov -------------------------------------------K MV F TT F C L P A IML F S I S LI S ---------------------------------S flp_16_pv ---------------------------------------------------------------------------------------------------flp_16_ppe --------------H ST I A F P I F A FFI K P I H P S IL F SS K S P K L H K TT K M A L D N C L R ILL P LLVL SS F L T I S A R A S P S N W L R W G A N T N ----A N G H QQQ 110 120 130 140 150 160 170 180 190 200 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_16_tc VLL S C S N A S P V N ------D Q R --------------------LV E V S P ------EE I E R E R E LL A LL R Q E M P ES DD T PP --S K R A Q T F V R F G K R A Q

PAGE 164

150flp_16_ce F LLIL Q L ST AA V -----------------------------L P A D Y A -Y G V A -DE M S A L P D S G S L F A ----------E Q R P S K R A Q T F V R F G K R A Q flp_16_cb LLLLI Q L SS AA V -----------------------------L P V D Y A S Q Y G V A S A DE M T A L P EE G S L F A ----------E R P A K R A QT F V R F G K R A Q flp_16_cr F LL F V Q L SS AA V -----------------------------L P V D Y A S Q Y G V A S A DE M T L P EE G S L F A ----------E R P S K R A Q T F V R F G K R A Q flp_16_rs E L AA L K P K R A Q T F V R F G K R S Q M E N E ----------------M AA G MK P --------K R A Q T F V R F G K R A P MM E G E K DE AA L Q N E K R A Q T F V R F G K L A Q flp_16_hg QQ F G M T E A S P LI ------QQ K DD S I P M P LV F D R T W Q P P ILLV P N PPP A Y F VI P A N V P L E R P I Y D Q N D S A IEDE L A EE A --K A K R A Q T F V R F G K R A Q flp_16a_ac VL F S L A R SS P V S ------D Q R --------------------LV E A S P ------ED I E R E R E L Y Q N L R Q A L A E S DE G P M --A K R A Q T F V R F G K R A Q flp_16_as Y A S VL N I P K --------------------------------D P RP E I S D V R S M DE A M Q K A Y A Q R Y R L F L E N LL S E AA L E N R L S A G D V Y AA S R P L D K R A Q flp_16_na VL F P L A H SS P TS ------D Q R --------------------L Y E GG P ------DD T E R E R E L Y Q S L R Q A L A E S Y E PP I --E KR A Q T F V R F G K R A Q flp_16_oo VLL S F P N D S P V N ------D Q R --------------------LV E V S P ------EE I E R E R E LL A LL R Q E M A E S DD T PP S M A K R A Q T F V R F G K R A Q flp_16_gr H Q S P F A V A S P F I ------P Q K DD S VVV P M TSF G Q P L PP S P L S LV P N PP L Y F V F P E N L P L E R P F DE Q N D G S EEE L A EE A M G T K A K R A Q T F V R F G K R A Q flp_16_pt L A V S IV S -A Y G -----Q D Q R F ------------------L Q S Q Y G P N ------------------F ED S V Q E A P S ----------------KR A Q flp_16_mi LI A S L T R E R Q Y S R D W QQ S QQQQ N ------------------F I N S F G P S P H L F P SS G I E W P QQQQ K I F L EE G E V EE P L EE N E K E K R A Q T F V R F G K R A Q flp_16_hc VLL A F S N A S P V N ------D Q R -------------------LV E V S P -------EDI E R E R E LL E LL R Q E M P A E S DD A PP S M A K R A Q T F V R F G K R A Q flp_16_ov S N AA I Y N P RR IV ------------------------F P V DEE I Q R E MI N D LLL R D Y A D R N R E Y I E K G L AA L A K NN L DD L E T L H S G S --------K R G Q flp_16_pv -------------------------------------------------------------------E W Q A M E A EEE P I G Q K AA K R A Q T F V R F G K R A Q flp_16_ppe QQ K G Q N Q NNN G V A P T A DD V E QQ K E ML N R D LL AA K Q Y F Q S QQ M P Q K P QQ L A Q Y L A M NN P M E Y EEE A E QQQQQ P M E E G DE L A EE A K A KR A Q T F V R F G K R A Q 210 220 230 ....|....|....|....|....|....|....|.... flp_16_tc T F V R F G K R A Q T F V R F G R S N P E Q M --------------flp_16_ce T F V R F G K R G Q T F V R F G R S A P F E Q ---------------flp_16_cb T F V R F G K R G Q T F V R F G R S A P F E Q ---------------flp_16_cr T F V R F G K R G Q T F V R F G R S A P F E Q ---------------flp_16_rs T F V R F G K R A Q T F V R F G K R AA P QQQQ E A E N ---------flp_16_hg T F V R F G K R A Q T K I R F G R D A Q R QQ E -M A N E Q A Q KK E --flp_16a_ac M F S G S A S GH RR S C G S G D P Y R SSS N SS P S P T H W H Q R T -flp_16_as T F V R F G K R A Q T F V R F G R D A ST H P T E -------------flp_16_na T F V R F G K R A Q T F V R F G ----------------------flp_16_oo T F V R F G K R A Q T F V R F G R S N P E Q M--------------flp_16_gr T F V R F G K R A Q T F V R L G R D T Q R Q F D G K M Q S E QQQ KK A --flp_16_pt T F V R F G K R A Q T F V R F G R S L P A R Y A M ED A E ---------flp_16_mi T F V R F G K R G Q T F V R F G R D S K H Q H N L S D QK Q L K T D K Q --flp_16_hc T F V R F G K R A Q T F V R F G R S N P E Q M --------------flp_16_ov T F V R F G K R S Q V S F R ------------------------flp_16_pv T F V R F G K R A Q T F V R F G R D V Q H V QQQ Q K A E A L N -----flp_16_ppe T F V R F G RR AQ T F V R F G R D V Q R V QQ EDE QQ KK T N -----

PAGE 165

151flp-17 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_17_ce ------ML S K LVL TT C LLL T I S G SS ----Q AA S M EE I Q S E K F C E K F P T L H M C R L K EE L T G S LV E L Q Y LL Q D G I NN -QQQ A G A Q E V Q K R K S A F V R flp_17_cb ------ML F K LV F I A LL F A SS Y G --------A S M EE IQ S E K F C E K F P T L H M C R L K EE L T G S LV E L Q Y LL Q D G I N V GG QQQ A GG V Q E V Q K R K S A F V R flp_17_ace ----A E A S L R VM W C V Y FFF S LIV C S ----F AA N ED N L S EE F C R Q F P S L H L C R L H D NL Q G S LV E L Q Y LL Q D S N I E I AA P --V S P V E K R K S A F V R flp_17_na ----------D M W C V FFF L S LIV C S ----F A T N ED N L S EE F C R Q F P S L H L C R L H D T L Q G S LV E L Q Y LL Q D NN V E N A V P --VN P M E K R K S A F V R flp_17_oo ------------------------S ----L A SSS D A Q L S E Q F C R Q F P S L H L C R L H D T L Q G S LV E L Q Y LL Q D T N V E T G E P --G I T A Q K R K S A F V R flp_17_ss P TS F I T VL F Y S L S IVV S L T L SL P A M E S HH A P N P TS D Q I S A Y E M F C K D Y S H L Q L C K L E F T L QQ A L A E L Q Y IIL N DD P I DD S E N ----F K T K R K S A F V R flp_17_aca ---------------Y FF L S LIV C S ----F AA N ED N L S EE F C R Q F P S L H L CR L H D N L Q G S LV E L Q Y LL Q D NN I E I G N P S AA T N P M D K R K S A F V R flp_17_hc --------------------S P A H S -------P A L E MI S C R S N S VV SS L H I C A V Y M T H F K G LLL N C ST Y F K T P I S I QQ H Q E ---AA L K R KS A F V R flp_17_xi -------------------------------------------------EE L C A F R G L L E G T L RR V N Q A L G S ----T D A D -----V E K R K SS Y V R 110 120 130 140 150 ....|....|....|....|....|....|....|....|....|....|....| flp_17_ce F G K R S A P EEE A M E M E K R K S A F V R F G R S F G M E P Q I T E K R K S Q Y I R F G K ------flp_17_cb F G K R S A S DEE G M E M E K R K S A F V R F G R S I G M E P Q F T E K R K S Q Y I R F G K ------flp_17_ace F G K R AA -DD A M E V E K R K S A F V R F G R S A P I E -T P E K R K S Q Y I R F G R K -----flp_17_na F G K R AA -DD AA E I E K R K S A F V R F G R S V P V D -V P E K R K S Q Y I Q F G R K -----flp_17_oo F G K R S I -DE A G D V E K R K S A F V R F G R S A P F D -ML E K R K S Q Y I R F G R K -----flp_17_ss F G K RS N -DDD ML F D K R K S A F V R F G R S V ED P I N G Q K R K SS Y V R F G -------flp_17_aca F G K R AA -DD A M E V E K R K S A F V R F G R S V P I E -T P E K R K S Q Y I R F G R K -----flp_17_hc F G K R AA -EE S A E IE K R K S A F V R F G R S --A E F D M P E K R K S Q Y I R F G R K -----flp_17_xi F G R SS A --D L AA P E K R K SS Y V R F G R S D P S G E F EE K E K R K S A LV R F G R S D G V D M E

PAGE 166

152flp-18 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_18_ce -----------M Q R W S G VLLI S L CC LL R G A L A Y T E P I Y E IV EED I P A ED I E V T R T N E K Q D G R V F S ---------------------------------flp_18_cb -----------M Q R W S G VI F I T L CC LL R E A S A Y T E P I Y E IV EDD I P A ED I E V T R G N E K Q D G R V FN ---------------------------------flp_18_df ------------M W R V ST A P LILL A VL A S AA D P EDE V Y D L P DD K Y T E A M T LL G I S P Q A Q -H I Y A ---------------------------------flp_18_od ------------M W R V ST A P LILL A VL A N AA D L K E Q V Y D L P EEE Y S DD L K LL D I A P L A K -H V SS ---------------------------------flp_18_tc ------------M WR V ST A S LVLL AA V A Y AA D L EE Q V Y D V P DEE Y T E A L T LL G I G P E A Q -H I Y A ---------------------------------flp_18_ace ----------------------------N AA D L EE Q V Y D L P E G E Y P DDE T LL G IV S Q A Q -H V S A ---------------------------------flp_18_aca -------------------P LILL A VL A N AA D L EE Q V F D L P E G E Y PDD K T P L G IV S Q A Q -H V S E ---------------------------------flp_18_oo ---------F V S C GG C Q R SS L K LL AA V A Y AA D L EE Q V Y D V P DEE Y T E A L T LL G I G P E A Q -H I Y A ---------------------------------flp_18_ss ---------------------------------------------------------------------------------------------------flp_18_hc -----------V S C GG V D G F S NNN S RR L A R P M K N K Y T IF LM R N I Q K L H C W V S V Q K H ---------------------------------------flp_18_as ------------MV E L AA I A V H L F A IL C I S V S A E I E L P ---D K R A Q F DD S F L P YY P SS A F M D S DE A I -V A V P SS K P G R Y --Y F D Q V G L D A E N A M S A flp_18_na RR PD P S NN F T L G VM W R V ST A P LLLV A VL SS A T D L EE Q V Y D L P GG E Y T E Y A S W L D T I P Q S Q -H I SS K R V D --M D S D M P G V F R F G K ------R ED H V flp_18_gr L F V S A W R M F A L F V F P L P LLLL C VL QR I S AAAA E A S A E P G M D V T K R M F S Y A D LM NN F GG P S P L D F V T G N G YY ML DEE R P K R E ---DE A V E LL T A W K R SS flp_18_gp ---------------------------------------------------------------------------------------------------flp_18_mj -NN T MI C Y F Q M F IL L P S L F L C F G E I F A N G E A G H NN EE I GM E K R ML S Y A D LM NN F GG P S A L D L A G Q G -LLI DDE R P K R E Q N Y NN DD V H IM T L W K R S P flp_18_mi -NN T MI C Y F Q M F IL L TS L F L C F G E II G N G E A G H NN EE I G M E K R ML S Y A D LM NN F GG P S A L DL A G Q G -LLI DDE R P K R E Q N Y NN DD V H IM T L W K R S P flp_18_mh ---IML C Y L Q M F IV L TS LLL C F G E K F A N G E A G H E Q EE I G M E K R ML S Y A D LM NN F GG P N A L D L A G Q G N S LLI DDE R P K R E N Y NN ED VH IM T L W K R S P flp_18_mc A N S K R M F G Y V N F LIV A L SS LLI C L G E K IV Y G E A V N K QQ E I G M E K R M F S Y A D LM NN F GG P S A L D LV E P G N A LLL DDE R P K R EE N Y NN DD V H IM T L W K R S P flp_18_ppa ----------------------------LV SSS E SG L D ---D A E S LMI E N M Y P S Y ED Y H L Q D -----------------------------------flp_18_ov ------------------A E Y L NN Y LI A T M E TTS L T M H MV K LVIIII T I TT I TTT F G V Q M D N L N S Y R P ----P N D A Y E F V G P E LLM -----------flp_18_xi ----------L P D R H Y P LI F R N K IVLV F S F I S KST A E N M D SS K M A IMV G LLV F A I C V F S A V S E A R ------------------IV P N E R T A V E G AA S flp_18_ts ------------------SSS I F I Q LL A S P I S L S L S F T F L E Y T Y S R I F Y S R F Q I R K I A P NN QQ K S ---------------Y I S L Q S IM F VL G VVF 110 120 130 140 150 160 170 180 190 200 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_18_ce -----------------K R D ---F D G A M P G VL R F G K R GG V W E K R E SS V Q KK E M P G VL R F G K R A Y F DE KK S V P G VL R F G K R S Y F DE KK S V P G VL R F G flp_18_cb -----------------K R D ---F D G A M P G VL R F G K R GG V W E KR E SS V Q KK E M P G VL R F G K R A Y F DE KK S V P G VL R F G K R S Y F DE KK S V P G VL R F G flp_18_df -----------------K R D ---L N G D M P G VL R F G K ------R Q E S E KK D V P G I F R F G K R ----A N KK S V P G VL R F G K R ------SV P G VL R F G flp_18_od -----------------K R D ---M D GG M P G VL R F G K ------R E G F A E K E V P G VL R F G K R ----S N KK S V P G VL R F G K R ------N V P G VL R F G flp_18_tc -----------------K R -----D GG M P G VL R F G K ------R E N G V E KK E V P G VL R F GK R ----T N KK S M P G VL R F G K R ------N V P G VL R F G flp_18_ace -----------------K R D ---F D N G M P G VL R F G K ------R ED IV KK E V P G V F R F G K R ----S N KK S V P G VL R F G K R ------S V P G VL G F G flp_18_aca -----------------K R D ---F D N G M P G VLR F G K ------R E S IV KK E V P G V F R F G K R ----S N KK S V P G VL R F G K R ------S V P G VL R F G flp_18_oo -----------------K R D ---L D GG M P G VL R F G K ------R E N G V E KK E V P G VL R F G K R ----T N KK S M P G VL R F G K R ------N V P GVL R F G flp_18_ss T E F M ----------------------------------------------KK E M P G VL R F G K R K Y N G QQ KK A V P G VL R F G K R ------G V P G LL R F G

PAGE 167

153flp_18_hc -------------N I ST P N D Y --F I GG M P G VL R F G K ------R E N G V E KK E V P G VL R F G K R ----T N KK S M P G VL R F G K R ------S V P G VL R F G flp_18_as R E ---------------K R G F G -DE M S M P G VL R F G K R G -------------M P G VL R F G K R E ---NE KK A V P G VL R F G K R G -----D V P G VL R F G flp_18_na --------------------------------------------------KK S V P G VL R F G K R ----S N KK S V P G VL R F G K R ------S V P H MLLL G flp_18_gr P F RR G F L N G V Q H N Y LM -KK D ----E F V A P G VL R F G K R --------------M P G VL R F G K RG P -Q H E KK A V P G VL R F G K R ---------------flp_18_gp -------G V Q H N Y LM -KK D ----E F V A P G VL R F G K R --------------M P G VL R F G K R G P -Q H E KK A V P G VL R F G K R ---------------flp_18_mj S Y G P S FF N T A G D L T ---KK D ----D F I A P G VL RF G K R --------------M P G VL R F G K R D R VVI Q E KK A V P G VL R F G K R Q S Q E S G A V P G VL R F G flp_18_mi S Y G P S FF N T A G D L T ---KK D ----D F I A P G VL R F G K R --------------M P VL R F G K R D R VVI Q E KK A V P G VL R F GK R Q S Q E S G A V P G VL R F G flp_18_mh S Y G R S FF STT G D L T ---KK D ----D F I A P G VL R F G K R --------------M P G VL R F G K R D R VV Q E KK A V P G VL R F G K R Q A Q E S G A V P G VL R F G flp_18_mc S --S Y L N M A SD L T ---KK D ----E F I A P G VL R F G K R --------------M P G VL R F G K R D K VV Q E KK A V P G VL R F G K R Q A Q E S G A V P G VL R F G flp_18_ppa -----------------K R D ------S M P G VL R F G K R ----------------------------------A Q A F V R F G K R L S P ------------flp_18_ov -----------------KK D -----------L F R F G K D N S Y N Q K F A P S IM E F ------------DDE Y E KK A V P G VL R F G K ----------------flp_18_xi D --------------E A Q I D P N F G Y G L Q V P G L F R F G K R -------------H F P G MM R F G K R S ----------------------------------flp_18_ts L C S I S F G S C T E A D M D N F K E N --------I P G R F L W KK ------------Y D A P G LM R F G K R V ----------------------------------210 220 230 240 250 260 270 280 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|... flp_18_ce K R D V -P M D K R E I P G VL R F G K R D -Y M A D S F D K R S E V P G VL R F G K R D V P G VL R F G K R S D L EE H Y A G VLL KK S V P G VL R F G R K----flp_18_cb K R D V -P M D K R E I P G VL R F G K R D -Y T EE M F D K R S E V P G VL R F G K R D V P G VL R F G K R S D L EE H Y A G VLL KK S V P G VL R F G R K ----flp_18_df K R --------E M P G VL R F G K R A ------------V P G ML R FG K R F S A P G MM R F G K R ST Y D T F P V E I F D KK S V P G VL R F G K -----flp_18_od K R --------E I P G VL R F G K R S ------------A P VV F R Y N K R Q N I P G VM R F G K R A M Y D VI P L E LL D KK S V P G VL R F G K -----flp_18_tc K R ---------E M P G VL R F G K R G ------------M P G VL R F G K R H E I P G MM R F G K R S A Y D T I P L E LL D KK N V P G VL R F G K -----flp_18_ace K R --------E M P G VL R F G K R S ------------T P G VL R F G K R H D I P G VM R F G K R T V Y D MI P V D LLE KK S V P G VL R F G K -----flp_18_aca K R --------E M P G VL R F G K R S ------------T P G VL R F G K R H D I P G VM R F G K R T V Y D MI P V D LL E KK N V P G VL R F G K -----flp_18_oo K R --------E M P G VL R F G K R G ------------M P G VL R F G K R H E IP G MM R F G K R ST Y D T I P L E LL D KK N V P G VL R F G K -----flp_18_ss K R D -------D M P G LL R F G K R D -----------Q I P G LL R F G K R G D M P G VL R F G K R P S Y DD F LI D --KK D M P G LL R F G K -----flp_18_hc K R --------E M P G VL R FG K R A ------------M P G VL R F G K R T E I P G VM R F G K R ST Y D T I -----------------------flp_18_as K R S -------D M P G VL R F G K R ------------S M P G VL R F G RR -----------------------------------------flp_18_na K R ---------G A D V F R F G K R N ---------N G N MI S F P VLV KK S V P G ML F W K I K Q RG E V R A M S LL DD TT Y D ----------flp_18_gr ---------A E V P G VL R F G K R -------------M P Q VL R F G -------------------------------------------flp_18_gp ---------A E V P G VL R F G K R -------------M P Q VL R F G -------------------------------------------flp_18_mj K R ---------------------------------M P Q VL R F G K ------------------------------------------flp_18_mi K R ---------------------------------M P Q VL R F G K ------------------------------------------flp_18_mh K R ---------------------------------M P Q VLR F G K ------------------------------------------flp_18_mc K R ---------------------------------M P Q VL R F G K ------------------------------------------flp_18_ppa ------I E KK E M P G VL R F G K R --------N E KK S V P G VL R F G K R ----------S V G M D S D M E T LI F KK S V P G VL R F G R K ----flp_18_ov ---------R E I P G VL R F G RR ----------S ED V P G VL R F G KR -------------------------S E P G VL R F G RR ----flp_18_xi ---------------------------------------------------------A P A ED N L N F P -------------------flp_18_ts -------------------------V Q G Y D R Y DD S A P G LM R F G K R S Y F --------------------------------------

PAGE 168

154flp-19 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_19_ce ----------M S F Q L T L F S ML F LLI A VVV G -----Q P I Q S Q N G D L K M ----------Q A V Q D N S P L N M E A F N DD S A L Y D Y L E Q S D P S L K S M E K R W A N flp_19_cb ----------M S F Q L T L F S MLLLLI A VVV G -----Q P I Q S P S G D LR V ----------Q A V Q D N S P L S M E A F N DD P A V Y D Y I E Q S D P T F K VM E KK W A N flp_19_tc ----------D G F V Y E H F P --F LI HH Q E K -----Q Y S L N R -----------------T F L N L ST MMI G Y T D I N K Y P H I H G P -----Y Q K R W A N flp_19_na ----------P SS ML R H LLLLL F VIV C VL A -----Y P S L D -----------------D A R Q D V D V A Y L D S W Q D T P ---FF Q G -P ---Y Q K R W A N flp_19_hg ------------------------------------------------------------------------------------G ---------Q W A S flp_19_ppe -L Q D T E M T -K N S II T ILM A LLILV N --------W K V C A K I D G --------------L N L EE L D NT IL P Y A ED W S P E N E W T L D P ----L R L KK W ST flp_19_di -----S L Q N --YY F I A F S P VLVL A Q N --------A LL DE T N E Y R ------------Q D L W P Y M DD M K -Q R P N D L S N LV YY D P -----R L K R W A S flp_19_mi ---ST K M F N L K Y F L Y LILF V F L F Q I T K C Q G Q G A G R W I G K N E I E G SSS V D GGG KK LV D S P L N V E A L D N IV S P Y A I G W S P E N E W T M D P ----I R L KK W ST flp_19_sr D M A Y SS IL N K I T F L F L G F ILLI T A E L N K NN -----A V S EQ F T A N E G V ----------E S F N P Y L Y Q I KK F Q Y P Y DDD M S L YY Q M N E Q I P I R D R K W A S 110 120 ....|....|....|....|....|.. flp_19_ce Q V R F G K R A S ------W A SS V R F G --flp_19_cb Q V R F G K R A S ------W A SS V R F G --flp_19_tc Q V R F G K R AS -----S W A SS V R F G --flp_19_na Q V R F G K R A S -----S W A SS V R F G --flp_19_hg Q V R F G R K -------------------flp_19_ppe Q L R F G K R AA A I R -P W SS Q V R F G --flp_19_di Q L R F G K R A N ------W A S K V R F G --flp_19_mi Q L R Y G K RA V S A F RR S P W SS Q V R F G --flp_19_sr Q L R Y G K R SS ------W A S Q L R Y G KK flp-20 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_20_ce ML S L E V A V K F E F H R N F S E Q ML G Y T Q S R VVI T LLL F S V F L A V C M A T P S G Y P G Q E L Q N V S DD Y P I Y E EE G L Q L S A E G T DE P H EE K R A V F R M G K R A MM R F G K flp_20_cb -------------------M G H SR S R F VI A LLLL S VLI A I C V AA PP A I S L Q D L P -A ED Y P F L E ED VL D L P S D G T D A P I A E K R A V F R M G K R A MM R F G K flp_20_tc ------------------------------------------------------------------------------------R SS AAA K R A MM R L G K flp_20_oo ---------------------------Q L S C F CC A IL P S Y VV A L P S Q Y ----------L E R P S F G S E LLL W R S PP V S V S N I S K TSS AAA K R A MM R L G K

PAGE 169

155flp_20_aca ------------------------------------------------------------------------------S Y P S W K S P A T I E K R A MM R L G K flp_20_ace -------------------------H G S H P C L CC F IL A S LVV A F P N QQ ---------L K R Q S L ED G VV P W R LL Q Y Q S Y P S W K T P T MV E K R A MM R L G K flp_20_na ------------H FF E G M A Q K F T Q T R T FI T C L CC F VL SS F VV A H Q N Q S ---------F S R Q S F D GG IV P W I L S L Y Q SS Q S W K S P S I P E K R S IM R L G K flp_20_hc -----------------------------------------------------------S R D N V P ----W ----W S Y F T A C R H N V A V K R A MM R L G K flp_20_pt -----------------------------IMII F M SSS Q TT V AAP N ----------------------------------------------------110 120 130 140 150 160 ....|....|....|....|....|....|....|....|....|....|....|....|.... flp_20_ce R A MM R F G K R S V F R L G ------------------------------------------------flp_20_cb R A VM R F G K R S V F R L G ------------------------------------------------flp_20_tc R VM S H Y Q K R A IM R L G K -----------------------------------------------flp_20_oo R VM S H Y Q K R A IM R L G K -----------------------------------------------flp_20_aca R A M K N L E K R A MM R L G K -----------------------------------------------flp_20_ace R AM K VL E K R A MM R L G K -----------------------------------------------flp_20_na R A T I H Y H K R A MM R L G K -----------------------------------------------flp_20_hc R L E T Y H R K R A IM R L G K -----------------------------------------------flp_20_pt -------R VMM R F G K R F SS F E N E H P Y N L H P LL Q F E G S P E N I Y R L TS Y F N R N Q L Y P MI P EI D M flp-21 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_21_ce -----------------------------------------M R --L F ILL S C LL A W VL AA P Y I D Q E -----D A L R -------------VL N A Y L flp_21_tc ------------------------------------L P S L R M R I S G F LV FF A C IV A W A F AA P V S D T E -----AA Y R -------------IL N K Y L flp_21_hc --------------------------------I R F N Y P S L R M R I S G F LV FL A C IV A W A F AA P V S D T E -----AA Y R -------------IL N K Y L flp_21_na -------------------------------------G F C R M R V S G F LVLI A C IV A W A F A S P V S D T E -----AA Y R -------------IL N K Y L flp_21_oo --------------------------------------P ------------C S IV A W A F AA F P S N T E -----AA Y R -------------IL N K Y L flp_21_aca -----------------------------------------R M R V S G F LV F I A C II A W A F AA P V S D T E -----AA Y R -------------IL N K Y L flp_21_rs ------------------SS A LL S LL C LVL S C S L A L C F Q F SS A G R P L F R M A P K A R P A E P AA D VM D L G R ----Q NNN -------------E L A E L L flp_21_mh -----SSF P L R P H F LLILLL S LIVLI N L T E A K T Q R IL K P F S Q F D N S Q L S Q K EDEE Q S DD I Q N IL EEE N ------NNN --------------I N DE L flp_21_ppe F LI SS A N F P II K S Q I T K M P SSSSS N L SSS ML SSSS L S P F A V S R H S IV G R P ILLL C LLVL A L SS L R TS AAK P Q P S A S A Y N P AAAAAA M P F P A N A VI D V P F flp_21_bm -----NN L F S K S A L F S L S L S Y LV S Q P S LLL ----Q C Y K LL R M N LIVL S ILLI T L F QQQ Y F A Q V A PP N -----D L F Y ------------E F M N P Y M flp_21_ss ---------------------------------E L K M A K Y PY C L C LI G IIIIL SS N Y T Y T I P I D NN E -----N Y F I R ------------Q L R E F P V flp_21_ppa -----------------P LV PP T L R PP -------I H I Q E M R F S L A S IL A LL A VIV G L T M G V PP A D T E -----A T M R -------------LL R N -L flp_21_hg LL S V SS Q M SS G Q S L S LLVF S V F L A IL C S F C SS F P L QQ M P S Q S R P M F G K I G TS A Q IV A N VL P E L Q MV E P F P F D Q I S D N S ------------QQ L K S E W

PAGE 170

156 110 120 130 140 150 160 ....|....|....|....|....|....|....|....|....|....|....|....| flp_21_ce E Q -----------------F G P G S D R V Y -Y V A EDD H G S M K R G L G P R P L R F G ---flp_21_tc Q R -----------------F SS DD L Q D V Y -P Y G -D H G S N K R G L G P R P L R F G ---flp_21_hc Q R -----------------F SS DD L Q D V Y -P Y G -D HG S N K R G L G P R P L R F G ---flp_21_na E R -----------------F GG DD L Q D V Y -LV G -D H G S Y K R G L G P R P L R F G ---flp_21_oo Q R -----------------F SS DD L Q D V Y -P Y G -D H G S N K R G L G P R P L R F G ---flp_21_aca Q R ------------------F GG DD L Q D V Y -LV G -D H G S I K R G L G P R P L R F G ---flp_21_rs A E -----------------M G P A Y W L D AA EE P Q K -R AA D K R G S L G P R P L R F G RRR flp_21_mh E R E ---------------K L G E L F M G MII P N R Y Q -R A L T I K RG S L G P R P L R F G R --flp_21_ppe Q M ED G Q S G E LIL E P V A DE AA M A P S AA G S R Y S A F VL P Q R Q G Q K R G S L G P R P L R F G RR -flp_21_bm N S -----------------L R S P D V N IL S -S Y A -DE R S W K R A L G P RP L R F G ---flp_21_ss EE F ----------------V G R P I YY D G Y F I E N A D P R I S F K R A V G P R P L R F G ---flp_21_ppa N R ------------------Y S Q F L DD L D -L Y N --D S P I K R A S L A G P R P L R F G ---flp_21_hg D G N G R GG E ------V Q T QR L Q F SS F P H F L F P V E V K R Q M DD T K R G S L G P R P L S F G RR -flp-22 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_22_ce ---------------------------M N R S MI A L C VVLMV S LV S A Q V F D L D G QQ L A G L E Q N D A R L -------------------M E QQ V K R S P S -A flp_22_ppa -----------------------------------LV T L S F S L E S -----S D F S I D N -------------------------------W K R S P S -S flp_22_tc --------------------------D R M Q R LL A LVMV C LLV A V C TS Q D VDD L P V E R ------------------------------A L K R T P S -A flp_22_oo -------------------------------------C G R L H IL P ----I DD L P VV R V E R ---------------------------A L K R T P S -A flp_22_aca ----------------------------------LVLV C LL A VV C T A Q D VV ED L P VV R A N A P --------------------------S F K R T P S -A flp_22_ace ---------------------------------------------------------------------------------------------------flp_22_ss ----------I F K M T -------K F L N L KII F I F T -I T L N I F S H LI T P F D F SS E F E N DD S M D -----------------------L A R VV R A P N -V flp_22_pt -GGG G G E F Y I -RR G K N P LL G -K T N T LIV Y C F L F ST I A S F D L S F R F D A D NN L D -----------------------L T R VVR A P N -V flp_22_rs -R P S A I A T HH Q N ---MV P S V QQ H LI S V R P LLLL A L A TT MLV T L SSS A L G V D A I P Y N P QQQ L Q R Y -A P I Q S L F D --A EE A LL DE P F A R AA R E P G -V flp_22_ppe N LL Q S I S F T P KK S P I SS MAA M K L S A P M S I H S LIV A T M A LM A L SS FF C N S N G V Q A L P Q F SS Q L RR Y A Q A P I Q S LL D G S I DD F D M AA D P F Y R AA R E N G -V flp_22_gr ------F V Q S MM AA S A VL P N T L S P Q F S F R W H F L P LLLL ALL T I S --D F A S C H S V P T LI A F D P AA Y A K L RR Y T P I Q S F L S DDE AA F E R AA R Q P A GG V flp_22_hg -SS I A S P I K MVVVL S A L SS G T F PP H F S L R W H S L S F LLV F ILL T V C F S A D F T N C H S V P S LM A FD P --T K L RR LM P I R E F L S DD Q L A F E R A I R Q P A GG V

PAGE 171

157 110 120 130 140 150 ....|....|....|....|....|....|....|....|....|....| flp_22_ce K W M R F G K R S P S A K W M R F G K R -S P S A K W M R F G K R S G A E A V S E Q D Y ---flp_22_ppa K W M R F G K R S P N A K W M R F G K R -A P S D K W M R F G K R AA LM EDD V E Y ---flp_22_tc K W MR F G K R S P N A K W M R F G K R -T P D A K W M R F G K R G E W N L ---------flp_22_oo K W M R F G K R S P N A K W M R F G K R -T P D A K W M R F G K R G E Y E F D G Y ED L E --flp_22_aca K W M R F G K R S P N AK W M R F G K R -S P E A K W M R F G K R S D Y E F E G DDD L Y --flp_22_ace ---------------------S P E A K W M R F G K R S D Y E F E G DDD L Y --flp_22_ss K W M R F G K R G S -YY NN Y D K R -S P Q V K W M R F G K R Y D TS N E IQ N Y ---flp_22_pt K W M R F G K R G L -Y D P S I D K R -S G Q V K W M R F G K R S E P L G E I G NN Y ---flp_22_rs K W M R F G K R A P Q G K W M R F G K R -A P N A G K W M R F G K R T E A N G V E R M E M E--flp_22_ppe K W M R F G K R T P Q G K W M R F G K R -A P S A G K W M R F G K R S DE A G I Q A D Y T E Q I flp_22_gr K W M R F G K R T P Q G K W M R F G K R T M A T E GG K W V R F G K R A EE M Q N D Q ------flp_22_hg KW M R F G K R T P Q G K W M R F G K R K M A I E GG K W V R F G K R A EE M Q G EE ------flp-23 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_23_tc Q MM R T L F VLIVLM A I S H A V ----------------F G T R G A L F R S G R S L S P I D -D G Q -------------------------------D F L R F G R T flp_23a_ce -MLL P K I S ILL Y ILVVL Q E T ---------------AA V R G A L F R S G R A V P F E R VV G QQ ------------------------------D F L R F G R A flp_23_cb -MMN R Q F LLV F V A I C VL S Q N --------------A S A L R G A L F R S G R S LL H R Q N L N Q E A V A H Q T ---H LI Q S --------E K R S P IV E ------110 120 130 140 ....|....|....|....|....|....|....|....|....|... flp_23_tc A P L P S V S -PP W T W R S -------------LM D A Y LL K ---------flp_23a_ce G M A S G V GGG S E GG P DD-------------V K N S Y I R V N G E P E IV Y Q flp_23_cb ----------------L Y P VV D S G N V E P E A F P S Y F R F ---------

PAGE 172

158flp-24 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_24_ce -----------------------ML SS R TSS IILIL A ILV A IM A V A Q C R N I Q Y D V EE M T P -E AA F R Y A Q W G ----E I P H K R V P S A G D MMV R F G K R S I flp_24_cb -------------------------M S R TS IILVL A I F V A I AA I A Q C R N I Q Y D V DE I S P -E AA F R Y A QW G ----E I P H K R V P S A G D MMV R F G K R S V flp_24_ace -----------------------R G F S L R S IVL A VL S I A F LI C VI D A R IV D Q Y DD H M A I P V G AA D Y R L R G W DD Y --IV P H K R V P S A G D MMV R F G K R S V flp_24_na ------------------------Y V D L R S IVL AVL S I A F LI C VI D A R IV D Q Y ED H M A V P Y V A G D Y R L R A F DD F --N F P H K R V P S A G D MMV R F G K R S V flp_24_oo -----------------------------------L F AA F LI C A V D A K VLL P Y D H E F Y P --G D Y R L E S F G D F --L A T H K R V PS A G D MMV R F G K R S V flp_24_aca --------------------------------L AA L S I A F LI C VI D A R IV D Q Y DD H M A I P V G AA D Y R L R G W ED Y --IV P H K R V P S A G D MMV R F G K R S V flp_24_as ------------------------M F S L K A IVMI A LVVI C T F C I S E S RR F H DDD F S R Q FL F R G I DE P L K N Y M R L R E A R IL S K R V P S AA D MMI R F G K R S flp_24_ov ---------------LLI D Q S Y F A M S A S K ILII A VLLII N H F C F N D S K R L Q D N D F A R Q F L F R GG F E P M K YY M S P Y DD V Y T V K R V P S AA D MMIR F G K R S R flp_24_bm -------------------------------Q F C LLL T I S G N -T D S K Q L Q D S D ----L F R GG F E P M R YY F N P S D S Y I G K R P N P A D MMI R F G K R S A 110 ....|....|... flp_24_ce ------------flp_24_cb ------------flp_24_ace ------------flp_24_na ------------flp_24_oo ------------flp_24_aca ------------flp_24_as --F IE Q D M E --flp_24_ov T Y D V P I G D L N DD flp_24_bm T F D A P T G D L N DE flp-25 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_25_ce ---------------M S H N S MI Y LLV A F LVLL C A TT E A ---KK E C S -I D C Q ED G S AA V D L G LVL PP E L Y E ST R ----------------------flp_25_cb ---------------M S N S MI Y LLLVV A VL T T VV D S ---KK D C SS I R G C DED -V S P V D L G LVL PP E L Y E ST R ----------------------flp_25_na ---------------PS H R F L A R P M H STS VLLLLVVIV ---V D M C G ---C Y L Q -----------P C TT Y D C Q P ----------------------flp_25b_gr --C F P E MI A G I R C H RRRRR H L P S LI T A I G AA L C V AAA S Q P S V N A L AA R A Q H R S P A LV D IV E P Y N E Q E G S P Q F R P S A L ALL C A Q S M P S G Q L A K V C A H L T G flp_25_ss --------------K D P N D R F Q R S D K T D P E G F S Y D F V R --F G K R N P ---L YY N ----------KK S Q P Y DD LI G ---------------------flp_25_mc ----G VL D I N E V K C LL P L N G Q I K F L K E I E SSS D F L CP Y N II C E P N C P CC Q L SS C D C K ST C P K A C E C F R D L N F T K N VV Q C N G KK N ---DE L D L Q E L P M Q flp_25_mj F N F K V G V P I F Y S N SS IL I Y L Y IM F L F W MV S L FF I A S V ST F S P V H K S EEE I F MP Y ED SS L N D L Y LI TS L N K R P K H LI N IL S L P N QQ T N K L Q L S P K H L K F L

PAGE 173

159 110 120 130 140 150 160 170 180 190 200 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_25_ce ----L S N L ---------------------------------------------------------------------L A R P SS Q F K M K R --D Y D F V R flp_25_cb ----L A N L ---------------------------------------------------------------------L A R P SS Q F K M K R --D Y D F V R flp_25_na ----T I DE -------------------VL R ---------------------------------------------------M E Y K P K R --N Y D F V R flp_25b_gr IVLM P S G N G Q T R P ILI G P E G I G IG M Q R K R H F W S G R I AA V E G R A Q N R A GG A I A G A V ---------------------------------K R --A Y D Y I R flp_25_ss ----------------------------A F N Q VL H F E R GG E Q D -------------------------------------------------------flp_25_mc SS H ILL S N L N I S VL KK S E FF G M G R LI E L H I N SS N I Q II E P S A F N T I NN L KS L H L S G NN L E K I N G DE F TST Q T L Y ML P EE Q E K QQ I K T P E K R --D Y D F V R flp_25_mj C N K F L K NN S K ------------E Y E R LL F L S K N I N L C R K EE K F M T K IL G N E ------------------------------------K R SSSS Y D F V R 210 220 230 240 250 ....|....|....|....|....|....|....|....|....|....|.. flp_25_ce F G R AA P I------KK A -----S Y D Y I R F G R K -----------------flp_25_cb F G R S A P M ------KK A -----S Y D Y I R F G R K -----------------flp_25_na F G R S G P A ------KK A -----S Y D Y I R F G K R SS D R M D R D S L R SS A L E Q flp_25b_gr F G RR S AA V H L A QQQ KKK S E H S AGG T Y D Y I R F G -------------------flp_25_ss --------------K R S D L E G T N Y D F V R F G -------------------flp_25_mc F G R S D ---------------------------------------------flp_25_mj F G RR N S F Q M D G K P N KK S N G NN G N T Y D Y I R F G -------------------flp-26 Alignment 10 20 30 40 50 60 70 80 90 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_26_ce ----M K VM F ML A LL F SS LV A TS A F R L P F Q FF G A N ED F N S G L T K R N YY E S K P Y K R E F N A DD L T L R F G K R GG A G E P L A F S P D ML S L R F G K flp_26_cb ----M K A VLLI A ILL G S I AA V S A F R L P F QFF G S Q ED F N S G L A K R N YY E S K P Y K R E F N A DD L T L R F G K R A G A G E P L A F S P D ML S L R F G K flp_26_na VM Y P G R L F LLI S VLI G S -A C R A L Y I P H D V H E V F V S N Y D K R S R M E L E GY R P D K R E F N A DD L T L R F G K R S G D --M A F H P N D L A L R F G R flp_26_ace --A P A R LL F LI S VLI G S -A C R A L Y I P H D M Q D L F L N T Y D K R S R M T L E G Y R P D K R E F N A DD L T L R F G KR GG E --M A F H P N D L A L R F G R flp_26_aca ---------LI S VLI G S -A C R A L Y I P H D M H D L F V N T Y D K R S R M T L E G Y R P D K R E F N A DD L T L R F G K R GG E --I A F H P N D L A L R F G R

PAGE 174

160flp-27 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_27_ce -----------M F S L T Q IL T F LLV A I T LM T F SS A Q P I DEE R P I F M E RR E -A S A F G D II G E L K G K G L GG R M R F G K R ---------SSS P D I S L A E M flp_27_cb -----------M F S F R K F L A F MLIVI A LM A S F SS A Q P I DEE R P I FM E RR E -A S A F G D II G E L K G K G L GG R M R F G K R ---------SSS P S D I S M A E L flp_27_na E T V S Y S R Q R Q S K M Q P Q H T LLI T VL A I F VL AA F I P ST V E A Q E Y G P ILM N RR D -LL P Y G E IV S E L K G K T M GG RM R F G K R ---------S G N A L H F V P A VV flp_27_aca ------Q R Q L R M Q S P N A LLV S ML A VLVL A T LI P C N A E A Q E F R P ILM N RR D -LL P Y G E IV S E L K G K T M GG R M R F G K R ---------S M N P Y Q F I P I E A flp_27_mj -----------------R L S L AF LL F S LILLI N T I N S K P H P N G L P MV RR D G N GG D L DD LL T E F R A K -G S R M R F G K R SS SSSS P R F SSSS A D V D F I E A flp_27_mp -------R K N L K F S I K M R L S L A F LL FF LILLI N T I N S K P H P N G L P MV RR D G N GGD L DD LL T E F R A K -G S R M R F G K R SS SSSS P R F SSSS A D V D F I E A flp_27_mc --------E K YY F L F K M K L Y L A FF L FF II S LI N K I N S K P H P N G L P LV RR D AA GG D L DD LL T E F R A K -G S R M R F G K R --SS F SS LY SSSS A D S D F I E A flp_27_mh --------------------L S FF L FF LI S LI N T I N A K P H P N G L P MV RR D AA GG D L DD LL T E F R A K -G S R M R F G K R --SS H S P R F SSSS A D V D F I E A flp_27_mi --------------I K M R L S L A F LL FF LI S LI N T I N S K PH P N G L P MV RR D G N GG D L DD LL T E F R A K -G S R M R F G K R SS SSS P R F SSSS A D V D F I E A flp_27_rs -P GG Y R L A R Q N A S A K I A D L P F RR A L F ST R F S H Q VV G T R P H G H I A LV RR D --G D F DD LLT E F R S K -G S R M R F G K R -------------A P I P F P K E flp_27_hg -----------------T A I F VL F I C F L F IVL P T I A Q M N Q P Q G I A LV RR D --S D Y DE LL T E F R S K -G S R M R F G K R S L G P ST D A I G L P SS D S F S N S A D 110 ....|....|....| flp_27_ce R A I Y GG D Q S N I F N F K flp_27_cb R A I Y GGG P V E Y V Q L flp_27_na L D A Y E H QQ QQQQ L flp_27_aca M E A Y E R QQ Q I ---flp_27_mj P I YY I P D R S M W F Q -flp_27_mp P I YY I P D R F M W F Q -flp_27_mc P I YY IP D R F M W F Q -flp_27_mh P I YY I P D R F M W F Q -flp_27_mi P I YY I P D R F M W F Q -flp_27_rs N L T C N G R C G L C ---flp_27_hg P NN LM S G H F N W N --

PAGE 175

161flp-28 Alignment 10 20 30 40 50 60 70 80 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|... flp_28_ce M F S V R S I F A I F C VLIL A L ST I N AA P N R VLM R F G K R GG N S E G H L G Y R F V P A G A P -A I A E Y I D V DD VI GG DD R F --------flp_28_cb M F S V R S F V A L F C VLIL A F S A V N AA P N R VLM R F G K R GG NS E G N L G Y R Y V P AAA P -A I A E Y I D V DD VL GG Q D R F --------flp_28_aca ---R A V F A LL Y MLLI A V A VV N S A P N R ILM R F G K R --TS D P L H V R P MV P V D Y --Y P L E LL G SS R A V D G D M ---------flp_28_hc SST R A VL A L F YMLL F S I A IV TS V P N R I F M R F G K R --N V D L A E Y R N P L P A D Y --F P V E LI G TS R F T D G D M ---------flp_28_oo ---------F Y VLV F G IVIV TS A P N R I F M R F G K R --N I D S A G Y R F P L P G E Y --F P I E LL G TS P SN D G D M ---------flp_28_pt --------IMII F M SSS Q TT V AA P N R VMM R F G K R F SS F E N E H P Y N L H P LL Q F E G S P E N I Y R L TS Y F N R N Q L Y P MI P E I D M flp_28_as --LL T LL A L P LV T L K S N T VV D AA P N K ILM R F G K RT L P L D R N L DE W I T E K D L -D N L H N L YY LL R E S G E Q ---------flp_28b_ppa ----S V H C S R P C I A L T V T L A S AA P S R VLM R F G K R -A V AA S R F D F H E L P V A S Y -G F L P Y G A V N P Q F E G F EE SSV D Q ----flp-30 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_30_mj ----------Y I H STTTSTT M T I K H I N L F ILL A LLM TT A Y C F E G I G N K Q Y K E L Y P M W K R Q M R E P L R F G K R Q ED K T N Y F K N S N E N A D N ---G F P DE A Y N flp_30_mc I K I F Q V Q Y I T Y F H S T K S MT M A R K Y A I F IIL A L F M T I Y S F E ML D N Q Y K E V Y P M W K R Q M R E P L R F G K R L A N E KK Y L K Y DD LVM NN E K I D S F P F E I Y K flp_30_mh -------Y I Y F I QQ E H L H M T M TT N H F IL F MLL A MLI TT V FC F E VL G N K Q Y K E V Y P M W K R Q M R E P L R F G K R L A S D T D Y F K Y EE N A G N ---R L T Y D N Y K flp_30_mi -----------F H S T I STT M TT K H I N L F ILL A LLM TT A Y C F E G I G N K Q Y K E L Y P M W K R Q M RE P L R F G K R Q ED K T N Y F K N S N E N A D N ---G F P DE A Y N ....|.... flp_30_mj G I S F Q K -flp_30_mc N -------flp_30_mh S R I S F Q N Y flp_30_mi G I S F Q K -

PAGE 176

162flp-31 Alignment 10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| flp_31_mh Q S HH K T Y K M R P -------------------L N TS F L P F T --K R Q I F T LL -----LL W F L -VL S IL A F D G SS A S Q G A S E M E T D VL EDD Q IVV P W flp_31_mc --------------------------------------------------------------------LL A F D V SS A L Q G A S E V E T D LI EDD Q IVV P W flp_31_mi --HH Y Q RIM Q P -------------------F NNN P H L P F S --Q R Q I F T LL -----F V W F L -VI S ILL T F D G TS VL Q G A S E M E T D VL EDD Q IVV P W flp_31_ppe N L P H R H PP F R S N F G RR L SS L D G R S N L F GG N E W A N R L P V P I H S E A K R H I Q NY L S I Q M P I F N Y F M Q F LLVLLL A IM A M A MI N A E TS A E G T G T L EE Q ILI P W 110 120 130 ....|....|....|....|....|....|....|.... flp_31_mh KK L Y R P R G PP R F G K R G LLLM N R Q R N F P V ----------flp_31_mc KK L Y R P R G PP R F G K R G LLLM N R H S N F Q E ----------flp_31_mi KK L YR P R G PP R F G K R G LLVM N R Q R N F P E ----------flp_31_ppe KK L Y R P R G PP R F G K R A LMLL N ------DE S L P L E A F E

PAGE 177

163 APPENDIX C 1H NMR ASSIGNMENTS FOR ALL PEPTIDES EXAMINED IN CHAPTER 3. All assignments in this table are for spect ra obtained from samples at pH 5.5.

PAGE 178

164 Peptide Amino Acid Resonance Chemical Shift (ppm) DFDGAM-NH2 Asp 1 H 4.17 H 2.65 H 2.75 Phe 2 H 8.82 H 4.62 H 3.07 H 3.12 Asp 3 H 8.43 H 4.54 H 2.62 Gly 4 H 7.99 H 3.87 Ala 5 H 8.25 H 4.28 H 1.40 Met 6 H 8.33 H 4.41 H 2.02 2.11 H 2.51 H 2.62 C-NH2 a 7.51 b 7.19 DFDGAMPGVLRF-NH2 Asp 1 H 4.17 H 2.73 Phe 2 H 8.82 H 4.62 H 3.07 H 3.12 Asp 3 H 8.44 H 4.54 H 2.62 Gly 4 H 7.91 H 3.85 Ala 5 H 8.13 H 4.28 H 1.35 Met 6 H 8.35 H 4.75 H 1.94 2.04 H 2.54 H 2.63 Pro 7 H 4.39 Gly 8 H 8.58 H 3.91 Val 9 H 7.89 H 4.07 H 2.06 H 0.90

PAGE 179

165 Peptide Amino Acid Resonance Chemical Shift (ppm) DFDGAMPGVLRF-NH2 Leu 10 H 8.37 Continued H 4.30 H 1.48 1.61 H 0.86 H 0.92 Arg 11 H 8.28 H 4.24 H 1.44 H 1.65 H 1.65 H 3.10 H 7.16 Phe 12 H 8.26 H 4.60 H 2.99 H 3.16 C-NH2 a 7.59 b 7.14 DFDGEMPGVLRF-NH2 Asp 1 H 4.18 H 2.65 H 2.73 Phe 2 H 8.79 H 4.63 H 3.07 H 3.12 Asp 3 H 8.46 H 4.54 H 2.62 Gly 4 H 7.89 H 3.87 Glu 5 H 8.20 H 4.26 H 1.87 H 2.01 H 2.21 H 2.26 Met 6 H 8.51 H 4.75 H 1.95 2.04 H 2.54 H 2.60 Pro 7 H 4.39 Gly 8 H 8.61 H 3.90 H 3.94 Val 9 H 7.87 H 4.07 H 2.06

PAGE 180

166 Peptide Amino Acid Resonance Chemical Shift (ppm) DFDGEMPGVLRF-NH2 H 0.90 Continued Leu 10 H 8.35 H 4.32 H 1.48 1.61 H 0.86 H 0.92 Arg 11 H 8.29 H 4.24 H 1.40 1.46 1.65 H 3.10 H 7.19 Phe 12 H 8.27 H 4.60 H 2.99 H 3.16 C-NH2 a 7.58 b 7.14 DFDGEMSMPGVLRF-NH2 Asp 1 H 4.18 H 2.66 H 2.75 Phe 2 H 8.80 H 4.61 H 3.08 H 3.12 Asp 3 H 8.47 H 4.54 H 2.63 Gly 4 H 7.98 H 3.90 Glu 5 H 8.28 H 4.24 H 1.91 H 2.04 H 2.28 Met 6 H 8.43 H 4.48 H 2.00 2.09 H 2.50 H 2.60 Ser 7 H 8.31 H 4.41 Hb 3.83 Met 8 H 8.33 H 4.80 H 1.94

PAGE 181

167 Peptide Amino Acid Resonance Chemical Shift (ppm) DFDGEMSMPGVLRF-NH2 H 2.07 Continued H 2.54 H 2.60 Pro 9 H 4.39 H 1.92 H 2.07 Gly 10 H 8.61 H 3.87 H 3.96 Val 11 H 7.88 H 4.07 H 2.07 H 0.90 Leu 12 H 8.37 H 4.33 H 1.48 1.61 1.61 H 0.86 H 0.93 Arg 13 H 8.29 H 4.24 H 1.40 1.46 1.66 H 3.10 H 7.17 Phe 14 H 8.28 H 4.61 H 2.99 H 3.16 C-NH2 a 7.58 b 7.14 GFGDEM-NH2 Gly 1 H 3.76 H 3.81 Phe 2 H 8.80 H 4.54 H 3.07 Gly 3 H 8.69 H 3.72 H 3.96 Asp 4 H 8.04 H 4.58 H 2.69 Glu 5 H 8.65 H 4.24 H 1.97 H 2.06 H 2.30 Met 6 H 8.40

PAGE 182

168 Peptide Amino Acid Resonance Chemical Shift (ppm) GFGDEM-NH2 H 4.43 Continued H 2.02 2.10 H 2.49 H 2.61 CNH2 a 7.56 b 7.18 GFGDEMSMPGVLRF-NH2 Gly 1 H 3.76 H 3.80 Phe 2 H 8.80 H 4.56 H 3.05 H 3.12 Gly 3 H 8.69 H 3.74 H 3.95 Asp 4 H 8.04 H 4.58 H 2.69 Glu 5 H 8.66 H 4.24 H 1.95 H 2.06 H 2.29 Met 6 H 8.38 H 4.47 H 2.02 2.09 H 2.50 H 2.60 Ser 7 H 8.26 H 4.43 Hb 3.85 Met 8 H 8.32 H 4.79 H 1.95 2.06 H 2.53 H 2.62 Pro 9 H 4.39 H 1.91 H 2.06 Gly 10 H 8.60 H 3.89 H 3.96 Val 11 H 7.89 H 4.06 H 2.06 H 0.90 Leu 12 H 8.37 H 4.32

PAGE 183

169 Peptide Amino Acid Resonance Chemical Shift (ppm) GFGDEMSMPGVLRF-NH2 H 1.48 Continued 1.61 H 0.86 H 0.92 Arg 13 H 8.29 H 4.23 H 1.41 1.46 1.65 H 3.09 H 7.17 Phe 14 H 8.27 H 4.60 H 2.99 H 3.16 C-NH2 a 7.58 b 7.14 GFGDAMPGVLRF-NH2 Gly 1 H 3.83 Phe 2 H 8.79 H 4.54 H 3.07 Gly 3 H 8.62 H 3.72 H 3.91 Asp 4 H 7.93 H 4.54 H 2.64 Ala 5 H 8.35 H 4.28 H 1.35 Met 6 H 8.40 H 4.75 H 1.93 2.06 H 2.53 H 2.62 Pro 7 H 4.40 H 1.93 H 2.06 Gly 8 H 8.57 H 3.91 Val 9 H 7.93 H 4.06 H 2.06 H 0.90 Leu 10 H 8.37 H 4.32 H 1.48 1.61 H 0.86 H 0.92

PAGE 184

170 Peptide Amino Acid Resonance Chemical Shift (ppm) GFGDAMPGVLRF-NH2 Arg 11 H 8.28 Continued H 4.24 H 1.35 1.42 1.65 H 3.11 H 7.16 Phe 12 H 8.27 H 4.60 H 2.99 H 3.16 C-NH2 a 7.60 b 7.14 EMPGVLRF-NH2 Glu 1 H 4.04 Met 2 H 8.99 H 4.84 H 1.98 2.11 H 2.60 H 2.66 Pro 3 H 4.43 Gly 4 H 8.57 H 3.93 Val 5 H 7.91 H 4.06 H 2.06 H 0.92 Leu 6 H 8.35 H 4.32 H 1.48 1.61 H 0.86 H 0.92 Arg 7 H 8.28 H 4.24 H 1.42 1.48 1.65 H 3.12 H 7.17 Phe 8 H 8.26 H 4.60 H 2.99 H 3.18 C-NH2 a 7.58 b 7.13 SGSGAMPGVLRF-NH2 Gly 2 H 8.82 H 4.09 Ser 3 H 8.55 H 4.47 H 3.93

PAGE 185

171 Peptide Amino Acid Resonance Chemical Shift (ppm) SGSGAMPGVLRF-NH2 Gly 4 H 8.56 Continued H 3.94 Ala 5 H 8.20 H 4.30 H 1.33 Met 6 H 8.48 H 4.78 H 1.96 2.06 H 2.56 H 2.64 Pro 7 H 4.41 Gly 8 H 8.56 H 3.94 Val 9 H 7.96 H 4.07 H 2.06 H 0.90 Leu 10 H 8.39 H 4.32 H 1.48 1.59 H 0.86 H 0.92 Arg 11 H 8.29 H 4.24 H 1.46 1.65 H 3.12 H 7.16 Phe 12 H 8.28 H 4.60 H 2.99 H 3.16 C-NH2 a 7.60 b 7.15 AAAAAMPGVLRF-NH2 Ala 1 H 4.07 Ala 2 H 8.65 H 4.31 H 1.37 Ala 3 H 8.51 H 4.26 H 1.37 Ala 4 H 8.39 H 4.26 H 1.36 Ala 5 H 8.35 H 4.28 H 1.35 Met 6 H 8.44 H 4.78

PAGE 186

172 Peptide Amino Acid Resonance Chemical Shift (ppm) AAAAAMPGVLRF-NH2 H 1.93 Continued 2.06 H 2.56 H 2.65 Pro 7 H 4.41 H 1.94 H 2.07 Gly 8 H 8.55 H 3.94 Val 9 H 7.96 H 4.07 H 2.06 H 0.90 Leu 10 H 8.38 H 4.32 H 1.48 1.61 H 0.86 H 0.92 Arg 11 H 8.29 H 4.24 H 1.42 1.65 H 3.12 H 7.16 Phe 12 H 8.28 H 4.60 H 2.99 H 3.16 C-NH2 a 7.60 b 7.14 PGVLRF-NH2 Pro 1 H 9.09 H 8.41 H 4.43 H 2.45 2.06 H 3.40 Gly 2 H 8.74 H 3.99 H 4.05 Val 3 H 8.29 H 4.09 H 2.02 H 0.90 Leu 4 H 8.44 H 4.32 H 1.46 1.59 H 0.86 H 0.92 Arg 5 H 8.37

PAGE 187

173 Peptide Amino Acid Resonance Chemical Shift (ppm) PGVLRF-NH2 H 4.26 Continued H 1.44 1.50 1.66 H 3.12 H 7.16 Phe 6 H 8.31 H 4.60 H 2.99 H 3.16 C-NH2 a 7.63 b 7.13 PGVLRFPGVLRF-NH2 Pro 1 H 9.09 H 8.41 H 4.43 H 2.06 2.45 H 3.41 Gly 2 H 8.74 H 3.98 H 4.04 Val 3 H 8.28 H 4.07 H 2.01 H 0.90 Leu 4 H 8.43 H 4.32 H 1.44 1.57 H 0.83 H 0.92 Arg 5 H 8.32 H 4.24 H 1.42 1.50 1.63 H 3.11 H 7.20 Phe 6 H 8.40 H 4.32 H 2.92 H 3.14 Pro 7 H1 3.76 H2 3.59 Gly 8 H1 8.61 H1 3.87 H2 8.11 H2 3.93 Val 9 H1 8.09 H2 8.05 H1 4.06

PAGE 188

174 Peptide Amino Acid Resonance Chemical Shift (ppm) PGVLRFPGVLRF-NH2 H1 4.06 Continued H2 4.08 H1 2.02 H2 2.06 H1 0.88 H2 0.92 Leu 10 H 8.40 H 4.32 H 1.46 1.60 H 0.83 H 0.90 Arg 11 H 8.30 H 4.23 H 1.42 1.46 1.65 H 3.11 H 7.16 Phe 12 H 8.27 H 4.60 H 2.99 H 3.16 C-NH2 a 7.61 b 7.15

PAGE 189

175 APPENDIX D PKA VALUES CALCULATED FOR RE SONANCES WITH PH DEPENDANT CHEMICAL SHIFTS. Given in this table are the pKa values cal culated for pH dependant resonances in the given peptides, dominant pKas and c values are highlighted in light gray. The value ci is the fraction of the total chemical shift displacement contributed by the ith titration event (having pKa i). The acidic amino acids in the sequence are color matched to their corresponding pKaÂ’s in other re sonances they affect. The total chemical shift change during the titration for each resonance is given in the column labeled (ppm). For each peptide, the activity on NPR-1 is given as a percent of the m easured activity of EMPGVLRF-NH2. Only sufficiently resolved backbone amide resonances with a total chemical shift change of greater than 0.02 pp m were included in this table. (Table on next page) = Chemical shift changed directions during pH titration; ** = 1 being the farthest downfield of two similar protons.

PAGE 190

176 Pe p tide Pea k (pp m ) p Ka1 c1 p Ka2c2 D F D GAM-NH2 D1H 0.13 3.23 0.02 1 D1 H 0.26 3.19 0.02 1 Activity: F2 HN 0.022 4.04 0.04 1 0% F2 H ** -0.042 3.82 0.02 1 D3 H 0.10 4.09 0.03 1 D3 H 0.26 4.13 0.03 1 D3 HN 0.14 4.11 0.01 1 G4 HN 4.15 0.08 0.61 3.18 0.14 0.39 A5 HN -0.08 4.02 0.02 0.70 2.81 0.07 0.30 M6 HN 0.03 3.93 0.02 1 D F D GAMPGVLRF-NH2 D1 H 0.15 2.97 0.02 0.84 4.18 0.100.16 F2 HN 0.024 3.80 0.24 1 Activity: D3 HN 0.14 4.04 0.01 1 29.1 % G4 HN -0.29 4.10 0.04 0.63 3.05 0.07 0.36 A5 HN -0.085 3.99 -0.70 2.85 -0.30 M6 HN 0.047 4.48 0.11 0.51 3.51 0.10 0.49 G8 HN -0.033 4.34 0.07 0.61 3.02 0.12 0.39 V9 HN 0.031 3.74 0.05 0.65 4.78 -0.35 R11 H -0.015 4.21 0.10 0.54 2.84 0.13 0.46 D F D G E MPGVLRF-NH2 F2 HN-0.0074 4.68 0.05 0.57 3.71 0.040.43 D3 HN 0.15 3.63 0.05 0.52 4.55 0.07 0.48 Activity: G4 HN -0.20 2.97 0.03 0.55 3.87 0.03 0.45 19.0 % E5 HN -0.16 4.59 0.02 0.75 3.08 0.04 0.25 M6 HN -0.0071* 4.74 0.03 0.57 3.09 0.03 0.43 G8 HN -0.059 4.51 0.04 0.61 3.24 0.05 0.39 V9 HN 0.060 3.31 0.04 0.53 4.48 0.05 0.47 R11 H -0.041 4.53 0.03 0.78 3.00 0.09 0.22 D F D G E MSMPGVLRF-NH2 F2 HN-0.016 4.59 0.09 0.56 4.13 0.100.44 D3 HN 0.15 3.72 0.05 0.54 4.62 0.07 0.46 Activity: G4 HN -0.20 2.99 0.08 0.50 3.74 0.07 0.50 45.0 % E5 HN -0.17 4.65 0.02 0.73 3.19 0.05 0.27 M6 HN 0.041 3.40 0.02 1 M8 HN 0.058 3.34 0.05 0.65 4.62 0.12 0.35 G10 HN-0.045 4.62 0.05 0.59 3.24 0.05 0.41 V11 HN 0.052 3.43 0.04 0.53 4.69 0.06 0.47 R13 H -0.018 4.58 0.04 0.73 2.88 0.11 0.27

PAGE 191

177 Pe p tide Pea k p Ka1 c1 p Ka2c2 GFG D E M-NH2 F2 HN-0.10 3.51 0.03 0.80 4.43 0.110.20 G3 HN -0.13 3.52 0.04 0.51 4.45 0.05 0.49 Activity: D4 HN 0.30 3.51 0.00 1 0 % E5 HN -0.17 4.41 0.01 1 M6 HN -0.0030 3.30 0.05 0.52 4.50 0.06 0.48 CNH2 1 0.046 4.24 0.02 1 GFG D E MSMPGVLRF-NH2 F2 HN-0.10 3.50 0.03 0.88 4.34 0.190.12 G3 HN -0.13 3.68 0.05 0.66 4.56 0.13 0.34 Activity: D4 HN 0.30 3.47 0.01 1 49.3 % E5 HN -0.20 4.31 0.01 1 M6 HN 0.013 4.49 0.05 0.71 2.98 0.12 0.29 S7 HN 0.058 3.75 0.07 0.75 5.03 0.44 0.25 M8 HN 0.052 3.63 0.05 0.78 5.21 0.47 0.22 G10 HN -0.037 4.19 0.07 0.78 2.66 0.33 0.22 V11 HN 0.035 3.87 0.07 0.71 5.24 0.43 0.29 R13 H -0.018 4.07 0.02 0.83 2.32 0.24 0.17 GFG D AMPGVLRF-NH2 F2 HN-0.069 3.46 1 G3 HN -0.060 3.47 1 Activity: D4 HN 0.35 3.50 0.01 1 104.8 % M6 HN -0.020 3.58 0.06 1 G8 HN -0.025 3.51 0.01 1 R11 H -0.0069 3.34 0.03 1 E MPGVLRF-NH2 E1 H 0.040 3.74 0.02 1 M2 HN -0.088 3.57 0.02 1 Activity: G4 HN -0.049 3.65 0.01 1 100 % V5 HN 0.037 3.80 0.04 1 R7 H -0.028 3.56 0.02 1

PAGE 192

178 APPENDIX E 1H AND 13C NMR CHEMICAL SHIFT ASSIGNMENTS FOR DOLICHODIAL-LIKE ISOMERS FROM THE WALKIN G STICK INSECT SPECIES Anisomorpha buprestoides AND Peruphasma schultei X = Not enough data for assignment. N/A = proton that does not exist for that particular isomer. indicates protons fo r which assignments are not stereospecific relative to the other proton on their attached carbon. The numbering scheme is based on Figure 4-2 in Chapter 4. A. buprestoides (1) A. buprestoides (1) (diol) A. buprestoides (2) A. buprestoides (2) (diol) P. schultei P. schultei (diol) C1 42.49 X 39.60 X 41.94 42.10 H1 3.33 X 3.35 X 3.24 2.88 C2 63.05 X 62.02 X 67.31 60.59 H2 2.70 X 2.80 X 2.24 1.74 C3 37.62 X 39.98 X 39.48 38.50 H3 2.36 X 2.61 X 2.32 1.99 C4 36.15 X 36.90 X 35.60 36.33 H4a* 1.35 X 1.41 X 1.47 1.36* H4b* 2.09 X 2.02 X 1.98 1.36* C5 32.48 X 33.00 X 32.80 34.69 H5a* 1.83 X 1.62 X 1.76 1.50 H5b* 1.95 X 2.10 X 2.03 1.82 C 151.51 X 154.06 X 153.42 156.02 C 141.39 X 139.36 X 139.62 139.68 H 6.37 6.432 6.24 6.545 6.27 6.25 H 6.56 6.272 6.50 6.213 6.55 6.54 C-f1 201.13 X 201.10 X 201.02 201.44 H-f1 9.45 9.44 9.45 9.44 9.45 9.45 C-f2 212.41 X 213.60 X 211.83 96.06 H-f2 9.30 N/A 9.71 N/A 9.50 4.92 C3' 21.95 X 18.44 X 21.21 23.21 H3' 1.05 X 1.00 X 1.04 1.09

PAGE 193

179 APPENDIX F HLPC-MASS SPEC IDENTIFICATION OF GLUCOSE FROM DEFENSIVE SECRETIONS OF Anisomorpha buprestoides HPLC-MS (-) electro-spray io nization (ESI) of aqueous fraction supplemented with 13C6-D-glucose and mass spectra of correspondi ng traces: Ablank; BTotal ion current (TIC); CSelect ion monito ring (SIM) of m/z 225; Dm/z 225 MS2; E -SIM m/z 231, Fm/z 231 MS2. A Thermo Separation Pr oducts Spectra SYSTEM SCM1000 membrane degasser, P4000 pump, AS3000 autosamp ler, thermoFinnigan UV6000LP LDC photodiode array detector (P DA), and Finnigan LCQ DecaXP Max mass spectrometer in ESI mode (+/-) (5kV spray vol tage and 275C capillary temper ature) were used. Sheath and sweep gas flow rates (arb) were 40 a nd 20, respectively. The mobile phase (1mL/min) was split 10:1 between the PDA and MS; eluants were 0.1% formic acid (FA) in ACN (a), 10mM ammonium formate (b ), and 10mM ammonium formate in 90% ACN(c). Elution through a YMC-NH2 analyt ical column (L = 250 mm, ID = 4.6 mm, S = 5mm, 20nm) was isocratic (4a:24b:72c) fo r 18min. This demonstrates that the unknown aqueous component of th e secretion is glucose.

PAGE 194

180

PAGE 195

APPENDIX G 2D NMR SPECTRA OF DEFENSIVE SECRETIONS OF Anisomorpha buprestoides AND Peruphasma schultei

PAGE 196

182 Anisomorpha buprestoides double quantum filtered COSY (Bruker cosydfphpr): Probe=HTS, 1-mm cryoprobe, Temp=20 C, operating frequency=600 MHz, sweep width=7184 Hz both dimensions carrier frequency=4.84 ppm both dimensions, complex points acquisition=2048, complex points indir ect (States-TPPI)=256, number of scans=8.

PAGE 197

183 Anisomorpha buprestoides TOCSY with 60 ms DIPSI-2 mixing (Bruker dipsi2phpr): Probe=HTS, 1-mm cryoprobe, Temp=20 C, operating frequency=600 MHz, sweep width=7184 Hz both dimensions carrier frequency=4.84 ppm both dimensions, complex points acquisition=2048, complex points indir ect (States-TPPI)=256, number of scans=8.

PAGE 198

184 Anisomorpha buprestoides ROESY with 400 ms cw mixing at a field strength of 1.75 kHz (Bruker roesyphpr): Probe=HTS, 1-mm cryoprobe, Temp=20 C, operating frequency=600 MHz, sweep width=7184 Hz both dimensions, carrier frequency=4.84 ppm both dimensions, complex points acquisition=2048, comple x points indirect (StatesTPPI)=256, number of scans=32.

PAGE 199

185 Anisomorpha buprestoides 13C-HMQC with BIRD pulse to suppress 12C (Bruker hmqcbiphpr): Probe=HTS, 1-mm cryoprobe Temp=20 C, operating frequency=600 MHz and 150.9 MHz, sweep width=7184 Hz 1H & 24140 Hz 13C, carrier frequency=4.84 ppm 1H & 82.8 ppm 13C, complex points acquisition=720, complex points indirect (States-TPPI)=128, number of scans=32.

PAGE 200

186 Anisomorpha buprestoides 13C-HMBC optimized for 10 Hz 13C-1H J couplings (Bruker hmbcndprqf): Probe=HTS, 1-mm cryoprobe Temp=20 C, operating frequency=600 MHz and 150.9 MHz, sweep width=7184 Hz 1H & 36232 Hz 13C, carrier frequency=4.84 ppm 1H & 117.8 ppm 13C, complex points acquisition=1024, magnitude indirect =512, number of scans=64, delay to build up multiple quantum coherence was set to 50 ms.

PAGE 201

187 Peruphasma schultei double quantum filtered COSY (B ruker cosydfphpr): Probe=HTS, 1-mm cryoprobe, Temp=20 C, operating fr equency=600 MHz, sweep width=7184 Hz both dimensions, carrier frequency=4.84 ppm both di mensions, complex points acquisition=2048, complex points indirect (S tates-TPPI)=256, number of scans=8.

PAGE 202

188 Peruphasma schultei TOCSY with 60 ms DIPSI-2 mixing (Bruker dipsi2phpr): Probe=HTS, 1-mm cryoprobe, Temp=20 C, operating frequency=600 MHz, sweep width=7184 Hz both dimensions carrier frequency=4.84 ppm both dimensions, complex points acquisition=2048, complex points indir ect (States-TPPI)=256, number of scans=8.

PAGE 203

189 Peruphasma schultei ROESY with 400 ms cw mixing at a field strength of 1.75 kHz (Bruker roesyphpr): Probe=HTS, 1-mm cryoprobe, Temp=20 C, operating frequency=600 MHz, sweep width=7184 Hz both dimensions, carrier frequency=4.84 ppm both dimensions, complex points acquisition=2048, comple x points indirect (StatesTPPI)=256, number of scans=32.

PAGE 204

190 Peruphasma schultei 13C-HMQC with BIRD pulse to suppress 12C (Bruker hmqcbiphpr): Probe=HTS, 1-mm cryoprobe, Temp=20 C, operating frequency=600 MHz and 150.9 MHz, sweep width=7184 Hz 1H & 24140 Hz 13C, carrier frequency=4.84 ppm 1H & 82.8 ppm 13C, complex points acquisition=720, comple x points indirect (S tates-TPPI)=128, number of scans=32.

PAGE 205

191 Peruphasma schultei 13C-HMBC optimized for 10 Hz 13C-1H J couplings (Bruker hmbcndprqf): Probe=HTS, 1-mm cryoprobe Temp=20 C, operating frequency=600 MHz and 150.9 MHz, sweep width=7184 Hz 1H & 36232 Hz 13C, carrier frequency=4.84 ppm 1H & 117.8 ppm 13C, complex points acquisition=1024, magnitude indirect =512, number of scans=64, delay to build up multiple quantum coherence was set to 50 ms.

PAGE 206

192 APPENDIX H GAS CHROMATOGRAPHY TRACES AND MASS SPECTRA OF DEFENSIVE SECRETIONS OF Anisomorpha buprestoides AND Peruphasma schultei AND EXTRACTS OF Teucrium marum Gas Chromatograph (GC) Legend: A. buprestoides individual ( ), P. schultei composite ( ), P. schultei composite x 5 ( ), T. marum “cat thyme” individual ( ). Gas Chromatography – Flame Ionization Dete ctor. A Hewlett Packard (Palo Alto, CA) 5890 series II gas chromatograph and a fl ame-ionization detect or (GC-FID) with nitrogen make-up gas (1.5 mL /min) and helium carrier gas (1.3 mL/min) were used. Cool on-column and split-less inje ctions (1uL) were at 40C and 200C, respectively; the detector was maintained at 260C. The oven program was as follows: isothermal for 5 min, heating from 40C to 200C at 11 C/m in, isothermal for 10min, heating from 200C to 250C at 25 C/min, and then isot hermal for 15min. GlasSeal connectors (Supleco) fused three silica columns in series : a primary deactivated column (L = 8 cm, ID = 0.53 mm), a HP-1MS retention gap colu mn (L = 2 m, ID = 0.25 mm, df = 0.25 mm), and a J&W DB-5 analytical column (L = 30 m, ID = 0.25 mm, df = 0.25 mm). Gas Chromatography – Mass Spectrometry. A Varian 3400 gas chromatograph and a Finnigan MAT Magnum ion trap ma ss spectrometer (GC-ITMS) in electron impact (EI) ionization mode (70 eV) with a filament bi as of 11765mV or chemical ionization (CI) mode (isobutane) were employed to acquire full-scan spectra over the ranges m/z 40 to 400 at 0.85 s per scan. Ho lox (Charlotte, NC) high purity helium was used as a carrier gas (1.4 mL/min). In jection and oven conditions were as above. Transfer-line and manifold temperatures were 240 and 220C, respectively.

PAGE 207

193

PAGE 208

194

PAGE 209

195 LIST OF REFERENCES 1. Anderson, R. C. (2000) Nematode Parasites of Vert ebrates: Their Development and Transmission, 2 ed., New York, NY. 2. Malakhov, V. V. e. (1994) Nematodes: Structure, Development, Classification, and Phylogeny, Smithsonian Institute Press, Washington, D. C., USA. 3. Bilgrami, R. G. a. A. L. (2004) Nematode Behaviour, CAB International, Cambridge, MA. 4. Wylie, T., Martin, J. C., Dante, M., Mitr eva, M. D., Clifton, S. W., Chinwalla, A., Waterston, R. H., Wilson, R. K., and McCa rter, J. P. (2004) Ne matode.net: a tool for navigating sequences from parasitic and free-living nematodes, Nucleic Acids Res 32, D423-6. 5. Cobb, N. A. (1914) Nematodes and their relationships, United States Department of Agriculture, Washington DC. 6. Cobb, N. A. (1914) The North Ameri can Free-Living Fresh-Water Nematodes, Trans. Am. Microsc. Soc. 33, 69-134. 7. Parwinder S. Grewal, R.-U. E ., David I. Shapiro-Ilan. (2005) Nematodes as biocontrol agents, CAB International Publis hing, Cambridge, MA, USA. 8. Dunn, R. A., and Grover C. Smart, Jr. (1997) Biological Control Nematodes: Suppliers and Pesticide Co mpatability (ENY45), ENY45 ed., University of Florida Institute of Food and Agricultural Sciences (IFAS), Gainesville, FL. 9. Maupas, E. (1900) Modes et form es de reproduction des nematodes., Archives de Zoologie Experiment ale et Generale 8, 463-624. 10. Sulston, J. E., Schierenberg, E., White, J. G., and Thomson, J. N. (1983) The embryonic cell lineage of the nematode Caenorhabditis elegans, Dev Biol 100, 64119. 11. Sulston, J. E., and Horvitz, H. R. (1977) Post-embryonic cell lineages of the nematode, Caenorhabditis elegans, Dev Biol 56, 110-56. 12. Sulston, J. E. (1976) Post-embryoni c development in the ventral cord of Caenorhabditis elegans, Philos Trans R Soc Lond B Biol Sci 275, 287-97.

PAGE 210

196 13. (1998) Genome sequence of the nematode C. elegans: a platform for investigating biology, Science 282, 2012-8. 14. White, J. G., Southgate, E., Thomson,J.N. and Brenner,S. (1986) The structure of the nervous system of the nema tode Caenorhabditis elegans., 314, 1-340. 15. Lee, D. L. (2002) The Biology of Nematodes, Taylor & Francis, New York, NY. 16. Li, C. (2005) The ever-expanding neuropeptide gene families in the nematode Caenorhabditis elegans, Parasitology 131 Suppl, S109-27. 17. Li, C., Nelson, L. S., Kim, K., Nathoo, A., and Hart, A. C. (1999) Neuropeptide gene families in the nematode Caenorhabditis elegans, Ann N Y Acad Sci 897, 239-52. 18. Li, C., Kim, K., and Nelson, L. S. (1999) FMRFamide-related neuropeptide gene family in Caenorhabditis elegans, Brain Res 848, 26-34. 19. Price, D. A., Greenberg, M. J. (1977) Structure of a Molluscan Cardioexcitatory Neuropeptide, Science 197, 670-671. 20. Greenberg, M. J., and Price, D. A. (1992) Relationships among the FMRFamidelike peptides, Prog Brain Res 92, 25-37. 21. McVeigh, P., Leech, S., Mair, G. R., Marks, N. J., Geary, T. G., and Maule, A. G. (2005) Analysis of FMRFamide-like peptide (FLP) diversity in phylum Nematoda, Int J Parasitol 35, 1043-60. 22. Vilim, F. S., Aarnisalo, A. A., Niemin en, M. L., Lintunen, M., Karlstedt, K., Kontinen, V. K., Kalso, E., States, B., Panula, P., and Ziff, E. (1999) Gene for pain modulatory neuropeptide NPFF: inducti on in spinal cord by noxious stimuli, Mol Pharmacol 55, 804-11. 23. Perry, S. J., Yi-Kung Huang, E., Cronk, D ., Bagust, J., Sharma, R., Walker, R. J., Wilson, S., and Burke, J. F. (1997) A human gene encoding morphine modulating peptides related to NPFF and FMRFamide, FEBS Lett 409, 426-30. 24. Perry, S. J., Straub, V. A., Schofield, M. G., Burke, J. F., and Benjamin, P. R. (2001) Neuronal expression of an FM RFamide-gated Na+ channel and its modulation by acid pH, J Neurosci 21, 5559-67. 25. Kivipelto, L., and Panula, P. (1991) Comparative Distri bution of Neurons Containing FLFQPQRFamidelike (morphine-modulating) Peptide and Related Neuropeptides in the Rat Brain, Eur J Neurosci 3, 175-185.

PAGE 211

197 26. Espinoza, E., Carrigan, M., Thomas, S. G., Shaw, G., and Edison, A. S. (2000) A statistical view of FMRFamide neuropeptide diversity, Mol Neurobiol 21, 35-56. 27. Pascual, N., Castresana, J., Valero, M. L., Andreu, D., and Belles, X. (2004) Orcokinins in insects an d other invertebrates, Insect Biochem Mol Biol 34, 11416. 28. Conlon, J. M. (2001) Evolution of the insulin molecule: insights into structureactivity and phylogenetic relationships, Peptides 22, 1183-93. 29. Cowden, C., Stretton, A. O., and Davis, R. E. (1989) AF1, a sequenced bioactive neuropeptide isolated from the nematode Ascaris suum, Neuron 2, 1465-73. 30. Cowden, C., and Stretton, A. O. (1993) AF2, an Ascaris neuropeptide: isolation, sequence, and bioactivity, Peptides 14, 423-30. 31. Sithigorngul, P., Cowden, C., Guastella, J., and Stretton, A. O. (1989) Generation of monoclonal antibodies ag ainst a nematode peptide extract: another approach for identifying unknown neuropeptides, J Comp Neurol 284, 389-97. 32. Cowden, C., Sithigorngul, P., Brackley, P., Guastella, J., and Stretton, A. O. (1993) Localization and differentia l expression of FMRFamide-like immunoreactivity in the ne matode Ascaris suum, J Comp Neurol 333, 455-68. 33. Cowden, C., and Stretton, A. O. (1995) Eight novel FMRFamide-like neuropeptides isolated from the nematode Ascaris suum, Peptides 16, 491-500. 34. Bowman, J. W., Friedman, A. R., Thom pson, D. P., Ichhpurani, A. K., Kellman, M. F., Marks, N., Maule, A. G., and Geary, T. G. (1996) Structure-activity relationships of KNEFIRFamide (AF1), a nematode FMRFamide-related peptide, on Ascaris suum muscle, Peptides 17, 381-7. 35. Bowman, J. W., Winterrowd, C. A., Frie dman, A. R., Thompson, D. P., Klein, R. D., Davis, J. P., Maule, A. G., Blair, K. L., and Geary, T. G. (1995) Nitric oxide mediates the inhibitory effects of SDPNFLRFamide, a nematode FMRFamiderelated neuropeptide, in Ascaris suum, J Neurophysiol 74, 1880-8. 36. Edison, A. S., Messinger, L. A., and Stre tton, A. O. (1997) af p-1: a gene encoding multiple transcripts of a new class of FMRFamide-like neuropeptides in the nematode Ascaris suum, Peptides 18, 929-35. 37. Fisher, J. M., and Scheller, R. H. ( 1988) Prohormone processing and the secretory pathway, J Biol Chem 263, 16515-8.

PAGE 212

198 38. Nathoo, A. N., Moeller, R. A., We stlund, B. A., and Hart, A. C. (2001) Identification of neuropeptide-like protein gene families in Caenorhabditiselegans and other species, Proc Natl Acad Sci U S A 98, 14000-5. 39. Husson, S. J., Clynen, E., Baggerman, G., De Loof, A., and Schoofs, L. (2005) Discovering neuropeptides in Caenorha bditis elegans by two dimensional liquid chromatography and mass spectrometry, Biochem Biophys Res Commun 335, 7686. 40. Rosoff, M. L., Doble, K. E., Price, D. A., and Li, C. (1993) The flp-1 propeptide is processed into multiple, highly similar FMRFamide-like peptides in Caenorhabditis elegans, Peptides 14, 331-8. 41. Marks, N. J., Shaw, C., Maule, A. G., Davis, J. P., Halton, D. W., Verhaert, P., Geary, T. G., and Thompson, D. P. ( 1995) Isolation of AF2 (KHEYLRFamide) from Caenorhabditis elegans: evidence for the presence of more than one FMRFamide-related peptide-encoding gene, Biochem Biophys Res Commun 217, 845-51. 42. Marks, N. J., Maule, A. G., Geary, T. G., Thompson, D. P., Davis, J. P., Halton, D. W., Verhaert, P., and Shaw, C. (1997) APEASPFIRFamide, a novel FMRFamide-related decapeptide from Caenorhabditis elegans: structure and myoactivity, Biochem Biophys Res Commun 231, 591-5. 43. Marks, N. J., Maule, A. G., Geary, T. G., Thompson, D. P., Li, C., Halton, D. W., and Shaw, C. (1998) KSAYMRFamide (PF3/A F8) is present in the free-living nematode, Caenorhabditis elegans, Biochem Biophys Res Commun 248, 422-5. 44. Mercier, A. J., Friedrich, R., and Bo ldt, M. (2003) Physiological functions of FMRFamide-like peptides (FLPs) in crustaceans, Microsc Res Tech 60, 313-24. 45. Rastogi, R. K., D'Aniello, B., Pinelli, C ., Fiorentino, M., Di Fiore, M. M., Di Meglio, M., and Iela, L. (2001) FMRF amide in the amphibian brain: a comprehensive survey, Microsc Res Tech 54, 158-72. 46. Brownlee, D. J., and Fairweather, I. (1999) Exploring the neurotransmitter labyrinth in nematodes, Trends Neurosci 22, 16-24. 47. Dockray, G. J. (2004) The expanding family of -RFamide peptides and their effects on feeding behaviour, Exp Physiol 89, 229-35. 48. Marks, N. J., Maule, A. G., Li, C., Nelson, L. S., Thompson, D. P., AlexanderBowman, S., Geary, T. G., Halton, D. W., Verhaert, P., and Shaw, C. (1999) Isolation, pharmacology and gene organization of KPSFVRFamide: a neuropeptide from Caenorhabditis elegans, Biochem Biophys Res Commun 254, 222-30.

PAGE 213

199 49. Marks, N. J., Shaw, C., Halton, D. W., Thompson, D. P., Geary, T. G., Li, C., and Maule, A. G. (2001) Isolation and preliminary biological assessment of AADGAPLIRFamide and SVPGVLRFamide from Caenorhabditis elegans, Biochem Biophys Res Commun 286, 1170-6. 50. Yang, H. Y., Fratta, W., Majane, E. A., and Costa, E. (1985) Isolation, sequencing, synthesis, and pharmacol ogical characteriza tion of two brain neuropeptides that modulate the action of morphine, Proc Natl Acad Sci U S A 82, 7757-61. 51. Waggoner, L. E., Hardaker, L. A., Golik, S., and Schafer, W. R. (2000) Effect of a neuropeptide gene on behavi oral states in Caenorha bditis elegans egg-laying, Genetics 154, 1181-92. 52. Nelson, L. S., Rosoff, M. L., and Li, C. (1998) Disruption of a neuropeptide gene, flp-1, causes multiple behavioral defects in Caenorhabditis elegans, Science 281, 1686-90. 53. Rogers, C., Reale, V., Ki m, K., Chatwin, H., Li, C., Evans, P., and de Bono, M. (2003) Inhibition of Caenorhabditis elegan s social feeding by FMRFamide-related peptide activation of NPR-1, Nat Neurosci 6, 1178-85. 54. Kubiak, T. M., Larsen, M. J., Nulf, S. C., Zantello, M. R., Burton, K. J., Bowman, J. W., Modric, T., and Lowery, D. E. (2003) Differential activati on of "social" and "solitary" variants of th e Caenorhabditis elegans G pr otein-coupled receptor NPR1 by its cognate ligand AF9, J Biol Chem 278, 33724-9. 55. Mertens, I., Vandingenen, A., Meeusen, T., Janssen, T., Luyten, W., Nachman, R. J., De Loof, A., and Schoofs, L. (2004) F unctional characterization of the putative orphan neuropeptide G-protein coupled receptor C26F1.6 in Caenorhabditis elegans, FEBS Lett 573, 55-60. 56. Mertens, I., Meeusen, T., Janssen, T., Nachman, R., and Schoofs, L. (2005) Molecular characterization of two G protein-coupled re ceptor splice variants as FLP2 receptors in Caenorhabditis elegans, Biochem Biophys Res Commun 330, 967-74. 57. Kubiak, T. M., Larsen, M. J., Zantello M. R., Bowman, J. W., Nulf, S. C., and Lowery, D. E. (2003) Functional annotati on of the putative orphan Caenorhabditis elegans G-protein-coupled receptor C 10C6.2 as a FLP15 peptide receptor, J Biol Chem 278, 42115-20. 58. Johnson, E. C., Bohn, L. M., Barak, L. S ., Birse, R. T., Nassel, D. R., Caron, M. G., and Taghert, P. H. (2003) Identifica tion of Drosophila neuropeptide receptors

PAGE 214

200 by G protein-coupled receptors-beta-arrestin2 interactions, J Biol Chem 278, 52172-8. 59. Lingueglia, E., Champigny, G., Lazdunski M., and Barbry, P. (1995) Cloning of the amiloride-sensitive FMRFamide peptide-gated sodium channel, Nature 378, 730-3. 60. Cottrell, G. A. (1997) The fi rst peptide-gated ion channel, J Exp Biol 200, 237786. 61. Furukawa, Y., Miyawaki, Y., and Ab e, G. (2005) Molecular cloning and functional characterization of the Aply sia FMRFamide-gated Na(+) channel, Pflugers Arch. 62. Xie, J., Price, M. P., Wemmie, J. A., Askwith, C. C., and Welsh, M. J. (2003) ASIC3 and ASIC1 mediate FMRFamide-re lated peptide enhancement of H+gated currents in cultured dorsal root ganglion neurons, J Neurophysiol 89, 245965. 63. de Bono, M., and Bargmann, C. I. (1 998) Natural variation in a neuropeptide Y receptor homolog modifies social behavi or and food response in C. elegans, Cell 94, 679-89. 64. Watters, M. R. (2005) Tropical ma rine neurotoxins: venoms to drugs, Semin Neurol 25, 278-89. 65. Terlau, H., and Olivera, B. M. (2004) Conus venoms: a rich source of novel ion channel-targeted peptides, Physiol Rev 84, 41-68. 66. Amstutz; Gary Arthur (San Jose, C. B. S. S. M. P., CA); Gohil; Kishorchandra (Richmond, CA); Adriaensse ns; Peter Isadore (Mountain View, CA); Kristipati; Ramasharma (Fremont, CA). (1998), Neur ex Corporation (Menlo Park, CA), USA. 67. Justice; Alan (Sunnyvale, C. S. T. P. A ., CA); Gohil; Kishor C. (Richmond, CA); Valentino; Karen L. (San Carlos, CA). (1994), Neurex Corporation (Menlo Park, CA), USA. 68. FDA, U. F. a. D. A. (2006) Determ ination of Regulator y Review Period for Purposes of Patent Extension; PRIALT, Federal Register 71, 13409-13410. 69. Jain, K. K. (2000) An evaluation of in trathecal ziconotide for the treatment of chronic pain, Expert Opin Investig Drugs 9, 2403-10.

PAGE 215

201 70. Maillo, M., Aguilar, M. B ., Lopez-Vera, E., Craig, A. G., Bulaj, G., Olivera, B. M., and Heimer de la Cotera, E. P. (2002) Conorfamide, a Conus venom peptide belonging to the RFamide fa mily of neuropeptides, Toxicon 40, 401-7. 71. Meinwald, J. a. T. E. (2003) Natura l Products Chemistry: New Opportunities, Uncertain Future, Helvetica Chimica 86, 3633-3637. 72. Ortholand, J. Y., and Ganesan, A. (2004) Natural products and combinatorial chemistry: back to the future, Curr Opin Chem Biol 8, 271-80. 73. Eisner, T. (2003) Chemical ecology: can it survive without natural products chemistry?, Proc Natl Acad Sci U S A 100 Suppl 2, 14517-8. 74. Eisner, T., and Berenbaum, M. (2002) Chemical ecology: missed opportunities?, Science 295, 1973. 75. Trowell, S. (2003) Drugs from bugs: the promise of pharmaceutical entomology., The Futurist 37, 17-19. 76. Meinwald, J. (2003) Understanding th e chemistry of chemical communication: are we there yet?, Proc Natl Acad Sci U S A 100 Suppl 2, 14514-6. 77. Brey, W. W., Edison, A. S., Nast, R. E., Rocca, J. R., Saha, S., and Withers, R. S. (2006) Design, construction, and vali dation of a 1-mm triple-resonance hightemperature-superconducting probe for NMR, J Magn Reson 179, 290-3. 78. Exarchou, V., Krucker, M., van Beek, T. A., Vervoort, J., Gerothanassis, I. P., and Albert, K. (2005) LC-NMR coupling t echnology: recent advancements and applications in natural products analysis, Magn Reson Chem 43, 681-7. 79. Nibbering, N. M. (2006) Four d ecades of joy in mass spectrometry, Mass Spectrom Rev. 80. Meinwald, J., M.S. Chadha, J.J. Hurst, and T. Eisner. (1962) Defense Mechanisms of Arthropods IX: Anis omorphal, The Secretion of a Phasmid Insect, Tretrahedron Letters, 29-33. 81. Li, Y., Webb, A. G., Saha, S., Brey, W. W., Zachariah, C., and Edison, A. S. (2006) Comparison of the performance of round and rectangular wire in small solenoids for high-field NMR, Magn Reson Chem 44, 255-62. 82. Pettit, G. R., Meng, Y., Herald, D. L ., Knight, J. C., and Day, J. F. (2005) Antineoplastic agents. 553. The Te xas grasshopper Brachystola magna, J Nat Prod 68, 1256-8.

PAGE 216

202 83. Koch, M. A., Schuffenhauer, A., Scheck, M., Wetzel, S., Casaulta, M., Odermatt, A., Ertl, P., and Waldmann, H. (2005) Ch arting biologically re levant chemical space: a structural classificati on of natural products (SCONP), Proc Natl Acad Sci U S A 102, 17272-7. 84. Zachariah, C., Cameron, A., Lindberg, I., Kao, K. J., Beinfeld, M. C., and Edison, A. S. (2001) Structural st udies of a neuropeptide precu rsor protein with an RGD proteolytic site, Biochemistry 40, 8790-9. 85. Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressi ve multiple sequence alignment through sequence weighting, positi on-specific gap penalties a nd weight matrix choice, Nucleic Acids Res 22, 4673-80. 86. Hall, T., Ibis Therapeutics, A division of Isis Pharmaceuticals, 1891 Rutherford Road, Carlsbad, CA 92008. (2005) http://www.mbio.ncsu.edu/BioEdit/bioedit.html thall@isisph.com. 87. Tippmann, H. F. (2004) Analysis fo r free: comparing programs for sequence analysis, Brief Bioinform 5, 82-7. 88. Bendtsen, J. D., Nielse n, H., von Heijne, G., and Brunak, S. (2004) Improved prediction of signal peptides: SignalP 3.0, J Mol Biol 340, 783-95. 89. Nielsen, H., Engelbrecht, J., Brunak, S., and von Heijne, G. (1997) Identification of prokaryotic and eukaryotic signal pe ptides and prediction of their cleavage sites, Protein Eng 10, 1-6. 90. Nielsen, H., and Krogh, A. (1998) Prediction of signal peptides and signal anchors by a hidden Markov model, Proc Int Conf Intell Syst Mol Biol 6, 122-30. 91. Wilkins, M. R., Lindskog, I., Gasteiger, E., Bairoch, A., Sanchez, J. C., Hochstrasser, D. F., and Appel, R. D. (1997) Detailed pep tide characterization using PEPTIDEMASS--a Worl d-Wide-Web-accessible tool, Electrophoresis 18, 403-8. 92. Gasteiger E., H. C., Gattiker A., Duvaud S., Wilkins M.R., Appel R.D., Bairoch A. (2005) in The Proteomics Protocols Handbook (Walker, J. M., Ed.) pp 571607, Humana Press. 93. Dosztanyi, Z., Csizmok, V., Tompa, P., and Simon, I. (2005) IUPred: web server for the prediction of intrinsically unstr uctured regions of proteins based on estimated energy content, Bioinformatics 21, 3433-4.

PAGE 217

203 94. Dosztanyi, Z., Csizmok, V., Tompa, P., and Simon, I. (2005) The pairwise energy content estimated from amino acid compos ition discriminates between folded and intrinsically unstructured proteins, J Mol Biol 347, 827-39. 95. Hummon, A. B., Hummon, N. P., Corbin, R. W., Li, L., Vilim, F. S., Weiss, K. R., and Sweedler, J. V. (2003) From precu rsor to final peptides: a statistical sequence-based approach to pr edicting prohormone processing, J Proteome Res 2, 650-6. 96. Dossey, A. T., Reale, V., Chatwin, H ., Zachariah, C., Debono, M., Evans, P. D., and Edison, A. S. (2006) NMR Analys is of Caenorhabditis elegans FLP-18 Neuropeptides: Implications for NPR-1 Activation, Biochemistry 45, 7586-7597. 97. Vijay-Kumar, S., Bugg, C. E., and C ook, W. J. (1987) Structure of ubiquitin refined at 1.8 A resolution, J Mol Biol 194, 531-44. 98. Weber, P. L., Brown, S. C., and Mueller, L. (1987) Sequential 1H NMR assignments and secondary structure identification of human ubiquitin, Biochemistry 26, 7282-90. 99. Di Stefano, D. L., and Wand, A. J. (1987) Two-dimensi onal 1H NMR study of human ubiquitin: a main chain directed assignment and structure analysis, Biochemistry 26, 7272-81. 100. Aguilar, C. F., Cronin, N. B., Badasso, M., Dreyer, T., Newman, M. P., Cooper, J. B., Hoover, D. J., Wood, S. P., Johnson, M. S., and Blundell, T. L. (1997) The three-dimensional structure at 2.4 A resolu tion of glycosylated proteinase A from the lysosome-like vacuole of Saccharomyces cerevisiae, J Mol Biol 267, 899-915. 101. Li, M., Phylip, L. H., Lees, W. E., Wi nther, J. R., Dunn, B. M., Wlodawer, A., Kay, J., and Gustchina, A. (2000) The as partic proteinase from Saccharomyces cerevisiae folds its own inhibitor into a helix, Nat Struct Biol 7, 113-7. 102. Weinreb, P. H., Zhen, W., Poon, A. W ., Conway, K. A., and Lansbury, P. T., Jr. (1996) NACP, a protein imp licated in Alzheimer's di sease and learning, is natively unfolded, Biochemistry 35, 13709-15. 103. Davidson, W. S., Jonas, A., Clayton, D. F., and George, J. M. (1998) Stabilization of alpha-synuclein secondary structur e upon binding to synthetic membranes, J Biol Chem 273, 9443-9. 104. Blake, C. C., Koenig, D. F., Mair, G. A., North, A. C., Phillips, D. C., and Sarma, V. R. (1965) Structure of hen egg-white lysozyme. A three-dimensional Fourier synthesis at 2 Angstrom resolution, Nature 206, 757-61. 105. Ganz, T. (2004) Antimicrobial polypeptides, J Leukoc Biol 75, 34-8.

PAGE 218

204 106. Fleming, A. (1922) On a remarkable b acteriolytic element found in tissues and secretions., Proc. Roy. Soc. Lond. 93, 306-310. 107. Green, T. B., Ganesh, O., Perry, K., Sm ith, L., Phylip, L. H., Logan, T. M., Hagen, S. J., Dunn, B. M., and Edison, A. S. (2004) IA3, an aspartic proteinase inhibitor from Saccharomyces cerevisiae, is intrinsically unstructured in solution, Biochemistry 43, 4071-81. 108. Tapp, H., Al-Naggar, I. M., Yarmola, E. G., Harrison, A., Shaw, G., Edison, A. S., and Bubb, M. R. (2005) MARCKS is a natively unfolded protein with an inaccessible actin-binding site: evidence for long-range intramolecular interactions, J Biol Chem 280, 9946-56. 109. Chandra, S., Chen, X., Rizo, J., Jahn, R., and Sudhof, T. C. (2003) A broken alpha -helix in folded alpha -Synuclein, J Biol Chem 278, 15313-8. 110. Bubb, M. R., Lenox, R. H., and Edison, A. S. (1999) Phos phorylation-dependent conformational changes i nduce a switch in the ac tin-binding function of MARCKS, J Biol Chem 274, 36472-8. 111. Hall, S. L., and Padgett, R. A. (1996) Requirement of U12 snRNA for in vivo splicing of a minor class of euka ryotic nuclear pre-mRNA introns, Science 271, 1716-8. 112. Sharp, P. A., and Burge, C. B. (1997) Classification of in trons: U2-type or U12type, Cell 91, 875-9. 113. Schinkmann, K., and Li, C. (1994) Comparison of two Caenorhabditis genes encoding FMRFamide(Phe-MetArg-Phe-NH2)-like peptides, Brain Res Mol Brain Res 24, 238-46. 114. Kim, K., and Li, C. (2004) Expression and regulation of an FMRFamide-related neuropeptide gene family in Caenorhabditis elegans, J Comp Neurol 475, 540-50. 115. Mertens, I., Clinckspoor, I., Janssen, T., Nachman, R., and Schoofs, L. (2006) FMRFamide related peptide ligands activ ate the Caenorhabditis elegans orphan GPCR Y59H11AL.1, Peptides 27, 1291-6. 116. Darlison, M. G., and Richter, D. (199 9) Multiple genes for neuropeptides and their receptors: co-evolution and physiology, Trends Neurosci 22, 81-8. 117. Bargmann, C. I. (1998) Neurobiology of the Cae norhabditis elegans genome, Science 282, 2028-33.

PAGE 219

205 118. Karlin, S., and Bucher, P. (1992) Corre lation analysis of amino acid usage in protein classes, Proc Natl Acad Sci U S A 89, 12165-9. 119. Neher, E. (1994) How frequent are co rrelated changes in fa milies of protein sequences?, Proc Natl Acad Sci U S A 91, 98-102. 120. Maule, A. G., Nikki J. Marks. and David W. Halton. (2001) in Parasitic Nematodes: Molecular Biology Biochemistry, and Immunology (Harnett, M. W. K. a. W., Ed.) pp 415-440, CAB Inte rnational Publishing, New York, NY. 121. Edison, A. S., Espinoza, E., and Zachar iah, C. (1999) Conformational ensembles: the role of neuropeptide structures in receptor binding, J Neurosci 19, 6318-26. 122. Andersen, N. H., Neidigh, J. W., Harris, S. M., Lee, G. M., Liu, Z., and Tong, H. (1997) Extracting Information from the Temperature Gradients of Polypeptide NH Chemical Shifts. 1. The Importa nce of Conformational Averaging, J Am Chem Soc 119, 8547-8561. 123. Tiers, G. V., and Coon, R. I. (1961) Preparation of Sodi um 2,2-Dimethyl-2Silapentane-5-Sulfonate, a Useful Intern al Reference for Nsr Spectroscopy in Aqueous and Ionic Solutions, Journal of Organic Chemistry 26, 2097-&. 124. Silverman, S. K., Lester, H. A ., and Dougherty, D. A. (1996) Subunit stoichiometry of a heteromultimeric G protein-coupled inward-rectifier K+ channel, J Biol Chem 271, 30524-8. 125. Van Renterghem, C., Bilbe, G., Moss, S., Smart, T. G., Constanti, A., Brown, D. A., and Barnard, E. A. (1987) GABA r eceptors induced in Xenopus oocytes by chick brain mRNA: evaluati on of TBPS as a use-dependent channel-blocker, Brain Res 388, 21-31. 126. Brown, N. A., McAllister, G., Weinbe rg, D., Milligan, G., and Seabrook, G. R. (1995) Involvement of G-prot ein alpha il subunits in activ ation of G-protein gated inward rectifying K+ channels (GIRK1) by human NPY1 receptors, Br J Pharmacol 116, 2346-8. 127. Reale, V., Hannan, F., Hall, L. M., and Evans, P. D. (1997) Agonist-specific coupling of a cloned Drosophila melanoga ster D1-like dopamine receptor to multiple second messenger pathways by synthetic agonists, J Neurosci 17, 654553. 128. Piotto, M., Saudek, V., and Sklenar, V. (1992) Gradient-tai lored excitation for single-quantum NMR spectrosc opy of aqueous solutions, J Biomol NMR 2, 661-5.

PAGE 220

206 129. Schweiger, A., Braunschweiler, L., Faut h, J., and Ernst, R. R. (1985) Coherent and incoherent echo spectroscopy with extended-time excitation, Physical Review Letters 54, 1241-1244. 130. Bothner-By, A. A., Stephens, R. L., Lee, J., Warren, C. D., and Jeanloz, R. W. (1984) Structure determination of a tetras accharide: transient nuclear Overhauser effects in the rotating frame, J Am Chem Soc 106, 811-813. 131. Delaglio, F., Grzesiek, S., Vuister, G. W ., Zhu, G., Pfeifer, J., and Bax, A. (1995) NMRPipe: a multidimensional spectral processing system based on UNIX pipes, J Biomol NMR 6, 277-93. 132. Johnson Bruce A., a. B., R. A. (1994) NMR View: A computer program for the visualization and analysis of NMR data, J Biomol NMR 4, 603-614. 133. Wuthrich, K. (1986) NMR of Proteins and Nucleic Acids, New York, NY. 134. Betz, M., Lohr, F., Wienk, H., and Rute rjans, H. (2004) Long-range nature of the interactions between titratable group s in Bacillus agaradhaerens family 11 xylanase: pH titration of B. agaradhaerens xylanase, Biochemistry 43, 5820-31. 135. Carlacci, L., and Edison, A. S. (2000) Computational analysis of two similar neuropeptides yields distinct conformational ensembles, Proteins 40, 367-77. 136. Stretton, A. O., Cowden, C., Sith igorngul, P., and Davis, R. E. (1991) Neuropeptides in the nematode Ascaris suum, Parasitology 102 Suppl, S107-16. 137. Payza, K., Greenberg, M. J., and Price, D. A. (1989) Furthe r characterization of Helix FMRFamide receptors: kinetics, tissue distribution, and interactions with the endogenous heptapeptides, Peptides 10, 657-61. 138. Marqusee, S., Robbins, V. H., and Baldwi n, R. L. (1989) Unusually stable helix formation in short alanine-based peptides, Proc Natl Acad Sci U S A 86, 5286-90. 139. Chakrabartty, A., Kortemme, T., and Bald win, R. L. (1994) Helix propensities of the amino acids measured in alanine-base d peptides without helix-stabilizing sidechain interactions, Protein Sci 3, 843-52. 140. Wishart, D. S., Sykes, B. D., and Ri chards, F. M. (1991) Relationship between nuclear magnetic resonance chemical sh ift and protein secondary structure, J Mol Biol 222, 311-33. 141. Schwarzinger, S., Kroon, G. J., Foss, T. R., Wright, P. E., and Dyson, H. J. (2000) Random coil chemical shifts in acidic 8 M urea: implementation of random coil shift data in NMRView, J Biomol NMR 18, 43-8.

PAGE 221

207 142. Schwarzinger, S., Kroon, G. J., Foss, T. R., Chung, J., Wright, P. E., and Dyson, H. J. (2001) Sequence-dependent correcti on of random coil NMR chemical shifts, J Am Chem Soc 123, 2970-8. 143. Wishart, D. S., and Case, D. A. (2001) Use of chemical shifts in macromolecular structure determination, Methods Enzymol 338, 3-34. 144. Lin, J. C., Barua, B., and Andersen, N. H. (2004) The helical alanine controversy: an (Ala)6 insertion drama tically increases helicity, J Am Chem Soc 126, 1367984. 145. Andersen, N. H., Cao, B., and Chen, C. (1992) Peptide/protei n structure analysis using the chemical shift index method: upfield alpha-CH values reveal dynamic helices and alpha L sites, Biochem Biophys Res Commun 184, 1008-14. 146. Dyson, H. J., Rance, M., Houghten, R. A ., Lerner, R. A., and Wright, P. E. (1988) Folding of immunogenic peptide fragments of proteins in water solution. I. Sequence requirements for the formation of a reverse turn, J Mol Biol 201, 161200. 147. Dyson, H. J., Rance, M., Houghten, R. A ., Wright, P. E., and Lerner, R. A. (1988) Folding of immunogenic peptide fragments of proteins in wate r solution. II. The nascent helix, J Mol Biol 201, 201-17. 148. Blanco, F. J., Serrano, L., and Forman -Kay, J. D. (1998) High populations of nonnative structures in the denatured state ar e compatible with the formation of the native folded state, J Mol Biol 284, 1153-64. 149. Osterhout, J. J., Jr., Baldwin, R. L., York E. J., Stewart, J. M., Dyson, H. J., and Wright, P. E. (1989) 1H NMR studies of the solution conformations of an analogue of the C-pept ide of ribonuclease A, Biochemistry 28, 7059-64. 150. Bundi, A., and Wuthrich, K. (1977) 1H NMR titration shifts of amide proton resonances in polypeptide chains, FEBS Lett 77, 11-4. 151. Bundi, A., Wuthrich, K. (1979) Use of 1H-NMR Titration Shif ts for Studies of Polypeptide Conformation, Biopolymers 18, 299-311. 152. Szyperski, T., Antuch, W., Schick, M., Betz, A., Stone, S. R., and Wuthrich, K. (1994) Transient hydrogen bonds identified on the surface of the NMR solution structure of Hirudin, Biochemistry 33, 9303-10. 153. Baxter, N. J., and Williamson, M. P. (1997) Temperature dependence of 1H chemical shifts in proteins, J Biomol NMR 9, 359-69.

PAGE 222

208 154. Elipe, M. V. S., Ralph T. Mosley, Ma ria A. Bednarek, Byron H. Arison. (2003) 1H-NMR Studies on a Potent and Select ive Antagonist at Human Melanocortin Receptor 4 (hMC-4R), Biopolymers 68, 512-527. 155. Higashijima, T., Tasumi, M., and Miyazawa, T. (1975) H nuclear magnetic resonance studies of melanostatin: depe ndence of the chemical shifts of NH protons on temperature and concentration, FEBS Lett 57, 175-8. 156. Kenakin, T. (2004) Principles : receptor theory in pharmacology, Trends Pharmacol Sci 25, 186-92. 157. Evans, P. D., Robb, S., Cheek, T. R., Reale, V., Hannan, F. L., Swales, L. S., Hall, L. M., and Midgley, J. M. (1995) Agonist-specific coup ling of G-proteincoupled receptors to s econd-messenger systems, Prog Brain Res 106, 259-68. 158. Robb, S., Cheek, T. R., Hannan, F. L., Ha ll, L. M., Midgley, J. M., and Evans, P. D. (1994) Agonist-specific coupling of a cloned Drosoph ila octopamine/tyramine receptor to multiple second messenger systems, Embo J 13, 1325-30. 159. Spengler, D., Waeber, C., Pantaloni, C., Holsboer, F., Bockaert, J., Seeburg, P. H., and Journot, L. (1993) Di fferential signal transducti on by five splice variants of the PACAP receptor, Nature 365, 170-5. 160. Floyd, P. D., Li, L., Rubakhin, S. S., Sw eedler, J. V., Horn, C. C., Kupfermann, I., Alexeeva, V. Y., Ellis, T. A., Dembrow, N. C., Weiss, K. R., and Vilim, F. S. (1999) Insulin prohormone processing, dist ribution, and relation to metabolism in Aplysia californica, J Neurosci 19, 7732-41. 161. Li, L., Moroz, T. P., Garden, R. W., Fl oyd, P. D., Weiss, K. R., and Sweedler, J. V. (1998) Mass spectrometric survey of in terganglionically transported peptides in Aplysia, Peptides 19, 1425-33. 162. Bedford, G. O. (1978) Biol ogy and Ecology of Phasmatodea, Annual Review of Entomology 23, 125-149. 163. Conle, O. V., and Hennemann, F. H. (2005) Studies on neotropical Phasmatodea I: A remarkable new species of Pe ruphasma Conle & Hennemann, 2002 from northern Peru (Phasmatodea : Pse udophasmatidae : Pseudophasmatinae), Zootaxa, 59-68. 164. Eisner, T., Morgan, R. C., Attygalle, A. B., Smedley, S. R., Herath, K. B., and Meinwald, J. (1997) Defensive produc tion of quinoline by a phasmid insect (Oreophoetes peruana), J Exp Biol 200, 2493-500.

PAGE 223

209 165. Ho, H. Y., and Chow, Y. S. (1993) Chemical-Identification of Defensive Secretion of Stick Insect, Megacrania-Tsudai-Shiraki, Journal of Chemical Ecology 19, 39-46. 166. Chow, Y. S., and Lin, Y. M. (1986) Ac tinidine, a Defensive Secretion of Stick Insect, Megacrania-Alp heus Westwood (Orthoptera, Phasmatidae), Journal of Entomological Science 21, 97-101. 167. Carlberg, U. (1986) Chemical Defe nse in SipyloideaSipylus (Westwood) (Insecta, Phasmida), Zoologischer Anzeiger 217, 31-38. 168. Bouchard, P., Hsiung, C. C., and Yaylay an, V. A. (1997) Chemical analysis of defense secretions of Sipyloidea sipylus and their potential use as repellents against rats, Journal of Chemical Ecology 23, 2049-2057. 169. Schneider, C. O. (1934) Las eman aciones del chinchemayo Paradoxomorpha crassa, Rev. Chil. Hist. Nat. 38, 44-46. 170. Scudder, S. H. (1876) Odoriferous glands in Phasmidae, Psyche 1, 137-140. 171. Smith, R. M., Joseph J. Brophy. G.W.K. Cavill, and Noel W. Davies. (1979) Iridodials and Nepetalactone in the Defe nsive Secretion of the Coconut Stick Insects, Graeffea crouani, J. Chem. Ecol. 5, 727-735. 172. Thomas, M. C. (2001) The Twostriped Walkingstick, Anisomorpha buprestoides (Stoll), (Phasmatodea: Pseudophasmatidae), Fla. Dept. Agri. & Consumer Svcs. Division of Plant Industry Entomology Circular 408. 173. Eisner, T. (1965) Defensive Spray of a Phasmid Insect, Science 148, 966-&. 174. Cavill, G. W., and Hinterberger, H. (1 961) Chemistry of Ants .5. Structure and Reactions of Dolichodial, Australian Journal of Chemistry 14, 143-&. 175. Ricci, D., Fraternale, D., Giamperi, L., Bucchini, A., Epifano, F., Burini, G., and Curini, M. (2005) Chemical composition, an timicrobial and antioxidant activity of the essential oil of Teucrium marum (Lamiaceae), Journal of Ethnopharmacology 98, 195-200. 176. Pagnoni, U. M., Pinetti, A., Trave, R ., and Garanti, L. (1976) Monoterpenes of Teucrium-Marum, Australian Journal of Chemistry 29, 1375-1381. 177. Eisner, T., Eisner, M., Aneshansley, D. J., Wu, C. L., and Meinwald, J. (2000) Chemical defense of the mint pl ant, Teucrium marum (Labiatae), Chemoecology 10, 211-216.

PAGE 224

210 178. Aue, W. P., Bartholdi, E., and Ernst, R. R. (1976) 2-Dimensional Spectroscopy Application to Nuclear Magnetic-Resonance, Journal of Chemical Physics 64, 2229-2246. 179. States, D. J., Haberkor n, R. A., and Ruben, D. J. (1982) A Two-Dimensional Nuclear Overhauser Experiment with Pu re Absorption Phase in 4 Quadrants, Journal of Magnetic Resonance 48, 286-292. 180. Bax, A., and Davis, D. G. (1985) Practical Aspects of Two-Dimensional Transverse Noe Spectroscopy, Journal of Magnetic Resonance 63, 207-213. 181. Bax, A., Griffey, R. H ., and Hawkins, B. L. (1983) Correlation of Proton and N15 Chemical-Shifts by Multiple Quantum Nmr, Journal of Magnetic Resonance 55, 301-315. 182. Garbow, J. R., Weitekamp, D. P., and Pines, A. (1982) Bilinear Rotation Decoupling of Homonuclear Scalar Interactions, Chemical Physics Letters 93, 504-509. 183. Bax, A., and Summers, M. F. (1986) H-1 and C-13 Assignments from SensitivityEnhanced Detection of Heteronuclear Multiple-Bond Connectivity by 2d Multiple Quantum Nmr, Journal of the American Chemical Society 108, 2093-2094. 184. Hendrix, D. L. (1993) Rapid Extract ion and Analysis of Nonstructural Carbohydrates in Plant-Tissues, Crop Science 33, 1306-1311. 185. Rowellrahier, M., and Pasteels, J. M. (1986) Economics of Chemical Defense in Chrysomelinae, Journal of Chemical Ecology 12, 1189-1203. 186. Blum, M. S., and Woodring, J. P. (1962) Secretion of Benzaldehyde and Hydrogen Cyanide by Millipede Pachydesmus Crassicutis (Wood), Science 138, 512-&. 187. Tumlinson, J. H., Klein, M. G., Doolittle, R. E., Ladd, T. L., and Proveaux, A. T. (1977) Identification of Female Japanese Beetle Sex-Pherom one Inhibition of Male Response by an Enantiomer, Science 197, 789-792. 188. Rudolph, R., and Lilie, H. (1996) In vitro folding of in clusion body proteins, Faseb J 10, 49-56. 189. Invitrogen. Guide to Baculovirus Expr ession Vector Systems (BEVS) and Insect Cell Culture Techniques, www.invotrigen.com.

PAGE 225

211 190. Novagen. (2006) pET System Manual 11th Edition. 191. Kay, L. E. (2005) NMR studies of protein structure and dynamics, J Magn Reson 173, 193-207.

PAGE 226

212 BIOGRAPHICAL SKETCH Aaron Todd Dossey was born at St. Deaconess Hospital in Oklahoma City, Oklahoma (USA) on December 15, 1977. He wa s raised in Midwest City, Oklahoma by his mother, Teresa Marie Dossey and grandpa rents, Jerry Courtla nd and Emma Ailene Dossey. There he attended Eastside Elem entary School, Jarman Jr. High School, and attended and graduated from Midwest City High School as a valedictorian in 1996. In Junior High and High Sc hool he was active in the band a nd enrolled in extra math and honors courses. He lived in Midwest City until moving to Stillwat er, Oklahoma for his undergraduate college education at Ok lahoma State University (OSU). In the fall if his first year at OSU, Aaron was in the OSU marching band. During his tenure at OSU he was fortunate enough to be involved in severa l research projects, gaining invaluable hands-on experience in bi ochemical techniques. Aaron worked for a short time under Dr. Steven White on a phos pholipase A2 molecular modeling project. Later, he worked with Dr. Ulrich Melcher making DNA constructs of tobacco viruses. From 1999 through spring 2001, he comp leted work on enzyme kinetics of dehydrogenase enzymes (in the laboratory of Dr Olin Spivey) which formed the basis of AaronÂ’s undergraduate honors thesis. Aaron was also awarded a Wentz Research award for the work in Dr. SpiveyÂ’s lab. In 2001 he received his bachel orÂ’s degree Cum Laude in Biochemistry and Molecular Biology (D epartment of Agricultural Sciences and Natural Resources) with minors in Chemistry and Mathematics from OSU.

PAGE 227

213 After applying to and vis iting several univers ities to which he was accepted for graduate school, Aaron chose to attend th e Interdisciplinary Program in Biomedical Sciences (IDP) program at the University of Fl orida, College of Medicine, in the fall of 2001. There, after working through lab rotati ons required in the fi rst year, he chose a position in the lab of Dr. Arthur Scott Edis on using Nuclear Magneti c Resonance (NMR), among other techniques, to analyze fine r molecular details of FMRFamide-like neuropeptides from the nematode Caenorhabditis elegans and defensive secretions of the phasmid insect species Anisomorpha buprestoides and Peruphasma schultei. The latter project occurred during the spring of AaronÂ’s last semester as a student and turned out to be a very nice last minute project which was also incorporated into this dissertation. It stemmed from AaronÂ’s persona l passion for insects and wa s made possible by his having his own cultures of A. buprestoides. AaronÂ’s tenure as a graduate student at the University of Florida was an unfolding mystery. At his arrival in the fall of 2001 he could not have predicted the many travels and projects he has been i nvolved with as a student unde r Dr. Edison. The insect chemistry project (Chapter 4 of this dissert ation) that he was allowed to pursue was particularly spontaneous. Aaron believes that his experiences at the University of Florida illustrate that everything happens for a reason.