THE EXPLORATION OF STRUCTURAL VARIABILITY IN PROTEINS BY NATURAL SELECTION AND RATIONAL DESIGN By JOSE ANTONIO HERNANDEZ PRADA A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLOR IDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2006
Copyright 2006 by Jos Antonio Hernndez Prada
To all life forms sacrificed in the name of scientific research for the better quality of humankind, especially mice, rats, and higher mammals
iv ACKNOWLEDGMENTS This dissertation would not have been completed without the support of my supervisory committee, coworkers, family, a nd friends. First I thank my mentor (Dr. David Ostrov) whose counseling extended to re search in the lab, manuscript preparation, and public speaking. He taught me the importan ce and meaning of collaborative efforts in science (along with bits of politics I am still adjusting to). I am also grateful for his patience and understanding that never gave th e slightest sign of running out, and for all the support and constant encouragement th at made the last 3.5 years enjoyable, motivating, and incredibly stimulating. Next I express my gratitude to my supervising committee members: Dr. Brian Cain, Dr. Jo anna Long, Dr. Mohan Raizada, and Dr. Thomas Rowe. Their constructive criticism gave proper shape to much of what is included in this dissertation and led me to anticipate our meetings with enthusiasm instead of apprehension. I am also grateful to Dr. Susan Frost, Dr. Harry Nick, Dr. Michael Kilberg, and Dr. James Flanegan for objective criticism duri ng and after journal club presentations. Their constant emphasis on the importance of such practice for beginning scientists was no ticed and appreciated. I extend special thanks to our collaborators at the Univer sity of South Florida, in Tampa. The laboratory of Dr. Gary Litman (particularly Dr. Robert Haire, Dr. John Cannon, and Dr. Litman himself of course) involved me in a rewarding project and allowed me to partake in a great collaborat ion. This dissertation would not have been completed without their support and expertise.
v Next I thank Dr. Robert McKenna and hi s wife Dr. Mavis Agbandje-McKenna, for bringing me early to Gainesville by letting me join their lab in Ma y before the program started. I thank their lab members at the ti me (mainly David Duda, Craig Yoshioka, Dr. Lakshmanan Govindasami, Robbie Reutzel, and Javier Ortiz) for receiving me in their lab. Finally, I thank my classmates, coworkers, and friends. I thank my classmates for their sympathy and endless support. I tha nk past members of the Ostrov lab (Luke, Christine, and Patrick) for their company and assistance during their time in the lab. I thank Amanda for managing the lab and he lping with so many experiments and administrative tasks. I thank my friends from back home: none of them do science, but they always show interest in my involvement with it. Finally I w ould like to thank my family, especially my parents and my sister s, for supporting me always and being there for me. There is no room in this dissertation to thank them for the 20+ years of teaching, guiding, and encouragement. I thank my sisters fo r the laughter we have shared in the last few years, and I thank my parents for raising me the way they did.
vi TABLE OF CONTENTS page ACKNOWLEDGMENTS.................................................................................................iv LIST OF TABLES.............................................................................................................ix LIST OF FIGURES.............................................................................................................x ABSTRACT......................................................................................................................x ii CHAPTER 1 BACKGROUND AND SIGNIFICANCE....................................................................1 Introduction................................................................................................................... 1 Structural Variability in Proteins..................................................................................2 Categorizing Structur al Variability.......................................................................3 Obvious Examples of Structural Diversity............................................................4 Conformational Subpopulations of Protein Structures as Tertiary Structural Variability................................................................................................................5 Enzyme Dynamics May Limit Catalysis......................................................................6 Population Shift in Protei n Structures as a Novel Strategy in Drug Therapy..............8 The Treatment of Hyperten sion and ACE2 Activation..............................................10 Molecular Docking and Virtual Screeni ng as Tools to Exploit Conformational Equilibria for Drug Discovery...............................................................................11 Molecular Docking and Virtual Screening..........................................................12 How DOCK Works.............................................................................................13 The Immunoglobulin Fold and Pr imary Structural Variability..................................16 Amphioxus Harbors Primordial V-Type Immunoglobulins.......................................18 Non-Canonical Interactions Become an Increasingly Recognized Feature in the Structure of Proteins..............................................................................................20 Nucleic Acids and C-H---O Bonds......................................................................20 Proteins and C-H---O Bonds...............................................................................21 Carbohydrates and C-H---O Bonds.....................................................................22 Conventional Hydrogen Bonds are Known to Vary in Strength.........................22 Known Chitin-Binding Domains Ha ve Antimicrobial Activity.................................23 Chitin...................................................................................................................23 Invertebrate and Plant Chitinases........................................................................24 Mammalian Chitinases........................................................................................26
vii 2 CRYSTALLIZATION AND PRELIMINAR Y X-RAY ANALYSIS OF VCBP2 AND VCBP3 FROM Branchiostoma floridae ...........................................................28 Introduction.................................................................................................................28 Materials and Methods...............................................................................................30 Constructs and Expression..................................................................................30 Refolding and Purification..................................................................................31 Crystallization of VCBP3 (V1 and V1V2)..........................................................32 Crystallization of VCBP2 (V1 and V1V2)..........................................................33 Data Collection and Processing for VCBP3V1 and VCBP2V1..........................33 Data Collection and Processing for VCBP3V1V2..............................................35 Results........................................................................................................................ .36 Discussion...................................................................................................................38 3 THE 1.15 CRYSTAL STRUCT URE OF VCBP3V1 FROM Branchiostoma floridae ........................................................................................................................44 Introduction.................................................................................................................44 Materials and Methods...............................................................................................45 Expression and Purification of VCBP3V1..........................................................45 Crystallization Conditions...................................................................................45 Data Collection and Processing...........................................................................45 Crystallographic Refinement...............................................................................47 Results........................................................................................................................ .49 Discussion...................................................................................................................56 4 ORIGINS OF IMMUNE-TYPE RECEPTOR DIVERSITY: CRYSTAL STRUCTURE OF VCBP3V 1V2 IN AMPHIOXUS..................................................63 Introduction.................................................................................................................63 Materials and Methods...............................................................................................65 Protein Expression, Purificat ion, and Crystallization..........................................65 Data Collection and Processing...........................................................................65 Phasing and Refinement......................................................................................66 Comparative Homology Modeli ng of VCBP2 V Domains.................................67 Results........................................................................................................................ .68 Discussion...................................................................................................................75 5 IDENTIFICATION OF ACE2 ACTIVATORS BY VIRTUAL SCREENING OF SMALL MOLECULE LIBRARIES..........................................................................78 Introduction.................................................................................................................78 Materials and Methods...............................................................................................81 Virtual Screening.................................................................................................81 Enzymes, Substrates, and Small Molecule Compounds.....................................82 Activity Assays....................................................................................................83
viii Results........................................................................................................................ .84 Discussion...................................................................................................................90 6 CONCLUSIONS AND FUTURE DIRECTIONS...................................................100 Crystallization and Preliminary X-ray Analysis of VCBP2 and VCBP3 from Branchiostoma floridae .......................................................................................101 The 1.15 Crystal Struct ure of VCBP3V1 from Branchiostoma floridae .............102 Origins of Immune-Type Receptor Divers ity: Crystal Structure of VCBP3V1V2 in Amphioxus.......................................................................................................104 Identification of ACE2 Activators by Virtual Screening of Small Molecule Libraries...............................................................................................................105 Future Directions......................................................................................................107 LIST OF REFERENCES.................................................................................................112 BIOGRAPHICAL SKETCH...........................................................................................125
ix LIST OF TABLES Table page 2-1 Data collection and reducti on statistics for VCBP crystals......................................35 3-1 Data collection statistics for VCBP3V1 crystals......................................................47 3-2 Refinement statistics for th e crystal structur e of VCBP3V1....................................48 3-3 Structural comparison of VCBP3V1 with proteins in the Protein Data Bank.........52 3-4 Non-canonical interactions of impor tant immunoglobulin -fold residues in VCBP3V1.................................................................................................................59 4-1 Data collection and refinement st atistics from VCBP3V1V2 derivative.................66 5-1 Top ten scoring compounds for the three different sites docked.............................86
x LIST OF FIGURES Figure page 1-1 Closed and open forms of ACE2................................................................................9 1-2 The renin-angiotensin system...................................................................................11 1-3 How DOCK works...................................................................................................15 1-4 Plant and invertebrate chitin-binding domain superimposed...................................25 1-5 NMR solution structure of tachycitin (PDBID: 1DQC)...........................................26 2-1 Formation of hexagonal VCBP3V1 crysta ls in different precipitant conditions.....33 2-2 Early crystals of VCBP2V1 identifie d by automated high throughput screening...34 2-3 VCBP3V1V2 crystals at different pH......................................................................37 2-4 Probability distribution for Matthews co efficient and solvent content values of protein crystals.........................................................................................................39 3-1 VCBP3V1 selenomethionine deri vative crystals and diffraction............................46 3-2 Electron density calculated from threewavelength MAD experimental phases to 1.3 .........................................................................................................................5 0 3-3 VCBP3V1 is a V-type immunoglobulin..................................................................51 3-4 Amino acid sequence alignment of Branchiostoma floridae VCBP3V1 with structural homologues..............................................................................................54 3-5 Structural homologs of V CBP3V1 use their front sheets (A GFCC C ) in ligand interactions...............................................................................................................55 3-6 Non-canonical interactions at the core of VCBP3V1 between Trp42 and Cys110.....58 4-1 Comparative homology modeling of VCBP2 based on the 1.85 crystal structure of VCBP3..................................................................................................68 4-2 Crystal structure of VCBP3V1V2 from Branchiostoma floridae solved by SAD and refined to 1.85 ................................................................................................70
xi 4-3. Structural comparison of th e VCBP3 domain fold and packing interactions with antigen receptors......................................................................................................72 4-4 VCBP3 uses a J-like segment at its interface...........................................................74 5-1 Clusters targeti ng three sites on ACE2....................................................................83 5-2 Activators of ACE2 ar e structurally similar.............................................................87 5-3 Compound 3 from site 1 activates ACE2.................................................................89 5-4 Compound 6 from site 1 activates ACE2.................................................................91 5-5 The ACE2 compounds do not enhance ACE activity..............................................92 5-6 Exhaustive molecular docking of ACE2 enhancers.................................................95
xii Abstract of Dissertation Pres ented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy THE EXPLORATION OF STRUCTURAL VARIABILITY IN PROTEINS BY NATURAL SELECTION AND RATIONAL DESIGN By Jos Antonio Hernndez Prada May 2006 Chair: David A. Ostrov Major Department: Biochemi stry and Molecular Biology At least 10390 different sequences are possibl e for the average protein of 300 residues and proteins approaching the 1000 resi dues are not that rare. Most of these available sequences probably fail to fold into a potentially useful un it, but those that can perform an advantageous function are sele cted, adapted, and eventually gain new functions. The immunoglobulin (Ig) fold is probably the most ve rsatile domain known today with many functions in cell adhesion and immunity. V regi on-containing chitinbinding proteins (VCBPs) from amphioxus, which lacks adaptive immune receptors, represent the only inve rtebrate variable (V) regions known to exhibit regionalized germline hypervariability and thought to be long to the V-type immunoglobulins. These characteristics combined make VCBPs one of the most structurally related invertebrate immune-type receptors to an adaptive receptor ancestor. Crystal structures of VCBP3 to 1.15 a nd 1.85 were solved and confirm that VCBP3 exhibits V-type regions. The hypervar iable regions map to a contiguous surface
xiii at the interface of the two Ig domains, similar to antigen receptors. The atomic resolution structure shows novel noncanonica l interactions by conserved V-type Ig residues that likely had an important role in the structural evolution of the Ig fold. These structures may reveal features of primordial antigen receptors, bridging gaps in our knowledge of immune system evolution. Adaptive receptors exemplify an exploit of primary sequence by natural selection, but proteins show structural variability at higher structure levels as well (i.e., secondary, tertiary, and quaternary). We outlined a novel strategy in drug discovery that exploits different conformations in enzymes (i.e., tertiary structure). ACE2 is a central regulator of the renin-angiotensin system and a target agains t hypertension. Structural analysis of ACE2 and virtual screening methods allowed us to identify 2 compounds that enhance enzymatic activity ~2-fold in a dose-respons e manner, while not affecting ACE activity. These compounds are predicted to interact with an allosteric site and show a high degree of structural similarity. This method is applicable to enzymes showing multiple conformers and offers new opportunities in th e development of enzymatic activators. It also offers alternatives to develop n ovel inhibitors of resistant enzymes.
1 CHAPTER 1 BACKGROUND AND SIGNIFICANCE Introduction The average size protein is 300 residu es long (Dobson 2004). This means that technically, there are a maximum of 10390 possible sequences for a protein of this length (assuming only 20 standard amino acids) but pr oteins approaching the 1000 residues are not that rare. This means there is an incred ibly vast pool of prim ary structure but the simplest organisms are predicted to e xpress less than 1000 proteins and our own proteome is estimated to consist of only a few thousand folds. This amazingly small subset of the available pool harbors enough va riability to allow for the different cells, tissues, and myriad biochemical reactions to func tion in a way that makes us what we are. Clearly, complex organisms (or any orga nisms) do not depend solely on the functions of proteins. Lipids , carbohydrates, proteins, and nucleic acids all form an incredibly complex system in chem ical biology. Other factors such as compartmentalization, posttranslational modifi cation, alternative splicing, and chemical gradients are also known to function on different levels of complexity. Complexity should not be understood only spatially, si nce temporal separation of biochemical phenomena is probably involved in more events than we are currently aware of. This list of examples, by no means exhaustive, is m eant to acknowledge the many other factors that may affect protein function. These fact ors encourage us to question and study the structures of proteins.
2 We examined two examples of structural variability in proteins and how this variability manifests for each case. These examples support our quest for understanding proteins for different reasons. Our study of th e origins of selected variability in the primary structure of proteins (Chapter 3 and 4) allowed us to answer questions related to the evolution of the imune system. Our study of different structures related to protein dynamics (Chapter 5) enabled us to pros pose a novel strategy in drug development. However, they both share a common template (e.g., fold or primary sequence) for variability. Understanding these differences in proteins will enable us to understand our evolutionary history and what can we do next to take advant age of these differences and push forward the frontiers of Biomedicine. Structural Variability in Proteins Proteins must have a diverse set of stru ctures to carry out the many chemical and structural roles they play in an organi sm. Obvious diversity is observed when we compare proteins that have distinct folds, or otherwise very differe nt structure scafolds. Although we may code for 20-30K proteins, the estimated number of folds in our proteome is much smaller (Orengo and Thornt on 2005). This suggests that variability in any particular fold may be less clearly defi ned. A particular fold may have a different function due to small variations in primar y structure such as point mutations and localized hypervariable residues (i.e., as in the case of an tibodies). The same fold may exist in different conformations (Towler et al. 2004), or a particular fold may be able to swich structures and form a different fold (Ross et al. 2005). This variability along with factors such as those mentioned above are greatly responsible for the complex systems that make up an organism. Although an exhaustive survey of the structures of proteins is not possible here, the variability of protein stru ctures can be categorized in parallel to the
3 levels of structure traditi onally taught when discussing the fundamentals of protein structure. Proteins may have a primary, secondary, te rtiary, and quaternary structure. Their primary structure consists of the amino aci d sequence. Secondary structure involves the local folding of residues, mostly into one of two structures: helices and strands. These locally ordered segments can arrange to form larger units such as helix bundles, sheets, and many different folds, which belong to te rtiary structure. Tertiary structure is classically described as the overall three-dimensional stru cture of proteins, although it could probably be subdivided to segregat e multi-domain proteins from single-domain proteins that better fit this description. Mu lti-domain proteins may behave somewhat like multimeric proteins if they are attached by a long flexible linker. As individual domains may act more independently, they will resemble quaternary structure (e.g., proteins that engage in domain swapping). Quaternary stru cture arises when different subunits (not linked) form complexes behaving as single biological units. A ll proteins certainly have a primary structure, but many proteins do not ha ve quaternary structur e. Most proteins do have a secondary and tertiary structure that is stable, although modern concepts of protein dynamics such as local unfolding and conf ormer equilibria challenge these traditional views (Frauenfelder et al . 1988). Categorizing Structural Variability It may be useful therefor e to categorize structural va riability according to these levels of structure. Primary structural va riability can be easily understood when we examine differences in protein sequen ce: mutations, deletions, insertions, polymorphisms, and similar variations of protein sequence. Secondary and tertiary structural variations can be grouped together since variation of either will seldom exclude
4 the other. Different topologies and folds ar e obvious representations of secondary and tertiary structural variability. Finally, quatern ary structural variability is observed when proteins with the same sequence and fold ha ve different functions or activities when complexed with different partners. An exampl e of quaternary struct ural variability is observed when proteins form homoand/or different heterodimers that function differently. These categories are not independent, since variation in sequence may ultimately yield a different fold, different topology, or different multimerization capability. Likewise, some folds can better withstanding less-than-conservative mutations and have a better chance to become polymorphic. This assumes that these mutations are not detrimental to their function. However, im mune receptors are a good example of when sequence diversity may become advantageous (i.e., to recognize a diverse set of antigens). This categorization is evidently arbitrary, but may help in thinking about interestingly subtle differences in the structures of proteins. Obvious Examples of Structural Diversity Probably the best examples today of primar y structural variability are the antigen receptors of the adaptive immune system. Th ese receptors have regionalized sequence variability that enables them to rec ognize a wide range of antigens (Zemlin et al . 2002). In the case of antibodies and T cell receptors (adaptive receptors) the same fold and the same topology can be mutagenized by the immune system to recognize an almost infinite number of antigens (Zemlin et al. 2002). Other families of imm une receptors throughout phylogeny are known to be highly polymorphic a nd thus capable of recognizing a wide range of antigens, but none to the extent of adaptive recepto r capability (Litman et al. 2005). Antibodies and T cell receptors are also good examples of quaternary structural
5 variability, since heavy and light chains must pair to functi on and expand their repertoire by combinatorial diversity (Collins et al. 2003). Another remarkable example of quaternar y variability is G-proteins, which may pair with inhibitory and activator protei ns at the cell membrane depending on what pathways are stimulated (Simonds 1999). Many receptors also function as homo or heterodimers, and their effector functions may be different in each case. Some of these receptors, G-proteins, and other examples may steer their functional roles depending more strongly on quaternary structure; but in the case of adaptive and many other antigen receptors, both primary and quaternary stru cture strongly influence their activity. An extreme example of seconda ry/tertiary structur al variability is observed in prion proteins and unstructured pr oteins. Prion proteins can restructure into infamous -strandrich folds that can multimerize and cause disease (Ross et al . 2005). Although the pathological structure of prion proteins multimerizes, it is the change in secondary/tertiary structure that enables th ese proteins to make new (unfortunately deadly) interactions. In the case of unstructur ed proteins, there is no stable secondary or tertiary structure throughout a significant portion of the protein. The native conformation of these proteins is the unfolded state, with no average structure (Dyson and Wright 2005). These proteins stabilize into a structur e on ligation and due to their extensible flexibility, they are often capable of binding more than one ligand. Thus they may have variability on all th ree (secondary, tertiary, and quat ernary) levels of structure. Conformational Subpopulations of Protein Structures as Tertiary Structural Variability Although any protein can exist as a set of c onformers, we are particularly interested in applying this concept to enzymes. An enzy me may exist in different conformations and
6 some of these may have an increased ability to perform their function. Likewise, other conformations may be inactive. It is also possible that the rates of exchange between these conformations limit turnover rates. NM R relaxation experiments show that a single frequency can describe the amide relaxation ra tes for most residues in several proteins examined (Eisenmesser et al . 2005). This suggests that the measurements represent collective motions involving most of the enzy me and may be translated to large-scale conformational changes. Furthermore, many enzymatic reactions occur in microto millisecond timescales that may correspond to such conformational changes. These observations and experiments showing that these rates are almost identical in the presence or absence of substrate suggest th at the rate limiting factor for many enzymes may be a conformational change. By consider ing these concepts (Chapter 5) as an example of tertiary structural variability, we were able to outline a novel strategy for drug discovery. Enzyme Dynamics May Limit Catalysis The function of a protein depends on both the stability and flexibility of its structure. Specific residues must be positioned appropriately in the active site of an enzyme to provide the required physicochemi cal environment for catalysis. The enzyme structure must provide a stable scaffold to hold these residues in pl ace. One of the most important features of an active site is its abil ity to seclude reactants from solvent while at the same time positioning them with an appr opriate geometry for the reaction to occur (Eisenmesser et al. 2005). Such strict requirements on the formation of a complex between an enzyme and its substrates are now understood to be rarely achieved by a â€œlock and keyâ€ process of binding (Chen et al. 2002). Instead, an appropriate degree of flexibility must allow substrates and products to enter and exit th e active site at a rate that
7 ideally does not slow down turnover rates. In many cases this flexibility may be manifested as simple side chains rearra ngements and other small â€œinduced-fitâ€ effects (McCammon 2005). But what happens when the catalytic cycle of an enzyme involves large-scale changes in its structure that occur in the microto millisecond timescale? Would these more flexible enzymes that are slower to perform the appropriate rearrangements limit catalysis? There are ma ny examples of enzymes whose rate-limiting step involves product release (Wolf-Watz et al . 2004). Others may not have the available kinetic evidence to suggest a rate-limiting step but clearly exist in at least two different conformations (open and closed). Such is the case for enzymes involving lid opening and closing (e.g., HIV protease) and hing e-bending enzymes (e.g., ACE2) (Towler et al. 2004). The analysis of NMR relaxation data by Wo lf-Watz and others (2004) showed that the reaction catalyzed by ade nylate kinase (Adk) is limited by the dynamics of its lid. Their experiments compared the dynamic s of the metastable kinase from E. coli and its thermostable homolog from Aquifex aeolicus . Hyperthermophilic proteins have an increased stability that allows them to function at 80C and higher temperatures. However, differences in the structure of mesophilic-thermophilic enzyme pairs are usually small. This observation suggests that the activities of these enzymes should be the same at low temperatures (most mesophili c enzymes completely denature between 40 and 60C) but thermophilic enzymes show d ecreased enzymatic activity compared to its mesophilic homolog at the same temperatur e. The laboratory of Dorothee Kern (WolfWatz et al . 2004) showed that the decreased conformational exchange rates of the thermostable protein coincide with the d ecreased activity at lower temperatures.
8 As observed for other proteins, a single ra te constant describes the motions of all amides with detectable conformational exch ange in Adk. Interest ingly, they found that the lid-closing rate constants are the same for both enzymes (kclose = ~1,400 s-1) at 20C. Lid-opening rates constants, however, were much lower and matched kinetic turnover rates at the same temperature (kopen = 44 s-1 and kcat = 30 s-1 for thermo Adk; kopen = 286 s-1 and kcat = 263 s-1 for meso Adk). An independent experiment showed that 1H-15N correlation spectra are almost identical in the presence of either substrate or a nonconve rtible ATP analog (AMPPNP). A dk catalyzes the reversible reaction of one AMP and one ATP molecule into 2 ADP molecules. The chemically inactive compound binds the enzyme like the substrate would but does not undergo the phosphotransfer step. The NMR data show that the dynamic processes in Adk are reflective of the binding and release of substrate but not of chemical catalysis. Many enzymes require large conformati onal changes to perform catalysis. Advances in NMR relaxation tec hniques are enabling us to quan titatively characterize the rate constants that describe the interconvers ion between different structures (open and closed). A variety of other enzymes have been shown to perform domain or lid conformational changes with fr equencies that approximate ca talytic turnover rates (e.g., triose phosphate isomerase, dihydrofolat e reductase, and RNAse A). Hinge-binding enzymes such as angiotensin-converting enzyme 2 have not been analyzed by these techniques yet, but they ar e likely to undergo similar dyna mic processes (Figure 1-1). Population Shift in Protein Structures as a Novel Strategy in Drug Therapy Most therapeutic agents are designed to inhibit the activity of an enzyme. In structure-based approaches to drug discovery and developmen t the ligand binding site is almost always targeted for inhibition . If successful, ofte n a synthetic compound
9 A B Figure 1-1. Closed and open forms of ACE 2. If ACE2 exists in mainly these conformations, we could be able to e nhance its activity by either increasing the concentration of a particular c onformer or by increasing the rate of exchange between conformers. (A) show s the inhibitor bound crystal structure (PDBID: 1R4L). Inhibitor not shown. (B) shows the open form of the enzyme (PDBID: 1R42). Both crystal structures are colored according to secondary structure (red for helices, gray fo r loops, and gold for strands). (small molecule, antibody, or peptide) competitively inhibits enzymatic activity by binding at the active site. This requires that we understand the active site of the enzyme and that the affinity of the enzyme for the i nhibitor be much greater than for that of the substrate. Logically, this approach is limite d to the design or discovery of therapeutic agents that function in inhibition. Howeve r, by focusing the effector function of a therapeutic agent via shifting of conformational populations of an enzyme, it may be possible to design enzymatic activators as we ll as novel inhibitors. This strategy may open the door to new targets in many diseas es (e.g., partial enzyme deficiencies). Novel inhibitors may be designed by deve loping compounds that stabilize inactive forms of an enzyme while stil l targeting structural sites out side of the active site. We refrain from using the term â€œallosteryâ€ in our study since traditionally this term is strongly related to the cooperative effects observed in many multimeric proteins (Jaffe 2005). Although populations shift concepts are not new to allosteric proteins (Volkman et al . 2001). In our study we focus our efforts on enzymes. Novel inhibitors may be of
10 particular importance in the case of enzymes that are resistant to current therapeutic agents (e.g., HIV protease and antimicrobials). By the same mechanism we could potential ly stabilize a more active form of the enzyme. Or we could increase the conformati onal rate of a confor mational change that enriches the population of active enzyme. It is proposed here that if this mechanism can be validated for an inhibitor, we can like ly design an activator based on these same principles. The Treatment of Hyperten sion and ACE2 Activation Hypertension is one of the highest hea lth concerns as it affects more than 65 million Americans (Suri et al . 2004). The prolonged high bl ood pressure experienced by individuals suffering from hypertension eventual ly leads to other ailments if untreated. This growing list of related complications in cludes strokes, ischemic heart disease, peripheral vascular disease, renal da mage, and acute lung failure (Miura et al . 2001; Borghi et al . 2004). It is well establishe d that the renin-angioteins in system (RAS) has an important role in the regulation of blood pressure (Zaman et al . 2002; Ruiz-Ortega et al . 2001) and agents that decrease activity of the RAS effectively lower blood pressure (Unger 2003). Likewise, transgenic a nd knockout mice of RAS genes develop cardiovascular and blood pressure phenotype s consistent with their function in the regulation of the RAS and hypertension (B ader 2002). Therefore, the control and treatment of hypertension is widely believe d to have deep root s in the therapeutic intervention of the RAS. In the classical understanding of th e RAS (Figure 1-2), renin cleaves angiotensinogen to produce the decapeptide angiotensin I(1-10) that is the substrate of angiotensin-converting enzyme (ACE). Cata lysis by ACE produces angiotensin II(1-8)
11 which stimulates angiotensin receptors. Recently, a homolog of ACE (ACE2) was identified (Donoghue et al. 2000, Tipnis et al . 2000). Angiotensin-converting enzyme 2 cleaves angioensin I to produce angiotensin 1-9 and angiotensin II to produce angiotensin 1-7 (Figure 1-2). Angiotensin 1-7 has vas odilatory properties th at counteract the vasoconstrictive effects of angiotensin II. Angi otensin 1-7 also inhibits ACE. Therefore, increased activity from ACE2 will result in desirable effects that lower blood pressure and likely offer therapeutic potential agains t hypertension. Most curr ent antihypertensive treatments involve the inhibition of ACE or AT1 receptors (Unger 2003), but an agent able to enhance the enzymatic activity of ACE2 represen ts a novel approach to the treatment of hypertension and related dis eases. We describe (C hapter 5) how we identified two small molecule compounds for this purpose. Figure 1-2. The renin-angiotensi n system. Classic understanding of the cascade is in blue. New alternatives that should be explor ed for therapeutic potential are shown in red. Molecular Docking and Virtual Screening as Tools to Exploit Conformational Equilibria for Drug Discovery As described in Chapter 5, the following in silico techniques were used to identify small molecule activators of angiotensinconverting enzyme 2, an important recently validated target for the treatme nt of hypertension (Katovich et al . 2005). Although more
12 experiments are needed to validate an equilibrium-shift mechanism of action for these compounds, it is noted this is the first report on the identifi cation of enzyme activity enhancers by this structure-based approach. Molecular Docking and Virtual Screening Molecular docking consists of a comput er simulation attempting to predict the binding orientation and confor mation between a small molecule and a macromolecular target (Kitchen et al. 2004). Although the field has expa nded to protein-protein docking and docking of other macromolecular asse mblages (e.g., nucleic acids on proteins) (Kitchen et al. 2004), this later development is not relevant to our study and will not be discussed further. Also, it is noted that di fferent macromolecules can be targeted with small molecules. Of special interest to our study, however, is the application of molecular docking to proteins (namely enzy mes) for the identification and development of small molecules that ma y modulate enzymatic activity. The number of protein structures available from databases such as the Protein Data Bank (PDB)(Berman et al . 2000) is increasing rapidly. This large body of data provides the scientific community with a concomitant in crease in the number of therapeutic targets available for the development of treatments against many diseases. Structures are particularly amenable to computational te chniques which offer economical and practical strategies and thus make the structure-ba sed drug discovery and development process more effective (Bajorath et al . 2002; Langer et al . 2001; Walters et al . 1998). The in silico docking of small molecules was pi oneered in the early 80s (Kuntz et al . 1982) and is now an active field of research. Virtual screening is the application of molecular docking to large databases of small molecule structure coordinates. It is estimated that there is a maximum of ~ 1060-
13 10100 druglike molecules in "chemical space" (Bohacek et al . 1996). This is at present, and probably will be for decades to come, an impossible number of small molecules to screen by any current method whether it is in vitro or in silico . Although current chemical libraries ("real" chemicals) consists of a few million compounds available for in vitro testing, the drug discovery fi eld has benefited from virtua l screening strategies by narrowing down the size of these databases while increasing lead identification rates. Current virtual screening t echniques can select from these databases those compounds that are most likely to have an affinity fo r a given site on a macromolecular structure. Studies that compare the likelihood of ra ndomly selecting a known compound from a database by functional testing with the same approach but including virtual pre-filtering continue to suggest that ther e is about a 20-fold increase in the identification rate of known compounds (Becker et al . 2004). How DOCK Works Like most molecular docking software packages available, DOCK (Ewing et al . 2001) functions in two steps. First a co mplex between a small molecule and a macromolecular target is generated, then scored. A database of small molecule coordinates can be input to rank the com pounds according to their scores. Those with better scores are more likely to have an affin ity for the target site. It is emphasized that although an intense research effort is in place to develop docki ng software that can accurately calculate binding energies and comp lexes, the field is still young and current computational power and software is less appropr iate for that level of analysis. There are more accurate scoring methods available but they are computationally expensive. For practical reasons, scoring functions should al low the processing of hundreds of thousands of compounds in a short time, usually days, ye t they also have to be accurate enough to
14 rank unknown compounds high compared to the rest of the database so that they can be identified by in vitro testing. DOCK and similar software should be used to filter large databases of compounds and therefore increas e the efficiency of the drug discovery process. Interpretations of high resolution in teractions between a small molecule and its target should be carried out cautiously. See th e work of Card and others (2004) for an impressive example of mol ecular docking applications. DOCK generates orientations of a small molecule by using a geometric approach (Kuntz et al . 1982). A molecular surface map of the target structure is obtained by the Connolly method (Connolly et al . 1983) or similar approac h. Software from the DOCK package can then be used to place spheres wh ere vectors normal to the surface intersect and meet specific parameters. These vector s will intersect on mo re concave surfaces. Therefore spheres are used to describe a nd represent a surface pocket or active site. Clusters of spheres form and can be select ed to create a negative image of the site intended for drug interaction (Figure 1-3). DOCK generates the orientations of sm all molecule compounds by matching the internal distances of the atom centers with those of the sphere centers. In other words, we can almost say DOCK superimposes the atom s of small molecules on these spheres and when orientations meeting minimum criteria are identified, DOCK ca n proceed to score them. The best scoring orientation for each compound can be written as output for comparison with other small molecules on ce they have been ranked (Figure 1-3). Depending on computing resources, the evalua tion of 100-1000 (or more) orientations is typically requested from DOCK.
15 A B Figure 1-3. How DOCK works. (A) shows how spheres where compounds are docked fill a target site on ACE2 (site1 discussed in Chapter 5). Spheres (yellow) are enclosed by the scoring grid (purple) . (B) shows the top scoring compound after molecular docking. The molecula r surface is colored according to secondary structure, shown in left pa nel. (A,B) show ACE2 in identical orientations. Some of the most important parameters involved in this matching step include: the distance tolerance, the nodes minimum, and the maximum number of bumps allowed. The distance tolerance indicates how closely the internal dist ances of the atom centers of a compound have to approximate those of the spheres. The nodes minimum parameter, usually set to 3 or 4, indicates how many dist ances (within the margin of error set by the distance tolerance) have to match to consider an orientation for sc oring. Sphere clusters ranging in size from 20 to 60 spheres are typica lly used. Finally, a bump filter eliminates orientations that clash with macromolecular atoms (i.e., spill outsid e the sphere cluster extensively). The maximum number of bumps, typically set from 3 to 5, indicates how many clashes are allowed in order for an orie ntation to pass. A clash is defined as a percent overlap of the van der Waal radii of the clashing atoms. These filtering criteria eliminate unfavorable orientations before scoring which saves computational time.
16 Once an acceptable orientation is ge nerated, DOCK can score it with a few different functions. Most calculations in our study were performed using the energy function implemented in DOCK. This energy function is a non-bonded molecular mechanics force field containing attractive and repulsive van der Waal terms and a coulombic term. This function does not account for entropy and explicit hydrogen bonding terms but is suitable for rapid calcu lation. Some newer functions in DOCK do account for some solvation and desolvation e ffects. However, these algorithms are in early stages of development a nd were not used in our study. The score for any given orientation is the sum of all interactions of every atom in the small molecule with atoms of the macrom olecule. However, not every atom from the target is included in this sum, as even small macromolecules may have thousands of atoms. Instead only atoms close to and surroun ding the sphere cluster are included in the calculation. These atoms are selected by gene rating a scoring grid (Figure 1-3). These grids contain pre-calculated attractive/repu lsive van der Waal parameters, dielectric values and charges at different points acr oss the molecular surface pocket; making the scoring more rapid as these values only need to be retrieved from memory. Finally once an orientation is scored it can be optimized to minimize the score (Meng et al . 1993). Docking minimization and flexible ligand sear ching parameters are extensive and will not be discussed here. For further details on other features of the DOCK package please consult the user manual (Ewing et al . 2001). The Immunoglobulin Fold and Primary Structural Variability The immunoglobulin fold has been classically defined to consist of a simple Greek key -barrel, although crystal structures are more accurately described as a -sandwich (Bork et al. 1994). In this fold, two -sheets comprised of 7-9 strands pack together in a
17 specific manner. Although the Ig fold is highl y conserved structura lly, sequence diversity varies greatly. Accordingly, Ig domains ha ve been implicated in many different functions. Ig-like domains mainly function as cell surface receptors, many of which are involved in immune function; with others i nvolved in cell adhesion. Others have been observed in more disparate examples, such as growth hormone receptor, matrix proteins (e.g., tenascin and fibronectin), intracellular regul atory proteins (e.g., chaperonin PapD) and finally as auxiliary domains in enzymes (Bork et al . 1994 and references therein). Notably, in all cases the Ig domain is involve d in ligand recognition and it has never been observed to have intrinsic enzymatic activit y. V-type immunoglobulins seen mainly in antibodies and T cell receptors are discussed in Chapters 3 and 4 while considering the origins of the structural featur es that lead to the vast am ount of variability used by the adaptive immune system. There are, however, other types of imunoglobulins. Immunoglobulin domains can be classified ma inly into four different classes: V-, C-, Hand S-type immunoglobulins (Bork et al . 1994). V-type immunoglobulins have a conserved 9-strand topology (back sheet stra nds: DEBA; front sheet strands: GFCCâ€™Câ€™â€™). The other types show a 7-strand topology but can be distinguished depending on whether the Câ€™/D strand positions on the front or back sheet. C-type immunoglobulins have a classical 7-strand topology with DEBA st rands comprising the back sheet and GCF strands in the front sheet. S-type immunogl obulins also have a 7-strand topology but the D strand is replaced by a Câ€™ strand (back sheet strands: EBA; front sheet strands: GFCCâ€™). The H-type domain is a hybrid betwee n the Cand S-type Ig. In the H-type Ig the N-terminal segment of strand Câ€™/D forms part of the back sheet and the C-terminal end of the strand hydrogen bonds to the front sheet. The review of Bork and others
18 (1994) contains excellent figures comparing the different types of Ig domains. In Chapter 3 (Figure 3-3) we ilustrate and examin e closely the topology of the V domain. Common to all Ig types, however, are th e more structurally conserved BCEF strands that are surrounded by th e more structurally variable remaining strands and loops. Perhaps not surprisingly, these 4 strands (the 2 center strands of each sheet) constitute the core of the Ig fold. It was also found by Bork and others (1994) th at the topology adopted by the Ig was correlated to the length of th e intervening sequence between the C and the E strands. Basically, as the amino acid seque nce between the C and E strand increases, there is an increasing likelihood for the Ig domai n to adopt an Sto Hto Cto V-type structure. This is consistent with the observation that the V-type fold has additional strands Câ€™ and Câ€™â€™, 9 strands in total, and so a larger primary st ructure is necessary. One last highly conserved structural f eature of the Ig domain is a disulfide bond that covalently joins the fr ont and back sheet of the -sandwich. The classical Ig fold has such pair of cysteine residue s in the B and F strands, but examples of a disulfide bond between the C and F strands and between the A and G in addition to the B and F strands have been observed in CD2. In a small numbe r of cases the disulfid e bonds are absolutely absent (e.g., CD4). Interestingly, it has been found that families of Ig domains missing a disulfide bond show a greater degree of c onservation of hydrophobic residues at the core of the fold. Amphioxus Harbors Primordial V-Type Immunoglobulins The adaptive immune system arose abruptly in ancestors of the jawed vertebrates, approximately 500 million years ago (Litman et al . 1999; van den Berg et al . 2004). Proteins characteristic of adaptive imm une responses (e.g., immunoglobulin and T cell antigen receptor [TCR]) have b een identified in all species of jawed vertebrates examined
19 thus far (Rast et al . 1997). However, no definitive homolog of either these or other genes associated with adaptive immune function ha s been reported in jawless vertebrates or invertebrates. Although the identity of the â€œprimo rdialâ€ receptor that gave rise to antigen receptors in jawed vertebrates may never be established, several different families of genes that show predicted characteristics of such a receptor have been described in and outside the jawed vertebrate s (Rast and Litman 1998). Among the protochordates, efforts have focused on the amphioxus ( Branchiostoma floridae ), a cephalochordate that represents the most phylogenetically proximal invertebrate form on a direct line with vertebrates. Efforts to find adaptive immune genes in this species have identified a multigene family that encodes the variable regioncontaining chitin-binding proteins (VCBPs) (Cannon et al . 2002). These molecules possess two tandem V domains a nd a chitin-binding domain and can be classified into five major families. Comparisons of pooled mRNAs (cDNAs) and genomic sequences derived from individual animals (Cannon et al . 2004) show several regions of considerable sequence substitution (i.e., VCBPs are diversified at both the interand intrafamily levels). To determine the extent of structural similarity among the VCBP proteins (which likely function as innate r eceptors) and antigen receptors characteristic of adaptive immune responses, N-terminal, Immunoglobulin -like variable domains from VCBP3 and VCBP2 protein families were crystallized. An understanding of VCBP structures will likely provide insights in the evolutionary mechanisms that underscore the dissemination of the antigen binding receptors.
20 Non-Canonical Interactions Become an Increasingly Recognized Feature in the Structure of Proteins Most non-canonical interactions consist of some unconventional form of a C-H---Y hydrogen bond, in which Y is the hydrogen bond acceptor. With the first C-H---O hydrogen bond suggested by Maurice Huggins soon after the discovery of the hydrogen bond (Huggins 1943), it was not until 1982 that th ese interactions were conclusively confirmed in a survey of neutron struct ures (Taylor and Kennard 1982). However in macromolecules these interactions were usua lly and incorrectly regarded as hydrophobic interactions until recently (Wahl and Sundaralingam 1997) at leas t in part due to the large size of macromolecules and the limited resolutio n of crystal structures. But with advances in X-ray crystallography, NMR spectroscopy, and other structural biol ogy techniques, it is now widely accepted that these unconventional hydrogen bonds play significant roles in the structures of proteins (Derewenda and Derewenda 1995, Loll et al . 2003). Although weak (1-2 kcal/mole), these interactions s eem to be numerous and thus the structure effects they induce are presumed to be far from trivial (Desiraju 1996). Nucleic Acids and C-H---O Bonds C-H---Y bonds have been reported and disc ussed for nucleic acids, proteins and carbohydrates. The widespread observation of these interactions points to their fundamental chemical nature. In nucleic acid s these interactions ar e thought to expand their base-pairing capab ilities significan tly, implicating a functi onal role in homologous recombination where C-H---O bonds might allow the third strand to discriminate between the pairs in the Watson-Crick duplex (Zhurkin et al . 1994). Also it has been suggested that the more restricted hydroge n bonding ability of thymine makes this base better fit for its role in the hereditary material (DNA) , whereas uracil is more versatile in its
21 interactions making a better candidate to function in RNA tertiary structure (Wahl et al . 1996). Proteins and C-H---O Bonds In proteins C-H---O bonds are more commonly found in -sheets, where the slightly acidic hydrogen on the C satisfies carbonyl oxygens of adjacent parallel or antiparallel strands (Derewenda and De rewenda 1995). In high resolution crystal structures it has been clearl y observed that a single car bonyl oxygen forms a bifurcated hydrogen bond with main chain C-H and N-H hydrogen bond donors (Card et al . 2004). In -helices these interactions are rare and mo re often consist mostly of carbonyls paired with C-H and C-H donors. Remarkably, in the case of pr oteins there is at least one case where C-H---O bonds have been implicated in a specific functional role. In serine hydrolases a bond between the C-H of the catalytic histidine with an aspartate carboxylic group positions the histidine resi due for catalysis (Derewenda et al . 1994). A biologically relevant ca se of overabundance of C-H ---O bonds in proteins has been discussed for photosystem I from S. elegantus (Loll et al . 2003). In the crystal structure of this trimer complex consisti ng of 34 transmembrane helices, 75 C-H---O bonds were identified following criteria si milar to those described in Wahl and Sundaralingam (1997), whereas only 49 conven tional hydrogen bonds were identified. Of those 75 non-canonical bonds only 3 occur in termolecularly. This nonrandom distribution suggests that they play a specific role in the structure of these membrane proteins. Membrane spanning regions of proteins de pend largely on hydrophobic and van der Waal interactions, but like conventional hydrogen bonds, C-H---O bonds are directional and could potentially guide h ydrophobic specificity (Loll et al . 2003). C-H---O bonds could
22 therefore represent an impor tant cohesive force in th e lipophilic environment of membrane spanning protein regions, the hydroph obic cores of folded proteins and even the active sites of enzymes, where dielectr ics are low and favor these interactions. Carbohydrates and C-H---O Bonds In the case of carbohydrates it is not su rprising to observe an abundance of C-H--O bonds considering their high content of CH and C=O groups. From neutron structures analyzed for these interactions it was determin ed that 93% of the C-H groups were within 3 of a carbonyl oxygen. The geometrical propertie s of these interactions varied from optimal to weaker â€œforcedâ€ intera ctions but at this frequency it is almost certain that their cumulative effect bears an important role in the stabilization of th e overall structure. Conventional Hydrogen Bonds are Known to Vary in Strength Conventional hydrogen bonds are now unders tood to vary in strength depending on the length and linearity of the bond as well as the pK difference between donor and acceptor (Cleland 2000). Typically the hydrogen bond is described in the form of XH +---A -, where X-H is the donor and A is the ac ceptor. In water the distance between oxygens is 2.8 and the H of formation corresponds to 5 kcal/mole. The weakness of these bonds in water is mainly due to th e large difference in pKs. The pKs of H3O+ and H2O are -1.7 and 15.7, respectively (Cleland 200 0). But in hydrogen bonding interactions between other species of matching pKs, the hydrogen bonding H of formation may increase to up to 15 kcal/mole in the conde nse phase giving rise a strong â€œlow barrierâ€ hydrogen bond (LBHB). Important roles for LB HBs have been described for several enzyme reaction mechanisms as they are usua lly involved in stabi lizing the transition state of a reaction. What is clear is that hydrogen bonds represent a wide spectrum of
23 interactions largely dependent on the physic ochemical characteristics of the donor and acceptor. C-H---Y hydrogen bonds, whether the acceptor involves a carbonyl oxygen or an electron rich (i.e., an aromatic side chain) , appear to exemplify the weak end of this spectrum. Known Chitin-Binding Domains Have Antimicrobial Activity Although V region-containing chitin-bindi ng proteins (VCBPs) have no known function, they express a chitin-binding domain which is thought to be used by plants and invertebrates as a defense unit in the recognition of pa rasites and fungi (Kawabata et al . 1996). Other functions of chitin-binding domain s have been observed only in fungi and arthropods (e.g., the peritrophic membrane of in sects in the gut) (Richards and Richards 1977). As filter feeders at sea, amphioxus is probably exposed to parasites frequently. Since they are secreted in its gut (Cannon et al. 2002), we think VCBPs may neutralize incoming chitinous parasites before they colonize or harm amphioxus. Chitin Chitin is one of the most abundant biopolym ers in nature but its role has been highly restricted to structural roles in arthropods and fungi (i.e., crustaceans, insects, mollusks, nematodes, worms) (Tjoelker et al . 2000). In fungi, this -1,4-linked polymer of N-acetylglucosamine could make from 1% to more than 40% of the cell wall of some organisms (Tjoelker et al . 2000). There are no known examples of chitin in mammals, other than the invasion of a parasite or fungus. Chitin provides these organisms with protection against their environment. With th is in mind, it comes as no surprise that many plants and invertebrates express chitinases that function in the disruption of chitin deposition.
24 Invertebrate and Plant Chitinases Chitinases occur in modular form consisting of a catalytic domain and one or more ancillary non-catalytic domains. They freque ntly express a chitin-binding domain (CBD). Although CBDs have no intrinsi c catalytic activity, they greatly enhance enzymatic efficiency (Suetake et al. 2002). Interestingly, the CBD s expressed by plants and invertebrates share significant structur al homology (Figure 1-4), exemplifying a convergent evolution process which may be suggestive of an important function (e.g., antigen recognition) for the chitin-binding domain (Shen Z. and Jacobs-Lorena 1999). It is clear that CBDs may function as antimicrobials and antifungals even when fused to other molecules that lack chitinase activity. Tachycitin (Kawabata et al . 1996) is a 73 residue peptide having antimicrobial ac tivity against both bact eria and fungi. Its CBD is composed of the C-terminal 65 residue segment and it was found to be responsible for the observed an tifungal activity afte r truncating the N-te rminal peptide. A recombinant form of tachycitin expressing th e CBD alone retained the antifungal activity, but this truncated form did not po sses antibacterial properties (Suetake et al. 2002). Although it is not known how the antibacterial N-terminal peptide or the CBD mediate these functional roles, it was concluded th at the CBD alone ha d antifungal activity. The only current representative of the i nvertebrate CBD in the Protein Data Bank (Berman et al . 2000) is the NMR solution structure of tachycitin from horseshoe crab, Tachypleus tridentatus (Suetake et al . 2000). Like the plant CBD, which is also known as the hevein domain, the inve rtebrate domain was found to consist of a two-stranded antiparallel -sheet that forms a hairpin followed by a short helical turn (Figure 1-5). Also similar to the hevein domain, tachyc itin was found to contain a high number of disulfide bridges; 5 in tachycitin and 4 in vehein. This structurally homologous segment
25 Figure 1-4. Plant and invertebrate chitin-bindi ng domain superimposed. The structurally homologous segment is highlighted in pink, tachycitin, and in blue, hevein. Residues known to be involve d in the interaction with chitin in hevein are located on the hairpin between the conserved strands ( 4 and 5 in tachycitin, Figure 1-5). only encompasses ~22 residues of tachycitin an d an additional 3-st rand antiparallel sheet not observed in hevein seems to form part of the invertebrate CBD ( 13, Figure 1-5). The two sheets form a distorted -sandwich according to the NMR structure (Suetake et al . 2000). Strikingly, the structurally homologous segment of plant and invertebrate CBDs includes important residues known to be invol ved in chitin recognition by hevein (Ser19, Trp21, Trp23) (Asensio et al . 1995). In the superimposed structures of hevein and tachycitin, it is clear that the residues at the equivalent positio ns in tachycitin have similar properties (Asn47, Tyr49, Val52) and are therefore pr edicted to be invo lved in chitin recognition by tachycitin (Suetake et al . 2000). This observation has also led to the thought that the remaining st ructure of tachycitin ( 13), which does not display the same level of similarity, functions as structur al support that stabilizes the conformation of
26 Figure 1-5. NMR solution structur e of tachycitin (P DBID: 1DQC). Strands are in gold, loops in gray and helical turn in red. N and Ctermini are labeled. Disulfide bridges are colored according to secondary structure. the chitin recognition segment. Mammalian Chitinases Mammalian chitinases have been identified more recently and although their physiological role in defense mechanisms need s more investigation, th e current literature is highly supportive of such a notion. Elevated plasma levels of phagocyte-derived chitotriosidase in Gaucher patients initially led to the identification of human chitinases (Renkema et al . 1995). Chitotriosidase is specifically expressed by phagocytes (Boot et al . 2001) and recombinant chitotriosidase is now known to inhib it hyphal growth of chitin-containing fungi (van Eijk et al . 2005). Like plant and invertebrate chitinases, hum an chitotriosidase is a modular protein consisting of a C-teminal chitin-binding domain and an N-terminal region with chitinase
27 activity (Renkena et al. 1997). There is a naturally occu rring truncated form of this enzyme that appears to be stored in the lysosome of macrophages, where the CBD is removed by proteolytic cleavage (van Eijk et al . 2005). The full length form of the enzyme is expressed for secretion. The sim ilarities of human ch itinases with those expressed by plants and inve rtebrates further suggest that higher level metazoans may have acquired these molecules and used them for immune function. Furthermore, due to the modular nature of these proteins it is not hard to imagine that throughout evolutionary history, the chitin-binding domain has expe rienced domain swapping and gene shuffling events allowing it to participate in altern ative antigen recognit ion systems and other functions in other organisms.
28 CHAPTER 2 CRYSTALLIZATION AND PRELIMINARY XRAY ANALYSIS OF VCBP2 AND VCBP3 FROM Branchiostoma floridae Introduction The adaptive immune system arose abruptly in ancestors of the jawed vertebrates approximately 500 million years ago (Litman et al . 1999; van den Berg et al . 2004). Proteins characteristic of adaptive imm une responses (e.g., immunoglobulin and T cell antigen receptor [TCR]) have b een identified in all species of jawed vertebrates thus far examined (Rast et al . 1997). However, no definitive homolog of either these or other genes associated with adaptive immune function has been repo rted in jawless vertebrates or invertebrates. Although the identity of the â€œprimordialâ€ receptor that gave rise to antigen receptors in jawed vertebrates may ne ver be established, several different families of genes that have predicted characteristics of such a receptor have been described both within and outside the jawed vert ebrates (Rast and Litman 1998). Among the protochordates, efforts have focused on the amphioxus ( Branchiostoma floridae ), a cephalochordate that represents the most phylogenetically proximal invertebrate form on a direct line with vertebrates. Efforts to find adaptive immune genes in this species have identified a multigene family that encodes the variable regioncontaining chitin-binding proteins (VCBPs) (Cannon et al . 2002). These molecules possess two tandem V region domains and a chit in-binding domain (CBD). They can be classified into five major families. Comparisons of pooled mRNAs (cDNAs) and genomic sequences derived from individual animals (Cannon et al. 2004) have showed
29 several regions of considerable sequence subs titution (i.e., VCBPs ar e diversified at both the interand intrafamily levels). Although VCBPs consist of tw o tandem N-terminal im munoglobulin domains and a C-terminal chitin-binding domain, no cons truct coding for the chitin-binding domain has been successfully expressed. A full constr uct would be interes ting for crystallography since a crystal structure woul d show how all three domains pack and this may in turn reveal potential ligand binding sites in V CBPs. The observation that VCBPs have a chitin-binding domain only suggests more st rongly that VCBPs function in a defense mechanism; probably one that involves the r ecognition of chitinous parasites or fungi. However, questions specific to the structure of the Ig domains can be answered in the absence of the chitin-binding domain. For exam ple, do VCBPs in fact consist of V-type immunoglobulins? The VCBPs woul d represent the first immunogl obulin of this type in an invertebrate. Also, how do the tandem doma ins interact (e.g., front-to-front or back-toback)? Do tandem domains interact intraor intermolecularly? And finally, where does the regionalized sequence hypervariabili ty map on the VCBP structure? To determine the extent of structural similarity among the VCBP from amphioxus, which likely function as innate receptors, and an tigen receptors characteristic of adaptive immune responses, immunoglobulin-like variab le domains from the VCBP3 and VCBP2 protein families have been purified and crystallized. An unde rstanding of VCBP structures will provide insights in the evol utionary mechanisms that underscore the dissemination of the antigen binding receptors. Two constructs were transformed and expressed in E Coli . The VCBP3 single and two domain prot ein crystallized better and was also more easily obtained than VCBP2 protein.
30 Materials and Methods Constructs and Expression Oligonucleotides for VCBP3 constructs were designed based on the VCBP3 cDNA sequence (Genbank accession no. AF520474). For the amino-terminal V domain of VCBP3 the oligonucleotide sequenc es were: VCBP3-XC-F1, 5ATGCAGTCCATCATGACCGTCCGCA (ATG + nt 49-70) and VCBP3-XC-R1, 5TTAGGTGTGGCCTGTCACCTTGAGCAC (TTA + antisense nt 427-450). In addition to native VCBP3 cDNA sequence, VCBP-XC-F1 included an artifici al methionine codon at its 5 end and VCBP-XC-R1 included an artificial TAA stop codon in antisense at its 5 end. The final peptide encoded by the P CR amplicon represented the N-terminal V domain of VCBP3 (VCBP3V1) (i.e., amino acids 17-150 of Genbank no. AAN62850) beginning at the first residue after the pred icted signal peptide and extending through the end of the amino-terminal immunoglobulin-like domain. Sequences of cDNA encoding the tande m V domains of VCBP3 (VCBP3V1V2) were amplified by PCR and cloned into a bact erial vector for expression and refolding. The oligonucleotide sequences for VCBP 3V1-XC-F1, same as for VCBP3V1, and VCBP3V2-XC-R2 5-CTATCAGACCTTGAG AATGGTTGAGGAC (antisense stop codons CTA TCA + VCBP3 antisense nt 801-78 7) were used as primers. The final peptide encoded by this PCR amplicon repres ents the two tandem V domains of VCBP3, also beginning at the first residue after the predicted signal pep tide (amino acids 17-267 of Genbank peptide AAN62850). Similar constructs were used for the cloning of a short (single N-terminal domain) and long (tande m V-regions) form of VCBP2. For the short form primers VCBP2-XC-F1 5-ATGGT GTCCATCACGACCGTGACGGT and VCBP2XC-R1 5-TTACAACTTGTACCA GAAGAGCAGAGAC were us ed, and for the long
31 form the primers were VCBP 2-XC-F1 and VCBP2-XC-R2 5CTACAGACAGATGACGTTGTTTG. PCR products were ligated into pETBlue-1 (Novagen) and sequenced for confirmation. The construct was then transformed into the Escherichia coli Tuner strain (Novagen) for IPTG induced expressi on. 2 L cultures were grown to OD600 = 0.5-0.9 at 37oC and 100 mM IPTG was added to a fina l concentration of 1 mM. Cultures were grown an additional 5 hours at 37oC. Induced bacterial cu ltures were centrifuged and stored overnight in 20 % sucrose, 10 mM EDTA. Refolding and Purification Thawed bacteria were brought to 200 mL in sucrose/EDTA and egg white lysozyme (Sigma) was added at 1 mg per mL to the bacterial slurry. The slurry was processed in an EmulsiFlex C5 high pressure homogenizer (Avestin, Ottawa, Ont, CA) at 10,000 psi for two cycles. PMSF was added to 0.1 mM final concentration and 60 L of Lysonase recombinant lysozyme and 60 L of Benzonase nuclease (N ovagen) was added and the homogenate was incubated at room temp erature for 20 min. Centrifugation for 25 minutes at 15,000 x g separated the inclusion bo dies from soluble components. Inclusion bodies were washed with 10 mM Tris pH 8.0, 5 mM EDTA, 0.1% Triton X-100 pH 8.0, followed by four to six alternating washes in 10 mM Tris and deionized water. SDSPAGE electrophoresis confirmed that the incl usion bodies contained inducible protein in a major band of the predicte d sizes. MALDI mass spectro scopy (UF ICBR Protein Core) yielded an estimated mass close to th e predicted size of the constructs. Inclusion bodies were so lubilized overnight at 4oC in 10 mL 7.8 M guanidinium HCl, 50 mM Tris, pH 8.0, for inclusion bodies from 2 L of starting culture. Protein
32 concentration was approximately 5-10 mg/mL during solubilization. Solubilized protein was centrifuged at 100,000 x g for 30 min. to re move nucleated aggregates. The solution was added immediately to a TCEP (Pierce) disulfide reducing agar ose gel column, bed volume 2.5 mL . The column eluate was slowly dr ipped (over the course of 4 hours) into a large volume (~300 mL ) cold, stirred re folding buffer: 0.55 M guanidine, 0.44 M Larginine, 55 mM Tris pH 8.2, 21 mM NaCl, 0.88 mM KCl, 1 mM EDTA, 1 mM GSH, 1 mM GSSG. The cold refolding buffer was slow ly brought to room temperature over 1 h in a refrigerated water bath. The 300 mL so lution was dialyzed against 2 L 10 mM Tris, 50 mM NaCl, pH 8.0 overnight at 4oC. The dialysate was centrifuged at 15,000 x g for 20 min. and concentrated 10-25 fold in an Amicon ultrafilter with PM10 membrane (Millipore). Clarified concen trate was then separated by FPLC on a Superdex 75 column (Amersham Pharmacia) and the appropriate si ze fraction (15 kD) colle cted as the final purification step. Final recove ry of purified VCBPs was a pproximately 10 mg protein per liter of starting culture. Crystallization of VCBP3 (V1 and V1V2) Crystals of VCBP3 were grown by the va por diffusion method in hanging drops (McPherson 1999). 2 L of protein solution (10 mg per mL in 10mM Tris pH 8.0, 50 mM NaCl) and 2 L of reservoir solution were mixed on siliconized slides and allowed to equilibrate against 1 mL of reservoir solution. Three commercially ava ilable sparse matrix crystallization (Jancarik and Kim 1991) kits from Hampton Research were used for screening: Crystal Screen 1, Crystal Screen 2, and Crystal Cryoscr een (144 conditions). Initial screening at 18oC revealed that VCBP3 (V1 and V1V2) crystals formed in conditions containing different precipitants (Figure 2-1), pH, and salt concentrations.
33 A B Figure 2-1. Formation of hexagonal VCBP3V 1 crystals in different precipitant conditions. Scale bars are 100 microns in length. (A) VCBP3V1 crystals formed in PEG 1500, 20% glycerol. (B) VCBP3V1 crystals formed in 2.0 M sodium acetate, 0.1 M NaCl, Tris-HCl, pH 7.5. Crystallization of VCBP2 (V1 and V1V2) Unlike VCBP3, initial crystallization condi tions were not identified by hand after testing with Hampton Research screens. Samples of VCBP2 were submitted to the Southeast Collaborative for Structural Ge nomics at the University of Alabama in Birmingham for automated crystallizati on high throughput screening. The robotic crystallization module is capable of screen ing 1000 conditions per hour using less than 300 micrograms of protein (10 mg/mL). Several crystallization conditi ons were identified by this method (Figure 2-2). Optimization of initial conditions proceeded with the same techniques as for VCBP3. Crystals of VCBP2V 1 quickly grew larger in approximately 2.0 M sodium formate. Data Collection and Processi ng for VCBP3V1 and VCBP2V1 Diffraction data were colle cted using an RAXISIV++ 100 image plate detector (300 x 300 mm). Crystals were mounted in gla ss capillaries for room temperature data collection or on a nylon-fiber loop and flash-fr ozen in a nitrogen gas stream using an Oxford cryosystem. X-rays were genera ted by a Rigaku rotating copper anode, with
34 Figure 2-2. Early crystals of VCBP2V 1 identified by automated high throughput screening . Original conditions were 0.1M Imidazole buffer, pH 8, 2.5M NaCl but crystals were later optimized in Na formate instead of NaCl. Crystals grew to similar sizes to those of VCBP3 but were highly mosaic. osmic mirrors and a 0.3 mm collimator, r unning at 40 kV and 100 mA. Data were collected using oscill ation angles of 1 per frame. Each frame was exposed for 5 minutes (VCBP3V1) or 15 min (VCBP2V1). Intensities were indexed and integr ated with DENZO and reduced with SCALEPACK (Otwinowski and Minor 1997). The molecular weight of the crystallized polypeptide was based on mass spectrometry anal ysis and cell contents were predicted with the Matthews Probability Calculator (Kantardjieff and Rupp 2003). Data-processing statistics for native data sets are shown in Table 2-1.
35 Table 2-1. Data collection and reduction sta tistics for VCBP crystals. *Values for the highest resolution shell are in parent hesis. **Not determined for VCBP2 crystals due to insufficient data. VCBP3V1 VCBP3V1V2 VCBP2V1 wavelength 1.5418 1.5418 0.9192 1.5418 Temperature (K) 100 295 100 295 No. of frames 76 120 600 27 No. of crystals 1 1 2 1 Detector distance (mm) 150 120 188 156 Observed reflections 79,003 56,458 1,628,269 -** Unique reflections 6,574 4,310 28,179 Redundancy 4 6 10.9 Resolution range () 40-2.40 (2.49-2.40) 40-2.70 (2.80-2.70) 20-1.85 (1.88-1.85) Space group P 31(2)21 P 31(2)21 P 61(5) Cell parameters a = b = 58.99, c = 79.21 a = b = 60.10 , c = 80.018 a = b = 109.60, c = 48.845 Oscillation step () 1 1 0.4 Mosaicity () 1.04 0.309 0.592 ave 13.9 (4.0) 13.5 (4.5) 35.4 (2.2) Reflections >3 (%) 75.6 (48.1) 81 (72.8) 76.6 (25.6) Completeness (%) 99.5 (98.3) 94.1 (97.1) 97.8 (84.3) Rmerge (%) 10.7 (36.4) 11.7 (34.8) 7.4 (33.2) Data Collection and Processing for VCBP3V1V2 Native data reduction statistics for VCBP3V1V 2 reflect the quality of two different data sets each collected from independent crys tals that were reduced and scaled together before post-refinement iterati ons and outlier rejections in SCALEPACK. A total of 600 frames of native diffraction data co llected at 15 second exposures and 0.4 oscillations were merged in this manner. Data reduction was otherwise performed as for the single domain constructs above. Crystals were cryo -cooled by immersion in liquid nitrogen and data was collected at beamline X6A of the National Synchrotron Light Source, Brookhaven National Lab, Upton, NY. Data re duction statistics for VCBP3V1V2 are listed in Table 2-1.
36 Results Crystals of VCBP3V1 formed in 1.2-1.4 M ammonium sulfate, 0.1 M NaCl, 0.1 M hepes, pH 8.5, 12% glycerol. A set of optimi zation experiments found distinct conditions that produced crystals of VCBP3 over a wide range of precipitant concentration (1.6 to 2.6 M sodium acetate, 0.1 M NaCl, Tris-HCl pH 7.5). Smaller crystals also grew in PEG 1500 20% glycerol (Figure 2-1). Crystallization trials at 4C yielded similar results. Crystals grew in three days and nuclea tion in several drops could be observed immediately after setting up the crystallization experiment. Crystals with a diffraction limit of 2.4 and approximate dimensions of 0.15 x 0.15 x 0.05 mm were obtained in 1.3 M ammonium sulfate, 0.1 M NaCl, 0.1 M hepes, pH 8.5, 12% glycerol. A single cryocooled crystal of this size provided the data reported in Table 2-1. A crystal of similar size grown in 2.0 M sodium acetat e, 0.1 M NaCl, Tris-HCl, pH 7.5, was mounted in a glass capillary at room temperature and diffraction data to 2.7 resolution were collected with statistics of comparable quality (Table 2-1). Data from VCBP3V1 crystals grown in sodium acetate are consistent w ith the same space group as those grown in ammonium sulfate, P31(2)21, with similar unit cell parameters. For VCBP3V1V2 a set of conditions was selected for optimization based on best physical appearance, with the largest crys tals observed to grow in 1.5 to 2.0 M sodium formate, 0.2 M MgCl, 0.1 M citrate, pH 6.6/4.6. Crystal growth conditions were optimized at both pH 4.6 and 6.6 since higher pH crystals offered a better crystal volume for x-ray exposure but lower pH (needle-like) crystals appeared to have better geometry (Figure 2-3). Crystals of VCBP3V1V2 were successfu lly cryo-protected by soaking in 20% ethylene glycol, although step so aking was necessary to preserve the integrity of crystals.
37 A B Figure 2-3. VCBP3V1V2 crystals at different pH. (A) shows cr ystals grown at pH 4.6 in the form of needles measuring up to more than 1 mm in length. (B) shows thicker but smaller crystals grown at pH 6.6. (A,B) are not under same magnification. Crystals were grown within identical range of conditions with pH being the only difference. In this case, fast (20 to 30 seconds) soak s in 5, 10, and 20% ethylene glycol gave good results. It was later recognized that co-cryst allization with 20% ethy lene glycol yielded smoother and slightly larger crystals; probably due, at least in part , to a reduced number of nucleation events in the drop. Low pH cr ystals were observed to tolerate cryoprotection better than high pH cr ystals. Crystals grew in less than a week and hence were quickly optimized. Although pH was observed to affect the morphology of VCBP3V1V2 crystals, both high (6.6) and low (4.6) pH diffraction data corresponded to the space group P61/P65 and diffracted to 1.8 to 1.9 . Needle-like crystals were ind eed found to be much less mosaic, whereas thin slicing of the oscillation frames was highly necessary during data collection from high pH crystals. Data reduction statistics are listed in Table 2-1. To date, a full length construct of a VCBP that includes the ch itin-binding domain has not been successfully expressed in E. coli for crystallography. Purified VCBP2 from amphioxus has been reported to ha ve an affinity for chitin (Cannon et al . 2002) but has no known physiological function.
38 Discussion The Matthews coefficient was used to esti mate the solvent content and number of molecules in the asymmetric unit (Matthews 1968; Kantard jieff and Rupp 2003). In their recent re-evaluation of VM values from more than 10,000 protein crystal forms, Kantardjieff and Bernhard obt ained improved distribution cu rves that also account for diffracting resolution limit. The sample size in their statistical analyses is more than 45 times what was available to Matthews in 1976 (Matthews 1976). According to a cell volume of 238,708.3 3 and a molecular weight of 14,886 Da, a single molecule of VCBP3V1 is unambiguously most probable at the asymmetric unit (Figure 2-4). The expected solvent content of VCBP3V1 cr ystals is 53.98% and corresponds to a VM value of 2.67 3/Da. Systematic absences are clearly consistent with screw axes of or unit cell translations. Crystals of VCBP3V1V2 ha ve a cell volume of 507,762.4 3 and a molecular weight of 27,373 Da. Like for the single domai n crystals, it is most likely that a single molecule of VCBP3V1V2 is found at the as ymmetric unit. The corresponding Mathews coefficient is 3.09 3/Da (Mathews 1976) and the calcula ted solvent content is 60.2%. Systematic absences were consistent with sc rew axes of 1/6 or 5/ 6 unit cell translations. Preliminary X-ray analysis of VCBP3 constr ucts were encouraging and selenomethionine derivatives were prepared accordingly for anomalous dispersion experiments (Chapters 3 and 4). Crystals of VCBP2V1 were grown to size s comparable to those of VCBP3V1 and had a clear geometry. These crystals consis ted mostly of hexagonal plates. However, VCBP2V1 crystals became highly mosaic upon flash freezing and diffraction limits did not excel from 4 to 5 . This prompted for da ta collection at room temperature, which
39 A B Figure 2-4. Probability distribut ion for Matthews coefficient and solvent content values of protein crystals. (A) shows the dist ribution for Matthews coefficients and highlights the only number of molecules per asymmetric unit in the cell that yields a value within the probable rang e. (B) shows the same but for solvent content. Both suggest a single molecule is found in the asymmetric unit of VCBP3V1 crystals.
40 was performed at the home source in a gla ss capillary. Diffraction patterns were observed to be much less mosaic albeit crystals stil l diffracted weakly (3 to 4 ). Long (15 minute) exposures were necessary and unavoidably led to crystal decay by radiation damage after collection of a few frames. Room temperature diffraction data was successfully indexed and the VCBP2V1 crystals were determined to belong to the te tragonal Bravais lattice. Unit cell dimensions refined from a single frame yielded the cell dimensions a = b = 121.3, c = 74.57, = 90.00, = 90.00, = 90.00. Reflection histograms from DENZO also suggested an average mosaicity of about 0.35 for the room temperature da ta. More robust crystals are necessary to obtain a complete data set from VCBP2V1, or many crystals may be needed to collect partial data sets that can then be merged and scaled together. The two domain protein (VCBP2V1V2) was also crystallized but these crystals have not been successfully grown to an acceptable size. Crystals m easuring less than 0.1 mm in their longest dimension do not diffract strongly enough for data collection. Thes e results indicate VCBP3 (short and long) crystal structures are likely to be su ccessfully solved (Chapters 3and 4). In the case of VCBP2, a significant amount of effort is likely re quired in order to determine better conditions for cryocooling. A model of the chitin binding domain in VCBP3 or VCBP2 was not successfully generated. Unfortunately chitin-binding doma ins display low sequence similarity and there is only one invertebrate chitin-binding domain with a known structure. Tachycitin (Suetake et al. 2000, PDBID: 1DQC) is ~16% identical to the chitin-binding domain in the VCBPs. All other known chitin-binding domain structures available belong to plant organisms, which are thought to display struct ural similarities to invertebrate chitin-
41 binding domains due to convergent evoluti on (Shen and Jacobs-Lorena 1999). The NMR structure of an unrelated antifungal peptide from a beetle also classi fied as chitin-binding (PDBID: 1IYC) does have the putative chitin -binding structural mo tif comprised of a two-stranded antiparallel -sheet and a -turn followed by a small helical segment (~1 turn). However, the primary sequence of this pe ptide is significantly shorter than those of chitin-binding domains, there is no apparent sequence similarity and most importantly, this peptide contains a single disulfide bridge. Therefore, it appears that the chitinbinding structural motif has evolved by converg ence in more than two occasions as plant CBDs, invertebrate CBDs and this other pe ptide all display the main functional unit (involved in chitin recognition) but do not share a common ancestor. These circumstances prevent us from generating a homology model of a VCBP CBD with significant accuracy. It should be noted, however, that as part of modular proteins Ig domains can be c onsidered independent structural units. Thus their evolution, although probably influenced by their function in the context of th e other domains they pair with (e.g., a chitin-binding domain), can be studied in the absenc e of these additional domains. Although it is still pos sible that the Ig domains a nd CBD of the VCBPs function independently whereby the Igs recognize one set of antigens, and the CBD another (chitin). This is the case for tachycitin which has both antifungal and antimicrobial activity. Expression of truncated products of only the CBD showed that antifungal activity remained intact, whereas antimicrobial activity was completely abolished (Suetake et al. 2002). The invertebrate chitin-binding domain has been only recently recognized with the identification of peritrophic matrix protei ns from insects in the mid 1990s, but CBDs
42 seem to form part of almost all chitinases and peritrophic proteins along with some other proteins of unknown function (Shen and Jacobs -Lorena 1999). Like in most chitinases, VCBPs have a single CBD at the C-terminus. In contrast, peritrophic proteins have 2 to 5 tandem CBDs to properly function in the peri trophic matrix. This indicates that VCBPs may have obtained the CBD from a chitinase in amphioxus; which is also consistent with the absence of peritrophic matr ix proteins in this cephaloch ordate. In the case of tandem CBDs of peritrophic proteins in insects, th e individual domains were found to be more closely related to each other than to any ot her CBD from another protein, indicating how gene duplication might have led to tandem organization of the CBDs (Shen and JacobsLorena 1999). CBDs are proposed to have been acquired by ancestral chitinases through transposition events, however, whether the CBDs were introduced into chitinases from peritrophic matrix protein gene s or another source (e.g., ce ll adhesion or immune related protein) is not known. The phylogenetic anal ysis of Shen and Jacobs-Lorena (1999) supports the latter suggesting that peritrophic proteins and the CBDs of chitinases share a common ancestor from which they ac quired the chitin-binding domain. We think it is still possible, for exampl e, that CBDs in amphioxus and some other organisms are used to trap food instead of parasites. We know amphioxus may feed on some crustaceans and other organism that may c ontain chitin. We also know that chitin is found abundantly in the gill bars of amphioxus. The CBDs could anchor to the gill bars while food is filtered by the variable domains. We think it is possible that the diversified V domains evolved to recognize a wide set of f ood sources. Alternatively, the CBDs may recognize chitin in f ood sources directly while the V domains engage in an unknown function.
43 It is also possible that similar to an tibody tails, the CBD in VCBPs mediate an effector function. Unknown receptors capable of transducing VCBP atigen recognition may be labeled with chitin. The CBD domain in VCBPs may bind these receptors once the V domains bind an antigen. This is so far highly speculative but possible. If this type of effector function was demonstrated for VCBPs, it relationaship with antibodies (whether it represents divergent or conve rgent evolution) w ould be irrefutable.
44 CHAPTER 3 THE 1.15 CRYSTAL STRUCT URE OF VCBP3V1 FROM Branchiostoma floridae Introduction Exquisite immune-type specificity, a central feature of adaptive immune responses, is effected through antigen receptors that are encoded in the germline and are further diversified through lymphocyte-restricted somatic variation (Tonegawa 1983; Jung and Alt 2004). Adaptive immunity emerged ~500 million years ago in the immediate ancestors of the contemporary jawed vertebrates and likely resulted from insertion of a transposon between the regions of an ances tral immunoglobulin (Ig)like gene encoding strands A through F of the variable (V) do main and the G strand (encoded by J gene segments) (Davis and Bjorkman 1988; Agrawal et al. 1998; Hiom et al. 1998). Recombination of antigen binding receptors intr oduces extensive somatic variation at this junctional interface through non-templated dele tion and addition of nucleotides. It is reasonable to assume that the ancestral gene element encoded a primordial receptor that was used in innate immune respons es or cell-cell interactions (Eason et al. 2004). Although no means exist to reconstruct the primordial antigen receptor (Bork et al. 1994), it would be expected to bear a low de gree of sequence identity to contemporary antigen receptors but to use the same type of structural fold in their V domains. A large family of germline-diversified mo lecules, each consisting of two Ig-type V domains and a chitin-binding domain (VCBP), have been described in amphioxus, a cephalochordate that occupies a basa l position in chordate phylogeny (Cannon et al. 2002). VCBPs share core V-determining resi dues found in Ig and TCR; however, the V
45 domains of VCBPs lack a J region and are rela ted only distantly at the level of primary structure to the V regions of immunoglobulins and TCRs. Reported here are the structural relationships between the N terminal V doma in of a VCBP and the corresponding regions of antigen receptors and related structures, at a level of refinement that permits new insight into the atomic basis for Ig domain organization and stability. Materials and Methods Expression and Purification of VCBP3V1 The N-terminal V domain of VCBP3 (GenBank gi 24528458), hereafter referred to as VCBP3V1, from amphioxus was expressed in E. coli and purified from inclusion bodies as described in Chapter 2. VCBP3V1 (7 mg/mL) was stored for crystallization trials at 4C in 10 mM Tris, 50 mM NaCl, pH 8.0. Selenomethionyl protein was produced as described (Hendrickson et al. 1990). Approximately 50% incorporation of one selenomethionine residue per VCBP3V1 monomer was established using mass spectrometry (University of Florida ICBR Protein Core). Crystallization Conditions Hanging-drop vapor-diffusion crystalliza tion experiments were performed at 18oC with drops consisting of 2 L of protein at 4 mg per mL and 2 L of reservoir solution. Drops were equilibrated on sili conized slides against 1 mL of reservoir solution (1.2 to 1.4 M ammonium sulfate, 0.1 M NaCl, 0.1 M HEPES pH 8.5, 12% glycerol). Both Selenomethionine derivative and native pr otein crystallized well (Figure 3-1). Data Collection and Processing Crystals of VCBP3V1 (Fi gure 3-1) were cryoprotected by soaking for 30 to 60 seconds in 30% glycerol before flash cooli ng in a stream of gaseous nitrogen. Three separate multiwavelength anomalous dispersion (MAD) (Hendrickson et al. 1990)
46 A B Figure 3-1. VCBP3V1 selenomethionine deriva tive crystals and diffraction. (A) these crystals grew to similar sizes as t hose of native VCBP3V 1. The figure shows not only smooth edges and faces but also the birefringent property of the crystal. (B) a representative diffrac tion pattern from derivative crystals collected at Brookhaven, NY shows low mosaicity. experiments were performed in which data se ts were collected at three wavelengths (Se inflection point, peak and remo te) from three single crystals (diffracting to 1.7 , 1.4 and 1.17 ) of VCBP3V1 selenomethionyl deri vative at beamline X6A of the National Synchroton Light Source, Br ookhaven National Lab, Upton, NY. Short exposures (5 seconds) were collected at each wavelength to reduce the effects of radiation damage yielding three data sets fro m a single crystal complete from 30 to 1.3 . Longer
47 exposures (30 to 60 seconds) were collect ed subsequently for higher resolution reflections (to 1.17 ). A complete native data set to 1.15 resolution was collected at X6A. All data sets were integrated and scaled with DENZO and SCALEPACK (Otwinowski and Minor 1997; Otwinowski et al. 2003). Data collection and reduction statistics for the native crystal and the highe st resolution MAD experiment are listed in Table 3-1. Table 3-1 . Data collection statistics for VCBP 3V1 crystals. Values for the highest resolution shell in parenthesis. NS LS, National Synchroton Light Source. Selenomethionine Derivative Native Beamline NSLS X6A NSLS X6A NSLS X6A NSLS X6A Wavelength () 0.97960 (inflection) 0.97908 (peak) 0.95007 (remote) 0.80000 No. of frames 360 360 360 360 Detector distance (mm) 100 100 100 110 Oscillation step (degrees) 1 1 1 1 Observed reflections 1,732,169 1,788,754 1,767,260 2,010,078 Unique reflections 103,915 103,983 89,107 57,420 Redundancy 8.7 8.7 9.3 14 Resolution range () 30-1.17 30-1.17 30-1.25 30-1.15 (1.18-1.17) (1.18-1.17) (1.26-1.25) (1.16-1.15) Space group P3121 P3121 P3121 P3121 Cell parameters () a,b=59.16 a,b=59.13 a,b=59.86 a,b=59.19 c=79.35 c=79.30 c=80.28 c=79.27 Mosaicity (degrees) 0.54 0.54 0.54 0.70 I/ (I) 41.9(2.6) 49.2(3.2) 42.3(4.4) 59.3(6.0) Reflections (%) >3 77.8(27.6) 79.8(35.5) 81.6(50.5) 81.8(56.3) Completeness (%) 99.1(85.5) 99.3(89.9) 100(100) 99.8(99.7) Rsym (%) 5.2(34.1) 4.6(31.1) 5.9(37.3) 4.9(37.0) Crystallographic Refinement The SOLVE/RESOLVE software package (Terwilliger and Berendzen 1999) was used to identify a single selenium atom in the asymmetric unit, calculate MAD phases, electron density maps, apply density modification and automatically trace the chain.
48 Experimental phases were calculated to 1.3 resolution and model building of 127 (out of 135) residues proceeded unambiguously; ev ery amino acid side chain was visible in experimental electron density maps. Build ing was performed with the program O9.0 (Jones et al . 1991). Native data were refine d initially using CNS (Brunger et al . 1998) and validated with PROCHECK (Laskowski et al . 1993) to 1.5 resolution. Higher resolution refinement to 1.15 was carried out with SHELXL (Weeks et al . 2003) using anisotropic thermal displacement refinement. Refi nement statistics are listed in Table 3-2. Xtalview (McRee 1999) was used to monitor re finement of solvent structure and split side chain conformations at atomic resolution. Id entical test sets of reflections were used to calculate Rfree statistics and the test set of reflections co mprise 11.3% of the data sets. The atomic coordinates and structure factor s for VCBP3V1 have been submitted to the Protein Data Bank as 1XT5. Table 3-2. Refinement statistics for the crys tal structure of VCBP3V1. rmsd, root mean squared deviation. protein atoms 1057 solvent molecules 223 sulfate ions 1 R(cryst) 12.61 R(free) 14.43 test set (%) 11.3 average Biso protein atoms 10.11 average Biso solvent molecules 27.14 average Biso sulfate ion atoms 13.49 protein anisotropy 0.401 solvent anisotropy 0.375 rmsd bonds 0.015 rmsd angles 2.5o Ramachandran Statistics most favored 91.1% Additionally allowed 8.9% generously allowed 0 Disallowed 0
49 Results VCBP3V1 was crystallized and its stru cture was solved to determine the relationship between the N-terminal predicte d V-type Ig domains of VCBPs and various reference Ig-type V structures (Hernndez Prada et al. 2004). Attempts to solve the structure by molecular replacement failed using the VL domain from the phosphocholine binding Fab McPC603 (PDBID: 1MCP, GenBank gi 230159), which is 26% identical to VCBP3V1 and represents the most homol ogous structure in the protein database. Selenomethionine protein crystals were gene rated and the structure was solved by the MAD technique (Hendrickson et al. 1990) to generate phases unbiased by molecular models. Native and selenomethionine crysta ls were isomorphous to each other, with unit cell dimensions 59.2 x 79.3 belonging to space group P3121, and diffracted to atomic resolution (1.2 or higher) (Dauter et al. 1997), Table 3-1. Phases generated on the position of the single selenium atom in the asymmetric unit were used to calculate electron density maps in which 81% of the re sidues were fit automatically (Figure 3-2). The VCBP3V1 structure was refined to a re solution of 1.15 with SHELXL (Schneider and Sheldrick 2002) against native data and consists of 135 residues, 227 ordered water molecules and one sulfate ion. Ig domains are structurally classified as V-type if they display a conserved 9 strand secondary structure topology which forms a tertiary structur e consisting of two -sheets packed tightly in the aligned m ode: the front sheet (strands A GFCC C ) and the back sheet (strands ABED) (Bork et al. 1994). Secondary structur e and H bonding patterns between strands were assessed with the program DSSP (Carter et al. 2003) and demonstrate that VCBP3V1 adopts the V-t ype Ig fold (Holm and Sander 1999). As shown in Figure 3-3 the C and C strands of VCBP3V1 are part of a main chain H bond
50 Figure 3-2. Electron density calculated from three-wavelength MAD experimental phases to 1.3 . Contoured at the 2 level (blue), at the 3 level (red) and superimposed onto the final model (white for carbon, blue for nitrogen, red for oxygen atoms). Trp42, invariant in V domains, is shown packed against the intrachain disulfide bond (Cys27-Cys110) that links the B and F strands. network stabilizing the front sheet, in contrast to s-type, h-type and c-type Ig domains which do not contain C stands (Bork et al. 1994). The strand a rrangement of the VCBP3V1 (front sheet A GFCC C ) shows that it is more similar to V , V , V , VH and VL than to V (front sheet A GFCC ) since the C strand of V is H bonded to the back sheet (ABEDC ) (Ostrov et al. 2000; Garcia et al. 1999) .
51 A B Figure 3-3. VCBP3V1 is a V-type immunoglob ulin. (A) a ribbon representation of the VCBP3V1 monomer. -strands are indicated in gold, and random coils in gray. Boundaries of the strands were assessed with DSSP (Carter et al. 2003) (A, 5-9, A , 14-18, B, 23-32, C, 39-45, C , 51-55, C , 73-77, D, 85-87, E, 9395, F, 106-113, G, 123-133). (B) an overlay of VCBP3V1 with a structurally similar TCR V domain. VCBP3V1 (in gold) superimposed on a TCR V domain (PDBID: 1TVD) (in blue). The most significant structural deviation occurs in the C C loop (corresponding to CDR2 ), which is eight residues longer than CDR2 of the mo st structurally related V . Figure generated with SETOR (Evans 1993). Comparison of related tertiary structures could lead to evolutionary and functional insights not inferred directly from primary st ructure analysis. VCBP3V1 belongs to the Ig protein superfamily, which is one of the most abundant and ubiquitous throughout the animal kingdom. The functions of members of the Ig superfamily are disparate and provide a striking example of selectiondriven adaptation of an apparently multifunctional domain (the Ig fold). VCBP3V 1 was aligned structurally with solved structures in the protein da ta bank using the program DALI (Holm and Sander 1999). As shown in Table 3-3, 30% of the 15 most simila r structures are V regions from TCRs. The most similar TCR V region is from a human V (Li et al. 1998) (PDBID: 1TVD).
52 Table 3-3. Structural comparison of VCBP3V1 with proteins in the Protein Data Bank. rmsd, root mean squared deviation. Rank PDB ID Z score C rmsd % identity Protein 1 1f5w 14.1 2.2 24 coxsackievirus and adenovirus receptor 2 1tvd 14.0 2.1 25 V 3 3 1qfo 13.5 2.3 19 Sialoadhesin D1 4 1neu 13.4 2.1 20 Myelin p0 protein 5 1tcr 13.2 2.0 19 V 8.2 6 1f97 13.2 1.9 20 Jam/reovirus receptor 7 1pko 13.0 1.7 23 Myelin oligodendrocyte glycoprotein 8 1cdy 12.4 1.8 23 CD4 D1 9 1hxm 12.3 2.4 18 V 2 10 1ccz 12.3 2.2 18 CD58 D1 11 1hxm 12.2 2.2 15 V 9 12 8fab 12.0 2.2 21 VL 13 1i85 12.0 2.3 20 CD86 (B7-2) D1 14 1qa9 11.7 2.8 15 CD2 D1 15 1bec 11.6 2.3 15 V 6 Superposition of VCBP3V1 on V (Figure 3-3) demonstr ates that the most significant structural difference occurs in the C C loop, which corresponds to CDR2. The C C loop of VCBP3V1 is nine residue s longer than CDR2 of this V domain (1TVD) but is the same length as CDR2 in solved crystal structures of human Ig VH (Monod et al. 2004). The structural fold of VCBP3V1 is overall the same as that present in antigen receptors. The strands of the VCBP3V1 and V framework regions and the lengths and dispositions of the BC loops (corresponding to CDR1) and the FG loops (corresponding to CDR3) are structurally similar. The front sheet of TCR and Ig V regions are curled and twisted in a characteristic manner that prom otes front sheet-front sheet dimerization. A greater degree of curling of the front sh eet is seen in VCBP3V1 relative to V (Figure 33), and represents a potential difference between the VCBPs and rearranging antigen
53 binding receptors in terms of the location and compositi on of their respective ligandbinding sites (see below) (Colman 1988; Chothia et al. 1985). Notwithstanding these differences, it is possible to predict protein-ligand interaction sites based on structural features at the molecular surface (e.g., pockets, clefts, and grooves) and the interactions of similar structures. Comparisons of VCBP3V1 to the three most structurally similar proteins, coxsackievirus and adenovirus receptor domain 1 (van Raaij et al. 2000), CAR D1 (24% identical), TCR V (Li et al. 1998) (25%) and sialoadhesin (Zaccai N. R. et al. 2003) (19%) shows that desp ite a relatively low degree of sequence similarity (F igure 3-4), CAR D1, TCR V , sialoadhesin and VCBP3V1 are predicted to share the capacity to interact with ligands utilizing residues located on the front sheets (A GFCC C ). Specifically, CAR D1 interact s with the knob domain of the adenovirus fibre head using the front sheet (van Raaij et al. 2000) (Figure 3-5A), TCR V V dimerization is mediated by their front sheets (Garcia et al. 1999) (Figure 3-5B) and sialoadhesin interacts with sialic aci d using residues on the front sheet (Zaccai et al. 2003) (Figure 3-5C). The crystal structure of VCBP3V1 shows that ther e is an interaction with a molecule related by cr ystallographic symmetry at an analogous site on the front sheet (Figure 3-5D). However, antigen receptors such as TCRs use CDR loops to mediate antigen specificity and MHC restriction (Garcia et al. 1999). Furthermore, VCBP3V1 is monomeric in so lution and in the crystal (one molecule per asymmetric unit), and so may form ligand binding interactions using the loops that are analogous to those seen in the VH CDRs of two cases of highly derived antigen are analogous to those seen in the VH CDRs of two cases of highly derived antigen receptors, the single chain camelid antibodies (De Genst et al. 2004) and shark IgNAR (Streltsov et
54 Figure 3-4. Amino acid sequence alignment of Branchiostoma floridae VCBP3V1 with structural homologues. Human coxsackievirus and adenovirus receptor domain 1 (CAR D1, PDBID: 1F5W), mouse TCR V (1TVD), and mouse sialoadhesin (Siglec-1, 1QFO) were id entified by the DALI server. VCBP3V1 and proteins used in immune recogniti on show a high degree of structural similarity (rmsd < 2.4 ) despite a rela tively low degree of sequence identities between VCBP3V1 and the reference structures (24%, 25%, and 19%, respectively). Residues were aligne d by sequence similarity using the BLOSUM62 (Myers and Durbin 2003) matrix and color-coded ranging from low similarity to high (white
55 Figure3-5. Structural homologs of VCBP3V1 use their front sheets (A GFCC C ) in ligand interactions. (A) CA R D1 (strands in gold, loops in gray) interacts with the adenovirus knob fibre head (blue) (1KAC). (B) V domain (1TVD, strands in gold, loops in gray) interacts with V (blue) as oriented in a TCR heterodimer (1HXM). (C) Sialoadhesin (strands in gold, loops in gray) binds to 3'sialyllactose (yellow = carbon, bl ue = nitrogen, red = oxygen atoms) (1QFO). (D) Residues on the front sheet of VCBP3V1 form a contact with a symmetry related molecule at residues 46-50 (colors as in C), corresponding to the CC loop. Figure generated wi th SETOR (Evans 1993).
56 Since many proteins, including many which di verged in evolution before antigen receptors, use -sheets in protein-protei n interactions, VCBP3V1 may be more likely to bind ligands via the front sh eet than through the induced-f it binding mechanism used by antigen receptor CDR loops (Rini et al. 1992; Rudolph and Wilson 2002). Discussion VCBP3V1 is the first crystal structure of a V-type Ig domain solved to a resolution limit of 1.15 . At this level of resolution it is possible to begin to observe ordered H atoms, assess protonation states, and carry ou t structural analyses that would not be possible at lower resolution (Dauter et al. 1997). We observe str ong electron density in peptide bonds indicating their character throughout the VCBP3V1 structure. Strikingly, we also observe C and C hydrogens for many residues, mainly at the core of this V-type Ig. The most conserved portion of V-type Ig domains lies at the central hydrophobic core region (Chothia et al. 1998) and consists of a high ly conserved Trp residue, Trp42 in VCBP3V1 (corresponding to IMGT Trp41), packed against the intrachain disulfide bond linking the B and F strands (Cys27 and Cys110) (Figure 3-2). The level of refinement that was achie ved permitted us to infer heretofore unrecognized structural features that likely stabilize the core of the VCBP3V1. Hydrogen atoms apparent in Fo-Fc difference maps suggest a clear role for non-canonical interactions between key conser ved residues at the core of th e V-type Ig domain (Figure 3-6). Analysis of the structural coor dinates of VCBP3V1 by the Non-Canonical Interaction server (NCI) (B abu 2003) identified 64 residues that are involved in noncanonical interactions (e .g., weak C-H---O bonds and interactions, Table 3-4). The interactions were identified based on minimum geometrical criteria of distances and angles (see Babu 2003) between donors and acceptors.
57 Probably the most critical noncanonical interactions occur at, or in close proximity to, the invariant Trp42 in the central hydrophobic co re region of VCBP3V1. Trp42 is involved in at least 4 non-ca nonical interactions (Figure3-6 ): 1 main chain-main chain interaction (CH-OC), 2 main chai n-side chain interactions (CH), and 1 side chain-side chain ( ) stacking interaction. Cys27 and Cys110 also show critical non-canonical interactions; for example, there is a main chain-side chain interaction (CH) between Cys110 and Trp42. Other conserved residues at the co re of the fold show non-canonical interactions; specifically, the invariant Tyr108, which forms multiple non-covalent bonds as both an H-bond donor and acceptor. The e xpanded definitions provided here of canonical and non-canonical inte ractions between invariant core residues provide a new basis for understanding the pa tterns of strong evolutiona ry conservation of V region structure. It is proposed that these observed weak interactions stabilize the scaffold on which the complementary determining regions are disp layed (in antigen receptors) and that they play an important role in its structure, making them sensitive to the selective pressures that favor the Ig fold. It is not unfair to conclude that more residues are involved in such interactions and that although weak the combination of many of these interactions may afford the structure with a significant level of stability. Non-canonica l interactions have gained popularity recently in structural biology (Manikandan 2004; Loll et al . 2003) and roles involving both structur e (Derewenda and Derewenda 1995) and catalysis (Ash et al . 2000) have been discussed. Alternatively, these weak interactions may play an important role during the folding of the Ig fold. Although we understa nd that weak interact ions can become
58 Figure 3-6. Non-canonical inte ractions at the core of VCBP3V1 between Trp42 and Cys110. 2Fo-Fc electron density calcula ted from phases to 1.15 was contoured at the 2 level (colored in blue). Non-canonical interactions mediated by electrons and H atoms apparent in the Fo-Fc electron density are shown at the 3 level (red), 2 level (white), and superimposed onto the final model (yellow for carbon, green for sulfur, blue for nitrogen, red for oxygen atoms). Figure made with Raster3D (Merritt 1994). significant when many are combined, we susp ect that the disulfide bond that links the front and the back sheet of the Ig fold along with the hydrogen bondi ng networks of its sheets may confer the domain with an amount of stability that may render any contribution from the weak noncanonical intera ctions negligible. It is still possible, however, that these interactions play an important function in guiding the hydrophobic collapse (or similar stage in the folding path way of the Ig fold) of the unfolded protein. Similar to how these weak interactions may have a pronounced effect on the structure of membrane proteins that are situated in a low dialectric environment, these weak
59 interactions would be expected to strengthen as the core of the Ig forms and dielectrics lower. Such stabilizing effects before the settlement of a primary sequence into a native conformation may affect the fract ion of properly folded protei n or the rates of transition between folding intermediates. VCBPs constitute an extensively diversifie d multigene family of putative antigen binding receptors that are expressed in cells lining the luminal surface of the gut in the floridian species of am phioxus, a filte r-feeding marine cephalochordate (Cannon et al . 2002). On the basis of predicted primary st ructure, VCBPs consist of two N-terminal tandem V regions and a chitin-binding doma in. The V regions of VCBPs possess many of the same core residues as the V regions found in Ig, TCR, and other IgSF members Table 3-4. Non-canonical interactions of important immunoglobu lin-fold residues in VCBP3V1. 1: (Chothia et al . 1998); 2: (Babu 2003). VCBP3 V1 Chothia (1) IMGT Ig NCI (2) Donor Acceptor Gly 20 16/15 16 0 Cys 27 23/22 23 CH---OC 0 1 Trp 42 35/36 41 CH---PI CH---OC PI---PI 2 3 Arg 84 61/66 75 CH---OC 0 1 Leu 95 73/80 89 CH---OC 0 1 Asp 104 82/86 98 CH---OC 0 2 Tyr 108 86/90 102 CH---PI CH---OC PI---PI 1 3 Cys 110 88/92 104 CH---OC CH---PI 1 1 (Chothia et al. 1985). Like Ig and TCR, the VCBPs can be organized into at least five families that differ from one another by at least 30% at the predicted protein sequence level. This pattern of sequence distribution a nd variation resembles that seen in the large
60 V region families of Ig, TCR, and novel immune-type receptors (NITR) (Strong et al. 1999; Yoder et al. 2004) and is consistent with immune-relate d diversification. Extensive analyses of the genomic comple xity of the different VCBPs amplified from individual animals indicate that certain sequences are shared by nearly all animals but that most polymorphic sequences deri ved from a single animal DNA source are unique to that individual (Carter et al. 2003). Notably, the differences involve both single positional replacements and overall length differences and bear a superficial resemblance to somatically-derived N-region variation seen at Ig/TCR-joining junctions. Such complex polymorphisms are consistent with an immune function for VCBPs in which these genes are undergoing strong diversif ying selection and adaptive evolution. In our studies, we conclude that VCBP3V 1 has key structural features in common with antigen receptors, including: conformati on of conserved residues that constitute the domain core, strand topology, and loop confor mations. VCBP3V1 shares five of the eight invariant residue sites (prese nt in > 98 % of sequences), which are associated with common core of V domains [Chothia 30/IMGT designation 31]: Cys23/23, Trp35/41, Asp82/98, Tyr86/102, Cys88/104 30. The other invariant V domain residues (encoded by the J gene segment formed from the G strand in TCR and Igs), namely Gly99/119, Gly101/121, and Thr102/122 are not present in VCBP3V1. Despite th e difference in these G strand residues, conserved main chain intramolecular H-bond interactions at this site (Chothia et al. 1998) are observed in VCBP3V1 between Leu128 (in an analogous position to Thr102/122 in antigen receptors) and Tyr108 (corresponding to the invariant Tyr86/102) (comparison of Ig residue numbers in Table 3-4). The BC, C C , and FG loops corresponding to CDR1, 2, and 3, respectively, are similar to conformations observed in Ig and TCR V domains. The
61 front sheet of VCBP3V1 is curled and twiste d in a manner similar to antigen receptors, which use this face of V-type Ig domains in conserved heterodimeric subunit interactions between the front sheets (Potapov et al. 2004) (i.e., V with V ). The VCBP3V1 crystal struct ure described here both provides a basis for highresolution analysis of a unique V domain presen t in an immune-type molecule and offers unequivocal atomic level information about the basic structural featur es that this VCBP shares with known antigen binding receptors. In addition, the structural relatedness of VCBP3V1 with other proteins also offers novel evolutionary insights. Germline-encoded VCBP3V1 is most similar to highly polymorphic, somatically diversified molecules, such as TCR V domains, and non-polymorphic mammalian leukocyte (e.g., CD4, CD2, CD86) and non-leukocyte cell surface receptors (CAR D1, junction adhesion molecule). These observations can be interpreted to suggest either a common evolutionary origin or convergence. The observation that certain ma mmalian structural homologs of VCBP3V1 serve as viral receptors [e.g., junction adhesion molecule which is a receptor for reovirus (Eason et al. 2004)], whereas others undergo so matic rearrangement underscore an evolutionary continuum of innate receptors, germline diversified innate receptors, and somatically varied innate and adaptive receptors (Atkinson and Leiter 1999). Localized germline hypervariability in V region-containing putative innate immune receptors such as VCBPs and NITRs is consiste nt with a potential to bind diverse ligands such as that achieved in both Igs and T CRs through CDR1-CDR2-CD R3 interactions. V domain-associated function appears to be ach ieved at different contact interfaces and likely is influenced through a high level of germline genetic change (e.g., CDR1 and CDR2) and through additional somatic varia tion (CDR3) and thr ough heterodimerization
62 in both Ig and TCR. This high degree of conservation of intram olecular contacts and structural conformations of domain core re sidues in VCBP3V1 is consistent with an evolutionary paradigm in which an innate antigen receptor has been refined to progressively adopt unique intermolecular associ ations in parallel with the acquisition of combinatorial rearrangement and regionall y limited somatic modification as major diversifying influences. The fu ll potential of these effects is realized in the genes encoding the rearranging an tigen binding receptors used in jawed vertebrates.
63 CHAPTER 4 ORIGINS OF IMMUNE-TYPE RECEPTOR DIVERSITY: CRYSTAL STRUCTURE OF VCBP3V1V2 IN AMPHIOXUS Introduction The immunoglobulin superfamily (IgSF) is one of the most ubiquitous families of protein domains, representatives of whic h are found throughout the animal kingdom (Orengo and Thornton 2005; Bork et al. 1994; Barclay 2003; Hunkapiller and Hood 1989). The functions of the various members of this superfamily are disparate and provide a striking example of selectio n acting on a modular structure, the immunoglobulin domain fold, to form molecules with specialized functions such as in cell-cell adhesion and i mmune recognition (Eason et al. 2004). Immunoglobulin (Ig) domains are comprised of a -sandwich fold, a prominent feat ure of which is a disulfide bridge connecting two -sheets with one invariant tr yptophan packing against the disulfide bond (Chothia et al. 1998). Ig domains can be classified into the V, C, S and H sets based on strand topology (Bork et al. 1994). The V-set of domains range from being nonor minimally polymorphic to having high levels of genetic complexity due to germline and somatic diversification. Immunoglobu lins (Igs) and T cell antigen receptors (TCRs), which constitute the rearranging antigen bindi ng receptors are the most extensively characterized V-set molecules but CD4 (Garrett et al. 1993), CD8 (Leahy et al. 1992), and certain NK receptors (Cantoni et al. 2003) also possess this struct ure. It is likely that the ancestral element that diversified into an tigen receptors was an Ig-like sequence with
64 characteristics of the V-set but its char acteristics and pattern s of evolutionary diversification remain unknown (Laird et al. 2000; Litman et al. 2005). Although certain excepti ons exist (Desmyter et al. 1996; Stanfield et al. 2004), most antigen receptors used in adaptive imm une responses consist of heterodimers in which V-set Ig domains pack to juxtapose th e protein segments that have the highest degree of germline and somatical ly induced variation (Garcia et al. 1999). This feature of tertiary-quaternary structure forms a vast range of binding surfaces that account for highly diverse antigen receptor repertoires. The unusual genetics that account for the hyperdiverse structures of Ig and TCRs ar e limited to the jawed vertebrates (Laird et al. 2000; Litman et al. 2005). Alternative forms of receptors are found th roughout invertebrate and protochordate phylogeny, but thus far only the variable re gion-containing chitin-binding proteins (VCBPs) in amphioxus ( Branchiostoma floridae ) have shown evidence for extensive regionalized hypervariation in predicted V domains (Litman et al. 2005; Cannon et al. 2002; Cannon et al. 2004). The VCBPs consist of two predicted N-terminal V regions and a C-terminal chitin-binding domain; however, inspection of primary structure shows the regionalized hypervariation to be significantly displaced from that seen in Ig and TCR fails to demonstrate a J region, a ubiquitous feature of antigen binding receptors (Davis and Bjorkman 1988). VBCPs may reflect an important transiti on between conventional views of innate pattern recognition domains and adaptive i mmune receptors (Medzhitov and Janeway 2000). To determine how structural features of the predicted V structure found in this putative receptor may relate to the more m odern (recently derived) rearranging antigen
65 receptors and hypothetical ancestral V-set molecules, we solved the structure of the Nterminal V regions of VCBP3 from amphioxus by x-ray crystallography. Materials and Methods Protein Expression, Purification, and Crystallization The two domain protein (VCBP3V1V2) was purified and crystallized with the same protocols described in Chapter 2, mate rials and methods. To obtain selenomethione derivative of VCBP3V1V2, however, cells were expressed in methionine deficient media and supplemented with selenomethionine to ex press the protein for anomalous dispersion experiments. This was carried out accordi ng to published protocol (Barclay 2003; Hunkapiller and Hood 1989; Eason et al . 2004; Chothia et al . 1998; Garrett et al . 1993; Leahy et al . 1992; Cantoni et al . 2003; Laird et al . 2000). The native data set described in Chapter 2 was used for the refinement of the structure in this chapter. Data Collection and Processing Selenomethionine derivative crystals of VCBP3V1V2 were cryo-cooled by immersion in liquid nitrogen and data was collected at beamline X6A of the National Synchrotron Light Source, Brookhaven Nati onal Lab, Upton, NY. A single-wavelength anomalous dispersion (SAD) experiment wa s performed by collecting diffraction data from a single selenomethionine derivative crys tal at the peak of its absorption edge. Derivative data was collected using oscillation angles of 0.5 and 20 second exposures. Intensities were indexed and integr ated with DENZO and reduced with SCALEPACK (Otwinowski and Minor 1997). Th e calculated molecular weight of the crystallized polypeptide and post-refined uni t-cell dimensions were submitted to the Matthews Probability Calculat or (Kantardjieff and Rupp 2003) to obtain information as to the contents in the asymmetric unit. S ee Table 4-1, for data collection parameters.
66 Table 4-1. Data collection and refinement statistics from VCBP3V1V2 derivative. *Values for the highest resolution shell are in parenthesis Data Collection Statistics Refinement Statistics Wavelength 0.9786 Protein atoms 1891 Temperature (K) 100 Solvent molecules 277 No. of frames 420 R(cryst) (%) 22.82 No. of crystals 1 R(free) (%) 26.85 Crystal-to-detector distance (mm) 187.7 Test set (%) 4.4 Observed reflections 767,471 rmsd bonds (Ã…) 0.005 Unique reflections 52,447 rmsd angles (Â°) 1.25 Redundancy 5.9 (2.6) Resolution range (Ã…) 30-1.87 (1.90-1.87)* Ramachandran Statistics (%) Space group P 61 Most favored 88.2 Unit-cell parameters a = 109.37, c = 48.34 Additionally allowed 11.8 Oscillation step (Â°) 0.5 Generously allowed 0 Mosaicity (Â°) 0.305 Disallowed 0 36.4 (3.0) Reflections >3 (%) 84.3 (54.0) Completeness (%) 98.2 (77.0) Rmerge (%) 5.3 (24.8) FOM (before density modificaton) 0.39 FOM (after density modification) 0.59 Phasing and Refinement The selenomethionine derivative diffraction data was successfully phased with the software package of SOLVE (Terwilliger and Berendzen 1999) by the single-wavelength anomalous dispersion method. Initial experiment al electron density maps were evaluated with Coot (Emsley and Cowtan 2004). Statisti cal density modification was performed with RESOLVE (Terwilliger 2000). The space gr oup ambiguity was resolved and it was determined that crystals belonged to the space group P61. SAD phasing with SOLVE also identified two strong peaks corresponding to th e single selenomethionine residue in each variable domain confirming the presence of a single molecule in the asymmetric unit of the unit cell. Statistical density modification improved the overall figure of merit of the original phases from 0.39 to 0.59. Most of the main-chain was successfully traced by RESOLVE (Terwilliger 2002) as polya lanine and polyglycine segments.
67 Model building of the main chain was car ried out with Coot during rounds of refinement against derivative diffraction data . The structure of VCBP3 was refined with Shelx97 (Sheldrick and Schneider 1997) and CNS (Brunger et al. 1998). Xtalview (McRee 1999) was also used for molecular gr aphics. Most of the main-chain and side chains were built before refining against nati ve diffraction data, where the same test set of reflections was used to calculate th e R-free (Brunger 1992). Several rounds of refinement were performed before water mo lecules started to be assigned. Refinement routines constituted of simulated annealing gradients from 2000-3000 K with slow cooling, geometrically restrain ed atom coordinate minimization and isotropic temperature factor refinement. Water molecules were assi gned through multiple cycles of refinement until most peaks with signal to noise ratios greater than 2 were occupied. CNS was used to assign the first solvent shell using the automatic water picking software but in later stages of refinement SHELXL was also used to assign solvent molecules to Fo-Fc sigma weighted electron density ma ps. Refinement continued until convergence for the model was achieved. The crystal structure was submitted to the Protein Data Bank (PDB); (PDBID: 2FBO). Refinement statistics are summarized in Table 4-1. Comparative Homology Modeling of VCBP2 V Domains Most of the sequence data availa ble was obtained from VCBP2 (Cannon et al . 2002). In order to evaluate whether the s uperimposition of this sequence polymorphism data on VCBP3V1V2 is a reasonable approac h, theoretical models of VCBP2 V domains were generated (Figure 4-1) using progr ams implemented by the Automated Protein Modelling Server (SWISS-MODEL) (Guex and Peitsch 1997; Guex et al . 1999). Sequences representative of VCBP2 V regi ons were queried using BLAST2P (Zhang and Madden 1997) to identify homologous proteins of known structure deposited in the Protein
68 A B Figure 4-1. Comparative hom ology modeling of VCBP2 based on the 1.85 Ã… crystal structure of VCBP3. The VCBP3 backbone is shown in magenta (V1) and violet (V2). The homology m odel of VCBP2 is superposed on VCBP3 in pale yellow (V1) and yellow (V2). (A) shows the back view of the VCBPs and (B) shows the front view, consistent with Figure 4-3C and 4-3D (see Figure 4-3 for labels). A chain break is observed between the V domains of VCBP3 since each was read as a seperate chain with Pymol (DeLano 2002). Only 38 residues of domain V2 from VCBP2 we re modeled with confidence, whereas the complete topology of the fold is observed in the model of V1. Data Bank. ProModII was used to build atomic models (Peitsch 1996; Peitsch et al . 2000). Energy minimization was conducted with Gromos96 (van Gunsteren 1996). RMS deviations and distances were calculate d with LSQKAB using CCP4 (Collaborative Computational Project 1994). Multiple sequence alignments were carried out with CLUSTAL X (Evans 1993). Results Like several other non-vertebrate immune-type receptors (Litman et al. 2005; Zhang et al. 2004; Watson et al. 2005), VCBPs are chimeras of Ig and lectin domains (Cannon et al. 2002). In the case of VCBP s, it is most likely th at variable recognition function is compartmentalized in the two tandem Ig domains that have extensive germline variation. We expressed and crysta llized the two tandem V domains of VCBP3
69 and determined the structure by the single wavelength anomalous dispersion technique (Wang et al. 2004). Native and selenomethionine s ubstituted forms were isomorphous to each other and belong to the space group P61 with unit cell dimensions a=b = 109.4, c = 48.3 Ã…. X-ray data were collected from nativ e and selenomethionyl pr otein crystals and refined to 1.85 Ã… (Table 4-1). The Structur e factors and refined crystallographic model were deposited in the Protein Data Bank (PDBID: 2FBO). The two tandem Ig domains of VCBPs had previously been predicted to have the V-set Ig fold based on the presence of conser ved residues that are canonical in antigen receptor V domains (Chothia et al. 1998), despite a relatively low degree of sequence similarity to antigen receptors (24%) (Cannon et al. 2002). The crystal structure shows one molecule in the asymmetric unit (Fi gure 4-2); both VCBP3 domains show strand topologies characteristic of the V-set Ig dom ains used by rearranging antigen receptors. Ig domains are structurally classified as V-type if they display a conserved 9 strand secondary structure topology forming two -sheets packed tightly in the aligned mode: the front sheet (strands A GFCC C ) and the back sheet (strands ABED) (Bork et al. 1994). Structural analysis by SSM (Kolodny et al. 2005) shows that both V-set Ig domains of VCBP3 adopt the V-type Ig fold and are structurally similar to each other (Figure 4-3A). The degree of similarity betw een VCBP3 V1 and V2 is higher than to any solved structure in the Protein Data Bank with rmsd of 1.4 Ã… for C atoms. Both V domains of VCBP3 have the c onserved intrachain di sulfide bonds linking the back and front sheets by the B and F strands, respectively (Cys26-Cys109 in V1; Cys160-Cys232 in V2). V2 also contains a disu lfide bond linking the B and C strands (Cys162-Cys165) stabilizing the BC loop, which is anal ogous to CDR1 in antigen receptors.
70 Figure 4-2. Crystal struct ure of VCBP3V1V2 from Branchiostoma floridae solved by SAD and refined to 1.85 Ã…. Secondary structure is shown in (A) and (B). (Gold; -strands, gray; loop regions, red; helices. The G strand and FG loop encoded by a Joining gene segment-like element is shown in cyan. The loops corresponding to CDR regions in Tcell receptor and immunoglobulin: BC loop, CDR1; CÂ’CÂ”, CDR2; FG, CDR3. (C) and (D) show the molecular surface of VCBP3 V1V2. The molecular surface of V1 is shown in magenta and V2 in violet. Polymorphic residues in VCBP sequences are depicted in gold forming a contiguous patch of so lvent exposed hypervariable residues, outlined in a red interrupted rectangle. (B) and (D) show VCBP3 rotated about a vertical axis in plane of the page 180Â° with respect to (A) and (C).
71 Among solved structures in the Protein Data Ba nk aligned by SSM, V1 is most similar to a TCR V domain, 1TVD:B (Li et al. 1998), shown in Figure 4-3B, followed by the CNS specific autoantigen myelin oligodendrocy te glycoprotein (MOG), 1PY9:A (Clements et al. 2003), and the human coxsackie and adenovi rus receptor D1, 1EAJ:B (van Raaij et al. 2000) (rmsd: 1.7 to 1.8 Ã…, sequence identity 2226 %). The front sheet of the VCBP3 V1 (front sheet A GFCC C ) is more similar to V , V , V , VH and VL than to V (front sheet A GFCC ) in which the C strand of V is H bonded to the back sheet (ABEDC ) (Garcia et al. 1999; Ostrov et al. 2000). The structures of VCBP3V1 and V2 are the first examples of highly variable V-set Ig domains identified in invertebrates, and b ear strong structural similarities to recently derived antigen receptors in domain foldi ng and packing interactions. For example, antigen receptors use a three layer packing mo de in which side chains from residues in the highly twisted edge strands, CÂ’ and G, comprise the inner la yer and the two outer layers are formed by main and side chains of the central -strands, F and C (Colman 1988; Chothia et al. 1985). The highly twisted edge stra nds fold over the inner strands of the front sheet to mediate critical contacts th at form the inner core layer of the three domain packing interface (Figure 4-3C) (Eason et al. 2004). A three layer interface is formed between V1 and V2 of VCBP3, and similar interface residues in the CÂ’ and G strands ar e observed as shown in Figure 4-3D. Main chain to side chain H-bond in teractions that comprise th e outer layer are observed between V1 and V2 of VCBP3, as shown in Figure 4-4A in which Arg106 from V1 forms 4 H-bond interactions across the V1/V2 interface to Glu240, Leu241, and Ala243. The specialized three layer packing mode used by VCBP3 is observed in antigen receptor
72 Figure 4-3. Structural compar ison of the VCBP3 domain fold and packing interactions with antigen receptors. (A) VCBP3 V1, cyan, is shown superimposed on V2, gold. The FG loops and G strands, encoded by J-like elements in VCBPs, are nearly identical, whereas the BC and CCÂ’ loops of V1 are longer than those present in V2. Among V-set Ig domains, VCBP3 V1 is most similar to a V domain, shown in (B) in purple superi mposed on V1, shown in cyan; note the CÂ’CÂ’Â’ loop curled over the front shee t (AÂ’GFCCÂ’CÂ”) in V1. (C) shows the three layer packing interaction charac teristic of the interface between V domains of antigen receptors. V , shown in gold and gray, interacts with the corresponding V -chain, shown in magenta, by a front sheet-to-front sheet interaction in which side chains (shown in cyan) from the highly twisted edge strands, CÂ’ and G, fold over the central strands of the front sheet to form an inner layer at the core of the three la yer interface between the domains. (D) a three layer packing interaction is obs erved at the interface between VCBP3 V1 and V2. Residues form an inner layer and the CÂ’ loop folds over the central strands of the front sheet. (C) and (D) V and V1 are shown in gold ( sheet), gray (loop regions), a nd red (helical regions). V and V2 are shown in magenta. (C) Pro43, Leu104, and Phe106 are depicted in cyan. (D) Leu42, Leu122, Ala124 are shown in cyan.
73 heterodimers and differs from the frequently observed two layer interf ace that occurs in most protein-protein in teractions in which -sheets interact. Heterodimeric antigen receptors in jawed vertebrates pack in a head-to-head fashion which positions the hypervariable CDR loops in a manner that forms a binding site comprised of two V domains. The V regions in VCBP3 pack in a h ead-to-tail relationship (Figure 4-2) similar to those observed in crys tal structures of a V domain (1TVD) (Li et al . 1998) and cell adhesion molecules such as MOG (1EAJ) (Clements et al. 2003). However, the packing of V domains in VCBP3 is significantly similar to antigen receptors in that both use the G strand s encoded by J like sequence elements. The structural importance of J gene segmen ts in antigen receptor dimerization and combinatorial diversity is due to a cons erved sequence motif, FGXGTXLXV, Figure 44B, which corresponds to V domain G strand residues which partic ipate in the unique front sheet-to-front sheet dimerization mode used exclusively by antigen receptors (Chothia et al. 1985). Strikingly , the crystal structure of VCBP3 shows that J-related sequences are present in the G strands at th e V1V2 front sheet-to-front sheet interface (Figure 4-2). Despite the absence of an independent contiguous J exon in the germline sequence of VCBPs (unpublished), the amino terminal V domain (V1) of one form of VCBP4 encodes FGXG in the site analogous to J gene segments. The carboxy terminal V domain (V2) of VCPB2, VCBP3, VCBP4, and VCBP 5 encode the conserved TXLXV motif corresponding to the carboxy terminal end of the G strand. A recombination event between these gene segments (e.g., FGXD and TXLXV in VCBP3) coul d give rise to a single structural region resembling the comp lete J region of the rearranging antigen
74 Figure 4-4. VCBP3 uses a J-like segment at its interface. (A) Hydrogen bond network formed between residues across the in terface between V1 and V2 of VCBP3. Interactions between side chains of one V domain and main chain atoms of the opposing V domain form the outer layer of the three layer packing arrangement observed in antigen r eceptors and VCBP3. Hydrogen bonding network in which the guanidinium group of Arg106 from VCBP3 V1 forms hydrogen bond interactions with main chain atoms from V2. For V1, carbon atoms are shown in magenta, V2; gree n. Nitrogen are depicted in blue, oxygen; red. (B) Structural comparison of the G strands in antigen receptors, shown in cyan, with VCBP3 V1, sh own in gold. The sequence motif FGXGTXLXV is highly conserved in J ge ne segments of antigen receptors and corresponds to G strands residues th at are critical in V domain packing. In VCBP3, the G strands of V1 and V2 are in an extended -strand geometry, whereas antigen receptors have a conserved -bulge at the position corresponding to FGX/G. The residues representing X of FGXG and the side chains of conserved residues in the se quence motif are disp layed illustrating a similarity between these positions and the corresponding residues in VCBPs, with the exception of the Phe which is displaced 4 residues amino-terminal of the corresponding position in antigen receptors.
75 binding receptors (Cannon et al. 2002). Whereas such an associ ation could be fortuitous, it is not unreasonable to speculate that the participation of the J like sequence in domain packing of V regions is a characteristic cons istent with ancestral immune-type receptors that preceded adaptive immunity. Discussion At least five sequence-related families of VCBPs (36-46 % identity) are found in amphioxus and their structural folds can be modeled at high confidence against the solved structure presented here, Supplemental Material s, (Zhang and Madden 1997; Peitsch 1996; van Gunsteren 1996; Schwede et al. 2003). The V family distributions, high degree of germline polymorphisms, tissu e-specific expression and utilization of a chimeric Ig-lectin structure found in VCBPs are consistent features of an immune recognition molecule (Litman et al. 2005; Cannon et al. 2002; Cannon et al. 2004). However, unlike Ig and TCR, the regions of most extensive hypervariation observed in predicted VCBP2 proteins are positioned in th e N-terminal portions (A, AÂ’ and B strands and their connecting loops) of both the V1 and V2 domains as opposed to the characteristic placement of hypervariable re sidues in Ig and TCR (BC loop-CDR1, CÂ’CÂ” loop-CDR2, FG loop-CDR3), ostensibly arguing against such an analogy. We mapped the hypervariable positions of VCBP2s onto the solvent accessible surface of VCBP3 to provide information on po tential ligand binding sites. The domains of VCBP3 pack in a manner, similar to antig en receptors, that pos itions the hypervariable residues on a contiguous patch of solvent acc essible surface on both V domains (Figure 4-2). The putative interacti on site at a hypervariable regi on comprised of two separate domains is shown in Figure 4-2C with a so lvent accessible surface of approximately 9001000 Ã…2 which is consistent with the range of surface area buried between TCR and
76 pMHC in solved crystal structures (Rudolph et al. 2002; Luz et al. 2002). More recently derived antigen receptors used in adaptiv e immune mechanisms bring both germline variable and somatically derived variable el ements into close, structurally contiguous proximity. In sharp contrast to heterodime ric antigen receptors in which interchain contacts mediate the interac tion of the V domains, both in teracting domains of VCBP3 are expressed in a single chain in the case of this non-rearranging immune-type receptor. One theory of primordial antigen receptor s is that they woul d have characteristics of cell adhesion molecules (Streltsov et al. 2004), some of which share some features with contemporary antigen receptors, such as incorporation of the V-set Ig fold. Unlike nonor minimally polymorphic cell adhesion mo lecules, the postulated changes from a self to a self-nonself dist inguishing mode of function in an immune-type progenitor would be driven by the definition of a highly variable combining site given the propensity of a wide variety of pathogens, including those that inhabit marine environments (Meibom et al. 2005), to use genetically sophisticated mechanisms to avoid host defense. Molecules such as VCBPs or their forerunners may have acquired ad vantages in immune recognition due to the clustering of hypervar iable residues at the solvent accessible surface comprising both V domains, an effect that is loosely analogous to combining site formation in heterodimeric antigen receptors and might have provi ded a high degree of germline success before the innovation of a ba sic mechanism of soma tic reorganization. At this point, VCBPs, which show a nu mber of significant characteristics of immune receptors, are not only the most phyl ogenetically ancient example of a gene family in which extensive hypervariation occu rs in V regions but now provide a likely paradigm for how the variation could create a basic mechanism for increasing receptor
77 diversity by focusing variability through novel packing at the surface of an interface between V-set domains, somewhat analogous to that seen in conventional antigen binding receptors. Whereas the structure of the actu al ancestral form of the rearranging antigen binding receptors may never be known (Litman et al. 2005), the solved structure of VCBP3 provides us with significant clues as to what types of changes may have accompanied the transition from a molecule that provides specificity through complex germline structural diversity to one in which somatic change could be efficiently selected.
78 CHAPTER 5 IDENTIFICATION OF ACE2 ACTIVATORS BY VIRTUAL SCREENING OF SMALL MOLECULE LIBRARIES Introduction Angiotensin-converting enzyme 2 (ACE2) is a type I membrane-anchored peptidyl carboxypeptidase of 805 amino acids (Donoghue et al. 2000, Tipnis et al . 2000). Its catalytic domain consists of approximately 733 residues and is 42 % id entical to that of its closest homolog, ACE. Unlike the ubiquit ously expressed ACE, ACE2 is expressed only in the kidneys, heart (including all ca rdiovascular tissues ), and lungs (Donoghue et al . 2000). Its substrate specificity has also been established to be different, and likely complementary, to that of ACE (Vickers et al . 2002). While ACE activity mainly results in the production of angioten sin II involved in vasoconstric tion and the biosynthesis of aldosterone (an important re gulator of blood pressure), AC E2 product peptides, namely angiotensin 1-7, are involved in vasodilation and hypotension. Furthermore, inhibitors of ACE such as captopril, lisinopril and enalap rilat do not affect the activity of ACE2 (Donoghue et al . 2000, Tipnis et al . 2000). Specific roles of ACE2 in different dis eases and normal physiology are currently a subject of intense study. Nonetheless, its cent ral role in the renin-angiotensin system (Burrel et al. 2004), cardiac contractil e function (Crackower et al . 2002), hypertension (Katovich et al . 2005) and therefore cardiovascular disease have all been recently established. Crackower and othe rs (2002) also observed an inverse correlation of ACE2 mRNA and blood pressure in experimental hypertension models. Other studies have
79 begun to demonstrate ACE2 represents a tr actable gene therapy target (Katovich et al. 2005; Huentelman et al. 2004). The approach attempts to over express ACE2 to offer protection against cardiac hypert rophy and fibrosis (Katovich et al. 2005). The inhibition of ACE is an established therapeutic approach and presently one of the primary strategies for the treatment of hypertension. Howeve r these studies (men tioned above) clearly suggests that suppression of ACE and enha ncement of ACE2 activity are both highly desirable to prevent and treat hypertensi on and related cardiovascular diseases. Although ACE2 is homologous to ACE, the crystal structures of recombinant ACE2 (Towler et al . 2004) and testicular ACE (Natesh et al . 2003) clearly demonstrate structural differences. These differences are observed in the active site, helping rationalize their substrate specificity, and also in their general architectu re. It is noted that no large conformational changes were obs erved between the free and inhibitor bound forms of ACE, while one of the largest hinge-bending motions was observed for ACE2. This may be a crystallization artifact, allowing ACE to only crystallize in the more compact conformation whether inhibitor is found or not. In the case of ACE2, however, the struct ures available in different conformations can be analyzed to answer an important question. Can we enhance ACE2 activity by shifting the conformational e quilibrium of the enzyme? Th is mechanism of action is heavily supported for an ATP analog i nhibitor of adenylate kinase (Watz et al . 2004) and for nitrogen regulatory protein C, a bacterial signaling pr otein which is activated upon phosphorylation (Volkman et al . 2001). Protein dynamics is a universal property of enzymes and different degrees of conformational change may be necessary for the functions of different proteins . Considering the extent of c onformational change in hinge-
80 bending enzymes, ACE2 repres ents an appropriate experi mental model to examine tertiary structural variability. We consider how different conformations of an enzyme can be treated as different struct ures and therefore expand the va riability of pr oteins at the tertiary structural level. Increased ACE2 activity represents an a lternative strategy for the treatment of hypertension and related cardiovascular di seases. The monovalent anion-dependent enhancement of ACE activity, similarly observed for ACE2 (Vickers et al . 2002), has been suggested to occur by this mechanism a nd is consistent with kinetic studies on the effect of chloride ions on ACE (Towler et al . 2004). Therefore, the crystal structures of the open and inhibitor bound forms of ACE2 were analyzed to identify molecular surface features unique to each conformation. Virtual screening methods were applied to identify small molecules capable of enhancing ACE2 activity. Molecular surface sites remote to the active site were targeted and 2 compounds able to increase enzymatic activity 2-fold were identified. Both compounds are predicte d to bind at the same site and share structural similarities. Furthermore, th ese compounds clearly enhance ACE2 activity while not affecting ACE activity. To date it appears this is the first report of in silico docking and structure-based approach used to identify enzymatic activators. Development of these compounds not only may serve as an additional strategy in the treatment of hypertension, but understand ing their mechanism of action may open the door to a novel therapeutic approach that may be applicable to other hinge-bending enzymes and likely any enzyme whose conforma tional equilibrium is critical to their activity.
81 Materials and Methods Virtual Screening The software package of DOCKv5.2 (Ewing et al. 2001) was used for in silico screening of ~140,000 compounds available from the National Cancer Institute, Developmental Therapeutics Program. Th e structure coordinates and chemical information for each compound were processed either with accessory software from DOCK or with the ZINC server (Irwin a nd Shoichet 2005). Each compound was docked as a rigid body in 100 different orientations and before scoring the orientations were filtered by bump filter parameters, excluding compounds with extreme steric clashes. The grid-based scoring system was us ed for scoring with the non-bonded force field energy function implemented in DOCK. A standard 6-12 Lennard-Jones potential was used to evaluate van der Waals c ontacts. Spheres were generated by SPHGEN (Kuntz et al . 1982) and clusters were edited by ha nd to target specific sites on the molecular surface of ACE2. Three different molecular surface pockets, re mote to the active site of ACE2, were targeted with spheres to rank the compounds of the NCI database (Figure 5-1). Two sites were identified in the inhibito r bound form of the enzyme (site s 2 and 3), and a single site (site1) was identified in the open conformation of ACE2. Each site was selected based on its uniqueness to each conformation. Thus, accord ing to the crystal structures of ACE2 available from the Protein Data Bank (PDBID: 1R42 and 14RL, free and bound enzyme respectively) the structural pockets represen ted by sites 2 and 3 are not present in the open conformation of the enzyme. Likewise, site 1 seems to fill with amino acid side chains in the closed conformation. Molecular surfaces were visualized with the software GRASP (Nicholls et al. 1991) to show the concavity of surface pockets. Some pockets
82 were more pronounced in one conformation or the other. Changes in the solvent accessible surface areas for each resiue between the open and the closed conformations were also analyzed. Solvent accesible surface area changes were not as helpful in this case but may be used in the future to identi fy pockets by looking at residues that are exposed in one conformation but not the other. The top ten scoring compounds for each site were selected for functional testing. Active compounds were submitted to a more rigorous calculation with DOCK. Both compounds were docked in at least 3,000 or ientations, energy minimized, and with flexible bond parameters on. Other parameter such as number of minimization steps and number of conformation steps were also incr eased to perform a more exhaustive search until the score for each compound converged and did not improve further. Enzymes, Substrates, and Small Molecule Compounds Recombinant ACE and ACE2 were obtained in purified form from R&D systems, Minneapolis, Minn (catalog ID: 929-ZN-10 and 933-ZN-10, respectively). Substrates for ACE (fluorogenic peptide V, Mca-RPPGFS AFK(Dnp)-OH, catalog ID: ES005), and for ACE2 (fluorogenic peptide VI, Mca-YVA DAPK(Dnp)-OH, catalog ID: ES007) were also obtained from R&D systems. Top sc oring molecules were obtained from the National Cancer Institute (NCI) for functi onal testing. Dry compounds were resuspended in 100% DMSO to prepare 100 mM stock solutions, according to the amount of compound provided by the NCI and its molecula r weight. Gentle heating to 60-80 C was carried out to assist their solubilization. Some compounds were further diluted to 50 mM stocks if clearly diffi cult to dissolve.
83 A B Figure 5-1. Clusters targeting three sites on ACE2. Spheres are in yellow and secondary structure in red for helices, gray fo r loops, and gold for strands. The figure shows the structure of th e inhibitor (not shown) bound conformation of ACE2 (PDBID: 1R4L). The cluster for site 1 wa s generated based on the structure of the open form of the enzyme (1R42) but it is shown superposed on the closed form to show its relative position to the other clusters . (A) and (B) are different by a 90Âº rotation around a horizontal axis. Activity Assays Activity of ACE and ACE2 was measur ed with a Spectra Max Gemini EM Florescence Reader (Molecular Devices). The enzyme removes the c-terminal dinitrophenyl moiety that quenches the i nherent fluorescence of its 7-methoxycoumain group, resulting in an increas e in fluorescence in the pr esence of enzyme activity. Fluorescence was measured w ith excitation and emission sp ectra of 328 nm and 392 nm, respectively. Reaction mixtures were pr epared in 100 Âµl volumes and different concentrations of compound were tested agai nst 10 ÂµM substrate. 10 nM enzyme in 100 mM NaCl, 75 mM Tris, 0.5 ÂµM ZnCl2, at pH 7.4. Samples were read every 15-20 seconds for at least 30 minutes immediatel y after the addition of fluorogenic peptide substrate at 37 Â°C. Assays, in cluding controls, we re performed in the presence of 1% dimethyl sulfoxide (DMSO). Although higher concentrations of NaCl increase the
84 activity of ACE2 and ACE (Vickers et al . 2002), a low concentration of salt (100 mM NaCl) was used in the assays to allow for enhancement of enzymatic activity to be detectable. That is, using 1 M NaCl which gi ves a maximal enhancing effect from the Cl ions might not allow the compounds to furthe r enhance the activity of the enzyme. The lower salt concentration should give the compounds available room for activation. Controls in the presence and absen ce of DMSO and without compound were carried out to evaluate the effect of DM SO on the activity of ACE and ACE2. Assays with no DMSO, 1% DMSO, and 2% DMSO were performed in identical conditions (.i.e, pH, temperature, salt concen tration, reactin mix volume and so on) to those of the experimental assays. At least up to 2% DMSO did not signifi cantly affect the activity of ACE or ACE2 with the substr ates used in this assay. Active compounds were observed to ab sorb and emit background levels of fluorescence. The experimental assays were co rrected at each concentration since higher or lower concentrations of compounds affect ed the background signal in a concentration dependent manner. The added or subtra cted background levels from the active compounds, however, were constant throughout the duration of the assays and did not show increasing or decr easing background signals. Results Approximately 140,000 compounds were vi rtually screened with DOCKv5.2 (Ewing et al . 2001) in 100 different orientations a nd ranked by energy score. The top ten scoring compounds for each of three sites ar e listed in Table 5-1. These compounds were requested from the National Cancer Institut e, Developmental Therapeutics Program (NCI/DTP) for functional testing. The top ten scoring compounds of each site share some general characteristics. Site 1 clearly selected for uncharged smaller compounds with
85 relatively few hydrogen bond donors and acceptors. The average molecular weight of the top ten scoring compounds is 279 Da. The xLogP values seem to range from 0.75 to 3.38 for most compounds of site 1 and a single compound (no. 8) seem s to slightly violate the Lipinski "rule of 5" (MW<500, cLogP < 5, H-bond donors <5, H-bond acceptors <10) in this regard(Lipinski et al. 1997). The Lipinski rule of 5 st ates that compounds are likely to have poor absorption and permeation when tw o or more parameters are out of range. In contrast to the compounds selected for site 1 by DOCK, sites 2 and 3 seem to meet the Lipinski criteria less conservatively. Site 2 favored neutral or negatively charged compounds of a slightly larger molecular weight (MWave 351 Da) and cLogP values have a wider range from -4.35 to 5.33. For both site 2 and 3 most compounds have a higher number of hydrogen bond donors and acceptors, with many exceeding cut off criteria. Both of these sites also selected for compounds with a higher numbe r of rotable bonds. Follow up studies to those of Lipinski favor molecules that have less than 7 rotable bonds as this may be another factor that affects the druglikeness of small molecules. Site 3 seems to have favored positively charged compounds of an even higher molecular weight (MWave 435 Da) compared to site 1. Most compounds in the top ten list for site 3 do not meet Lipinski criteria in at least one parameter. The sh ared characteristics of these compounds likely reflect the properties of the si tes selected for virtual screening and it appears site 1 is better fit for the ligation of a druglike molecule. Only 21 of the requested compounds were obtained and tested for enhancement of ACE2 activity. During initial functional scr eening, compounds 3 and 6 both selected for site 1 were observed to increase ACE2 activity about 2-fold. Both compounds identified
86 Table 5-1. Top ten scoring compounds for the three different sites docked. All requested from the NCI/DTP for in vitro testing. * Not obtained from the NCI. **Not Available. Active compounds are highlighted. Rank # Catalog Number xlogP Hdonors Hacceptors net charge molecular weight rotatable bonds Score Site 1 1 NSC269897 NA**0202243 -19.26 2 NSC72361 3.411502953 -17.35 3 NSC354677 NA2804836 -16.03 4 NSC72756 NA0502284 -15.68 5 NSC21221 1.840201592 -14.00 6 NSC354317 NA2503820 -13.38 7 NSC43058 3.032502413 -12.15 8 NSC43083 5.060202744 -11.52 9 NSC21044 3.380101842 -11.09 10 NSC354297 0.750803202 -10.95 Site 2 1 NSC121146 -0.7511-24009 -32.43 *2 NSC243619 -4.35010-43026 -30.57 *3 NSC324063 -1.42814-14679 -28.99 4 NSC90568 1.467702583 -26.01 5 NSC371456 2.492803846 -25.94 6 NSC42370 2.62210-13374 -25.62 7 NSC631816 1.125703123 -25.56 8 NSC103522 5.335604424 -25.47 *9 NSC624460 2.555603053 -25.30 10 NSC371140 -0.062903056 -25.26 Site 3 1 NSC83458 1.2661024934-27.71 *2 NSC138120 8.121305063 -26.40 3 NSC658245 -1.1251124688 -26.31 4 NSC152085 2.558522983 -25.86 *5 NSC138115 7.771304173 -25.63 *6 NSC82526 0.5461024654 -25.42 *7 NSC694478 2.415903947 -25.35 *8 NSC704636 2.98210-15986 -25.31 *9 NSC657774 8.010624715 -24.91 *10 NSC407491 2.241602433 -24.77 also share some structural similarities. They consist of a rigid ring system scaffold with hydrogen bond donors in similar positions (Fi gure 5-2). These observations demonstrate consistency in the in silico simulations. They show that DOCK was able to select two different but similar compounds that presumably interact with the same site and have similar activities out of an in silico library of ~140,000 compounds.
87 Compounds 3 and 6 were assayed again to confirm their effect on ACE2 activity. They were confirmed to enhance enzymatic activity 2-fold and both compounds have similar activity profiles across a wide concen tration range. All assays were performed in 1% DMSO. Control experiments showed that 1 and 2% DMSO did not affect ACE2 activity in the absence of compounds. Com pound 3 showed a maximum activation at 100 Âµ M with a clean dose response that almost doubled ACE2 activity at 100 Âµ M compound (Figure 5-3). At concentrations higher than 100 Âµ M however, compound 3 became inhibitory with 400 Âµ M returning enzymatic activity to approximately control levels and with 800 Âµ M inhibiting its activity slightly below that of control. This inhibition at such high concentrations may be a consequence of compound aggregation, which is known to promisc uously inhibit enzymes by sequestering the A B Figure 5-2. Activators of ACE2 are structurally similar. (A) both compounds show a multicyclic scaffold that was docked in approximately the same orientation. These are the best scoring orientations for each compound. Strikingly, hydrogen bonding donors and acceptors, highlighted in light yellow occur in both compounds at approximate positions. (B) 2-dimensional structure diagrams.
88 enzyme from solution. Another artifact that could possibly occur under the conditions of our assays is related to the coordination of zinc by the large number of lone pairs of electrons from the active compounds. Oxidi zed zinc may be coordinated by these compounds at high concentrations. Although metalloproteases usually have a high affinity for their metals, 0.5 Âµ M zinc may be a low concentra tion of zinc compared to 500 and 800 Âµ M compound. Finally, these high concentr ations of compound may force them to bind the enzyme at secondary low affinity sites that may still m odulate the activity of the enzyme (e.g., to inhibit it). However it should be noted that the rates of enzyme activity obtained from these spectrophotom etric assays (RFU/s) across this wide concentration range (0-800 Âµ M) approximate a quadratic cu rve closely (Figure 5-3) and that this inhibition may still be consiste nt with a conformati onal equilibrium shift mechanism. In the case of the latter a high c oncentration of activator may still prevent the enzyme from shifting into the closed form of this enzyme, if indeed the compound is found to stabilize the open form. Although th e overall inhibition observed for compound 3 in Figure 5-3 may not be significant when compared to control activity, the rates of enzyme activity give a clear dose response pattern on the ACE2 modulating effects of compound 3. Compound 6 did not show the same dose re sponse but activated ACE2 similarly (Figure 5-4). Compound 6 activated ACE2 identically at 20, 50 and 100 Âµ M but like compound 3 it inhibited ACE2 at higher concentrations. At 500 Âµ M compound 6 ACE2 activity returned down to cont rol level. It is observed th at compound 6 was significantly more insoluble than compound 3 and the lesser quality of the data may be a reflection of its poor solubility. One explanation to th e equal activating effect of compound 6 on
89 -200 -100 0 100 200 300 400 500 600 700 8000:00 1:38 3:16 4:54 6:32 8:1 0 9:4 8 11:26 13:04 14:42 16:20 17:5 8 19: 36 21 :14 2 2:52 2 4:30 :0 0 2 6:08 :0 0 2 7:46 :0 0 2 9: 24:0 0Time (min:sec)RF U C1 (+) 20uM 50uM 100uM 200uM 400uM 800uMA y = -0.3205x2 + 2.516x + 0.5414 R2 = 0.90750 1 2 3 4 5 6 7No compound 20 uM50uM100uM200uM400uM800uMrate (FU/s)B Figure 5-3. Compound 3 from site 1 activates ACE2. Concentrations ranging from 0-800 Âµ M clearly gave a clean dose response even though the compound did not go completely into solution. Assays done in 10 nM enzyme, 10 Âµ M substrate, 100 mM NaCl, 75 mM tris pH7.5 and 0.5 Âµ M ZnCl2 at room temperature. The 30 minute time course yielded linear curves (A) from where rates were calculated (B). All curves in the t op panel had a straight line correlation coefficient of > 0.98, except 20 Âµ M compound (c.c. = 0.93).
90 ACE2 at different concen trations (20, 50 and 100 Âµ M) would be that compound 6 has already reached its maximum effect at 20 Âµ M, and that raising th e concentration of the compound further only forms more aggregate. The effective concentration of compound 6 available in solution would be the same at a ll concentrations. In this case it is likely that the inhibition observed is due to aggregate and may be nonspecific. Lower concentration titrations would be necessary to reveal a clear er dose response but the effect may be too weak to observe with confidence. Overall, compound three seems to beha ve more promisingly. Both compounds appear to be relatively non-t oxic. The National Cancer Institu te provides that both were tested in anticancer screens and more than 95% of rats subjected to 200 mg/Kg of compound had survived after 30 days of expos ure. Finally, compounds were tested in similar conditions for ACE activation. As s hown in Figure 5-5, compounds 3 and 6 did not activate ACE at either 50 or 100 Âµ M. ACE is 42% homologous to ACE2 and is also activated by chloride ions. These experiment s support that compounds 3 and 6 selected by virtual screening methods targeting th e open form of ACE2 have a specific measurable enhancing effect on enzymatic activity. Although mo re experiments are required to explore their mechan ism of action and validate thei r site of inte raction, this report begins to outline a stru cture-based approach to the identification of remote site activators, an approach that c ould also be applied to the di scovery of new inhibitors for other enzymes. Discussion It is encouraging that both compounds identi fied are predicted to interact with the same site since this more strongly suggests th at we have discovered a molecular surface
91 -200 0 200 400 600 800 1000 1200 0:00:0012:00:0024:00:0036:00:0048:00:00 Time (m:s:00)RFU control comp 6A 0 1 2 3 4 5 6 7C+1C+2C+320uM50uM100uM200uM350uM500uMRate (RFU/s)B Figure 5-4. Compound 6 from site 1 activates ACE2. (A) shows the activity of ACE2 is significantly increased by about 2-fold. Assay done in 100 Âµ M compound 6. Error bars are standard errors of meas urement at a 95% confidence interval. The curves show a 40 minute time course obtained in identical conditions to those described in Figure 5-3. (B) show s rates in RFU/s from control (in triplicate: C+1, C+2, C+3) and compound concentrations ranging from 0-500 Âµ M. 20, 50, and 100 Âµ M gave identical curves an d were pooled to obtain the average shown in the top panel.
92 -500 0 500 1000 1500 2000 2500 3000 0:00:009:36:0019:12:0028:48:00 time (m:s:00)RFUA ACE compound 3 and 6 at 50 and 100 uM-100 0 100 200 300 400 0:00:004:48:009:36:0014:24:0019:12:0024:00:0028:48:0033:36:00time (m:s:00)RFU "3.100" "6.50" "6.100" "cont." "3.50"B Figure 5-5. The ACE2 compounds do not e nhance ACE activity. Top panel shows activation of ACE2 by compound 3 at 50 Âµ M. Error bars are standard errors of measurement at 95% confidence intervals. The 30 minute time course was obtained in identical conditions to th ose from Figure 5-3. Bottom panel shows the activity of ACE (red) is not enha nced by either compound 3 (dark blue, 100 Âµ M; dark purple, 50 Âµ M) or compound 6 (bright blue, 100 Âµ M; magenta, 50 Âµ M). All assays were done in triplicat e but in panel (B) error bars are omitted for simplicity.
93 pocket outside of the active site of an enzy me capable of modulating enzymatic activity upon ligation by a small molecule. This is a st riking result considering we have limited ourselves to functionally test only the top ten scoring compounds for each site (typical drug discovery campaigns test thousands of compounds) and that we only screened three sites on ACE2. Clearly virtual screening methods as implemented by DOCK serve to increase the efficiency of initial screening assays. The results reported here show that these compounds are selective for ACE2 and do not enhance ACE activity, which is 42 % identical to ACE2 (i.e., their catalytic domains). Site 1 clearly selected for a group of co mpounds that meet druglikeness criteria (Lipinski et al. 1997). Compared to sites 2 and 3, th e characteristics of these compounds may reflect properties of the molecular surface site on which they were screened. Out of a library of ~140,000 compounds, the top ten co mpounds for each site shared a group of physicochemical characteristics (Table 5-1). In aiming to identify remote sites from the active site an enzyme that could potentially be exploited for drug development, it may be desirable for these sites to not only have uni que features among diffe rent conformers, but also have characteristics that are likely to favor ligation of a druglike molecule. Similar to Lipinski rules of 5 now commonly used to prescreen small molecules, there may be a set of criteria we could follow when selecti ng a molecular surface pocket to probe. For example, the size of the pocket will limit the size of small molecules since DOCK will eliminate compounds that do no fit into a pocket. Smaller molecules would in turn be less likely to have too many hydrogen bond donors or acceptors. However, selecting a site that is too small may leave no room for lead optimization.
94 Traditionally, the active site of an enzyme is targeted for inhibition and thus the site we desire to screen is alr eady given. In this study, howev er, we successfully identified small molecules that enhance the activity of ACE2. Such goal offers poor prospects of achievement if we are preventing the enzyme from binding its substrate. Thus there is necessity for a novel approach and it could be postulated that any given site on the molecular surface of proteins may be rated according to their druglikeness complementarity, that is, its predisposition to bind a druglike molecule. To conclusively determine a set of criteria by which allosteric sites can be evaluated in this regard, future comparative studies with different enzymes will be necessary. Solving the crystal structure of AC E2 bound to the active compounds will be necessary to fully validate mo lecular docking simulations, but at present we can describe the docked models available. Compound 3 a nd 6 were docked with minimization while treated as flexible ligands to obtain the mo st accurate prediction of their complex with ACE2. The three-member ring scaffold is positioned similarly in site 1 for each compound (Figure 5-6 and 5-2). Both compound s are predicted to engage in several hydrogen bonds with residues from ACE2, a lthough the hydrogen bondi ng interactions do not involve the same residues. According to DOCK, compound 3 hydrogen bonds with residues Lys94, Tyr196, Gly205 and His195 (Figure 5-6, panel A). The NZ nitrogen from the lysine side chain is positioned at 3.25 Ã… from the hydroxyl group oxygen (O3) in compound 3. The carbonyl oxygen (O2) is within 3.16 Ã… from the hydroxyl group of Tyr196. The distal amine nitrogen (N2) from compound 3 interacts at a distance of 3.31 Ã… with the main chain carbonyl oxygen of glycine in ACE2. And the ND1 nitrogen from the ACE2 histidine is
95 A B Figure 5-6. Exhaustive molecular docking of ACE2 enhancers. Compounds 3 and 6 selected for site 1 were docked more rigorously with the same sphere cluster and scoring grids used previously. Co mpounds were minimized and treated as flexible ligands. Searching parameters were made increasingly more thorough until the docking scores converged. (A) and (B) show ACE2 in a similar orientation. Surface is color coded li ght blue for carbon, red for oxygen and blue for nitrogen. Compounds are also colored according to the elements, green for carbon, red for oxygen, orange for sulfur and blue for nitrogen. Likely hydrogen bonding interactions are labeled with dashed yellow lines. (A) shows the predicted binding or ientation for compound 3. Hydrogen bonding interactions are labeled accordi ng to compound atoms as described in main text. (B) shows the same for compound 6.
96 within 2.98 and 3.31 Ã… of the ether-sulfate oxygens (O4 and O6 respectively) in compound 3. All hydrogen bonding angles show good geometry (125-130Âº), except for the angle C14-O3-NZ which is wider (160Âº). Given that lysine si de chains are very flexible, however, an experimental structure is likely to show the side chain of Lys94 oriented in a more favorable orientation. Compound 6 seems to be involved in 3 hydrogen bonds with residues Gln98, Gln101 and Gly205 (Figure 5-6, bottom panel) . Both hydroxyl oxygens in compound 6 interact with main-chain carbonyl oxygens of ACE2; O5 seems to bond to Gly205 (3.18 Ã…) and O4 to Gln101 (3.33 Ã…). The ester oxygen (O2) in compound 6 accepts an amide hydrogen from the side chain of Gln98 at a s lightly less ideal dist ance of 3.51 Ã…, but as mentioned for the model of compound 3, doc king simulations do not account for any Â“induced fitÂ” effects on ACE2 residues. An expe rimental structure is likely to show better hydrogen bonding distances and geometry fo r both compounds. At present it is nonetheless observed that 3.5 Ã… is an accepta ble hydrogen bonding distance. Like for compound 3, hydrogen bonding angles are as expected (~117Âº). Many enzymes are known to exist in multiple conformations in crystal structures. The modern view on folding funnels and en ergy landscapes suggests strongly that most proteins do exist as an ensemble of conformations (Frauenfelder et al . 1988). However, some conformations are more capable of performing their function while other conformations are rendered in active. NMR relaxation experiments with several model proteins (e. g., triose phos phate isomerase, dihydrofolat e reductase, RNAse A and HIV protease) have revealed that frequencies of conformational exchange are nearly identical
97 in the presence or absence of substrate; s uggesting that conformati onal changes involved in the binding or release of liga nds may be rate limiting (Eisenmesser et al . 2005). A study on adenylate kinase explicitly dem onstrated that the ra te of active site Â“lidÂ” opening matched the tur nover rate of the enzyme and also fully accounted for the decreased activity of a thermophilic homolog (Wolf-Watz et al . 2004). The study concluded that increased temper atures leading to an increase d rate of lid opening restored the thermophilic enzyme to optimal activity. According to these concepts, it would be interesting to see if small molecules targeted to stabilize the open conformation of the thermophilic enzyme will enhance its activity at metastable temperatures. For hinge bending enzymes such as ACE2, the large conf ormational change that opens and closes its active site should be analogous to the dyna mics of lid opening in adenylate kinase. Unfortunately, there is no data on ACE2 compar able to that of ade nylate kinase or even to suggest that the release of products is rate limiting. Therefore, both forms of the enzyme were probed in an attempt to enhance enzymatic activity. If the compounds identified in this stu dy interact with the open conformation of ACE2 at site 1, they may speci fically stabilize this conformation in solution. This effect may enhance ACE2 activity by at least two m echanisms. Logically, closed conformations of the free enzyme do not allow substrate in to its active site. In the presence of compound, the populations of free enzyme may be shifted to that of the open form effectively increasing the activity coefficien t of the enzyme. Alternatively, it is also possible that product release is a rate limiting step in ACE2 turnover. This is known for several enzymes (e. g., dihydrofolate reductase, also mentioned above). The activity of ACE2 in the presence of compound may then be enhanced as the enzyme-product
98 complex empties more quickly and ACE2 become s available to start another cycle. It is possible that compounds 3 and 6 modulate AC E2 activity by both mechanisms. In both cases, compounds would be acting by shifti ng the populations of enzyme into a conformation that is fully ac tive, whether the enzyme is in free or bound form, and helping the enzyme avoid Â“wasting its timeÂ” on nonproductive complexes or conformations. Here we report not only the identification of ACE2 activators, but also a novel structure-based drug discovery approach that may be applicable to other enzymes. By targeting allosteric sites on the molecular surf ace of enzymes we may be able to enhance or inhibit their activity. Enzymatic activator s are rare and their development by current structure-based knowledge is unprecedented. He re we propose that it may be possible to identify molecular surface sites remote from th e active site of the enzyme to be exploited for drug development. This approach will open new doors in drug therapy as the identification and design of activators beco mes a tractable route. This will expand the availability of macromolecular targets and al so offer hope for the development of novel inhibitors for enzymes resistant to curr ent therapeutics; such as HIV protease. In summary, while validation of the specific mechanism of action of the compounds described is pending, this study enco urages us to ask the following questions: can virtual screening approaches be used to identify allosteric sites for drug development? Can the analyses of such sites re veal a set of criteria to more easily identify sites that will recognize a druglike molecule and at the same time shift populations of conformers in a desirable direc tion (inhibit or activa te an enzyme)? If high concentrations of small molecule activators become inhibitory, what is the maximal activating effect that
99 can be achieved by this method? Will these ac tivating effects have a significant impact in vivo ? And finally, can this approach to drug ther apy offer alternative inhibitors to treat diseases in which increasingly resistant enzy mes have nearly exhausted the usefulness of active site inhibitors? Answering of these questions will unlock novel therapeutic strategies for the treatmen t of many diseases includi ng hypertension and any disease dependent on the activity of an enzyme whose conformational equilibrium can be modulated.
100 CHAPTER 6 CONCLUSIONS AND FUTURE DIRECTIONS In summary, the structure of a V regi on-containing chitin-binding protein and angiotensin-converting enzyme 2 were exam ined to identify nove l features in the structure of these proteins that offer important advantages to their function. In the case of the V region-containing chitin-binding proteins (VCBPs), millions of years of evolution have led to the structure presented in Ch apters 3 and 4. These immune-type receptors likely responded to selective pressures that favored those with better antigen recognition capabilities; for example, polymorphism a nd a stable fold that easily accommodated highly diverse sequences. The immunoglobulin fold is one of the most versatile molecular scaffolds and has been adapted fo r many functions in cell adhesion and the immune system. Chapter 3 discusses unconven tional interactions obser ved in VCBP3 hitherto unknown to the Ig fold. These interactions ar e being increasingly recognized in other proteins and it is suggested here that these interactions play ed an important role in the structural evolution of the Ig fold. Chapter 4 focuses on features observed in VCBP that support its primordial relationship to adap tive immune receptors. The VCBPs are the only immune-like receptors from an organi sm below the jawed vertebrates known to exhibit both a V-type immunoglobulin fold and germline encoded regionalized hypervariability. These are characteristics t ypical of adaptive immune receptors. The structure presented in Chapter 4 shows that hypervariable regions colocalize to a
101 contiguous surface at the interface of the tw o Ig domains in VCBP3. The structures presented here suggest a clearer set of stages in adaptive receptor evolution. The current views on protein structure state that proteins exist as ensembles of conformations, with some of these conformations being more capable of executing their function. It is logical to pres ume that if we could rationall y skew these populations of conformers we could enhance or suppress thei r activity to our benefit. In Chapter 5, a structure-based approach to identify enzy matic enhancers was applied based on these concepts. Angiotensin-converting enzyme 2 (ACE2), an import ant target against hypertension, was used as a model hinge bending enzyme although this approach likely can be applied to any enzyme exhibiti ng distinct conforme r populations. Adaptive immune receptors, the VCBPs, and ACE2 ha ve been shaped by millions of years of evolution. Even though it is well accepted that proteins have evolve d in compromise to attain a balance between stability and flexibil ity, some of this flexibility may reflect an entropy reservoir that can stil l be exploited by rational approaches. The results presented in Chapter 5 offer new opportunities in drug discovery and development. Crystallization and Preliminary X-ray Analysis of VCBP2 and VCBP3 from Branchiostoma floridae Before our work the classification of th e N-terminal domains of the VCBPs as Vtype immunoglobulin were based on sequence si milarity alone. To determine similarities and differences between the structures of modern antigen recept ors of the adaptive immune system and these primordial invert ebrate proteins, VCBPs were expressed and purified for x-ray crystallography. The chitin -binding domain of VCBP, a cysteine rich domain, was not successfully expressed but a homology model based on tachycitin was examined (the constructs expressed are descri bed in the materials and methods of Chapter
102 2). Proteins of VCBP3 crysta llized more easily in many conditions, with small crystals identified by hand. Crystallization conditions we re quickly optimized as crystals grew in 3 to 4 days. Crystals were found to offer good quality diffraction data and initial unit cell parameters and space group were successfully determined. Both long and short forms of VCBP3 diffracted to a good re solution level (~2 Ã…). The crys tallization and preliminary x-ray analysis of VCBP3 estab lished it would be possible to so lve the crystal structure of both short and long VCBP3. Proteins of VCBP2 were not crystallized by hand and automated high throughput screening was performed by the Southeast Collaborative for Structural Genomics, Alabama. Although optimized crystals of the shor t form grew to sizes similar to those of VCBP3, it was found that crystals diffracted too weakly at room temperature. Long exposure times prevented the collection of e nough data for a full preliminary analysis, although successful indexing of the data ga ve initial cell dimensions and point group symmetry for two different crystals. More di ffraction data is needed to determine the space group and refine unit cell parameters. Coll ection of data from cryocooled crystals was unsuccessful as cryoprotection led to high mosaicity and unacceptably low diffraction limits for VCBP2. Crystals of the long form of VCBP2 were very small (<1 mm in the longest dimension) and would not di ffract adequately in a home or even most synchroton sources. The 1.15 Ã… Crystal Structure of VCBP3V1 from Branchiostoma floridae The short form of VCBP3 was found to di ffract to atomic resolution at beamlines available in the National Synchroton Light Source, New York. A selenomethione derivative was generated to perform multiwavelength anomalous dispersion experiments and obtain an unbiased model of the VCBP st ructure. This is currently the highest
103 resolution structure of an Ig in the Protein Data Bank. The good quality and high resolution data allowed anisotropic temperat ure factor refinement and hydrogen atom corrections during crystallographi c refinement. This led to the relaxation of restraints and a highly accurate final structure. The structur e is clearly of the Vtype class confirming that VCBPs are the only represen tatives of a V-type Ig below jawed vertebrates that also exhibit regionalized hypervariability. However, much of the hypervar iability observed in antigen receptors of the ad aptive immune system is i nduced somatically by highly specialized mechanisms of recombination and mutation. The hypervariability observed in amphioxus is harbored at th e germline level only (Cannon et al . 2002). In addition, it had also been noted that the hypervariable re gions of VCBPs do not coincide with the hypervariable regions of T cell receptors or antibodies (i.e., the complementarity determining regions). The hypervariable regi ons of VCBP are shifted toward the Nterminus compared to adaptive receptors. Searching the PDB for structurally homol ogous proteins showed that VCBP3V1 is most similar to proteins that interact with a ligand by means of their front sheet; one of these structures is a T cell receptor. The short form of VCBP3 (VCBP3V1) was observed to display a crystal contact at its front sheet in a ma nner analogous to the ligand recognition of its structural homologs (Figur e 3-5). Chapter 4 confirms and discusses a biologically relevant dime rization of VCBP3V1V2 vi a front sheet residues. The atomic resolution structure elucidated for this dissertation clearly indicated the position of hydrogen atoms at the core of the Ig fold (Figure 3-6). This data led to the unprecedented analysis of weak non-canonical interactions for Ig domains. These interactions were found to be involved la rgely with V-type Ig conserved residues
104 suggesting that they may play an important role in the structure. Although more structures at this level of re solution are needed to make gene ralizations about the roles of these interactions in Ig domain s, it is proposed that these inte ractions had an effect in the stabilization of this -sandwich structure which led to their selecti on throughout the evolutionary history of immune-type receptors exhibiting V-type Igs. Considering the residues involved in these interactions form the core of the Ig, it is likely that they appeared early in the structur al evolution of the Ig domai n. Finally it is noted that the disulfide bond between the front and back shee ts of the Ig may lead us to think any contributions from these weak interactions are negligible, but it is still possible that weak interactions are important in guiding the foldi ng of the newly translat ed Ig domain, before cysteine oxidation and while driving specifi c orientations of residues during hydrophobic collapse. Origins of Immune-Type Receptor Diversit y: Crystal Structure of VCBP3V1V2 in Amphioxus The long form of VCBP3 was also solv ed by an anomalous dispersion method (SAD) that provided unbiased phases. Alt hough the resolution of this structure was significantly lower (1.85 Ã…) by VCBP3V1 standa rds, superposition of the V1 domain of this structure on the atomic resolution struct ure described in Chapte r 3 showed that the backbone conformation is nearly identical, with an rmsd of 0.6 Ã…. In this chapter the structure shows how the tandem V-type domains interact largely by means of front sheet residues, as suspected from the structure in Chapter 3 (Figure 3-5). Interestingly, the Jlike motif encoded by VCBP3 is heavily involved in interactions at the interface of the two domains. This suggests the J-like motif may have had an important function in a primordial V-type Ig, before antigen recep tors acquired this structural feature.
105 Available sequence information was aligne d with the structur e of VCBP3V1V2. It was determined that the hypervariable segmen ts in the sequences of VCBPs map to a contiguous surface constituted by both V1 and V2. This strongly supports that this surface is a site of ligand recognition. As already noted, the hypervariable regions of VCBP do not map to the complementarity dete rmining regions of antigen receptors and therefore suggest that primordial V-type r eceptors displayed diversification of their primary sequence elsewhere as well. The J-like motifs FGXGTXLXV observed in VCBPs, with half the motif in one domain a nd the other half in the other, suggest these structures predate a presumed transposition ev ent that rearranged the motif fragments in VCBP V1 and V2. It is possible that the same or a similar event led to the shift of the hypervariable regions to the CDRs in an tigen binding receptors. Alternatively, hypervariable surface patches might have slowly migrated from the N-terminal end of the sequence to the more derived location at the CDRs. Although more examples demonstrating the gradual migration of hyperv ariable surface patche s are to be found, it is noted that in single chain antibodies fr om llamas, camels and relatives framework residues outside of the CDRs participate in significant interactions with the antigen. Identification of ACE2 Activators by Virt ual Screening of Small Molecule Libraries Two compounds that enhan ce the activity of ACE2 in vitro were identified by a structure-based approach. As a central regulato r of the renin-angioten sin system, ACE2 is currently an intensely res earched therapeutic target against hypertension. Based on conformational equilibri um concepts in enzyme function, it was hypothesized that small molecule activators that stabilized a part icular conformation of ACE (open or closed) may enhance its activity. Examination of the crys tal structures of the different conformers of ACE led to the identification of molecular surface features that were unique to either
106 conformation. Two sites in the closed conformation and one s ite in the open form were selected to apply molecular docking techni ques. A database of ~140,000 compounds was screened at each site. In vitro testing of the top scoring co mpounds for each site showed that two compounds predicted to interact with site 1 had si milar activity. These compounds were also observed to exhibit a high degree of st ructural similarity. Both compounds activated ACE2 in the low micr omolar range while inhibiting it in the medium to high micromolar spectrum. A lthough this inhibitory effect may be a consequence of poor compound solubility, it is still consistent w ith a conformational equilibrium shift mechanism of action. The more soluble compound showed a clear dose response effect on ACE2 activity. Finally, it was determined that these compounds enhance ACE2 activity while not affecting AC E activity significantly. The two proteases (ACE and ACE2) are ~40% homologous. Alth ough more experiments are needed to determine the exact mechanism of acti on of these compounds, the body of evidence presented at least supports a successful rationa l approach to identifying enzymatic activators by the exploi t of structure knowledge. It is still possible these two lead compounds may not develop into powerful therapeutic agents, but furthe r studies on these compounds may lead to the discovery of novel techniques in drug discovery. Most en zyme targets are larg ely regulated by the development of active site inhibitors. This study suggests we may be able to target enzymes whose activity is desired by developi ng enzyme activators. This may lead to a direct drug treatment in some enzyme deficiencies. This approach also offers new strategies in the development of inhibitors for enzymes now heavily resistant to active site inhibitors, since accord ing to conformational equili brium concepts, stabilizing a
107 nonfunctional form of the enzyme will effectiv ely inhibit it. These techniques would be applicable to any enzyme or other prot ein whose activity is dependant on their conformational state. Future Directions There are at least 4 distinct families of VCBPs and obtaining additional structures of VCBP representatives will be necessary to identify general features shared by all families. For example, it is still possible, although unlikely, that not all VCBPs exhibit a V-type fold. Analyzing representatives of di fferent families will yield more appropriate models of adaptive immune receptor ancestors . Crystals of VCBP2 are reported in this dissertation but data collection from these crystals was not co mpletely successful. Further screening of cryoprotectant conditions likely will result in successful data collection from cooled crystals. Additive screens should al so help identify conditions that stabilize VCBP2 crystals since this may keep mosa icity down and improve diffraction limits. Another interesting approach that may be attempted to improve the diffraction properties of VCBP2 crystals is reductive met hylation of lysine resi dues. Lysine residues are frequently found in the surfaces of protei ns in an extended conformation that may not be ideal for crystal packing. By methylating these residues thei r charge is neutralized and the more hydrophobic character of the modified residues results in better packing of the long side chains, closer to the molecular surf ace. Other model proteins such as lysozyme have been examined for artifacts when pe rforming this technique and have shown none other than the expected compact conformati on for the lysine side chains (Rypniewski et al . 1993). Although chemical modification of lysi ne residues did not affect the diffraction properties of lysozyme significantly, the stru cture of myosin subfragment 1 was obtained only after applying this method (Rayment et al . 1993).
108 Other VCBPs may be amenable to crys tallization, but th eir expression in E. coli has not been possible until now. Alternative expression systems may offer soluble VCBPs that can be screened for crystallization. Furt her attempts to crystallize VCBPs should also include automated high throughput crystal sc reening for rapid crystallization since VCBP2 was crystallized only after it was scr eened at the Southeast Collaborative for Structural Genomics consortium center at Birmingham, Alabama. A full construct structure of a VCBP is of pa rticular interest since it woul d show how the Ig and chitinbinding domains are organized. However, as chitin-binding domains are cysteine rich, these may prove more difficult to express. These additional structures will be an important aim of future studies on VCBPs. The VCBPs so far are the only immune-type receptors in an inve rtebrate displaying both V-type Igs and regionalized hypervaria bility. Although these characteristics are highly suggestive of immune function, the f unctional roles of VCBPs in amphioxus have not been established. Identifying a functi onal role for VCBPs should conclusively indicate whether they form part of the i nnate immune system in amphioxus or not. Purified VCBPs may be tested for inhibition of fungal or bacterial growth. If active, mutagenesis experiments may be designed ba sed on the structures described in this dissertation to determine what residues in VCBPs are most important to their function. Amphioxus can be bred in the laboratory. It would be interesting to generate VCBP deficient mutants that could be tested for a significantly highe r susceptibility to fungal or bacterial colonization. If a hi gher susceptibility is observed, a rescue experiment should be performed to test whether extran eous VCBP prevents the affliction.
109 More specifically, ligands recognized by VCBPs will eventually need to be identified. Coimmunoprecipitation and yeast tw o-hybrid system experiments would help determine if there are any other amphioxus prot eins that interact with VCBPs. However, VCBPs may be secreted into the gut of amphioxus as a pr imitive type of neutralizing antibody, therefore recognizing many different antigens. Surface plasmon resonance experiments would be highly efficient in iden tifying these antigens. With VCBPs fixed to a plate, sea water may be applied to these putative immune receptors until a ligand is identified. A complex structure of a VCBP with its antigen will confirm or refute that the polymorphic surface of VCBPs is the antigen binding site. In the case of ACE2, experiments are n eeded to validate the mechanism of action of the compounds identified. Kinetic experiment s will be required to determine whether these compounds are modulating Kcat or Km or both. There is a potent ACE2 inhibitor available that should enable us to perform ac tive site titrations a nd quantitate enzymatic activity. It should also be determined whether the inhibition observed at higher concentrations is specific or a consequence of aggregati on. Lineweaver-Burk plots could give initial evidence that inhi bition occurs by a specific mech anism. Also, assays should be performed with a physiological ly relevant substrate of AC E2 (i.e., angiotensin II) to shown that the compounds in fact have th e potential to illicit a physiological effect in vivo . Physical interactions of these compounds with ACE2 should be analyzed to validate molecular docking simulations. Crys tallization conditions for ACE2 are known. Solving the structure of ACE2 bound to the ac tive compounds will confirm their site of interaction, orientation a nd specific interactions i nvolved. In conjunction with
110 crystallization experiments, isothermal titration and differential scanning calorimetry methods should confirm th e interaction of ACE2 w ith the active compounds. Microcalorimetry experiments will also enable us to obtain affinity constants for each compound. Finally, NMR relaxation studi es could be performed to obtain information about conformational changes in ACE2. As s hown for other proteins (Eisenmesser et al . 2005), it may be that the backbone amide frequencie s in the presence and absence of substrate are the same suggesting conformational ch anges are rate limiting for ACE2. More specifically, if it can be shown that the opening rate of the ACE2 clam shell is slower than the closing rate , this would help explain how the active compounds affect ACE2 activity consistent with our assumptions of its dynamics (i.e., compounds may be stabilizing the open conformation). Animal studies may show that these co mpounds are able to rescue or prevent hypertensive symptoms in an established ra t model (already available). However, these compounds may not produce a noticeable effect in vivo even though they do modulate enzymatic activity in vitro . The compounds identified to be active were observed to obey the Lipinski rules of 5 but they may still la ck the potency or the ability to reach their target in vivo . Therefore, a family of compounds si milar to those identified should be tested along with the original compounds. This approach may also allow to determine structure-activity relationships that will be helpful in the development of the lead compounds. Development of these lead compounds is a nother important aim for future studies of ACE2. A crystal structure of ACE2 bound to the compounds described in this
111 dissertation will not only validate their site of interaction but will also result in high resolution information necessary for struct ure-based optimization. Although the models obtained from rigorous molecula r docking may be used as a starting point to derivatize lead compounds, current molecular docking al gorithms are not appropriate for high resolution analysis. A crystal structure will be required for efficient optimization. Finally, additional computational techniques can be used for automated structure-based optimization with software packages such as RACHEL (Tripos, Inc.). RACHEL allows a database of fragments to be screened and ev aluated (i.e., scored) as each fragment is considered as an extension of the lead compound. The lead compound can then be grown in silico at user defined sites and ranked again. This approach should provide us with a Â“filteredÂ” library of derivatives likely to have an increased affinity for our target, similar to the way virtual screening was used to reduc e the original database of small molecules. This dissertation examined concepts in stru ctural variability in proteins. The first example discussed how immune-like proteins might have begun to be exploited by natural selection for efficien t antigen recognition in invert ebrate organisms. The second example discusses how the conformational ense mble of enzymes can be considered part of the variability repertoire in proteins and demonstrates these concep ts can be applied to develop novel structure-based drug design methods. The method discussed here should be developed and studied not only for ACE2 as therapy against hypertension, but also in other enzymes with known confor mational ensembles. Also of particular interest is the application of these methods to enzymes resist ant to active site inhi bitors such as HIV protease; further emphasizing the importance of this research since this approach could likely be applied to other proteins.
112 LIST OF REFERENCES Agrawal A., Eastman Q. M. and Schatz D. G. (1998) Transposition Mediated by RAG1 and RAG2 and its Implications for th e Evolution of the Immune System. Nature 394 , 744-751. Asensio J. L., Canada F. J., Bruix M., R odrÃguez-Romero A. and JimÃ©nez-Barbero J. (1995) The Interaction of Hevein wi th N-Acetylglucosamine-Containing Oligosaccharides. Solution Structure of Hevein Complexed to Chitobiose. Eur. J. Biochem. 230 , 621-633. Ash E. L., Sudmeier J. L., Day R. M., Vincent M., Torchilin E. V., Haddad K. C., Bradshaw E. M., Sanford D. G. and B achovchin W. W. (2000) Unusual 1H NMR chemical shifts support (His) C(epsil on) 1...O==C H-bond: proposal for reactiondriven ring flip mechanism in serine protease catalysis. PNAS 97 , 10371-10376. Atkinson M. A. and Leiter E. H. (1999) Th e NOD Mouse Model of Type 1 Diabetes: As Good as it Gets? Nat. Medicine 5 , 601-604. Babu M. M. (2003) NCI: A Server to Identi fy Non-Canonical Interactions in Protein Structures. Nucleic Acids Res. 31 , 3345-3348. Bader M. (2002) Role of the Local ReninAngiotensin System in Cardiac Damage: A Minireview Focussing on Transgenic Animal Models. J. Mol. Cell Cardiol. 34 , 1455-1462. Bajorath J. (2002) Integration of Vi rtual and High-Throughput Screening. Nat. Rev. Drug Discov. 1 , 882-894. Barclay A. N. (2003) Membrane Proteins with Immunoglobulin-Like Domains--A Master Superfamily of Interaction Molecules. Semin. Immunol. 15 , 215-223. Barton G. J. (1993) ALSCRIPT: A Tool to Format Multiple Sequence Alignments. Protein Eng. 6 , 37-40. Becker O. M., Marantz Y., Shacham S., I nbal B., Heifetz A., Kalid O., Bar-Haim S., Warshaviak D., Fichman M. and Noiman S. (2004) G Protein-Coupled Receptors: In Silico Drug Discovery in 3D. PNAS 101 , 11304 Â– 11309. Berman H. M., Westbrook J., Feng Z., Gilliland G., Bhat T. N., Weissig H., Shindyalov I. N. and Bourne P. E. (2000) The Protein Data Bank . Nucleic Acids Res. 28 , 235242.
113 Binkowski T. A., Naghibzadeh S. and Lia ng J. (2003) CASTp: Computed Atlas of Surface Topography of Proteins. Nucleic Acids Res. 31 , 3352-3355. Bohacek R. S., McMartin C. and Guida W. C. (1996) The Art and Practice of StructureBased Drug Design: A Molecula r Modeling Perspective. Me d. Res. Rev. 16 , 3-50. Boot R. G., Blommaart E. F., Swart E., Gh auharali-van der Vlugt K., Bijl N., Moe C., Place A. and Aerts J. M. (2001) Identi fication of a Novel Acidic Mammalian Chitinase Distinct from Chitotriosidase. J. Biol. Chem. 276 , 6770-6778. Borghi C., Bacchelli S., Degli Esposti D. a nd Ambrosioni E. (2004) A Review of the Angiotensin-Converting Enzyme Inhibito r, Zofenopril, in the Treatment of Cardiovascular Diseases. Expert Opin. Pharmacother. 5 , 1965-1977. Bork P., Holm L. and Sander C. (199 4) The Immunoglobulin Fold. Structural Classification, Sequence Pa tterns and Common Core. J. Mol. Biol. 242 , 309-320. Brunger A. T. (1992) Free R Value: A Novel Statistical Quantity for Assessing the Accuracy of Crystal Structures. Nature 355 , 472-475. Brunger A. T., Adams P. D., Clore G. M., De Lano W. L., Gros P., Grosse-Kunstleve R. W., Jiang J. S., Kuszewski J., Nilges M ., Pannu N. S., Read R. J., Rice L. M., Simonson T. and Warren G. L. (1998) Crystallography & NMR System: A New Software Suite for Macromolecular Structure Determination. Acta Cryst. D54 ( Pt 5) , 905-921. Burrell L. M., Johnston C. I., Tikellis C. and Cooper M. E. (2004) ACE2, A New Regulator of the Renin-Angiotensin System. Trends Endocrinol. Metab. 15 , 166169. Cannon J. P., Haire R. N. and Litman G. W. (2002) Identification of Diversified Genes that Contain Immunoglobulin-Like Vari able Regions in a Protochordate. Nat. Immunol. 3 , 1200-1207. Cannon J. P., Haire R. N., Schnitker N., Mu eller M. G. and Litman G. W. (2004) Individual Protochordates Have Unique Immune-Type Receptor Repertoires. Curr. Biol. 14 , R465-466. Cantoni C., Ponassi M., Biass oni R., Conte R., Spallarossa A., Moretta A., Moretta L., Bolognesi M. and Bordo D. (2003) The Three-Dimensional Structure of the Human NK Cell Receptor NKp44, a Triggeri ng Partner in Natural Cytotoxicity. Structure 11 , 725-734. Card G. L., Blasdel L., England B. P., Zha ng C., Suzuki Y., Gillette S., Fong D., Ibrahim P. N., Artis D. R., Bollag G., Milburn M. V., Kim S. H., Schlessinger J. and Zhang K. Y.(2005) Family of Phosphodi esterase Inhibitors Discovered by Cocrystallography and Scaffold-based Drug Design. Nat. Biotechnol. 23 , 201207.
114 Carter P., Andersen C. A. and Rost B. (2003) DSSPcont: Continuous Secondary Structure Assignments for Proteins. Nucleic Acids Res. 31 , 3293-3295. Chen B., Piletsky S. and Turner A. P. ( 2002) High Molecular Recognition: Design of "Keys". Comb. Chem. High Throughput Screen. 5 , 409-427. Chothia C., Gelfand I. and Kister A. (1998) St ructural Determinants in the Sequences of Immunoglobulin Variable Domain. J. Mol. Biol. 278 , 457-479. Chothia C., Novotny J., Bruccoleri R. and Ka rplus M. (1985) Domain Association in Immunoglobulin Molecules. The Packing of Variable Domains. J. Mol. Biol. 186 , 651-663. Cleland W. W. (2000) LowBarrier Hydrogen Bonds and Enzymatic Catalysis. Arch. Biochem. Biophys. 382 , 1-5. Clements C. S., Reid H. H., Beddoe T., Tynan F. E., Perugini M. A., Johns T. G., Bernard C. C. and Rossjohn J. (2003) The Crystal Structure of Myelin Oligodendrocyte Glycoprotein, a Key Au toantigen in Mul tiple Sclerosis. Proc. Natl. Acad. Sci. USA 100 , 11059-11064. Collaborative Computationa l Project N. (1994) The CCP 4 Suite: Programs for Protein Crystallography. Acta Cryst. D50 , 760-763. Collins A. M., Sewell W. A. and Edward s M. R. (2003) Immunoglobulin gene rearrangement, repertoire divers ity, and the alle rgic response. Pharmacol. Ther . 100 , 157-170 Colman P. M. (1988) Structure of Anti body-Antigen Complexes: Implications for Immune Recognition. Adv. Immunol. 43 , 99-132. Connolly M. L. (1983). Solvent-Accessible Su rfaces of Proteins and Nucleic Acids. Science 221 , 709-713. Crackower M. A., Sarao R., Oudit G. Y., Yagil C., Kozieradzki I., Scanga S. E., Oliveirados-Santos A. J., da Costa J., Zhang L ., Pei Y., Scholey J., Ferrario C. M., Manoukian A. S., Chappell M. C., Backx P. H., Yagil Y. and Penninger J. M. (2002) Angiotensin-Converting Enzyme 2 is an Essential Regulator of Heart Function. Nature 417 , 822-828. Dauter Z., Lamzin V. S. and Wilson K. S. (1997) The Benefits of Atomic Resolution. Curr. Opin. Struct. Biol. 7 , 681-688. Davis M. M. and Bjorkman P. J. (1988) T-Cell Antigen Receptor Genes and T-Cell Recognition. Nature 334 , 395-402.
115 De Genst E., Handelberg F., Van Meirhaeghe A., Vynck S., Loris R., Wyns L. and Muyldermans S. (2004) Chemical Basis fo r the Affinity Maturation of a Camel Single Domain Antibody. J. Biol. Chem. 279 , 53593-53601. DeLano W. L. (2002) The PyMOL Molecular Graphics System. DeLano Scientific , San Carlos, CA, USA. Derewenda Z. S., Derewenda U. and Kobos P. M. (1994) (His)C Epsilon-H...O=C < Hydrogen Bond in the Active Sites of Serine Hydrolases. J. Mol. Biol. 241 , 83-93. Derewenda Z. S., Lee L. and Derewenda U. (1995) The Occurrence of C-H...O Hydrogen Bonds in Proteins. J. Mol. Biol. 252 , 248-262. Desiraju G. R. (1996) The CH-O Hydroge n Bond: Structural Implications and Supramolecular Design. Acc. Chem. Res. 29 , 441-449. Desmyter A., Transue T. R., Ghahroudi M. A., Thi M. H., Poortmans F., Hamers R., Muyldermans S. and Wyns L. (1996) Crys tal Structure of a Camel Single-Domain VH Antibody Fragment in Complex with Lysozyme. Nat. Struct. Biol. 3 , 803811. Dobson C. M. (2004) Chemical Space and Biology. Nat. Insight. Intro. pp. 2 Donoghue M., Hsieh F., Baronas E., Godabout K., Gosselin M., Stagliano N., Donovan M., Woolf B., Robison K., Jeyaseelan R., Breitbart R. E. and Acton S. (2000) A Novel Angiotensin-Converting Enzyme -Related Carboxypeptidase (ACE2) Converts Angiotensin I to Angiotensin 1-9. Cir. Res. 87 , E1-E9. Dyson H. J. and Wright P. E. (2005) Intr insically Unstructured Proteins and Their Functions. Nat. Rev. Mol. Cell Biol. 6 , 197-208. Eason D. D., Cannon J. P., Haire R. N., Rast J. P., Ostrov D. A. and Litman G. W. (2004) Mechanisms of Antigen Receptor Evolution. Semin. Immunol. 16 , 215-226. Eisenmesser E. Z., Millet O., Labeikovsky W., Korzhnev D. M., Wolf-Watz M., Bosco D. A., Skalicky J. J., Kay L. E. and Kern D. (2005) Intrinsic Dynamics of an Enzyme Underlies Catalysis. Nature 438 , 117-121. Emsley P. and Cowtan K. (2004) Coot: Model-Building Tools for Molecular Graphics. Acta Cryst. D60 , 2126-2132. Evans S. V. (1993) SETOR: Hardware-L ighted Three-Dimensional Solid Model Representations of Macromolecules. J. Mol. Graphics 11 , 134-138, 127-128. Ewing T. J., Makino S., Skillman A. G. and Kuntz I. D. (2001) DOCK 4.0: Search Strategies for Automated Molecular Docking of Flexible Molecule Databases. J. Comput. Aided Mol. Des. 15 , 411-428.
116 Frauenfelder H., Parak F. and Young R. D. (1988) Conformational Substates in Proteins. Annu. Rev. Biophys. Biophys. Chem. 17 , 451-479. Garcia K. C., Teyton L. and Wilson I. A. ( 1999) Structural Basis of T Cell Recognition. Ann. Rev. Immunol. 17 , 369-397. Garrett T. P., Wang J., Yan Y., Liu J. and Ha rrison S. C. (1993) Refinement and Analysis of the Structure of the Firs t Two Domains of Human CD4. J. Mol. Biol. 234 , 763778. Guex N., Diemand A. and Peitsch M. C. (1999) Protein Modelling for All. Trends Biochem. Sci. 24 , 364-367. Guex N. and Peitsch M. C. (1997) SWI SS-MODEL and the Swiss-PdbViewer: An Environment for Comparative Protein Modeling. Electrophoresis 18 , 2714-2723. Hendrickson W. A., Horton J. R. and LeMast er D. M. (1990) Selenomethionyl Proteins Produced for Analysis by Multiwavelength Anomalous Diffraction (MAD): A Vehicle for Direct Determination of Three-Dimensional Structure. Embo Journal 9 , 1665-1672. Hernandez Prada J. A., Haire R. N., Cannon J. P., Litman G. W. and Ostrov D. A. (2004) Crystallization and Preliminary X-Ray An alysis of VCBP3 from Branchiostoma Floridae. Acta Cryst. D60 , 2022-2024. Hiom K., Melek M. and Gellert M. (1998) DNA Transposition by the RAG1 and RAG2 Proteins: A Possible Source of Oncogenic Translocations. Cell 94 , 463-470. Hogue C. W. (1997) Cn3D: A New Genera tion of Three-Dimensional Molecular Structure Viewer. Trends Biochem. Sci . 22, 314-316. Holm L. and Sander C. (1999) Protein Fo lds and Families: Sequence and Structure Alignments. Nucleic Acids Res. 27 , 244-247. Huentelman M. J., Zubcevic J., Katovich M. J. and Raizada M. K. (2004) Cloning and Characterization of a Secreted Form of Angiotensin-Converting Enzyme 2. Regul. Pept. 122 , 61-67. Huggins M. L. (1943) The Stru cture of Fibrous Proteins. Chem. Rev. 32 , 195-218. Hunkapiller T. and Hood L. (1989) Diversity of the Immunoglobulin Gene Superfamily. Adv. Immunol. 44 , 1-63. Irwin J. J. and Shoichet B. K. (2005) ZINC --A Free Database of Commercially Available Compounds for Virtual Screening. J. Chem. Inf. Model. 45 , 177-182. Jaffe E. K. (2005) Morpheeins--A New Struct ural Paradigm for A llosteric Regulation. Trends Biochem. Sci. 30 , 490-497.
117 Jancarik J. and Kim S. H. (1991) Sparse Matrix Sampling: A Screening Method for Crystallization of Proteins. J. Appl. Cryst. 24 , 409-411. Jones T. A., Zou J. Y., Cowan S. W. and Kjeldgaard (1991) Improved Methods for Building Protein Models in Electron Dens ity Maps and the Location of Errors in these Models. Acta Cryst. A47 ( Pt 2) , 110-119. Jung D. and Alt F. W. (2004) Unraveling V( D)J Recombination; Insights into Gene Regulation. Cell 116 , 299-311. Kang B. S., Devedjiev Y., Derewenda U. and Derewenda Z. S. (2004) The PDZ2 Domain of Syntenin at Ultra-High Resolution: Bridging the Gap between Macromolecular and Small Molecule Crystallography. J. Mol. Biol. 338 , 483-493. Kantardjieff K. A. and Rupp B. (2003) Matt hews Coefficient Probabilities: Improved Estimates for Unit Cell Contents of Pr oteins, DNA, and Protein-Nucleic Acid Complex Crystals. Protein Sci. 12 , 1865-1871. Katovich M. J., Grobe J. L., Huentelman M. and Raizada M. K. (2005) AngiotensinConverting Enzyme 2 as a Novel Target for Gene Therapy for Hypertension. Exp. Physiol. 90 , 299-305. Kawabata S., Nagayama R., Hirata M., Shigen aga T., Agarwala K. L., Saito T., Cho J., Nakajima H., Takagi T. and Iwanaga S. (1996) Tachycitin, a Small Granular Component in Horseshoe Crab Hemocytes, is an Antimicrobial Protein with Chitin-Binding Activity. J. Biochem. 120 , 1253-1260. Kitchen D. B., Decornez H., Furr J. R. and Bajorath J. (2004) Docking and Scoring in Virtual Screening for Drug Discove ry: Methods and Applications. Nat. Rev. Drug Discov . 3 , 935-949. Kolodny R., Koehl P. and Levitt M. (2005) Comprehensive Evaluation of Protein Structure Alignment Methods: Scoring by Geometric Measures. J. Mol. Biol. 346 , 1173-1188. Kuntz I. D., Blaney J. M., Oatley S. J., Langr idge R. and Ferrin T. E. (1982) A Geometric Approach to Macromolecule-Ligand Interactions. J. Mol. Biol. 161 , 269-288. Laird D. J., De Tomaso A. W., Cooper M. D. and Weissm an I. L. (2000) 50 Million Years of Chordate Evolution: Seeki ng the Origins of Adaptive Immunity. Proc. Natl. Acad. Sci. USA 97 , 6924-6926. Langer T. and Hoffmann R. D. (2001) Virtual Screening: An Effective Tool for Lead Structure Discovery? Curr. Pharm. Des. 7 , 509-527. Laskowski R. A., Moss D. S. and Thornton J. M. (1993) Main-Chain Bond Lengths and Bond Angles in Protein Structures. J. Mol. Biol. 231 , 1049-1067.
118 Leahy D. J., Axel R. and Hendrickson W. A. (1992) Crystal Structure of A Soluble Form of the Human T Cell Coreceptor CD8 at 2.6 Ã… Resolution. Cell 68 , 1145-1162. Lefranc M. P., Giudicelli V ., Ginestoux C., Bosc N., Folch G., Guiraudou D., JabadoMichaloud J., Magris S., Scaviner D. , Thouvenin V., Combres K., Girod D., Jeanjean S., Protat C., Yousfi-Monod M., Duprat E., Kaas Q., Pommie C., Chaume D. and Lefranc G. (2004) IMGT-ONTOLOGY for Immunogenetics and Immunoinformatics. In Silico Biol. 4 , 17-29. Li H., Lebedeva M. I., Llera A. S., Fields B. A., Br enner M. B. and Mariuzza R. A. (1998) Structure of the Vdelta Domain of a Human Gammadelta T-Cell Antigen Receptor. Nature 391 , 502-506. Lipinski C. A., Lombardo F., Dominy B. W. and Feeney P. J. (1997) Experimental and Computational Approaches to Estimate Solubility and Permeability in Drug Discovery and Development Settings. Adv. Drug Delivery Rev. 23(1-3) , 3-25. Litman G. W., Anderson M. K. and Rast J.P. (1999) Evolution of Antigen Binding Receptors. Ann. Rev. Immunol. 17 , 109-147. Litman G. W., Cannon J. P. and Dishaw L. J. (2005) Reconstructing Immune Phylogeny: New Perspectives. Nat. Rev. Immunol. 5 , 866-879. Loll B., Raszewski G., Saenger W. and Biesia dka J. (2003) Functional Role of C(Alpha)H...O Hydrogen Bonds between Transmembr ane Alpha-Helices in Photosystem I. J. Mol. Biol. 328 , 737-747. Luz J. G., Huang M., Garcia K. C., Rudol ph M. G., Apostolopoulos V., Teyton L. and Wilson I. A. (2002) Structural Comparison of Allogeneic and Syngeneic T Cell Receptor-Peptide-Major Histocompatibil ity Complex Complexes: A Buried Alloreactive Mutation Subtly Alters Peptide Presentation Substantially Increasing V(Beta) Interactions. J. Exp. Med. 1 95, 1175-1186. Marchler-Bauer A., Anderson J. B., Cherukur i P. F., DeWeese-Scott C., Geer L. Y., Gwadz M., He S., Hurwitz D. I., Jackson J. D., Ke Z., Lanczycki C. J., Liebert C. A., Liu C., Lu F., Marchler G. H., Mu llokandov M., Shoemaker B. A., Simonyan V., Song J. S., Thiessen P. A., Yamashita R. A., Yin J. J., Zhang D. and Bryant S. H. (2005) CDD: A Conserved Domain Database for Prot ein Classification . Nucleic Acids Res. D33 , 192-196. Matthews B. W. (1968) Solvent Content of Protein Crystals. J. Mol. Biol. 33 , 491-497. Matthews B. W. (1976) X-Ray Crysta llographic Studies of Proteins. Ann. Rev. Phys. Chem. 27 , 493-523. McCammon J. A. (2005) Target Flex ibility in Molecular Recognition. Biochim. Biophys . Acta . 1754 , 221-4.
119 McDonald I. K. and Thornton J. M. (1994) Satisfying Hydrogen Bonding Potential in Proteins. J. Mol. Biol. 238 , 777-793. McPherson A. (1999) Crystallization of Biological Macromolecules (Cold Spring Harbor Laboratory, 1st edition) pp. 586. McRee D. E. (1999) XtalView/Xfit--A Ve rsatile Program for Manipulating Atomic Coordinates and Electron Density. J. Struct. Biology 125 , 156-165. Medzhitov R. and Janeway C. Jr. (2000) Innate Immunity. N. Engl. J. Med. 343 , 338-344. Meibom K. L., Blokesch M., Dolganov N. A ., Wu C. Y. and Schoolnik G. K. (2005) Chitin Induces Natural Competence in Vibrio Cholerae. Sc ience 3 10, 1824-1827. Meng E. C., Gschwend D. A., Blaney J. M. and Kuntz I. D. (1993) Orientational Sampling and Rigid-Body Minimization in Molecular Docking. Proteins 17 , 266278. Meng E. C., Shoichet K. and Kuntz I. D. (1992) Automated Docking with Grid Based Energy Evaluation. J. Comp. Chem. 13 , 505-524. Merritt E. A. (1994) Raster3D Version 2.0. A Program for Phot orealistic Molecular Graphics. Acta Cryst. D50 , 869-873. Merritt E. A. (1999) Comparing Anisotr opic Displacement Parameters in Protein Structures. Acta Cryst. D55 ( Pt 12) , 1997-2004. Miura K., Daviglus M. L., Greenland P. and Stamler J. (2001) Making Prevention and Management of Hypertension Work. J. Hum. Hypertens. 15 , 1-3. Monod M. Y., Giudicelli V., Chaume D. and Lefranc M. P. (2004) IMGT/JunctionAnalysis: The First Tool fo r the Analysis of the Immunoglobulin and T Cell Receptor Complex. Bioinformatics 20 Suppl 1 , I379-I385. Myers G. and Durbin R. (2003) A Table-Dr iven, Full-Sensitivity Similarity Search Algorithm. J. Comput. Biol. 10 , 103-117. Natesh R., Schwager S. L., Sturrock E. D. and Acharya K. R. (2003) Crystal Structure of the Human Angiotensin-Converti ng Enzyme-Lisinopril Complex. Nature 421 , 551-554. Nicholls A., Sharp K. and Honig B. (1991) GRASP: A Molecular Surface Graphics Program. PROTEINS, Structure, Function and Genetics . 11 , 281. Orengo C. A. and Thornton J. M. (2005) Prot ein Families and their Evolution-Astructural Perspective. Annu. Rev. Biochem. 74 , 867-900.
120 Ostrov D. A., Shi W., Schwartz J. C., Almo S. C. and Nathenson S. G. (2000) Structure of Murine CTLA-4 and its Role in Modulating T Cell Responsiveness. Science 290 , 816-819. Otwinowski Z., Borek D., Majewski W. and Mi nor W. (2003) Multiparametric Scaling of Diffraction Intensities. Acta Crystallographica. Sec tion a, Crystal Physics, Diffraction, Theoretical and General Crystallography 59 , 228-234. Otwinowski Z. and Minor W. (1997) Processing of X-ray Diffraction Data Collected in Oscillation Mode. Methods Enzymol. 276 , 307-326. Peitsch M. C. (1996) ProMod and Swiss-Mode l: Internet-Based Tools for Automated Comparative Protein Modelling. Biochem. Soc. Trans. 24 , 274-279. Peitsch M. C., Schwede T. and Guex N. (2000) Automated Protein Modelling--the Proteome in 3D. Pharmacogenomics 1 , 257-266. Potapov V., Sobolev V., Edelman M., Kister A. and Gelfand I. (2004) Protein--Protein Recognition: Juxtaposition of Domain and Interface Cores in Immunoglobulins and Other Sandwich-Like Proteins. J. Mol. Biol. 342 , 665-679. Rast J. P., Anderson M. K., Strong S. J., Litman R. T. and Litman G. W. (1997) , , , and T Cell Antigen Receptor Genes Ar ose Early in Vertebrate Phylogeny. Immunity 6 , 1-11. Rast J. P. and Litman G. W. (1998) Toward s Understanding the Evolutionary Origins and Early Diversification of R earranging Antigen Receptors. Immunological Reviews 166 , 79-86. Rayment I., Rypniewski W. R., Schmidt-Base K., Smith R., Tomchick D. R., Benning M. M., Winkelmann D. A., Wesenberg G. and Holden H. M. (1993) ThreeDimensional Structure of Myosin Su bfragment 1: A Molecular motor. Science . 261 , 50-58. Renkema G. H., Boot R. G., Muijsers A. O ., Donker-Koopman W. E. and Aerts J. M. (1995) Purification and Characterizati on of Human Chitotriosidase, a Novel Member of the Chitinase Family of Proteins. J. Biol. Chem. 270 , 2198-2202. Renkema G. H., Boot R. G ., Strijland A., Donker-Koopman W. E., van den Berg W., Muijsers A. O. and Aerts J. M. (1997) Synthesis, Sorting and Processing into Distinct Isoforms of Human Macrophage Chitotriosidase. Eur. J. Biochem. 244 , 279-285. Richards A. G. and Richards P. A. (1 977) The Peritrophic Membranes of Insects. Annu. Rev. Entomol. 22 , 219-240. Rini J. M., Schulze-Gahmen U. and Wilson I. A. (1992) Structural Evidence for Induced Fit as a Mechanism for Antibody-Antigen Recognition. Science 255 , 959-965.
121 Ross E. D., Minton A. and Wickner R. B. ( 2005) Prion domains: sequences, structures and interactions. Nat. Cell Biol. 7 , 1039-1044. Rudolph M. G., Luz J. G. and Wilson I. A. (2002) Structural and Thermodynamic Correlates of T Cell Signaling. An nu. Rev. Biophys. Biomol. Struct. 3 1, 121-149. Rudolph M. G. and Wilson I. A. (2002) Th e Specificity of TCR/pMHC Interaction. Curr. Opin. Immunol. 14 , 52-65. Ruiz-Ortega M., Lorenzo O., Ruperez M., Este van V., Suzuki Y., Mezzano S., Plaza J. J. and Egido J. (2001) Role of the Renin-A ngiotensin System in Vascular Diseases: Expanding the Field. Hypertension . 38 , 1382-1387. Rypniewski W. R., Holden H. M. and Raymen t I. (1993) Structural Consequences of Reductive Methylation of Lysine Residue s in Hen Egg White Lysozyme: an Xray Analysis at 1.8-A Resolution. Biochemistry 32 , 9851-9858. Schneider T. R. and Sheldrick G. M. ( 2002) Substructure Solution with SHELXD. Acta Cryst. D58 , 1772-1779. Schwede T., Kopp J., Guex N. and Peitsch M. C. (2003) SWISS-MODEL: An Automated Protein Homo logy-Modeling Server. Nu cleic Acids Res. (Online) 3 1, 3381-3385. Sheldrick G. M. and Schneider T. R. (1997) SHELXL: High-Re solution Refinement. Methods Enzymol. 277 , 319-343. Shen Z. and Jacobs-Lorena M. (1999) E volution of Chitin-Binding Proteins in Invertebrates. J. Mol. Evol. 48 , 341-347. Simonds W. F. (1999) G Protein Regulation of Adenylate Cyclase. Trends Pharmacol. Sci . 20 , 66-73. Stanfield R. L., Dooley H., Fl ajnik M. F. and Wilson I. A. (2004) Crystal Structure of a Shark Single-Domain Antibody V Region in Complex with Lysozyme. Science 305 , 1770-1773. Stanfield R. L., Takimoto-Kamimura M., Rini J. M., Profy A. T. and Wilson I. A. (1993) Major Antigen-Induced Domain Rearrangements in an Antibody. Structure 1 , 8393. Streltsov V. A., Varghese J. N., Carmichael J. A., Irving R. A., Hudson P. J. and Nuttall S. D. (2004) Structural Evidence for Evolution of Shark Ig New Antigen Receptor Variable Domain Antibodies from a Cell-Surface Receptor. Proc. Natl. Acad. Sci. USA 101 , 12444-12449.
122 Strong S. J., Mueller M. G., Litman R. T., Ha wke N. A., Haire R. N., Miracle A. L., Rast J. P., Amemiya C. T. and Litman G.W. (1999) A Novel Multigene Family Encodes Diversified Variable Regions. Proc. Natl. Acad. Sci. USA 96 , 1508015085. Suetake T., Aizawa T., Koganesawa N., Osaki T., Kobashigawa Y., Demura M., Kawabata S., Kawano K., Tsuda S. and Nitta K. (2002) Production and Characterization of Recombinant Tachycitin, the Cys-Rich Chitin-Binding Protein. Protein Eng. 15 , 763-769. Suetake T., Tsuda S., Kawabata S., Miura K., Iwanaga S., Hikichi K., Nitta K. and Kawano K. (2000) Chitin-Bindi ng Proteins in Invertebra tes and Plants Comprise a Common Chitin-Binding Structural Motif. J. Biol. Chem. 275 , 17929-17932. Suri M. F. K., Kirmani J. F. and Divani A. A. AHA News Release Sept 3 2004, at www.americanheart.org. Taylor R. and O. Kennard (1982) Crystallogr aphic Evidence for the Existence of CH-O, CH-N, and CH-Cl Hydrogen Bonds. J. Am. Chem. Soc. 104 , 5063-5070. Terwilliger T. C. (2000) Maximu m-Likelihood Density Modification. Acta Cryst. D56 , 965-972. Terwilliger T. C. (2002) Automated Main-C hain Model Building by Template Matching and Iterative Fragment Extension. Acta Cryst. D59 , 38-44. Terwilliger T. C. and Berendzen J. (1999) Automated MAD and MIR Structure Solution. Acta Cryst. D55 ( Pt 4) , 849-861. Tipnis S. R., Hooper N. M., Hyde R., Karran E ., Christie E. and Turner A. J. (2000) A Human Homolog of Angiot ensin-Converting Enzyme. Cloning and Functional Expression as a Captopril-In sensitive Carboxypeptidase. J. Biol. Chem. 275 , 33238-33243. Tjoelker L. W., Gosting L., Frey S., Hunter C. L., Trong H. L., Steiner B., Brammer H. and Gray P. W. (2000) Structural an d Functional Definition of the Human Chitinase Chitin-Binding Domain. J. Biol. Chem. 275 , 514-520. Tonegawa S. (1983) Somatic Generation of Antibody Diversity. Nature 302 , 575-581. Towler P., Staker B., Prasad S. G., Menon S ., Tang J., Parsons T., Ryan D., Fisher M., Williams D., Dales N. A., Patane M. A. and Pantoliano M. W. (2004) ACE2 XRay Structures Reveal a Large Hinge-B ending Motion Important for Inhibitor Binding and Catalysis. J. Biol. Chem. 279 , 17996-18007. Unger T. (2003) Blood Pressure Lowering a nd Renin-Angiotensin System Blockade. J. Hypertens. 21 suppl 6 , S3-S7.
123 van den Berg T. K., Yoder J. A. and Litman G. W. (2004) On the Origins of Adaptive Immunity: Innate Immune Receptors Join the Tale. Trends in Immunology 25 , 1116. van Eijk M., van Roomen C. P., Renkema G. H., Bussink A. P., Andrews L., Blommaart E. F., Sugar A., Verhoeven A. J., B oot R. G. and Aerts J. M. (2005) Characterization of Human Phagocyte-Derived Chitotriosidase, a Component of Innate Immunity. Int. Immunol. 17 , 1505-1512. van Gunsteren W. F. (1996) Biomolecular Simulation: The GROMOS96 Manual and User Guide. 4-31. van Raaij M. J., Chouin E., van der Zandt H ., Bergelson J. M. and Cusack S. (2000) Dimeric Structure of the Coxsackievirus and Adenovirus Receptor D1 Domain at 1.7 Ã… Resolution. Structure Fold Des. 8 , 1147-1155. Vickers C., Hales P., Kaushik V., Dick L ., Gavin J., Tang J., Godbout K., Parsons T., Baronas E., Hsieh F., Acton S., Patane M., Nichols A. and Tummino P. (2002) Hydrolysis of Biological Peptides by Human Angiotensin-Converting EnzymeRelated Carboxypeptidase. J. Biol. Chem. 277 , 14838-14843. Volkman B. F., Lipson D., Wemmer D. E. a nd Kern D. (2001) Two-State Allosteric Behavior in a Single-Do main Signaling Protein. Science 291 , 2429-2433. Wahl M. C., Rao S. T. and Sundaralingam M. (1996) The Structur e of R(UUCGCG) Has a 5'-UU-Overhang Exhibiting Hoogsteen-Like Trans U.U Base Pairs. Nat. Struct. Biol. 3 , 24-31. Wahl M. C. and Sundaralingam M. ( 1997) C-H...O Hydrogen Bonding in Biology. Trends Biochem. Sci. 22 , 97-102. Wallace A. C., Laskowski R. A. and Thornt on J. M. (1995) LIGPLOT: A Program to Generate Schematic Diagrams of Protein-Ligand Interactions. Protein Eng. 8 , 127-134. Walters W. P., Stahl M. T. and Murcko M. A. (1998) Virtual Screening-An Overview. Drug Discov. Today 3 , 160-178. Wang J. W., Chen J. R., Gu Y. X., Zheng C. D., Jiang F., Fan H. F., Terwilliger T. C. and Hao Q. (2004) SAD Phasing by Combinat ion of Direct Methods with the SOLVE/RESOLVE Procedure. Acta Cryst. D60 , 1244-1253. Watson F. L., Puttmann-Holgado R., Thomas F., Lamar D. L., Hughes M., Kondo M., Rebel V. I. and Schmucker D. (2005) Extensive Diversity of Ig-Superfamily Proteins in the Immune System of Insects. Science 309 , 1874-1878.
124 Weeks C. M., Adams P. D., Berendzen J., Br unger A. T., Dodson E. J., Grosse-Kunstleve R. W., Schneider T. R., Sheldrick G. M ., Terwillinger T. C., Turkenburg M. G. and Uson I. (2003) Automatic Solution of Heavy-Atom Substructures. Substructure Solution with SHELXD. Methods Enzymol. 374 , 37-83. Wolf-Watz M., Thai V., Henzler-Wildman K ., Hadjipavlou G., Eisenmesser E. Z. and Kern D. (2004) Linkage between dynam ics and catalysis in a thermophilicmesophilic enzyme pair. Nature Struct. & Mol. Biol. 11 , 945-949. Yoder J. A., Litman R. T., Mueller M. G., De sai S., Dobrinski K. P., Montgomery J. S., Buzzeo M. P., Ota T., Amemiya C. T., Tred e N. S., Wei S., Djeu J. Y., Humphray S., Jekosch K., Hernandez Prada J. A., Ostrov D. A. and Litman G. W. (2004) Resolution of the Novel Immune-Type Receptor Gene Cluster in Zebrafish. Proc. Natl. Acad. Sci. USA 101 , 15706-15711. Zaccai N. R., Maenaka K., Maenaka T., Crocke r P. R., Brossmer R., Kelm S. and Jones E. Y. (2003) Structure-Guided Design of Sialic Acid-Based Siglec Inhibitors and Crystallographic Analysis in Complex with Sialoadhesin. Structure (Camb) 11 , 557-567. Zaman M. A., Oparil S. and Calhoun D. A. (2002) Drugs Targeting the ReninAngiotensin-Aldosterone System. Nat. Rev. Drug Discov. 1 , 621-636. Zemlin M., Schelonka R. L., Bauer K. and Schroeder H. W. Jr. (2002) Regulation and Chance in the Ontogeny of B and T Cell Antigen Receptor Repertoires. Immunol . Res. 26 , 265-278 Zhang J. and Madden T. L. (1997) PowerBLAST: A New Network BLAST Application for Interactive or Automated Se quence Analysis and Annotation. Genome Res. 7 , 649-656. Zhang S. M., Adema C. M., Kepler T. B. and Loker E. S. ( 2004) Diversification of Ig Superfamily Genes in an Invertebrate. Science 305 , 251-254. Zhurkin V. B., Raghunathan G., Ulyanov N. B., Ca merini-Otero R. D. and Jernigan R. L. (1994) A Parallel DNA Triple x as a Model for the Inte rmediate in Homologous Recombination. J. Mol. Biol. 239 , 181-200.
125 BIOGRAPHICAL SKETCH JosÃ© Antonio HernÃ¡ndez Prada was born in Caracas, Venezuela, on July 25th, 1980. His parents, JosÃ© Antonio HernÃ¡ndez Davila and Irene Prada de HernÃ¡ndez, are both Venezuelan with Italian, Spanish, and Native Amer ican descent. He is the oldest of three siblings, with two younger sisters, MarÃa Hele na and Irene Virginia. In Caracas he was raised to the age of 14 while attending Catholic school, and in 1996 he moved to the United States with his family to finish hi gh school at Miami Sunset Senior High. In 1998, JosÃ© Antonio graduated from high school a nd began the undergraduate program at the University of Miami, where he majored in biochemistry and chemistry with a minor in psychology. He graduated cum laude from the University of Miami in 2002 and moved to Gainesville, Florida, to start gr aduate studies in May of that year at the University of Florida. In 2003, he joined the laboratory of Dr. David Ostrov, wh ere he learned x-ray crystallography and structure-base d approaches to drug discovery.