COMPUTATIONAL STUDIES OF MELANO CORTIN RECEPTOR SYSTEM AND CHANNEL FUNCTION IN GLUTAMIN E-DEPENDENT AMIDOTRANSFERASES By XIANG SIMON WANG A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLOR IDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2003
ACKNOWLEDGMENTS I am thankful to many individuals for their help and encouragement in my five years of graduate studies. First, I would like to thank Dr. Nigel Richards for teaching me the fundamentals of computational chemistry, scientific advice and sense of humor. I am grateful to the assistance from Dr. Carrie Haskell-Luevano, without whom half of this dissertation may not have been possible. I am also thankful to Dr. Adrian Roitberg for his scientific advice and encouragement to me when I had difficulties. I would like to acknowledge other members of my supervisory committee, Dr. Nicole Horenstein, Dr. Alexander Angerhofer and Dr. Dennis Wright, as well as the Department of Chemistry faculty, for their scientific teaching and advice during my graduate studies. I would like to acknowledge members of Richardsâ€™ group and Haskell-Luevanoâ€™s group for their support and friendship. I would like to thank the Department of Chemistry and the University of Florida for an excellent education and research system. Finally, I am grateful to my wife, Min Lin, and my parents, Huasong Wang and Qiongru Liu, for their love and support. ii
TABLE OF CONTENTS Page ACKNOWLEDGMENTS..................................................................................................ii ABSTRACT.........................................................................................................................v CHAPTER 1 INTRODUCTION OF BIOLOGY OF MELANOCORTIN RECEPTOR SYSTEM AND THEORETICAL METHODOLOGY.................................................................1 Melanocortin Receptor System..................................................................................1 Homology Modeling Methodology............................................................................3 Fragment-based Algorithm.................................................................................4 Structurally conserved regions (SCRs) modeling.......................................4 Structurally variable regions (SVRs) modeling..........................................6 Side chain modeling....................................................................................7 Refinement and evaluation of the model....................................................7 Restraint-based Algorithm..................................................................................9 2 HOMOLOGY MODELING OF THE THREE-DIMENSIONAL STRUCTURE OF MELANOCORTIN RECEPTORS (MCR) AND THE HAGRP(87-132)/MC4R COMPLEX.................................................................................................................13 Methods....................................................................................................................13 Results and Discussion.............................................................................................16 Modeling of the Ligand-free Mouse Melanocortin-4 Receptor........................16 Complex Model and hAGRP(87-132)/mMC4R Interaction.............................20 3 INVESTIGATING THE LIGAND MOTION AND ACTIVITY RELATIONSHIP OF THREE LACTAM DERIVATIVES OF HAGRP(109-118) IN THE DECANE/SPC MEMBRANE MIMETIC MODEL..................................................28 Methods....................................................................................................................28 Construction of the Atomic System..................................................................28 Simulation Protocol...........................................................................................30 Results and Discussion.............................................................................................32 Decane/SPC Simulation....................................................................................32 Average Structural Properties...........................................................................34 Fluctuations of the Peptide Ligands..................................................................35 iii
4 FURTHER MODELING CALCULATIONS USING THE TEMPLATE OF HAGRP(87-132)/MC4R COMPLEX: CASE STUDY OF BICYCLIC PEPTIDES.39 Methods....................................................................................................................40 Results and Discussion.............................................................................................41 5 INTRODUCTION OF BIOLOGY OF CHANNEL FUNCTION IN GLUTAMINE-DEPENDENT AMIDOTRANSFERASES AND THEORETICAL METHODOLOGY.....................................................................................................48 Glutamine-dependent Amidotransferases and Ammonia Channel..........................48 Molecular Dynamic Methodology...........................................................................50 Classical Mechanics..........................................................................................50 Integration Algorithms......................................................................................51 Statistical Mechanics.........................................................................................54 6 EXPLORING THE STRUCTURAL AND THERMODYNAMIC PROPERTIES OF AMMONIA CHANNEL IN GLUTAMINE PHOSPHORIBOSYLPYROPHOSPHATE AMIDOTRANSFERASE. PART I: LOCALLY ENHANCED SAMPLING AND CAVITIES CALCULATIONS.........60 Methods....................................................................................................................60 Force Field Parameters Development...............................................................60 Locally Enhanced Sampling (LES) Protocol....................................................62 Cavities Analysis of the Channel......................................................................65 Results and Discussion.............................................................................................67 Force Field Parameters......................................................................................67 Computational Evidence for NH 3 Channel.......................................................68 Cavities and Electrostatic Properties of the Channel........................................72 L415A and F314A Mutant................................................................................74 7 EXPLORING THE STRUCTURAL AND THERMODYNAMIC PROPERTIES OF AMMONIA CHANNEL IN GLUTAMINE PHOSPHORIBOSYLPYROPHOSPHATE AMIDOTRANSFERASE. PART II: GIBBS FREE ENERGY CALCULATIONS.............................................................90 Methods....................................................................................................................91 Results and Discussion.............................................................................................92 APPENDIX LIST OF ABBREVIATION.......................................................................99 LIST OF REFERENCES.................................................................................................100 BIOGRAPHICAL SKETCH...........................................................................................112 iv
Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy COMPUTATIONAL STUDIES OF MELANOCORTIN RECEPTOR SYSTEM AND CHANNEL FUNCTION IN GLUTAMINE-DEPENDENT AMIDOTRANSFERASES By Xiang Simon Wang August 2003 Chair: Nigel G. J. Richards Major Department: Chemistry The melanocortin receptor family consists of five subtypes (MC1R-MC5R) and belongs to the family of transmembrane G-protein coupled receptors (GPCR). The melanocortin-4 receptor, due to its direct involvement in feeding behavior, is a current drug target for the design of selective therapeutics to treat obesity. To determine the molecular recognition processes of receptor-ligand interactions as well as aid in the design of ligands, the molecular models of melanocortin receptors (MCR) were constructed based on the high-resolution crystal structure of rhodopsin. The 2D 1 H NMR structure of hAGRP (87-132) was docked to the mMC4R structure and refined. The resulted complex structure was proved to be consistent with the recent in vitro mutagenesis data and therefore useful as the template. A new membrane model, the decane/water model, was developed to approximate the membrane environment for the GPCRs. Three decapeptides/MCRs complexes, which have different activities against MCRs, were built based on the template and docked to the membrane model. Dynamic v
simulations were carried out at the nanosecond time scale. The trajectory analysis of the peptides were done and showed different motions corresponding to their activities. These studies will supply insight into the event associated with receptor-ligand recognition and activation. GPATase catalyzes the first step of de novo purine nucleotide synthesis and belongs to a family of Ntn glutamine amidotransferases. The glutaminase domain catalyzes the hydrolysis of glutamine and releases the NH 3 as the intermediate. The NH 3 is channeled to the PRTase domain as the nitrogen source for PRA synthesis. The Locally Enhanced Sampling (LES) approach was utilized and gave the direct computational evidence to the existence of the channel. Cavities and electrostatic potential calculations were carried out on the snapshots of LES trajectories to examine properties about the channel. These studies suggested that NH 3 channel acts simply as a hydrophobic â€œtubeâ€ to connect two active sites. Simulations on the leaking mutant L415A reproduced the experimental results and gave the possible mechanism behind. The PMF profiles for the migration of ammonia/ammonium ions were calculated. Consistent with the LES, the PMF of ammonia is basically level while a prohibitive energy barrier found for ammonium ion. vi
CHAPTER 1 INTRODUCTION OF BIOLOGY OF MELANOCORTIN RECEPTOR SYSTEM AND THEORETICAL METHODOLOGY Melanocortin Receptor System The melanocortin receptor family consists of five subtypes and designed as MC1R-MC5R for the order in which they were isolated (1-8). They belong to the family of seven transmembrane G-protein coupled receptors and activate adenylate cyclase and elevate intracellular 3â€™, 5â€™cyclic adenosine monophosphate (9). The centrally located melanocortin-3 and -4 receptors (MC3R, MC4R) have been identified in knockout mice to be involved in feeding behavior, obesity, metabolism, and energy homeostasis (10-12). The most well-studied melanocortin receptor ligands are for the skin melanocortin-1 receptor (MC1R), which is involved in pigmentation and animal coat coloration (13-15). The MC5R is expressed in muscle, liver, spleen, lung, brain, adipocytes, and a variety of other tissues (5, 16), and has been identified as being involved in exocrine gland function (17). The melanocortin peptides, -, -, and â€“melanocyte stimulating hormone (MSH) and adrenocorticotropic hormone (ACTH) are the endogenous agonist ligands for the melanocortin receptors and are derived by posttranslational processing of the proopiomelanocortin (POMC) gene transcript. All these melanocortin peptide agonists contain a core His-Phe-Arg-Trp tetrapeptide sequence that has been attributed to the ligand selectivity and stimulation of the melanocortin receptors (18-20) (Table 1.1). In addition, agouti protein (ASP) (21) and agouti-related protein (AGRP) (22, 23) have been 1
2 identified as the only two naturally occurring antagonists of G-protein coupled receptor (GPCR) reported to date. The agouti protein was first characterized as an antagonist of the skin MC1R and brain MC4R (1). The AGRP protein was demonstrated pharmacologically to competitively antagonize the MC3R and MC4R brain melanocortin receptors (23, 24), and when ectopically expressed, resulted in an obese phenotype. The cysteine-rich C terminus of both agouti and AGRP has been identified as possessing nanomolar antagonistic properties equipotent to the full-length peptide (25-30) (Figure 1.1). Thus, suggesting the key structural and molecular recognition features are located in this domain. Studies of the AGRP protein identified the Arg-Phe-Phe residues, conserved in both agouti and AGRP, as critical for its antagonistic properties (25, 31). The melanocortin-4 receptor, due to its direct involvement in feeding behavior, is a current drug target for the design of selective agonist therapeutics to treat obesity and the design of selective antagonists for potential treatment of anorexia. Therefore, extensive point mutageneses were performed to elucidate the molecular basis for the interaction of ligands with the MCRs (31, 32, 34-36). The conserved residues that are critical for binding were identified and two binding pockets are hypothesized (Figure 1.2). The first is a predominantly ionic pocket formed by Glu/Asp while the second is a hydrophobic pocket formed by a series of aromatic residues in TM4, -5, -6. However, existing literature is conflicting regarding the contribution of exoloops to the binding of melanocortin agonists to the MCRs (33, 37, 38). But for the natural antagonist AGRP(87-132), important MCRs structural property involved in its binding was identified to reside in extracellular loop 2 (EL2) and extracellular loop 3 (EL3) (33).
3 Unfortunately, because the MC4R is a G protein-coupled receptor (GPCR), it is unlikely that a high-resolution structure for the AGRP/MC4R complex will be obtained by X-ray crystallography in the near future. As a result, we have employed computational methods to construct a homology model of the MC1, 4R based on the high-resolution X-ray crystal structure of rhodopsin. To the best of our knowledge, this is the first study in which the MCRs have been modeled using the rhodopsin crystal structure rather than the helical packing observed in bacteriorhodopsin. These have the indispensable roles in determing the molecular recognition processes of ligand-receptor interactions as well as the receptor activation process, potentially assist antiobesity and antianorexia drug design. Homology Modeling Methodology The primary sequences of more than a half-a-million proteins have been provided by the various genome projects. The gap between sequence determination and structure determination continues to grow despite significant developments in crystallography and NMR. Homology modeling, also referred to as knowledge-based modeling, has been proved as an important approach that is reasonably accurate and forms a rational basis for explaining experimental observations and de novo ligand design. Homology is precisely defined: having a common evolutionary origin (49, 50). The phrase "homology modeling" means deriving the 3D structure of an unknown based on the known coordinates of a protein to which it is homologous. The first knowledge-based modeling studies were carried out in the late 1960s before interactive computer graphics were exploited (59, 60). Since the mid 1980s a large number of protein models have been built using rule-based protocols and reported in the literature. Such modeling techniques fall into two classes:
4 The assembly of rigid fragments from homologous and other known protein structures The use of restraints such as spatial distances to build models that have the best agreement with homologues of known structure Fragment-based Algorithm Jones and Thirup first demonstrated that a protein structure could be built from a combination of segments from other proteins (61). The subsequent studies by Unger et al. (62), Claessens et al. (63), and Levitt (64) supported it and led to the fragment-based modeling software COMPOSER developed by Blundell and co-workers (65, 66). Structurally conserved regions (SCRs) modeling Figure 1.3 shows a typical flow chart for protein homology modeling which is fragment-based. The first step is to identify the homologues of the known three-dimensional structure and several computerized search methods are available to assist this procedure. The next critical step is the alignment of the unknown sequence with the homologues. Many algorithms are available for sequence alignment, including the dynamic algorithm of Needleman and Wunsch (67), FASTA, Smith-Waterman and BLASTP. Sometimes the most perplexing task when performing an alignment is to consider the following factors: Which algorithm to use for alignment. Which scoring method to apply. Whether and how to assign gap penalties. All scoring algorithms typically involve construction of the 210 pairs of scores as a 20x20 matrix of similarities in which identical amino acids and those of similar character (i.e., conservative substitutions) may be scored higher than those of different character. The most commonly used scoring schemes are the following:
5 Identity: the simplest one, considering only identical residues Genetic Code: considers the number of base changes in DNA/RNA to interconvert the codons for the amino acids Chemical Similarity: considers the physical-chemical properties with greater weight given to alignment of similar properties, developed by McLachlan (68) Observed Substitutions: considers substitution frequencies seen in alignments of sequences. Among the above, the substitution schemes are generally considered to be the best methods for scoring alignments. Two widely used substitution matrices are the PAM and BLOSUM matrices. PAM matrices. Dayhoff and co-workers (51-53) developed this method during analysis of the evolution of proteins. The Dayhoff mutation probability matrix shows the probability of one amino acid mutating to a second amino acid within a particular evolutionary time. The scoring schemes are named PAM (percentage of acceptable point mutations) followed by a number. BLOSUM matrices. Henikoff and Henikoff (54) extended Dayhoffâ€™s approach by developing substitution matrices using local multiple alignments of more distantly related sequences. A database was constructed that contained multiple alignments (without gaps) of short regions of related sequences. These sequences were clustered into groups (blocks) based on their similarity at some threshold value of percentage identity. Blocks substitution matrices (BLOSUM) were derived based on substitution frequencies for all pairs of amino acids within a group. Different matrices are obtained by varying the clustering threshold. After the known structures are aligned, the structurally conserved regions (SCRs) are identified and used as the weighted mean structure, or framework, to construct these
6 regions of the proteins. From the studies on other homologues, we know that the SCRs are often secondary structure elements, such as helices and extended strands, and ligand binding sites. Many sophisticated superposition methods have been proposed (69). The coordinates of the main chain atoms of the SCRs of the unknown protein are generated in a straightforward manner, compared to the structurally variable regions (SVRs). Sometime the contributions of each homologue are best weighted, based on the extent of similarity. Structurally variable regions (SVRs) modeling The structurally variable regions usually lie on the surface of the proteins and form the loops where the main chain turns. They accommodate most mutations, deletions and insertions and are particularly difficult to model. However, loop regions are often involved in functional specificities, recognition process and building of the active sites. Hence, the accurate modeling of loops is crucial in homology modeling. Currently there are two major methods to build the loops: database search techniques and ab initio methods. The first method is widely used and applicable to loops shorter than 7 residues. The search through the crystallographic database uses two criteria: geometry and sequence similarity. The query can be defined as the residues of the loop and end regions that are anchored to secondary structure elements. The selection of the correct conformer can be improved by considering the R. M. S. D. difference in the anchor regions and the sequence similarity between the database entries and the loop to be modeled. The candidate loops can be ranked by matching them with structural templates (70, 71). The templates were selected based on the amino acid substitutions that are compatible with the local structural environment for each amino acid.
7 Side chain modeling The correct orientation of side chains is important in the packing of amino acids. Several factors must be considered: preferred torsion angles, steric clash, close packing, disulfide cross-links, hydrogen bonds and electrostatic interactions. Side chain coordinates could be copied if the sequence of the unknown is identical or very similar to that of the homologue. Otherwise, rotamer libraries can be used in a systematic approach to define the possible side chain conformations. Rotamer libraries have been constructed by indicating the preferred side chain torsions for all amino acids (72) or to correlate (, ) values with side chain torsion angle probabilities (73). It has been confirmed that the side chain torsion angles, N-C -C -C ( 1 ), C -C -C -C ( 2 ) etc., of amino acid residues in proteins prefer values that correspond to the three staggered orientations, namely, +60 (gauche minus, g-), -60 (gauche plus, g+) and 180 (trans, t). McGregor et al. also showed the preferred values depended on the secondary structure of the main chain (74). Hence, the side chains could be built by the well-defined rules and subjected to energetic criteria afterward. Refinement and evaluation of the model Energy minimization and molecular dynamics (MD) methods are usually used to refine the model. Monte carlo (MC) and other conformational search techniques are occasionally employed to explore the conformational space. Energy minimization is necessary to relieve steric contacts and correct bad geometry in the model, especially in the anchor regions where a loop is melded to the main chain of the protein. However, energy minimization will not significantly alter the structure. Molecular dynamics is valuable to overcome the potential energy barriers and locate energy minima.
8 An empirical force field is used to represent potential energy of a molecule as a function of geometric variables. It is an approximation of the Born-Oppenheimer type in that nuclei and electrons are lumped into atom-like particles. However, the electronic system is included implicitly by correct choice of parameters. The function and parameters that define the potential energy surface in molecular mechanics are referred to collectively as a force field. There are two major sets of force fields: Class I and Class II. Class I force fields include CHARMM (39), AMBER (40), OPLS (41) and ECEPP (42). Force fields like CFF (43), MM3 (44), MMFF94 (45) and DREIDING (46) are in Class II. Here is an example of Class I potential energy function (CHARMM): Intramolecular: torsionsanglesbondsbnKKbbKcos12020 BradleyUreyUBimpropersrrKK20,3,13,120 Intermolecular: ticselectrostaVDWijijijijijijjirRrRrqq6min,12min,2 One the latest development of force fields is the polarizable force field (47, 48). It is non-additive and includes explicit term in the function to treat polarization of charge distribution caused by the environment. An assessment of likely errors is required for every homology model. The analysis of the structural features of the three-dimensional model is based on the terms known about the protein structure in general. This can be performed at three levels of structural organization, namely, the overall fold, localized regions and stereochemical parameters. The criteria for analysis of correctness can include:
9 main chain conformations in acceptable regions of the Ramachandran map peptide bond planarity, bond lengths, bond angles no bad atom-atom contacts Z-score: sEEZpp Z p = Z-score of an amino acid sequence E p = energy of the amino acid sequence E = average energy of all fragments s = standard deviation side chain conformations that correspond to those in the rotamer library hydrogen-bonding of polar atoms if they are buried proper environments for hydrophobic and hydrophilic residues Restraint-based Algorithm Although homology modeling based on the fragment-based algorithm has proved remarkably successful, restraint-based algorithm remains a useful alternative. It is especially true where the homologues are distantly related. MODELLER (56) (http://salilab.org/modeller/modeller.html) developed by Sali and Blundell arrives at a three-dimensional model by optimally satisfying spatial restraints (57, 58). These restraints, which are expressed as probability density function (pdfs), include structural properties at residue positions and relationships between residues. MODELLER minimizes the objective function, F, with respect to the Cartesian coordinates of the protein atoms iiipfScRF, where R are the Cartesian coordinates of the atoms, c is a restraint dependant on f,p, f is a geometric feature of a molecule and include the distance, angle and dihedral values and p are parameters to help describe some restraints.
10 81 132 Figure 1.1 Primary sequences of the mouse agouti (ASP), human agouti-related protein (AGRP). Figure 1.2 Illustration of the putative ligand DPhe-Arg-Trp amino acids interactions with the mouse MC1R. Table 1.1 Primary sequence of POMC-derived melanocortin peptides with the core "His Phe-Arg-Trp" sequence emphasized. Name Sequence ACTH (1-39) NH 2 -SYSMEHFRWGKPVGKKRRPVKVYPNGAEDESAEAFPLEF-OH -MSH Ac-SYSMEHFRWGKPV-NH 2 -MSH NH 2 -AEKKDEGPYRMEHFRWGSPPKE-OH -MSH NH 2 -YVMGHFRWDRF-OH
11 Figure 1.3 A flow chart for the comparative modeling procedure, COMPOSER. Various steps in COMPOSER and the tools used are indicated (69).
CHAPTER 2 HOMOLOGY MODELING OF THE THREE-DIMENSIONAL STRUCTURE OF MELANOCORTIN RECEPTORS (MCR) AND THE HAGRP(87-132)/MC4R COMPLEX Methods The initial receptor model was derived from the C coordinate template of rhodopsin provided by Baldwin (81). After the publication of the high-resolution 2.8 crystal structure of rhodopsin (82) (PDB code: 1F88), we rebuilt our receptor model based on this crystal structure. The set of sequences of rhodopsin and melanocortin receptor was searched and retrieved from SRS 6.06. All fragments and duplicates among the sequences were discarded. The selected 42 rhodopsin sequences and 34 melanocortin receptor sequences plus the new mouse melanocortin-4 receptor (mMC4R) sequence were aligned using the program ClustalW 1.81 (89) with the default parameters. The resulting multiple sequence alignment was manually edited within SeqLab of GCG Wisconsin Package. Mutations and loop building were performed within Biopolymer module implemented in SYBYL 6.5 (Tripos Inc.). Secondary structure predictions of the loop regions were carried out using GOR 4.0 method (90) as implemented in Biology Work Bench 3.2. The N(1-31) and Cterminal (311) of mMC4R were then deleted and capped with acetylated N-terminus and N-methylamide C-terminus. Hydrogen atoms were added using the HBUILD module within CHARMM 25b2 (83, 84). The detailed structures of transmembrane (TM) helices were checked and repaired wherever necessary using SYBYL and CHARMM. 13
14 The models were subjected to 1000 cycles of steepest descent followed by adopted basis Newton-Raphson algorithms until convergence of 0.01 kcal/mol r.m.s deviation. A 1.0 kcal/mol 2 positional constraint was applied to the receptor backbone atoms. The receptor was then placed in a restrained water droplet model (76) for further annealing refinement on the extracellular loop domain. The loops were fully solvated in a sphere of equilibrated TIP3P water (88) with a radius of 35 and free to move in the simulation. Restraints were applied to the transmembrane domain fixing all atoms except the side chains of the last one extracellular turn of the transmembrane helices. The starting structures for the simulated annealing studies were obtained from minimization, with additional positional restraints imposed on water oxygen atoms. The minimized structure was further refined using the following simulated annealing protocol: the system was gradually heated to 1500 K over a period of 30 ps with T = 5 K every 0.1 ps, followed by 10 ps of equilibration at 1500 K. Ten structures were extracted from the trajectory at 1500 K by sampling every 1 ps. Each structure was cooled to 300 K gradually with three additional 10 ps constant temperature simulation at 700 K, 500 K and 300 K. The restraints applied to the transmembrane domain were reduced gradually from 10 kcal/mol 2 to 2 kcal/mol 2 during the annealing process. The initial positional restraints of 0.1 kcal/mol 2 imposed on water oxygen atoms were decreased gradually to zero from 1500 to 500 K, then spherical harmonic restraints acted on the oxygen atoms of waters in the outer 5 shell of the solvation sphere instead. The force constant of the Miscelaneous Mean Field Potential (MMFP) was K R = 0.002 kcal/mol 2 from 500 to 400 K and K R = 0.0007 kcal/mol 2 from 400 to 300 K. The structures obtained at the end of 300 K were fully minimized and then analyzed based on the energy criteria and the pairwise root
15 mean-square deviation analysis implemented in Insight II 98.0 package (Biosym/MSI, San Diego). One of these structures was equilibrated at 300 K for 100 ps, and the receptor structure averaged over 62 ps to 80 ps was fully minimized until convergence. As control, similar procedures were also performed on the same protein system in vacuum, with a distance-dependent dielectric constant = 4r. The coordinates of hAGRP(87-132) were taken from the 2D 1 H NMR study reported by Bolin et al. (77) (PDB code: 1QU8). The peptide was manually docked to the minimized receptor structure via interactive graphics using the Docking module in Insight II. The critical portion of hAGRP RFF(111-113) (78) was found to fit to the putative binding sites (79, 80) of mMC4R very well. The resulting ligand-receptor complex was then soaked into the previous restrained water droplet model and the MMFP potential applied. The scope of restraints on the transmembrane domain was lowered to start from the third extracellular turn below the putative binding site. The recent in vitro mutagenesis data of mMC4R (80) were converted into distance constraints and employed in the subsequent minimization and dynamics. The whole system was energy minimized and equilibrated for 50 ps with positional restraints applied on the transmembrane domain and ligand backbone atoms. During another 50 ps running, the distance constraints were gradually reduced to zero and position restraints were reduced to 0.5 kcal/mol 2 and zero on the transmembrane domain and ligand respectively. Finally, the system was allowed to freely evolved during 200 ps of MD at 300 K. The molecular simulations were performed with the CHARMM 25b2 package using the param27 parameter set (85-87). Molecular dynamics was carried out using the Verlet leapfrog algorithm with a distance-dependent dielectric constant = r. Long-range
16 electrostatic forces were cut off at 13 by means of a shifting function, and Van der Waals forces were cut off between 9 and 13 by means of a switching function. The nonbonded list was kept to 14 and updated every 10 steps. A step size of 1 fs was used, and the velocities were scaled every 50 steps to keep the temperature within 10 of 300 K during the equilibration. All bonds containing hydrogens were held rigid by the SHAKE algorithm (75). The calculations were performed on a Silicon Graphics Onyx2 supercomputer. Results and Discussion Modeling of the Ligand-free Mouse Melanocortin-4 Receptor A 2-D representation of the mMC4R is given in Figure 2.1. The transmembrane helices boundaries are defined by the multiple sequence alignment of melanocortin receptors and bovine rhodopsin template. The lengths of the seven transmembrane helices and the location of highly conserved amino acids are expected to be nearly the same for most of the Class A family members. The extracellular loop II of Melanocortin receptor is extremely short, having no Cys in this loop and TM3, as also proved by the GPCR sequence analysis result of Baldwin (81, 93). The rhodopsin TM1, TM4, TM5, TM6 and TM7 are bent at Pro residues, although it is not significant in TM1 and causes distortion only around the extracellular end of TM4. The melanocortin receptor family lacks the conserved proline residue at the corresponding kink region of TM1, TM4 and TM5. The kink region of mMC4R TM1 Ile49~Leu52 and TM4 Ser172~Leu175 were fixed by assigning the standard helix Phi and Psi value within Sybyl Biopolymer module, while keeping the TM5 kink region at Met196 unchanged. The helices structures were checked carefully and several i~i+4 backbone broken hydrogen bonds were fixed by energy minimization with distance constraints applied on i~i+4 hydrogen bond distance in
17 CHARMM. These regions include TM7 Tyr279, whose corresponding residues in rhodopsin are conserved Pro291, which alter the normal helix backbone hydrogen bond network but do not cause the kink. The disposition of the polar and ionizable residues in each helix was checked and most of the conserved polar residues were found to be pointing inside the helix bundle. The exception, TM1 Glu41, which is conserved in MC2, 3, 4, 5R, was redirected inside and found close enough for the interaction with conserved residue Thr45. Their interaction was maintained after energy minimization. Another exception is TM2 Glu92, which is reasonable because of its direct contact with the ligand (92). The basic residues at the intracellular end of TM4, Arg156 and Arg157, were directed toward the cytoplasmic boundaries where these Arg/Lys residues are predicted to interact with the phospholipid head groups (92). The interhelical hydrogen bond network features of bovine rhodopsin crystal structure were well maintained in the final structure of mMC4R. Helices TM1, TM2 and TM7 were linked through interaction among Asn54, Asp82 and the peptide carbonyl of Ala287. The crystal water inside mediates a contact among TM2, TM3 and TM7 through Asp82, Ser124 and Asp290. In addition, the Asp138 and Arg139 side chains of the (D/E)R(Y/W) motif at TM3 form a strong salt bridge. Important residues Asn78, Glu122, Met163, His211 and Thr251 in rhodopsin structure are not conserved in melanocortin receptor family. The landmark accomplishment of the crystal structure of bovine rhodopsin provides an opportunity to review and reconsider previous attempts to infer, through indirect means, the structure of GPCRs. It also offers a structural template for other GPCRs , including the assignment of secondary structural elements and the location of highly conserved amino acids. Ballesteros et al. (91) recently examined the conserved
18 and nonconserved motifs of the transmembrane domain of rhodopsin crystal structure, compared to other class A receptors, especially amine GPCRs. Melanocortin receptors belong to peptide GPCRs of class A receptors. Most class A receptors do not have a Pro at position 1.48, while Gly1.49 is highly conserved in amine receptors. Because of the flexibility induced by Gly, amine receptors might have a similar kink structure at TM1 (91). The sequence alignment of melanocortin receptors showed no conserved Gly or Thr around this region. Thus it may lead to divergent folds on this domain. TM4 -helix, the shortest membrane-spaning helix in rhodopsin as well as other GPCRs is disrupted by a Pro-Pro motif at position 4.59 followed by a short helical turn. The lack of conservation pattern in GPCRs around this motif suggests the structure might not be conserved. However, certain amine receptors with only a single Pro at 4.59 show the presence of a consistent helical turn at this domain, suggesting that such a motif might be conserved despite the absence of significant sequence conservation (94). Given that there is no presence of Pro-Pro, or Pro-X-Pro motif in Melanocortin receptor sequences, the overall orientation of TM4 may be different from rhodopsin. Despite the lack of conservation of Pro5.50 at TM5, melanocortin receptors are expected to share the same fold as rhodopsin because of the highly conserved profile of Pro5.50 among GPCR family (81, 93). For peptide ligands, there is more evidence that the extracellular loops contribute to the binding and receptor activation (95-98). We extended our receptor model to include the intraand extracellular loops and applied simulated annealing to explore the possible conformations of the ELs. The loop PDB database search result and the further GOR 4.0 secondary structure analysis result are summarized in Table 2.1. The secondary structure analysis was found in good accord with the loop search result. To select the
19 representative structures among the many thermodynamically possible, we conducted simulated annealing on the ELs domain within a restrained water droplet model. By introducing a minimum number of restraints and shell thickness of 5 , this solvation model enables the water sphere model to reproduce structural and dynamic properties of water in the temperatures of interest and to prevent the water molecules from evaporating in simulations at high temperatures. Unlike Periodic Boundary Condition (PBC), it yields a single, continuous trajectory for the entire simulation, without the need of continuous adjustment of the box size. The overall architecture of the average-minimized structure after annealing is shown in Figure 2.2. The highly conserved EL1 loop, comprising acidic motif Asp-X-Asp, is arranged around the edge of the cavity formed by TM helices 1, 2, 3 and 7. This arrangement is particularly striking, since Gantz et al. have already identified a putative binding pocket for both MC4R agonists and antagonists, which is located among TMs 1, 2, 3 and 7 (80, 99-101). Viewed from the top, this particular configuration of EL1 is clearly able to control access. The second extracellular loop (EL2) arches between TM helices 4 and 5. The EL3 loop forms a flat span between the ends of TM6 and TM7, without any specific folding pattern. Current concepts regarding the molecular interaction of small peptide ligand and GPCRs suggest that a peptide like hAGRP(87-132) binds to both the transmembrane core and exoloops (102). However, there exists controversy over the contribution of exoloops to the binding of ligands to the MCRs in the literature (103-106). Our current model appears to contradict the result of cassette substitutions experiment by Yang et al. (103, 111). They concluded that EL2 and EL3 of the hMCRs are important for AGRP binding. A comparison of their chimeric receptor model with our current model derived from
20 rhodopsin crystal structure indicates that the definitions of the loop boundaries are different. The loops in their model span 3-11 residues longer than ours, thus extending to several turns of the extracellular TM ends. In this regard, it is worthy to note that the possibility of small, localized conformational changes exists. In addition, Figure 2.3 shows the highly conserved pattern of EL1 sequences within individual subtypes of MCRs. In contrast, the EL2 and EL3 sequences are more conserved for the entire MCRs family. Given the fact of basic property of hAGRP(87-132), it is reasonable to speculate that EL1 contributes to the MCRs subtypes selectivity. The conserved pattern of Asp103, Asp105 in EL1 of MC4Rs switches to conserved Arg107 in MC1Rs, which makes the ion pair interaction between hAGRP(87-132) and EL1 unlikely. Schioth et al. reported that the N-terminal of the human MC1, MC3, MC4 and MC5 receptors do not play an important role for the ligand binding (107). In addition, the N-terminal of the MCRs is not long. Therefore this region was not included in the current model. Complex Model and hAGRP(87-132)/mMC4R Interaction The refined averaged structure of hAGRP(87-132) bound to the mouse receptor is illustrated in Figure 2.2. mMC4R accommodates Arg111 of hAGRP(87-132) in a negatively charged pocket formed by Glu92, Asp114 and Asp118. The acidic side chains form a strong and stabilizing ionic bridge with the guanidinium moiety of Arg111. In addition, these interactions were well maintained during the freely evolved 200 ps simulation. The results are consistent with the experimental data (79). Point mutations of these residues E92K, D114R and D118K resulted in >1000 fold loss in binding affinity and showed no antagonist activity, suggesting that ion bridges among these residues are the major determinant for binding. The electrostatic surface potential was calculated and displayed with the GRASP program (108). A close contact between the negative
21 potential region corresponding to the formal negative charge of Glu92, Asp114 and Asp118 and the crucial residue Arg111 is present. A cluster of Phe residues within the cavity formed by TM4, TM5 and TM6 define an aromatic binding pocket as shown in Figure 2.2. Phe112 and Phe113 of hAGRP(87-132) are found to be accommodated into this binding pocket formed by Phe176, Phe193, Phe253 and Phe254. Single point mutations of Phe176, Phe193 and Phe254 to residue Ser or Lys in mMC4R decreased hAGRP(87-132) affinity 4-7 fold (79). Furthermore, a F253S mutation in TM6 resulted a 1000-fold shift in binding, suggesting that aromaticity at Phe253 as well as Phe176, Phe193 and Phe254 is crucial for binding. Our model provides molecular insight into these mutations. While Tyr179 at EL2 appears not to directly contact Phe112 and Phe113, its proximity to the aromatic pocket may infer indirect participation in stabilization of the aromatic cluster. As expected, Phe259 is located above the aromatic binding pocket and does not directly contribute to the hAGRP(87-132) binding. It provides additional evidence that the agonist activity of the antagonists SHU9119 and SHU9005 at the F254S and F259S mutant receptors is only specific for the melanocortin-based peptide antagonists, not for hAGRP (79). It has been proposed by several laboratories that the conserved Arg-Phe-Phe motif in AGRP is critical for the antagonistic and molecular recognition at the melanocortin receptors (109, 110). The hAGRP-derived decapeptide Yc[CRFFNAFC]Y retained the antagonistic activity at MC3R and MC4R, and interestingly, showed agonist activity at the MC1R. Additional hAGRP(109 118) based decapeptides have been synthesized and a novel MC1R antagonist has been reported (113). Structure-function studies indicate that the agonist or antagonist activities can be switched only by small and local structure
22 changes (79, 113), which lead to different conformation families. Importantly, our current receptor model would provide molecular basis for such changes. A different conformational space is obtainable for the Yc[CRFFNAFC]Y decapeptide compared to when this compound is incorporated into the 46-amino acid C-terminal domain of AGRP. We might speculate that it is more likely to happen when the decapeptide ligand binds to receptor because of the more complex environment within receptors. Efforts along these lines are currently underway in our laboratory. The links between the structure features of the receptor-bound conformation and the flexibility of ligand conformations will supply a great deal of insight into the event associated with ligand recognition and activation.
23 (A) (B) Figure 2.1 Panel A: Schematic diagram of the mouse Melanocortin 4-receptor. The horizontal black lines represent the approximate membrane boundaries. Black circles with white text indicate the amino acids involved in the hAGRP(83-132) binding site. The arabic numbers indicate the position of the residues inside the TM domain. EL = extracellular loop, IL = intracellular loop. Panel B: mMC4R receptor model (TM helical bundle and extracellular domain) viewed along the helical axes from the extracellular end. TM helices are colored green and extracellular loops are colored yellow. Side chains of acidic residues in EL1 are highlighted.
24 (A) (B) (C) Figure 2.2 Model of the AGRP/MC4R complex. Panel A: Structural model of the AGRP/MC4R structure, visualized using CAChe V5.0, showing that the protein interacts with both the extracellular loops and an intramembrane binding site. AGRP is colored by atom type: C, black; O, red; N, blue; H, white; S, yellow. TM helical domains: I, red; II, yellow; III, green; IV, purple; V, cyan; VI, orange; VII, black. Loops are colored blue. Panel B: Illustration of the hydrophilic binding pocket for AGRP(87-132) in mMC4R. The receptor and ligand are depicted as ribbons in green and red, respectively. The side chains of the important residues involved in the ionic bridge are highlighted and labeled. Panel C: Illustration of the hydrophobic binding pocket for AGRP(87-132) in mMC4R. The receptor and ligand are depicted as ribbons in green and red, respectively. The side chains of the important residues involved in the aromatic interaction are highlighted and labeled.
OPSD_BOVIN .....YFV.........VGWSRYIPEGMQCSCGIDYYTPHEETN......Q.GSDF.... MC1R_HUMAN ...AGALVARAA.....A........................YY.....PEHPTCGC... MC1R_MOUSE ...VGILVARVA.....T........................YY.....PQHPTCSC... MC1R_ALCAA ...AGALAARAA.....T........................YY.....PQHPTCGC... MC1R_BOVIN ...AGVLATQAA.....T........................YY.....PQHPTCGC... MC1R_CANIS ...AGALAAQAA.....A........................YY.....PQHPICGC... MC1R_CAPCA ...AGALAARAA.....T........................YY.....PQHPTCGC... MC1R_CAPHI ...AGVLATRAA.....T........................YY.....PQHPTCGC... MC1R_CEREL ...AGALAARAA.....T........................YY.....PQHPTCGC... MC1R_CHICK ...HGVLVIRAS.....T........................YY.....PTNPFCTC... MC1R_DAMDA ...AGALAAQAA.....T........................YY.....PQHPTCGC... MC1R_HORSE ...AGVLATQAS.....A........................YY.....PQHPTCGC... MC1R_OVIMO ...AGVLATQAA.....T........................YY.....PQHPTCGC... MC1R_RANTA ...AGALAARAA.....T........................YY.....PQHPTCGC... MC1R_SHEEP ...AGVLATRAA.....T........................YY.....PQHPTCGC... MC1R_VULVU ...AGALAAQAA.....A........................YY.....PQHPICGC... MC4R_HUMAN ....STDTDAQS.....I........................YS.....PQNPYCVC... MC4R_MOUSE ....STDTDAQS.....I........................YS.....PQNPYCVC... MC4R_RAT ....STDTDAQS.....I........................YS.....PQNPYCVC... 25 MC4R_CHICK ....NTDTDAQS.....I........................YS.....PYNPYCVC... EL1 EL2 EL3 Figure 2.3 Multiple sequence alignment of extracellular loop domains of melanocortin 1and 4-receptor family with bovine rhodopsin. Sequences are given in amino acid one-letter code. Conserved residues within MC1R family are shown in red. Conserved residues within MC4R family are shown in blue and italic. Residues conserved within both MC1R and MC4R families are shown in bold.
Table 2.1 Secondary structure prediction and sequence homology analyses of the extracellular and intracellular loop region of the mouse melanocortin-4 receptor. 26 mMC4R loop Protein template b Residue range Sequence a and prediction c Fit_RMS d Homology EL1 mMC4R 1ALO 101 108 323 330 STDTDAQS CCCCCEEC EIHPGTPN CCCCCEEC 0.5847 69.23 EL2 mMC4R 2HPD 178 180 196 198 IYS CEE PAY CEE 0.3348 16.67 EL3 mMC4R 1RHD 264 271 171 178 PQNPYCVC CCCCEEEC LESKRFQL CCCCEEEC 0.2784 29.81 IL1 mMC4R 1ACC 65 68 190 193 NKNLH CEEEC EGYTV CCEEC 0.4479 34.88 IL2 mMC4R 1TOF 145 153 53 61 YALQYHNIM CCCCCCCEE YAGKVIFLK CCCEEEEEC 0.2255 49.47 IL3 mMC4R 1DOI 208 229 93 114 FLMARLHIKR IAVLPGTGTI RQ CCCCCCCEEE EEECCCCCEE EC VEDKNVRLTC IGSPDADEVK IV CCCCCEEEEE ECCCCCCEEE EC 0.5186 46.93 a Sequences are shown using the amino acid one-letter code. b Protein templates are identified by their PDB entry names. c Secondary structure prediction obtained by using the method of GOR 4.0 (90). Alpha Helix = H, Beta Sheet = E, Random Coil = C. d r. m.s. d. of the anchor region between receptor loop and template.
27 Table 2.2 Putative hAGRP(87-132) amino acid intramolecular and intermolecular interactions with the mMC4 receptor. hAGRP(87-132) amino acid mMC4R/hAGRP residue Arg89 Backbone of Cys271 (EL3 adjacent to TM7 initiation) hAGRPGlu92 His91 Tyr33 (TM1 initiation adjacent to N-termini) Backbone C=O of Asp103 (EL1) Thr104 (EL1) Glu92 hAGRPArg89 Ser93 Backbone of Cys271 (EL3 adjacent to TM7 initiation) Gln97 Asp181 (TM5 initiation adjacent to EL2 end) hAGRPAsn114 Gln98 Asn266 (EL3) Tyr109 Asp114 (TM3) Arg111 Glu92 (TM2) Asp114 (TM3) Asp118 (TM3) Phe112 Phe176 (end of TM4 adjacent to EL2 initiation) Phe253 (TM6) Phe113 Phe176 (end of TM4 adjacent to EL2 initiation) Phe193 (TM5) Phe253 (TM6) Phe254 (TM6) Asn114 Asp181 (TM5 initiation adjacent to EL2 end) hAGRPGln97 Phe116 hAGRPTyr118 Tyr118 Asp105 (EL1) Gln107 (EL1) hAGRPPhe116 Arg120 Asp105 (EL1) Asp114 (TM3) Lys121 Asp103 (EL1) hAGRPAsp103 Thr124 Thr110 (end of EL1 adjacent to TM3 initiation) Arg131 Backbone of Ala106 (EL1) Thr132 Thr104 (EL1) Unless otherwise noted, the putative ligand-receptor interactions are with the mMC4 receptor side chain to side chain.
CHAPTER 3 INVESTIGATING THE LIGAND MOTION AND ACTIVITY RELATIONSHIP OF THREE LACTAM DERIVATIVES OF HAGRP(109-118) IN THE DECANE/SPC MEMBRANE MIMETIC MODEL The interactions between protein and membranes are of fundamental importance for a variety of physiological process. Current Molecular Dynamic methodology has reached the phase where one can generate trajectories of realistic atomic models of complex biological systems. In recent years, the approach has supplied a great wealth of information about ion channels, GPCRs and other transmembrane proteins. In this chapter we have obtained dynamics trajectories of detailed atomic models of the decapeptides/ MC1, 4R complex embedded in a fully solvated decane membrane model. Our aim with these calculations is to explore the receptor-bound dynamic behaviors of the three peptides. The microscopic fluctuations of the transmembrane GPCRs as well as other factors cannot easily be accessed experimentally. Long time-scale dynamics simulation will complement the information that is currently not available, and ultimately help to refine our understanding of GPCRs. Methods Construction of the Atomic System The simulation system consists of an atomic model of decapeptides/ MC1, 4R complex embedded in Decane/SPC membrane mimetic structure. The three-dimensional atomic coordinates of decapeptides/ MC1, 4R complex were taken from the previous hAGRP(87-132)/MC4R template. The MC1R complex model were constructed by superimposing the coordinates of MC1R to the template of hAGRP(87-132)/MC4R. The 28
29 residues that comprise the binding pockets in MC1R and MC4R are similar and the conserved interactions within were maintained. The structural modifications on the decapeptide ligands were done in Sybyl 6.5 (Tripos, Inc.). The amino acid sequences outside the core Yc[CRFFNAFC]Y were deleted and the -NH 3 + protonated ending group were assumed for residue Tyr1. The Cys2-Cys9 disulfide bonds were manually edited to the lactam bridges based on the various modes of connection, which lead to different sizes of ring structure. The differences of peptide 1 (26-membered ring), peptide 2 (27-membered ring), and peptide 3 (25-membered ring) are illustrated by the location of the Asp or Glu side chain (circled) in relationship to the lactam bridge (Figure 3.1). The localized lactam regions were subjected to annealing protocols to relax any steric tensions. GROMACS residue topologies for three decapeptides were written by using the PRODRG server (118). The whole structure of the decapeptide was treated as a special residue by the PRODRG server and its topology was output according to coordinates of decapeptides. In this way the difficulties in defining the lactam bridge in peptide and introducing of unnatural amino acid, diaminopropionic acid (DPR), to GROMACS topologies were avoided. However, the GROMACS topologies provided by the PRODRG server used old version of GROMACS force field. Some parameters were found to be different from the one currently used by GROMACS 3.0.5. Therefore, the parameters portion in the topologies was discarded and the upper portions were converted to individual RTF entries. Atom types in the topologies were checked carefully to be consistent with the current version of force field. The lactam bridge was treated as the regular peptidyl bond and missing parameters were added based on the ones with similar chemical properties.
30 A general protocol was used to build the Decane/SPC membrane mimetic model (115-117). The decane lipids were taken from a library of pre-equilibrated decane molecules (114) and replicated to a box of 4nm * 9nm * 9nm. Three layers of decane lipids in the box are to simulate the effective packing of various lipids molecules, such as DPPC, POPC and DMPC. The inter-layers distance is set to 0.3nm based on the Van der Waal radii of carbon atoms. The SPC water layers were created by GROMACS and stacked a pre-equilibrated water system below and above the decane lipid layer. This resulted in a three-layer simulation cell at the size of 10nm * 9nm * 9nm, with 867 decanes and 15627 waters. Simulation Protocol The Decane/SPC system was first minimized with positional harmonic restraints applied on alkanes carbons and SPC oxygen atoms. The steepest descent algorithm was used for this stage. It was followed by a conjugated gradient algorithm and an equilibration period of 1 ns. The harmonic restraints were retained only on alkanes carbons at the second phase of minimization and gradually reduced to zero during the first 200 ps of equilibration. The purpose of the final stage of equilibration is to relax the membrane lipids and the water layers around. Progressively decreasing harmonic restraints were applied to ensure a smooth equilibration of the system. A snapshot taken near the end of the 1 ns trajectory was used as the representative configuration of Decane/SPC tri-layer system. Six decapeptides/MCRs complex were docked to the Decane/SPC boxes and the overlapped solvent molecules were removed based on the Van der Waals radii database read by the program. The entire system is required to be electrically neutral. Different numbers of Cl anions were inserted into the positions with the most favorable electrostatic potential calculated by the normal
31 GROMACS particle based methods. The overlapped solvent molecule was replaced and the potential was recalculated after every ion insertion. For decapeptides/MC4R systems, 2 Cl anions were required to make the entire system neutral and 11 Cl anions are needed for decapeptides/MC1R systems. All the complexes were minimized prior to dockings and the entire complex/membrane systems were subjected to the similar protocols as in the equilibration phase. The positional harmonic restraints were applied to heavy atoms of proteins instead and gradually reduced to zero at the first 200 ps. The entire systems were allowed to evolve freely for 1 ns. Different starting configurations of the systems and assignments of initial velocity were considered. Several control MD trajectories were generated. In GROMACS, velocities were generated according to a Maxwell distribution at certain temperature with random seed number specified. The GROMACS force field was selected as the potential function. Compared to GROMOS96 force field, it behaves better for long alkanes and lipids. Ryckaert-Bellemans dihedral potential was used to deal with long alkanes in GROMACS force field with the exclusion of 1-4 interactions in alkane atoms (114). In addition, the GROMACS force field has more selection of residual topologies. The SPC potential was used as water model. The electrostatic interactions were computed using the particle mesh ewald (PME) algorithm. The interpolation order is 4 and the maximum grid spacing for the FFT is 0.12 nm. The Coulomb cut-off, Lennard-Jones cut-off and cut-off for the short-range neighbor list were all set to 0.9 nm. All the molecular dynamics simulations use the leap-frog algorithm and were at constant pressure and temperature (NPT). The Berendsen thermostat was used for temperature coupling with the reference temperature
32 of 300K and time constant of 0.1 ps. Decane lipid, water and protein were separately coupled. Exponential relaxation pressure coupling was used with time constant of 1.0 ps. The pressure was isotropic in x, y and z directions and kept at 1 bar. The compressibility at the reference pressure was set as 4.5e-5 bar -1 . Periodic boundary conditions were applied in all three spatial dimensions. The SHAKE algorithm was used to fix the length of all bonds involving hydrogen atoms and the time step was 2 fs. Results and Discussion Decane/SPC Simulation During the dynamic trajectories of decane/SPC and decane/SPC/protein simulations, the structure of decane/SPC membrane remained very stable (Figure 3.2). The definitions of the SPC/decane/SPC tri-layer were clear and the energies for decane, SPC and the solvated protein were steady. These three groups were coupled to thermostats separately and no deviation of temperature was observed. Ryckaert-Bellemans potential was used for the dihedrals CH 2 CH 2 CH 2 CH 2 and CH 2 CH 2 CH 2 CH 3 instead of the regular periodic function: 50cosiiiCV This function is based on expansion in powers of cos and gives better statistics on trans/gauche behavior. The CH n atoms (n=1, 2, or 3) were treated as united atoms and no special hydrogen-bond potential had been included. The Lennard-Jones parameters and between the united CH 2 , CH 3 atoms of the decane molecules and the O atom of the SPC molecules had been optimized in GROMACS force field to obtain the ideal interfacial properties, including the solubility of water in decane, the surface tension and the orientational preference. The Van der
33 Waal parameters are vital for the simulation of the interface between two immiscible liquids. If the repulsion is not strong enough, the mixing of the two liquids at the interface will result. Whereas the repulsion is too strong the surface tension will become too large. In all simulations the sharp interface between decane and SPC was observed and the surface area that is perpendicular to the z-axis of membrane was unchanged. Decane and water ordering were also observed at the interfacial area. The water molecules tend to orient themselves to create more possibilities for hydrogen bonding as the number of neighboring water molecules available is small at the interface. It has been reported to be true for the benzene/water interface and the hexane/water interface (119, 120). Similarly the decane molecules are expected to be more laterally oriented near the interface to minimize the contact of the total surface with water. Particle Mesh Ewald was used to treat long-range electrostatic interactions during simulations instead of the twin range cutoff method (129, 130). By using the Fourier transformation with a 3D FFT algorithm on the following reciprocal energy term: NjimmmjijirecxyzmrrimmqqvfV,222/exp2 the scales had been improved from N 2 to NlogN and is substantially faster than the ordinary Ewald summation on medium or large systems (128). Membrane is such a large system that includes the multiple phases and ion-ion interactions. Truncation of electrostatic interactions, even at a very large cutoff distance, leaves out vital ion-ion interactions that lead to simulation artifacts. Tieleman et al. evaluated three methods, a twin range cutoff, particle mesh ewald and a twin range cutoff combined with a reaction field correction, over an alamethicin channel system (125). They found that water orientation inside the channels was sensitive to the algorithms used. Several other studies
34 showed that particle mesh ewald was advantageous over truncation method in extended dynamics simulations (126, 127). Several advantages to use decane as membrane mimetic are: It includes all structural and dynamic details of the membrane protein. It provides a decent representation of the biphasic, hydrophobic/hydrophilic characters and the molecular motions of the long acyl chains found in membranes. The simulation of decane/SPC model is orders of magnitude faster than that of phospholipids. However, this model lacks a number of features of real membrane: The charge natures of phospholipid headgroups. The non-uniform density and pressure profiles. Although the simplified system sacrifices the above features, it is computationally economical and has been proved to be satisfactory in reproducing experimentally determined structural features (115, 116, 121, 122). Another simplified membrane model is the continuum mean field (123, 124). It consists of three distinct dielectric regions that correspond to the low dielectric membrane interior and two high dielectric regions outside. The representation of the membrane is largely simplified and its computational cost was so low that free energies could be obtained. Average Structural Properties Overall the helical secondary structure of melanocortin receptors and interhelical orientations were preserved in all simulations with decane/SPC system. Loss of helical structure was not found in all six trajectories and i~i+4 hydrogen bonds that hold the helical structures were maintained. Water molecules occasionally penetrated into the decane phase but did not disturb the local structure. The TMs -helices conserve their
35 initial orientation with respect to each otherâ€™s. The tilt angles appear to be the most stable, with fluctuations on the order of only 3-5. The orientations of the TM5, TM6 and TM7 kink regions were unchanged. However, the upper portion of TM1 was observed to bend slightly in the average structures of 500-600 ps and 800-900 ps in peptide1/MC1R trajectories but not in other 5 simulations. The bend occurred at the regions where the TM1 kink in rhodopsin was fixed in the early phase of homology modeling (see chapter 2). Its implication was unclear since the bend is not significant and inconsistent with other simulations. In all six simulations, the patterns of peptide/MCRs interactions in the binding pockets were kept at the time frame of 1 ns. Fluctuations of the Peptide Ligands The trajectory analysis of six nanosecond simulations provided exciting results. The root mean square deviation (R. M. S. D.) of the decapeptide ligands depicted the motion difference in six complexes, in consistence with their individual activities, more specifically, agonistic or antagonistic (Figure 3.3). Much of the time the R. M. S. D. with respect to peptide 1-3/MC4R was as low as 0.05~0.15 nm. All three peptides showed antagonistic activities against MC4R. Similar R. M. S. D. was observed at peptide 1,3/MC1R but not at peptide 2/MC1R. The bound peptide 2 showed more fluctuation and a higher R. M. S. D. of 0.07~0.23 nm at MC1R than other two ligands. Interestingly, the activities of peptide 2 switched to be antagonistic while peptides 1-3 remain agonism over MC1R. Thus, molecular motions of receptor-bound ligands in the extended simulation were consistent with their activities. The receptor-bound conformational cluster analysis supported the above findings as well (Figure 3.4). The number of conformational families for peptide 2/MC1R is 19, larger than the ones for peptide
36 1,3/MC1R, 6 and 12 respectively. Consistently, there is no difference to the number of conformational families for three MC4R complexes. It indicated once again the structural flexibility of peptide 2/MC1R complex compared to five others. It is intriguing that the modification on the connecting mode of the lactam bridge leads to the switch of activities. Our studies suggested that the conformational flexibility, or molecular motion, is linked between the structural features and activities. Further analysis on the trajectories is set to shed light on the current agonist activation models, giving more detailed information about the sequence of conformational changes induced by ligand binding.
37 Figure 3.1 Primary structures of peptide 1 (26-membered ring), peptide 2 (27-membered ring), and peptide 3 (25-membered ring) illustrating the location of the Asp or Glu side chain (circled) in relationship to the lactam bridge. Figure 3.2 Left: The Decane/SPC membrane mimetic model structure viewed in Rasmol. Note the three layers from top to bottom are SPC, decane, SPC. Right: The decapeptide/MCRs complex structures solvated in the Decane/SPC membrane mimetic model. The receptor atoms are shown as ribbons in green color while the ligand in a space-filling representation of yellow color.
38 Figure 3.3 The R. M. S. D. of decapeptides in the complex structure during the dynamic simulations. A. Peptide 1/mMC1R. B. Peptide 2/mMC1R. C. Peptide 3/mMC1R. D. Peptide 1/mMC4R. E. Peptide 2/mMC4R. F. Peptide 3/mMC4R. Figure 3.4 The conformational cluster analysis on the multiple 1 nano-second trajectories of dynamic simulations of six decapeptides/MCRs complex.
CHAPTER 4 FURTHER MODELING CALCULATIONS USING THE TEMPLATE OF HAGRP(87-132)/MC4R COMPLEX: CASE STUDY OF BICYCLIC PEPTIDES The cysteine-rich C-terminus of AGRP has been identified as possessing nM antagonistic activities equipotent to the full-length peptide (25). It suggests that the key structural and functional features are located at this domain. Further studies showed that RFF motif is critical for the binding and antagonist activity at MC3R and MC4R, even though the single cyclic analogs containing this active loop have lower affinity for these receptors (78). The recently determined NMR structure of this domain showed that RFF loop is present in a well-defined -hairpin. In addition, the truncated, bicyclic AGRP fragments of nanomolar antagonist activity at MC3R and MC4R have been identified (211). Thus, the -hairpin structure of RFF may be responsible for the antagonist activity of AGRP. The design, synthesis and pharmacological assay of a novel truncated, bicyclic analog of C-terminal AGRP have been done in Dr. Haskell-Luevanoâ€™s lab (Figure 4.1). The binding and antagonistic potencies are remarkable, with IC 50 = 24 nM, K i = 31 nM at MC4R respectively (Table 4.1). The TOCSY and NOESY 1 H NMR studies showed the conformational similarity exists at the active loop region (R. M. S. D. = 1.7). Six conformational families were classified and representative structures of each family were obtained. The molecular modeling techniques were undertaken in attempts to identify the mechanisms behind the ligandsâ€™ binding. It was also interesting to explore whether the superpotent activities of the analog were related to any conformational families. The 39
40 model of hAGRP(87-132)/Melanocortin-4 receptor complex (see chapter 2.) was used as the template in current studies. Methods The homology model of the hAGRP(87-132)/Melanocortin-4 receptor complex (131) and the representative structure of individual bicyclic peptide families were imported into the Workspace interface of BioMedCAChe 5.05 software package (CAChe Group, Fujitsu, Portland OR). The primary structures were generated within the text-based Sequence View Window and automatic sequence alignments were done according to the Needleham-Wunsch alignment algorithm (132) in which the BLOSUM50 substitution matrix was used. Conserved residues were then identified using the automatic sequence matching function. Individual representative structures of peptide were docked to the Melanocortin-4 receptor within the 3D Structure Window by superimposing the alpha carbons of RFF(111-113) residues onto the cognate atoms in the AGRP loop with automated deletion of the original protein ligand. The angles of RFF residue were checked and adjusted to favorable orientations as well as to fit to the corresponding binding pockets. Limited conformation search of bound bicyclic peptides were carried out within the CONFLEX (133) module to locate the global minimum. Complex structures of family 135 and 32 were excluded from further analysis because of the steric conflicts found between bicyclic peptides and Melanocortin-4 receptor. CONFLEX generated a sequence of low-energy conformers of any shape systematically and exhaustively, incorporating down-stream, reservoir-filling, corner flap, edge flip, stepwise rotation, and pre-check. All conformations with Boltzmann populations larger than 1% were subjected to local
41 perturbations and optimized by augmented MM3. All the final structures were analyzed by automatic labeling of hydrogen bonds and bumps in the 3D Structure Window. The calculations were performed on a Dell desktop PC with 1.70GHz Pentium 4 processor. Results and Discussion Consistent with the observations in NMR, the R. M. S. D. of RFF(111-113) alpha carbon superposition were small, within the range of 0.01~0.1 . It indicated that conformations of the hairspin region among all five families were well conserved to the original hAGRP(87-132) structure. The torsional angles of RFF(111-113) side chains were mostly in trans and gauche (+,-) conformation after the adjustment. All five complex structures were checked carefully before the conformational search. Two steric forbidden structures were found and excluded from further studies. The conserved disulfide loop (S101-S119) of bicyclic ligand from family 135 embraced the EL2 backbone of the Melanocortin-4 receptor, which was an irrational configuration. Similarly, S101-S119 disulfide loop in family 32 structure crossed the upper TM7 backbone of the Melanocortin-4 receptor. The final structures obtained after limited conformational search by CONFLEX were energy minimized and the intra-, intermolecular interactions within the complex were listed in Table 4.2-4.4. Among all three families, family 2 structure retained more conserved intermolecular interactions of hAGRP(87-132) compared to family 106 and 131. Beside interactions between crucial residues RFF(111-113) and their corresponding binding pockets, the quaternary ammonium of Lys121 in family 2 structure form a strong ionic bridge with the acidic side chain of Asp103, making an additional contribution to the binding affinity. The similar stabilizing factor can be found in hAGRP(87-132) complex, where the corresponding Lys121 hydrogen bonded with Asp103 of EL1. The
42 Tyr118 of family 2 structure was in proximity to EL1 of the Melanocortin-4 receptor and hydrogen bonded with Thr104, which was closed to the interaction pattern of hAGRP(87-132) (Figure 4.2). The remaining interactions in the family 2 complex were less specific as the backbone atoms were involved. This was especially true for structures from family 106 and 131, except for Tyr118 of family 106 that formed hydrogen bond with Asp103 at EL1 of the Melanocortin-4 receptor. These structural differences may lead to the binding affinity variation, as ligands with more specific hydrogen bond interactions will bind better than those with fewer interactions. Family 2 structure, the largest family of all the conformational clusters, tends to bind the Melanocortin-4 receptor most favorably (Figure 4.3). Family 131 structure binds the worst among all three because of the least number of intermolecular interactions. The energy criteria supported this binding affinity trend among three complexes as well. The family 2 complex had the lowest energy at -1731.33 kcal/mol. Family 106 was higher at -1644.35 kcal/mol and family 131 was the highest at -1599.59 kcal/mol, 7.61% over the lowest. It suggested that the binding of family 2 bicyclic peptide to Melanocortin-4 receptor was thermodynamically stable, compared to other two families. From the initial five families to the final â€˜singled outâ€™ family 2 structure, the docking results suggest that not all the free peptide conformers will bind to a particular receptor. This supports the model given by Edison et al. (134), in which only a single species from a conformational ensemble of an unbound peptide will bind to a receptor. Thus, the conformational ensemble reduces the effective concentration of a particular peptide with respect to a particular receptor.
43 87 132 Figure 4.1 Primary sequence comparisons of agouti-related protein (AGRP) (Top) and its bicyclic analog (Bottom). (U: -aminobutyric acid) 101 122 Figure 4.2 Structural model of the representative structure of family 2 binding to the mouse MC4R. The receptor and ligand are depicted as ribbons in green and red, respectively. The side chains of the critical residues involved in the interactions are highlighted in ball-and-stick mode.
44 Figure 4.3 Stereo views of the binding model of the representative structure of family 2 superimposed by the original hAGRP(87-132) structure. The bicyclic analog and hAGRP(87-132) are depicted as ribbons in red and gray, respectively. The regions at the hAGRP(87-132) which are corresponding to the bicyclic analog are colored in black. The side chains of the critical residues involved in the interactions are highlighted in ball-and-stick mode. Table 4.1 Functional and binding activity of the bicyclic analog of AGRP at the mouse melanocortin receptors 3 and 4. MC3R MC4R Binding IC 50 (nM) 101 24 Antagonist K i (nM) 176 31
45 Table 4.2 Putative bicyclic representive structure of family 106 intramolecular and intermolecular interactions with the mMC4 receptor Bicyclic 106 amino acid mMC4R/ Bicyclic 106 residue Asp103 Bicyclic 106 Tyr109 Bicyclic 106 Arg120 Thr107 Asp103 (EL1) Tyr109 Bicyclic 106 Asp103 Arg111 Asp114 (TM3) * Asn115 (TM3) Phe112 Phe176* (end of TM4 adjacent to EL2 initiation) Phe253 (TM6) * Phe113 Phe176* (end of TM4 adjacent to EL2 initiation) Phe193 (TM5) * Phe253 (TM6) * Asn114 Bicyclic 106 Arg120 Tyr118 Asp103 (EL1) Arg120 Bicyclic 106 Asp103 Bicyclic 106 Asn114 Lys121 Backbone of Cys269 (EL3) Backbone of Cys271 (EL3) Backbone of Leu122 Backbone of Cys269 (EL3) * Intermolecular interaction which is conserved to the original hAGRP(87-132)/mMC4 receptor complex model. Unless otherwise noted, the putative ligand-receptor interactions are with the mMC4 receptor side chain to side chain.
46 Table 4.3 Putative bicyclic representive structure of family 131 intramolecular and intermolecular interactions with the mMC4 receptor Bicyclic 131 amino acid mMC4R/ Bicyclic 131 residue Asp103 Bicyclic 131 Arg120 Backbone of Tyr109 Asp105 (EL1) Backbone of Cys110 Asp105 (EL1) Arg111 Asp114 (TM3) * Asn115 (TM3) Asp118 (TM3) * Phe112 Phe176* (end of TM4 adjacent to EL2 initiation) Phe113 Phe176* (end of TM4 adjacent to EL2 initiation) Asn114 Backbone of Leu257 (TM6) Backbone of Cys117 Asn277 (TM7) Tyr118 Backbone of Cys32 (TM1 initiation adjacent to N-termini) Arg120 Backbone of Phe272 (TM7 initiation adjacent to EL3 end) Bicyclic 131 Asp103 * Intermolecular interaction which is conserved to the original hAGRP(87-132)/mMC4 receptor complex model. Unless otherwise noted, the putative ligand-receptor interactions are with the mMC4 receptor side chain to side chain.
47 Table 4.4 Putative bicyclic representive structure of family 2 intramolecular and intermolecular interactions with the mMC4 receptor Bicyclic 2 amino acid mMC4R/ Bicyclic 2 residue Asp103 Bicyclic 2 Arg120 Backbone of Ala106 Asp105 (EL1) Backbone of Thr107 Asp105 (EL1) Backbone of U108 Asp105 (EL1) Cys110 Gln107 (EL1) Arg111 Asp114 (TM3) * Asn115 (TM3) Asp118 (TM3) * Phe112 Phe176* (end of TM4 adjacent to EL2 initiation) Phe253 (TM6) * Phe113 Phe176* (end of TM4 adjacent to EL2 initiation) Phe193 (TM5) * Phe253 (TM6) * Asn114 Backbone of Tyr260 (end of TM4 adjacent to EL2 initiation) Backbone of Tyr118 Thr104 (EL1) Arg120 Bicyclic 2 Asp103 Lys121 Asp103 (EL1)* Thr104 (EL1) Backbone of Leu122 Gln35 (TM1 initiation adjacent to N-termini) Asn277 (TM7) * Intermolecular interaction which is conserved to the original hAGRP(87-132)/mMC4 receptor complex model. Unless otherwise noted, the putative ligand-receptor interactions are with the mMC4 receptor side chain to side chain.
CHAPTER 5 INTRODUCTION OF BIOLOGY OF CHANNEL FUNCTION IN GLUTAMINE-DEPENDENT AMIDOTRANSFERASES AND THEORETICAL METHODOLOGY Glutamine-dependent Amidotransferases and Ammonia Channel Glutamine phosphoribosylpyrophosphate (PRPP) amidotransferase (GPATase) catalyzes the first step of de novo purine nucleotide synthesis and is the key regulatory enzyme in the pathway (181). The enzyme belongs to a family of Ntn, N-terminal nucleophile family of glutamine amidotransferases (181, 182). The N-terminal glutaminase domain catalyzes the first half-reaction, the hydrolysis of glutamine. NH 3 derived from glutamine hydrolysis react with PRPP at a site in the C-terminal PRTase domain, a type I PRTase (154, 183). Like most glutamine amidotransferases, the enzyme can also utilize external NH 3 as a nitrogen source for PRA synthesis, both in vitro and in vivo. X-ray structures of ligand-free glutamine PRPP amidotransferase from Bacillus subtilis and Escherichia coli show structural features that are incompatible with catalysis (183-186). In brief, i) the glutamine site is closed and inaccessible to glutamine; ii) the PRPP site is open to solvent; iii) two domains are separated by a 16 solvent-exposed space. However, these barriers to catalysis are not present in the ternary complex structure of an E. coli enzyme with substrate analogues bound (154). In this structure cPRPP is bound to the PRPP site and 6-diazo-5-oxo-L-nor-leucine (DON) is covalently attached to Cys1 to mimic the glutamyl thioester intermediate. A PRTase â€œflexible loopâ€ has closed over the bound PRPP and forms an NH 3 channel connecting glutamine and PRPP site. Thus, the overall reaction indeed proceeds in two steps at physically separated 48
49 catalytic sites whose activities are tightly coupled. According to the structure-based mechanism (187), PRPP binding results in a reorganization of a PRTase â€œflexible loopâ€, consequently repositions the glutamine loop and moves Arg73 for an optimal salt bridge with the carboxyl of glutamine, which is required for high affinity glutamine binding. Ordering of the flexible loop is a prerequisite for channel formation. Recent mutational analysis of conserved amino acid residues in NH 3 channel as well as key residues in flexible loop indicates that channel function and interdomain signaling are coupled (155). Substrate channeling is the process of direct transfer of an intermediate between the active sites of enzymes that catalyze sequential reactions in a biosynthetic pathway (188, 189). The active sites can be located either on separated domains in a multifunctional enzyme or separate subunits in a multienzyme complex. X-ray crystallographic studies on tryptophan synthase revealed the first molecular mechanism for channeling (190). This was a unique example for a long time until recent structure determinations found plausible evidence for intermediate channeling in carbamoyl phosphate synthase (CPS), asparagine synthetase B and GPATase (154, 191, 192). In all the three enzymes, glutamine serves as the preferred source of nitrogen whereby an acceptor molecule is subsequently aminated. Carbamoyl phosphate synthase belongs to the Triad, or class I family of glutamine amidotransferases, which employ a histidine-cysteine couple for catalytic activity (193-195). Another family, the Ntn (class II) family of glutamine amidotransferases, of which asparagines synthetase and GPATase are members, has the catalytic cysteine invariably located at the N-terminus. Examinations of the X-ray crystallographic structures reveal channel with a length of at least 96 for CPS and around 20 for asparagines synthetase and GPATase (154, 191, 192, 196).
50 Molecular Dynamic Methodology Molecular dynamics is the science of computing the molecular motions of particles in a system. Itâ€™s become the principal tool in the theoretical study of biological systems, providing detailed information such as structure, dynamics and thermodynamics. Complex biological processes that can be explored by this mean include: protein folding, molecular recognition, conformational changes, ion transport in membrane system and protein stability. Molecular dynamics techniques are also widely used in structural determinations from X-ray crystallography and NMR experiments. The first molecular dynamics study was introduced by Alder and Wainwright in 1957 using a hard sphere model (135). Many important insights concerning the behavior of simple liquids emerged from their studies. McCammon et al. first applied molecular dynamics simulations on a protein structure, the bovine pancreatic trypsin inhibitor (BPTI) in 1977 (136). Since then the number of simulation techniques has greatly expanded and a variety of perplexing biological issues have been addressed. Classical Mechanics The molecular dynamics technique is based on Newtonâ€™s second law of motion: 22dtrdmdtdvmamFiiiiiii where F i is the force exerted on the particle i, m i is the mass of particle i and a i is the acceleration. v and r are the velocity and position of particle i at the time t. Also the force can be calculated as the gradient of the potential energy: VFii where V is the potential energy of the system. Combining the two equations we have:
51 22dtrdmdrdViii This equation describes the motion of a particle i along with the derivative of the potential energy. If the initial positions and the velocities are known, then the positions and velocities at all other times can be determined. Integration of this equation yields a trajectory in which the positions, velocities and accelerations of the particles change with time. The initial velocities, v i , can be selected randomly from a Maxwell-Boltzmann or Gaussian distribution at a given temperature. From the following equation, the probability of particle i having a velocity v x in the x direction at temperature T can be computed. TkvmTkmvpBixiBiix2exp222/1 Meanwhile the distribution of velocities are corrected as to make sure that the overall momentum equals to zero, 01NiiivmP The temperature, T, can be calculated from the momentums of individual particles: NiiimPNT1231 where N is the number of particles of the system. Integration Algorithms Since the continuous potential of all the particle positions (3N) coupled together has become a many-body problem, it is too complicated to be solved analytically. The solution is numerical, more specifically, a finite difference method.
52 There are many numerical algorithms for integrating the equation of motion. Several factors need to be considered in choosing the algorithm: It should permit a long time step for integration. The energy and momentum should be conserved. It should be computationally efficient. All integration algorithms assume the positions, velocities and accelerations can be approximated by a Taylor series expansion: tcttbttatttvtrttr4241361221 tcttbtttatvttv361221 tctttbtatta221 where r is the position, v is the velocity (the first derivative of the positions with respect to time), a is the acceleration (the second derivative), b is the third derivative, and so on. Verlet algorithm. To derive the Verlet algorithm (137) we can write down the following relationships at time t: tatttvtrttr221 tatttvtrttr221 Adding these two equations, we obtains: tatttrtrttr22 It uses positions and accelerations at time t and the positions from previous time tt to calculate the new positions at time t t . The Verlet algorithm uses no explicit
53 velocities. One simple approach to calculate the v(t) is to divide the difference in positions at times tt and t t by t 2 : tr v t2t tt t t r v tr tttrtt 2 The advantages of the Verlet algorithm are the following: It is straightforward. The computing storage requirements are modest. However, the small term ta was added to the difference of the two much larger terms, and tr2 tr , may cause a loss of precision. Another disadvantage is that it is not a self-starting algorithm. The positions at t t , ttr , are always needed to obtain the new position r . At 0 there is obviously no t tr available and special mathematic treatment is then needed. Leap frog algorithm. Many variations on the Verlet algorithm have been developed. The leap frog method was implemented by Hockney (138), using the following equation: tttvtrt 21 ttattvtt 2121 In this algorithm, the velocities at time t t 21 are first calculated and then used to deduce the position t together with the previous position tr and the accelerations at time t. In this way the velocities leap over the position by t 21 and the positions leap over the velocities then and so on. The velocities at time t can be approximated by the relationship: ttvttvtv 212121
54 Compared to the Verlet algorithm, the leap frog algorithm explicitly includes the velocities and does not involve adding the small term to the difference of large numbers. On the other hand, the velocities and positions are not calculated at the synchronized manner. It is disadvantageous because the kinetic energy cannot be calculated at the same time as the positions are defined. Velocity algorithm. This algorithm yields positions, velocities and accelerations at the same time (139). There is no compromise on precision. tatttvtrttr221 ttatattvttv 21 Statistical Mechanics Statistical mechanics is fundamental to molecular dynamics technique in the study of biological systems. It provides a bridge between the macroscopic realm of classical thermodynamics and the microscopic realm of particles (atoms and molecules). Molecular dynamics method yields microscopic information such as atomic positions and velocities. To convert these microscopic observables to macroscopic observables, including pressure, enthalpy, heat capacities and free energies, statistical mechanics gives the rigorous mathematical expressions. With the molecular dynamics method, one can solve both thermodynamic and time-dependent kinetic properties. As a branch of physical sciences, statistical mechanics studies the macroscopic phenomena from individual molecules that make up the system. To start the discussion, several definitions need to be introduced. Thermodynamic state: A system is usually defined by a small set of parameters, e.g., the temperature, T, the pressure, P, and the number of particles, N. Microscopic state: A system is defined by the atomic positions, r, and momenta, p.
55 Ensemble: An ensemble is a collection of all possible systems that have different microscopic states but have an identical thermodynamic state. Microcanonical ensemble (NVE): The thermodynamic state is characterized by a fixed number of atoms, N, a fixed volume, V, and a fixed energy, E. Canonical ensemble (NVT): The ensemble is characterized by a fixed number of atoms, N, a fixed volume, V, and a fixed temperature, T. Isobaric-Isothermal ensemble (NPT): The ensemble is characterized by a fixed number of atoms, N, a fixed pressure, P, and a fixed temperature, T. Grand canonical ensemble (VT): The thermodynamic state is characterized by a fixed chemical potential, , a fixed volume, V, and a fixed temperature, T. In statistical mechanics, averages that correspond to experimental observables are defined in terms of ensemble averages. The average is taken over a large number of replicas of the system considered simultaneously. The ensemble average is given by: NNNNNNensemblerprpAdrdpA,, where NNrpA, is the observable of interest and expressed as a function of the momenta, p, and the position, r, of the system. The integration is over all possible variables of r and p. The probability density of the ensemble is: TkrpHQrpBNNNN,exp1, where H is the Hamiltonian, T is the temperature, k is Boltzmannâ€™s constant and Q is the partition function. B TkrpHdrdpQBNNNN,exp To calculate the integral one must include all possible states of the system. In molecular dynamics simulation, the points that correspond to finite time steps can be calculated sequentially. Thus, the time average of A is expressed as:
56 NNMttNNtimerpAMdttrtpAA,1,1lim10 where is the simulation time, M is the total number of time steps and NNrpA, is the instantaneous value of A. Then there is dilemma as to whether the time average calculated by molecular dynamics simulation could be used as ensemble averages, which count for experimental observables. This leads to the fundamental axiom of statistical mechanics, the ergodic hypothesis: timeensembleAA It states that the time average equals the ensemble average. The assumptions is that the system is allowed to evolve in infinite time thus the system will eventually pass through all possible states. Therefore, if a molecular dynamics simulation is able to generate enough representative conformations, experimental observables about structural, dynamics and thermodynamics properties may then be calculated. Here we state some common time averages: NNNNNNensemblerprpAdrdpA,, where NNrpA, is the observable of interest and expressed as a function of the momenta, p, and the position, r, of the system. The integration is over all possible variables of r and p. The probability density of the ensemble is: TkrpHQrpBNNNN,exp1,
57 where H is the Hamiltonian, T is the temperature, k is Boltzmannâ€™s constant and Q is the partition function. B TkrpHdrdpQBNNNN,exp To calculate the integral one must include all possible states of the system. In molecular dynamics simulation, the points that correspond to finite time steps can be calculated sequentially. Thus, the time average of A is expressed as: NNMttNNtimerpAMdttrtpAA,1,1lim10 where is the simulation time, M is the total number of time steps and NNrpA, is the instantaneous value of A. Then there is dilemma as to whether the time average calculated by molecular dynamics simulation could be used as ensemble averages, which count for experimental observables. This leads to the fundamental axiom of statistical mechanics, the ergodic hypothesis: timeensembleAA It states that the time average equals the ensemble average. The assumptions is that the system is allowed to evolve in infinite time thus the system will eventually pass through all possible states. Therefore, if a molecular dynamics simulation is able to generate enough representative conformations, experimental observables about structural, dynamics and thermodynamics properties may then be calculated. Here we state some common time averages: Average potential energy: MiiVMVV11
58 where M is the number of snapshots in the molecular dynamics trajectory and V i is the potential energy of configuration in each snapshot. Average kinetic energy: MjjiiNiivvmMKK1121 where N is the number of atoms in the system, m i is the mass of the particle i and v i is the velocity of particle i. Other thermodynamics properties that can be derived from the partition function: Internal energy: VBVBTQTkTQQTkUln22 Enthalpy: TBVBVQTVkTQTkHlnln2 Helmholtz free energy: QTkABln Gibbs free energy: TBBVQTVkQTkGlnln
CHAPTER 6 EXPLORING THE STRUCTURAL AND THERMODYNAMIC PROPERTIES OF AMMONIA CHANNEL IN GLUTAMINE PHOSPHORIBOSYLPYROPHOSPHATE AMIDOTRANSFERASE. PART I: LOCALLY ENHANCED SAMPLING AND CAVITIES CALCULATIONS In an effort to both complement and extend our current understanding of intermediate channeling during enzyme catalysis, computational studies on the NH 3 channeling in GPATase from E. coli was initiated. The crystallographic structure of analog-bound E. coli GPATase (PDB code: 1ECC) was selected as the model system because the structure is completed and the channel functions were well studied (154, 155, 187). So far there is no direct computational evidence to address whether intramolecular channeling actually occurs and several key issues concerning its properties. For example, How does the NH 3 channel formed? Is the channeling a transient process or a lasting process? Which one plays a pivotal role in the intermediate channeling, the thermodynamic factor or kinetic factor? Is there ever a driving force for intermediate channeling? Is there any water molecule exists in the channel and if so whatâ€™s their affection on the channeling? The computational studies described here begin to address these fundamental structural and functional questions. Methods Force Field Parameters Development The CHARMM parameters for the GPATase glutaminase domain thioester covalent intermediate structure, PRTase domain substrate PRPP (alpha-D-5'-phosphoribosyl pyrophosphate), ammonia (NH 3 ) and ammonium (NH 4 + ) were chosen as 60
61 follows (140-144). The model compounds selected in these studies are shown in Figure 6.1. They were designed to include functional groups required to properly model the missing parameters, while being small enough to remain computationally tractable. Target data collected on those compounds, including both experimental structure and IR spectra data (145, 146) and ab initio data, solely acted as the basis for the parameter optimization. Model compounds were built in CAChe 4.4 (Fujitsu, Inc.) and optimized by combined MM3/PM3 module. Ab initio calculations were then carried out using the Gaussian 98 (147) program with the 6-31G(d) basis set on model compound S-methyl thioacetate (SMTA) and 6-31+G(d) basis set on model compound Tetrahydrofuran phosphoric acid (THFP). The results of ab initio calculations were introduced where necessary to supplement the experimental data, especially in the assignment of experimental vibrational frequencies to internal coordinates. The geometry minimization and vibrational calculation were done at the second-order Mller-Plesset (MP2) level of theory. However, the ab initio values of the frequencies need to readjusted by a scaling factor, determined by comparison of known experimental frequencies with the ab initio results. The derived factor was then applied to the unobserved frequencies. Molecular mechanics calculations were carried out using the program CHARMM 25b2 (148, 149). Frequencies were determined using the VIBRAN module in CHARMM, and analysis of the internal coordinates distributions to the vibrational spectra were made with the MOLVIB facility. Force constants were adjusted to fit the vibrational data in an iterative fashion, by the addition of Urey-Bradley and improper terms in cases where the agreement was unsatisfactory. The dihedral energy surface of C1-C-S-C2 and O=C-S-C2 in SMTA was obtained by constraining the target dihedral in 30 increments from
62 to 180, while the remainder of the molecule was allowed to relax. The dihedral (O4â€™-C1â€™-O1â€™-P) in THFP was also sampled in the same way. All dihedral energy surfaces were investigated at both the restricted Hartree-Fock (RHF) level of theory and at the MP2 level of theory. The internal parameters were manually optimized in an iterative approach to maximize the agreement with the internal target data, shown as iterative loops (140). Initial partial charges for model compounds were obtained from a Mulliken population analysis of the MP2/6-31G(d) wave function. The final topology structures of the thioester intermediate (NTNG), PRPP, NH 3 and NH 4 + were built on the related known structures in Top_all27_prot_na.inp in CHARMM27. The atom type selection for NH 3 and NH 4 + residues are based on residue type MAMI (methylamine) and MAMM (methylammonium). The structure data for the model compounds and NH 3 and NH 4 + were taken from MP2/6-31G(d) calculation result. There was no intermolecular parameter optimization involved in these studies since no new atom types were created in the parameters determination. Locally Enhanced Sampling (LES) Protocol In the crystal structure of E. coli GPATase (PDB code: 1ECC), the glutamine analog DON was covalently linked to the catalytic nucleophile Cys 1 . This substrate analog, with one more methylene between Cys 1 sulfur atom and glutamine carbonyl carbon, was essential to crystallization of the active conformer of GPATase. The E. coli asparagine synthetase B (AS-B, PDB code 1CT9) belongs to the same enzyme family as GPATase, a second class of glutaminase nucleophile type amidotransferases (Ntn). The glutaminase domain of AS-B has an unlinked glutamine in its crystal structure (158). Superimpose result of the glutamine structure in AS-B and the DON structure in
63 GPATase showed high similarity in the binding mode of two, with the heavy atom R. M. S. D. of 0.209 . (MacroModel 6.5, Schrdinger, Inc.) (Figure 6.2) Structure modification was done on the DON to restore the actual thioester intermediate structure in glutaminase domain of GPATase, while keeping most of the conserved interactions within the binding site (Insight II 98.0, Biosym/MSI). A NH 3 fragment was added to the carbonyl carbon in the thioester structure to form a tetrahedral structure, as what is believed to happen in the real catalysis. Then the C-N bond was deleted and the NH 3 molecule became unengaged while keep the d C-N =1.48 as the starting point of the LES simulation. The stable carbocyclic analog of the substrate PRPP (cPRPP) in the PRTase active site of GPATase was restored to the original PRPP structure during the simulation. The binding modes of two were believed unchanged because of the similarity between the two structures. To test the thermodynamic properties of the LES simulation, alternative starting positions were made. One started from the S N 2 attacking position of PRTase domain substrate PRPP. Another was from the geometry center of F259:CE1, I407:CG2 and F334:CZ, the approximate middle point of the NH 3 channel. The mutants of GPATase F334A and L415A (155) were made in InsightII BIOPOLYMER module. The simulations with NH 4 + instead of NH 3 were also carried out. The LES calculations were carried out in both gas phase and stochastic boundary conditions (162-165). Hydrogen positions for the protein residues and crystallographic waters were built with the HBUILD module in CHARMM. The starting structures for both simulation studies were obtained from minimization, with 1.0 kcal/mol 2 positional restraints imposed on enzyme backbone atoms and substrate. The minimization includes 500 cycles of steepest descent followed by 2000 cycles of adopted basis Newton
64 Raphson algorithms. Fifty copies of NH 3 molecules were then made by REPLICA module in CHARMM, and they initially occupied the same point in coordinate space. The partial charges and Lennard-Jones well depth parameters for each atom are scaled by 1/50 for 50 copies within BLOCK module, while covalent terms are not involved. The minimized system was gradually heated to 300 K over a period of 6 ps with T = 5 K every 0.1 ps, while keeping the same restraints as in minimization. During the heating, the atom velocities were assigned instead of scaling, based on the gaussian distribution method. The system was then subjected to a 20 ps NVT MD equilibration at 300 K while the above-mentioned harmonic constraints were reduced gradually from 1 kcal/mol 2 to zero. Production MD simulations were then performed for 200 ps in the same NVT ensemble at 300 K. Multiple NH 3 copies moved apart after initial velocity assignment. For the gas phase calculation the NVT ensemble utilizes the Nose-Hoover method for temperature scaling with the magnitude of coupling, q ref , set to 100.0 (159, 160). The minimized enzyme-substrate system was immersed into a 24 sphere of previously equilibrated water centered at the approximate middle position of the channel. Note that this sphere is not large enough to encompass the entire protein in all directions. Water molecules at a distance within 2.5 from any nonhydrogen protein-substrate atoms or crystallographic water were removed. The system is spatially subdivided into three concentric regions: reaction, buffer and reservoir regions. The reaction region of specific interest is a 20 radius sphere treated by Hamiltonian dynamics. A shell from 20 to 24 forms a buffer zone, in which motions of protein and solvent atoms were governed by Langevin equations of motion. An outer region beyond the 24 radius is called the reservoir boundary region. All protein atoms in this region were kept rigidly
65 fixed but were retained as a static reaction field for the dynamic region. A deformable stochastic boundary (162, 163) maintains water molecules within the simulation sphere through the use of a constraining force that rises steeply beyond 24 . In addition, protein atoms in the buffer region are constrained by a harmonic restoring force based on their X-ray reference position, with the force constant of 1.0 kcal/mol 2 . The Langevin friction coefficient applied on the protein atoms and water oxygen atoms in the buffer region are 80.0 and 62.0 ps -1 respectively (164, 165). The solvated LES simulation follows the same heating, equilibration and data collection sequences as in gas phase. The system temperatures were maintained near the specified values by the thermal bath of the langevin buffer regions and the velocities were re-assigned every 50 steps to keep the temperature within 5 of 300 K during the equilibration. The simulations were performed with the CHARMM 25b2 package using the param27 parameter set (140, 141). Molecular dynamics was carried out using the Verlet leapfrog algorithm, an integration time step of 0.002 ps, and with a distance-dependent dielectric constant = r. All bonds containing hydrogens were held rigid by the SHAKE algorithm (161). Long-range electrostatic forces were cut off at 13 by means of a shifting function, and Van der Waals forces were cut off between 9 and 13 by means of a switching function. The nonbonded list was kept to 14 and updated every 25 steps. The calculations were performed on a Silicon Graphics Onyx2 supercomputer at the McKnight Brain Institute of the University of Florida. Cavities Analysis of the Channel The program VOIDOO (166, 167) was employed in the cavities calculation inside the E. coli GPATase crystal structure and the snapshots of the LES simulation. The
66 residues that form the cavities can be delineated after the cavities have been detected and the cavity volume is measured. Since the cavities are more of interest to accommodate the NH 3 ligands, the solvent accessible volume is used instead of Van der Waals volume. The probe radius used was 1.4 and the atomic fattening factor was 1.100; Grid-shrink factor 0.900 was used during the cavity refinement. The calculations were repeated using several random orientations of the molecules. The output plot files of VOIDOO were converted by the program MAPMAN (168) and then displayed by O (169, 170). The electrostatic surface potential of the NH 3 channel was calculated with the GRASP program (172). The residues lining out the channel, Tyr74, Thr76, Arg188, Phe254, Glu255, Tyr258, Phe259, Phe334, Ile335, Gly406 and Ile407, found by the cavities calculation, were used to build the Van der Waal molecular surface. The electrostatic potential map was calculated for CHARMM charge, which was extracted from the CHARMM27 top_all27_prot_na.inp file. To further explore the local properties of the NH 3 channel, a series of simulations were carried out to answer the question: Are there any water molecules inside the channel? If so, whatâ€™s the influence of water molecules on the channel functioning? First, FLOOD (166, 167), VOIDOO's sister program, was used to calculate how many water molecules can fill the cavities inside the channel. Using loop search of cavity grid points with the probe radius of 1.4 for a water molecule, FLOOD attempts at packing solvent molecules as tightly as possible inside a given cavity. At the 25 ps snapshot of previous gas phase simulation trajectory, FLOOD reported that at maximum 23-26 water molecules could be packed into the channel in the total of 5 runs starting from different orientations (Figure 6.11). Therefore 25 TIP3P copies were used in the LES simulation
67 with the same conditions as NH 3 copies. TIP3P copies behaved like NH 3 copies and distributed over the channel area in 30~40 ps. At the structure of 20 ps snapshot, one TIP3P at N-, Cend of the channel was replaced into NH 3 separately and then LES simulations of 50 copies NH 3 were carried out in the presence of 24 TIP3P molecules inside the channel. LES simulations of NH 3 in the presence 3 and 10 TIP3P molecules were also made by keeping only 3 and 10 TIP3P molecules in the local minimum areas. These areas were where TIP3P replicas were likely clustered at as observed in the trajectory analysis of LES simulation of TIP3P. Three local minimum areas are the following: NTNG, Y74, G406 at the N-end; I407, F259 at the middle; V370, F254, Y258 at the C-end. Results and Discussion Force Field Parameters It is necessary to add new bond type S-CC since there is no thioester analog in the par_all27_prot_na.inp of CHARMM. The equilibrium bond length of 1.78 was obtained from the ab initio MP2/6-31G(d) structure. The vibrational data and the empirical parameters are shown in the Table 6.1-6.5. The force constant 1.0 Kcal/mol for n=1 and 2.5 Kcal/mol for n=2 did a good job of reproducing the ab initio potential energies surface for CT3-CC-S-CT3 and O-CC-S-CT3, as presented in Figure 6.3. Most needed parameters for model compound THFP and final substrate PRPP were taken from the already parameterized compounds, such as ADP and ribose (RIBO, ARMO) and methylphosphate (MP_2). The performance of new parameters and the structure modification site of NTNG and PRPP were checked carefully after the minimization. The structures of interest showed reasonable geometries during minimization and the later dynamic simulations.
68 The minimized structure was checked by PROCHECK (173, 174) and reported as equal to or even better than the original crystal structure in various stereochemical qualities. 91.3% of the residues in minimized structure are in core area in Ramachandrans plot, compared to the 88.3% in the crystal structure. Computational Evidence for NH 3 Channel The LES simulation clearly showed the movement of NH 3 copies from the thioester intermediate in glutaminase active site down to the PRPP in PRTase active site through the â€œchannelâ€ like structure connecting two subdomains of E. coli GPATase. At the beginning of the production phase of simulation, 50 NH 3 copies distributed like a cloud around the minimized position, which is 3.04 away from the carbonyl carbon (CDâ€™) of the thioester. Then the copies began to move down and distribute along the channel region exactly as proposed by the experimental and crystallographic data (154-156). In the gas phase, the NH 3 copies reached at the end of channel and distribute all over the channel area at 15-20 ps; the center-of-mass of all copies moved down to the approximate middle point at 25 ps. (Figure 6.4, Panel A and Figure 6.7) The distribution of NH 3 copies was not even in the whole area, with no specific regions to occupy. The copies were more likely to form small clusters, composed of 3-10 copies each, and then moved apart quickly. The formation of the clusters was in a random manner. Since the simulations were done in the empirical force field, the NH 3 copies couldnâ€™t be consumed to attack the C1â€™ of the PRPP down at the PRTase domain to form a C-N bond. So the copies began to leak off the channel at the late phase of the simulation. Some copies were trapped in certain cavities of the protein while several leaked out of the protein through certain loose structures at the interface of two domains. In the solvation model, the NH 3 copies moved down the channel in a similar manner but with more time steps. The center-of-mass of all
69 copies took 40-50 ps to the middle point (Figure 6.4, Panel B). The slightly different behavior of the NH 3 copies with the solvation model was: NH 3 copies distributed through the middle in 15 ps and more snapshots were observed to distribute along this upper area until 110 ps. After 110 ps, the lower part was occupied with the equal probabilities. Leaking was also seen with the solvation model but with fewer copies. In the LES simulations with different starting point, similar results were obtained. In the gas phase simulation starting from PRTase end of the channel, the NH 3 copies moved up through the channel and distributed all over in 5-10 ps. The clustering and distribution behavior of the copies were the same as the previous simulations. With the solvation model the center-of-mass took 40-50 ps to move up to the middle (Figure 6.4, Panel C). Similarly, the copies were more likely to stay at the lower part of channel from 10 to 120 ps and distributed all over after that. As expected, the center-of-mass of NH 3 copies in the simulation starting from the middle point of the channel stayed at the same level throughout the simulation (Figure 6.4, Panel D). In the gas phase, the copies distribute the whole area in 10-15 ps. With the solvent model, the copies however clustered at the middle area for most snapshots. The LES approach is designed to enhance the sampling of a small subsystem of interest (150-153). In present studies, this particular region was the presumed NH 3 channel domain in the E. coli GPATase crystal structure (154). The â€œenhancedâ€ sampling is accomplished through the use of multiple, but discrete, copies of the region of interest. The copies, which do not interact during the simulation, are treated by a mean-field approximation. In this way, an atom of outer protein sees the mean force from all of the copies, not a force from an average conformation of the copies. The NH 3 copies are free
70 to move apart in dynamics and explore different regions of phase space, thereby increasing the statistical sampling. LES provides multiple trajectories for the replicated portion at a much lower computational expense than repeated simulation of the entire original system. Additionally, the barriers to conformational transition are reduced, leading to a smoother potential energy surface, with faster transitions and efficient sampling of the various accessible minima. The present LES studies revealed the existence of the interdomain channeling in E. coli GPATase, giving the direct computational evidence for this debated enzyme mechanism. The NH 3 copies, in both gas phase and solvent model, did move down from the detaching position in the glutaminase domain to PRPP in the PRTase domain. The alternative simulations starting from Cterminal and middle point of the channel suggested that the NH 3 channel acts simply as a hydrophobic â€œtubeâ€ to connect two active sites. The product of the hydrolysis half reaction in the Nterminal glutaminase domain, NH 3 , diffused to the PRTase domain through this 20 passageway. The clustering and distribution of the NH 3 copies during the gas phase LES simulation indicates that there are no specific energy minima on the channel pathway. However, in stochastic boundary conditions there seems an accessible minima located at the middle point of the channel, close to residues Ile407 and Phe334, preventing the NH 3 copies to distribute all over the channel at the early phase of the simulation. The direct influence of the solvent on the internal diffusion of the NH 3 molecules is expected to be small or even none, because the hydrophobic channel may function to exclude water and ensures that NH 3 produced is the only nucleophile to enter the channel. It is possible that the solvent may affect the average structure and the movement of protein, thus altering the potential phase inside the protein compared to the
71 gas phase simulation. Another notable difference between the gas phase and solvation model was that the latter generally took longer to reach the whole channel. One possible explanation is that the different potential energy surface inside the solvated protein makes the conformational transition slow. However, it could also be caused by different coupling methods in NVT ensembles. As an extended system approach, the Nose-Hoover method includes the potential and kinetic energy of the piston in the system Hamiltonian. The coupling of the heat bath to the system is so efficient that the equilibration phase becomes short. In stochastic boundary condition, the Langevin oscillators in the buffer region act as a heat bath for the reaction region. Generally, the energy flow between two regions occurs quickly because of the coupled nature of the atomic interactions. In present studies, however, the equilibration of stochastic boundary condition took longer than the Nose-Hoover method. Hence, the kinetic energy transfer to NH 3 copies may be less efficient in stochastic boundary condition and the movement of NH 3 copies becomes slow. The simulations with 10 copies of NH 3 gave us the distinct results compared to the regular 50 copies run. In 5-50 ps copies started to distribute from glutaminase end to the middle of the channel in gas phase. Then the copies kept clustered around the starting position for the rest 450 ps, however. In solvent the center-of-mass of copies had the slightly notable movement during the whole 200 ps run (Figure 6.4, Panel F). The copies also formed clusters around the Nterminal of the channel and no extensive distribution was observed. These examined the sensitivity of the LES results to the number of copies used. The multiple LES trajectories of the NH 3 copies provide significantly enhanced statistics by reducing the potential barriers. The barriers to conformational transition with
72 10 copies are expected to be smaller than when 50 copies are employed. Thus the transitions in 10 copies run take place on a longer time scale. In both our studies the NH 3 copies showed limited conformational sampling in 500 ps and 200 ps period. It demonstrates, on the other hand, the regular LES simulations are more than an order of magnitude more efficient than single copy methods. To verify the protonation state of the NH 3 during the interdomain channeling, simulations with NH 4 + were done (Figure 6.4, Panel E). In gas phase, the NH 4 + copies started to stream out from a â€œholeâ€ on the wall of the channel and off the protein at 4 ps. This leak process lasted until 82 ps when all the copies are gone. The hole, formed by side chains of Thr76, Asp408, Arg482 and backbones of Ile407 and Asp408, was connected to a special â€œpassageâ€ around the interface between domains. This passage was linked to the outer space of protein and so efficient that the copies never got trapped once moved into. The result in solvent model repeated the leak as well. The leak occurred at the same position and the process took the whole 200 ps. In both studies, no copies ever moved down through the channel. Cavities and Electrostatic Properties of the Channel As Figure 6.5 shown, the cavities calculations on the snapshots of LES trajectories clearly showed the existence and expansion of the cavities located at the channel area, which correlated with the simulations well. At the 25 ps snapshot, the cavities of upper part of channel extended to the PRTase domain. The residues reported by VOIDOO to line out the channel are: Tyr74, Thr76, Arg188, Phe254, Glu255, Tyr258, Phe259, Phe334, Ile335, Gly406 and Ile407. These were consistent with the originally proposed pathway by Krahn et al. (154). Interestingly, the crystal structure of E. coli GPATase did not show cavities at the channel pathway (Figure 6.5, Upper Left). In total three major
73 cavities (>2 3 ) were reported: one of 10.967.033 3 at DON binding pocket, one of 5.358.271 3 around Ile407 and one of 2.104.078 3 around Ile335. The latter two were close to the channel pathway but not exactly on it. One possible reason to it is that the channel structure was collapsed after model building and structure refinement. It is also possible that the channel structure was transient and formed only when the NH 3 ligand moves through and pushes the side chains in the front away. The cavities calculations on different orientations of the protein were consistent, especially for the volume of major cavities. The electrostatic potential surface of the E. coli GPATase NH 3 channel is mostly neutral inside (Figure 6.6). The negatively charged surface area around the Cterminal PRTase end is closed to the pyrophosphate group of PRPP. However, the region around the entrance of the channel glutaminase end is clearly distinguishable as intensely positively charged. Five basic residues, Arg26, Arg73, Arg188, Arg482 and Lys46 were found to be within 8 of the thioester NTNG and capped the entrance of the channel. This unusual folding mode of basic residues clustering might have certain biological function on NH 3 channeling. In the simulation with NH 4 + , the electrostatic repulsion between ligand and the basic residues capped the entrance may be the major driving force for NH 4 + leaking. The Asp408 at the outlet may act to guide the NH 4 + copies at the very beginning of the simulation. The unique electrostatic properties of the NH 3 channel ensure the essential role of NH 3 in the enzyme mechanism. Unlike the bifunctional enzyme dihydrofolate reductase-thymidylate synthase (DHFR-TS), the electrostatical factor is less involved in the intermediate channeling in GPATase. The X-ray structure of DHFR-TS suggests that the product of TS,
74 dihydrofolate, moves along a positively charged electrostatic â€œhighwayâ€ across the surface of the protein (177, 178). This electrostatic channel, exposed to the bulk solvent, is supported by Brownian dynamics simulation (179, 180). However, the reverse directional simulation also turns out very high transfer efficiency, suggesting that simple diffusion is ultimately responsible for intermediate channeling. The role of electrostatics is more one of restricting the intermediate motions than one of guiding it from the TS to the DHFR active site. Our LES simulations on the intramolecular NH 3 channel agree well on the role of channel: it is simply as of a hydrophobic â€œtubeâ€ to connect two active sites. There is no thermodynamic advantage over the â€œchannelingâ€ process but clearly as the result of kinetic process. However, the unique electrostatic properties at the two ends of the NH 3 channel inside GPATase may have an additional role to assure the protonation state of the intermediate before channeling. The LES simulations of NH 3 in the presence of TIP3P molecules inside the channel showed that TIP3P changed the local potential of the channel to the large extent that it blocked the NH 3 channeling. The NH 3 copies couldnâ€™t move down and begin to leak out of the channel continuously at the late phase of the 24 and 10 TIP3P simulations. In the simulation with 3 TIP3P, only the upper 2/3 region of the channel was accessible to the NH 3 copies and the rest part was never reached. The simulations starting from the C-end showed the worse: NH 3 copies couldnâ€™t even move into the channel and leaked out in the early phase of the simulation. The above results suggested the importance of hydrophobic property to the channel functioning. L415A and F314A Mutant To analyze the perturbation of the mutations upon the NH 3 channel (155), mutant structures of L415A and F314A were prepared, minimized and further simulated with the
75 same LES protocol. The results of wild type and mutants, shown in Table 6.6, agree well with the experimental data. L415A, the leaky mutant identified by enzyme assays, showed the leak in gas phase LES simulation (Figure 6.7). At the 50 ps, 16 copies was found to leak off the protein compared to 4 and 6 copies in wild type and F334A mutant. Trajectory analysis showed the leak occurred at the same â€œholeâ€ as in the NH 4 + simulation. Superimposition of the average structure of 50-100 ps on wild type and L415A trajectory showed the formation of the hole on the wall of the channel (Figure 6.8). The space between Thr76 side chain and Asp408 whole residue, Ile407 backbone was enlarged and the Thr76, Asp408 CA distance increased by 1.8 in L415A mutant. These make a hole on the wall large enough to let the NH 3 leak through. It is surprise that the site of mutation, Leu415 is located down at the PRPP binding pocket, which is far away from the leaky hole. Coupled motions in protein dynamics may play a role in this mechanism (175, 176). Residues far from the active site can affect the enzyme catalysis through correlated or anti-correlated motions. The average R. M. S. D fluctuation of 50-100 ps of L415A mutant trajectory deviates from the wild type in residues 110-130, 326-350 and 440-500 region (Figure 6.9). Among them, residues 326-350 belong to the flexible loop that closes over the PRPP binding pocket and forms the NH 3 channel. Other types of interactions also exist in this area. First, the side chain of Leu415 is close to the side chain of Met409, which is next to Asp408 of the hole. Second, Leu415 backbone HN hydrogen bonded to backbone O of Val370, which is next to Arg371. The latter joins Arg342 and Arg482 to form stable hydrogen bonds with Asp408. The extensive hydrogen bond network found in this interdomain area may modulate the structural constraints and signaling during the catalysis. The later control simulation of L415A
76 mutant without LES conditions reproduced the â€œleaking holeâ€ formation (Figure 6.10). This indicated that the potential surface we obtained in LES simulation was the same as the one from the regular MD simulation. Thus we can exclude the possibility that the leaking occurred in LES simulation was an artifact caused by the sampling techniques. The L415A mutant simulation with stochastic boundary conditions, however, did not show leak as in the gas phase. But 10-20 NH 3 copies clustered at a special pocket formed by Phe254, Ile407, Met409 and Val460, which is accessible by few copies in wild type simulation. The inconsistence of the result with the solvation model is likely due to the approximation brought by the solvation model. The stochastic boundary conditions model was of great use to study the localized region of the protein interior that of particular interest. The dynamic properties of the atoms in the reaction region, such as equilibrium fluctuation and correlation, can be reproduced by treating the atoms in the buffer region as interacting Langevin oscillators. However, this model has one limitation: it does not include the low-frequency collective motions into the response functions for the buffer region atoms (163). For most active-site studies it is not necessary since relatively small fluctuations are likely to play the dominant role. In our case, when large-scale motion is to be considered in the mechanism, the results were likely to be affected. The F334A mutant is expected for a blocked or disrupted NH 3 channel (155). It has a lower limit channeling efficiency but not leaky channel phenotype. In both gas phase and solvation model, no leak was observed. In stochastic boundary conditions, however, the energy minima in the channel were lowered down to a position around Val370 and Arg371. The NH 3 copies clustered around it for more than 160 ps, much longer than the wild type. The average R. M. S. D. fluctuation for F334A deviates from the wild type in
77 residues 270-290 and 440-460 region (Figure 6.9). Combining the above two points, it is reasonable to speculate that the mutation affects the dynamic behaviors of the enzyme and alters the potential surface inside the channel, thus blocking the NH 3 channeling.
78 Figure 6.1 Diagram of the GPATase Nterminal active site covalent intermediate structure and Cterminal active site substrate PRPP and the individual model compounds used for the parameterization. A: Glutamine analog 6-diazo-5-oxo-L-nor-leucine (DON). B: Glutamine thioester covalent intermediate. C: S-methyl thioacetate (SMTA). D: Carbocyclic analog of PRPP (cPRPP). E: Alpha-D-5'-phosphoribosyl pyrophosphate (PRPP). F: Tetrahydrofuran phosphoric acid (THFP). Figure 6.2 Superimposition of the glutamine analog 6-diazo-5-oxo-L-nor-leucine (DON) in the crystal structure of E. coli GPATase and the glutaminase domain substrate glutamine in the crystal structure of E. coli asparagine synthetase B.
79 Figure 6.3 Relative potential energy as a function of the dihedral C1-C-S-C2 (A) and O=C-S-C2 (B) in model compound S-methyl thioacetate (SMTA). Symbols are as follows: MP2/6-31G(d) ab initio energies, bold line (); RHF/6-31G(d) ab initio energies (); CHARMM empirical potential energies, dashed line ().
80 Figure 6.4 Illustrations of the movements of the center-of-mass (blue sphere) of NH 3 and NH 4 + ligands. Center-of-mass was made on all NH 3 molecules inside the channel at different snapshots of LES simulations. The figures were made by superimposing different snapshots and retained the residue structures of only 25 ps (A) or 50 ps (B-F) for clarity reason. Panel A: regular LES simulation in gas phase starting from Nterminal of the channel. Panel B: regular LES simulation in solvation model starting from Nterminal. Panel C: LES simulation starting from Cterminal with solvation model. Panel D: LES simulation starting from the middle point with solvation model. Panel E: LES simulation of the NH 4 + molecules with solvation model. Panel F: LES simulation with 10 copies of NH 3 molecules with solvation model. The atom color code is green for carbon, red for oxygen, blue for nitrogen, pink for phosphorus, white for hydrogen and yellow for sulfur.
81 Figure 6.5 Representation of the cavities calculated with VOIDOO on the crystal structure of E. coli GPATase (Upper Left) and the snapshot of 5 ps (Upper Right), 10 ps (Lower Left) and 25 ps (Lower Right) of the LES simulation. The cavities are shown in maroon semi-transparent surface and atom coloring is yellow for carbon, red for oxygen, blue for nitrogen and green for sulfur. The glutamine analog 6-diazo-5-oxo-L-nor-leucine (DON) and cPRPP structure in the crystal structure and the corresponding NTNG and PRPP structures in others are shown by stick model while other protein residues are shown in lines.
82 Figure 6.6 Representation of the electrostatic potential on the van der Waal molecular surface of the ammonia channel of E. coli GPATase calculated with GRASP for CHARMM charges. The positive potentials are coded in blue, negative potentials coded in red and neutral potentials coded in white. Left panel, the channel viewing from Cterminal PRPP to Nterminal NTNG. The surface areas closed to PRPP are shown as negative. Right panel, the view in the reverse direction. Note that the entrance of the ammonia channel is clearly distinguishable as a region of intense positive potentials. In both views, residue NTNG and PRPP are shown by line models and atoms are white for carbon, red for oxygen, blue for nitrogen, yellow for phosphorus, green for sulfur and purple for hydrogen.
83 Figure 6.7 Locations of the ammonia copies relative to the GPATase backbone and the substrate NTNG and PRPP. Upper Panel: the snapshot of 28 ps at the LES simulation of E. coli GPATase wild type. Lower Panel: the snapshot of 39 ps at the LES simulation of E. coli GPATase L415A leaking mutant. The protein is depicted by BioMedCAChe (Fujitsu, Inc.) as ribbon drawings and secondary structure coloring is yellow for helices, red for sheet. Ammonia copies and residue NTNG and PRPP are shown by stick models and atoms are grey for carbon, red for oxygen, blue for nitrogen, yellow for sulfur and white for hydrogen.
84 Figure 6.8 Comparison of the leaking site structures in E. coli GPATase mutant (red) shown in LES simulation superimposed with the wild type structure (blue). Both structures are taken from the average structures of 50-100 ps gas phase LES trajectory. Panel A: van der Waal molecular surface (blue dot) of the wild type structure. Panel B: van der Waal molecular surface (red dot) of the mutant structure.
85 Figure 6.9 The atomic R. M. S. D fluctuation over the 50 â€“ 100 ps LES simulation of wild type E. coli GPATase (A), L415A mutant (B) and F334A mutant (C).
86 Figure 6.10 The time dependence of distance between CA atoms of leaking site residues during the LES or regular simulations. A: Distance between CA atoms of Thr76 and Ile407 in the simulations of wild type (solid line) and L415A mutant (dotted line) with LES setup. B: Distance between CA atoms of Thr76 and Ile407 in the simulations of wild type (solid line) and L415A mutant (dotted line) without LES setup. C: Distance between CA atoms of Thr76 and Asp408 in the simulations of wild type (solid line) and L415A mutant (dotted line) with LES setup. D: Distance between CA atoms of Thr76 and Asp408 in the simulations of wild type (solid line) and L415A mutant (dotted line) without LES setup.
87 Figure 6.11 Representation of the water molecules calculated with FLOOD on the 25 ps snapshot of E. coli GPATase LES simulation. The water molecules are shown in red semi-transparent spheres and atom coloring is yellow for carbon, red for oxygen, blue for nitrogen and green for sulfur. The substrates NTNG and PRPP structures are shown by stick model while other protein residues are shown in lines.
88 Table 6.1 Vibrational data for the model compound S-methyl thioacetate (SMTA) Mode Experimental freq. a (cm-1) Ab initio freq. (cm -1 ) CHARMM freq. (cm -1 ) IR intensity b Assignment c 1 425 484 421 1.5117 C=O 2 626 623 678 60.5407 C-S O-S 3 726 742 740 2.5109 S-CH 3 4 940 956 954 32.2228 rCH 3 -S 5 988 997 963 3.9139 rCH 3 -C 6 1106 1022 1023 3.8898 rCH 3 -S 7 1140 1142 1128 167.6671 C-C 8 1312 1378 1391 1.9976 s CH 3 -S 9 1355 1391 1418 24.2186 s CH 3 -C 10 1428 1464 1427 8.4180 as CH 3 -C asCH3-S 11 1690 1696 1632 151.3010 C=O 12 2929 2997 2917 0.5378 s CH 3 -C 13 3003 3016 2976 9.5931 as CH 3 -C asCH3-S a Experimental data from refs 145, 171 as reported. b IR intensities obtained from the cited ab initio calculations. c indicates stretching modes, indicates bends, r indicates rocking, indicates out of plane bending; index s=symmetric, as=asymmetric. Table 6.2 Bond stretching parameters derived from the model compound SMTA Atom types Reference length () Force constant (Kcal/mol 2 ) S-CC 1.78 260.00 Table 6.3 Angle bending parameters derived from the model compound SMTA Atom types Reference angle (Degrees) Force constant (Kcal/moldeg 2 ) CT3-CC-S 114.40 80.00 CC-S-CT3 98.20 55.00 O-CC-S 122.30 50.00
89 Table 6.4 Dihedral parameters derived from the model compounds SMTA and THFP Atom types Multiplicity Phase (Degrees) Force constant (Kcal/moldeg 2 ) CT3-CC-S-CT3 1 0 1.00 CT3-CC-S-CT3 2 180 2.50 O-CC-S-CT3 1 180 1.00 O-CC-S-CT3 2 180 2.50 ON6B-CN7-ON2-P 1 180 7.30 ON6B-CN7-ON2-P 2 0 1.25 ON6B-CN7-ON2-P 3 0 0.80 Table 6.5 Improper dihedral parameters derived from the model compound SMTA Atom types Out-of-plane ref. angle (Degrees) Force constant (Kcal/moldeg 2 ) O-S-CT3-CC 0.00 45.00 Table 6.6 The number of copies of NH 3 molecule leaking out of channel during the different phases of LES simulation Molecules 10 ps 50 ps 100 ps Wild type 0 4 11 F334A 0 6 10 L415A 1 16 26
CHAPTER 7 EXPLORING THE STRUCTURAL AND THERMODYNAMIC PROPERTIES OF AMMONIA CHANNEL IN GLUTAMINE PHOSPHORIBOSYLPYROPHOSPHATE AMIDOTRANSFERASE. PART II: GIBBS FREE ENERGY CALCULATIONS A critical question in the evaluation of functions of ammonia channel is whether the substrate is transferred in the ionic form or not. If diffusion of ammonia inside the channel is faster than ammonium ion, the lower energy route for ammonia migration will be preferred. Locally enhanced sampling technique allowed us to study the kinetics of the process by which the ammonia copies distributed along the pathway. Like other dynamics methods, LES can also give the descriptions about the potential energy surface of local structure. But at this section, the free energy profile is plausible to answer the questions related to the thermodynamics properties of the channel. A variety of theoretical simulations have been undertaken to study the possible pathway by which the substrate or product diffuse into and/or out of the active sites (198, 201, 208, 212). Umbrella sampling along the coordinate for substrate channeling, as a module in CHARMM, was selected to calculate free energies for the migration of ammonia/ammonium ion. In details, the free energy barriers were characterized and the residues that may assist the substrate migration were identified by the analysis of trajectories. The respective mechanisms behind the potential of mean force (PMF) profiles are discussed and the possible implications for the substrate channeling are proposed. 90
91 Methods The starting coordinates were derived from the 28 ps snapshot of previous trajectory of locally enhanced sampling simulation. Ammonia copies were distributed along the proposed channel pathway that consists of hydrophobic residues. This defines the possible route by which the ammonia substrate is channeled from the glutaminase active site to PRTase active site. The free energy profiles for ammonia/ammonium diffusion through the route were calculated using the umbrella sampling techniques (197-207). A harmonic biasing potential, 20K U was imposed to hold the substrates within the specified region. The distance between N of ammonia/ammonium and the carbonyl carbon CDâ€™ of the addition structure, NTNG, was used as coordinate () for the channeling of the substrates. K was the force constant and set to 5 kcal/mol 2 . In order to fully sample the different positions of the pathway, the minimum of the biasing potential was shifted in the desired direction along by 0.5 for each window. The starting structures for each window were prepared by the translation of the ammonia/ammonium by 0.5 in the direction of the defined route. The structure for the first window was derived from quasitetrahedral structure of NH 3 addition to the carbonyl carbon of the thioester. The NH 3 is unengaged and its orientation was kept the same as in its quasitetrahedral adduct precuror. All starting structures were minimized first while applying harmonic restraints on the heavy atoms of protein and the substrates. A second minimization was followed with distance restraints for keeping the distance between NH 3 : N and NTNG:CDâ€™ constant. The steepest descent algorithm was employed in both minimization phases.
92 The protocol that was used successfully to the translocations of ligand was adopted in our work for the calculation of the PMF (201, 208, 209). The distribution of values over the trajectory, , was determined by using bins of 0.1 wide. Each window of the umbrella simulation consisted of a 60 ps run with 10 ps of equilibration and 50 ps of sampling. The window began from the end point of the previous simulation. The equilibration is related to the potential of mean force by: iBCUTkW ln where C is a different constant for each window. Since the various windows overlap, the potential of mean force can be joined to form a continuous function. The constant C is then determined from this procedure. The simulations were carried out in both gas phase and stochastic boundary condition. i i Results and Discussion The PMF profile for the migration of ammonia, shown in Figure 7.1, is basically level at = 3.5-14.5 . There are several small barriers, 0.83-2.16 kcal/mol, along the pathway at = 4.25 , 6.15 and 6.65 . It reaches a plateau when > 6.65 . The trend of free energy of ammonia migration suggests no driving force is behind the substrate channeling. It supports the conclusion of chapter 6 that no preferred direction of ammonia migration is shown and the channel acts simply as a â€œpipeâ€ to connect the two active sites. The barriers in the ammonia route are attributable to H-bond interactions between the substrate and residues nearby (Table 7.1). Starting at 3.5 , NH 3 formed H-bond with NTNG:HT1 and Gly406:O of E. coli GPATase and its formation was dynamic along with the movement of the substrate. The H-bond involved NH 3 :N and NTNG:HT1 became weaken and broke at = ~5.0 and another one broke at = ~6.0 . The
93 barriers at = 5.65 , 6.15 and 6.65 were observed and linked to loss of interaction energies. Time series showed that the H-bonding interaction is established between the HN of Asp408 and ammonia after = 6.65 and stabilize the system. Clearly, the pattern of H-bond formation and breakage at the channeling route correlates with the small barriers located at the PMF curve. The formation of H-bond may function as the guide of substrate migration but could not change the PMF profile dramatically. Since the establishment of H-bond is reversible at either direction, it does not contribute to the driving force behind the migration. In chapter 6 we found the motion of side chain atoms that consisted the channel was expected to assist the movement of the substrate and formation of the channel. Karplus et al. concluded that motions of side chain were vital to many events associated with protein function, catalysis and product release (197, 208, 210). Our findings generally agreed with it and shed lights on the catalysis mechanisms of this unique family of amidotransferase. The high-energy barrier, at the height of ~4.40 kcal/mol, among the PMF profile of ammonium ion movement seems prohibitive (Figure 7.1). Like NH 3 , ammonium ion formed H-bonds with Gly406 and had one more H-bonds with NTNG at the initial phase of migration. Along with NH 4 + moving down, these interactions became weaken and two H-bonds broke at = 5.5 . A significant repulsive force for the diffusion of ammonium ion through the channel is an electrostatic repulsion in the direction of three Arg residues, 342, 371 and 482. Together with Arg491, these four Arg residues stay at one side of the channel and form a â€œleverâ€ at the middle of the channel. However, they were not the residues that delineate the channel directly but contributed to the intense positive potentials at the entrance of the channel (Figure 6.6). The involvement of Arg residues to
94 the high barrier is clearly reflected in their distance with the ammonium ion. At = 7.5 , the distance of NH 4 :N/Arg342:NH2 is 6.50 . This distance reached its minimum of 5.94 at = 8.0 while the PMF was at its highest peak. The positively charged guanidinium side chain of arginines engaged in an electrostatic repulsion with the ammonium ion. Then PMF decreased while the distance of NH 4 :N/Arg342:NH2 increased. At = 8.5 the distance went back to 6.13 . The distance of NH 4 to Arg482 and Arg371 followed the similar pattern. These findings were consistent with the LES simulation on the ammonium ion. The copies of ions were forced to escape from the channel at the very beginning of the simulation. It is likely that the ammonium ion encountered the prohibitive thermodynamic barrier and left the protein via an alternate exit channel. The results obtained from the simulations with the stochastic boundary condition were similar to the ones with gas phase except the fact that the values corresponding to the barriers shift to the left by ~2 (Figure 7.2). The heights of the barriers are comparable to their counterparts in gas phase simulations. The shift may be brought by the limitation of the algorithm as discussed in chapter 6. As it may be noted that the pathway of ammonia channeling is not straight. It simply bends at = 5.5 and points toward the PRTase active site. Due to the shape of the entire pathway, it is reasonable to deduce that the reaction coordinate, , represent the route of migration. However, the reaction coordinate in umbrella sampling is only a distance value that has no directionality. It is because the umbrella biasing potential does not impose any directionality. Due to the large energy barriers the protein imposed on the movements of other directions, the substrate will take the very shortest restraint distance
95 defined. In addition, the protein is unable to rearrange completely on the time scale of each window.
96 ( ) ( ) Figure 7.1 PMF profile for ammonia (Top) and ammonium ion (Bottom) through the ammonia channel in E. coli GPATase in gas phase simulation.
97 ( ) ( ) Figure 7.2 PMF profile for ammonia (Top) and ammonium ion (Bottom) through the ammonia channel in E. coli GPATase in simulation with stochastic boundary condition.
98 Table 7.1 Distances () between hydrogen bond donor and acceptors at the channel region of E. coli GPATase during simulations at different values of (). Donor Acceptor = 4.0 = 4.5 = 5.0 = 5.5 = 6.0 = 7.0 = 7.5 = 8.0 NTNG:HT1 NH 3 :N 1.79 2.05 2.43 NH 3 :HN1 Gly406:O 1.88 2.00 1.95 1.61 2.44 Asp408:HN NH 3 :N 2.26 2.04 2.20 NH 3 :HN1 Asp408:N 2.35
APPENDIX LIST OF ABBREVIATION MCR melanocortin receptor hMCR human melanocortin receptor mMC1R mouse melanocortin-1 receptor mMC4R mouse melanocortin-4 receptor GPCR G-protein coupled receptor AGRP agouti-related protein hAGRP human agouti-related protein ASP agouti protein MSH melanocyte stimulating hormone ACTH adrenocorticotropic hormone POMC proopiomelanocortin SCR structurally conserved region SVR structurally variable region PDB Protein Data Bank TM transmembrane EL extracellular loop MD molecular dynamics MC monte carlo PBC periodic boundary condition MMFP miscelaneous mean field potential DPR diaminopropionic acid PME particle mesh ewald R. M. S. D. root mean square deviation PRPP alpha-D-5'-phosphoribosyl pyrophosphate SMTA S-methyl thioacetate THFP tetrahydrofuran phosphoric acid MP2 the second-order Mller-Plesset RHF restricted Hartree-Fock MAMI methylamine MAMM methylammonium GPATase glutamine phosphoribosylpyrophosphate amidotransferase AS-B asparagine synthetase B CPS carbamoyl phosphate synthase BPTI bovine pancreatic trypsin inhibitor DON 6-diazo-5-oxo-L-nor-leucine LES locally enhanced sampling PMF potential of mean force 99
LIST OF REFERENCES 1. Lu, D., Willard, I. R., Patel, S., Kadwell, L., Overton, T., Kost, M., Luther, W., Chen, R. P., Yowchik, W. O., Wilkison, R. D. C. Nature 1994, 371, 799-802. 2. Chhajlani, V., Wikberg, J. E. S. FEBS Lett 1992, 309, 417-420. 3. Gantz, I., Konda, Y., Tashiro, T. J Biol Chem 1993, 268, 8246 8250. 4. Gantz, I., Miwa, H., Konda, Y. J Biol Chem 1993, 268, 15174 15179. 5. Gantz, I., Shimoto, Y., Konda, Y., Miwa, H., Dickinson, C. J., Yamada, T. Biochem Biophys Res Commun 1994, 200, 1214 1220. 6. Mountjoy, K. G., Mortrud, M. T., Low, M. J., Simerly, R. B., Cone, R. D. Mol Endocrinol 1994, 8, 1298 1308. 7. Mountjoy, K. G., Robbins, L. S., Mortrud, M. T., Cone, R. D. Science 1992, 257, 1248 1251. 8. Roselli Rehfuss, L., Mountjoy, K. G., Robbins, L. S. Proc Natl Acad Sci USA 1993, 90, 8856 8860. 9. Cone, R. D., Lu, D., Kopula, S. Recent Prog Horm Res 1996, 51, 287 318. 10. Chen, A. S., Marsh, D. J., Trumbauer, M. E., Frazier, E. G., Guan, X. M., Yu, H., Rosenblum, C. I., Vongs, A., Feng, Y., Cao, L., Metzger, J. M., Strack, A. M., Camacho, R. E., Mellin, T. N., Nunes, C. N., Min, W., Fisher, J., Gopal-Truter, S., MacIntyre, D. E., Chen, H. Y., Van Der Ploeg, L. H. Nat. Genet. 2000, 26, 97-102. 11. Butler, A. A., Kesterson, R. A., Khong, K., Cullen, M. J., Pelleymounter, M. A., Dekoning, J., Baetscher, M., Cone, R. D. A. Endocrinology 2000, 141, 3518-3521. 12. Huszar, D., Lynch, C. A., Fairchild-Huntress, V., Dunmore, J. H., Smith, F. J., Kesterson, R. A., Boston, B. A., Fang, Q., Berkemeir, L. R., Gu, W., Cone, R. D., Campfield, L. A., Lee, F. Cell 1997, 88, 131-141. 13. Hruby, V. J., Wilkes, B. C., Cody, W. L., Sawyer, T. K., Hadley, M. E. Pept. Protein Rev. 1984, 3, 1-64. 14. Mountjoy, K. G., Robbins, L. S., Mortrud, M. T., Cone, R. D. Science 1992, 257, 1248-1251. 100
101 15. Lu, D., Vge, D. I., Cone, R. D. Mol. Endocrinol. 1998, 12, 592-604. 16. Chhajlani, V. Muceniece R. and Wikberg, J.E.S. Biochem Biophys Res Commun 1993, 195, 866 873. 17. Chen, W., Kelly, M. A., Opitz-Araya, X., Thomas, R. E., Low, M. J., Cone, R. D. Cell 1997, 91, 789 798. 18. Hruby, V. J., Wilkes, B. C., Hadley, M. E., Al-Obeidi, F., Sawyer, T. K., Staples, D. J., DeVaux, A., Dym, O., Castrucci, A. M., Hintz, M. F., Riehm, J. P., Rao, K. R. J. Med. Chem. 1987, 30, 2126-2130. 19. Castrucci, A. M. L., Hadley, M. E., Sawyer, T. K., Wilkes, B. C., Al-Obeidi, F., Staples, D. J., DeVaux, A. E., Dym, O., Hintz, M. F., Riehm, J., Rao, K. R., Hruby, V. J. Gen. Comp. Endocrinol. 1989, 73, 157-163. 20. Haskell-Luevano, C., Sawyer, T. K., Hendrata, S., North, C., Panahinia, L., Stum, M., Staples, D. J., Castrucci, A. M., Hadley, M. E., Hruby, V. J. Peptides 1996, 17, 995-1002. 21. Miller, M. W., Duhl, D. M., Vrieling, H. Genes Dev 1993, 7, 454 467. 22. Shutter, J. R., Graham, M., Kinsey, A. C., Scully, S., Lthy, R., Stark, K. L. Genes Dev. 1997, 11, 593-602. 23. Ollmann, M. M., Wilson, B. D., Yang, Y.-K., Kerns, J. A., Chen, Y., Gantz, I., Barsh, G. S. Science 1997, 278, 135-138. 24. Yang, Y. K., Ollmann, M. M., Wilson, B. D. Mol Endocrinol 1997, 11, 274 280. 25. Kiefer, L. L., Ittoop, O. R., Bunce, K. Biochemistry 1997, 36, 2084 2090. 26. Perry, W. L., Hustad, C. M., Swing, D. A., Jenkins, N. A., Copeland, N. G. Genetics 1995, 140, 267 274. 27. Perry, W. L., Nakamura, T., Swing, D. A. Genetics 1996, 144, 255 264. 28. Quillan, J. M., Sadee, W., Wei, E. T., Jimenez, C., Ji, L., Chang, J. K. FEBS Lett 1998, 428, 59 62. 29. Willard, D. H., Bodnar, W., Harris, C. Biochemistry 1995, 34, 12341 12346. 30. Yang, Y. K., Thompson, D. A., Dickinson, C. J. Mol Endocrinol 1999, 13, 148 155. 31. Kiefer, L. L., Veal, J. M., Mountjoy, K. G., Wilkison, W. O. Biochemistry 1998, 37, 991 997.
102 32. Haskell-Luevano, C., Cone, R. D., Monck, E. K., and Wan, Y. -P., Biochemistry 2001, 40, 6164-6179. 33. Yang, Y. K., Dickinson, C. J., Zeng, Q., Li, J. Y., Thompson, D. A., Gantz, I. J. Biol. Chem. 1999, 274, 14100-14106. 34. Yang, Y. K., Fong, T. M., Dickinson, C. J., Mao, C., Li, J. Y., Tota, M. R., Mosley, R., Lex, Van der Ploeg, L. H. T., Gantz, I. Biochemistry 39, 14900-14911. 35. Yang, Y. K., Dickinson, C., Haskell-Luevano, C., Gantz. I. J. Biol. Chem. 1997, 272, 23000-23010. 36. Haskell-Luevano, C., Sawyer, T. K., Trumpp-Kallmeyer, S., Bikker, J. A., Humblet, C., Gantz, I., Hruby, V. J. Drug Design and Discovery, 14, 197-211. 37. Chhajlani, V., Xu, X., Blauw, J., and Sudarshi, S. Biochemical and Biophysical Research Communications. 219, 521-525. 38. Schioth, H. B., Muceniece, R., Szardenings, M., Prusis, P., Wikberg, J. E. S. Biochemical and Biophysical Research Communications. 229, 687-692. 39. Schlenkrich, M., Brickmann, J., MacKerell, A. D., Jr. & Karplus, M. Biological Membranes: A Molecular Perspective from Computation and Experiment (Merz, K. M. & Roux, B., eds.), 1996, 31-81. Birkhaser, Boston. 40. Cornell, W. D., Cieplak, P., Bayly, C. I., Gould, I. R., Merz, J., K. M., Ferguson, D. M., Spellmeyer, D. C., Fox, T., Caldwell, J. W. & Kollman, P. A. J. Am. Chem. Soc. 1995, 117, 5179-5197. 41. Jorgensen, W. L., Tirado-Rives, J. J. Am. Chem. Soc. 1988, 110, 1657-1666. 42. Dunfield. J. Phys. Chem. 1978, 82, 2609. 43. Hwang, M. J., Stockfisch, T. P., Hagler, A. T. J. Am. Chem. Soc. 1994, 116, 2515-2525. 44. Lii, J. L., Allinger, N. L. J. Comp. Chem. 1991, 12, 186-199. 45. Halgren, T. A. J. Comp. Chem. 1996, 17, 490-519. 46. Mayo, S. L., Olafson, B. D., Goddard, I., W. A. J. Phys. Chem. 1990, 94, 8897-8909. 47. Caldwell, J. W., Kollman, P. A. J. Phys. Chem. 1995, 99, 6208-6219. 48. Stern, H. A., Kaminski, G. A., Banks, J. L., Zhou, R., Berne, B. J., Friesner, R. A. J. Phys. Chem. B 1999, 103, 4730-4737. 49. Lewin, R. Science 1987, 237, 1570.
103 50. Reeck, G. R. Cell 1987, 50, 667. 51. Dayhoff, M.O., Eck, R.V. Atlas of Protein Sequence and Structure (Dayhoff, M.O., ed.), 1968, 3, 33-41. National Biomedical Research Foundation, Washington, D.C. 52. Dayhoff, M. O., Schwartz, R. M., Orcutt, B. C. Atlas of Protein Sequence and Structure (Dayhoff, M.O., ed.), 1978, 5, suppl. 3, 345-358, National Biomedical Research Foundation, Washington, D.C. 53. Dayhoff, M. O., Barker, W. C., Hunt, L. T. Meth. Enzymol. 1983, 91, 524-545. 54. Henikoff, S., Henikoff, J. G. Proc. Natl. Acad. Sci. USA 1992, 89, 10915-10919. 55. Johnson, M. S., Srinivasan, N., Sowdhamini, R., Blundell, T. L. Crit. Rev. Biochem. Mol. Biol., 1994, 29, 1. 56. Marti-Renom, M. A., Stuart, A., Fiser, A., Snchez, R., Melo, F., Sali, A. Annu. Rev. Biophys. Biomol. Struct. 2000, 29, 291-325. 57. Sali, A., Blundell, T. L. J. Mol. Biol. 1993, 234, 779-815. 58. Fiser, A., Do, R. K., Sali, A. Protein Science 2000, 9, 1753-1773. 59. Browne, W. J., North, A. C. T., Philips, D. C., Drew, K., Vanaman, T. C., Hill, R. L. J. Mol. Biol. 1969, 42, 65. 60. Issacs, N., James, R., Niall, H., Bryant-Greenwood, G., Dodson, G., Evans, A., Nature 1978, 271, 278. 61. Jones, T. H., Thirup, S. EMBO J. 1986, 5, 819. 62. Unger, R., Harel, D., Wherland, S., Sussman, J. L. Proteins, 1989, 5, 335. 63. Claessens, M., Cutsen, E. V., Lasters, I., Wodak, S. Protein Eng., 1989, 4, 335. 64. Levitt, M. J. Mol. Biol. 1992, 226, 507. 65. Blundell, T. L., Sibanda, B. L., Sternberg, M. J. E., Thornton, J. M. Nature, 1987, 326, 347. 66. Blundell, T. L., Carney, D., Gardner, S., Hayes, F., Howlin, B., Hubbard, T. Eur. J. Biochem. 1988, 172, 513. 67. Needleman, S. B., Wunsch, C. D. J. Mol. Biol. 1970, 48, 442-453. 68. McLachlan, A. D. J. Mol. Biol. 1972, 64, 417.
104 69. Johnson, M. S., Srinivasan, N., Sowdhamini, R., Blundell, T. L. Crit. Rev. Biochem. Mol. Biol., 1994, 29, 1. 70. Overington, J. P., Johnson, M. S., Sali, A., Blundell, T. L. Proc. R. Soc. Lond., 1990, B241, 146. 71. Topham, C. M., McLeod, A., Eisenmenger, F., Overington, J. P., Johnson, M. S., Blundell, T. L. J. Mol. Biol., 1993, 229, 194. 72. Ponder, J. W., Richards, F. M. J. Mol. Biol., 1987, 193, 775. 73. Dunbrack, R. L., Karplus, M. J. Mol. Biol., 1993, 230, 543. 74. McGregor, M. J., Islam, S. A., Sternberg, M. J. E. J. Mol. Biol., 1987, 198, 295. 75. van Gunsteren, W. F., Berendsen, H. J. C. J. Mol. Phys. 1977, 34, 1311-1327. 76. Sankararamakrishnan, R., Konvicka, K., Mehler, E. L., Weinstein H. International Journal of Quantum Chemistry 2000, 77, 174-186. 77. Bolin, K. A., Anderson, D. J., Trulson, J. A., Thompson, D. A., Wilken, J., Kent, S. B., Gantz, I., Millhauser, G. L. FEBS Lett. 1999, 451, 125-131. 78. Tota, M. R., Smith, T. S., Mao, C., MacNeil, T., Mosley, R. T., Van der Ploeg, L. H. T., Fong, T. M. Biochemistry 1999, 38, 897-904. 79. Haskell-Luevano, C., Cone, R. D., Monck, E. K., Wan, Y. -P., Biochemistry 2001, 40, 6164-6179. 80. Haskell-Luevano, C., Sawyer, T. K., Trumpp-Kallmeyer, S., Bikker, J. A., Humblet, C., Gantz, I., Hruby, V. J. Drug Design and Discovery 1996, 14, 197-211. 81. Baldwin, J. M., Schertler, G. F. X., Unger, V. M. J. Mol. Biol. 1997, 272, 144-164. 82. Palczewski, K., Kumasaka, T., Hori, T., Behnke, C. A., Motoshima, H., Fox, B. A., Le Trong, I., Teller, D. C., Okada, T., Stenkamp, R. E., Yamamoto, M., Miyano, M. Science 2000, 289, 739-745. 83. Brooks, B. R., Bruccoleri, R. E., Olafson, B. D., States, D. J., Swaminathan, S., Karplus, M. J. Comp. Chem. 1983, 4, 187-217. 84. MacKerell, A. D., Brooks, Jr., B., Brooks, C. L., Nilsson, III, L., Roux, B., Won, Y., Karplus, M. The Encyclopedia of Computational Chemistry 1998, 1, 271-277. P. v. R. Schleyer , editors (John Wiley & Sons: Chichester) 85. Foloppe, N., MacKerell, Jr., A.D. Journal of Computational Chemistry 2000, 21, 86-104.
105 86. MacKerell, Jr., A.D., Banavali, N. Journal of Computational Chemistry 2000, 21, 105-120. 87. MacKerell, Jr., A. D., Bashford, D., Bellott, M., Dunbrack Jr., R. L., Evanseck, J. D., Field, M. J., Fischer, S., Gao, J., Guo, H., Ha, S., Joseph-McCarthy, D., Kuchnir, L., Kuczera, K., Lau, F. T. K., Mattos, C., Michnick, S., Ngo, T., Nguyen, D. T., Prodhom, B., Reiher, III, W. E., Roux, B., Schlenkrich, M., Smith, J. C., Stote, R., Straub, J., Watanabe, M., Wiorkiewicz-Kuczera, J., Yin, D., Karplus, M. Journal of Physical Chemistry B 1998, 102, 3586-3616. 88. Jorgensen, W. L., Chandrasekhar, J., Madura, J. D., Impey, R. W., Klein, M. L. J. Chem. Phys. 1983, 79, 926-935. 89. Thompson, J. D., Higgins, D. G., Gibson, T. J. Nucleic Acids Res. 1994, 22, 4673-4680. 90. Garnier, J., Gibrat, J. F., Robson, B. Method Enzymol. 1996, 266, 540-553. 91. Ballesteros, J. A., Shi, L., Javitch, J. A. Mol. Pharmacol. 2001, 60, 1-19. 92. Ballesteros, J. A., Weinstein, H. Methods in Neurosciences 1995, 25, 366-428. 93. Baldwin, J. M. The EMBO Journal 1993, 12, 1693-1703. 94. Javitch, J. A., Shi, L., Simpson, M. M., Chen, J., Chiappa, V., Visiers, I., Weinstein, H., Ballesteros, J. A. Biochemistry 2000, 39, 12190-12199. 95. Gether, U., Johansen, T. E., Snider, R. M., Lowe, J. A., Nankanishi, S., Schwartz, T. W. Nature 1993, 362, 345-348. 96. Turcatti, G., Zoffmann, S., Lowe, J. A., Drozda, S. E., Chassaing, G., Schwartz, T. W., Chollet, A. Journal of Biological Chemistry 1997, 272, 21167-21175. 97. Holst, B., Zoffmann, S., Elling, C. E., Hjorth, S. A., Schwartz, T. W. Molecular Pharmacology 1998, 53, 166-175. 98. Alla, S. A., Quitterer, U., Grigoriev, S., Maidhof, A., Haasemann, M., Jarnagin, K., Mller-Esterl, W. J. Biol. Chem. 1996, 271, 1748-1755. 99. Yang, Y., Fong, T. M., Dickinson, C. J., Mao, C., Li, J. Y., Tota, M. R., Mosley, R., Van der Ploeg, L. H. T., Gantz, I. Biochemistry 2000, 39, 14900-14911. 100. Yang, Y. K., Dickinson, C., Haskell Luevano, C., Gantz, I. Journal of Biogical Chemistry 1997, 272, 23000-23010. 101. Kiefer, L. L., Veal, J. M., Mountjoy, K. G., Wilkison, W. O. Biochemistry 1998, 37, 991-997. 102. Ji, T. H., Grossmann, M., Ji, I. J. Biol. Chem. 1998, 273, 17299-17302.
106 103. Yang, Y. K., Dickinson, C. J., Zeng, Q., Li, J. Y., Thompson, D. A., Gantz, I. J. Biol. Chem. 1999, 274, 14100-14106. 104. Chhajlani, V., Xu, X., Blauw, J., Sudarshi, S. Biochem Biophys Res Commun 1996, 219, 521-525. 105. Schioth, H. B., Fredriksson, A., Carlsson, C., Yook, P., Muceniece, R., Wikberg, J. E. Mol Cell Endocrinol 1998, 139, 109-115. 106. Schioth, H. B., Muceniece, R., Szardenings, M., Prusis, P., Wikberg, J. E. Biochem Biophys Res Commun 1996, 229, 687-692. 107. Schioth, H. B., Petersson, S., Muceniece, R., Szardenings, M., Wikberg, J. E. FEBS Lett. 1997, 410, 223-228. 108. Nicholls, A., Sharp, K. A., Honig, B. Proteins 1991, 11, 281-296. 109. Tota, M. R., Smith, T. S., Mao, C., MacNeil, T., Mosley, R. T., Van der Ploeg, L. H. T., Fong, T. M. Biochemistry 1999, 38, 897-904. 110. Haskell Luevano, C., Monck, E. K., Wan, Y. P., Schentrup, A. M. Peptides 2000, 21, 683-689. 111. Yang, Y., Chen, M., Lai, Y., Gantz, I., Yagmurlu, A., Georgeson, K. E., Harmon, C. M. Mol Pharmacol 2003, 64, 94-103. 112. Haskell Luevano, C., Monck, E. K., Wan, Y. P., Schentrup, A. M. Peptides. 683-689. 113. Thirumoorthy R., Holder J. R., Bauzo R. M., Richards N. G. J., Edison A. S., Haskell-Luevano C. J. Med. Chem. 2001, 44, 4114-4124. 114. van Buuren, A. R., Marrink, S. J., Berendsen, H. J. C. J. Phys. Chem. 1993, 97, 9206-9212. 115. Rolz, C., Mierke, D. F. Biophysical Chemistry 89, 119-128. 116. Christian Rolz, Maria Pellegrini, Dale F. Mierke. Biochemistry 38, 6397-6405. 117. Tieleman, D. P., Berendsen, H. J. C., Sansom, M. S. P. Biophysical Journal 80, 331-346. 118. van Aalten, D. M. F., Bywater, R., Findlay, J. B. C., Hendlich, M., Hooft, R. W. W., Vriend, G. Journal of Computer Aided Molecular Design 1996, 10, 255-262. 119. Linse, P. J. Chem. Phys. 1987, 86, 4177. 120. Locker Carpenter, I., Hehre, W. J. J. Phys. Chem. 1990, 94, 531.
107 121. Pellegrini, M., Mierke, D. F. Biopolymer 1999, 51, 208-220. 122. Giragossian, C., Mierke, D. F. Biochemistry 2001, 40, 3804-3809. 123. Ben-Tal, N., Honig, B. Biophys. J. 1996, 71, 3046-3050. 124. Kessel, A., Cafiso, D. S., Ben-Tal, N. Biophys. J. 2000, 78, 571-583. 125. Tieleman, D. P., Hess, B, Sansom, M. S. P. Biophys. J. 2002, 83 (5): 2393-2407. 126. Law, R. J., Tieleman, D. P., Sansom, M. S. P. Biophys. J. 2003, 84 (1): 14-27. 127. MacCallum, J. L., Tieleman, D. P. J. Am. Chem. Soc. 2002, 124 (50): 15085-15093. 128. Ewald, P. P. Ann. Phys. 1921, 64, 253-287. 129. Darden, T., York, D., Pedersen, L. J. Chem. Phys. 1993, 98, 10089-10092. 130. Essmann, U., Perera, L., Berkowitz, M. L., Darden, T., Lee, H., Pedersen, L. G. J. Chem. Phys. 1995, 103, 8577-8592. 131. Wilczynski, A., Wang, X. S., Xiang, Z. M., Bauzo, R. M., Holder, J. R., Richards, N. G. J., Haskell-Luevano, C. J. Med. Chem. 2003 (submitted) 132. Needleham, S. B., Wunsch, C. D. J. Mol. Biol. 1970, 48, 443-453. 133. Goto, H., Osawa, E. J. Chem. Soc. Perkin Trans. 1993, 2, 187-198. 134. Edison, A. S., Espinoza, E., Zachariah, C. The Journal of Neuroscience 1999, 19, 6318-6326. 135. Alder, B. J., Wainwright, T. E. J. Chem. Phys. 1957, 27, 1208. 136. McCammon, J. A., Gelin, B. R., Karplus, M. Nature 1977, 267, 585. 137. Verlet, L. Physical Review 1967, 159, 98-103. 138. Hockney, R. W. Methods in Computational Physics 1970, 9, 136-211. 139. Swope, W. C., Anderson, H. C., Berens, P. H., Wilson, K. R. Journal of Chemical Physics 1982, 76, 637-649. 140. Foloppe, N., MacKerell, A. D. Jr. Journal of Computational Chemistry 2000, 21, 86-104.
108 141. MacKerell, A. D. Jr., Bashford, D., Bellott, M., Dunbrack, R. L. Jr., Evanseck, J. D., Field, M. J., Fischer, S., Gao, J., Guo, H., Ha, S., Joseph-McCarthy, D., Kuchnir, L., Kuczera, K., Lau, F. T. K., Mattos, C., Michnick, S., Ngo, T., Nguyen, D. T., Prodhom, B., Reiher, W. E. III, Roux, B., Schlenkrich, M., Smith, J. C., Stote, R., Straub, J., Watanabe, M., Wirkiewicz-Kuczera, J., Yin, D., Karplus M. Journal of Physical Chemistry B 1998, 102, 3586-3616. 142. Foloppe, N., MacKerell, A. D. Jr. Journal of Physical Chemistry B 1998, 102, 6669-6678. 143. Pavelites, J. J., Gao, J., Bash, P. A., Mackerell, A. D. Jr. Journal of Computational Chemistry 1997, 18, 221-239. 144. Nilsson, L., Karplus, M. Journal of Computational Chemistry 1986, 7, 591-616. 145. Smolders, A., Maes, G., Zeegers-Huyskens, Th. Journal of Molecular Structure 1988, 172, 23-40. 146. Lumbroso, H., Bertin, D. M. Journal of Molecular Structure 1989, 212, 113-122. 147. Frisch, M. J., Trucks, G. W., Schlegel, H. B., Scuseria, G. E., Robb, M. A., Cheeseman, J. R., Zakrzewski, V. G., Montgomery, J. A., Jr., Stratmann, R. E., Burant, J. C., Dapprich, S., Millam, J. M., Daniels, A. D., Kudin, K. N., Strain, M. C., Farkas, O., Tomasi, J., Barone, V., Cossi, M., Cammi, R., Mennucci, B., Pomelli, C., Adamo, C., Clifford, S., Ochterski, J., Petersson, G. A., Ayala, P. Y., Cui, Q., Morokuma, K., Malick, D. K., Rabuck, A. D., Raghavachari, K., Foresman, J. B., Cioslowski, J., Ortiz, J. V., Stefanov, B. B., Liu, G., Liashenko, A., Piskorz, P., Komaromi, I., Gomperts, R., Martin, R. L., Fox, D. J., Keith, T., Al-Laham, M. A., Peng, C. Y., Nanayakkara, A., Gonzalez, C., Challacombe, M., Gill, P. M. W., Johnson, B., Chen, W., Wong, M. W., Andres, J. L., Gonzalez, C., Head-Gordon, M., Replogle, E. S., Pople, J. A. 1998, Gaussian 98, Revision A.6 ed.: Pittsburgh, PA. 148. Brooks, B. R., Bruccoleri, R. E., Olafson, B. D., States, D. J., Swaminathan, S., Karplus, M. J. Comp. Chem. 1983, 4, 187-217. 149. MacKerell, A. D., Brooks, Jr., B., Brooks, C. L., Nilsson, III, L., Roux, B., Won, Y., Karplus, M. The Encyclopedia of Computational Chemistry 1998, 1, 271-277. P. v. R. Schleyer , editors (John Wiley & Sons: Chichester) 150. Elber, R., Karplus, M. J. Am. Chem. Soc. 1990, 112, 9161-9175. 151. Roitberg, A., Elber, R. J. Chem. Phys. 1991, 95, 9277-9287. 152. Stultz, C. M., Karplus, M. J. Chem. Phys. 1998, 109, 8809-8815. 153. Stultz, C. M., Karplus, M. Proteins: Structure, Function, and Genetics 1999, 37, 512-529.
109 154. Krahn, J. M., Kim, J. H., Burns, M. R., Parry, R. J., Zalkin, H., Smith, J. L. Biochemistry 1997, 36, 11061-11068. 155. Bera, A. K., Smith, J. L., Zalkin, H. J. Biol. Chem. 2000, 275, 7975-7979. 156. Bera, A. K., Chen, S., Smith, J. L., Zalkin, H. J. Biol. Chem. 1999, 274, 36498-36504. 157. Chen, S., Burgner, J. W., Krahn, J. M., Smith, J. L., Zalkin, H. Biochemistry 1999, 38, 11659-11669. 158. Larsen, T. M., Boehlein, S. K., Schuster, S. M., Richards, N. G. J., Thoden, J. B., Holden, H. M., Rayment, I. Biochemistry 1999, 38, 16146-16157. 159. Nose, S. J. Chem. Phys. 1984, 81, 511-519. 160. Hoover, W. G. Phys. Rev. 1985, A31, 1695. 161. van Gunsteren, W. F., Berendsen, H. J. C. J. Mol. Phys. 1977, 34, 1311-1327. 162. Brooks, C. L., Karplus, M. J. Chem. Phys. 1983, 79, 6312-6325. 163. Brooks, C. L., Brunger, A., Karplus, M. Biopolymers 1985, 24, 843-865. 164. Brooks, C. L., Karplus, M. J. Mol. Biol. 1989, 208, 159-181. 165. Phelps, D. K., Beth Post C. Protein Science 1999, 8, 2281-2289. 166. Kleywegt, G. J., Zou, J. Y., Kjeldgaard, M., Jones, T. A. International Tables for Crystallography 2001, Volume F. Chapter 17.1, 353-356, 366-367. 167. Kleywegt G. J., Jones T. A. Acta Cryst. 1994, D50, 178-185. 168. Kleywegt G. J., Jones T.A. Acta Cryst. 1996, D52, 826-828. 169. Jones, T. A. J. Appl. Cryst. 1978, 11, 268-272. 170. Jones, T.A., Zou, J., Cowan, S.W., Kjeldgaard, M. Acta Cryst. 1991, A47, 110-119. 171. Girardet, C., Maillard, D., Schriver, A., Perchard, J. P. J. Chem. Phys. 1979, 70, 1511-1521. 172. Nicholls, A., Sharp, K. A., Honig, B. Proteins 1991, 11, 281-296. 173. Laskowski, R. A., MacArthur, M. W., Moss, D. S., Thornton, J. M. J. Appl. Cryst., 1993, 26, 283-291. 174. Morris, A. L., MacArthur, M. W., Hutchinson, E. G., Thornton, J. M. Proteins 1992, 12, 345-364.
110 175. Radkiewicz, J. L., Brooks, C. L. III J. Am. Chem. Soc. 2000, 122, 225-231. 176. Karplus, M. J. Phys. Chem. B 2000, 104, 11-27. 177. Knight, D. R., Kan, C. C., Howland, E., Janson, C. A., Hostomska, Z., Welsh, K. M., Matthews, D. A. Nat. Struct. Biol. 1994, 1, 186-194. 178. Stroud, R. M. Nat. Struct. Biol. 1994, 1, 131-134. 179. Elcock, A. H., Potter, M. J., Matthews, D. A., Knight, D. R., McCammon, J. A. J. Mol. Biol. 1996, 262, 370-374. 180. Elcock, A. H., Huber, G. A., McCammon, J. A. Biochemistry 1997, 36, 16049-16058. 181. Zalkin, H., Smith, J. L. Adv. Enzymol. Relat. Areas Mol. Bio. 1998, 72, 87-144. 182. Zalkin, H. Adv. Enzymol. Relat. Areas Mol. Bio. 1993, 66, 203-309. 183. Muchmore, C. R. A., Krahn, J. M., Kim, J. H., Zalkin, H., Smith, J. L. Protein Sci. 1998, 7, 39-51. 184. Smith, J. L., Zaluzec, E. J., Wery, J. P., Niu, L., Switzer, R. L., Zalkin, H., Satow, Y. Science 1994, 264, 1427-1433. 185. Kim, J. H., Krahn, J. M., Tomchick, D. R., Smith, J. L., Zalkin, H. J. Biol. Chem. 1996, 271, 15549-15557. 186. Chen, S., Tomchick, D. R., Wolle, D., Hu, P., Smith, J. L., Switzer, R. L., Zalkin, H. Biochemistry 1997, 36, 10718-10726. 187. Smith, J. L. Curr. Opin. Struct. Biol. 1999, 8, 686-694. 188. Srere, P. A. Annu. Rev. Biochem. 1987, 56, 89-124. 189. Ovadi, J. J. Theor. Biol. 1991, 152, 1-22. 190. Hyde, C. C., Ahmed, S. A., Padlan, E. A., Miles, E. W., Davies, D. R. J. Biol. Chem. 1988, 263, 17857-17871. 191. Thoden, J. B., Holden, H. M., Wesenberg, G., Raushel, F. M., Rayment, I. Biochemistry 1997, 36, 6305-6316. 192. Larsen, T. M., Boehlein, S. K., Schuster, S. M., Richards, N. G. J., Thoden, J. B., Holden, H. M., Rayment, I. Biochemistry 1999, 38, 16146-16157. 193. Amuro, N., Paluh, J. L., Zalkin, H. J. Biol. Chem. 1985, 260, 14844-14849.
111 194. Tesmer, J. J. G., Klem, T. J., Deras, M. L., Davisson, V. J., Smith, J. L. Nat. Struct. Biol. 1996, 3, 74-86. 195. Thoden, J. B., Miran, S. G., Phillips, J. C., Howard, A. J., Raushel, F. M., Holden, H. M. Biochemistry 1998, 37, 8825-8831. 196. Raushel, F. M., Thoden, J. B., Reinhart, G. D., Holden, H. M. Curr. Opin. Chem. Biol. 1998, 2, 624-632. 197. McCammon, J. A., Karplus, M. Proc. Natl. Acad. Sci. U.S.A. 1979, 76, 3585-3589. 198. Case, D. A., Karplus, M. J. Mol. Biol. 1979, 132, 343-368. 199. Northrup, S. H., Pear, M. R., Lee, C.-Y., McCammon, J. A., Karplus, M. Proc. Natl. Acad. Sci. U.S.A. 1982, 79, 4035-4039. 200. Case, D. A., McCammon, J. A. Ann. N. Y. Acad. Sci. 1986, 482, 222-233. 201. Kottalam, J., Case, D. A. J. Am. Chem. Soc. 1988, 110, 7690-7697. 202. Brooks, C. L., IlI, Karplus, M., Pettitt, B. M. Proteins: A Theoretical Perspective of Dynamics, Structure, and Thermodynamics, Wiley & Sons: New York, 1988, 46-49. 203. Beveridge, D. L., DiCapua, F. M. Annul Rev. Biophys. Chem. 1989, 18, 431-492. 204. Young, W. S., Brooks, C. L., III. J. Mol. Biol. 1996, 230, 560-573. 205. Brooks, C. L., III, Case, D. A. Chem. Rev. 1993, 2487-2502. 206. Tobias, D. J., Brooks, C. L., III. Biochemistry 1991, 30, 6059-6070. 207. Torrie, G. M., Valleau, J. P. J. Comp. Phys. 1977, 23, 187-199. 208. Enyedy, I. J., Kovach, I. M., Brooks. B. R. J. Am. Chem. Soc. 1998, 120, 8043-8050. 209. Pomes, R., Roux, B. Biophysical J. 1998, 75, 33-40. 210. Karplus, M., McCammon, J. A. CRC Crit. Rev. Biochem. 1981, 293-349. 211. Jarosinski, M. A., Dodson, W. S., Harding, B. J., Hale, C., McElvain, M., Zamborelli, T. J., Lenz, D. M., Bennett, B. D., Marasco, J., Baumgartner, J., Liu, C-F., Karbon, E. W. 17 th American peptide Symposium 2001 Jun, 322. 212. Tilton, R. F., Jr., Singh, U. C., Weiner, S. J., Connolly, M. L., Kuntz, I. D., Jr., Kollman, P. A., Max, N., Case, D. A. J. Mol. Biol. 1986, 192, 443.
BIOGRAPHICAL SKETCH Xiang Simon Wang was born Jan. 5 th , 1972 in Sichuan, China. He is the son of Huasong Wang and Qiongru Liu. Simon graduated from the School of Pharmaceutical Sciences, Peking University, in July of 1994 where he received a Bachelor of Science degree in pharmacy. He then worked as a research scientist in the Department of Pharmacology, Institute of Medicinal Biotechnology, Chinese Academy of Medical Science, from August of 1994 to September of 1996. Simon entered Peking Union Medical College for graduate studies and received a Master of Science degree in antimicrobial pharmacology in July of 1998. He is currently anticipating graduation in August of 2003 from the University of Florida with a Doctor of Philosophy degree in computational chemistry. 112