<%BANNER%>

Free Energy Simulations of Complex Biological Systems at Constant pH

MISSING IMAGE

Material Information

Title:
Free Energy Simulations of Complex Biological Systems at Constant pH
Physical Description:
1 online resource (218 p.)
Language:
english
Creator:
Swails, Jason Matthew
Publisher:
University of Florida
Place of Publication:
Gainesville, Fla.
Publication Date:

Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Chemistry
Committee Chair:
Roitberg, Adrian E
Committee Members:
Deumens, Erik
Horenstein, Nicole Alana
Fanucci, Gail E
Bloom, Linda B

Subjects

Subjects / Keywords:
constant -- exchange -- ph -- pka -- replica
Chemistry -- Dissertations, Academic -- UF
Genre:
Chemistry thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract:
Solution pH has profound effects on the structure, function, and activity of many complex biomolecules that catalyze the chemical reactions responsible for sustaining life. Even focusing on the human body, the various physiological environments span a wide pH range---as low as 1 in the stomach to values as high as 8.1 in pancreatic secretions. Small changes from the normal pH of a biomolecule's environment can be catastrophically disruptive to its activity. For example, a change in pH of as little as +/-0.1 pH units in the human bloodstream is enough to cause life-threatening alkalosis or acidosis. Due to the importance of pH in biology and the profound effect it can have on biomolecules, it is important to incorporate pH effects in computational models designed to treat these biomolecules. The solution pH controls protonation state equilibria of specific functional groups prevalent in biomolecules, such as carboxylates, amines, and imidazoles.  These protonation states in turn affect the charge distribution in the biomolecule which can have a significant impact on both its 3-dimensional structure as well as interactions with the surrounding environment.  In many cases, pH can also impact whether or not a proton donor or acceptor will be available for catalysis during the course of the biocatalytic mechanism. The aim of my work is to develop accurate and efficient computational models to probe the pH-dependent behavior of proteins and nucleic acids. The models must be carefully designed to obey the laws of thermodynamics under the constraint of an externally applied pH. Only then can the results be directly compared to experimental measurements. In this dissertation, I present my work on the development of pH-based models for biomolecules and other work performed in the area of molecular modelling. In the first chapter, I introduce the fundamental concepts computational biomolecular modeling that lay the foundation for the presented work.  This is followed by chapters on free energy and sampling, constant pH simulations, replica exchange, and some useful tools I developed to aid in conducting computational research with the Amber simulation package.
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility:
by Jason Matthew Swails.
Thesis:
Thesis (Ph.D.)--University of Florida, 2013.
Local:
Adviser: Roitberg, Adrian E.

Record Information

Source Institution:
UFRGP
Rights Management:
Applicable rights reserved.
Classification:
lcc - LD1780 2013
System ID:
UFE0045741:00001

MISSING IMAGE

Material Information

Title:
Free Energy Simulations of Complex Biological Systems at Constant pH
Physical Description:
1 online resource (218 p.)
Language:
english
Creator:
Swails, Jason Matthew
Publisher:
University of Florida
Place of Publication:
Gainesville, Fla.
Publication Date:

Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Chemistry
Committee Chair:
Roitberg, Adrian E
Committee Members:
Deumens, Erik
Horenstein, Nicole Alana
Fanucci, Gail E
Bloom, Linda B

Subjects

Subjects / Keywords:
constant -- exchange -- ph -- pka -- replica
Chemistry -- Dissertations, Academic -- UF
Genre:
Chemistry thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract:
Solution pH has profound effects on the structure, function, and activity of many complex biomolecules that catalyze the chemical reactions responsible for sustaining life. Even focusing on the human body, the various physiological environments span a wide pH range---as low as 1 in the stomach to values as high as 8.1 in pancreatic secretions. Small changes from the normal pH of a biomolecule's environment can be catastrophically disruptive to its activity. For example, a change in pH of as little as +/-0.1 pH units in the human bloodstream is enough to cause life-threatening alkalosis or acidosis. Due to the importance of pH in biology and the profound effect it can have on biomolecules, it is important to incorporate pH effects in computational models designed to treat these biomolecules. The solution pH controls protonation state equilibria of specific functional groups prevalent in biomolecules, such as carboxylates, amines, and imidazoles.  These protonation states in turn affect the charge distribution in the biomolecule which can have a significant impact on both its 3-dimensional structure as well as interactions with the surrounding environment.  In many cases, pH can also impact whether or not a proton donor or acceptor will be available for catalysis during the course of the biocatalytic mechanism. The aim of my work is to develop accurate and efficient computational models to probe the pH-dependent behavior of proteins and nucleic acids. The models must be carefully designed to obey the laws of thermodynamics under the constraint of an externally applied pH. Only then can the results be directly compared to experimental measurements. In this dissertation, I present my work on the development of pH-based models for biomolecules and other work performed in the area of molecular modelling. In the first chapter, I introduce the fundamental concepts computational biomolecular modeling that lay the foundation for the presented work.  This is followed by chapters on free energy and sampling, constant pH simulations, replica exchange, and some useful tools I developed to aid in conducting computational research with the Amber simulation package.
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility:
by Jason Matthew Swails.
Thesis:
Thesis (Ph.D.)--University of Florida, 2013.
Local:
Adviser: Roitberg, Adrian E.

Record Information

Source Institution:
UFRGP
Rights Management:
Applicable rights reserved.
Classification:
lcc - LD1780 2013
System ID:
UFE0045741:00001


This item has the following downloads:


Full Text

PAGE 1

FREEENERGYSIMULATIONSOFCOMPLEXBIOLOGICALSYSTEMSATCONSTANTPHByJASONM.SWAILSADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOLOFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENTOFTHEREQUIREMENTSFORTHEDEGREEOFDOCTOROFPHILOSOPHYUNIVERSITYOFFLORIDA2013

PAGE 2

2013JasonM.Swails 2

PAGE 3

IdedicatethisdissertationtothelateProfessorFrederickP.Arnoldwhoseintelligence,excitement,andguidancepropelledmeintothiseld. 3

PAGE 4

ACKNOWLEDGMENTS Iwouldliketothankmyparents,MarkandMindySwails,fortheirlove,guidance,andconstantencouragement.Ithankmysiblings,KerriandJeffreySwails,forallthegreattimesandtheir(vain)attemptstokeepmehumble.ThankyoutomyextendedfamilyfortheincrediblesupportsystemI'vealwayshadgrowingup.TheRoitbergGroupprovidedagreatdealofsupportandcomaraderie,andAdrianRoitbergprovidedguidanceandinstructionduringmygraduatestudies.IalsothankQuantumTheoryProjectforallthegoodtimes.Andnally,Iwouldliketothankmywife,RoxyLowrySwails,foreverything. 4

PAGE 5

TABLEOFCONTENTS page ACKNOWLEDGMENTS .................................. 4 LISTOFTABLES ...................................... 9 LISTOFFIGURES ..................................... 10 LISTOFABBREVIATIONS ................................ 13 LISTOFCONSTANTSANDOPERATORS ....................... 15 ABSTRACT ......................................... 16 CHAPTER 1INTRODUCTION ................................... 18 1.1OriginsofComputationalChemistry ..................... 18 1.1.1QuantumMechanics .......................... 19 1.1.1.1Born-OppenheimerApproximation ............. 20 1.1.1.2ComputationalQuantumMechanics ............ 21 1.1.2StatisticalMechanics .......................... 21 1.1.2.1MonteCarlo ......................... 24 1.1.2.2MolecularDynamicsandtheErgodicHypothesis ..... 28 1.2MolecularMechanics .............................. 30 1.2.1ForceFields ............................... 31 1.2.1.1Bonds ............................. 31 1.2.1.2Angles ............................ 33 1.2.1.3Torsions ............................ 33 1.2.1.4ElectrostaticInteractions .................. 35 1.2.1.5vanderWaalsInteractions ................. 36 1.2.1.6OtherForceFieldTerms .................. 37 1.2.2TheAmberForceField ......................... 39 1.2.2.1FunctionalForm ....................... 40 1.2.2.2Implementation ........................ 41 2BIOMOLECULARSIMULATION:SAMPLINGANDFREEENERGY ...... 43 2.1SimulationsintheCondensedPhase ..................... 43 2.1.1ImplicitSolvent ............................. 44 2.1.1.1Distance-dependentDielectric ............... 44 2.1.1.2Poisson-Boltzmann ..................... 46 2.1.1.3GeneralizedBorn ...................... 48 2.1.1.4Non-polarSolvation ..................... 53 2.1.2ExplicitSolvent ............................. 54 2.1.2.1PeriodicBoundaryConditions ............... 55 5

PAGE 6

2.1.2.2CutoffMethods ........................ 55 2.1.2.3EwaldSummation ...................... 59 2.1.2.4OtherApproaches ...................... 62 2.2Sampling .................................... 64 2.2.1UmbrellaSampling ........................... 65 2.2.2SteeredMolecularDynamics ..................... 68 2.2.3ExpandedEnsemble .......................... 69 2.2.4ReplicaExchangeMolecularDynamics ............... 71 2.3FreeEnergyCalculations ........................... 72 2.3.1ThermodynamicIntegration ...................... 73 2.3.2FreeEnergyPerturbation ....................... 77 2.3.3End-stateCalculations ......................... 78 2.3.3.1MM-PBSA .......................... 79 2.3.3.2LIE .............................. 80 3CONSTANTpHREPLICAEXCHANGEMOLECULARDYNAMICS ....... 83 3.1ConstantpHandpKaCalculations ...................... 83 3.2Theory ...................................... 86 3.2.1TheSemi-GrandEnsemble ...................... 86 3.2.2CpHMD ................................. 87 3.2.3pH-REMD ................................ 89 3.3Methods ..................................... 89 3.3.1StartingStructure ............................ 89 3.3.2MolecularDynamics .......................... 90 3.3.3ReplicaExchange ........................... 91 3.4ResultsandDiscussion ............................ 91 3.4.1SimulationStability ........................... 92 3.4.2AccuracyofPredictedpKas ...................... 92 3.4.3EnhancingProtonationStateSamplingwithpH-REMD ....... 94 3.5ExchangeAttemptFrequencyandProtonationStateSampling ...... 98 3.5.1EnhancingConformationalStateSamplingwithpH-REMD ..... 101 3.5.2ScalabilityWithIncreasingExchangeAttemptFrequency ..... 109 3.6Conclusion ................................... 110 4CONSTANTpHMOLECULARDYNAMICSINEXPLICITSOLVENT ...... 112 4.1Introduction ................................... 112 4.2TheoryandMethods .............................. 114 4.2.1ConformationalandProtonationStateSampling ........... 114 4.2.2ExplicitSolventCpHMDWorkow ................... 116 4.2.3pH-basedReplicaExchange ..................... 117 4.3CalculationDetails ............................... 119 4.3.1ModelCompounds ........................... 119 4.3.2ACFCA ................................. 121 4.3.3Proteins:HEWLandRNaseA ..................... 122 6

PAGE 7

4.3.4SimulationDetails ........................... 123 4.4ResultsandDiscussion ............................ 124 4.4.1BoxSizeEffects ............................ 124 4.4.2rlxEffects ................................ 125 4.4.3ACFCA:CpHMDvs.pH-REMD .................... 128 4.4.4HenEggWhiteLysozyme ....................... 131 4.4.5RibonucleaseA ............................. 134 4.5Conclusion ................................... 137 5REMD:GPUACCELERATIONANDEXCHANGESINMULTIPLEDIMEN-SIONS ......................................... 142 5.1TemperatureREMD .............................. 142 5.2HamiltonianREMD ............................... 145 5.3Multi-DimensionalREMD ........................... 147 5.4Implementation ................................. 148 5.4.1ExchangeAttempts ........................... 149 5.4.2MessagePassing:DataExchangeinREMDSimulations ...... 152 6FLEXIBLETOOLSFORAMBERSIMULATIONS ................. 155 6.1MMPBSA.py .................................. 155 6.1.1Motivation ................................ 155 6.1.2Capabilities ............................... 156 6.1.2.1StabilityandBindingFreeEnergyCalculations ...... 156 6.1.2.2FreeEnergyDecomposition ................ 158 6.1.2.3EntropyCalculations ..................... 159 6.1.3GeneralWorkow ............................ 160 6.1.4RunninginParallel ........................... 162 6.1.5Differencestomm pbsa.pl ....................... 162 6.2ParmEd ..................................... 164 6.2.1Motivation ................................ 164 6.2.2ImplementationandCapabilities ................... 165 6.2.2.1Lennard-JonesParameterModications .......... 167 6.2.2.2ChangingAtomicProperties ................ 168 6.2.2.3SettingupforH-REMDSimulations ............ 169 6.2.2.4ChangingParameters .................... 169 APPENDIX ANUMERICALINTEGRATIONINCLASSICALMOLECULARDYNAMICS .... 170 A.1LagrangianandHamiltonianFormulations .................. 170 A.2NumericalIntegrationbyFiniteDifferenceMethods ............. 171 A.2.1Predictor-corrector ........................... 171 A.2.2VerletIntegrators ............................ 173 7

PAGE 8

BAMBERPARAMETER-TOPOLOGYFILEFORMAT ............... 176 B.1Layout ...................................... 176 B.2ListofSECTIONs ................................. 178 B.3DeprecatedSections .............................. 192 B.4CHAMBERTopologies ............................. 193 CMESSAGEPASSINGINTERFACE ......................... 199 C.1ParallelComputing ............................... 199 C.1.1DataModels ............................... 199 C.1.2MemoryLayout ............................. 199 C.1.3ThreadCount .............................. 201 C.2TheMechanicsofMPI ............................. 202 C.2.1Messages ................................ 202 C.2.2Communicators ............................. 202 C.2.3Communications ............................ 202 C.2.3.1Point-to-point ......................... 203 C.2.3.2All-to-oneandOne-to-all .................. 203 C.2.3.3All-to-all ............................ 204 C.2.4Blockingvs.Non-blockingCommunications ............. 206 REFERENCES ....................................... 207 BIOGRAPHICALSKETCH ................................ 218 8

PAGE 9

LISTOFTABLES Table page 3-1ReferencepKavaluesfortheacidicresiduestreatedinthisstudy.ValuesarethesameasthoseusedintheoriginalAmberCpHMDimplementation.[ 130 ] .. 88 3-2pKaandHillcoefcientsforeachresiduetakenfromeachsetofsimulations.ThepKasandHillcoefcients(n)areshownforeachEAF. ............ 94 3-3ValueofRSSaccordingtoEq. 3 forthe8residuesshowninFigs. 3-2 and 3-3 .Largervaluesrepresentmoredeviationfromthettedcurve ........ 97 3-4StandarddeviationsofpKa(pKa)andHillcoefcient(n)andaverageHillco-efcient(n)calculatedbydividingeachsimulationintosectionsof0.25ns. ... 99 3-5AveragetimingsforCpHMDandpH-REMDsimulations. ............. 109 4-1ModelcompoundpKavaluesandreferenceenergies. .............. 121 4-2CalculatedpKasforacid-rangetitratableresiduesinHEWLusingtheproposedmethodforbothstartingstructuresPDBs1AKIand4LYTwithoutions. ... 132 4-3CalculatedpKasforacid-rangetitratableresiduesinHEWLusingtheproposedmethodforbothstartingstructuresPDBs1AKIand4LYTwith21ions. ... 135 4-4CalculatedpKasforRNaseAusingsimulationsbegunfromcrystalstructures1KF5and7RSA. ................................... 138 4-5CalculatedpKasforRNaseAsimulationsrunwithexplicitcounterionspresent. 138 B-1Listofalloftheperturbedtopologylesections. ................. 193 B-2ListofagsthatarecommonbetweenAmberandchambertopologyles,buthavedifferentFORMATidentiers. .......................... 195 9

PAGE 10

LISTOFFIGURES Figure page 1-1Twoconformationsofanethanemolecule.Theconformationontheleftisthetypical`staggered'conformation .......................... 26 1-2Thecurvesrepresentthetrajectoryofsimpleharmonicoscillatorswithahighfrequency(left)andlowfrequency(right). ..................... 30 1-3TheexactpotentialenergysurfaceforH2[ 21 ]plottedwiththebest-ttingquadraticandquarticpolynomialsandthebest-ttingMorsepotential(Eq. 1 ). .... 32 1-4TheLennardJonespotentialbetweentwoatomswithaRmin,i,jof3.816Aand"of0.1094kcalmol-1. ................................ 38 1-5Schematicsshownforvariousparameterspresentintypicalforceelds. .... 39 2-1Distance-dependentdielectricfordifferentvaluesofthefreeparameterSinEq. 2 ........................................ 45 2-2RegionofspacebetweentwoatomsiandjofradiusRiandRjthatisinac-cessibletoasphericalsolventmoleculeofradiusRsolv. .............. 52 2-3Periodicsimulationintwodimensionswitharectangularunitcell.Themaxi-mumpermissiblecutoff(rcut)fortheminimumimageconventionisshown ... 56 2-4Effectsofvarious16Acutoffschemesontheelectrostaticinteractionoftwomonovalentionswithoppositecharges. ...................... 58 2-5Periodiccellsaddedinasphericalshaperadiallyfromthecentralunitcell.Theprogressionfromdarkertolightercellsshows ................ 59 2-6Aone-dimensionalexampleofparticleswithagivencharge(red)withaneu-tralizingGaussianchargedistribution(blue)shown. ............... 61 2-7Anexample1-dimensionalPMF(showninblack).Twobiasingumbrellapo-tentialsareshownalongsidetheresulting,biasedPMF.AllPMF ........ 66 2-8DiagrammaticsketchofREMDsimulations.Replicasarerepresentedasthickarrowsandexchangeattemptsareshownbetweenadjacentreplicas ...... 72 2-9HardcoreofdisappearingatomcausedbytheLennardJonesterms.The=1stateistheoneinwhichacarbonatomhasvanished. ........... 76 2-10Functionalformofsoft-coreLennardJonesinteractionswithdifferentvaluesoffromEq. 2 ................................. 77 2-11ThermodynamiccycleforMM/PBSAcalculations. ................ 80 10

PAGE 11

2-12SchematicshowinginteractionsnecessarytocomputetheLIEfreeenergyofnoncovalentbindingforaligandinaproteinusingwhitearrows. ........ 82 3-1RMSDplotsforCpHMDsimulations(a)andpH-REMDsimulationsatdiffer-entexchangeattemptfrequencies(b-d)asafunctionoftime .......... 93 3-2Titrationcurvesobtainedwith(a)EAF=0ps-1,and(b)EAF=50ps-1.ThedatafortheseresiduesshowthebestttoEq. 3 fortheCpHMDsimulations. .. 95 3-3Titrationcurvesobtainedwith(a)EAF=0ps-1and(b)EAF=50ps-1.ThedatafortheseresidueshavethepoorestttoEq. 3 fortheCpHMDsimulations. 96 3-4Numberofprotonationstatetransitionspernsofsimulationtime. ........ 100 3-5HistogrammedRMSDdataforpH2,pH4.5,andpH7takenfromsimulationsrunwithdifferentEAFs. ............................... 102 3-6Kullback-LeiblerdivergenceforeachsimulationcalculatedviaEq. 3 .... 103 3-7Averageatomicuctuationsforeachresiduerelativetotheaveragestructureoftheensemble. ................................... 105 3-8DistributionsoftheAsp52-CGlu35-Ccarboxylatecarbons.Asp52andGlu35arethecatalyticresiduesofHEWL. ....................... 107 3-9FractionofsimulationwiththeGlu35Asp52distanceshorterthan5Avs.pH. ........................................... 108 4-1WorkowoftheproposeddiscreteprotonationCpHMDmethodinexplicitsol-vent. .......................................... 118 4-2ThermodynamiccycleusedtoevaluateprotonationstatechangesinCpHMDsimulations. ...................................... 120 4-3Radialdistributionfunctions(RDFs)ofsolventoxygenatoms(O)andhydro-genatoms(H)withdifferentunitcellsizes. .................... 125 4-4Therelaxationoftheprotonatedstatestartingfromtheprotonatedtrajectoryisshowninbluewithitsautocorrelationfunctionshowninpurple. ........ 127 4-5ComputedpKasfortheAspartatemodelcompoundusingdifferentrelaxationtimes(rlx). ...................................... 129 4-6RDFsofwateroxygenatoms(O)andhydrogenatoms(H)aroundthecenter-of-massofthecarboxylategroup .......................... 130 4-7TitrationcurvesofCys2andCys4intheACFCApentapeptide.ResultsfromCpHMD(noreplicaexchangeattempts)andpH-REMDareshown ....... 131 11

PAGE 12

4-8RMSDplotsover20nsofpH-REMDsimulationforHEWLatpH2,4,6and8withrespecttothestartingcrystalstructure1AKI. ................. 133 4-9RMSDplotsover20nsofpH-REMDsimulationwithexplicitcounterionsforHEWLatpH2,4,6and8withrespecttothestartingcrystalstructure1AKI. 136 4-10DistancedistributionfunctionscalculatedfromHEWLsimulationsbegunwithcrystalstructure1AKIforallsnapshotsintheensembleatpH1.0. ....... 137 5-1PotentialenergydistributionsofTrpCagea20-residuepeptideatvarioustemperaturesinaT-REMDsimulation. ....................... 145 5-2Schematicshowingexchangeattemptsinmulti-dimensionalREMDsimula-tions.Exchangeattemptsareindicatedbythecoloredarrows .......... 149 5-3Communicatorarrangementinmulti-dimensionalREMDsimulationsatmulti-pleexchangestepsfollowingsomesuccessfulstateparameterexchanges. .. 154 6-1Generalworkowforperformingend-statecalculationswithMMPBSA.py.LEaPisaprograminAmberusedtocreatetopologylesfordynamics. ........ 161 6-2MMPBSA.pyscalingcomparisonforMM-PBSAandMM-GBSAcalculationson200framesofa5910-atomcomplex. ...................... 163 6-3Screenshotofthexparmed.pyGUIwindow,labeledwiththeavailableActionsandamessagelog. ................................. 166 C-1Schematicofdifferentpoint-to-pointcommunications.Threadsanddataareshownasovalsandboxes,respectively ...................... 204 C-2Schematicofdifferentall-to-oneandone-to-allcommunications.Threadsanddataareshownasovalsandboxes,respectively ................. 205 C-3Schematicofdifferentall-to-allcommunications.Threadsanddataareshownasovalsandboxes,respectively .......................... 206 12

PAGE 13

LISTOFABBREVIATIONS APIApplicationProgrammerInterfaceAMBERAssistedModelBuildingwithEnergyRenementBOABorn-OppenheimerApproximationCMAPCorrectionMapCpHMDConstantpHMolecularDynamicsDNADeoxyribonucleicAcidEAFExchangeattemptfrequencyESPElectrostaticPotentialFEPFreeEnergyPerturbationFFTFastFourierTransformGBSAGeneralizedBornwithSurfaceAreaGPUGraphicalProcessingUnitGRFGeneralizedReactionFieldH-REMDHamiltonianReplicaExchangeMolecularDynamicsHEWLHenegg-whitelysozymeIPSIsotropicPeriodicSumLIELinearInteractionEnergyLJLennard-JonesMCMonteCarloMDMolecularDynamicsMMMolecularMechanics 13

PAGE 14

MPIMessagePassingInterfaceMOMolecularOrbitalMTPMultipleTrajectoryProtocolPBCPeriodicBoundaryConditionsPBSAPoisson-BoltzmannwithSurfaceAreapH-REMDpHReplicaExchangeMolecularDynamicsPMFPotentialofMeanForceQMQuantumMechanicsREFEPReplicaExchangeFreeEnergyPerturbationREMDReplicaExchangeMolecularDynamicsRESPRestrainedElectrostaticPotentialRMSDRootmeansquareddeviationRNARibonucleicAcidSTPSingleTrajectoryProtocolT-REMDTemperatureReplicaExchangeMolecularDynamicsTIThermodynamicIntegration 14

PAGE 15

LISTOFCONSTANTSANDOPERATORS erf(x)2 p Rx0exp()]TJ /F7 11.955 Tf 9.3 0 Td[(t2)dtErrorfunctionerfc(x)1)]TJ /F7 11.955 Tf 11.95 0 Td[(erf(x)ComplementaryErrorFunctionh6.62606810)]TJ /F10 7.97 Tf 6.59 0 Td[(34m2kg=secPlanck'sConstant[ 1 ]ip )]TJ /F9 11.955 Tf 9.3 0 Td[(1Imaginaryunit~5(@ @~x,@ @~y,@ @~z)GradientOperator5255=@2 @x2+@2 @y2+@2 @z2LaplaceOperatorkB1.380658(12)10)]TJ /F10 7.97 Tf 6.59 0 Td[(23JK)]TJ /F10 7.97 Tf 6.59 0 Td[(1Boltzmann'sconstant[ 1 ] 15

PAGE 16

AbstractofDissertationPresentedtotheGraduateSchooloftheUniversityofFloridainPartialFulllmentoftheRequirementsfortheDegreeofDoctorofPhilosophyFREEENERGYSIMULATIONSOFCOMPLEXBIOLOGICALSYSTEMSATCONSTANTPHByJasonM.SwailsAugust2013Chair:AdrianE.RoitbergMajor:ChemistrySolutionpHhasprofoundeffectsonthestructure,function,andactivityofmanycomplexbiomoleculesthatcatalyzethechemicalreactionsresponsibleforsustaininglife.Evenfocusingonthehumanbody,thevariousphysiologicalenvironmentsspanawidepHrangeaslowas1inthestomachtovaluesashighas8.1inpancreaticsecretions.SmallchangesfromthenormalpHofabiomolecule'senvironmentcanbecatastrophicallydisruptivetoitsactivity.Forexample,achangeinpHofaslittleas0.1pHunitsinthehumanbloodstreamisenoughtocauselife-threateningalkalosisoracidosis.DuetotheimportanceofpHinbiologyandtheprofoundeffectitcanhaveonbiomolecules,itisimportanttoincorporatepHeffectsincomputationalmodelsdesignedtotreatthesebiomolecules.ThesolutionpHcontrolsprotonationstateequilibriaofspecicfunctionalgroupsprevalentinbiomolecules,suchascarboxylates,amines,andimidazoles.Theseprotonationstatesinturnaffectthechargedistributioninthebiomoleculewhichcanhaveasignicantimpactonbothits3-dimensionalstructureaswellasinteractionswiththesurroundingenvironment.Inmanycases,pHcanalsoimpactwhetherornotaprotondonororacceptorwillbeavailableforcatalysisduringthecourseofthebiocatalyticmechanism.TheaimofmyworkistodevelopaccurateandefcientcomputationalmodelstoprobethepH-dependentbehaviorofproteinsandnucleicacids.Themodelsmust 16

PAGE 17

becarefullydesignedtoobeythelawsofthermodynamicsundertheconstraintofanexternallyappliedpH.Onlythencantheresultsbedirectlycomparedtoexperimentalmeasurements.Inthisdissertation,IpresentmyworkonthedevelopmentofpH-basedmodelsforbiomoleculesandotherworkperformedintheareaofmolecularmodelling.Intherstchapter,Iintroducethefundamentalconceptscomputationalbiomolecularmodelingthatlaythefoundationforthepresentedwork.Thisisfollowedbychaptersonfreeenergyandsampling,constantpHsimulations,replicaexchange,andsomeusefultoolsIdevelopedtoaidinconductingcomputationalresearchwiththeAmbersimulationpackage. 17

PAGE 18

CHAPTER1INTRODUCTION 1.1OriginsofComputationalChemistryTheseedsofcomputationalchemistryweresowninthemid-1800swithLudwigEduardBoltzmann'sformulationofstatisticalmechanics.Inanerawhentheexistenceofatomsandmoleculeswashotlydisputedwithinthephysicscommunity,Boltzmannfatheredatheoryinwhichthebehaviorandinteractionofindividualatomsormoleculesonthemicroscopicscalecouldbeusedtodescribeandpredictmacroscopicphenom-ena.Usinghistheoremsandequations,itbecamepossibletoreducetheproblemofsimulating1023moleculestosimulating1molecule.Boltzmannusedthistogreateffectindescribingandderivingpreviously-known,phenomenologicalequationsforidealgases,suchasthewidelyknownequationofstate,PV=nRT.Allthatremainstoprovidethefoundationforusingmolecularsimulationsistheproperdescriptionofatomsandmoleculesonthemicroscopicscale.Thetheoriesrequiredtoaccuratelymodelthebehaviorofindividualatomsinteract-ingwitheachotherandtheirsurroundingswouldnotbedevelopeduntilthersthalfofthe20thcenturywiththeadventofquantummechanics.Thelimitsofclassicalmechan-icsbecameapparentwhenconsideringtheRayleigh-Jeansformulaforcalculatingthespectralemissionofaradiatingblackbody.TheRayleigh-Jeanslaw,givenbyEq. 1 ,iscompletelyderivedusingthelawsofclassicalmechanics.BylookingatEq. 1 ,weseethattheemissionspectrumdivergesathighfrequencies()aclearviolationofthewell-establishedlawofconservationofenergy. B(T)=22kT c2(1)Toaddressthisapparentdisparity,MaxPlancksuggestedthattheerrorintheclassicalmechanicalapproachwastoassumeacontinuousemissionspectrum.Instead,Plancksuggestedthattheemissionspectrawasquantized,leadingtoanequationthatagreed 18

PAGE 19

muchcloserwithexperiment.Thisideaofquantizedenergyemissions,whiledevelopedtoreconcilethemathematicsofblack-bodyradiationwithexperimentalmeasurements,wouldforeverchangeourunderstandingofthemicroscopicworld.Asquantummechanicsmatured,ourabilitytoexplainandpredictbehaviorattheatomicscaledramaticallyimproved.In1929,PaulDiracproclaimed,Thefundamentallawsnecessaryforthemathematicaltreatmentofalargepartofphysicsandthewholeofchemistryarethuscompletelyknown,andthedifcultyliesonlyinthefactthattheapplicationoftheselawsleadstoequationsthataretoocomplextobesolved.Evenapproximationsdesignedtosimplifytheequationsofquantummechanicsinmolecularsystemsresultedincomputationstoocomplextoapplytoallbutthesimplestsystems.Withthefundamentaltheorynecessarytodescribesinglemoleculesandthemachineryrequiredtoextendthatdescriptiontoexperimentalmeasurementsatourdisposal,computersprovidedthecatalystthatthrustedtheoreticalchemistryintoaprominentroleinchemicalresearch.Thenextsectionswilldescribethetheoryofquantummechanicsandtheap-proximationstypicallyemployedtosimplifyitsequations,followedbyadescriptionofstatisticalmechanics. 1.1.1QuantumMechanicsTwentyyearsafterPlanckintroducedtheideaofquantizedoscillatorstoexplainblack-bodyradiation,ErwinSchrodingerintroducedawaveequationformulationofquantummechanics(QM).[ 2 ]Schrodinger'sequation(Eq. 1 )bearsastrongresem-blancetoHamilton'sformulationofclassicalmechanicsbyemployingananalogousHamiltonianoperatorcomprisedofakineticenergyterm(relatedtothemomentumoperator)andapotentialenergyterm.E(~x,t)=^H(~x)=)]TJ /F17 11.955 Tf 12.93 8.09 Td[(~2 2m52+V(~x)(~x) (1) 19

PAGE 20

InEq. 1 ,Eisthetotalenergy,^HistheHamiltonianoperator,and(~x,t)isthewavefunctionthecentralobjectofSchrodinger'sequationcontainingalloftheinforma-tionandpropertiesinherenttothesystem.Eq. 1 isaspecialformofSchrodinger'sequationcorrespondingtoastationarystate(i.e.,thepotentialfunctionistime-independent,sotheenergyforthatstateisconstant).Inchemistrywhenwewishtocalculateobservablepropertiesofasystemcomposedofatoms,thekineticenergyisthesumofthekineticenergiesoftheatomicparticlesinthesystem,andthepotentialenergyiscalculatedastheinteractionofallchargedparticlesprotonsandelectronsintheelectriceldtheycreate(plusanyexternaleldthatmaybepresent).Thewavefunctioncontainsalloftheinformationabouteachoftheparticlesinthesystem.Asthenumberofparticlesinthesystemincreases,sotoodoesthecomplexityofthewavefunctionandtheeffortrequiredtosolveEq. 1 .Therefore,weturntoanumberofapproximationsdevelopedtosimplifycomputingasolutiontoSchrodinger'sequation. 1.1.1.1Born-OppenheimerApproximationTheBorn-Oppenheimerapproximation(BOA)ispervasiveintheeldofcomputa-tionalchemistry.Electronscanmovefarmorerapidlythannucleonssinceelectronsareroughly1000timeslighter.Thisimpliesthatelectronscanreorganizearoundmovingnu-cleisoquicklythatnuclearprotonsarealwayssubjecttothepotentialfromtheaverageelectriceldgeneratedbytheelectrons.UsingtheBOA,thewavefunctionofamolecularsystemcanbeseparatedintotwoparts:anelectronicpartwherethenucleiaretreatedasxedpointcharges,andanuclearpartwherethenucleimovethroughtheaverageelectriceldgeneratedbytheelectrons.[ 3 ]SocriticalistheBOAtocomputationalchemistrythatitappearsattheheartofnearlyeverycomputationalmolecularmodel. 20

PAGE 21

1.1.1.2ComputationalQuantumMechanicsThemaingoalofmostQMcalculationsinchemistryandmolecularphysicsistodetermineatomicandmolecularpropertiesofthesystembyestimatingtheelectronicpartofthewavefunctionfromtheBOA.Thesecalculationshaveprovidedvaluableassistancetoexperimentalinvestigations.QMcalculationscanprovidereliablemea-surementsofmoleculargeometries,[ 4 ]potentialandfreeenergybarriersofchemicalreactions,[ 5 ]ionizationenergies,[ 6 ]protonafnitiesandgas-phasebasicities[ 7 ],andmanyotherchemicalandmolecularproperties.[ 8 ]Thesecalculationsarebecomingroutineasmoreandmoreexperimentalstudiesemploysomeformofcalculationtohelpinterpretresultsorstrengthenconclusions.Despitealltheirsuccessesandtherapidincreaseofcomputationalpoweroverrecentyears,however,thecomputationaldemandsofQMmethodsoftenremainprohibitivelyhighforsystemswithmorethan100200atoms.Furthermoreforresearchersinter-estedintheselargesystems,calculationsonasinglearrangementofatomicnucleibecomesincreasinglyinsufcienttoquantifythebehaviorofthosesystems.Forsuchapplications,weturnourattentionbacktostatisticalmechanicswiththeaimofultimatelyapplyingthoseprinciplestomolecularmechanicalsimulationsoflargemoleculesthatoftencontainthousandsevenhundredsofthousandsofatoms. 1.1.2StatisticalMechanicsMacroscopicchemicalsystemsarecomposedofavastnumberofatomsontheorderofAvogadro'snumber,or6.0221023.How,then,canourcalculationsofasinglemoleculeorasmallclusterofmoleculesbeusedtopredictthebehaviorofacollectionof1023molecules?Forthatweturntostatisticalmechanicsandtheideaofanensemble.InasystemwithNparticles(whereNistypicallyontheorderofAvagadro'snumberinmagnitude),thereare6Ntotaldegreesoffreedominthesystemcorrespondingtothepositionandmomentumofeachparticleinallthreedimensions.Thisultra-high, 21

PAGE 22

6N-dimensionalspaceisreferredtoasphasespace,andthecollectionofallpointsthatconformtoasmallsetofthermodynamicconstraintse.g.,constantvolumeorenergyrepresentsanensemble.[ 9 ]TheconnectionbetweenthisimaginaryensembleofsystemsandexperimentalmeasurementsofrealsystemswasprovidedbyJosiahGibbs.Theexperimentalvalueofanysysteminthelabispostulatedtobeequaltothevalueofthatmechanicalobservableaveragedovereverymemberoftheensemble.[ 9 ]Byknowingtheprobabilityofndingamemberofanensemblewithagivensetofproperties,thisensembleaveragecanbecalculatedaccordingtoEq. 1 .hAi=PaW(a)A(a) PaW(a)=XaP(a)A(a) (1)W(a)inEq. 1 canbethoughtofasthenumberofstatesintheensemblewiththesamevalueofA.P(a)isthenormalizedprobabilityforthatstate,wherePaW(a)isthenormalizationfactor.Giventhatthereareontheorderof1023particlesinthetypicalsystem,thenumberofensemblemembersfromwhichtheaverageiscalculatedappearsatrstglancetobeintractable.However,itturnsoutthatthemeansquareuctuationsofmeasurablepropertieswithintheensemblescaleasroughly1=p NwhereNisthenumberofparticlesinthesystem.BecauseNisontheorderof1023,theuctuationsaroundthemostprobablevalueintheensemblevanishandtheensembleaverageandmostprobablevaluebecomeidentical.Theproblemofcalculatingtheensembleaverageofadesiredpropertyisreducedtothefarmoretractabletaskofcalculatingitsmostprobablevalue.Themostcommonlyusedensemble,calledthecanonicalensemble,isconstrainedsuchthateachmemberhasthesamenumberofparticles,volume,andtemperature(NVT).Othercommonensemblesincludethemicrocanonicalensemble(NVE),the 22

PAGE 23

grandcanonicalensemble(VT),andtheisobaric-isothermalensemble(NpT),whereE,,andpstandforconstantenergy,chemicalpotential,andpressure,respectively.Attypicaltemperaturesandparticledensities,theuctuationsofmechanicalpropertiesineachoftheseensemblesbecomesnegligible.Therefore,theseensemblesareeffec-tivelyequivalenttooneanother,allowingustochoosetheonethatismostconvenienttoworkwithmathematically.Thelinkbetweentheseensemblesandthermodynamicsisthenaturallogarithmofthepartitionfunction,whichhappenstobethenormalizationconstantfromEq. 1 foreachoftheensembles.ThepartitionfunctionsofthecommonensemblesareshowninEqs. 1 to 1 .Thelogarithmofthepartitionfunctionforeachensembleisdirectlyproportionaltothethermodynamicfunctionthathasthesamesetof`natural'variables.TheseconnectionsaresummarizedinEqs. 1 to 1 .[ 9 ]Microcanonical(N,V,E)=!(E) (1)CanonicalQ(N,V,T)=XE(N,V,E)exp()]TJ /F3 11.955 Tf 9.3 0 Td[(E) (1)GrandCanonical(,V,T)=XNQ(N,V,T)exp(N) (1)Isobaric-Isothermal(N,p,T)=XVQ(N,V,T)exp()]TJ /F3 11.955 Tf 9.29 0 Td[(pV) (1)InEqs. 1 to 1 ,!isthetotalnumberofstateswithagivenenergyandis1=kBT.Accordingtotheprincipleofequalaprioriprobabilities,allstatesinthemicrocanonicalensembleareconsideredequallyprobablesimplybecausethereisnoreasontoassumeotherwise. 23

PAGE 24

MicrocanonicalS=kln((N,V,E)) (1)CanonicalA=)]TJ /F7 11.955 Tf 9.3 0 Td[(kTln(Q(N,V,T)) (1)GrandCanonicalpV=kTln((,V,T)) (1)IsobaricIsothermalG=)]TJ /F7 11.955 Tf 9.3 0 Td[(kTln((N,p,T)) (1)Withthelinktoclassicalthermodynamicsnowrmlyestablishedthroughthepartitionfunction,statisticalmechanicscannowexplainthewholeofthermodynamicsfromthemicroscopicbehaviorofindividualatomsandmolecules.Oneoftheprinciplechallengesofcomputationalchemistrybecomeshowtoefcientlyestimatethepartitionfunction.SignicanteffortincomputationalchemistrycentersonestimatingthecanonicalpartitionfunctionQ(N,V,T).Thenaveapproachtocalculatethesumin 1 wouldbetocalculatethedegeneracyofeachenergylevel((N,V,E))andscaleitwiththeexponentialweightingfactor(exp()]TJ /F3 11.955 Tf 9.3 0 Td[(E))calledtheBoltzmannfactor.Duetotheimmeasurablesizeof(N,V,E),however,thisapproachishighlyinefcientinpractice.ItturnsoutthatmostoftheeffortputintocomputingQ(N,V,T)resultsintermsthatcontributeverylittletothepartitionfunctionsincetherearemoreavailablestatesathigherenergies(whichcarrylittleweightwiththeBoltzmannfactor).BecauseitisinfeasibletocalculatethefullsumsinEqs. 1 to 1 ,partitionfunctionsareestimatedusingarepresentativesubsampleoftheavailablepointstoconstructtheneededdistributions.Thestrategiesofgeneratingthesesubsamplesarecollectivelyreferredtoassampling.ThetwomostcommonapproachesMonteCarloandMolecularDynamicsarediscussedinthefollowingsections. 1.1.2.1MonteCarloOneapproachtoapproximatingEq. 1 ,calledMonteCarlo(MC)sampling,istoselectnewcongurationsofatomicpositionsatrandominthemolecularsystem, 24

PAGE 25

evaluatetheenergyofthatstructure,andadditscontributionweightedbytheBoltz-mannfactortothesumofthepartitionfunction.ThisisequivalenttoreorganizingthesuminEq. 1 tosumoverindividualstatesratherthanenergylevels.Eq. 1 isthenestimatedas Q(N,V,T)NsamplesXi=1exp()]TJ /F3 11.955 Tf 9.3 0 Td[(Ei)(1)Becauseweassumenopriorknowledgeofphasespacebeforehand,usingrandomcongurationsinMCsamplingiscriticaltoavoidintroducingbiasintothesubsample.TheMCapproachtoapproximatingthepartitionfunctionprovestobehighlyinefcient,however,asmostrandomatomiccongurationsinachemicalsystemcorrespondtospeciesthatareunphysicalandcontribute0tothepartitionfunction.Forexample,considercharacterizingthephasespaceofanethanemoleculeusingMC.Arandomcongurationisgeneratedbyplacingbothcarbonatomsandallsixhydrogenatomsatarandompointinspace,evaluatingtheenergyofthatcongurationusingaQMcalculation,andaddingthattermtothesummationinEq. 1 .Figure 1-1 depictstwoconformationsofethanewithanequalprobabilityofbeingchosen,onlyoneofwhichwillhaveanenergylowenoughtocontributesignicantlytoQ(N,V,T).Itshouldbeeasytoseethattherearefarmoreunphysicalarrangementsoftheatomsinethanethanchemicallyreasonableones.WhileMCsuffersseverelimitations, Metropolisetal. proposedamodicationtothetraditionalMCapproachthathelpedalleviatemanyoftheproblemsdescribedabove.[ 10 ]Thisvariant,describedbelow,iscalledMetropolisMonteCarloafterthemethod'sarchitect.MetropolisMonteCarlo Metropolis'sbreakthroughinMCmethodsisasubtlechangetothestandardap-proach.InsteadofgeneratingrandomstructuresandaddingthemalltotheensemblewithaweightequaltotheBoltzmannfactor,randomstructuresaregeneratedandac-ceptedasfullmembersoftheensemblewithaprobabilityproportionaltotheBoltzmann 25

PAGE 26

Figure1-1. Twoconformationsofanethanemolecule.Theconformationontheleftisthetypical`staggered'conformationknowntobethelowest-energystructure.Thestructureontherightisanabsurdconformationthatisneverconsideredexperimentally.Whilethestructureontherightcontributesnegligiblytothepartitionfunction,itisanequallylikelystructuretobeproposedbyMonteCarloastheoneontheleft. factor.Therefore,lower-energystructuresaremorelikelytobeaddedtotheensemblesincetheprobabilityofacceptingtheformerissignicantlygreater.Inpractice,anensembleisbuiltusingMetropolisMCbyconstructingachainofstatesbeginningwithsomeinitialstructure.The`next'structureisgeneratedrandomlyandacceptedwithaprobabilitythatensurestheconstructedensemblereproducesthecorrectprobabilitydistributionsforeachstate.Theprocessofgeneratingarandomconformationandevaluatingitsacceptanceintotheensembleiscalledatrialmove.TheresultingchainofstatesgeneratedbyMetropolisMCiscalledaMarkovchain,andithastwoimportantqualities.First,trialmovesareselectedfromanitesetofavailable,predeterminedmovesthatcannotchangeastheMarkovchaingrows.Second,aMarkovchainissaidtobememorylessthatis,theprobabilityofacceptingaproposedstructuredependsonlyonthecurrentstateandnotonanyotherstatethat 26

PAGE 27

hascomebefore.Becausethermodynamicsdealswithchemicalequilibria,anensemblebuiltfromaMarkovchainofstatesneedsanadditionalpropertyreversibility.AreversibleMarkovchainneedstosatisfytheadditionalconditionofdetailedbalance,arelationshipshowninEq. 1 Pii!j=Pji!j(1)wherePiistheprobabilityofbeinginstateiandi!jistheprobabilityofacceptingtheproposedchangeofgoingtostatejfromstatei(calledthetransitionprobability).ThedetailedbalanceconditioninaMarkovchainassertsanequilibriumbetweenallstatesinthechain.Eq. 1 isnothingmorethanacommonequilibriumexpressionencounteredingeneralchemistrywherePiisthe`concentration'ofstateiintheMarkovchainandthetransitionprobabilityisthe`rate'ofchangingfromstateitostatej.ThelastremainingdetailofMetropolisMCistodeneatransitionprobabilityequa-tionthatsatisesdetailedbalance.Forthecanonicalensemble,wheretheprobabilityofbeinginstateiisproportionaltotheBoltzmannfactor,Eq. 1 satisesdetailedbalance. i!j=min1,exp()]TJ /F3 11.955 Tf 9.3 0 Td[(Ei) exp()]TJ /F3 11.955 Tf 9.3 0 Td[(Ej)(1)Eq. 1 canbeinsertedintoEq. 1 toverifythatthischoiceforthetransitionprobabilitysatisesdetailedbalanceandthereforeresultsinareversibleMarkovchain.ModelsusingtheMetropolisMCapproachinsteadoftraditionalMCarefarmoreefcientsomuchsothatthetermMonteCarlooftenimpliesMetropolisMonteCarlo,[ 11 12 ]andthatconventionwillbeadoptedfortherestofthisdissertation.OneconcernthatMetropolisMCdoesnotaddress,however,isthepropensityforrandomchoicestoresultinmeaninglessstructures.Thisisalleviatedbystartingfromachemicallyreasonablestructureandlimitingthemagnitudeofthestructuredifferencesallowedineachtrialmoveatechniquereferredtoasimportancesampling.Thestepsizebecomesatunableparameterofthemethod.Ifitistoosmall,thenitwilltakealong 27

PAGE 28

timetolltheensemblewithdifferentstructures.However,ifitistoolarge,thelikelihoodofproposingreasonablestructureswilldropoffandtheacceptanceratewillsuffer. 1.1.2.2MolecularDynamicsandtheErgodicHypothesisAnalternativemethodforconstructingastatisticalensembleofstates,calledmoleculardynamics(MD),correspondstogeneratingstructuresbyintegratingtheequationsofmotionformolecularsystemsandbuildingensemblesfromtheresultingtrajectories.Theideathatatime-averageoveratrajectoryisequaltoanensembleaverageiscalledtheergodichypothesis,andisthecornerstoneofMDmethods.ThemostcommonequationsofmotionusedinMDsimulationsarethosefromclas-sicalmechanics.TheforceoneachatomicnucleusiscalculatedasthegradientofthepotentialenergyfunctionU(~x)atthenuclearcentersandthenintegratednumericallyac-cordingtoNewton'slaws.AdiscussionofcomputationalMDandnumericalintegrationoftheclassicalequationsofmotionispresentedinAppendix A .MoleculardynamicssimulationshaveseveraladvantagescomparedtoMonteCarlo-basedmethods.First,MDcanbeusedtocalculatetemporalproperties,suchasdiffusion.Second,everystructurethatisgeneratedduringamoleculardynamicstrajectoryisafullmemberoftheresultingensemble.Incontrast,MC-basedtechniquesdiscardsomefractionofthestructurestheygenerate.Finally,trajectoriesgeneratedbyMDsimulationscaninformaboutthenatureofhowamoleculemoveswithinaparticularenvironment,whichmayprovideinsightintothebehaviorofmolecularsystems.Forthesereasons,MDtechniqueshavebecomeverypopularintheeldsincetherstreporteduseonproteinsin 1977 .[ 13 ]Moleculardynamicsdoeshaveseveralweaknesses,however,whichmustbeovercomeinordertouseMDsimulationsaspredictiveinstrumentsinchemistry.StandardmoleculardynamicssimpleintegrationofNewton'slawssamplesstrictlyfromthemicrocanonicalensemblesinceenergyisconserved.Whilethevar-iousthermodynamicensemblesareequivalentinthethermodynamic(macroscopic) 28

PAGE 29

limit,itisoftenmoreconvenienttoworkwithotherensembles,likethecanonicalandisobaric-isothermalensembles.Thedesiretosimulatesystemswithdifferentthermo-dynamicconstraintsledtothedevelopmentofnumerouswaystocontroltemperatureandpressure.[ 12 ]Thesetechniquesarereferredtoasthermostatsandbarostats,respectively.Thenaveapproachtomaintainingaconstanttemperatureistoscaleallvelocitiesateachtimestepsuchthateachpointalongthetrajectoryhasthesamekineticenergy(andthereforetemperature).[ 14 ]Forlargesystems,however,theresultingperturba-tiononthesystemistoolarge.Toaddressthisproblem, Berendsenetal. proposedamethodinwhichthefactorbywhichvelocitiesarescaledisreducedsothatthermaliza-tionoccursonanitetimescale(ratherthaninstantaneously).[ 15 ]Similarapproachesexistformaintainingconstantpressure.[ 15 ]Analogoustoscalingthevelocitiestomaintainaconstanttemperature,thesystemvolumeisscaledtomaintainaconstantpressure.AnothermajorchallengeinMDsimulationsischoosingtheintegrationtimestep.Thetimestepmustbechosenshortenoughtoavoidaccumulatingintegrationerrors,butlongenoughthatslowstructuralchangesmaybesampledinareasonableamountofsimulationtime.Whiletheslowmotionswithsmallfrequenciesareoftenthemostin-terestingsincetheycorrespondwithglobalconformationalchangesinmacromolecules,thetimestepisdictatedbythehighfrequencymotionsseeFig. 1-2 foragraphicalillustrationclarifyingthisphenomena.Asanexample,bondsbetweenhydrogenand`heavy'atoms(e.g.,carbon,oxygen,andnitrogen)oftengiverisetothehighestfrequencymotionsintypicalmacromolecules.ThesedegreesoffreedomcutthemaximumtimestepthatcanbeusedforMDsimula-tionsinhalf.Asaresult,constraintsareoftenappliedtothesehigh-frequencydegreesoffreedomtopermanentlyxthemtotheirequilibriumbondlengthsusinganyofanumberofalgorithms.[ 16 20 ] 29

PAGE 30

Figure1-2. Thecurvesrepresentthetrajectoryofsimpleharmonicoscillatorswithahighfrequency(left)andlowfrequency(right).TheblackarrowsarethetrajectorytracedoutintegratingNewton'slawsnumericallyusingatimestepof1timeunitsintheplot.Theredlineistheanalyticaltrajectorytosimpleharmonicoscillation. WhileatomicforcescanbederivedfromQMcalculationsonmolecularsystemsandMDcanbeperformedusingthispotential,themassivecomputationalexpenseofQMmodelshinderstheirutilityforlargebiomolecules.Itisnecessary,therefore,todevelopamodelthatcanaccuratelydescribelargemoleculeswhilebeingsimpleenoughtosolvewithreasonablecomputationaleffort.Forthat,weturntomolecularmechanics. 1.2MolecularMechanicsWesawfromSec. 1.1.1.2 thatcomputationalchemistsusequantummechanicstosolvetheelectronicSchrodingerequationinordertocalculatetheenergyasafunctionofnuclearcoordinates.Forsmallmoleculescontaining2030atoms,therearetypicallyasmallnumberofconformationsthatthemoleculecanreasonablyadoptattypical 30

PAGE 31

temperatures,andpartitionfunctionscanbereasonablyapproximatedusingonlyahandfulofdifferentstructures.Forlargersystems,however,itbecomesincreasinglydifculttouseQMmethodsfortworeasons.First,thecomputationaldemandforobtainingtheenergyofasinglestructurerapidlyincreases.Second,phasespacebecomessomassivethatcalculatingthepotentialenergyofasmallnumberofsnapshotsisnolongerareasonableapprox-imationtothepartitionfunction.Forthesereasons,weseektodevelopamodelwithwhichwecanefcientlycalculateinteratomicpotentialsinmolecularsystemswithoutsolvingtheelectronicSchrodingerequation.Thismodelwilltasimplefunctionalformtothepotentialofthemolecule,describingtheinteractionbetweeneveryatominthesystem.Thesefunctionstypicallyhaveanalyticderivativesthatcanberapidlyevalu-atedtofacilitatetheiruseinMDsimulations.Becausetheanalyticgradientsofthesepotentialsaretheforcesthatactontheatomiccenters,thesemolecularmechanicalmodelsarecalledforceelds.Iwillnowdiscusshowtheseforceeldsaredesigned,withspecialattentionpaidtotheAmberfamilyofforceelds. 1.2.1ForceFieldsInthissection,Iwilldiscussthevariousparametersfoundincommonforceelds,includingbonds,angles,torsions,andnon-bondedinteractions. 1.2.1.1BondsIbeginwithmodelingthechemicalbond.UsingtheBorn-Oppenheimerapproxima-tion,wecancalculatethepotentialenergysurfaceforachemicalbondbycalculatingthepotentialenergyatdifferentnuclearseparationsusinganappropriateQMmethodology.Acommonchoiceoffunctiontoreproducethe`correct'potentialenergysurfaceisaTaylorseriesexpansioncenteredaroundtheequilibriumbondlength.Thisseriescanbetruncatedatanyordertoachievethedesiredaccuracyandprecision.AnexamplefortheHydrogenmoleculeisshowninFig. 1-3 ,wherethe`exact'potentialenergysurfaceistakenfromRef. 21 .Whenthedeviationfromtheminimumbondlengthissmall,the 31

PAGE 32

Figure1-3. TheexactpotentialenergysurfaceforH2[ 21 ]plottedwiththebest-ttingquadraticandquarticpolynomialsandthebest-ttingMorsepotential(Eq. 1 ). potentialbehaveslikeasimpleharmonicoscillatorobeyingthepotential U(~x)=1 2k(~x)]TJ /F3 11.955 Tf 11.44 .5 Td[(~xeq)2(1)where~xeqistheequilibriumbondlength.Anotherfunctioncommonlyusedtomodelchemicalbonds,calledtheMorsepotential,isshowninEq. 1 .TheMorsepotentialhasthebenetthatitcanmodelbonddissociation(D~xinEq. 1 )aneffectthatcannotbecapturedwithalow-order,truncatedTaylorseriesexpansion.Itisusedlessfrequentlythanasecond-tofourth-ordertruncatedTaylorseries,however,becauseitiscostliertocomputeandmostsimulationsemployingforceeldsstudyconformationsinwhichbondsremainclosetotheirequilibriumvalues.Whenbondlengthsdeviatelittlefromequilibrium,thedifferencebetweentheMorsepotentialandaquadratic(orquartic)polynomialissmall.[ 22 ] 32

PAGE 33

U(~x)=D~x[1)]TJ /F9 11.955 Tf 11.95 0 Td[(exp(~x(~x)]TJ /F3 11.955 Tf 15.44 .5 Td[(~xeq))]2(1)Bondparameterscanbederivedfromeitherhigh-levelquantumcalculationssuchasthoseshowninFig. 1-3 orfromexperimentalmeasurements.Vibrationalforceconstants(kinEq. 1 )anddissociationenergies(D~xin 1 )canbedeterminedspectroscopicallyandsubsequentlyusedtodenethebondparameters. 1.2.1.2AnglesAvalenceangleisdenedastheangle()betweenatomsseparatedbytwoconsecutivebondsseeFig. 1-5 B.Likebonds,theybehavelikesimpleharmonicoscillatorswhentheyaresufcientlyclosetotheequilibriumvalue.Asaresult,theyaretypicallytreatedwiththesimplequadraticpotentialfunctionU(~)=1=2k(~)]TJ /F3 11.955 Tf 15.43 3.15 Td[(~eq)2.Angleparameters,too,canbederivedfromeitherhigh-levelQMcalculationsorfromspectroscopicmeasurements.Infraredspectroscopyisparticularlywell-suitedforderivingtheseparameters,sincevibrationalfrequenciescorrespondwithharmonicforceconstants. 1.2.1.3TorsionsAtorsionisdenedbetweenfouratomsconnectedbythreesequentialbondsforsimplicityIwillreferencetheatomnumbersfromthelabelsinFig. 1-5 C.Thetorsionangle(),then,istheanglebetweenthebonds1and3whenprojectedontoaplanewhosenormalvectoristhe2bond.ThisprojectioniseasilyvisualizedfortheNewmanprojectionofatorsion,showninFig. 1-5 D.Itshouldbeapparentthattorsionpotentialsshouldrepeatwithamaximumperiodof360sincetorsionanglesseparatedby360areidentical.Thefunctionalformusedfortorsionsisdifferentthanthatusedforbondsandangles.WhileaperiodicfunctioncanberepresentedbyaTaylorseriespolynomialofinniteorder,aFourierseriesisfarmoresuitedtottingtorsionpotentialsthanTaylor 33

PAGE 34

seriessincethebasisfunctionsofaFourierseriesare,themselves,periodic.AcommonfunctionalformfortorsionpotentialsisgiveninEq. 1 .[ 22 ] U()=NXiki[1+cos(ni+ i)](1)wherethetorsionpotentialisrepresentedasasumofNtermswithbarrierheightski,periodicitiesofni,andphaseshiftsof i.Torsionpotentialsareeasilythemostimportantofallbondedparametersinforceelds.Bondsandanglesarerelativelyrigid,sincetheyareoftenmodeledbyquadraticpotentialswithmodestlylargeforceconstants.Evenmakingtheforceconstantforbondsandanglestwotimeslargerthantheyshouldbewillresultinonlyasmallchangeinconformationalsampling.Torsionpotentials,ontheotherhand,typicallyhavemuchsmallerbarriersandgiverisetofarmoresignicantconformationalchanges.ConsidertheethanemoleculeinwhichtorsionsaredenedbetweenHCCH.Atroomtemperature,neithertheindividualbondsorangleswilldeviatemuchfromtheirequilibriumvalues,butthetorsionanglewillreadilysampleeveryvalueduetothelowenergybarriersbetweenstaggeredandeclipsedconformations.Inordertoaccuratelycalculatethepartitionfunction,then,aforceeldmustproperlyreproducetheenergybarriersalongthetorsioncoordinatetoprovideareasonableestimateofthethermodynamicpropertiesofethane.Unlikebondandangleparameters,therearenospectroscopictechniquesthatcanbeusedtoextracttorsionparameters.Furthermore,forceeldparametersarenotorthogonalwithoneanotherforexample,differentchoicesfornon-bondedpotentialterms(describedinSections 1.2.1.4 and 1.2.1.5 )willimpacttorsionproles.Therefore,torsiontermsaretypicallythelastvaluesttedwhendesigningaforceeld,andareusedascorrectionaltermsto`x'thedeciencyoftheotherforceeldparametersindescribingconformationalequilibria.Forceeldsareoftensystematicallyimprovedjustbychangingsometorsionterms.[ 23 25 ] 34

PAGE 35

1.2.1.4ElectrostaticInteractionsThethreepotentialsthatIjustdiscussedarecalledbondedinteractionssincetheyoccurbetweenatomsconnectedbybonds.Potentialsbetweenallatomsarecallednon-bondedinteractions.Therstofthenon-bondedinteractionsIwilldiscussariseduetocharge-chargeinteractions,typicallyreferredtoaselectrostaticinteractions.Atomstreatedinaforceeldareassignedpartialchargesthatroughlycorrespondtoatomelectronegativities,althougheachforceeldhasapreciserecipeforderivingpartialatomiccharges.Acommonstrategytoassignpartialchargesistottoanelectrostaticpotential(ESP)calculatedusingaQMmethod.Itiscommonpracticetoapplyconstraintstothettoensurethatrotationallydegenerateatoms(e.g.,thethreehydrogenatomsinafreelyrotatingmethylgroup)havethesamechargeatechniquereferredtoasrestrainedelectrostaticpotential(RESP).[ 26 28 ]Therearetwoprinciplecharge-chargeinteractionmodelsutilizedinmodernforceelds:so-calledpolarizableandxed-chargeforceelds.Thepolarizableforceeldsallowthepartialatomicchargeofeachatomtochangeinresponsetoitssurroundings,providingadditionalexibilitytoforceeldparametrization.Duetotheaddedcom-putationalexpenseofcomputingpolarizablepotentialsandthedifcultythisimposesonderivingotheraspectsoftheforceeld,xed-chargeforceelds(i.e.,forceeldswherepartialatomicchargesneverchange)aremorecommonlyused.Allfuturediscus-sioninthisdissertationofelectrostaticinteractionsintheMMframeworkwillfocusonxed-charge,monopole-monopoleinteractions.Theelectrostaticpotentialiscalculatedaccordingto U(ri,j)=kqiqj ri,j(1)InEq. 1 ,kistheelectrostaticconstant,qiisthepartialchargeonatomi,andri,jisthedistancebetweenatomsiandj.OnethingtonoteaboutEq. 1 isthelong-rangednatureoftheinteraction.Whiletheelectrostaticenergyoftwochargedparticles 35

PAGE 36

fallsto0asthedistancebetweenthembecomesinnite,1=idecayssoslowlythatP1i=11=i=1.Therefore,electrostaticinteractionstypicallyhavetobeevaluatedoveraverylongdistance(orcalculatedcompletely). 1.2.1.5vanderWaalsInteractionsInadditiontoelectrostaticinteractions,forceeldsalsoemployanothernon-bondedpotentialthataccountsforvanderWaalsinteractions.ThevanderWaalspotentialiscomposedoftwopartsastronglyrepulsivetermthatmodelsstericclashesandanattractivetermaccountingfordispersioninteractions.TheattactivetermofthevanderWaalspotentialisderivedmostlyfromtheLondondispersionforcesshownforanidealgasdimerinEq. 1 .[ 3 ] U(ri,j)=)]TJ /F9 11.955 Tf 10.49 8.09 Td[(3 2I r6(1)whereIistherstionizationenergyandisthepolarizability.Thisattractiveinterac-tionariseseveninnoblegasesduetoinstantaneousatomicpolarizationcausedbycorrelatedmovementsoftheelectrons.ThemostcommonfunctionalformusedtomodelvanderWaalsinteractionsiscalledtheLennard-Jones(LJ)potential,showninEq. 1 .ULJ(ri,j)=4"i,j"i,j ri,j12)]TJ /F13 11.955 Tf 11.95 16.86 Td[(i,j ri,j6#=4"i,j"1 4Rmin,i,j ri,j12)]TJ /F9 11.955 Tf 13.15 8.09 Td[(1 2Rmin,i,j ri,j6# (1)=ai,j r12i,j)]TJ /F7 11.955 Tf 13.15 8.09 Td[(bi,j r6i,jwhereri,jisthedistancebetweenatomsiandj,andtheremainingtermsarelabeledinaschematicdiagramshowingthenatureoftheLJpotentialinFig. 1-4 .Thethreeforms 36

PAGE 37

ofEq. 1 areequivalentifRmin,i,j=21=6i,jai,j="i,jR12min,i,jbi,j=2"i,jR6min,i,jDuetoitscomputationalefciency,thethirdformofEq. 1 istypicallyusedinmolecularsimulations.TouseEq. 1 inthesesimulations,theai,jandbi,jvaluesmustbecomputedforeverypairofatomsinthesystem.Fortransferableforceelds(i.e.,forceeldswhoseparameterscanbeusedformanydifferent,butrelated,systems),eachtypeofatomdenedintheforceeldistypicallyassignedanindividual"andparameterwhichmustbecombinedwitheveryotheratomtypetoyieldai,jandbi,j.Thewayinwhichtheseindividualatomicparametersaremixedisreferredtoasthecombiningrules. 1.2.1.6OtherForceFieldTermsTheparameterspresentedintheprevioussectionsmakeupthebulkofallparame-tersfoundintypicalforceelds.InthissectionIwilldescribesomeoftheless-commonlyusedtypesofparameters.ImproperTorsions.Impropertorsionsaretypicallyusedascorrectiontermstocontrolout-of-planemotion.Therearenumerousinstanceswherefourormoreatomsshouldbepredominantlycoplanarsuchasaromaticve-andsix-memberedrings.TheexistingparametersIhavealreadymentioneddonotnecessarilyensurethattheproperplanarityofthesesystemswillbemaintained.Asaresult,impropertorsiontermsareaddedtotheforceeldinkeylocationstosuppressunwantedout-of-planemotion.AdiagrammaticdepictionofanimpropertorsionisshowninFig. 1-5 E.CorrectionMap.TorsionpotentialsaresoimportanttoensuringthatMMsimula-tionsgenerateasensibleconformationalensemblethatsomeforceeldsparametrizecoupledtorsionparameterstoimprovetheaccuracy.Themostcommonimplementation 37

PAGE 38

Figure1-4. TheLennardJonespotentialbetweentwoatomswithaRmin,i,jof3.816Aand"of0.1094kcalmol-1.Thevariousparametersareindicatedonthegraph,andthefullLJpotentialisshownalongsideitsrepulsiveandattractiveterms. ofthesecoupled-torsioncorrectionsisdoneintheformofacorrectionmap,orCMAPterm.[ 29 ]TheCMAPisgeneratedbymappingthepotentialenergysurfaceoftwotor-sionsinasmallsamplesystemwithouttheCMAPcorrectionandsubtractingthatfromthe`true'potentialenergysurfacecalculatedwithsomehigh-levelQMmethod.TheCMAPisthenlaidoutonagrid,usingsometypeofinterpolatingspline(e.g.,bicubicsplines)tocalculatepotentialenergiesandforcesduringMDsimulations.Aschematicofthecoupled-torsionscommonlyparametrizedviaCMAPsisshowninFig. 1-5 F.Urey-Bradley.AnotherparametercommonlyusedinCHARMMforceelds[ 30 ]iscalledtheUrey-Bradleypotential.ThefunctionalformoftheUrey-BradleytermisidenticaltothebondterminSec. 1.2.1.1 (Eq. 1 ),butexistsbetweenatoms 38

PAGE 39

Figure1-5. Schematicsshownforvariousparameterspresentintypicalforceelds.A)isabondparameter,B)showsthevalenceangleparameterandtheUrey-Bradleyparameterwhere~uistheshowndistance,C)depictsatorsion,D)depictsthesametorsionusingaNewmanprojection,E)depictsanimpropertorsion,andF)depictstwocoupledtorsionsalongsideatypicalfreeenergymapoftwotorsionsthatCMAPparametersattempttotto. separatedbytwobonds(i.e.,formingavalenceangle).TheUrey-BradleytermisshowninFig. 1-5 Balongsidethevalenceangle. 1.2.2TheAmberForceFieldTheAmberforceeldisapopularfamilyofforceeldsdesignedtotreatlargebiomoleculessuchasproteins,DNA,andRNA.ThissectionwillfocusonthefunctionalformandimplementationoftheAmberforceelds[ 23 31 32 ]inthesupportingAmberprograms.[ 33 ] 39

PAGE 40

1.2.2.1FunctionalFormThefunctionalformoftheAmberforceeldsistypicallypresentedasinEq. 1 ,[ 31 ]althoughthisisanincompletespecication.AmorerigorousdenitionispresentedinEq. 1 ,takingintoaccounttheproperexclusionofnon-bondedtermsbetweenbondedatoms.U(q)=XbondsKr(r)]TJ /F7 11.955 Tf 11.96 0 Td[(req)2+XanglesK()]TJ /F3 11.955 Tf 11.96 0 Td[(eq)2+XtorsionsVn 2[1+cos(n)]TJ /F3 11.955 Tf 11.96 0 Td[()]+1 2XiXjAi,j R12i,j)]TJ /F7 11.955 Tf 13.15 8.09 Td[(Bi,j R6i,j+kelecqiqj Ri,j (1)XbondsKr(r)]TJ /F7 11.955 Tf 11.95 0 Td[(req)2+Xangles+K()]TJ /F3 11.955 Tf 11.95 0 Td[(eq)2+U(q)=XtorsionsVn 2[1+cos(n)]TJ /F3 11.955 Tf 11.95 0 Td[()]+1 2XiXj2l1)]TJ /F15 5.978 Tf 5.76 0 Td[(4,iAi,j 2.0R12i,j)]TJ /F7 11.955 Tf 21.07 8.09 Td[(Bi,j 2.0R6i,j+qiqj 1.2Ri,j+ (1)1 2XiXj=2lexcl,iAi,j R12i,j)]TJ /F7 11.955 Tf 13.16 8.08 Td[(Bi,j R6i,j+kelecqiqj Ri,jAmberemploysasimpleharmonicpotentialtomodelanglesandbondstocom-pletelydescribetheinteractionsbetweenatomsseparatedbyoneandtwobondsi.e.,noelectrostaticorLennard-Jonespotentialsarecalculatedbetweenpairsofatomsconnectedbyabondorangle.TorsionsaretreatedwithatruncatedFourierseriesexpansion,typicallyusingintegralvaluesfortheperiodicity(ninEqs. 1 and 1 ).Therefore,thesumovertorsionsinEqs. 1 and 1 isasumoverallindividualtorsiontermsforeachdistincttorsion.Impropertorsionsaremodeledthesamewayas`proper'torsionswithonlyasingletermdesignedtomaintaintheirplanargeometry.Thenon-bondedinteractionsarecomposedofaLennard-Jonesterm(thethirdformofEq. 1 )andelectrostatictermcalculatedbetweenallatompairsthatarenotexcludedfromthecomputation.Thenon-bondedexclusionlistlexcl,iforatomiinEq. 40

PAGE 41

1 iscomposedallatomsseparatedbyone,two,andthreebonds(i.e.,thatformbonds,angles,ortorsionswithatomi).Finally,theLJinteractionsbetweenallatomsseparatedbythreebondsarescaledby1=2andtheelectrostaticinteractionsbetweenthoseatompairsarescaledby1=1.2.InEq. 1 ,l1)]TJ /F10 7.97 Tf 6.59 0 Td[(4,irepresentsthelistofatomsrelatedtoatomilikeatoms1and4inFig. 1-5 C. 1.2.2.2ImplementationThissectionwilldescribehowtheAmberfamilyofforceeldsisimplementedintheAmberprogramsuite.ItisimportanttonotethattheAmberforceeldisnotuniquetherearemanyvariants,eachwithadifferentname.[ 23 31 32 34 36 ]AllinformationnecessarytofullydescribeamolecularsystemwiththeAmberforceeldiscontainedintwolestheparameter-topologyle(prmtop)andthecoordinatele.TheprmtoplefullydescribedinAppendix B containsalloftheinformationregardingthebondednetworkandthenecessaryparametersforevaluatingEq. 1 .ThecoordinatelecontainstheCartesiancoordinatesandvelocitiesforeachatominthesystemdescribedbytheprmtople.Theprmtopleisgeneratedbythetleapprogrambymatchingtheparametersfromadatabasetotheassigned`atomtypes'oftheinputstructure.Atomtypesaredescriptorsofindividualatomsthatspeciesthepropertiesandtypicalchemicalstructureofbondsinvolvingthatatom.Eachatomtype,i,hasapredeterminedsetofatomparametersanatomicmass,aLJradiusri,andaLJwell-depth"i.ThepairwiseRmin,i,jinEq. 1 ofatomtypesiandjisthesumri+rj.Thecombinedwelldepth"i,jisthegeometricmeanoftheindividualwelldepths(p "i"j).Thesearetheso-calledcombiningrulesemployedbytleapwhenparametrizingamoleculewithanAmberforceeld.TheAmberparameterdatabasesstorealistofallrecognizedatomtypesaswellasthebond,angle,andtorsionparametersbetweenthevariousbondedarrangementsoftheavailableatomtypes.Forinstance,eachpairofatomtypesthatcouldformabond 41

PAGE 42

(e.g.,twoaromaticcarbonsoranaromaticcarbonandanaromatichydrogen)hasanequilibriumbondlengthandbondforceconstantassociatedwithit.Alsostoredintheseparameterdatabasesaretheequilibriumangledisplacementswithcorrespondingforceconstantsandtorsionparameters(periodicities,barrierheights,andphaseshiftsforeachtermofeverytorsion). 42

PAGE 43

CHAPTER2BIOMOLECULARSIMULATION:SAMPLINGANDFREEENERGYAllofthesimulationmodelsdiscussedinChapter 1 useanelectrostaticequationdealingwithchargesinteractinginavacuum.However,biologicalchemistryoccursalmostexclusivelyinanaqueousenvironment,necessitatingthedevelopmentofmodelscapableofsimulatingthesesystemsinsolution.Inthischapter,Iwilldescribethevariousmethodsbywhichsolventeffectsareintroducedintosimulation,followedbytheensemblesamplingtechniquesthatwillbeusedfortheprinciplestudiesinthisdissertation. 2.1SimulationsintheCondensedPhaseThetechniquesbywhichsolvationeffectscanbeincorporatedintovariouscompu-tationalmodelscanbeseparatedintotwogroups.Themostobviouswayistoincludethesolventatomsandmoleculesdirectlyintothesimulationalongsidethesystemofinterestreferredtocollectivelyasexplicitsolventmethods.Whileexplicitsolventmod-elsarethemostaccurateapproachinprinciple,theydrasticallyincreasethesizeofthesystemsimultaneouslyincreasingthecostofthesimulationandamountofsamplingrequiredtoobtainconvergedresults.Analternativewaytoincludesolventeffectsisbymodifyingtheelectrostaticinteractionsinasystemtoaccountforthenaturalscreeningthataparticularsolventprovides.Theseapproachesarecalledimplicitsolventmethodsbecausesolventeffectsareincludedinanaveragewaywithoutincludingtheactualsolventatomsormoleculesinthesimulation.Simulationsemployingimplicitsolventmodelsresultinsmallersystemsinwhichcomformationalsamplingconvergesmorerapidlybecausethesolventdegreesoffreedomarealreadyincludedinameaneldway.However,individualsolventmoleculesoftenplayacriticalroleinthestructureandfunctionofbiologicalmoleculesandbehaveverydifferentlyfrommoleculesinbulksolventaneffectimplicitsolventmodelsareill-equippedtohandle. 43

PAGE 44

Thefollowingsectionsdescribethevariousimplicitandexplicitsolventmodelscommonlyusedinbiomolecularsimulations. 2.1.1ImplicitSolventOneofthemostimportantqualitiesofasolventespeciallyanaqueoussolventisitsabilitytopolarizeinresponsetoanelectriceld,therebyreducingthemagnitudeofelectrostaticinteractionsacrossagivendistance.Whilethenaveapproachofsimplyapplyingthesolventdielectriceverywhereisattractiveinitssimplicity,solvent-excludedregionsshouldobviouslynotbesubjecttothescreeningeffectsofthesolvent.Forlargebiomolecules,thesolvent-excludedregionscanbequitelarge,soitbecomesimportanttodealwiththeseregionseffectively. 2.1.1.1Distance-dependentDielectricAmongtheearliestapproachestoaccountforthedifferentdielectricenvironmentsoftheinteriorofabimoleculeandbulksolventintroducedadielectricconstantthatchangedasafunctionofthedistancebetweentwochargedparticles.Astheseparationbetweentwoparticlesincreased,sotoodidthelikelihoodthattheywereseparatedbysolvent,andwerethereforesubjecttodielectricscreeningeffects.Thisapproachisattractiveinitssimplicityitaddslittletothecomputationalcostofthemodelwhileretainingthesimple,pairwise-decomposablenatureoftheelectrostaticpotentialterm.AcommonequationmodelingthedielectricconstantisgivenbelowinEq. 2 .[ 12 ] "e(r)="bulk)]TJ /F9 11.955 Tf 11.95 0 Td[(1 2(rS)2+2rS+2exp()]TJ /F7 11.955 Tf 9.3 0 Td[(rS)(2)whereristhedistancebetweenthetwoparticles,"bulkisthedielectricconstantofthebulk,"eistheeffectivedielectricconstantatagivenparticleseparation,andSisafreeparameter.Fig. 2-1 plotstheresultingcurvefor"efromEq. 2 fordifferentvaluesofthefreeparameter. 44

PAGE 45

Figure2-1. Distance-dependentdielectricfordifferentvaluesofthefreeparameterSinEq. 2 ThiseffectivedielectricconstantisthenincorporatedasinEq. 1 ,andinu-encesthecalculatedforcesduetoitsdependenceonri,j.Oneofthebiggestweak-nessesofdistance-dependentdielectricsisthatittreatseveryatominthebiomoleculeasthoughtheyareinthesameenvironment,whereastheshapesofbiomoleculesandtheirsolvent-excludedvolumesareoftenhighlyirregular.Thatis,twoatomsburiedinsidethesolvent-excludedvolumeseparatedbydAaretreatedexactlythesamewayastwodifferentatomsdAapartwhoseinterstitialregionissolvent-accessible.Furthermore,becausetheshapesofbiomoleculescanvarygreatlyfromsystemtosystem,the`optimal'valueforSinEq. 2 ishighlysystem-dependent.Finally,whilethetruedielectricregionsareeitherthevalueofthebulksolventorthemoleculeinte-rior,adistance-dependentdielectrichasalargeregioncorrespondingtounphysical,intermediatevaluesofthedielectric. 45

PAGE 46

Forthesereasons,thedistance-dependentdielectricmodelisrarelyusedinmodernsimulations,havinggivenwaytothemoreaccuratemethodslikethePoisson-BoltzmannandGeneralizedBornequations. 2.1.1.2Poisson-BoltzmannAttheheartofmostmodernimplicitsolventmodelsliesthePoissonequation5(~r)5(~r)=)]TJ /F9 11.955 Tf 9.3 0 Td[(4(~r)whereistheelectrostaticpotentialdistributionfunction,isthechargedistributionfunction,andisthedielectricconstantatagivenpointinspace.Thedielectricconstantisoftendividedintotworegionsaregionoflowdielectricinthesolvent-excludedvolumeandthatofthebulksolvent`outside'thesystemofinterest.[ 22 ]ThePoissonequationisonlyvalid,however,atzeroionicstrength.WhenmobileionsarepresentasisthecaseinvivowithallbiomoleculesthePoissonequationmustbeaugmentedwithanappropriatedistributionofcounterions.Theprobabil-ityofndinganioninaparticularregionofspaceisrelatedtoitsBoltzmannfactorexp()]TJ /F3 11.955 Tf 9.3 0 Td[(q(~r)),whereq(~r)istheenergyofapointchargeinagivenelectrostaticpoten-tial.Becauseionscomeinpairswithbothpositiveandnegativecharges,theBoltzmannprobabilityofndingbothtypesofionsmustbeincluded.Theequationforcalculatingtheelectrostaticpotentialinabiomolecularsystemwithagivensolutionionicstrength,termedthePoisson-Boltzmann(PB)equation,isshowninEq. 2 .[ 22 ]5(~r)5(~r))]TJ /F3 11.955 Tf 11.95 0 Td[((~r)(~r)2kBT 2qexp()]TJ /F3 11.955 Tf 9.3 0 Td[(q(~r))+(~r)(~r)2kBT 2qexp(q(~r))=)]TJ /F9 11.955 Tf 9.3 0 Td[(4(~r)5(~r)5(~r))]TJ /F3 11.955 Tf 11.95 0 Td[((~r)(~r)2kBT qsinhq(~r) kBT=)]TJ /F9 11.955 Tf 9.3 0 Td[(4(~r) (2) 46

PAGE 47

InEq. 2 ,qisthechargeoftheions(bothpositiveandnegativeionsarepresent),(r)isasimpleswitchingfunctionthatis0insolvent-excludedregionsand1insolvent-accessibleregions,and2isrelatedtotheionicstrengthas2=8q2I kBTEq. 2 isanon-linear,second-orderdifferentialequationintheelectrostaticpotentialthatmustbesolvediterativelyuntilthedesiredlevelofself-consistencyintheelectrostaticpotentialisachieved.ThesinhterminEq. 2 maybeexpandedusingitsTaylorseriesexpansion.Iftheionicstrengthislowandthesoluteisnothighlycharged(so(~r)isrelativelysmall),theTaylorseriesexpansionforsinhcanbetruncatedafterthersttermtoyieldthemuchsimplerEq. 2 withlittlelossofaccuracy.Eq. 2 iscalledthelinearizedPoisson-BoltzmannequationbecausetheTaylorseriesexpansionforsinhistruncatedafteritslinearterm. 5(~r)5(~r))]TJ /F3 11.955 Tf 11.95 0 Td[((~r)(~r)2(~r)=)]TJ /F9 11.955 Tf 9.3 0 Td[(4(~r)(2)ThePoissonEquationcanbesolvedexactlyforonlythesimplestsystems,likesolvatingapoint-chargeoraconductingspherewithauniformchargedistributiononitssurface.Eq. 2 or 2 mustbesolvednumericallyforcomplexbiomoleculeswithirregularshapes.Acommonapproachistosetupathree-dimensionalgridsurroundingthesoluteandcalculatethechargedistributiononthegridfromthepartialchargesofeachsoluteatom.Thedielectricboundarycanbecalculatedfromthesolventaccessiblesurface[ 37 ],soeachgridpointhasanassociatedchargeanddielectricvalue.Thedifferentialequationscanthenbesolvedvianitedifferenceswithinthedenedgrid.[ 38 ]AftertheelectrostaticpotentialiscalculatedviaEq. 2 ,thefreeenergyiscalcu-latedbyintegratingtheproductofthechargedistributionandthecalculatedelectrostatic 47

PAGE 48

potentialaccordingtoG=1 2Z(~r)(~r)d~rwherethe1=2factorcorrectsfordouble-countingtheinteractions.Thefreeenergyofsolvationduetosolventpolarizationiscalculatedfromthedifferenceintheelectrostaticpotentialsinvacuumandsolvent(solv)]TJ /F3 11.955 Tf 12.59 0 Td[(vac)aquantityreferredtoasthereactioneld.[ 12 ]Thecharge-dependentportionofthesolvationfreeenergythenbecomes Gpol=1 2Z(~r)(solv(~r))]TJ /F3 11.955 Tf 11.95 0 Td[(vac(~r))d~r(2)ModelsemployingimplicitsolventviathePBequationhaveproveneffectiveinmanycases.[ 38 41 ]However,duetorequirementsofafairlydensegridandtheiterative,self-consistentnatureofsolvingthePBequation,thecomputationalcostofthismodelistoohighformanyapplications.Furthermore,thedielectricfunctionisdiscontinuousattheboundariesofthesolvent-excludedandsolvent-accessibleregions,makingstablegradients(andthereforeforces)difculttocalculate.[ 42 ]Therefore,IwillnowconsideracommonapproximationtothePBequationcalledtheGeneralizedBornmodelthatseekstoprovideanefcient,analyticalalternativetosolvingthePBequation. 2.1.1.3GeneralizedBornWhiletheelectrostaticpotentialgeneratedbymostchargedspeciescannotbesolvedanalyticallyusingthePoissonequation,Iwillconsidertwosimple,idealsystemsthatcan.Therstisaperfectconductingsphereofradiusrwithauniformchargedistribution.GivenatotalchargeqandusingthePoissonequationtocalculatetheelectrostaticpotentialinducedbythechargedsphere,thepolarcontributiontothefreeenergyofsolvationcanbecalculatedfromEq. 2 ,givingthefamiliarBornequation,shownbelow.[ 22 ] Gpol=)]TJ /F9 11.955 Tf 10.5 8.09 Td[(1 21 vac)]TJ /F9 11.955 Tf 20.37 8.09 Td[(1 bulkq2 r(2) 48

PAGE 49

InEq. 2 ,vacisthedielectricconstantofavacuum,whichisunity.Itisshownexplicitlyheretodemonstratethatthedielectricconstantofthesolvent-excludedvolumeinthePoissonequationdoesnotappearintheBornequation.Ifinsteadofbeingaperfectconductingspherewithauniformchargedistribution,thespherehadaperfectdipolarchargedistribution,thefreeenergyofsolvationusingthePoissonequationwouldresultintheKirkwood-Onsagerequation,shownbelow.[ 22 ] Gpol=)]TJ /F9 11.955 Tf 10.5 8.09 Td[(1 22()]TJ /F9 11.955 Tf 11.95 0 Td[(1) 2+12 r3(2)wheretheisthedielectricconstantofthebulksolventandthedielectricconstantofvacuumhassimplybeenreplacedby1.TheGeneralizedBorn(GB)formalismforcalculatingthepolarcontributiontothesolvationfreeenergyis,asitsnamewouldsuggest,anextensionoftheBornsolutionshowninEq. 2 tocomplexmoleculeswithanarbitrarysizeandshape.[ 43 47 ] Stilletal. werethersttoproposethemethod,adjustingtheBornequation(Eq. 2 )asshownbelow.[ 43 ] Gpol=)]TJ /F9 11.955 Tf 10.5 8.08 Td[(1 21)]TJ /F9 11.955 Tf 13.15 8.08 Td[(1 NXi=1NXj=1qiqj fGB(2)whereqiisthechargeofatomi,isthedielectricconstantofthesolvent,andfGBisanarbitrary,analyticfunctionofatompositionsdesignedtocalibrateEq. 2 toexperiment.ThemostcommonformoffGBdevisedby Stilletal. ,andstillusedpredominantlytoday,isshowninEq. 2 fGB=s r2i,j+ijexp)]TJ /F7 11.955 Tf 18.03 9.32 Td[(r2i,j 4ij(2)whereri,jisthedistancebetweenatomsiandjandiiscalledtheeffectiveBornradiusofatomiforreasonsthatwillsoonbeapparent.Eq. 2 doesnotrepresentatheoretically`correct'choiceforfGB,norhasitbeenshowntobethebestchoiceinfactitprobablyisnot.[ 48 ]However,itisagoodchoice 49

PAGE 50

forseveralreasons.First,Eq. 2 isasimpleformulawithanalyticalgradientsassumingiisananalyticfunctionofthenuclearpositionsandcanbecomputedrapidly.Moreimportantly,however,Eq. 2 hastheappropriatelimitingbehavior.[ 43 ]Forasingleparticleortwoidenticalpointchargesseparatedbyadistanceof0fGBreducestoandEq. 2 reducestotheBornequation(Eq. 2 )inwhichtheradiusofthe`sphere'isi.Itisforthisreasonthattheivaluescanbethoughtofasan`effective'radius.Furthermore,fortwopointchargesseparatedbyasmalldistance(i.e.,smallerthantheeffectiveradiiofthetwoparticles)theresultagreeswiththeKirkwood-Onsagersolution(Eq. 2 )towithin10%ofthetruevalue.[ 43 ]ThenextmajorchallengeinsolvingEq. 2 iscalculatingtheeffectiveBornradii,i,foreachatom.Theeffectiveradiusofanatomreectsthesphericallyaveragedis-tanceofthatatomfromthesolventexcludedsurface.Calculatingtheeffectiveradiiisparticularlychallengingbecauseitmustbedonerapidly,accurately,andsogradientsmaybeeasilycomputed.BecauseGBwasdevelopedasanefcientalternativetosolvingthePBequation,computationallyintensiveapproachestocalculatingtheeffec-tiveradiiofferlittleadvantageoverusingthemoreprecisePBequation.Furthermore, Onufrievetal. hasdemonstratedtheimportanceofcomputingeffectiveBornradiiac-curately,[ 47 ]showingthatso-called`perfectradii'reproducePBresultsveryclosely.Finally,gradientsarenecessarytoperformeithergeometryoptimizationormolecu-lardynamics,andanexpressionthatlendsitselftorapidcomputationofanaccurategradientisanattractivefeature.Themostcommonapproachtocomputingtheeffectiveradiusiscalledthecoulombeldapproximation,shownbelowinEq. 2 .[ 22 ] Ii=Zid3r 4r4(2) 50

PAGE 51

IiinEq. 2 isthecoulombeldintegralforatomiandisigniestheintegralisoverallspacecenteredonatomi.TheeffectiveradiusisthencomputedfromthisintegralusingEq. 2 i=)]TJ /F3 11.955 Tf 5.48 -9.69 Td[()]TJ /F10 7.97 Tf 6.59 0 Td[(1i)]TJ /F7 11.955 Tf 11.95 0 Td[(Ii)]TJ /F10 7.97 Tf 6.58 0 Td[(1(2)whereiistheintrinsicvanderWaalsradiusofatomiandIiistheintegralfromEq. 2 .Ascomputationalpowerincreasedandsimulationsreachedlongertimescalesandlargersystems,however,decienciesintheseequationsbegantosurface,leadingtoeffortstoimprovethecalculationoftheeffectiveBornradii.[ 47 49 51 ]ThetwoapproachesthathavebeenimplementedintheAmbersuiteofprogramsarebrieydescribedbelow. Onufrievetal. noticedthatEqs. 2 and 2 tendedtounderestimatetheeffectiveradiiofburiedatomsbecauseitassumedthatinterstitialregionsofspacebetweenatomsweresolvent-lled,despitethefactthattheyweretoosmalltocontainafullwatermolecule.[ 49 ]Asaresult,theymodiedEq. 2 intothefollowingform: i=)]TJ /F10 7.97 Tf 6.58 0 Td[(1i)]TJ /F3 11.955 Tf 11.95 0 Td[()]TJ /F10 7.97 Tf 6.59 0 Td[(1itanh)]TJ /F7 11.955 Tf 5.48 -9.68 Td[(a)]TJ /F3 11.955 Tf 11.95 0 Td[(2+3)]TJ /F10 7.97 Tf 6.58 0 Td[(1(2)where=Ii(IiistakenfromEq. 2 ),anda,,andarettingparameters.Thetanhfunctionwaschosenbecauseitisinnitelydifferentiable(analytically)andincreasestheeffectiveradiiofmoredeeply-buriedatomswhileleavingtheeffectiveradiiofatomsclosertothesurfaceunchanged.Inthisway,Eq. 2 maintainsthesuccessEq. 2 displayedforsmallcompoundswhileimprovingthebehaviorofdeeply-buriedresidues.[ 49 ]ThisGBvariantisreferredtoasGBOBC(whereOBCcomesfromtheauthorsOnufriev,Bashford,andCase). Monganetal. tookadifferentapproach.WhileEq. 2 provideduniformscalingforallatomswithagivendegreeofburial(asmeasuredbythevalueofIiinEq. 2 ), 51

PAGE 52

Figure2-2. RegionofspacebetweentwoatomsiandjofradiusRiandRjthatisinaccessibletoasphericalsolventmoleculeofradiusRsolv.Thisinaccessibleregioniscalledtheneckandisshadedgray. Monganetal. adoptedanapproachbasedongeometry.Bytreatingeachatomandeachsolventmoleculeasasphereagoodapproximationforawatermoleculetheinterstitialspacebetweentwosoluteatomsthatisinaccessibletosolventcanbequantied.BecausethisinterstitialregionresemblesaneckasseeninFig. 2-2 thismodelisreferredtoasGBneck.[ 50 ]Themostrecentapproachby Nguyenetal. involvesare-parametrizationoftheintrinsicatomicradii(iinEq. 2 )foratomscommonlyinvolvedinsaltbridgesandacombinationoftheideaspresentedintheGBOBCandGBneckmodelsdescribedabove.[ 51 ] 52

PAGE 53

2.1.1.4Non-polarSolvationTheprocessofsolvationcanbebrokendownintotwoctitiousstepsacavitationstepinwhichthesolventisexcludedfromaregionofspaceequaltothesolute'ssolventexcludedvolume,andachargingstepwherethesolvent-polarizedchargedistributionofthesoluteisinsertedintothatcavity.Becausethefreeenergyisastatefunction,thisgedankendecompositionwillyieldanidenticalfreeenergytothetrueexperimentalfreeenergyassumingofcoursethateachstepcanbecalculatedexactly.Thefreeenergiesofthesetwostepsarereferredtoasthenon-polarandpolarsolvationfreeenergy,respectively.ThePoisson-BoltzmannandGeneralizedBornequationsshowninEqs. 2 and 2 areusedtocomputethepolarsolvationfreeenergy(i.e.,theportionofthefreeenergyderivedfromthereactioneld).Thereareseveralmethodsforcalculatingthenon-polarsolvationfreeenergy.Methodsforcalculatingthenon-polarcontributiontosolvationareoftenparametrizedbyassumingthatthesolvationfreeenergyforextendedandbranchedalkanesisnon-polarinnature.Themostcommonwaytocalculatenon-polarsolvationistotasurfacetensionvaluetotheexperimentalsolvationfreeenergiesofthealkanes.[ 22 ]Thisapproachcanberationalizedusingtheideathatthepresenceofanon-polarsoluteimmersedinsolventdisruptsthesolvent-solventinteractions,therebyrestrictingsolventstructureinthesolvationshellsurroundingthesolute.Thiseffectimposesanentropicpenaltytosolvation(thatisoffsetbythepolarsolvationtermforsolublecompounds).Ifthiswastheonlysourceofthe`non-polar'solvationfreeenergy,thenitsmagnitudewouldvarywiththesizeofthemolecule,whichisdirectlyrelatedtoitssurfacearea.Combiningthissurface-areanon-polarsolvationtermwitheitherthePoisson-BoltzmannorGeneralizedBornequationsforthepolarsolvationtermresultsintheso-calledPBSAandGBSAmethods,respectively.OneofthemostcommonmethodsforcalculatingthesurfaceareainGBSAmoleculardynamicssimulationsiscalledthelinear 53

PAGE 54

combinationofpairwiseoverlaps(LCPO)methodso-calledbecauseitisparametrizedbyttingveparameterstothesphericaloverlapsofindividualatoms.[ 52 ]ThechiefadvantageofLCPOisthatitprovidesanefcientwaytocalculatesurfaceareasusingananalyticalformulawhosederivativescanbeeasilycalculatedforuseinmoleculardynamics. 2.1.2ExplicitSolventWhiletheimplicitsolventmethodsdescribedaboveareusefulwaysofincorporatingsolvationeffectsinmolecularsimulations,allsolventeffectsareaccountedforinanaverageway.Therefore,individualwatermoleculesthatplaystructurallyimportantrolesinbiomoleculesarenottreatedwellbyeitherPBorGBmethodologies.Insuchcases,itisadvantageoustoincludethesolventmoleculesexplicitlyinthesimulation.Explicitsolventsignicantlyincreasesthecostofthesimulationbyaddingalargenumberofatomstothesystem,butshouldimprovetheaccuracybycreatingamodelclosertoreality.Alargedrawbackwhenaddingexplicitsolvent,however,isthefactthatmodernsimulationsatanatomicresolution(i.e.,whereallatomsaretreatedexplicitly)arelimitedto,atmost,108atoms,[ 53 ]althoughsimulationsbetween105to106atomsaremorereasonable.Macroscopicsystems,ontheotherhand,containontheorderof1023atomsanintractablenumberformodernhardware,sooursimulationsmustbescaleddowntoamicroscopicsize.Asademonstration,thehairpinribozymeisabiomoleculethatcontainsroughly2100atoms.Addingonly22,000watermoleculesenoughtocreatea20Asphericalsolventbufferaroundtheribozymeincreasesthesimulationsizetoroughly90,000atoms.Itisquiteclear,therefore,thatevenonthelargestsupercomputers,wecanonlymodelamicroscopicdropletinexplicitsolvent.Atsuchsmallsizes,theratioofsurfaceareatovolumefortheseminusculedropletsisastronomicallylarge,andwater 54

PAGE 55

moleculesatthesolvent-airinterfacebehavequitedifferentlyfromthosemoleculesinbulksolvent.Whileearlyapproachesofapplyinga`cap'potentialanarticialbiasingpotentialpenalizingsolventthatdiffusestoofarawayfromthesolutehelpedovercomesomesurfaceeffectslikeevaporation,itmadedirectcomparisontoexperimentdubious.Amajorbreakthroughinexplicitsolventcalculationscamewiththeintroductionofperiodicboundaryconditions.[ 54 ] 2.1.2.1PeriodicBoundaryConditionsToemulatebulksolutionbehaviorusingasystemcomposedofatractablenumberofatoms,weimposeperiodicboundaryconditions(PBC)onthesystem,replicatingitinnitelyineverydimension.Insuchasystem,eachatominteractswithallotheratomsinallothersimulationcellsincludingitsownperiodicimages.[ 54 ]Atwo-dimensionalillustrationofPBCisshowninFig. 2-3 forarectangularunitcellillustratingtheseideas.ApracticecommonlyadoptedinPBCsimulationsinwhicheachatominteractsdirectlywithonlyasingleimageofeveryotheratomspecicallythenearestimageiscalledtheminimumimageconvention.Theminimumimageconventionisemployedtosimplifytheproblem,butalsoimposesalimittotherangeofcalculatedinteractions.Specically,thenon-bondedinteractionsdonotextendbeyondhalfthelengthoftheshortestsideoftheunitcell.Employingtheminimumimageconvention,theenergycalculatedforasystemwithPBCistheenergyofasingleunitcellintheeldgeneratedbyeveryperiodiccell.Thechallengeishowtocalculatethenon-bondedinteractionsforeveryatominthesystem.Thecommonapproachesemployedinbiomolecularsimulationareexploredinthenextthreesections. 2.1.2.2CutoffMethodsThesimplestapproachtocalculatingnon-bondedinteractionsistoemployasimplecutoffthatissmallerthanhalfthelengthoftheshortestsideoftheunitcelli.e.,allnon-bondedinteractionsbetweenatomscloserthanthecutoffareincludedandall 55

PAGE 56

Figure2-3. Periodicsimulationintwodimensionswitharectangularunitcell.Themaximumpermissiblecutoff(rcut)fortheminimumimageconventionisshownwithabluedottedcirclecenteredonparticle1intherstbox. thosebetweenatomsgreaterthanthecutoffareneglected.Becausethecommonformsofthenon-bondedpotentialdecayasthedistancebetweenatomsincreases(seeEqs. 1 and 1 ),interactionsbetweendistantatomsaresignicantlysmallerthaninteractionsbetweennearbyatoms.Thenon-bondedinteractionsaremodeledasthesimplepiecewisefunctionshowninEq. 2 56

PAGE 57

U0(~xi,j)=8><>:U(~xi,j):j~xi,jjxcut(2)Whileconceptuallysimpleandcomputationallyefcient,simplecutoffssufferfromaseverelimitation.Thepotential,andthereforetheforce,encountersadiscontinuityatthecutoffdistance,showninFig. 2-4 A.Thisdiscontinuityresultsinsimulationsthatdonotconserveenergyandleadstonumerous,non-physicalartifacts.[ 55 61 ]Thiseffectisparticularlypronouncedforelectrostaticinteractionsaverylong-rangepotentialoftheform1=r.Asmentionedbefore,thisfunctiondecayssoslowlythatP1i=11=i=1.Twomonovalentionsmustbeseparatedby332Abeforetheirinteractionenergydropsto1kcal/mol.Suchacutoffwouldrequireaunitcellsizeatleast664Aoneachedgecontaining107watermolecules.Giventheneedtoimprovethebehaviorofthenon-bondedpotentialsnearthecutoffdistance,twopopularmodicationstothesimplecutoffapproachwereintroducedasmoothswitchingfunctionandashiftingfunction.Theswitchingfunctionapproachappliesasmoothfunctionatagivendistancethatsatisesthefollowingcriteria:a)thepotentialanditsgradientiscontinuouseverywhere,b)theshort-rangedformofthepotentialisunchanged,andc)thepotentialapproaches0atthecutoff.Eq. 2 isanexampleofaverysimpleswitchingfunctioninwhichtheoriginalpotentialismultipliedby1whentheinterparticledistanceislessthanthecutoffand0otherwise.Ofcourse,thisswitchingfunctiondoesnotobeyeitherthea)orc)conditionslistedabove.AnexampleofasmoothswitchingfunctionisshowninFig. 2-4 B.[ 62 ]Thesecondfamilyofmethodscommonlyemployedareso-calledshiftingfunctionssincethepotentialismodiedby`shifting'thepotentialupsuchthatthevalueofthepotentialbecomeszeroatthecutoffdistance.[ 54 62 ]Simplyshiftingthepotential,though,isnotenoughforMDsimulations,sincetheforcewillremainunchangedandstillfacesadiscontinuityatthecutoff.Therefore,theshiftingfunctionoftencontainsa 57

PAGE 58

Figure2-4. Effectsofvarious16Acutoffschemesontheelectrostaticinteractionoftwomonovalentionswithoppositecharges.A)showstheeffectofimposingahardcutoff.B)showsatypicalswitchingfunctionstartingat8A.C)showsatypicalshiftingfunctionfortheelectrostaticpotential.Theenergiesasafunctionofdistanceareshowninthetop3plotsandtheforcesasafunctionofdistancearethebottom3plots. force-shiftingcomponent,asshowninEq. 2 .[ 54 ]TheeffectoftheshiftingfunctionisshowninFig. 2-4 C. Us(~ri,j)=8><>:U(~ri,j)]TJ /F7 11.955 Tf 11.95 0 Td[(U(~rcut))]TJ /F13 11.955 Tf 11.95 13.27 Td[(dU(~ri,j) d~ri,j~ri,j=~rcut(~ri,j)]TJ /F3 11.955 Tf 10.87 .5 Td[(~rcut)~ri,j<~rcut0~ri,j~rcut(2) 58

PAGE 59

Figure2-5. Periodiccellsaddedinasphericalshaperadiallyfromthecentralunitcell.Theprogressionfromdarkertolightercellsshowstheorderinwhichinteractionsareaccumulatedinthesumoftheelectrostaticinteractions(withthedarkercellsbeingaddedbeforethelighterones).Theexample,adaptedfrom AllenandTildesley ,isshownintwodimensions,butcanbetriviallyextendedtothreedimensions.[ 54 ] 2.1.2.3EwaldSummationIdeally,simulationsinthecondensedphaseswouldbeperformedwithouttruncatingelectrostaticinteractionsatall.ThefullelectrostaticinteractionforanetneutralunitcelltakesthefunctionalformPNi=1()]TJ /F9 11.955 Tf 9.3 0 Td[(1)i=i,sincethereareanequalnumberofchargesofbothsigns.Thissumisconditionallyconvergent,meaningthat,whileitconvergestoanitevalue,thatvaluedependsontheorderinwhichthetermsaresummed.[ 54 ]Anat-uralchoicefororderingthesummationoftheinnitenumberofelectrostaticinteractionswithparticleiisbysummingalloftheelectrostaticinteractionswitheachparticlejineveryunitcellextendingradiallyfromtheunitcellcontainingi.ThisapproachisshowndiagrammaticallyinFig. 2-5 .[ 54 ] 59

PAGE 60

In1921, Ewald devisedamethodwherebytheelectrostaticinteractionsbetweenanionandallofitsperiodicimagesinacrystallatticecouldbecomputedaccordingtothetechniquepresentedinFig. 2-5 .[ 63 ]ThesameapproachcanbeusedforsimulationsinthecondensedphasewhenPBCareused.Thetechnique,calledtheEwaldsum,utilizesatricktocausetheelectrostaticinteractionsbetweenparticlestodecayarbitrarilyrapidly,allowingtheinteractionstobetruncatedatadistancewheretheinteractionsthemselvesarenegligible.Todothis,aGaussianchargedistributioniscenteredateachpointchargewiththeoppositesignofthepointcharge,asshowninFig. 2-6 .GivenawidthoftheGaussiandistribution,thefunctionalformoftheneutralizingchargedistributionisshowninEq. 2 i(r)=qi3 3=2exp)]TJ /F14 11.955 Tf 5.48 -9.68 Td[()]TJ /F3 11.955 Tf 9.29 0 Td[(2r2(2)whereiisthechargedistributionduetoparticleianditsneutralizingGaussianandisthetunableparametercontrollinghowdiffusetheGaussianis.TheelectrostaticinteractionoftwochargedparticlesiandjwiththeirneutralizingchargedistributionisEi,j=qiqjerfc(ri,j) ri,jwhereerfcisthecomplementaryerrorfunction.Thecomplementaryerrorfunctiondecaysrapidlymorerapidlyfornarrowerneutralizingdistributions.Thenarrowertheneutralizingdistributionsare,thesmallerthecutoffthatmaybeusedwithoutcompromisingaccuracy.Infact,atthelimitwheretheGaussianwidthiszero,theneutralizingchargedistributionbecomesadeltafunctionthatexactlycancelstheoriginalpointcharge,allowingacutoffofzero!However,whileaddingtheneutralizingchargedistributionshasallowedustocom-putethedirectelectrostaticenergiesbetweenparticlesrapidlybyimposingarelativelyshortcutoff,wehavechangedoursystem.Theeffectoftheneutralizingchargedistri-butionsmustbecanceledbyinvertingalloftheneutralizingchargedistributionsand 60

PAGE 61

Figure2-6. Aone-dimensionalexampleofparticleswithagivencharge(red)withaneutralizingGaussianchargedistribution(blue)shown. addingtheirinteractionbacktotheoriginalsum.Byaddingtheseso-calledcancelingchargedistributionsbacktotheelectrostaticsum,theoriginalinteractionofjustthepointchargesisrecovered.Theinteractionsbetweentheseneutralizingchargedistributionsrepresentanumberofconvolutionintegralswhichmaybecomputedveryrapidlybytak-ingtheFouriertransformofthedistributionsandsummingthecontributionsinreciprocalspace.Theresultisthenreverse-Fouriertransformedtoobtaintheelectrostaticpotentialateachoftheparticles.[ 54 ]Particle-MeshEwald.AweaknessofEwald'ssummationisthattheFouriertrans-formisaslowoperationontheorderofO(N2)whereNisthenumberofparticles.Toaddressthisshortcoming,thechargedensityduetothecancelingchargedistributionscanbediscretizedona3-dimensionalmeshwithagivengridspacing.ThisallowsustousethefastFouriertransformalgorithm(FFT)toperformboththeFouriertransform 61

PAGE 62

andreverseFouriertransformtocalculatetheelectrostaticpotentialateachofthemeshpoints.UnlikethestandardFouriertransform,theFFTscalesasO(Nlog(N)),resultinginasubstantialincreaseincomputationalefciency.ThepotentialateachoftheparticlesanditsgradientcanthenbeinterpolatedfromtheadjacentgridpointsonthemeshusingcardinalB-splines.[ 64 ]ThisapproachistermedParticle-MeshEwald(PME)duetothewayinwhichtheparticlesinteractwiththemeshtodeterminethelong-rangeelectrostaticinteractions. 2.1.2.4OtherApproachesEwald-basedmethodsemployingthediscretefastFouriertransformhavebeenverypopularoverthepasttwodecades.Astherapidincreaseincomputationalpoweral-lowedsimulationstorunincreasinglylonger,thedeciencyoftypicalcutoffmethodsforsimulatinghighlychargedsystemssuchasDNAorRNAbecamereadilyapparent.[ 65 66 ]Properlyaccountingforlong-rangeelectrostaticeffectsusingPMEresultedinstablesimulationsofnotonlyproteins,butalsohighlychargedsystemslikeDNAandRNA.[ 59 ]Furthermore,byemployingtheFFT,PMEallowedcalculationstobedonemorerapidlybyreducingthecomputationalcostofthenon-bondedinteractions.However,therearetwoprincipledrawbacksofEwald-basedmethods.First,theuseofperiodicboundariesmayintroduceartifactsintothesystemcausedbythecorrelatedmotionsofeachperiodicimage.[ 67 ]Forinstance,ifperiodicboundaryconditionswasimposedonagasofmonovalentionssuchthateachcellhadasingleparticle,theparticledistributionwouldnecessarilybeuniformsinceperiodicsymmetryreducesdimensionalityofthesystemtoasingledegreeoffreedom.Whilethiseffectdoesnotseemtoinducemeasurableartifactsformostsimulations,[ 67 ]amoreseriouslimitationofEwald-basedmethodshastodowiththechangingarchitectureofmoderncomputers.Formanyyears,theefciencyofthecentralprocessingunit(CPU),typicallymea-suredinthespeedwithwhichitexecuteseachoperation(i.e.,clockspeed),improvedasengineerswereabletoshrinkthesizeofthetransistorsandplaceincreasinglymore 62

PAGE 63

ofthesetransistorsontoeachCPUdie.Recently,however,thepowerrequirementstoincreasetheclockspeedcausedchipstomeltsincethetheheatgeneratedcouldnotbedissipatedquicklyenough.ThisdrovechipmanufacturerstoincreasethecomputationalpoweroftheseCPUsbyaddingadditionalcores.TotakeadvantageofthisformofimprovedCPUefciency,computationalalgorithmsmustbedesignedtoruninparallel.Itturnsoutthatduetothenon-localnatureoftheFFTandthealgorithmicdetailsofitsefcientimplementation,calculationsemployingsuchmethodsarelimitedintheirabilitytotakeadvantageoftheincreasingparallelismofmodernprocessors.ToalleviatethelimitedscalabilityofstandardPME, CeruttiandCase devisedanapproach,termedMulti-levelEwald,todividethesystemintosmallerchargegridssothatthereciprocal-spacesumcanbeperformedinparallelinmultiple,independent`chunks.'[ 68 ]Theseindependentgridscanthenbe`stitched'togetherusingamuchcoarserglobalgridthatcanbecomputedfarmorerapidly.TocombatbothshortcomingsmentionedforEwald-basedmethods,manyre-searchershaveinvestigatedalternativestothePMEtreatmentoflong-rangeelectro-staticinteractionsinbimolecularsimulations.Onesuchmethod,theisotropicperiodicsum(IPS),assumesanisotropicdistributionofparticlesbyreplicatingthesurroundingregionaroundeachparticlewithinacutoffinnitelyinalldirections.[ 69 ]Whilethismethodnecessitatesusingalargercutofftomorefullycharacterizeeachparticle'ssurroundings,itavoidsneedingachargegridpopulatedfromeveryatominthesystem,therebyreducingthecommunicationoverhead.Asaresult,IPScanbeimplementedinsuchawaythatismorescalableonmodernhardwarethanPME.Thegeneralizedreactioneld(GRF)methodemploysyetanotherapproachtotreatinglong-rangeelectrostaticsbasedonthePBequation.Asphereisconstructedaroundeachparticlewhoseradiusisequaltothenon-bondedcutoffdistance,insidewhichallinteractionsarecomputeddirectly.Thesurroundingsaremodeledasabulkdielectricenvironment,andthereactioneldpotentialiscalculatedonthesphere 63

PAGE 64

analyticallyaccordingtothelinearizedPBequation.Theforceexertedbythiselectriceldcanbecalculatedontheatomatthecenteroftheconstructedsphere.[ 70 ]Thisapproachhasthesamecostastypicalcutoffmethods,butmodelsinteractionsoutsidethecutoffasthoughitwerebulksolvent.SuchtreatmentnecessitatestheuseofalargercutoffvaluethanthatrequiredbyEwaldmethods.Whilethelistofmethodshereisnotcomprehensive,thegeneralaimofPME-replacementsistoeitherlessenthelikelihoodofobservingperiodicityartifactsinsimu-lationsand/ortopresentanalgorithmthatismoreamenabletoparallelization.Despitethechallengesincomputationalscalingandefciencyassociatedwithparallelizingthereciprocal-spaceEwaldsum,Ewald-basedmethodsarestillwidelyusedtoday,evenonhighlytuned,specializedhardwaredesignedspecicallytoaccelerateMDsimulations.[ 71 ] 2.2SamplingSamplingistheprincipleprobleminmostcondensedphasesimulationsespeciallyinvolvingbiomolecules.Forsuchlargesystems,thesizeofphasespacea6N-dimensionalhyperspacecomposedofpositionsandmomentaforNparticlesinall3spatialdimensionsisunconscionablylarge.Althoughanychemicalsystemcanbecharacterizedcompletelyifthedensityofstatesisknownatanarbitraryenergy((E)),thisnumberissovastthatitcannotbedirectlycomputed.Luckily,thepartitionfunctionsofmostthermodynamicensemblesinparticularthatforthecanonicalensemble(Eq. 1 )arealmostentirelycomprisedoflow-energystructuresduetotheexponentialweightingintheBoltzmannfactor.Despitethisfortuitoussimplication,nosimulationiscapableoftrulyexhaustivesamplingfortypicalbiomolecularsimulations,anditisunlikelythatexhaustivesamplingwilleverbeattainable.ThemostnaveapproachtosamplingrunningpuremoleculardynamicsorMonteCarlosimulationsisfrequentlyinsufcienttocharacterizerareeventsthathappenon 64

PAGE 65

themillisecondorevensecondtimescale.Evenwithhighlyspecialized(andexpensive)hardware,pureMDsimulationsarecurrentlyconnedtothemillisecondtimescale.[ 72 ]Inthissection,IwilldiscussthreeapproachestoenhancesamplingcomparedtotraditionalMDsimulationsumbrellasampling,steeredmoleculardynamics(SMD),andexpandedensembletechniques(andthespecialcaseofreplicaexchange). 2.2.1UmbrellaSamplingUmbrellasamplingisabiasedsamplingtechniquethatactsonaspecicreac-tioncoordinate.Incomplexsystems,thereareoftenfreeenergybarriersseparatingdifferentstatesthatarefarlargerthantheaverageavailablethermalenergy,kBT.AnexampleisshownasablacklineinFig. 2-7 inwhichthe6N-dimensionalfreeenergysurface(reducedto3N-dimensionalwhenthemomentumintegralisseparatedfromthecanonicalpartitionfunction)isprojectedontoa1-dimensionalreactioncoordinate.Thisreduced-dimensionfreeenergysurface,calledapotentialofmeanforce(PMF)showsafreeenergybarrierofroughly6kBTinFig. 2-7 .Instandarddynamicssimulations,itwouldtakeaverylongunbiasedsimulationtocrossthatbarrier.Thetrickinvolvedinumbrellasamplingistomodifytheunderlyingpotentialwithaharmonicbiasingpotentialtoencouragethesimulationtosamplehigherenergystructuresmoreoften.Fig. 2-7 showshowaquadraticumbrellapotentialchangestheshapeoftheunderlyingPMFsuchthathigher-energystructuresaresampledmorefrequently.Clearly,thetwobiasingpotentialsshowninFig. 2-7 tendtofavorsamplingnearthetwotransitionstatesseparatingdifferentminima,sincethatportionofthereactioncoordinateislowestinenergy.Theresultingensembleofthemodiedpotential,showninEq. 2 ,containsmoresnapshotsaroundtheareasthataretraditionallysampledpoorlybyMDsimulationsofniteduration.However,allpropertiescalculatedbasedonthesestatisticsrefertoactitioussystem,andwillnottranslateintoexperimentalobservables.Inotherwords,thestatisticscollectedfromanumbrellasamplingsimulationcorrespondtothe 65

PAGE 66

Figure2-7. Anexample1-dimensionalPMF(showninblack).Twobiasingumbrellapotentialsareshownalongsidetheresulting,biasedPMF.AllPMFcurveshavebeentranslatedsothatthe`minimum'freeenergyis0.Becauseonlyenergydifferencesaresignicant,verticaltranslationsofthePMFhavenoeffectoncalculatedproperties. HbiasHamiltonianinEq. 2 ,whereasthephysicalsystemactuallyobeystheHorigHamiltonian.[ 12 ] Hbias(~x)=Horig(~x)+1 2kumb(f(~x))]TJ /F7 11.955 Tf 11.96 0 Td[(s)2(2)Horigistheoriginal,unbiasedHamiltonianinEq. 2 ,kumbistheforceconstantontheharmonicumbrellapotential,f(~x)isthereactioncoordinate,andsisthecenteroftheumbrellapotentialalongthatreactioncoordinate.Sincetheexactshapeofthebiasingpotentialisknown,andthesamplingprovidesinformationabouttheshapeofthetotalbiasedpotential,wecanusethatinformationtodeducetheunderlyingshapeoftheoriginalHamiltonianalongthechunkofthePMF 66

PAGE 67

thatoursimulationhaseffectivelycharacterizedthroughsampling.However,becausetheumbrellapotentialismonotonicallyincreasingoneithersideoftheumbrellacenter,congurationsfarawayfromthatcenterwillbesampledverypoorly,leadingtopoorconvergenceinthoseregions.Toalleviatethisissue,aseriesofumbrellasamplingsimulationsareperformedinintervalsalongthereactioncoordinatecalledwindowswhichareusedtoconstruct`pieces'ofthePMFnearthecenteroftherespectiveumbrellas.Thesepiecesarethenstitchedtogethertoapproximatethetotal,unbiasedPMF.ThefreeenergyofthebiasedpotentialalongthePMFisrelatedtotheprobabilitydensityfunctionatthatpointaccordingtobias(~x0)=Rexp()]TJ /F3 11.955 Tf 9.29 0 Td[(Hbias(~x))(~x)]TJ /F3 11.955 Tf 11.43 .5 Td[(~x0)d~x exp()]TJ /F3 11.955 Tf 9.3 0 Td[(Abias(~x0))whereistheprobabilitydistributionfunction,istheDiracdeltafunctionthatservestoextractonlythoseensemblemembersthatcorrespondtothespecicpoint~x0onthePMF,andAisthefreeenergyalongthePMFatthatvalue.Theunbiasedprobabilitydistribution,whichisdirectlyrelatedtotheunbiasedfreeenergyuptoanarbitraryconstant,canbeestimatedaccordingtounbias(~x)=exp()]TJ /F3 11.955 Tf 9.29 0 Td[((Abias)]TJ /F7 11.955 Tf 11.96 0 Td[(Aunbias))exp1 2kumb(f(~x))]TJ /F7 11.955 Tf 11.95 0 Td[(s)2bias(~x)whereAistheHelmholtzfreeenergyalongthePMF.Theunbiasedprobabilitydistribu-tionfunctionisestimatedforeachwindow,andmustberecombinedtocalculatethefullPMF.[ 11 ]Whiletheweightedhistogramanalysismethodhasbeenarguablythemostpopularmethodfordeterminingtheadditiveconstantsnecessaryateachwindowtoconstructthe`best'completePMF,[ 73 ]morerecentmethodshavebeenshowntobebetteresti-matorsoftheunbiasedPMF.SuchexamplesincludethemultistateBennettacceptanceratio(MBAR)[ 74 ]andvariationalfreeenergyprole,[ 75 ]whichhavedemonstrated 67

PAGE 68

superiorperformanceincomputingnotonlythePMFmoreefcientlywithlessdata,[ 75 ]butalsoreasonableestimationsofthestatisticalerrors.[ 74 ] 2.2.2SteeredMolecularDynamicsTheideaofsteeredmoleculardynamics(SMD)isverysimilartothatofumbrellasampling.Aharmonicbiasingpotentialisaddedtotheunderlyingpotentialalongareactioncoordinatetodrivethesamplingalongthatcoordinate.Unlikeumbrellasamplinginwhichtheharmonicpotentialsarexedatagivenpositionalongthereactioncoordinate,thepotentialismovedalongthereactioncoordinateatsomespeedinSMDsimulations.WhileSMDappearssimilartoumbrellasampling,thefactthattheumbrellapotentialmovesmarksasignicantfundamentaldifferencebetweenthetwotechniques.Um-brellasamplingperformsequilibriumsamplingwiththebiasedHamiltonian,whereasthenitespeedofthemovingumbrellainSMDsimulationsisinherentlynon-equilibrium.[ 11 ]Thenon-equilibriumworkdonebymovingumbrellaistabulated,andeffectivelyrepresentsanupper-boundestimateonthefreeenergyaccordingtoEq. 2 .[ 11 ] hW1,2(~x)iA1,2(2)whereWistheworkalongthepathgivenby~xbetweenstates1and2andAisthefreeenergychangebetweenthosetwostates.Clearly,theutilityoftheworkprolecalculatedusingSMDsimulationsisseverelylimitedsinceEq. 2 issimplyaninequality.ThelinkbetweenequilibriumfreeenergiesandcomputedworkprolesfromSMDsimulationswassuppliedby Jarzynski in 1997 .[ 76 ]Theso-calledJarzynskiequality,showninEq. 2 ,statesthatequilibriumfreeenergiescanbecalculatedfromacompleteensembleofworkprolesalongthereactioncoordinatebetweenanensembleofstartingpointsatstate~x1anddrivingthecenteroftheumbrellatostate~x2. 68

PAGE 69

exp()]TJ /F3 11.955 Tf 9.3 0 Td[(A1,2)=hexp()]TJ /F3 11.955 Tf 9.3 0 Td[(W1,2(~x0))i(2)AcaveattoEq. 2 isthataninnitenumberofworkprolesbetweenstates1and2arenecessaryfortheequalitytohold.Becausesimulatinganinnitenumberoftrajectoriesisimpossible,wemustbecontenttoestimatethetotalfreeenergyusinganitenumberofsimulations.Fortunately,theexponentialaverageconvergesveryrapidlywithasmallnumberof`good'workproles(i.e.,low-energyworkprolesthatfollowthetruePMFclosely),sincehigh-energyprolescontributelittletotheaverage.OptimizingthecomputationalperformanceofSMDsimulationsisabalancingact.PulltooquicklyandallcomputedworkproleswillmostlikelybemuchhigherthanthetruePMF,givingyouapoorestimateoftheactualfreeenergy.Pulltooslowlyandthesimulationswilltaketoolongtotraversethefullreactioncoordinate.TheoptimalpullingspeedwillgenerateawidedistributionofworkprolesthatgivesagoodestimateofthetotalPMF. 2.2.3ExpandedEnsembleAcommonclassoftechniquesusedtoenhancesamplingcomparedtostandardmoleculardynamicsareso-calledexpandedensembletechniques.Thecanonicalen-semble,forinstance,islimitedbythethermodynamicconstraintsimposedbyrequiringallmembersoftheensembletohavethesamenumberofparticles,volume,andtem-perature(NVT).Thesevariablesarereferredtoasstateparameters,sincetheydeneeachstatepresentintheensemble.Thewayexpandedensembletechniquesenhancesamplingistogeneratealargerensembleinwhichmany,smallerthermodynamicensemblesarebroughtintoequilib-rium.Truetoitsname,enhancedsamplingisobtainedbysamplingfromanexpandedensembleofnumerousstandardthermodynamicensembles.Therstexampleexam-inedintheliteratureinvolvedexpandingthecanonicalensembletomultipletempera-tures.[ 77 ]Thisnewensembleisacombinationofmultiplecanonicalensembleseach 69

PAGE 70

atadifferenttemperature.Byallowingasimulationtomigratethroughtemperature-spaceasitissamplingnewconformations,expandedensemblesimulationscantakeadvantageoftheatterfreeenergysurfacespresentathighertemperaturestoenhanceconformationalsamplingwhilestillcollectingstatisticsatthetargettemperatureofinterest.Thetotalpartitionfunctionforthisnew,expandedensembleisshowninEq. 2 .[ 77 ] Q=MXm=0Qmexpm(2)whereQmisthecanonicalpartitionfunctionatagiventemperaturemandmisacarefullychosensetoftuningparametersdesignedtobiasthesimulationtowardspendingmoretimenearthetemperaturesofinterest.[ 77 ]EitherperiodicallyoratrandomintervalsthroughouttheMDorMCsimulation,aMonteCarloattempttochangethetemperatureofthe`current'conformationisperformed.SuccessfulattemptsbetweentemperatureskandmareevaluatedaccordingtotheMonteCarlocriteriashowninEq. 2 Pk!m=minf(k)]TJ /F3 11.955 Tf 11.96 0 Td[(m)H(~x)+m)]TJ /F3 11.955 Tf 11.96 0 Td[(kg(2)wherePk!mistheprobabilityofchangingfromtemperaturektotemperaturemandistheconstanttunedtocontroltheresidencetimeofthesimulationateachtemperature(seeEq. 2 ).Ifstatisticsaredesiredforaspecictemperature,anensemblecanbegeneratedfromallsnapshotswiththetargettemperature.Byallowingthesimulationtovisithighertemperatures,newpathwaysaroundandoverbarriersareopenedupbytraversingtemperature-spaceandconguration(conformation)spacesimultaneously.Theavail-ablekineticenergyathighertemperaturesmakesitmorelikelythathighbarrierswillbecrossedthanatlowertemperatures,whilethesamplestakenatlowertempera-turesprovidetheresolutionnecessarytocharacterizethethermodynamicproperties 70

PAGE 71

atbiologicallyrelevanttemperatures.However,sincethesimulationisallowedtovisitmultipletemperatures,asignicantportionofthesimulationis`wasted'samplinghighertemperaturesthatcontributelittletothelow-temperatureensemble.Theamountoftimethatthesimulationispermittedtospendateachtemperaturemustbecarefullybalancedtoenhancesamplingwithenoughsimulationdoneathighertemperaturesandmaintainadesiredlevelofresolutionofthelowtemperatureensemble.Thisdistributioniscontrolledbytheparameterateachtemperature(Eq. 2 ),whoseoptimalvaluesaredeterminedbyrunningshortsimulationsateachtemperaturetoestimatetheshapeofphasespace.[ 77 ] 2.2.4ReplicaExchangeMolecularDynamicsReplicaexchangemoleculardynamics(REMD)simulationsareaspecialcaseofexpandedensemblesimulationsthataredesignedtobescalabletomodern,parallelcomputers.Inthesesimulations,anitenumberofindependentsimulations,orreplicasarerun,eachwithadifferentstateparameter(e.g.,differenttemperatures).Thesereplicasperiodicallyattempttoexchangeinformationbetweeneachothereithercongurationsorstateparametersinsuchawaythatmaintainsthevalidityofthe`subensemble'ofeachreplica.AdiagrammaticrepresentationofREMDsimulationsisshowninFig. 2-8 .[ 78 ]ToensurethateachreplicaisinastateofequilibriumwithallotherreplicasintheREMDsimulation,areversibleMarkovchainofmovesalongthestateparameterdimensionisnecessary(seeFig. 2-8 ).Trialmovesaretypicallydonebetweenasinglepairofreplicastosimplifytheexpressionforcalculatingtheexchangeprobability.AswesawinSection 1.1.2.1 ,applyingtheMetropoliscriteriatoarandomlyproposedMCmovesatisestherequirementofdetailedbalance.Therefore,MetropolisMCisusedtoenablereplicastosamplealongthestatespacecoordinateinREMDcalculations.ThereareseveraldifferentchoicesonecanmakeforthestatespaceparameterwhensettingupaREMDcalculation.Commonchoicesincludetemperature[ 78 ], 71

PAGE 72

Figure2-8. DiagrammaticsketchofREMDsimulations.Replicasarerepresentedasthickarrowsandexchangeattemptsareshownbetweenadjacentreplicasconnectedbythinblackarrows.Thequestion-markindicatesthataMCmoveisacceptedwiththeprobabilitycalculatedaccordingtotheMetropoliscriteria.Successfulandunsuccessfulexchangeattemptsareshownwithagreenorredquestionmark,respectively. umbrellapotentials(forumbrellasamplingsimulations)[ 79 80 ],Hamiltonians,[ 81 85 ]andsolutionpH,[ 86 89 ]amongothers.[ 90 91 ]Thesemethodsarediscussedindetailinlaterchapters. 2.3FreeEnergyCalculationsCalculatingthe`freeenergy'istheHolyGrailofcomputationalchemistry,asitfurnishestheultimatecomparisonwithexperimentalobservables.Asaresult,signicantefforthasbeenspentsearchingforcomputationallyefcientwaystoaccuratelycalculate 72

PAGE 73

freeenergychangesofvariousprocesses,includingconformationalrearrangement,[ 92 93 ]proteinfolding,[ 94 96 ]solvation,[ 97 98 ]protein-ligandbinding,[ 99 100 ]andprotein-proteinbinding[ 101 102 ]amongothers.Becausethefreeenergyisastatefunction,thefreeenergydifferencesbetweentwodistinctstatesareindependentofthepathtakenfromthestartingstatetotheother.Infact,thisprincipleholdsevenifthatpathwayiscompletelyctitious!Thisgivessimulationasignicantadvantageincomputingfreeenergies,sincetheeasiestpathalongwhichtocomputethisvaluemaybeusedevenifthatpathwayischemicallynonsensical.Despitethisadvantage,however,freeenergiesremainexceedinglydifculttocomputedirectly.[ 103 ]Inthissection,Iwillbrieyoutlineseveralmethodscom-monlyusedtocomputefreeenergydifferencesbetweentwostatesThermodynamicIntegration,FreeEnergyPerturbation,andend-statefreeenergymethods. 2.3.1ThermodynamicIntegrationThermodynamicIntegration(TI)isaso-calledalchemicalfreeenergycalculationmethod,sinceitcontainsaninterpolatingparameterthat`morphs'onesystemintoanother.[ 11 12 ]Assumingthetwostates0and1obeythepotentialenergyfunctions,orHamiltonians,H0andH1,respectively,theHamiltonianoftheperturbedsystemisshowninEq. 2 H(q,p)=f()H0(q,p)+g()H1(q,p)(2)whereisaswitchingparameterwiththecontinuousdomainbetween0and1andthefunctionsf()andg()obeytherelationshipsf(0)=1,f(1)=0g(0)=0,g(1)=1 73

PAGE 74

suchthattheHamiltonianateitherendpointisapurefunctionofoneofthetwostates.Alinearswitchingfunction,withf()=g()=1)]TJ /F3 11.955 Tf 11.96 0 Td[(iscommonlyusedduetoitssimplicity.Becausethescalingparameteriscontinuousandcanbemadetovaryinnitelyslowly,samplingdoneattheintermediatestates(i.e.,0<<1)arealwaysatequilibrium.Thetotalfreeenergy,then,canbecalculatedviatheintegralshowninEq. 2 .[ 12 ] G0!1=Z10@H @d(2)wheretheaverageistakenovertheensemblegeneratedateach.BecausedoingtrueTIwouldrequireaninnitenumberofsimulationsforequaltoallrealnumbersbetween0and1,Eq. 2 isapproximatedusingaRiemannsum,shownbelow.G0!11X=0@H @Therefore,TIcalculationsrequiretheselectionofasetofwindows(i.e.,selectionsbetween0and1)atwhichanensemblemustbegeneratedtoevaluatethegradientofthecoupledHamiltonianwithrespecttothecouplingparameter.Asufcientnumberofvaluesmustbechosentoobtainanaccurateandconvergedfreeenergythenumberofrequiredwindowsvariesfromsystemtosystem.Forthesimplelinearswitchingfunctiondescribedpreviously,thegradientsrequiredbyEq. 2 canbecomputedanalyticallybasedonthefunctionalformoftheunderlyingHamiltonians.Theonlytermsthatcontributeto@G=@arethosetermsthatincludeinteractionswithoneoftheatomsthatdifferinsomewaybetweenthetwoendstates.Therefore,asonewouldexpect,theTIcalculationsconvergemorerapidlywhentheperturbationbetweenstates0and1aresmall.Indeed,TIcalculationshavebeen 74

PAGE 75

successfullyemployedtocalculatemanyfree-energybasedproperties,suchasproteinpKas,[ 104 ]andsolvationfreeenergies.[ 105 ]TraditionalTIcalculationssufferfromaseverelimitationwhenappliedtotypicalMMforceelds,however.UsingthefunctionalformoftheAmberforceeld(Eq. 1 )asanexample,thereisasignicantproblemwithconvergingTIcalculationsatwindowswhereapproacheseither0or1whenatomsare`appearing'or`disappearing'(i.e.,whenthoseatomsexistonlyinoneendpoint).TheproblemarisesintheLennardJonestermwhichhasaverystrongrepulsiveforceatcloseintermoleculardistanceswithasingularityattheorigin.ThissingularityexistsaslongastheHamiltoniancontainingthisatomhasnon-zeroweightaccordingtothechosenvalue,whichistrueforallvaluesexcept0or1.Therefore,evenwhenisarbitrarilyclosetoeither0or1,thereisaregionofspacearoundthecenterofalldisappearingatomsinwhichnoatomcanenterduetotherepulsiver)]TJ /F10 7.97 Tf 6.59 0 Td[(12termofthealmost-vanishedatom.Thisphenomena,referredtoasahardcore,hurtsconvergenceofTIcalculationsbypreventingcongurationsinwhichmoleculesenterthespacepartiallyoccupiedbyadisappearingatom.[ 106 ]Figure 2-9 demonstratesthiseffectbyplottingtheLennardJonespotentialbetweentwocarbonatomswhenoneofthemvanishesat=1.Toaddressthislimitation,anadditional-dependenttermisaddedtodisappearingatomstosoftenthecoreneartheendpointsandeliminatethesingularitythatpreventsparticlesfromenteringthespaceoccupiedbyapartially-vanishedatom.Thisapproach,describedbelow,isreferredtoassoft-corethermodynamicintegration.Soft-coreTI.ToavoidthesingularityintheLennardJonespotentialtermofavanishingatominTIcalculations,thefunctionalformofthispotentialisadjustedbyEq. 2 .Agoodchoiceforthefunctionalformofthesoft-corepotentialshouldsatisfyseveralconditions.First,thepotentialshouldbeeither0foravanishedatomortheoriginalLennardJonespotentialforanatomthatis`fully'present.Second,thepotentialmustnotdivergebetweenapartiallyvanishedatomandanunperturbedatomwhen 75

PAGE 76

Figure2-9. HardcoreofdisappearingatomcausedbytheLennardJonesterms.The=1stateistheoneinwhichacarbonatomhasvanished. theirseparationapproacheszero.Finally,theforcemustremainconservative(i.e.,theenergydifferencebetweenanytwopointsmustbeindependentofthepathtakenbetweenthem).Eq. 2 satisesalloftheserequirements,makingitagoodcandidatetoreplacethestandardLennard-Jonespotentialinvanishingatoms.ULJi,j(ri,j,)=n4"i,j0BBB@1 LJ(1)]TJ /F3 11.955 Tf 11.96 0 Td[()2+ri,j i,j62)]TJ /F9 11.955 Tf 58.78 8.09 Td[(1 (1)]TJ /F3 11.955 Tf 11.96 0 Td[()2+ri,j i,j61CCCAULJi,j(ri,j,)=(1)]TJ /F3 11.955 Tf 11.95 0 Td[()n4"i,j0BBB@1 LJ2+ri,j i,j62)]TJ /F9 11.955 Tf 43.45 8.09 Td[(1 2+ri,j i,j61CCCA (2) 76

PAGE 77

Figure2-10. Functionalformofsoft-coreLennardJonesinteractionswithdifferentvaluesoffromEq. 2 .The=1stateistheoneinwhichanatomhasvanished.SeeFig. 2-9 toseehowsoftcoresenablesamplingclosetothecenterofthevanishingatomwhen1. ThetopequationofEq. 2 correspondstothefunctionalformwhenoneoftheatomsvanisheswhen=0andthebottomequationcorrespondstothecasewhereoneoftheatomsvanisheswhen=1.ThedenominatorsinEq. 2 donotcontainasingularitywhenri,j=0whenoneoftheatomshasvanished.Theparametercontrolshow`soft'thecoreofthevanishingatomsare,asshowninFig. 2-10 .[ 107 ]TIsimulationsusingsoft-corepotentialsfortheLennardJonestermsofvanishingatomsshowsignicantlybetterconvergenceoffreeenergies.[ 105 107 ] 2.3.2FreeEnergyPerturbationAnalternativeapproachtocalculatethefreeenergydifferencebetweentwostatesisthefreeenergyperturbation(FEP)methodproposedby Zwanzig .[ 108 ]ThefreeenergybetweentwostatesAandBcanbecalculatedaccordingtoEq. 2 77

PAGE 78

GA!B=)]TJ /F7 11.955 Tf 9.3 0 Td[(kBTlnhexp()]TJ /F3 11.955 Tf 9.3 0 Td[((EB)]TJ /F7 11.955 Tf 11.96 0 Td[(EA))iA(2)wheretheaveragesaretakenovertheensemblegeneratedinstateAandEBaretheenergiesofthestructuresinensembleAevaluatedwiththeHamiltoniangoverningthebehaviorofensembleB.Thisiscalledforwardsampling,sincetheensembleweusedtoestimatethefreeenergycamefromtheoriginalstate.[ 12 ]Thereverse,orbackwardsampling,representsthereverseprocess(i.e.,byswappingtheindicesAandBinEq. 2 ).Becausefreeenergyisastatefunction,theforwardandbackwardfreeenergiesshouldsumexactlyto0.Thisbalancebetweentheforwardandreversesamplingrarelybalancescompletelyforcomplextransformations,however,indicatingashortcominginthenaveFEPapproach.Iftheensemblesgeneratedbythetwostatesaresignicantlydifferent,theforwardandreversefreeenergieswillbesystematicallydifferent.Forinstance,ifwearesimulatingthefreeenergychangeoftransformingbenzeneintophenoltocalculatetheirdifferenceinsolvationfreeenergies,thesolventarrangementaroundthetwosystemswillbesignicantlydifferentduetotheaddedbulkofthehydroxylgroupinphenolaswellasthedifferenceinthedipolemomentcausedbythathydroxyl.Toaddressthisshortcoming,twoend-statesareofteninterpolatedusingacouplingparametersimilarinspirittoTI.ByperturbingthesystemslowlyfromstateAtoB,thedifferencesbetweenadjacentstatesarereduced,leadingtosimilarensemblesthatgeneratemoreconsistentforwardandreversefreeenergieswhenusedinEq. 2 .[ 12 ] 2.3.3End-stateCalculationsThenalfamilyoffreeenergymethodsIwilldiscusshereareso-calledend-statecalculationssincetheyinvolvecalculatingthefreeenergychangeGA!BfromsimulationsperformedonlyonthetwophysicalendstatesAandB.Thesemethodsare 78

PAGE 79

oftenusedtoestimatebindingfreeenergiesofanon-covalentlyboundprotein-ligandorprotein-proteincomplex.[ 99 101 102 109 110 ]Iwilldiscusstwomethodswithinthisfamilythatareroutinelyusedinbindingfreeenergycalculationstheso-calledMolecularMechanicsPoisson-Boltzmannsurfacearea(MM-PBSA)method[ 111 112 ]andthelinearinteractionenergy(LIE)method.[ 113 114 ] 2.3.3.1MM-PBSAMM-PBSA,anditsclosely-relatedcounterpartsMM-GBSA(GBimplicitsolvent),MM-3DRISM(3D-RISMimplicitsolvent),andQM/MM-GBSA,arecommonlyusedtocalculatebindingfreeenergiesofnoncovalentlyboundcomplexes.ThesemethodscomputebindingfreeenergiesviathethermodynamiccycleshowninFig. 2-11 ,wherethesolvationfreeenergytermsarecomputedusinganimplicitsolventmodel(e.g.,Poisson-Boltzmann,GeneralizedBorn,or3D-RISM).Thetotalfreeenergycomputedalongthecycle,showninEq. 2 ,istakenfromensembleaveragesoverasimulatedtrajectory.TheensemblesaretypicallygeneratedbyrunningeitheraMDorMCsimula-tionforeachofthethreestatestheboundcomplex,unboundreceptor,andunboundligand.[ 110 ] Gbinding=hHsolv,boundi+hHbinding,gasi)-222(hHsolv,unboundi(2)TheaveragesinEq. 2 aretakenfromtheensemblesofeachsystem.TheprincipalcomputationalcostofanMM-PBSAcalculationisduetotheinitialsimulationsrequiredtoconstructeachensemble.ToreducethecostofcomputingbindingfreeenergieswithMM-PBSA,allthreeensemblesmentionedabovecanbeextractedfromasinglesimulationoftheboundcomplex,atechniquereferredtoasthesingletrajectoryprotocol.[ 110 ]Thisapproachwillalwaysunderestimatethebindingfreeenergy(predictingoverly-stablebinding),sincetheboundstatesofthereceptorandligandwillalwaysbelessstableintheboundconformationthantheyarewhen 79

PAGE 80

Figure2-11. ThermodynamiccycleforMM/PBSAcalculations.Implicitsolventisrepresentedwithabluebackground,whileawhitebackgroundrepresentssystemsinthegasphase. freeinsolution.However,whenmakingthesameapproximationforafamilyofrelatedreceptorsandligands,thesystematicerrorsineachend-statecalculationwillbesimilar.Therefore,MM-PBSAmethodscanbeusefulfortaskslikerank-orderingahandfulofproposedinhibitorsforaspecicenzymebycalculatingaccuraterelativebindingfreeenergies.[ 109 ] 2.3.3.2LIEThelinearinteractionenergymethod(LIE)isanotherend-statemethodwidelyusedtocalculatenoncovalentbindingfreeenergiesofsmallligandstoproteins.LIEisbasedonassumingthatthebindingfreeenergyisaresultoftheenergeticdifferencesofthe 80

PAGE 81

ligandinthetwoenvironmentsboundintheactivesiteoftheproteinversushydratedinsolutionandobeyslinearresponsetheory.[ 113 ]TheelectrostaticcontributiontoLIEcanbesimpliedtotheconceptofchargingtheatomsoftheligandinsideacavitythathasthesameshapeastheligand.FollowingtheideasMarcustheory,[ 115 ]thereorganizationenergyofthesurroundingsolventcanbeexpressedas[ 113 ] =hVB)]TJ /F7 11.955 Tf 11.95 0 Td[(VAiA)]TJ /F9 11.955 Tf 11.96 0 Td[(GA!B=hVA)]TJ /F7 11.955 Tf 11.95 0 Td[(VBi+GA!B(2)SolvingforGA!BinEq. 2 yieldsthefollowingexpressionforthefreeenergyofthechangefromtheligandboundinenvironmentAtotheligandboundinenvironmentB: GA!B=1 2)]TJ /F14 11.955 Tf 5.48 -9.69 Td[(hViA+hViB(2)Eq. 2 canbereadilyappliedtotheelectrostaticcontributionofthebindingfreeenergy,butthenon-polarnon-bondedinteractionsnamelythevanderWaalsinteractionsarenotknowntoobeyMarcustheoryaccurately.Asaresult,thenon-polarinteractionenergyisscaledinEq. 2 byaparameterthatisadjustedtotadatabaseofknownbindingafnities.[ 113 ] Gbind=1 2Velecw!p+VvdWw!p(2)wherethe`w'subscriptindicatestheligandfreeinwaterand`p'indicatestheligandboundintheprotein.ThevanderWaalsinteractionsarescaledbytheparameter,whichwasadjustedtogivegoodagreementwithexperimentalbindingafnities.LIEcalculationsrequiretwoensemblestobegeneratedoneoftheligandfreeinexplicitsolventandtheotheroftheligandboundintheprotein.AdiagrammaticrepresentationoftheLIEcalculationisshowninFig. 2-12 81

PAGE 82

Figure2-12. SchematicshowinginteractionsnecessarytocomputetheLIEfreeenergyofnoncovalentbindingforaligandinaproteinusingwhitearrows. 82

PAGE 83

CHAPTER3CONSTANTPHREPLICAEXCHANGEMOLECULARDYNAMICSInthischapter,IwilldiscussmyworkwithREMDsimulationsinwhichthestateparameterthepropertyexchangedbetweenreplicasisthesolutionpH.Thisworkisreprintedwithpermissionfrom Swails,andRoitberg ,J.Chem.TheoryComput2012,84393.Copyright 2012 AmericanChemicalSociety.[ 88 ] 3.1ConstantpHandpKaCalculationsSolutionpHisoftencriticaltotheproperfunctioningofbiologicalcatalysts.[ 116 117 ]ThepHenvironmentofbiologicalsystemsinuencestheionizationequilibriapresentinthesystem,therebyaffectingtheprotonationstateofvarioustitratableresiduesinthesystem.AtitratableresidueisanyresiduethathasapKavaluewithin1or2unitsofthebiologicalpHrange(whichisroughly19).Theprotonationstatesoftheseresiduescanhaveaprofoundeffectonthestabilityofthesystem,thesystem'sinteractionswithitssurroundings,andanycatalyticmechanismthatreliesonaspecicsetofprotonationstatestocarryoutgeneralacid-basecatalysisornucleophilicattack.[ 118 ]Simulationsaimedatmodelingproteinsornucleicacidsmusthavesomemethodforassigningprotonationstatesforeachtitratableresidue.Becausebondbreakingandbondformationareimpossibleinclassicalforceelds,eachresidueistypicallyassignedoneprotonationstateandtheentiresimulationisrunusingthissetofstates.Thisapproachhastwodrawbacks.First,thechoiceofprotonationstateisoftenbasedonthebehaviorofeachtitratableresiduewhenfreeinsolution.Thismaynotbeavalidassumption,however,becausetheproteinornucleicacidenvironmentcanmodulatearesidue'sprotonationstateequilibrium.Second,asingleprotonationstatemaynotaccuratelyrepresentthetrueensembleofstatesatthedesiredpH.IfthepHisclosetothepKaofagivenresidue,orifthesystempopulatesconformationsinwhich 83

PAGE 84

thedominantprotonationstatechanges,thenthetrueensembleisrepresentedbyconformationswithdifferentprotonationstates.TherstdrawbackcanbeaddressedbyusingtoolssuchasPROPKA[ 119 ]andH++,[ 120 ]whichprovideameanstoassignprotonationstatestotitratableresiduesbycalculatingthepKaofthestartingstructure.However,thisdoesnotaddressthepossibilitythatmultipleprotonationstatesmaybenecessarytobuildthedesiredensemble.Whileitmayseemthatbothdrawbackscanbeaddressedbysimplyrunningsimulationswitheverypossiblesetofprotonationstates,thisapproachquicklybecomesunwieldy.GivenNtitratableresidues,thereareatleast2Ndistinctprotonationstatesassumingeachresidueiseitherprotonatedordeprotonated.Withonly10titratableresiduesthisamountstoaminimumof1024distinctsimulations!Whilemostofthesestatesmaynotbefoundinthegivenensemble,thereisnowaytoknowwhichonestoexcludeapriori.Itisimportant,then,todevelopamethodcapableofdirectlyprobingprotonationstateequilibriainbiologicalmolecules.Inordertoprobeprotonationstateequilibriainathermodynamicallymeaningfulway,simulationsmustberunatconstantpH.TherstapproachesforconstantpHsimulationsusedcontinuumelectrostaticsmethodstocalculatetheperturbingeffectofthesystemenvironmentonprotonationstateequilibriausinganimplicitsolventmodel(e.g.,thePoisson-Boltzmannequation)onasinglestructure.[ 121 123 ]Thesemethods,whilesometimesusefulforcalculatingpKavaluesinbiologicalsystems,assumethatthefullprotonationstateequilibriacanbecharacterizedwithasinglestructure.Inparticular,usingasinglestructureneglectstheresponseofthesystemrelaxingtoaccommodatethenewprotonationstate.Whiletheeffectsofsystemrelaxationhasbeenaddressedtosomedegreebytreatingtheproteininteriorwithalargedielectricconstant,[ 123 ]thisapproachassumesanunphysicalhomogeneityinthesystem'sdielectricresponsetoprotonationstatechanges. 84

PAGE 85

Amoresophisticatedapproachtoincorporatingthesystemresponseinvolvessimultaneoussamplingofbothprotonationstatesandside-chainrotamers.[ 124 ]ThisapproachdramaticallyimprovespKapredictionwithrespecttoexperiment,butmaybeinsufcientforsystemswithlargescaleconformationalchangesthatcannotbeattributedonlytosidechainmobility.Tocapturethecouplednatureofconformationalexibilitywithprotonationstatesampling,severalconstantpHmoleculardynamics(CpHMD)methodshavebeenpro-posed.[ 125 131 ]ThesemethodshaveproventobepowerfultoolsforpKacalculationandprediction,althoughthereisstillroomforimprovement.[ 132 ]ForsystemsinwhichsometitratableresiduesexperiencelargepKashifts,predictedpKavaluesareofteninerrorbymorethan1pHuniteveninthestudiesthatreproduceexperimentalvaluestheclosest.[ 132 ]Thisisusuallyadirectresultofinsufcientsamplingofprotonationandconformationalstatesoralimitationoftheunderlyingmodel. MachuqueiroandBaptista haveshownthatcorrectingsomeofthelimitationsoftheunderlyingmodel,suchasimprovingthedenitionofthereferencecompound(whoseroleisdescribedbelowintheTheorysection)andimprovingtheunderlyingforceeldimprovesresults.[ 133 ]Otherworkhascoupledenhancedsamplingtechniques,suchasacceleratedmoleculardynamics[ 134 ],withCpHMDtoshowthatimprovedconformationalsamplingalsoim-provespredictedpKaswithrespecttoexperiment.[ 135 ] Webbetal. recentlypublishedasystematicstudyshowingthattheerrorsinherenttoexperimentalmeasurementsareoftenlargerthanthosereported,whichhasimportantimplicationsforassessingtheaccuracyoftheoreticalpredictions.[ 136 ]Replicaexchangemoleculardynamics(REMD)isafamilyofextendedensembletechniquesthathavebeenshowntodramaticallyimprovesampling.[ 78 85 137 140 ]InREMDsimulations,aseriesofindependentreplicas(singleMDtrajectoriesofasystem)periodicallyattempttoexchangeinformation,suchastemperature[ 78 137 ]and,morerecently,pH[ 86 87 ]tosamplefromanexpandedensemblecoveringmultiplestates. 85

PAGE 86

Inthisstudy,IimplementedthepH-REMDmethoddescribedby Itohetal. [ 87 ]inthesandermoduleoftheAmber[ 141 ]softwarepackage.IshowhowthismethodsignicantlyimprovessamplingcomparedtoCpHMDinhenegg-whitelysozyme(HEWL),asystemcommonlyusedasabenchmarkforpKacalculations.TitrationcurvesgeneratedusingpH-REMDcontainsignicantlylessnoiseandconvergemorerapidlythanCpHMD,suggestingpH-REMDisapowerfultoolforcarryingoutpKapredictions.OurgrouphaspreviouslyshownthattemperatureREMDsimulationsconvergesignicantlyfasterwithincreasingexchangeattemptfrequency(EAF).[ 142 143 ]Here,IshowthatincreasingtheEAFinpH-REMDsimulationscausespH-dependentobservablepropertiestoconvergefasteraswell.Inthenextsections,IwilldescribethefoundationoftheconstantpHmethoddevelopedby Monganetal. [ 130 ]andthecorrespondingpH-REMDmethod.[ 86 87 ]IwillthendescribethedetailsofmystudyonHEWLfollowedbytheresultsandconclusionsdrawnfromthatstudy. 3.2TheoryHereIwilldescribethetheorybehindconstantpHsimulations,beginningwithadescriptionofthestatisticalensemblecorrespondingtothisfamilyofsimulationsandfollowingupwithanoverviewofthemethodsusedinthisstudy. 3.2.1TheSemi-GrandEnsembleAtconditionsofconstantpH,systemsnolongerobeytheconstraintsofthetypicalcanonicalensemblepresentedintheopeningchapter.Instead,thechemicalpotentialofhydroniumrelateddirectlytothesolutionpHbyEq. 3 isheldconstant,therebyallowingtheH+counttouctuate. 86

PAGE 87

H+=@G @NH+=)]TJ /F7 11.955 Tf 9.3 0 Td[(kTln[H+]=)]TJ /F7 11.955 Tf 9.3 0 Td[(kTln(10)log[H+]=kTln(10)pH (3)whereH+isthechemicalpotentialofhydroniumandtheactivityofthehydroniumionhasbeenreplacedbytheconcentrationduetotheverylowconcentrationsinwhichitistypicallypresentinbiologicalsystems.LookingbackatEq. 1 ,wecancalculatethepartitionfunctionofthesemi-grandcanonicalensembleusingEq. 3 ,introducingapH-dependenceinoursamplingscheme.(H+,V,T)=XNH+Q(N,V,T)exp(RTNH+ln(10)pH)=XNH+Q(N,V,T)exp(NH+ln(10)pH) (3) 3.2.2CpHMDIusedtheconstantpHmoleculardynamics(CpHMD)methoddevelopedby Monganetal. [ 130 ]thatemploysMonteCarlotransitionsbetweendiscreteprotonationstatesatperiodicintervalsduringaMDsimulationtoprobeprotonationstateequilibria.InthisCpHMDimplementation,boththedynamicsandtheMCprotonationstatesamplingareperformedinGeneralizedBornimplicitsolvent.Afterapredeterminednumberofsteps,theMDishaltedandaprotonationstatechangeisattemptedbyevaluatingtheenergeticcostofthatproposedchange,calculatedaccordingtoEq. 3 .[ 130 ] G=kBT(pH)]TJ /F7 11.955 Tf 11.96 0 Td[(pKa,ref)ln10+Gelec)]TJ /F9 11.955 Tf 11.96 0 Td[(Gelec,ref(3) 87

PAGE 88

Table3-1. ReferencepKavaluesfortheacidicresiduestreatedinthisstudy.ValuesarethesameasthoseusedintheoriginalAmberCpHMDimplementation.[ 130 ] ResidueReferencepKa Aspartate4.0Glutamate4.4Histidine(H)7.1Histidine(H)6.5 Eq. 3 representsafreeenergychangeofprotonatingordeprotonatingatitratableresidueembeddedinabiologicalsystemwithrespecttoapredenedreferencecom-pound.Thereferencecompoundisamonomerofthetitratableresiduecappedwithsmall,neutralfunctionalgroups.InEq. 3 ,Geleciscalculatedbytakingthedifferenceoftheelectrostaticenergybetweentheproposedandexistingprotonationstates.[ 130 ]Directlycalculatingthefreeenergychangeassociatedwithprotonationordepro-tonationisdifcultbecauseevaluatingtheenergeticcostofdesolvatingafreeprotonandmakingandbreakingchemicalbondsisimpossibleinaclassicalmechanicalframe-work.Therefore,wecalculatethefreeenergycostofthisprotonationstatechangebycomparingthefreeenergyoftheprotonationstatechangetoGelec,refinEq. 3 ,aprecomputedfreeenergyforthereferencecompoundthatisadjustedtoreproduceexperimentalpKavalues.Eq. 3 ,then,representsashiftinthepKaofatitratableresidueinabiologicalsystemfromitsvaluefreeinsolution.ThereferencecompoundpKavaluesusedintheAmberCpHMDimplementation[ 130 ]areshowninTable 3-1 .RunningaCpHMDsimulation,weobtainanensembleconsistingofmultipleprotonationstatesproperlyweightedforthesemi-grandcanonicalensemble,thethermodynamicensemblecorrespondingtoconstanttemperature,volume(orpressure)andchemicalpotentialofhydronium(i.e.,constantpH).[ 126 ]Becausethesimulationisassumedtobeergodic,thedeprotonationfractioncanbecalculatedbysimplycountingthefractionofensemblemembersinwhichtheresidueisdeprotonated.MultipleCpHMDsimulationsmustberunwitharangeofpHstocalculatepKavaluesfortitratableresiduesinbiologicalsystemsbyttingatitrationcurvetothedata. 88

PAGE 89

RunningasimulationwithanexpandedensemblesoeachCpHMDsimulationisinequilibriumwithsimulationsatdifferentpHscanfurtherenhancesamplingfromthedesiredsemi-grandcanonicalensemble.Forthis,weturntothepH-REMDmethod. 3.2.3pH-REMDReplicaexchangesimulationsatconstantpH(pH-REMD)isavariantofreplicaexchangeinwhicheachreplicaissimulatedataseparatepH.ThefullpH-REMDsimu-lationrepresentsanexpandedensembleinwhicheachreplicasamplesconformationswithaxedpHandsamplesdifferentpHvaluesataxedconformation.Inthisstudy,IimplementedthepH-REMDmethodintroducedby Itohetal. [ 87 ]inthesandermoduleofAmber.[ 141 ]InpH-REMD,adjacentreplicasinthepHladderswappHwiththeMonteCarloexchangeprobability Pi!j=minf1,exp[ln10(Ni)]TJ /F7 11.955 Tf 11.95 0 Td[(Nj)(pHi)]TJ /F7 11.955 Tf 11.95 0 Td[(pHj)]g(3)forreplicasiandjwhereNiisthenumberoftitratableprotonspresentinreplicaiandpHiisthepHofreplicaipriortotheexchangeattempt.OurgrouprecentlydevelopedadifferentpH-REMDmethodinwhichreplicaexchangesareattemptedviaHamiltonianexchangewhereonlyatomiccoordinatesareswapped.[ 89 ]Incontrast,thecurrentlyproposedmethodonlyswapsthesolutionpHbetweenreplicas.Forlargesystemswithmorethan35titratableresidues,theproposedmethodofswappingsolutionpHbetweenreplicasachievesmoreefcientreplicaexchangesthanthevariantemployingHamiltonianexchange.ForHEWL,specically,theHamiltonianREMDvariantexperiencedanexchangesuccessrateof<0.01%,whichiseffectivelyindistinguishablefromCpHMDsimulations. 3.3Methods 3.3.1StartingStructureIchosetostudyheneggwhitelysozymebecauseitiswell-characterizedbothexperimentally[ 136 144 145 ]andcomputationally.[ 86 130 146 ]Ichosethestructure 89

PAGE 90

fromtheproteindatabank(PDB)withthecode1AKI[ 147 ]becauseitwasthefocusofMongan'soriginalstudy.[ 130 ]ThetopologylewaspreparedinthetleapmoduleofAmberTools12usingtheAm-berff10forceeld,whichisequivalenttoff99SB[ 23 ]forproteins.Crystallographicwatermoleculeswereremovedfromthestartingstructures,andtleapaddedallhydrogenatoms.Finally,thembondi2intrinsicradiiforimplicitsolventcalculationswereselectedintleaptobeconsistentwiththeinitialimplementationofCpHMD.[ 130 ] 3.3.2MolecularDynamicsTobeconsistentwiththeoriginalimplementation,theGeneralizedBornmodeldescribedby Onufrievetal. [ 49 ](correspondingtotheinputparameterigb=2forAmberprograms)wasusedwiththesaltconcentration,modeledasaDebyescreeningparameter,setto0.1Mineverysimulation.[ 130 ]Duetothelong-rangenatureoftheelectrostaticforces,Ialwaysusedaninnitecutofffornon-bondedinteractions.Eachstartingstructurewasminimizedusing50stepsofsteepestdescentfollowedby950stepsofconjugategradientwith10kcalmol-1A-2restraintsonthebackboneatomstorelievebadcontacts.Then,theminimizedstructurewasheatedbyvaryingthetargettemperaturelinearlyfrom10Kto300Kfor667ps,keepingweakrestraints1kcalmol-1A-2onthebackbone.IusedtheLangevinthermostatwithacollisionfrequencyof5ps-1tocontrolthetemperature.ThesesimulationswereperformedusingthepmemdmoduleoftheAmber12programsuite.[ 141 ]Afterheating,eachstructurewasfurtherrunat300Kfor1nswith0.1kcalmol-1A-2restraintsonthebackbone.Eachtitratablecarboxylatewasdeprotonatedandthehistidinewasprotonated,andnoprotonationstatechangeswereattemptedduringthesimulation.Next,theresultingstructurewasusedtostart16nsofCpHMDatpHvaluesspanning2to7withanintervalof0.5.Onlythe10acidicresiduestheaspartates,glutamates,andhistidinesweretitratedbecauseHEWLiscatalyticallyactiveatlowpH[ 148 ]and1AKIwassolvedintheseconditions.Iuseda2fstimestepandattempted 90

PAGE 91

protonationstatechangesevery5stepsforallsimulationsinwhichprotonationstatechangeswereattempted.TheLangevinthermostatwithacollisionfrequencyof10ps-1wasusedtocontrolthetemperature,andsimulationswerebegunwithadifferentrandomseedtoavoidsynchronizationartifacts.[ 149 ]IusedthesandermoduleofAmber12foreachofthesesimulations. 3.3.3ReplicaExchangeAllpH-REMDsimulationswererunwith12equallyspacedreplicasatpHvaluesspanning2to7.5identicaltothepHvaluesusedfortheCpHMDsimulationswiththeadditionofareplicaatpH7.5.TheadditionalreplicaisnecessarybecausetheREMDimplementationinsanderrequiresanevennumberofreplicassothateachreplicahasapartnerforeachexchangeattempt.Thestructuresobtainedafter1nsofCpHMDsimulationforeachpHwereusedasthestartingstructureforthereplicaexchangesimulations(thestructurefromCpHMDrunatpH7wasusedforthereplicarunatpH7.5aswell).IranpH-REMDsimulationswithexchangeattemptfrequencies(EAFs)(i.e.,thefrequencywithwhichreplicasattempttoswappHvalues)equalto50ps-1,10ps-1,5ps-1,and0.5ps-1toassesstheeffectofEAFontheconvergenceofobservableproperties.Thiscorrespondstoattemptingexchangesevery10,50,100,and1000steps,respectively.AllpH-REMDsimulationswererunfor15ns.InoteherethattheCpHMDsimulationsareequivalenttoaREMDsimulationwithanEAFequalto0.Iusedanin-house,modiedversionofsanderinwhichIimplementedpH-REMDforthesesimulationsandwrotein-housescriptstoextractpH-basedtitrationdatafromtheensembleofreplica-basedles.ThereplicaatpH7.5wasignoredinallpH-REMDanalysessothesimulationscouldbecomparedfairly. 3.4ResultsandDiscussionCpHMDmethodsmustsamplebothprotonationstatesandconformationstatestobuildathermodynamicallymeaningfulensemble.HereIwilldiscusshowwellCpHMD, 91

PAGE 92

asimplementedinAmber,[ 130 ]samplesfromthedesiredensemble.IthenanalyzehowpH-REMDaffectsprotonationandconformationalstatesamplingcomparedtoCpHMD,andhoweffectivethesetoolsareforpKaprediction. 3.4.1SimulationStabilityCpHMDinvolvesinstantaneouschangesinthechargedistributionoftheproteinastheprotonationstatesarechanged.Therefore,itisimportanttoverifythattrajectoriesgeneratedfromCpHMDandpH-REMDremainstablewithrespecttosecondaryandtertiarystructureduringthecourseofthesimulation. Monganetal. [ 130 ]showedthattemperatureandenergyuctuationsgreaterthanthoseobtainedwithstandardMD(i.e.,MDsimulationswithstaticprotonationstates)wereminimalduringthecourseofa1nssimulation.Mostoftheenergyuctuationsarisefromtheintrinsicresponseoftheforceeldtothenewchargestate.Toanalyzestructuralstability,Iplottedtherootmeansquareddeviation(RMSD)ofevery-carbonintheproteinwithrespecttotheminimizedcrystalstructurevs.time.TheresultsfromCpHMDandpH-REMDsimulationsareshowninFig. 3-1 forthelowestpH,2;thehighestpH,7;andanintermediatepH,4.5.TheRMSDsareboundedbelow4A,suggestingthatthetrajectoriesremainstableforbothCpHMDandpH-REMDduringtheentiresimulation. 3.4.2AccuracyofPredictedpKasOneofthegoalsofanyconstantpHsimulationmethodistoaccuratelypredictpKavaluesoftitratableresidues.ThepKaofeachresiduewascalculatedbyusingtheLevenberg-Marquardtnon-linearoptimizationmethodtotthetitrationdataateachpHtothestandardHillequationshownbelow: fd=1 10n(pKa)]TJ /F4 7.97 Tf 6.59 0 Td[(pH)+1(3)wherefdisthefractionofthetotalsimulationthatthetitratableresiduespentinadeprotonatedstate. 92

PAGE 93

Figure3-1. RMSDplotsforCpHMDsimulations(a)andpH-REMDsimulationsatdifferentexchangeattemptfrequencies(b-d)asafunctionoftimethroughoutthesimulation.Therstnanosecondisexcluded,asdescribedinMethods.TheRMSDisplottedwithrespecttotheminimizedcrystalstructure1AKIandareshownforonelow,onemedium,andonehighpHsimulation(2,4.5,and7,respectively). Table 3-2 showsthecalculatedpKavaluesforeachtitratableresiduecalculatedfromEq. 3 forselectexchangeattemptfrequencies.Hillcoefcientsthatdiffersignicantlyfrom1implyeitherthatthepKaofthatresiduedisplayssignicantnon-Henderson-Hasselbalch(non-HH)behavior,orthatprotonationspaceispoorlysampledatsomepHvalues,dependingonhowwellEq. 3 tsthedata.IfEq. 3 tsthedatapoorly,thenpoorprotonationstatesamplingisatleastpartiallyresponsibleforthedeviationoftheHillcoefcientfrom1.BecauseonlytheCpHMDsimulationsshowseveralresidueswhoseHillcoefcientdeviatessubstantiallyfrom1,weconcludethatpH-REMDimprovessamplingfromthedesiredensemble. 93

PAGE 94

Table3-2. pKaandHillcoefcientsforeachresiduetakenfromeachsetofsimulations.ThepKasandHillcoefcients(n)areshownforeachEAF.pKarootmeansquareerrors(RMSEs)fromthe13C-NMRexperimentalvaluespublishedby Webbetal. [ 136 ]areshowninthelastrow.Asp66wasproblematicbecauseitispositionedbetweenseveralArginineresidues,causingittoresistprotonation.WhenitsignicantlyimpactstheRMSE,theRMSEforallresiduesexceptAsp66isshowninparenthesesnexttothetotalRMSE. CpHMDEAF=0.5ps-1EAF=50.0ps-1ResiduepKanpKanpKanExpt.pKa GLU73.621.163.600.883.840.962.60.2HIS155.961.055.741.095.900.975.50.2ASP182.261.112.010.942.010.892.80.3GLU355.673.825.441.164.980.986.10.4ASP481.220.431.110.711.990.831.40.2ASP522.691.162.370.962.300.753.60.3ASP66-18.130.11-3.170.491.531.191.20.2ASP872.520.812.610.812.660.912.20.1ASP1013.692.103.661.003.570.874.50.1ASP1192.260.982.731.042.430.963.50.3RMSE6.15(0.74)1.56(0.76)0.89 3.4.3EnhancingProtonationStateSamplingwithpH-REMDReplicaexchangemethodologiesarewell-knowntoimprovesamplinginthedesiredensemble[ 78 138 ]aslongasreplicastraversethestate-spaceladderregularly.Ifreplicaexchangeattemptsalwaysfail,thesimulationdoesnotbenetfromthoseattempts.InourpH-REMDsimulations,exchangeattemptsbetweenreplicaswithneighboringpHvalues(i.e.,replicaswithsolutionpHsseparatedby0.5pHunits)succeededbetween40%to98%ofthetime,displayingveryefcienttraversalofthepH-spacereplicaladder.HereIwilldiscusstheextenttowhichpH-REMD,withdifferentexchangeattemptfrequencies(EAFs),improvesprotonationstatesamplingcomparedtoCpHMD.Ishowsampletitrationcurvesforsimulationswithnoexchangeattemptsandsimulationsinwhichreplicaexchangeswereattemptedwithafrequencyof50ps-1.Residuesforwhich`good'titrationcurvesareobtainedwithCpHMDareshowninFig. 94

PAGE 95

Figure3-2. Titrationcurvesobtainedwith(a)EAF=0ps-1,and(b)EAF=50ps-1.ThedatafortheseresiduesshowthebestttoEq. 3 fortheCpHMDsimulations. 3-2 .Icharacterize`good`titrationcurvesbysmalldeviationsofeachpointfromthettedtitrationcurveandHillcoefcientsbetween0.5and1.5.ResiduesthatshowpoortitrationcurvesforCpHMDcharacterizedbylargedeviationsofpointsfromthettedtitrationcurveand/orHillcoefcientssignicantlyshiftedfrom1areshowninFig. 3-3 .Figure 3-2 showsthatevenwhenCpHMDgeneratesdatathatcloselytEq. 3 ,usingpH-REMDstillimprovesthet.MoredrasticimprovementisshowninFig. 3-3 whereCpHMDperformspoorlybecausesomeresiduesbecomeconformationallytrappedatseveralpHs,impactingprotonationstatesamplingandskewingthepointsawayfromthetitrationcurve. 95

PAGE 96

Figure3-3. Titrationcurvesobtainedwith(a)EAF=0ps-1and(b)EAF=50ps-1.ThedatafortheseresidueshavethepoorestttoEq. 3 fortheCpHMDsimulations. ThequalityofthetscanbequantiedbymeasuringthedeviationofeachpointfromthettedequationaccordingtoEq. 3 RSS=Xpoints(O(x))]TJ /F7 11.955 Tf 11.96 0 Td[(E(x))2(3)whereRSSistheresidualsumofsquares,O(x)istheactualdatapoint,andE(x)isthevalueofthettedequationatthatvalueofx.Eq. 3 providesaneasywaytoquantitativelyevaluatehowwellthetitrationdatafromthepH-REMDsimulationstEq. 3 comparedtotheCpHMDsimulations.Theresultsforthe8residuesplottedinFigs. 3-2 and 3-3 areshowninTable 3-3 .TheimprovementbyusingpH-REMDoverconventionalCpHMD,alreadyapparentbyviewingFigs. 3-2 and 3-3 ,isstrikingasmeasuredinTable 3-3 .EvenwhenCpHMDperformswell,pH-REMDresultsinanimprovementof23ordersofmagnitudeinthe 96

PAGE 97

Table3-3. ValueofRSSaccordingtoEq. 3 forthe8residuesshowninFigs. 3-2 and 3-3 .Largervaluesrepresentmoredeviationfromthettedcurve,whereasavalueof0representsaperfectt.The`good'titratableresidues(Fig. 3-2 )aretherst4entriesandthe`bad'titratableresidues(Fig. 3-3 )arethelast4entries. ResidueRSS(CpHMD)RSS(EAF=50ps-1) GLU72.710)]TJ /F10 7.97 Tf 6.59 0 Td[(27.910)]TJ /F10 7.97 Tf 6.59 0 Td[(5HIS153.810)]TJ /F10 7.97 Tf 6.59 0 Td[(22.710)]TJ /F10 7.97 Tf 6.59 0 Td[(5ASP186.310)]TJ /F10 7.97 Tf 6.59 0 Td[(33.310)]TJ /F10 7.97 Tf 6.59 0 Td[(5ASP522.210)]TJ /F10 7.97 Tf 6.59 0 Td[(21.310)]TJ /F10 7.97 Tf 6.59 0 Td[(3GLU353.610)]TJ /F10 7.97 Tf 6.59 0 Td[(14.210)]TJ /F10 7.97 Tf 6.59 0 Td[(4ASP481.710)]TJ /F10 7.97 Tf 6.59 0 Td[(12.410)]TJ /F10 7.97 Tf 6.59 0 Td[(5ASP668.210)]TJ /F10 7.97 Tf 6.59 0 Td[(42.910)]TJ /F10 7.97 Tf 6.59 0 Td[(4ASP1013.410)]TJ /F10 7.97 Tf 6.59 0 Td[(22.710)]TJ /F10 7.97 Tf 6.59 0 Td[(4 RSSmetric,andresultsinatleastanotherorderofmagnitudeofimprovementinthecaseswhereCpHMDperformspoorly(exceptforAsp66,forwhichCpHMDhasalreadyproventoperformpoorlyinthisstudy).Byexchangingstructureswithotherreplicas,ensemblesgeneratedateachpHinpH-REMDsimulationsareabletoescapefromlocalminimathatpreventtitratableresiduesfromaccuratelysamplingprotonationstates.Becauseeachreplicaisrunindependently,snapshotsineachreplicaarenotcorrelatedwithoneanother,soensemblesgeneratedateachpHcontainmoreuncorrelatedmembersinsimulationswithmorerapidEAFs(aslongasreplicaexchangeattemptssucceedregularly).Therefore,pH-REMD'sabilitytocrossfreeenergybarriersmoreefcientlyreducestoanentropyargumentensemblesateachpHaregivenmoreopportunitiestosampledifferentconformations.AnotherwayofthinkingaboutpH-REMDsimulationsistoconsidertheentireexpandedensemble,inwhichthesimulationssampleinbothconformational-spaceandpH-space.CpHMDsimulations,ontheotherhand,donotsampleinpH-space,asthepHremainsconstantthroughouttheentiresimulation.TheprotonationstateofeachtitratableresiduestronglydependsonboththesolutionpHandtheproteinconformation,andiscoupledtoothertitratableresiduesincomplicatedways.Therefore,pH-REMD 97

PAGE 98

simulationscanmovethroughamuchlargerfreeenergyspaceextendedtoanotherdimensionrelativetoCpHMDsimulationspH-space.CpHMDsimulationsareunabletotakeadvantageoflowerfreeenergybarriersinthisexpandedensemble,causingthemtobecomemoreeasilytrappedinconformationsthatskewpredictedpKascomparedtopH-REMDsimulations.InanextremecaseAsp66theCpHMDsimulationsatlowpHnevervisitedconformationsfavorabletoprotonating.ByallowingexchangesbetweenpHreplicas,ensemblesatlowerpHcrossedintoregionsofphasespacefavorabletoAsp66proto-nation.ThecaseofAsp66furtherdemonstratesthatincreasingtheEAFimprovesproto-nationstatesampling.ThepKapredictionsystematicallyimprovesasEAFisincreased,andtheHillcoefcientimprovesfrom0.11to1.19. 3.5ExchangeAttemptFrequencyandProtonationStateSamplingToanalyzetheeffectEAFhasonpKaconvergence,Idividedeachsimulationintosectionsof0.25nsandcalculatedthestandarddeviationsofthepKaandHillcoefcient,aswellasthemeanHillcoefcient,byttingEq. 3 tothedataobtainedfromeachpH.TheresultsaresummarizedinTable 3-4 .TheaverageuctuationinpKasystematicallydecreasesasEAFincreases,inlargepartduetotheimprovementofresiduesthattitratepoorly,namelyAsp48andAsp66.Thelargestandarddeviationofthesetworesiduesisevidencethattheprotonationstatesamplingdoesnotconvergeonthe0.25nsintervalsthatwereusedtogeneratethestatistics.However,increasingtheEAFleadstoasystematicdecreaseintheuctua-tionsofthecalculatedpKaforAsp48andAsp66,becauseahigherEAFdecreasesthesimulationtimerequiredtoachievepKaconvergence.ThetrendoftheHillcoefcientshowninTable 3-4 alsoshowsaradicalimprove-mentinprotonationstatesamplingwithpH-REMD.WhileaHillcoefcientthatdeviates 98

PAGE 99

Table3-4. StandarddeviationsofpKa(pKa)andHillcoefcient(n)andaverageHillcoefcient(n)calculatedbydividingeachsimulationintosectionsof0.25ns.ThepKaandHillcoefcientsarecalculatedforeachsectionofthesimulationbyttingttingdatafromallpHreplicastoEq. 3 andcalculatingthestatisticsfromthe60resultingdatapoints. CpHMDEAF=0.5ps-1EAF=50.0ps-1ResiduepKannpKannpKann GLU70.183.13.30.291.00.20.151.00.1HIS150.242.61.70.211.20.40.201.00.1ASP180.271.40.70.291.00.30.281.00.3GLU350.428.16.70.301.60.60.411.20.3ASP486.611.83.55.11.01.23.31.10.8ASP521.041.20.80.401.20.50.641.00.7ASP66150.80.9151.00.34.91.40.9ASP870.422.01.80.371.00.40.431.10.3ASP1010.203.82.10.191.10.20.300.70.2ASP1190.331.40.80.221.20.30.241.10.3Average2.43.02.72.21.10.41.11.10.4 signicantlyfrom1mayindicatecooperativitybetweentitratingresidues,previousevi-dencesuggeststheseresiduesmostlytitrateindependently.[ 130 ]Furthermore,becauseCpHMDandpH-REMDsimulationsconvergetothesamelimitingensemble,Hillcoef-cientsthatdivergesignicantlyfrom1inCpHMDsimulationsbutremaincloseto1inthepH-REMDsimulationsmostlikelyindicatepoorprotonationstatesamplingintheCpHMDsimulations.TheCpHMDsimulationsshowanaverageHillcoefcientofatleast2forhalfofthetitratableresidues,anditsstandarddeviationsarenearlyaslargeastheHillcoefcientitself.Inthiscase,evenlowEAFsresultinHillcoefcientscloserto1,andtheiraveragerelativestandarddeviationdropsfrom100%to33%.Therefore,theHillcoefcientsfromtheCpHMDsimulationssymbolizepoorprotonationstatesamplingratherthanstrongcooperativitybetweentitratingresidues.Analmetricforanalyzingprotonationstatesamplingofaparticularresidueistocountthenumberoftimestheprotonationstatechangesoveraspeciedperiodoftime.Icalltheseprotonationstatechangestransitions,andIonlyconsideratransition 99

PAGE 100

Figure3-4. Numberofprotonationstatetransitionspernsofsimulationtime.Atransitioniscountedifconsecutivesnapshotsintheensemblehaveadifferentnumberofprotonsforthatresidue.TheCpHMDresultsarelabeledwithEAF=0.1ps-1totonthelog-scale. tohaveoccurredifthenumberofprotonsonthetitratableside-chainchangedfromonesnapshottothenextinagivenensemble.Inparticular,atautomericchange,suchasaprotonchangingfromoneoxygeninacarboxylatetotheotheroxygen,isnotcounted.Fig. 3-4 showshowthenumberofprotonationstatetransitionspernsofsimulation,summedovereveryreplicafrompH2to7,changeswithEAF.Ineverysimulation,protonationstatechangesareattemptedevery5steps.Simulationsthatresultinmoretransitionsdemonstrateenhancedprotonationstatesampling,sincemoreprotonationstatechangesoccurinthesameamountoftime.AllsimulationswerecarriedoutwiththesamesetofparameterssoeveryensemblegeneratedatagivenpHwillconvergetothesameresultgivenenoughsimulationtime.Therefore,simulationswithmoretransitionswillobtainconvergedpKavaluesfaster. 100

PAGE 101

Fig. 3-4 showsthatincreasingEAFdramaticallyincreasesthenumberoftransitionsdespitethefactthatthefrequencyofattemptedprotonationstatechangesisconstant.Thisisduetothenatureoftheprobabilityofacceptingareplicaexchangeattempt,whichisgovernedbyEq. 3 .Becausethesuccessofanexchangeattemptdependsonlyonthenetdifferenceoftitratingprotonsbetweenthetworeplicas,itispossibleforthisnetdifferencetobesmall,thereforetheprobabilityofacceptingtheexchangeattemptlarge,evenwhenseveralresidueshavedifferentprotonationstates.There-fore,numerousprotonationstatechangesforindividualresiduesoftenaccompanyasuccessfulexchangeofreplicas. 3.5.1EnhancingConformationalStateSamplingwithpH-REMDBecauseconformationsandprotonationstatesarecoupled,enhancedconforma-tionalsamplingfrompH-REMDnaturallyaccompaniesenhancedprotonationstatesampling.Inwell-designedpH-REMDsimulations(i.e.,pH-REMDsimulationsinwhichefcientmixingoccursinpH-space),eachreplicacontributesstructurestotheensembleateachpH,whichservestoincreasethenumberofconformationsvisitedateachpH.RMSDisametricthatreectshowdifferentthesampledconformationsarefromareferenceinthiscasetheoriginal,minimizedcrystalstructure.ThehistogrammedRMSDdatafromFig. 3-1 isshowninFig. 3-5 toalloweasiercomparisonbetweenthedifferentsimulations.SimulationswithhigherEAFstraversethereplicaladdermorerapidly,allowingtrajectoriestobreakoutoflocalminimathataretiedtoaparticularprotonationstate.Thewideningbin-widthsinFig. 3-5 showthatRMSD-spaceisexploredmorethoroughlywithinthe15nstimescalesampledineachsimulationasEAFincreasesto10ps-1(thereisnonoticeabledifferencebetweenthe10ps-1and50ps-1EAF).Becauseeachsimulationissubjectedtothesamesetofexternalconstraints(e.g.,temperature,pH,solvationmodel,etc.),eachgeneratedensembleshouldrepresentasubsetofthetheoreticallycompleteensembleundertheseexternalconstraints. 101

PAGE 102

Figure3-5. HistogrammedRMSDdataforpH2,pH4.5,andpH7takenfromsimulationsrunwithdifferentEAFs. BecausethesewiderRMSDdistributionreectssamplingofmoreconformationsfurtherfromthestartingstructure,itisalmostcertainthatthesewiderdistributionsathighEAFarethermodynamically`better'(i.e.,theensemblesapproximatethetheoreticallycompleteensemblesbetter)thantheirnarrowercounterpartsintheCpHMDand5ps-1EAFsimulations.TheoriginalRMSDdata,plottedinFig. 3-1 ,alsosuggeststhatpH-REMDsimulationsconvergemorerapidlybecausethosesimulationsdisplaymanytransitionsbetweenconformationswithdifferentRMSDs.InadditiontosamplingmoreRMSDspacethanCpHMDsimulations,thepH-REMDsimulationsalsoconvergetotheirnalRMSDdistributionsmuchmorerapidly.Toquantifythismeasure,IusedtheKullback-Leiblerdivergence[ 150 151 ](DKL).TheKullback-Leiblerdivergence,calculatedviaEq. 3 ,quantiesthesimilaritybetweentwodistinctprobabilitydistributionsP(i)andQ(i). 102

PAGE 103

Figure3-6. Kullback-LeiblerdivergenceforeachsimulationcalculatedviaEq. 3 .P(i)istheRMSDhistogramoftheindicatedsimulationattimetandQ(i)istheRMSDhistogramoftheentireindicatedsimulation.Valuesclosertozeroindicatedistributionsofhighersimilarity. DKL=NXi=1P(i)lnP(i) Q(i)(3)whereDKListheKullback-Leiblerdivergencemetric,iisthepropertyofinterest(RMSDinthiscase),andP(i)andQ(i)aretwoprobabilitydistributionfunctionsoni-space.Fordiscretespaces(suchasthoseobtainedbyhistogrammingdata),Eq. 3 isrepresentedasasum(asshown),butbecomesanintegraloverallofi-spaceforcontinuousprobabilitydistributionfunctionsP(i)andQ(i).Fig. 3-6 plotsDKLcalculatedfromEq. 3 whereP(i)istheRMSDdistributionofeachsimulationattimetandQ(i)isthenalRMSDdistributionofeachsimulation.AsP(i)andQ(i)becomemoresimilar,DKLtendstowardzero.Therefore,thecurvesthatapproachzeromorerapidlyapproachtheirnalRMSDdistributioninashorteramountoftime.ThepH-REMDsimulationsnotonlyexploremoreRMSDspacethanthecorre-spondingCpHMDsimulations,buttheycharacterizethislargerspacemorerapidlyas 103

PAGE 104

well,sincetheyconvergetotheirnaldistributionfasterthanCpHMD.Furthermore,thesimulationswithanEAFof10ps-1and50ps-1typicallyconvergefasterthantheonewithanEAFof5ps-1.Theonlyexceptioninthiscase,includingthepHsnotshowninFig. 3-6 ,isthesimulationatpH2.Ingeneral,simulationswithEAF10ps-1and50ps-1areindistinguishablewithrespecttoRMSD.AtpH2,theRMSDdistributionoftheEAF=5ps-1simulationismuchnarrowerthanthecorrespondingdistributionsatEAF10ps-1and50ps-1.Therefore,it'snotsurprisingthattheDKLofthe5ps-1EAFsimulationconvergesmorerapidlythanthehigherEAFs.Ingeneral,however,pH-REMDsimulationswithhighEAFssamplemoreRMSDspacemoreefcientlythanCpHMDandsimulationswithlowEAFs.ToprobethenatureoftheconformationalexibilityofHEWLatdifferentEAFs,Icalculatedtheaverageatomicuctuationsforeachresiduefromtheaveragestructure.Theseuctuationsprovideinsightintotheexibleregionsoftheprotein,givingamorene-grained,structuralanalysisthanRMSDdoes.Theresults,showninFig. 3-7 ,showthatthesamepartsofHEWLaregenerallyexibleforeachsimulation,butthepH-REMDsimulationstendtodisplayenhancedexibilitycomparedtotheCpHMDsimulations.Again,becauseeachsimulationsamplesfromthesameensemblesubjecttothesamethermodynamicconstraints,thisincreasedexibilitysuggeststhatthesimulationsathighEAFconvergetothetrueensemblemorerapidlythantheCpHMDsimulationsandthepH-REMDsimulationswithalowEAF.WhiletheoverallexibilityinpH-REMDsimulationsisincreasedwithrespecttotheCpHMDsimulations,thedynamicsstillrevealdifferentbehavioratdifferentpH.Inpartic-ular,theregionbetweenresidues100and120showsdrasticallyincreasedexibilityatpHvalueshigherthan4forthepH-REMDcomparedtotheCpHMDsimulations.Furthermore,theregionaroundresidue70,whichshowsheightenedexibilityatpH2(Fig. 3-7 ),containstheproblematictitratableresidueAsp66.Thisincreasedexibility 104

PAGE 105

Figure3-7. Averageatomicuctuationsforeachresiduerelativetotheaveragestructureoftheensemble.DataareshownforCpHMD,lowEAF(0.5ps-1)andhighEAF(50ps-1). 105

PAGE 106

athighEAFisthelikelyexplanationwhyAsp66protonatesatlowpHinthepH-REMDsimulationsbutnotintheCpHMDsimulationwherethisloopissubstantiallylessexible.ToprobethepH-dependenceofHEWLdynamicsfurther,IplotthedistributionsofthedistancebetweenthecarboxylatecarbonsofthecatalyticresiduesAsp52andGlu35ateachpH.BecauseAsp52andGLU35arethecatalyticresiduesinHEWL,thisbehaviormayhaveimportantimplicationsintheHEWLcatalyticactivityproleasafunctionofpH.Fig. 3-8 showsthatonlythesimulationrunatpH5samplesconformationsinwhichAsp52andGlu35arecloselyinteractingfortheCpHMDsimulations.Furthermore,thesimulationatpH5spendsroughly75%ofitstime`stuck'inthiscloseinteraction.ItishighlyunlikelythatthisinteractionissostrongatpH5,yetisalmostnon-existentatpH4.5and5.5.Morelikely,Fig. 3-8 suggeststhattheCpHMDsimulationrunatpH5becametrappedwhiletrajectoriesatotherpHvalueswereunabletoenterthisconformationalbinwithinthe16nsofsimulation.pH-REMDsimulationswithahighEAFeasilyovercomethisbarrierwithinthesimulationtimescale.ThedistributionsfrompH-REMDsimulationswitha50ps-1EAFdisplaymoreexpectedbehavior,giventhatthecalculatedpKasofAsp52andGlu35are2.30and4.98,respectively(Table 3-2 ).Thisinteractionislikelystrongestwhenoneofthecarboxylatesisprotonatedandtheotherisdeprotonated,anditislikelyweakestwhenbotharedeprotonated.Therefore,thisinteractionshouldbestrongestatapHbetween2.30and4.98.TheAsp52Glu35interactionisthestrongestatpH2.5anddecaysasthepHeitherincreasesordecreases.AtpH2.5,Asp52ismostlikelydeprotonatedwhileGlu35ismostlikelyprotonated.AtpH2.0,bothresiduesarelikelytobeprotonated,resultinginaslightlyweakerinteraction.However,thisisstillmorefavorablethanwhenbothresiduesaredeprotonated,sotheinteractionbecomessignicantlyweakerasthepHincreases. 106

PAGE 107

Figure3-8. DistributionsoftheAsp52-CGlu35-Ccarboxylatecarbons.Asp52andGlu35arethecatalyticresiduesofHEWL. 107

PAGE 108

Figure3-9. FractionofsimulationwiththeGlu35Asp52distanceshorterthan5Avs.pH. Furthermore,tightcouplingbetweenAsp52andGlu35likelyinducesnon-HHbehaviorastheseresiduesnolongertitrateindependently.ThisexplainswhytheHillcoefcientforAsp52,reportedinTable 3-2 ,ismoresignicantlyshiftedawayfrom1to0.75fortheEAFof50ps-1.OverthepHrangethatcontainstheAsp52inectionpoint,Fig. 3-8 showsthattheinteractionbetweenthetwoactivesiteresiduesisstrong.OverthepHrangethatcontainstheGlu35inectionpoint,however,theinteractionisweak,causingtheGlu35titrationtodisplaynearlyidealHenderson-Hasselbalchbehavior.TobetterillustratethepH-dependenceoftheGlu35Asp52interactiondepictedinFig. 3-8 ,(at50ps-1EAF)Iintegratedeachofthedistributionsfrom0Ato5AandplottedtheresultagainstpH,showninFig. 3-9 108

PAGE 109

Table3-5. AveragetimingsforCpHMDandpH-REMDsimulations.CpHMDsimulationsused24processors,whereaspH-REMDsimulationsused288processorsperreplica.AllsimulationswereperformedonNICSKeeneland.[ 152 ] EAF(ps-1)Efciency(ns/day) 0.07.560.56.815.06.5510.06.7250.06.70 3.5.2ScalabilityWithIncreasingExchangeAttemptFrequencyItisimportantwhenselectingasimulationprotocoltoconsidertheperformanceim-plicationsofeachofthechoices,sincethereisoftenatrade-offbetweencomputationalexpenseandtheoreticalrigor.As Monganetal. demonstratedintheirwork,theCpHMDmethodimplementedinAmberisonlymarginallymoreexpensivethantraditionalMDwithconstantprotonationstates.[ 130 ]Here,IwilldiscusstheperformanceimplicationsofincreasingtheEAFofpH-REMDsimulations.ThecomputationalcostofthepH-REMDsimulationsisthesumofthecostoftheunderlyingCpHMDmethod[ 130 ]andthecostofthereplicaexchangeattempts.TheexchangesuccessprobabilityinpH-REMDsimulationsisgovernedbyEq. 3 andcanbeimplementedsothatthecomputationalcostofeachexchangeattemptisnegligible.Thereplicasofeverysimulationwerecarriedouton24processorsonNICSKeeneland[ 152 ]sothesimulationefciency,measuredintermsofnsofsimulationperday,canbedirectlycompared.TheaverageresultsobtainedforeachEAFissummarizedinTable 3-5 .ThedecreasedperformancefromtheCpHMDsimulation(EAF=0.0inTable 3-5 )totheEAF=5.0ps-1simulationsarisesfromthefactthatREMDsimulationsinAmbercurrentlyrequireeachreplicatoperformthesamenumberofMDstepsbetweenexchangeattempts.Thissynchronizationcauseseachreplicatorunonlyasfastastheslowestreplica. 109

PAGE 110

ThepH-REMDsimulationsarerunwith288processors(12replicaswith24proces-sorseach),whereastheCpHMDsimulationsareruneachreplicaindependentlywithonly24processors.Therefore,thesynchronizationofthereplicasinREMDisre-sponsibleforthe10%performancereductionbetweentheCpHMDsimulationandthepH-REMDsimulationwithanEAFof0.5ps-1(attemptingexchangesevery1000integrationsteps).IncreasingtheEAFofpH-REMDsimulationsfrom0.5ps-1to50.0ps-1(attemptingexchangesevery10steps)resultsina1%reductioninaveragesimulationefciencyavaluethatfallswellwithintheuctuationsbetweentwodifferentsimulationswiththesameEAFrunonthesamemachine.GiventhelackofperformancedegradationasEAFincreasesandtheimprovedperformanceofsimulationsasEAFincreases,thebestEAFtousewiththepresentedpH-REMDimplementationisonewhereexchangesareattemptedevery10to50integrationsteps(10.0ps-1and50.0ps-1inthisstudy,respectively). 3.6ConclusionInthisstudy,IhaveshownthatpH-REMDeffectivelyenhancessamplingfromthesemi-grandcanonicalensemblecomparedtoCpHMDinthecaseofhenegg-whitelysozyme.ThetitrationcurvesgeneratedfrompH-REMDsimulationsareconsiderablylessnoisythantheanalogoustitrationcurvesgeneratedfromCpHMDsimulations,andtheyttotheHillequationmuchbetter.Furthermore,pKascalculatedfrompH-REMDsimulationsconvergefasterandachievebetterprecisionthanCpHMD.Insomecases,pH-REMDcaneffectivelycrosspotentialenergybarriersthattrapresiduesinCpHMDsimulations.InthecaseoftheAsp66residueinHEWL,CpHMDsimulationswereunabletoobtainnoticeableprotonationfractionsevenatapHaslowas2whenstartedfromthe1AKIcrystalstructure.UtilizingpH-REMDsimulationswitharapidEAF,IobtainedatitrationcurvewithaHillcoefcientcloseto1andacalculatedpKathatcompareswelltoexperiment. 110

PAGE 111

IhavedemonstratedthatincreasingtheEAFimprovessamplingandconvergenceofseveralobservablesinthisstudy.Asp66titratesmoreefcientlywithahighEAFduetoenhancedthemobilityofexibleregionsoftheprotein.Furthermore,analysisofthedistancebetweenthecatalyticresiduesAsp52andGlu35showthatincreasingtheEAFcanprovidevaluablechemicalinsightintobiologicallysignicantpH-dependentbehaviorofproteins.SimilarlytopastworkwithtemperatureREMD,[ 142 143 ]highEAFsgiverisetomorerapidconvergence.Replicaexchangemethodologiescanbeimplementedefcientlytoreducethecostofeachexchangeattempt.InpH-REMD,theexchangesuccessprobability,governedbyEq. 3 ,involvesonlytrivialmathematicssothecostofevaluatingEq. 3 isnegligible.ForefcientREMDimplementations,liketheonepresentedinthiswork,IrecommendsettingtheEAFtoatleast10ps-1,althoughsomeimprovementisstillseenwithhigherEAFs. ChoderaandShirts [ 138 ]provideanexplanationfortheimprovedefciencyofhighEAFsbyrelatingittoGibbs'samplingandtheeffecthighEAFhason`statespace'sampling(pH-spaceinthisstudy).Intheirpaper, ChoderaandShirts proposeenhance-mentstotheexchangeprocessinREMDsimulations,suchasexchangesbetweennon-adjacentneighborsin`statespace'aswellasattemptingmultipleexchangesbeforeresumingdynamics.[ 138 ]pH-REMDislikelytobenetbyattemptingexchangesbetweennon-adjacentneighbors,sincethedifferenceinpHbetweenreplicasissmall.Giventhesimplicityoftheexchangeprobabilityequation(Eq. 3 ),theexchangesuccessratecanbecalculatedbetweenanytworeplicasoverthecourseofthepH-REMDsimulation.Thecalculatedsuccessratesshowanon-negligibleprobabilityofacceptingexchangeattemptsbetweenreplicasupto2pHunitsawayfromeachother(i.e.,separatedby3replicas). 111

PAGE 112

CHAPTER4CONSTANTPHMOLECULARDYNAMICSINEXPLICITSOLVENTInthischapter,IpresentanewmethodforperformingCpHMDsimulationsinexplicitsolventbuildingonthediscreteprotonationstatemodeldevelopedandexplainedinCh. 3 4.1IntroductionRecently, MachuqueiroandBaptista raisedconcernsaboutpKapredictionsin-heritingproblemsrelatedtothemodelcompounddenitionandinaccuraciesintheunderlyingforceeld.[ 133 ]Inparticular,forceelddeciencieshavebeenshowntoresultinincorrectevenunphysicalglobalminima.[ 23 36 153 155 ] MachuqueiroandBaptista 'sworksuggeststhatpKapredictionsareimprovedwhenthesampledcon-formationsmorecloselyresemblethetrue,experimentalensemble.Therefore,theroleofGBinevaluatingthedynamicsofbiomoleculesmaybeproblematicincertainsitua-tionswhereGBisknowntofail,suchastheover-stabilizationofsaltbridges.[ 156 157 ]Indeed,forhighlychargedsystems,likenucleicacids,amoreaccuratetreatmentofelectrostaticinteractionsisrequiredtobuildasensibleensemble.Whilemostofthephysics-basedmethodsdesignedtodescribeabiomolecularsystematconstantpHuseanimplicitsolventrepresentationofthesolvent,severalCpHMDmethodshavebeenextendedtosample,atleastconformations,inexplicitsolventwithboththediscrete[ 126 ]andcontinuous[ 86 158 159 ]protonationmodels.Themethodsproposedby Baptistaetal. [ 126 ]and WallaceandShen [ 86 ]useanimplicitsolventpotentialtosampleprotonationstateswhilethemethodsdevelopedby Gohetal. [ 159 ]and Donninietal. [ 158 ]perform-dynamicsonthetitrationcoordinatedirectlyinexplicitsolvent.Amorerecentapproachby WallaceandShen usesa-dynamicsapproachinpureexplicitsolvent,butaddsacounter-ionwhosechargeischangedsimultaneouslywithatitratableresidueinordertomaintainchargeneutralityintheunitcell.[ 160 ] 112

PAGE 113

Discreteprotonationmethodsusemoleculardynamicstopropagatethespatialcoordinates,whileoccasionallyinterruptingthedynamicstoattemptchange(s)totheprotonationstatesofthetitratableresiduesusingaMetropolisMonteCarlocriteria.TheCpHMDmethodimplementedinAmber[ 130 ](andlaterimplementedinCHARMM[ 87 ])performsMDinGBsolvent,periodicallyattemptingtochangetheprotonationstateofoneortwointeractingresiduesroughlyevery10fs.[ 130 ]Inthestochastictitrationmethoddescribedby Baptistaetal. ,dynamicsisruninexplicitsolventfor2ps[ 161 ],afterwhichacycleofprotonationstatechangeattemptsareevaluatedusingthePoisson-Boltzmann(PB)equationtotreatsolvationeffectsforeverytitratableresidueandinteractingtitratableresiduepair.About40,000fullcyclesareattemptedeachtimeprotonationstatechangesareattempted.[ 162 ]Afterwards,thesoluteisheldxedwhileMDispropagatedonthesolventtoreorganizethesolventdistributiontothenewsetofprotonationstates.ImplicitsolventmodelsinthiscaseGBandPBaverageoverallsolventdegreesoffreedom,allowingsuchapproachestoinstantlyincorporatethesolventrelaxationtodiscreteprotonationstatechanges.Therefore,MCmovesinwhichaprotonationstatechangeisattemptedhaveareasonableprobabilityofsucceedingwhenthesolutionpHissetclosetotheintrinsicpKaofthetitratablegroup.Whenexplicitsolventmoleculesarepresent,however,thesolventorientationaroundanysolvent-exposed,titratableresiduewillopposeeveryproposedprotonationstatechange.Onaverage,thesolventdistributiontendstoresistprotonationstatechangesbyimposingabarrierontheorderof100kcal/molasestimatedbymeasurementsinourlabandinothers',[ 86 ]makingtitrationwithdiscreteprotonationstatesdifcultdirectlyinexplicitsolvent.Inthisstudy,IpresentanewmethodofperformingCpHMDsimulationsinexplicitsolventusingdiscreteprotonationstates.Thismethodissimilarinsomerespectstothatproposedby Baptistaetal. ,[ 126 ]andIevaluateitsperformanceonthemodelcompounds,apentapeptide,andtwoproteins:RibonucleaseA(RNaseA)andthehen 113

PAGE 114

eggwhitelysozyme(HEWL).ToenhancethesamplingcapabilitiesofthisnewCpHMDmethod,IusedreplicaexchangeinthepH-dimension(pH-REMD),whosetheoryandperformancewerediscussedpreviouslyinthecontextofimplicitsolventcalculations.[ 87 88 ]Thischapterisorganizedasfollows:IwillrstdescribethemethodanditsimplementationintheTheoryandMethodssection,followedbyadescriptionofthecalculationsIperformedintheCalculationDetailssection.Afterwards,Iwillevaluateitsperformanceaswellassensitivitytothemethod'stunableparametersintheResultsandDiscussionsection. 4.2TheoryandMethodsInthissection,Iwilldiscussthedetailsofourproposedmethodandhighlighthowitdiffersfromtheapproachusedby Baptistaetal. [ 126 ]ThetheoreticalfoundationofourCpHMDmethodisdescribedindetail,aswellasthepH-REMDmethodIusedinoursimulations. 4.2.1ConformationalandProtonationStateSamplingInCpHMD,structuresaresampledfromthesemi-grandcanonicalensemble,whoseprobabilitydistributionfunctionisgivenby (q,p,n)=expH+n)]TJ /F3 11.955 Tf 11.96 0 Td[(^H(q,p,n) Pn0Rdp0dq0expH+n0)]TJ /F3 11.955 Tf 11.96 0 Td[(^H(p0,q0,n0)(4)where=1=kBT,H+isthechemicalpotentialofhydronium(directlyrelatedtothesolutionpH),qisthegeneralizedcoordinatesofthesystemparticles,pistheconjugatemomenta,andnisthetotalnumberoftitratableprotonspresentinthatstate.Whenbold,nreferstotheprotonationstatevector,specifyingnotonlythetotalnumberofprotonspresent,butonwhichtitratablesitesthoseprotonsarelocated.ThedenominatorinEq. 4 isthepartitionfunctionofthesemi-grandcanonicalensemble.TosamplefromtheprobabilityfunctioninEq. 4 ,discreteprotonationstatemethodsuseMDwithaxedsetofprotonationstatestosamplecoordinatesandmomentacoupledwithaMC-basedprotonationstatesamplingatxedconformations 114

PAGE 115

throughoutthetrajectory.Thisisequivalenttoseparatingintoconditionalprobabilities.InEq. 4 ,(q,pjn)issampledviaMDand(njq,p)issampledviatheMCprotonationstatechanges. ZZZ(q,p,n)dqdpdn=ZZ(q,pjn)dqdpZ(njq,p)dn(4)Inexplicitsolvent,(njq,p)isdifculttosampledirectly,sincethesolventorientationissetaccordingtothecurrentprotonationstatevector.Followingtheargumentsof Baptistaetal. ,thesystemcoordinates(andmomenta)canbeseparatedintosoluteandsolventdegreesoffreedom.[ 126 ]Theprotonationstatesamplingisthenperformedaccordingtotheconditionalprobability 0=(psolvent,qsolvent,njpsolute,qsolute)(4)whereqsolventandpsolventarerelaxedsolventdistributionsofpositionsandmomentaaroundtheprotonationstatevector,n.[ 126 ]Implicitsolventmodelsaccountforsolventreorganizationinstantantly,sothedistributionfunction0inEq. 4 canbeapprox-imatedusingcontinuummodels,suchasthePBorGBequations,therebyavoidingtheotherwisecostlysolventrelaxationcalculationassociatedwitheachattemptedprotonationstatechange.ContrarytothestochastictitrationmethodthatcalculatedsolvationfreeenergiesusingthePBequationtoevaluateprotonationstatechanges,[ 126 ]IchosetousetheGBimplicitsolventmodelforthreemainreasons.First,sanderhasnumerousGBmodelsreadilyavailable,[ 49 50 163 165 ]allowingustousetheexistingcodetoevaluateprotonationstatechangeattempts.Second,resultsfromtheoriginalGB-basedCpHMDimplementationby Monganetal. ,andfromanumberofpreviousstudiesusingthemethod,havebeenpromising.[ 88 130 135 166 ]Furthermore,GBwasshowntobeeffectivewhenusedinahybridsolventmethodwithcontinuousprotonationstates[ 86 ]andislesscomputationallyexpensivethanPB,anditscalculationismoreeasily 115

PAGE 116

splitupamongmanyprocessors,allowinglongersimulationstobeperformedinthesameamountoftime. 4.2.2ExplicitSolventCpHMDWorkowTheprocessoftheCpHMDmethodpresentedherecanbedividedintothreerepeatingsteps,summarizedintheworkowdiagraminFig. 4-1 .Thisworkowisverysimilartotheonepresentedin Baptistaetal. (Fig.2),[ 126 ]althoughthenatureoftheMCprotonationstatemoveisdifferent.IntheproposedmethodstandardMDinexplicitsolventiscarriedoutusingaconstantsetofprotonationstates(aninitialsetmustbeprovidedatthestartofthesimulation).AtsomepointtheMDisstopped,thesolvent(includinganynon-structuralions)arestripped,thepotentialisswitchedtoanavailableGBmodel,andasetofNprotonationstatechangesareattemptedwhereNisthenumberoftitratableresidues.WhileinprincipletheMDcanbestoppedrandomlywithapredeterminedprobabilityatanystep,inthisiterationofourproposedmethodMDisrunforasettimeinterval,MD,similartothestochastictitrationmethod.[ 126 ]AftertheMDishaltedandthesolventstripped,protonationstatechangesareproposedforeachtitratableresidueonce,inrandomorder,choosingfromtheavailableprotonationstatesofthatresidueexcludingthecurrentlyoccupiedstate.Theelectro-staticenergydifferencebetweentheproposedandcurrentprotonationstates,aswellastheMCdecisionregardingwhetherornottoaccepttheproposedstate,arecalculatedthesamewayasintheoriginalGBimplementation.[ 130 ]Iftheprotonationstatechangeisaccepted,the`current'stateisappropriatelyupdated,andthenextresidue,chosenatrandomwithoutreplacement,istitratedwiththisnewstate.Foreachresiduethatistitrated,thereisa25%chancethataso-calledmulti-sitetitrationwilloccurwithaneighboringresiduethatis,theproposedchangewillinvolvechangestotheprotonationstateofbothneighbors.Twotitratableresiduesareconsidered`neighbors'iftheirtwotitratinghydrogenatomsarewithin2Afrom 116

PAGE 117

eachother.Ifeitherresiduehasmorethanonetitratingproton,thetworesiduesareneighborsiftheminimumdistancebetweenanypairoftitratinghydrogensmeetsthecutoff.Likethesingle-residuechangeattempts,ifthisprotonationstatechangeattemptfails,thesystemremainsinitsoriginalprotonationstatesforbothresidues.Includingmulti-siteprotonationstatejumpsisimportantforsystemsthathaveclosely-interactingtitratableresidues.Withoutthesemulti-sitemoves,protontransfersbetweenadjacenttitratableresiduesinvolvedinahydrogenbondwouldneveroccurduetothehighpenaltyofdisruptingtheinteractionbyaddinganotherprotonorremovingtheprotoninvolvedinthehydrogenbond.Thisfeaturewasactuallypresentintheinitialimplementation,andwhilenomentionofitwasmadeintheoriginalpaper,asmallnotewasmadeintheUsers'manual.[ 130 ]Ifanyoftheprotonationstatechangeattemptswereaccepted,thesoluteisfrozenwhileMDisperformedonthesolvent(andanyions)torelaxthesolventdistributionaroundthenewprotonationstates.Thelengthofthisrelaxationisatunableparameterofthemethod,whichIwillcallrlx.Whentherelaxationtimeisinnitelylong,thisprocessbecomesexact.Aftertherelaxationiscomplete,thevelocitiesofthesoluteatomsarerestoredtotheirvaluespriortotherelaxationandthestandarddynamicsiscontinued. 4.2.3pH-basedReplicaExchangeTheunderlyingtheorybehindreplicaexchangeinpH-spacewithMDruninexplicitsolventisunchangedfromtheversionIimplementedinimplicitsolvent,asdescribedinCh. 3 .[ 87 88 ]ReplicasareorderedbytheirsolutionpHparameter,andadjacentreplicasattempttoexchangetheirpHperiodicallythroughouttheMDsimulations.Theprobabilityofacceptingthesereplicaexchangeattempts,givenbyEq. 3 ,dependsonlyonthedifferenceinthenumberoftitratingprotonspresentineachreplicaandtheirrespectivedifferenceinpH.[ 88 ]Asaresult,thenumberofreplicasnecessarytoobtainefcientmixinginpH-spacedoesnotincreaseasexplicitsolventisadded. 117

PAGE 118

Figure4-1. WorkowoftheproposeddiscreteprotonationCpHMDmethodinexplicitsolvent.FollowingthestandardMD,thesolvent,includingallnon-structuralions(asdeterminedbyuser-input),arestrippedandtheprotonationstatechangesareevaluatedinaGBpotential.Afterthat,thesolventandtheoriginalsettingsarerestoredfortheremainingsteps. 118

PAGE 119

This,coupledwiththeimprovedsamplingfoundwithimplicitsolventsimulations,[ 88 ]makespH-REMDaneffectivetoolforexplicitsolventCpHMD. 4.3CalculationDetailsToevaluatetheperformanceoftheproposedmethod,Iappliedittotheaminoacidmodelcompounds,asmallpentapeptide(ACFCA),andtwoproteinscommonlyusedinpKacalculationstudiesribonucleaseA(RNaseA)andthehenegg-whitelysozyme(HEWL). 4.3.1ModelCompoundsAbsolutepKasareverydifculttocalculateinsolutiontheyareimpossibleusingclassicalforceelds.Asaresult,everyphysics-basedCpHMDmethodusestheideaofamodelcompoundwhoseexperimentalpKaiseasytomeasurewithahighlevelofaccuracy.AnempiricalparameterthereferenceenergyisthenaddedsothatCpHMDreproducestheexperimentalpKasofthesemodelcompounds.Inthisway,CpHMDcomputesthepKashiftofatitratableresidueinabiomoleculewithrespecttotheisolatedmodelcompoundinsolutionviathethermodynamiccycleshowninFig. 4-2 ThemodelcompoundshavethesequenceACE-X-NME,whereACEisaneutralacetylcappingresidue,Xisatitratableresidue,andNMEisaneutralmethylaminecappingresidue.[ 130 ]TheavailabletitratableresiduesinAmberareaspartate(AS4),glutamate(GL4),histidine(HIP),lysine(LYS),tyrosine(TYR),andcysteine(CYS),whicharealldenedasdescribedby Monganetal. [ 130 ]A10ATIP3P[ 167 ]solventbufferwasaddedinatruncatedoctahedronaroundthemodelcompound.TheaspartatemodelcompoundwasalsosimulatedwithlargerboxsizesAand20AbufferstodetermineifithadanyeffectonthecalculatedpKa.Afterthesystemtopologiesweregenerated,eachsystemwasminimizedusing100stepsofsteepest-descentminimizationfollowedby900stepsofconjugategradient 119

PAGE 120

Figure4-2. ThermodynamiccycleusedtoevaluateprotonationstatechangesinCpHMDsimulations. minimization.Theywerethenheatedatconstantpressure,varyingthetargettempera-turelinearlyfrom50Kto300Kover200ps.Thesolvatedmodelcompoundswerethensimulated,freeofrestraints,for2nsatconstanttemperatureandpressure.EachmodelcompoundsystemwassimulatedatconstantpHandvolumefor2ns,settingrlx=200fs.EachsystemwassimulatedwithpH-REMDusingsixreplicaswiththesolutionpHsettopKa0.1,pKa0.2,andpKa1.2.Toevaluatetheeffectofthesolventrelaxationtime,thecysteineandaspartatemodelcompoundswererunwithrlxsetto10fs,40fs,100fs,200fs,and2ps. 120

PAGE 121

Table4-1. ModelcompoundpKavaluesandreferenceenergies.BecauseonlydifferencesinreferenceenergiesareusedintheMCcalculation,onestateisarbitrarilyassignedareferenceenergyof0(thedeprotonatedstatesofAS4andGL4,theprotonatedstateofCYS,andthedouble-protonatedstateofHIP).Thelistedreferenceenergiesarecalculatedwithrespecttothearbitraryzerovalue.pKcalcaarepKascalculatedwiththeproposedmethodusingtheGBreferenceenergy(GGBref).AdjustedreferenceenergiesforexplicitsolventCpHMDarelabeledasGref.Allenergiesareinkcal/mol ResidueReferencepKaGGBrefpKcalcaGref Aspartate4.032.388034.2032.11310Glutamate4.414.454214.7513.97287Histidine-6.5-16.347906.55-16.47559Histidine-7.1-11.777017.19-11.71159Cysteine8.589.151148.4089.28861 TheresultsfromthesesimulationswereusedtoadjusttheoriginalreferenceenergiestoreproducethecorrectmodelcompoundpKasinexplicitsolvent.ThisadjustmentcanbecalculateddirectlyfromthepKashiftrelativetoexperimentwhenusingtheoriginalreferenceenergy.Tocalculatetherequiredadjustment,thereferenceenergyisbrokenintotwocomponentsaTI-basedcomponentwhichisequaltothefreeenergydifferencebetweenthetwostatesandapKa-basedcomponentthatoffsetstheenergiesoftheprotonationstatestothenecessaryvaluerequiredtoobtainthecorrectpKaforthemodelcompound.ThisisshowninEq. 4 .AsummaryoftherequiredchangesisgiveninTable 4-1 Gref=GTI+kTln10NpKa,model(4) 4.3.2ACFCAApentapeptidewiththesequenceAla-Cys-Phe-Cys-Ala(ACFCA)wassolvatedwitha15AbufferofTIP3Pmoleculesaroundthesoluteinatruncatedoctahedron.Thesystemwasminimizedusing100stepsofsteepestdescentminimizationfollowedby900stepsofconjugategradient.Theminimizedstructurewasheatedbyvaryingthetargettemperaturelinearlyfrom50Kto300Kover200psatconstant 121

PAGE 122

pressure.Theresultingstructurewasthensimulatedat300Katconstanttemperatureandpressuretostabilizethesystemdensityandequilibratethesolventdistributionaroundthesmallpeptide.SimulationsatsixdifferentpHvalues.1,8.1,8.3,8.7,8.9,and9.9wereperformedbeginningfromtheresulting`equilibrated'struture.ThesepHvalueswerechosenbecausethepKaofthecysteinemodelcompoundis8.5,sothetwocysteinesofACFCAwereexpectedtotitrateinthispHrange.TodemonstratetheeffectthatpH-REMDhadonthetitrationofACFCA,twosetsofsimulationswererunCpHMDwithnoexchangesandpH-REMDwitheachreplicabeingrunfor2ns.Therelaxationdynamicsfollowingsuccessfulexchangeattemptswererunfor100fs. 4.3.3Proteins:HEWLandRNaseATwodifferentstartingstructureswereselectedfromthePDBforbothHEWLandRNaseA.ThestructuressolvedinPDBcodes1AKI[ 147 ]and4LYT[ 168 ]wereusedasstartingstructuresfortheHEWLcalculations,whilethosefromPDBcodes1KF5[ 169 ]and7RSA[ 170 ]wereusedfortheRNaseAcalculations.AllPDBleswerepreparedbyremovingallsolventandkeepingonlytherstconformationpresentforeachresidueifmorethanonewaspresent.AllaspartateresidueswererenamedAS4,allglutamateresidueswererenamedGL4,andallhisti-dineresidueswererenamedHIPinpreparationforCpHMDandpH-REMDsimulations.Bydefault,aspartateandglutamateareintheirdeprotonatedstatewhilehistidineisinitsdouble-protonatedstate.Alldisuldebondswereaddedmanuallyintleapandeachstructurewassolvatedwitha10ATIP3Pwaterbuffersurroundingtheproteininatruncatedoctahedron.Totesttheeffectofionsintheexplicitsolventtitrations,asecondsetofsystemswassetupforeachstartingstructurebyaddingseveralionsrandomlydistributedaroundtheunitcell.Iadded14chlorideionsand6sodiumionstotheRNaseAstartingstructures,and15chlorideionsand6sodiumionstotheHEWLstartingstructures,whichneutralized 122

PAGE 123

allsystemsintheirinitialprotonationstates.Theaddedcounter-ionsresultedinsaltconcentrationsrangingfrom0.17Mto0.18Mforallfoursimulationsfollowingtheconstantpressureequilibration.Allstructureswereminimizedusing1000stepsofsteepestdescentminimizationfollowedby4000stepsofconjugategradient,with10kcal/molpositionalrestraintsappliedtothebackbone.Thestructureswerethenheatedatconstantvolume,varyingthetargettemperaturelinearlyfrom10Kto300Kover400ps.Theheatedstructureswerethenequilibratedfor2nsatconstanttemperatureandpressure.Followingthesetupstagesofthesimulations,eachstructurewassimulatedusingpH-REMDsimulationsfor20ns,with8replicasspanningintegerpHvaluesfrom1to8tocharacterizetheacidic-rangetitrationbehaviorofthesystems. 4.3.4SimulationDetailsAllsystemswereparametrizedusingtheAmberff10forceeld,whichisequivalenttotheAmberff99SBforceeldforproteins.[ 23 ]ThetleapprogramoftheAmberTools12programsuitewasusedtobuildthemodelcompoundandACFCAmolecules,toaddhydrogenatomstoRNaseAandHEWL,andtosolvateeachsystem.AllsimulationswereperformedusingthesandermoduleofadevelopmentversionofAmber12.[ 141 ]Langevindynamicswasusedineverysimulationtomaintainconstanttemperaturewithcollisionfrequenciesvaryingfrom1ps-1to5ps-1,andtherandomseedwassetfromthecomputerclocktoavoidsynchronizationartifacts.[ 149 171 ]TheBerendsenbarostatwasusedtomaintainconstantpressurefortheequilibrationdynamicswithacouplingconstantof1ps-1.Allmoleculardynamics,includingthesolventrelaxationdynamics,arerunwitha2fstimestep,constrainingbondscontaininghydrogenusingSHAKE.[ 16 18 ]Replicaexchangeattemptsbetweenadjacentreplicasweremadeevery200fsforallpH-REMDsimulations.Protonationstatechangeswereattemptedevery200fsforallconstantpHsimulations. 123

PAGE 124

Long-rangeelectrostaticinteractionsweretreatedwiththeparticle-meshEwaldmethod[ 172 173 ]usingadirect-spaceandvanderWaalscutoffof8A.DefaultswereusedfortheremainingEwaldparameters.TheGBmodelproposedby Onufrievetal. ,speciedbytheparameterigb=2insander,[ 49 ]wasusedtoevaluatetheprotonationstatechangeattemptstobeconsistentwiththeoriginalimplementationinimplicitsolvent.[ 130 ] 4.4ResultsandDiscussionHereIwillanalyzetheperformanceofourproposedCpHMDandpH-REMDmethodsaswellaswaystooptimizeitsoverallperformance.Iwillstartbydiscussingthebehaviorofthemodelcompoundswhenthesizeoftheunitcellandthelengthoftherelaxationdynamics(rlx)isvaried.IwillfollowthisdiscussionwithasimilaranalysisonaslightlylargersystemACFCAbeforediscussingtheapplicationofourproposedmethodtorealproteins. 4.4.1BoxSizeEffectsTostudytheeffectthattheunitcellsizehasontitrationsinourproposedmethod,IpreparedthreesystemsoftheaspartatemodelcompoundwithdifferentTIP3Psolventbufferssurroundingit.Ipreparedsystemswitha10A,15A,and20ATIP3Psolventbufferaroundthemodelaspartate.BecauseprotonationstatesamplingtakesplaceinGBsolventwithoutperiodicboundaryconditions,anyeffectoftheboxsizeoncalculatedpKaswillariseduetoalterationsofthestructuralensemblesinducedbyartifactsfromtheboxsize.ThecalculatedpKasofthethreesystemswere4.020.07,4.050.08,and4.120.07forthe10A,15A,and20Asolventbuffersystems,respectively.ToestimatetheuncertaintiesIdividedeachsimulationinto100pschunksandtookthestandarddeviationofthesetof20pKascalculatedfromthosesegments. 124

PAGE 125

Figure4-3. Radialdistributionfunctions(RDFs)ofsolventoxygenatoms(O)andhydrogenatoms(H)withdifferentunitcellsizes.Theshownmeasurements,15,and20Arepresentthesizeofthesolventbuffersurroundingthesolute.RDFplotsforthreedifferentpHsareshown,highlightingthepHdependenceofthesolventstructurearoundthecarboxylateoftheaspartatemodelcompoundanditsinvariancetoboxsize. TofurtherdemonstratetheinsensitivityofboxsizetopH-REMDtitrations,Iplottedthesolventradialdistributionfunctions(RDFs)aroundthecenterofmassofthecar-boxylatefunctionalgroupinthreedifferentsolutionpHenvironments,showninFig. 4-3 .TheinsensitivityofthepKaandsolventstructurewithrespecttothemodelcompoundprovidesstrongevidencethatnounduecareisnecessarywhenchoosingthesizeofthesolventbufferforthesetypesofsimulations. 4.4.2rlxEffectsAnimportantapproximationintheproposedmethodisthattheprotonationstatesampling0fromEq. 4 canbereplacedusinganimplicitsolventmodelfollowedby 125

PAGE 126

relaxationMDtogeneratetherelaxedsolventpositionsandmomenta.Thequestionthenbecomeshowlongthisrelaxationdynamicsshouldberun.Toaddressthis,2nsofconstantprotonationmoleculardynamicssimulationswererunonthemodelcysteinecompoundinbothprotonationstatesprotonatedanddeprotonatedafterthesameminimizationandheatingprotocolswereusedasfortheothermodelcompoundsimulations.Theprotonationstatewasthenswappedforthenalstructuresofbothsimulations,andMDwasperformedwhileconstrainingthesolutepositionfor20ns,equivalenttotherelaxationdynamicsprotocolinourproposedmethod.Theoptimumvalueforrlxisthetimeafterwhichtheenergyoftherelaxationtrajectorystabilizesandthesimulationlosesallmemoryofitsinitialconguration.Tobetrulyequivalenttohavingbeenchosenatrandom,thenal,relaxedsolventdistributionmustbecompletelyuncorrelatedfromtheinitialdistributionatthetimetheprotonationstatewaschanged.Toprobethenecessarytimescalesfortheserelaxationdynamics,theenergyofeachsnapshotintherelaxationtrajectoryisplottedalongsidetheautocorrelationfunctionofthatenergyinFig. 4-4 toclearlydemonstratethe`appropriate'valueofrlxforthismodelsystem.Ichosethecysteinemodelcompoundforthistestfortworeasons.First,themodelcompoundsarefullysolvent-exposedduetotheirsmallsize,whichresultsinaworst-casescenariointermsofthenumberofwatermoleculesthatmustbereorganizedduringtherelaxationdynamics.Theoptimumrlxvalueformodelcompoundsisex-pectedtobeanupper-boundonthevaluesrequiredforlargersystems.Secondly,cysteineisthesmallestandsimplestofthetitratableaminoacids,eliminatingpotentialcomplicationsfromtautomericstatescomparedtoaspartate,glutamate,andhistidine.TherelaxationenergiesplottedinFig. 4-4 begintostabilizeafter4to6psofrelaxationdynamics,andtheautocorrelationfunctionindicatesthattherelaxation 126

PAGE 127

Figure4-4. Therelaxationoftheprotonatedstatestartingfromtheprotonatedtrajectoryisshowninbluewithitsautocorrelationfunctionshowninpurple.Therelaxationofthedeprotonatedstatefromanequilibratedsnapshotfromtheprotonatedensembleisshowninredwithitsautocorrelationfunctionshowningreen.Here,PRandDRstandforProtonated-RelaxationandDeprotonated-Relaxation,respectively. energiesareuncorrelatedfromthepointoftheprotonationstatechange.However,because4psofMDcorrespondingto2000stepsofdynamicswitha2fstimestepaddsdramaticallytothecostofCpHMDsimulationsinexplicitsolvent,Iexploredtheapproximationofusingasignicantlysmallervalueforrlx.Boththerelaxationenergiesandautocorrelationsdropverysharplyatthestartoftherelaxationdynamics,sothemajorityofthebenetgainedbyrelaxingthesolventisrealizedwithintherstfewsteps. 127

PAGE 128

Forexample,theenergiesfromtherelaxationofthedeprotonatedstructureintheprotonatedstatedropsfrom-7197kcal/molto-7377kcal/molduringtherst200fs.Theaverageenergyofthenal10nsofthattrajectoryis-7490kcal/mol.Likewise,theenergiesfromtheotherrelaxationdynamicsdropsfrom-7267kcal/molto-7465kcal/molovertherst200fs,nallysettlingintoanaverageof-7552kcal/moloverthenal10ns.Inbothcases,70%ofthetotalrelaxationenergywasrealizedduringtherst200fsofrelaxationdynamics.Theautocorrelationfunctionoftherelaxationenergydecayssimilarly,sotheassumptionthattherelaxedsolventdistributionisuncorrelatedfromitsstartingpointisareasonableapproximation.Tovalidatetheuseofashorterrlx,ItitratedtheaspartatemodelcompoundusingpH-REMDwithvedifferentvaluesforrlxfs,40fs,100fs,200fs,and2ps.ThecalculatedpKaswere4.080.02,showninFig. 4-5 .Furthermore,comparingthesolventradialdistributionfunctionsofthedifferentsolventrelaxationtimes(Fig. 4-6 )showslittledependenceofthesolventdistributiononthevalueofrlx. 4.4.3ACFCA:CpHMDvs.pH-REMDThesmallpeptidechainACFCA,describedinSec. 4.3.2 ,waschosenasatestduetoitssmallsizeandpredictabletitrationbehavior.Thesimplicityofthesystemmakesitanidealtestitssmallsizemitigatestheconformationalsamplingproblem,andthesimpletitratingbehaviorofcysteinefurthersimpliesprotonationstatesampling.Unlikeaspartateandglutamate,whichhavethefourdenedtautomericstatesdenedbyanti-andsyn-protonationoneachoftwocarboxylateoxygens,andhistidinewhichhastwotautomericstatesontheimidazole,cysteinehasonlyoneprotonatedandonedeprotonatedstate,presentingfewerdegreesoffreedomthatmustbeexhaustivelysampled.Eachcysteineisinaslightlydifferentmicro-environmentduetothedifferentchargesoftheN-andC-termini.BecauseCys2istypicallyclosertotheN-terminus,itisexpectedtoexperienceanegativepKashiftwithrespecttothemodelcompounddueto 128

PAGE 129

Figure4-5. ComputedpKasfortheAspartatemodelcompoundusingdifferentrelaxationtimes(rlx). theelectrostaticinuenceofthepositively-chargedterminus.Cys4,ontheotherhand,isexpectedtoexperienceapKashiftintheoppositedirectionduetotheelectrostaticpressureofthenegatively-chargedC-terminus.IransimulationsatpH7.1,8.1,8.3,8.7,8.9,and9.9tosufcientlycharacterizethetitrationbehaviorofbothcysteineresiduesaroundtheirpKas.OnesetofreplicaswasrunwithpH-REMDwhiletheothersetwasrunusingCpHMD(i.e.,withoutattemptingexchangesbetweenthereplicas).Thetitrationcurvesforbothsetsofsimulations,showninFig. 4-7 ,demonstratetheimportanceofusingpH-REMDinconstantpHsim-ulationsinexplicitsolvent.ThepKaofCys2andCys4were8.2and9.4,respectively.Asexpected,thesepKasrepresentshiftsof-0.2pKunitsforCys2and+0.9pKunits 129

PAGE 130

Figure4-6. RDFsofwateroxygenatoms(O)andhydrogenatoms(H)aroundthecenter-of-massofthecarboxylategroupofthemodelaspartatemoleculeatdifferentsolutionpHs. forCys4withrespecttothemodelCyscompound.Asatest,thiscompoundwasrunusingtheoriginalCpHMDimplementationinimplicitsolvent[ 130 ]toensurethatweobtainedthesameresults.BecausetheavailablephasespaceinsimulationsACFCAissosmallduetothesmallsizeofthemolecule,thesampledensemblesinimplicitandexplicitsolventareexpectedtobeverysimilar.Whenruninimplicitsolvent,thetwocysteineresidueshavealmostidenticalpKastothoseobtainedbythesimulationsinexplicitsolvent.1and9.4,forCys2andCys4,respectively.EvenforasimplesystemsuchasACFCA,usingpH-REMDontopofstandardCpHMDsimulationsresultsinadrasticimprovementintitrationcurvetaresultofimprovedprotonationstatesampling.Theresidualsumofsquares(RSS),aquantitythatmeasureshowwellanequationtsadatasetwhoseequationisshownin 3 ,showsdrasticimprovementusingpH-REMD.TheRSSforCys2andCys4using 130

PAGE 131

Figure4-7. TitrationcurvesofCys2andCys4intheACFCApentapeptide.ResultsfromCpHMD(noreplicaexchangeattempts)andpH-REMDareshownintheplotsontheleftandright,respectively. CpHMDwas910)]TJ /F10 7.97 Tf 6.58 0 Td[(2and710)]TJ /F10 7.97 Tf 6.58 0 Td[(3,respectively.ForthepH-REMDsimulations,ontheotherhand,theRSSwasreducedbyseveralordersofmagnitudeto710)]TJ /F10 7.97 Tf 6.59 0 Td[(5and910)]TJ /F10 7.97 Tf 6.59 0 Td[(6forCys2andCys4,respectively. 4.4.4HenEggWhiteLysozymeHEWLisacommonbenchmarkforpKacalculationsbecauseithasbeenstudiedextensivelybothexperimentally[ 136 144 145 ]andtheoretically,[ 86 88 130 135 146 161 ]andithasalargenumberoftitratableresiduessomewithamarkedpKashiftcomparedtotheisolatedmodelcompound.The1AKIand4LYTcrystalstructureswerepreparedinitiallywithoutanyions,resultinginunitcellswithanetchargeof+9electronswhenthecarboxylateresiduesarenegativelychargedandthehistidineresidueispositivelycharged.ThecalculatedpKaforall10residuesthattitrateintheacidicrangearesummarizedinTable 4-2 forbothstartingstructures. 131

PAGE 132

Table4-2. CalculatedpKasforacid-rangetitratableresiduesinHEWLusingtheproposedmethodforbothstartingstructuresPDBs1AKIand4LYTwithoutions.Therootmeansquareerror(RMSE)andmeanunsignederror(MUE)withrespecttotheexperimentalvaluesareshowninthelasttworows.Experimentalvaluesaretakenfrom Webbetal. [ 136 ] ResiduePDB1AKIPDB4LYTExperiment[ 136 ] Glu71.61.82.6His157.16.55.5Asp181.81.82.8Glu355.04.96.1Asp48-0.2-0.31.4Asp52-0.3-1.23.6Asp66-1.8-1.01.2Asp870.50.52.2Asp1013.73.84.5Asp1190.00.33.5RMSE2.192.20MUE1.911.83 ThepKaspredictedhereagreeworsethanourresultspresentedinRef. 88 ,duemainlytothepoortreatmentofaspartateresidues48,52,66,and119.Thedisparitybetweentheimplicitandexplicitsolventresultsprobablystemsfromtheenhancedconformationalsamplingattainablewithimplicitsolventsimulations.Dynamicaleventsoccurmuchslowerinexplicitsolventsimulationsduetothefrictionandviscosityofthesolvent.However,theconformationssampledinimplicitsolventarefrequentlyartifactsoftheinaccuraciesintheunderlyingsolventmodel[ 156 157 ]thatmayhindertheperformanceoftheCpHMDsimulations.[ 133 ]Thedifferenceintheconformationalsamplingabilityoftheproposedmethodandtheoriginal,implicitsolvent-basedmethodispronouncedenoughthatasimplecomparisonoftherootmeansquareddeviation(RMSD)isasufcientillustration.TheRMSDofthetrajectoriesinthecurrentstudy(Fig. 4-8 )is2to3timessmallerthantheRMSDsshowninFig. 3-5 fromCh. 3 ,despitetheextra4nsofproductionMDperformedforeachreplicainexplicitsolvent.Furthermore,thedynamicsdisplaysnone 132

PAGE 133

Figure4-8. RMSDplotsover20nsofpH-REMDsimulationforHEWLatpH2,4,6and8withrespecttothestartingcrystalstructure1AKI.Thedistributionsareverysimilarforthestartingstructure4LYTaswell. oftheregionsofexibilitynotedinimplicitsolventthatwerecorrelatedwiththeimprovedtitrationofaspartate66.[ 88 ]Ions Whenallaspartateandglutamateresiduesaredeprotonatedandthehistidineisprotonatedatboththeandpositions,HEWLhasanetchargeof+9electrons.EventhoughthePMEimplementationinAmberappliesanetneutralizingplasmaforsuchsystems,thelackofcounterionsintheunitcellmayleadtounusualbehaviorbyanyofthe30chargedresiduesinthesystem. 133

PAGE 134

Therefore,Iadded21ionschlorideand6sodiumtoaddionicstrengthandtoprovidetheionsnecessarytoneutralizetheinitialunitcell.Liketheeffectsoftheunitcellsizeandrlxvalue,ionswillonlyaffectthecalculatedpKasbymodifyingthesampledconformationssincetheprotonationstatechangesareperformedusingimplicitsolvent.ThepredictedpKas,showninTable 4-3 ,showamarkedimprovementforseveralresidueswhosecalculatedpKawastoolowwithoutions.Thesimulationswithexplicitionsdidnotexhibitheightenedsamplingoflarge-scaleconformationalchangestheRMSDplotsofthesesimulationsareshowninFig. 4-9 butisratheraresultofchangestothemicroenvironmentaroundtherelevanttitratableresiduessignicantenoughtoeffectanoticeablechangetothetitratingbehavior.TheresiduethatexperiencedthelargestpKashiftasaresultoftheaddedionswasAsp66.AsIobservedinourpreviouswork,Asp66issurroundedbyprotondonors,suchastheArg68andthehydroxylgroupsofThr69andSer60.[ 88 ]Thearginineresidue,carryinganetpositivecharge,isthestrongestdrivingforcefavoringthedeprotonatedstate.TocomparehowArg68mayaffectAsp66differentlywhenionsarepresent,IshowinFig. 4-10 thatthedistributionofAsp66Arg68distancesisshiftedtolargervalueswhenionsarepresent.BecauseArg68canoccasionallyinteractwithchlorideionsinthebulksolventwhentheyarepresentinsteadofAsp66,Asp66ismorelikelytoacceptproposedprotonationmoveswhentheseionsarepresent.ItissurprisingthatthepresenceofionscanhavesuchalargeimpactonpredictedpKaswithoutinducingglobalconformationchanges.ThattheionsarenotincludedintheGB-based,protonationstatechangesonlyincreasesthepeculiarityofthisresult.ExplicitionscanmodifythelocalenvironmentaroundtitratableresiduesenoughtoinducelargepKashifts,makingthemimportanttoincludeintheproposedmethod. 4.4.5RibonucleaseALikeHEWL,RNaseAisacommonbenchmarkforconstantpHstudiesduetoitslargenumberoftitratingresidues.Furthermore,thecurrentlyproposedmechanism 134

PAGE 135

Table4-3. CalculatedpKasforacid-rangetitratableresiduesinHEWLusingtheproposedmethodforbothstartingstructuresPDBs1AKIand4LYTwith21ions.Therootmeansquareerror(RMSE)andmeanunsignederror(MUE)withrespecttotheexperimentalvaluesareshowninthelasttworows.Experimentalvaluesaretakenfrom Webbetal. [ 136 ] ResiduePDB1AKIPDB4LYTExperiment[ 136 ] Glu71.91.82.6His156.46.35.5Asp181.91.72.8Glu354.64.96.1Asp48-1.40.81.4Asp520.50.13.6Asp660.2-0.41.2Asp870.40.52.2Asp1013.53.54.5Asp1190.6-0.33.5RMSE1.891.91MUE1.681.56 requiresonecatalytichistidinetobeaprotondonor(generalacid)andanothertobeaprotonacceptor(generalbase)His119andHis12,respectively.Becauseaspeciccombinationofprotonationstatesarenecessaryforcatalysis,theproposedmethodisausefultoolforprobingthepH-dependenceofRNaseA.ThepredictedpKasforRNaseAinanacidic-rangetitrationsummarizedinTable 4-4 areinbetteragreementwithexperimentthanthosefromHEWL.Thisisexpected,however,sincetheaveragemagnitudeofthepKashiftswithrespecttothemodelcompoundsissmallerinRNaseA.WhilemostoftheresidueshaveacalculatedpKaclosetotheexperimentalvalue,severalaretrappedinenvironmentsthatresistchangingtheirprotonationstateacrosstheentirerangeofsimulatedpHs.Glu2isadjacenttotheN-terminallysineresiduewitha+2chargeandinteractscloselywiththepositively-chargedlysine7andarginine10residuesmuchofthetime,pushingthepredictedpKaverylow.Ifthetimescaleofthesimulationisinsufcienttoescapethislocalconformationalbasin,thepredictedpKaofGlu2willbeunphysicallylow,asseeninTable 4-4 forthe1KF5structure. 135

PAGE 136

Figure4-9. RMSDplotsover20nsofpH-REMDsimulationwithexplicitcounterionsforHEWLatpH2,4,6and8withrespecttothestartingcrystalstructure1AKI. SimilartrapsareseenaroundAsp14,whichissurroundedbyhydrogenbonddonors.TheHis48residue,ontheotherhand,interactscloselywithnumerousback-bonecarbonylatomsinacongurationthatresistsdeprotonatingeitherNorNinthe1KF5startingstructure.LikewiththeHEWLsimulations,Iranasecondsetof20nssimulationsinwhichIadded14chlorideand6sodiumionstogenerateanetionconcentrationaround0.18M.Morechloridewasaddedbecausethenetchargeoftheinitialprotonationstateswas+8electrons.ThepredictedpKasforthesimulationswithexplicitcounterionsresultedinmarkedimprovementinmosttitratableresiduesthatprovedproblematicin 136

PAGE 137

Figure4-10. DistancedistributionfunctionscalculatedfromHEWLsimulationsbegunwithcrystalstructure1AKIforallsnapshotsintheensembleatpH1.0.Theprobabilitydistributionswerecalculatedusing10,000snapshotsandsmoothedusingagaussiankerneldensityestimatewithabandwidthof0.1 thesimulationswithouttheions,followingthetrendseenintheHEWLcalculations.ThefullsummaryofcalculatedpKasisshowninTable 4-5 4.5ConclusionIhaveextendedtheconstantpHmoleculardynamicsmethoddevelopedby Monganetal. [ 130 ]sothatthedynamicscanberuninexplicitsolvent.Itestedawiderangeofparametersinourproposedmethodfortheireffectontheconformationalandprotona-tionstatesamplingofsmalltestsystems.Becausethesetestsystemsaresmallandtheirtitratablesitesarecompletelysolvent-exposed,theylikelyrepresentthehighestlevelofsensitivitytothesevariousparameters. 137

PAGE 138

Table4-4. CalculatedpKasforRNaseAusingsimulationsbegunfromcrystalstructures1KF5and7RSA.ExperimentalvaluesshownaretakenfromRef. 174 .Rootmeansquarederror(RMSE)andmeanunsignederror(MUE)doesnotincludeHis48forthe1KF5structureorAsp14forthe7RSAstructure. ResiduePDB1KF5PDB7RSAExperiment[ 174 ] Glu2-5.7-0.22.5Glu93.63.63.9His126.15.86.0Asp14-1.2-8.81.8Asp382.12.32.1His4815.86.1Glu493.43.34.3Asp533.63.73.7Asp831.61.83.3Glu864.34.34.0His1057.37.86.5Glu1113.63.43.8His1196.06.06.5Asp1210.6-0.83.0RMSE*1.832.20MUE*1.181.26 Table4-5. CalculatedpKasforRNaseAsimulationsrunwithexplicitcounterionspresent.AllcalculatedpKasareincludedinthecalculatedrootmeansquarederror(RMSE)andmeanunsignederror(MUE). ResiduePDB1KF5PDB7RSAExperiment[ 174 ] Glu20.6-1.82.5Glu93.73.63.9His126.26.96.0Asp14-1.4-0.31.8Asp381.82.32.1His487.96.76.1Glu494.25.24.3Asp533.32.53.7Asp831.51.33.3Glu863.83.84.0His1057.16.96.5Glu1113.73.63.8His1195.75.76.5Asp121-0.20.03.0RMSE1.531.70MUE1.071.20 138

PAGE 139

Inparticular,Ifoundthattheboxsizeoftheunitcellhadnodiscernibleeffectonthetitrationbehavioroftheaspartatemodelcompound,givencellsizesthatrangedfrom20Aindiameteroneofthesmallestsizespermissiblewhenusingtheminimumimageconventionwithan8Acutoffto40Aindiameter.AnotherkeyaspectofthecurrentmethodisthenecessitytorelaxthesolventaroundanynewprotonationstateselectedbytheMCmovescarriedoutinGB.Byana-lyzingthedecayofthepotentialenergyinthesolventrelaxationdynamics,Ideterminedthat4psofMDwassufcienttostabilizetheenergyofthesolventdistributionsandgen-eraterelaxedsolventconformationsthatareuncorrelatedfromtheinitialarrangements.However,giventheexpenseofsuchalongrelaxationperiod,Iinvestigatedusingfewerrelaxationstepstoincreasethesimulationefciencyandfoundshortertimesdownto0.2pshadnomeasurableeffectonthecalculatedpKaandverylittleeffectonthesolventdistributionaroundthemodelcysteinecompound.Furthertestsonasmallpentapeptidetestsystemwithtwotitratablesites(ACFCA)showedtheimportanceofusingpH-REMDoverconventionalCpHMDwiththeproposedmethod.WhileIshowedinCh. 3 thattheenhancedprotonationstatesamplingofpH-REMDresultsinsmoothertitrationcurvesforcomplexproteinsinimplicitsolvent,[ 88 ]eventhesimplestsystemsinexplicitsolventrequirepH-REMDtoobtainasmoothtitrationcurve.Itestedtheproposedmethodontwoproteinsystems,heneggwhitelysozyme(HEWL)andribonucleaseA(RNaseA).WhilecalculatedpKaswereingoodagreementwithexperimentfornumeroustitratableresidues,othersappearedstuckinconforma-tionaltrapsresistanttochangingtheirprotonationstatesforthedurationofthe20nssimulation.ManyoftheresidueswhosecalculatedpKasdifferedbymorethan1to2pKaunitsfromexperimentweresurroundedbychargedresiduesthatstronglyfavoredaspe-cicprotonationstate.UnlikeourpreviousstudyonHEWL,[ 88 ]thelargeconformational 139

PAGE 140

changesseeninimplicitsolventoccuronamuchlongertimescaleinexplicitsolventduetothefrictionandviscosityofthewatermolecules.ThelargerthepKashiftatitratableresidueexperiencesinsidetheproteinenvi-ronmentcomparedtothemodelcompound,thelessthatenvironmentresemblesbulksolution.Itisfortheseresidues,therefore,thataccurate,extensive,conformationalsamplingisrequiredtoreproduceexperimentalpKameasurements.GiventhelimitedmobilityofHEWLandRNaseAinoursimulations,itisthereforenotsurprisingthatthemostproblematicresidueswerethosewhoseexperimentalpKaswereseveralpKunitslowerthantheirmodelcompounds.WhenIaddedexplicitionstothesimulationcell,thecalculatedpKasofthemostproblematicresiduesshiftedtowardtheirexperimentalvaluesinsomecasesbymorethan1fullpKaunitdespitethelimitedsamplingofglobalconformationalchangesonthe20nstimescale.Therefore,whileitisimportanttoincludeexplicitionstoprovideamoreaccuratemicroenvironmentaroundthetitratableresidues,themethodwouldprobablybenetstronglyfromattemptstoimproveconformationalsampling,eitherbylongersimulationsorsometypeofenhancedsamplingtechnique.Forexample,acceleratedMDwasusedinconjunctionwiththeoriginalCpHMDimplementationinAmberwithpromisingresults.[ 135 ]Tosummarize,ourproposedextensiontoAmber'sCpHMDmethodallowsdynamicstobecarriedoutatconstantpHevenforsystemsthatcannotyieldsensibleresultswhentreatedwithanimplicitsolventmodel,suchasDNAandribozymes.Asanexample,ItestedGBsimulationsofthehepatitisdeltavirus(HDV)ribozymewherenucleobasesarethoughttoactasgeneralacidsandbasesandevenaftercarefulpreparation,thesecondaryandtertiarystructuresbeganbreakingdownalmostimmediatelyandhadcompletelyfallenapartwithin5ns.WhileitmayseemthatsuchpoorbehaviorintheMDsimulationswouldprecludeGBfrombeingeffectiveforsamplingprotonationstates,theprotonationstatesamplingbenetsfrombettercancellationoferrors.The 140

PAGE 141

MCprotonationstatemoveisevaluatedbasedonadifferenceofenergydifferencestheenergydifferencebetweenthetwochargestatesfromthemodelcompoundissubtractedfromtheenergydifferencebetweenthetwochargestatesinthebiomolecule.ManyoftheerrorsinherenttoGBshouldcancelaftertheseconddifferenceistakensothatsensibleresultsmaybeextractedfromthesesimulations.InfutureworkIwillexploretheuseofenhancedsamplingtechniquesinconjunctionwithpH-REMDinanattempttoimprovetheefciencyoftheconformationalsamplinginexplicitsolvent,aswellasapplyourmethodtosystemsrequiringanexplicitsolventrepresentation,suchasHDV. 141

PAGE 142

CHAPTER5REMD:GPUACCELERATIONANDEXCHANGESINMULTIPLEDIMENSIONSThischaptercontainsadescriptionofmyworkimplementingreplicaexchangemoleculardynamics(REMD)inthepmemdprogramoftheAmberprogramsuite.[ 141 ]TherstsectionsdescribethegeneraltheoryofthestateexchangessupportedbyAmber,followedbydetailsoftheirimplementation.I'llthennishwithadescriptionofmydesignofmultiple-dimensionREMDinAmberthatIimplementedinboththesanderandpmemdprograms. 5.1TemperatureREMDThemostcommonvariantofREMDsimulationsinvolvesassigningreplicaswithdifferenttemperatures(T-REMD)[ 78 ]betweenwhichtheMonteCarlo-basedreplicaexchangeattemptsoccur.Theexchangesuccessprobabilitycalculatedinawaythatsatisesdetailedbalancetopreservevalidthermodynamicsissolvedfortheproposedchangeoftworeplicasswappingtemperatures,asshowninEq. 5 .When2Nreplicasarepresent,Nindependentexchangeattemptscanbemadesimultaneouslybetweendifferentpairsofreplicas.Ifnoreplicaisinvolvedinmultipleexchangeattempts,thesemovescanbeevaluatedindependently.Whilethismaynotbethemostefcientwaytoperformreplicaexchangeattempts,itisthemostcommonapproachduetoitssimplicityandefciency.TocalculatetheexchangeprobabilityinT-REMDexchangeattempts,westartwiththedetailedbalanceequation(Eq. 1 )inwhichreplicasmandnhavetemperaturesTmandTn,respectivelyinourinitialstatei.ThetemperaturesswapinourproposedstatesuchthatreplicasmandnhavetemperaturesTnandTm,respectively.BecausethepotentialenergyfunctionofeachreplicaisthesameonlythetemperaturediffersbetweenreplicastheprobabilityofareplicahavingaspecictemperatureisdirectlyproportionaltotheBoltzmannfactor(inthecanonicalensemble).ThederivationoftheexchangeprobabilityequationinT-REMDsimulationsisshowninEq. 5 142

PAGE 143

Pii!j=Pjj!iexp[)]TJ /F3 11.955 Tf 9.3 0 Td[(mEm]exp[)]TJ /F3 11.955 Tf 9.3 0 Td[(nEn] QmQni!j=exp[)]TJ /F3 11.955 Tf 9.29 0 Td[(nEm]exp[)]TJ /F3 11.955 Tf 9.29 0 Td[(mEn] QnQmj!ii!j j!i=minf1,exp[(n)]TJ /F3 11.955 Tf 11.95 0 Td[(m)(En)]TJ /F7 11.955 Tf 11.96 0 Td[(Em)]g (5)wheremis1=kBTmforreplicamandEmisthepotentialenergyofthestructureinreplicam.Becausethetemperatureofthesystemuniquelydenesitskineticenergy,thepotentialenergycanbeusedinlieuofthetotalenergyinEq. 5 aslongasthetotaltemperatureremainsconsistentaftertheexchangeattemptcompletes.Therefore,themomentaofreplicamaretypicallyscaledbyp Tn=Tmaftersuccessfullyexchangingwithreplican.[ 78 ]Byscalingthevelocitiesinthisway,snapshotsfollowingasuccessfulexchangeattemptareimmediately`equilibrated'membersofthenewtemperature'sensemble,therebyeliminatingtheneedtorelaxthestructuretoits`new'temperature.ThisallowsREMDsimulationstobecarriedoutmoreefcientlybypermittingexchangeattemptsveryfrequently.[ 142 143 ]AnimportantconsiderationforT-REMDsimulationsishowmanytemperaturereplicasyoushoulduseaswellaswhattemperaturesthosereplicasshouldhave.Asthetemperatureofasystemincreases,thenumberoflow-energystructuresthataresampledduringthesimulationdecreases.Infact,atinnitetemperatures,MDiseffectivelyequivalenttorandomsampling,whoseconsequenceswereillustratedinFig. 1-1 .Thetemperatureladder(i.e.,theselectionoftemperaturesatwhichtoruneachreplica)shouldbechosensoastooptimizethesimulationefciency.Ifthetemperaturedifferencebetweenadjacentreplicasistoogreat,thentheaveragepotentialenergydifferencebetweenadjacentreplicaswillbelargeandtheexchangeprobabilityinEq. 5 willbeverysmall.Asaresult,thelowtemperatureensembleswillnotbenetfromtheenhancedsamplingachievableatthehighertemperatures.Ontheotherhand,ifthe 143

PAGE 144

temperaturedifferencebetweenadjacentreplicasistoosmall,thencomputationaleffortwillbewastedbysimulatingunnecessaryreplicasthatdonotenhancesamplingfromthegeneralizedensemble.ByanalyzingEq. 5 ,itisclearthatinordertohaveahighexchangeacceptanceprobability,eitherthetemperaturedifferenceorthepotentialenergydifferencebetweenexchangingreplicasmustbesmallintheextremecase,ifahighertemperaturereplicahasaconformationwhosepotentialenergyislessthanorequaltothelower-temperaturereplica,thatexchangeattemptisalwaysaccepted.Byplottingthepotentialenergydistributionsobtainedfromashortsimulationateachtemperature,theexchangeratebetweenanytworeplicascanbeestimatedbasedonthedegreebywhichtheirpotentialenergydistributionsoverlap,showninFig. 5-1 .Agoodchoiceoftemperaturesforeachreplicacanbemadeaprioribasedsimplyonthenumberofdegreesoffreedompresentinthesystem.[ 175 ]OnechallengewithT-REMDisitsscalabilityforlargesystems.Itiswell-knownthatthermodynamicuctuationsscaleas1=p NinstatisticalensembleswhereNisthetotalparticlecount.Therefore,thelargerasystemgets,thenarroweritspotentialenergydistributionbecomes.Consequently,asthepotentialenergydistributionsnarrow,replicasmustbespacedcloserandclosertogethertoachievesufcientmixingalongthetemperature-spaceparameter.Forthisreason,T-REMDsimulationsonsystemsthatareexplicitlysolvatedarerare.Whilesomeapproaches,liketheoneproposedby Okuretal. ,useahybridsolvationschemewherebyexchangeattemptsarecarriedoutinimplicitsolvent,thersttwosolvationlayersareoftenrepresentedpoorlybyimplicitsolvent,requiringtheirinclusioneveninthehybridapproach.[ 176 ]Furthermore,thesnapshotsgeneratedathighertemperaturesinthegeneralizedensemblearetypicallydiscardedfromanalysesfortworeasons.First,wearetypicallyinterestedinthethermodynamicsofroomtemperature,sothehigh-temperaturedynamicsarenotofgeneralinterest.Second,ourforceeldsareparametrizedfor 144

PAGE 145

Figure5-1. PotentialenergydistributionsofTrpCagea20-residuepeptideatvarioustemperaturesinaT-REMDsimulation. useattemperaturesnear300K,andhighertemperaturesmaybreaktheapplicabilityofharmonicfunctionsforseveralbondedpotentials.Whilethehigh-temperaturedatamaybereweightedforinclusioninlow-temperatureensembles,[ 177 ]highertemperaturereplicascontributeincreasinglylittleinformationtothetemperaturesofinterest. 5.2HamiltonianREMDAnothercommonvariantofREMDsimulationsinvolvesswappingHamiltoniansbe-tweenreplicas(H-REMD).BecausethenatureoftheexchangeinH-REMDsimulationsisfundamentallydifferentfromthoseinT-REMD,Eq. 5 cannotbeusedtocalculatetheexchangeprobabilityforH-REMDsimulations.TheproperexchangeprobabilityforH-REMDsimulations,generalizedforrunningreplicasatdifferenttemperatures,isderivedinEq. 5 .Eq. 5 isthespecialcaseofEq. 5 whenthetemperaturesofexchangingreplicasarethesame.Theeasiestandmostgeneralwayofimplementing 145

PAGE 146

H-REMDistoswapcoordinatesbetweenexchangingreplicas.Thisapproach,asim-plementedinAmber,canbeusedforumbrellasamplingREMD,[ 79 80 ]acceleratedREMDwithdifferentboostparameters,[ 82 83 ]andalchemicalchangesbetweentwoendstates.[ 85 ]Asaresult,Eq. 5 isderivedsubjecttoexchangingonlycoordinates.Pii!j=Pjj!iexp[)]TJ /F3 11.955 Tf 9.3 0 Td[(mHm(~xm)]exp[)]TJ /F3 11.955 Tf 9.3 0 Td[(nHn(~xn)] QmQni!j=exp[)]TJ /F3 11.955 Tf 9.3 0 Td[(mHm(~xn)]exp[)]TJ /F3 11.955 Tf 9.3 0 Td[(nHn(~xm)] QnQmj!ii!j j!i=minf1,exp[)]TJ /F3 11.955 Tf 9.3 0 Td[(m(Hm(~xn))]TJ /F7 11.955 Tf 11.96 0 Td[(Hm(~xm)))]TJ /F3 11.955 Tf 11.96 0 Td[(n(Hn(~xm))]TJ /F7 11.955 Tf 11.96 0 Td[(Hm(~xn))]g (5)i!j j!i=minf1,exp[)]TJ /F3 11.955 Tf 9.3 0 Td[((Hm(~xn))]TJ /F7 11.955 Tf 11.95 0 Td[(Hm(~xm)+Hn(~xm))]TJ /F7 11.955 Tf 11.96 0 Td[(Hm(~xn))]g (5)LookingatEqs. 5 and 5 ,itisreadilyapparentthatexchangeattemptsinH-REMDsimulationsarefarmoreexpensivethanexchangeattemptsinT-REMD(Eq. 5 )orpH-REMD(Eq. 3 )simulations.TocalculatetheprobabilityofacceptinganexchangeinH-REMDsimulations,eachreplicamustcalculatethepotentialenergyofthecoordinatesofitsexchangepartner.T-REMDandpH-REMDexchangeprobabilities,ontheotherhand,arecalculatedviaasingleexponentialofquantitiesknownbeforetheexchangeattemptoccurs.Whenperformingreplicaexchangeonanumbrellacoordinate,however,theex-changeattemptcanbemodiedtosignicantlyreduceitscost.SincetheunderlyingHamiltonianisthesameforeachreplica,theenergydifferencesHm(~xn))]TJ /F7 11.955 Tf 12.42 0 Td[(Hm(~xm)areequaltothedifferenceintheirumbrellapotentials(Eq. 2 ),whichcanbecalculatedveryrapidly.Thisapproachreducesthecostoftheexchangeattemptsintwoways.First,theumbrellapotentialscanbeswappedbetweenadjacentreplicasratherthanthecoordinatesandmomenta,therebysignicantlyreducingthecommunicationoverheadandeliminatingtheneedtoreconstructanewpairlistimmediately.Second,computing 146

PAGE 147

thepotentialduetoanumbrellarestraintrequiresasmallnumberofgeometricmea-surements,whichisnegligiblecomparedtoevaluatingtheenergyoftheentiresystem(includingtherestraintpotential).DespitetheapparentlyhighcostofevaluatingEq. 5 ,attemptingexchangesevery100MDstepsincurs,atmost,a1%performancehitduetoperformingoneextraenergyevaluationevery100steps(eachofwhichrequiresafullforceevaluationforstandarddynamics).Therefore,therehasnotbeenenoughincentiveforwritinganoptimizedexchangeroutinespecicallyforumbrellasamplingsimulationsinAmber.SuchanexchangeroutinewouldbeusefulinfuturestudiesifGibbs'samplingexchangeattemptswereimplemented,[ 138 ]orinsituationswhereswappingonlyanumbrellapotentialsimpliescalculatingEq. 5 .ReplicaExchangeFreeEnergyPerturbation.HereIwillrefocusonfreeenergyperturbationEq. 2 anditsrelationshipwiththeH-REMDexchangeprobabilityshowninEq. 5 .Bycomparingthesetwoequations,weseethattheenergydif-ferencesrequiredinEq. 2 arecalculatedeverytimetheexchangeprobabilityiscalculatedinEq. 5 !Therefore,thetermhexp()]TJ /F3 11.955 Tf 9.3 0 Td[((EB)]TJ /F7 11.955 Tf 11.95 0 Td[(EA)))icanbeaccumulatedinboththeforwardandreversedirectionsduringthecourseoftheH-REMDsimulation.ThisapproachofcomputingFEP-basedenergydifferencesbetweentwostatesduringaH-REMDsimulationisreferredtoasReplicaExchangeFreeEnergyPerturbation(REFEP).[ 84 85 ] 5.3Multi-DimensionalREMDAsthearchitectureofmoderncomputerscontinuesitspushintomassiveparal-lelization,highlyscalabletechniquessuchasREMDbecomeincreasinglycost-efcientmethodsintheeldofcomputationalchemistry.WhilewehaveseenthatREMDsimu-lations,ingeneral,enhancesamplingbyexpandingouroriginalensemblethroughstatespace(e.g.,temperaturespace,Hamiltonianspace,etc.),differentvariantsofREMD 147

PAGE 148

bestowdifferentadvantagesonthesimulation.Forinstance,T-REMDenhancesconfor-mationalsamplingbyatteningthefreeenergysurface,pH-REMDenhancessamplingbyallowingsimulationstododgefreeenergybarriersthroughpH-space,andH-REMDenhancesconformationalsamplingbycouplingdifferentenergyfunctions.Asavailabilitytolargenumbersofprocessingcoresincreases,itbecomesfeasibletocombinemultipletypesofreplicaexchangesintoasingle,super-expandedensemble.Inthisnew,largerensemble,replicasaredenedbyaseriesofstateparameters,suchasaspecictemperature,Hamiltonian,umbrellapotential,orsolutionpH.Exchangeattemptsbetweenreplicasmustnowtakeintoaccountchangesinmultiplestateparameters,whichmayleadtocomplexequationsfortheexchangeprobability.Tosimplifytheexchangeprocess,thereplicascanbeseparatedintodifferentdimensionsinwhichonlyasinglestatevariablechangesalongthatdimension.Byadoptingthisapproach,theexchangeroutinesdescribedinpreviouschaptersandsectionscanbereusedinthisnew,multi-dimensionalREMDmethod.Tovisualizewhichexchangesareperformed,considera2-dimensionalsquarematrixinwhichtherowsandcolumnsrepresenttwodifferentstateparameters.Insinglerowsorcolumns,onlyasinglestateparameterchanges,sotheexchangeprobabilityequationsthathavealreadybeenderivedapplytotheseexchangeattempts.Fig. 5-2 displaysthearrangementofreplicasandtheallowedexchangeattemptsinasimplediagram.Whiletheseideascanbetriviallyextendedtoanarbitrarynumberofdimensions,thenumberofreplicasrequiredincreasesexponentiallywitheachadditionaldimension. 5.4ImplementationInthissection,IwilldescribehowREMDisimplementedinAmber,withfocuspaidtohowexchangeattemptsarecarriedoutaswellastheprogrammaticdetailsofhowinformationistradedbetweenexchangingreplicas. 148

PAGE 149

Figure5-2. Schematicshowingexchangeattemptsinmulti-dimensionalREMDsimulations.Exchangeattemptsareindicatedbythecoloredarrows,whereredarrowsindicateexchangeattemptsbetweenthejstateparametersandbluearrowsindicateexchangeattemptsbetweentheistateparameters 5.4.1ExchangeAttemptsSpecicdetailsofhowandwhenexchangesareattemptedbetweenreplicasisveryimportanttonotonlytheefciencyofouroverallsimulations,butalsothetheoreticalrigorofitscorrectness.AsIhavealreadymentioned,exchangeattemptsarerestrictedtoasinglepairofreplicasinwhichonlyasinglestateparameterdiffersbetweenthem.Thequestionofwhichreplicasattempttoexchangeinformationalsohasastrongimpactonhowquicklyobservablepropertiesconverge.Theeasiestandmostnaveapproachistochooseasinglepartnerandattemptanexchange.Tomaximizethelikelihoodthatthe 149

PAGE 150

exchangeattemptissuccessful,exchangesareattemptedbetweennearest-neighborsinthestateparameterthatisbeingswapped.Duetoitssimplicity,thisistheapproachthatwasimplementedinAmberby Chengetal. .[ 178 ]Recentevidencesuggests,however,thatsuchanapproachlimitssamplinginthestatespacecoordinate.[ 138 ]SamplingalongthestatespacecoordinatecanbeenhancedbyemployingideasfromGibbs'sampling,[ 138 ]orsimplyincreasingthefrequencyofexchangeattempts.[ 142 143 ]AnotherimportantconsiderationinREMDsimulationsiswhentosuspendtheMDineachdimensionandattempttoexchangeinformation.Strictadherencetotheconditionofdetailedbalanceandtheprincipleofreversibilityintheresultingchainofstatesrequirestheseexchangeattemptsbedonestochastically.[ 138 ]However,whiledeterministicexchangeattemptsviolatedetailedbalance,theysatisfythelessrestrictiveconditionofgeneralbalance,sothethermodynamicalrigorofsuchanapproachispreserved.[ 138 ]Amberemploysadeterministic,synchronousapproachtodecidingwhenexchangeattemptsshouldbeperformedbyattemptingexchangesbetweenadjacentreplicaseverynsteps,wherenisatunableinputparameter.Alsoimportantisthenatureoftheexchangeitself.ThetwoapproachescurrentlyusedinAmberexchangingstateparametersorexchangingcoordinatesarede-scribedbelow.ExchangingStateParameters.Themostefcientwaytocarryoutreplicaex-changesistoswapstateparametersanapproachusedinAmberforbothT-REMDandpH-REMD.Inthiscase,replicastypicallydifferbyatermthatmodiesapotentialenergyfunctionthatisotherwisethesameforeachreplica.Inthisinstance,simulationsaresubjecttoadifferentthermodynamicconstraintafterexchangesaresuccessful.Followingsuccessfulexchanges,thepositionofeachreplicaintheorderedlistofstateparameterschanges.Asaresult,thenearestneighborsbetweenwhomexchangesareattemptedchangesaftereachexchangeattempt.Priortoeachexchangeattempt,each 150

PAGE 151

replicamustgureoutwhereeveryreplicaresidesinstatespacesotheyknowhowtocarryoutexchanges.Whenexchangingstateparameters,replicastypicallyneedtoexchangeaminimalamountofinformationtheirstateparameterandarelatedconjugateproperty.InthecaseofT-REMD,replicasexchangetemperaturesandpotentialenergies,andindividualreplicasadoptdifferenttemperaturesasafunctionoftime.ForpH-REMD,thesolutionpHandtotalnumberof`active'titratableprotonsareswappedbetweenadjacentreplicas.Whenanexchangeattemptcanbecompletedsimplybyswappingstates,theresultingoutputlesfromthesimulationsfollowthecourseofasingletrajectoryasitpassesthroughbothphasespaceandstatespace.Asaresult,thetrajectorylemustbemodiedsothatthestateparameterofeachframecanbeidentied.Thisisnecessaryforreconstructingthesub-ensembleofinterest(e.g.,theensembleat300K,orpH7).Whilethisapproachaddsthecomplexityofthebookkeepingrequiredtopost-processthedata,thecommunicationrequiredscalesasO(1)withsystemsize,improvingthescalabilityoftheseREMDsimulations.[ 88 ]BecausethecostofexchangeattemptsinthisfamilyofREMDmethodsisnegligible,thereisnopracticallimittothefrequencywithwhichreplicasattempttoexchange,allowingustotakeadvantageofthefasterconvergenceaccessibleviarapidexchangeattempts[ 142 143 ]orGibbs'sampling.[ 138 ]ExchangingCoordinatesandMomenta.Thealternativetoswappingstateparametersbetweenreplicasistoswapcoordinatesandtheirconjugatemomenta,whichislogicallyequivalenttoswappingfullpotentialenergyfunctions.Itissignicantlysimplerandrequiresfarlesscommunicationbetweenexchangingreplicastobefullygeneralthanswappingthefullpotentialenergyfunction(whichpotentiallyincludesparticlecharges,masses,pairwiseLennard-Jonesparameters,restraints,etc.).ItisfortheaddedsimplicityandreducedcommunicationoverheadthatH-REMDis 151

PAGE 152

implementedinAmberbyswappingcoordinatesandvelocities(scalingthevelocitiesifexchangingpairshavedifferenttemperatures).[ 85 ]Addingtothecomputationalexpense,however,istheneedtoeitherrecomputeorexchangethefullpairlistofeachreplica.Eitherchoiceisquiteexpensivesincethepairlistisaverylargearraythatrequiresevaluatingallpairwisedistancesinthesystemtobuild.Unlikeapproachesthatexchangestateparameters,thecostofexchangeattemptsthatrequirecoordinateexchangesandextraenergyevaluations(andmultipleadditionalpairlistbuilds)imposesaveryrealupperlimitonthepracticalefciencyofemployingrapidexchangeattemptsorGibbs'samplingideastothesesimulations.ThemostefcientwayofperformingREMDusingumbrellapotentialswouldbetoswaptheumbrellapotentialsessentiallyastateparameterandtrackareplica'strajectorythroughumbrellaspace. 5.4.2MessagePassing:DataExchangeinREMDSimulationsWhileREMDsimulationscanbecarriedout`inserial'bysimulatingchunksofeachreplicasequentiallybyasingleprocess,suchanapproachdefeatsthepurposeofproposingREMDsimulationsasascalableprotocolforenhancedsampling.REMDismostefcientwheneachreplicacanbesimulatedsimultaneouslyusingdifferentprocesses,orthreads.ThemainsimulationenginesinAmberusetheMessagePassingInterface(MPI)toenabledistributedmemoryparallelization(i.e.,eachworkingthreadcontainsitsownmemorythatisinaccessiblebyotherthreads).MPIdescribedinmoredetailinAppendix C isideallysuitedforlarge-scaleparallelizationsinceitallowsworkerstobespreadacrossmultipleprocessingcoresthatdonotshareacommonmemorybank.Themostpowerfulsupercomputersintheworldthatwetypicallyusetocarryoutoursimulationsareso-calleddistributedsupercomputerssincetheyareconstructedfrommanyindividualcomputerswithdedicatedmemorythatarenetworkedtogether. 152

PAGE 153

MPIenablesparallelismbyallowinggroupsofthreadstoexchangeinformationbysendingandreceivingdatathroughaseriesofpredenedfunctionsandsubroutines(typicallycalledanapplicationprogrammerinterface,orAPI).Data,ormessages,canbesentandreceivedbetweentwothreadsinanMPIprogramthataregroupedtogetherinthesamecommunicator.Becausecommunicatorsprovideasimpleandefcientwayofprogrammaticallyseparatingthreadsintodifferentgroups,wetakeadvantageofthisfeaturewhenorganizingtheworkloadinMPIprograms.Intra-replicacommunicationwhichallowsasinglereplicatoberunusingmultipleprocessorsishandledbyadedicatedreplicacommunicator.Anarbitrarilydesignatedmasterthreadofeachreplicaisassignedtoseparatecommunicatorsforcommunicatingalldatapertinenttocarryingoutreplicaexchangeattempts.IntypicalREMDsimulationsinvolvingonlyasinglestateparameter,theREMDcommunicatorissimplyacommunicatorthatlinksallreplicamasters.Inmulti-dimensionalREMD,however,exchangesareonlypermittedbetweenreplicasthatdifferinonlyonestateparameter.Therefore,communicatorsaredenedbetweenonlythosereplicasbetweenwhichexchangesarepermitted.UsingFig. 5-2 asaguide,com-municatorsaredenedbetweenthemastersofthereplicasinasingleroworcolumn.ThesecommunicatorshavetobesetupanddestroyedaftereachexchangeattemptbecausesuccessfulexchangeattemptsinadimensionthatimplementsstateparameterswapswillchangetheREMDcommunicatorthatthereplicabelongstointheotherdimensions.ThisisillustratedinFig. 5-3 153

PAGE 154

Figure5-3. Communicatorarrangementinmulti-dimensionalREMDsimulationsatmultipleexchangestepsfollowingsomesuccessfulstateparameterexchanges.Thelargenumbersinthebackgroundarethe(unchanging)threadnumbersinthecommunicatorlinkingthe`master'threadsofeachreplica.Theblueandrednumbersaretheindexesintherstandsecondstateparametertables,respectively.EverycellwiththesamebackgroundcolorisamemberofthesameREMDcommunicator. 154

PAGE 155

CHAPTER6FLEXIBLETOOLSFORAMBERSIMULATIONSInthischapter,IwilldescribethemotivationbehindcreatingtwotoolstoaidusersincarryingoutbiomolecularsimulationswiththeAmberprogrammingpackageaswellassomedetailsregardingtheirfunctionalityandimplementation.Duringthecourseofmygraduatestudies,IwroteseveralscriptsandprogramstoaidinmyworkseveralofwhichIpolishedandreleasedwiththeAmbersuiteofprograms.ThetwoIwilldescribeinthischapterareMMPBSA.py[ 110 ]andParmEd. 6.1MMPBSA.pyPortionsofthissectionarereprintedwithpermissionfrom MillerIII,McGeeJr.,Swails,Homeyer,Gohlke,andRoitberg ,MMPBSA.py:AnEfcientProgramforEnd-StateFreeEnergyCalculations,J.Chem.TheoryComput.,2012,8(9),pp3314.[ 110 ]MMPBSA.pyisascriptdesignedtoautomatetheprocedureofperformingend-statefreeenergycalculations,asdescribedinSec. 2.3.3.1 6.1.1MotivationEnd-statefreeenergymethodsbrieydescribedinSec. 2.3.3 arepopularmethodsforcomputingbindingfreeenergiesforprotein-ligandbinding,[ 179 183 ]protein-proteinbinding,[ 101 102 183 184 ]nucleicacidbinding,[ 183 185 ]andrelativeconformationalstabilities.[ 186 187 ]Therehasbeensignicanteffortappliedtoimprovingtheapproximationsusedinend-statemethods,andinsomecasesithasevenapproachedpredictiveaccuracy.[ 182 188 ]By2008,therewasasetofperlscriptscapableofautomatingMM-PBSAandMM-GBSAcalculationsthatwerewrittenin2002forreleasewithAmber7andhadnotbeenchangedsince2003.Thesescriptswillbecollectivelyreferredtoasmm pbsa.plfromnowon.Duetoitsage,mm pbsa.plwascompatibleonlywiththelow-precision,inefcientASCIItrajectoryformat,andofferedonlyalimitedsetoftheavailableimplicit 155

PAGE 156

solventmodelsandinputparametersthathadbeendevelopedoverthedecadethatfolloweditsinitialrelease.Furthermore,theinputformm pbsa.plwasverydifferentfromthetypicalinputthatmostotherAmberprogramsexpected.Finally,mm pbsa.plwasonlycapableofrunninginserial,despitethefactthatend-stateanalysesthemselvescouldbetriviallyparallelizedbycomputingbindingfreeenergiesforindividualframessimultaneously.Thegoalofmyprojectwastorevitalizethissetofhelpfulscriptsthathadfallenoutofsupportandhadgrownoutdated.WewantedtobringtheinputstyleinlinewiththerestoftheAmberprogramsforexample,atomselectionsshouldbeinputviatheAmbermasksyntax,andgroupsofrelatedvariablesshouldbespeciedinFortran-stylenamelists.Furthermore,wewantedtoprovidetheuserwithaccesstonewinputvariablesandsolventmodels.Giventhemagnitudeofthechangesrequired,therecentemergenceofPythonintheeldofcomputationalchemistry,[ 110 189 191 ]andtheundocumented,monolithicstateofmm pbsa.pl,wedecidedtobuildanewscripttoperformend-statefreeenergycalculationsinPythonMMPBSA.py.[ 110 ] 6.1.2CapabilitiesInthissection,IwillbrieyoutlinesomeofthevarioustypesofcalculationsthatMMPBSA.pyiscapableofperforming. 6.1.2.1StabilityandBindingFreeEnergyCalculationsEnd-statecalculationsarefrequentlyusedfortwotypesofanalysescalculatingtherelativestabilityofmultipleconformationsofasystemandcalculatingthebindingfreeenergyinanon-covalentlybound,receptor-ligandcomplex,[ 109 ]whosethermo-dynamiccycleswereshowninCh. 2 ,Fig. 2-11 .Stabilitycalculationscomparethefreeenergiesofmultipleconformationstodeterminetheirrelativestability.IfweconsidertheprocessofabiomoleculechangingconformationsfromstateAtostateB,thenthefreeenergyassociatedwiththatconformationalchangeissimplythedifferenceinthefreeenergiesofstatesAandB.Similarly,thenon-covalentbindingfreeenergiescan 156

PAGE 157

becomputedasthedifferenceinfreeenergiesoftheboundandfreestatesofthetwospeciesinsolution.Thefreeenergychangesinsolutioncanbedecomposedaccordingto Gsolvated=Egas+Gsolvation)]TJ /F7 11.955 Tf 11.96 0 Td[(TSsolute(6)whereGsolvationrepresentsatruefreeenergy,sincethesolventdegreesoffreedomhavebeenaveragedbyusinganimplicitsolventmodel.Thefreeenergyofsolvationin 6 canbefurtherdecomposedintoasumofpolarandnon-polarcontributionsinmostimplicitsolventmodels.Amongthesolventmodelsavailableforend-statecalculationsinMMPBSA.pyarethepreviouslymentionedPBandGBimplicitsolventmodelsaswellasthe3-dimensionalreferenceinteractionsitemodel(3D-RISM).[ 192 ]TheenergiesdescribedinEq. 6 aresinglepointenergiesofthesystem.How-ever,inpractice,end-statecalculationsestimatetheseenergiesaccordingtoensembleaveragestakenfromasimulation.ExpressingEq. 6 intermsofanaverageoverasimulatedensembleyieldsEq. 6 .GsolvatedhEgasi+hGsolvationi)]TJ /F7 11.955 Tf 19.26 0 Td[(ThSsolutei=1 N(NXi=1[Ei,gas+Gi,solvation])]TJ /F7 11.955 Tf 11.95 0 Td[(TNXi=1Si,solute) (6)whereiistheindexofaparticularframeandNisthetotalnumberofanalyzedframes.TherearetwoapproachestogeneratingthenecessaryensemblesfortheboundandunboundstateofbindingenergycalculationsallensemblescanbeextractedfromasingleMDorMCtrajectoryoftheboundcomplex,ortrajectoriescanbegeneratedforeachstateusingseparatesimulations.[ 193 ]Theseapproachesarecalledthesingletrajectoryprotocol(STP)andthemultipletrajectoryprotocol(MTP),respectively,andeachapproachhasdistinctadvantagesanddisadvantages. 157

PAGE 158

STPislesscomputationallyexpensivethanMTP,becauseonlyasingletrajectoryisrequiredtogenerateallthreeensembles.Furthermore,theinternalpotentialterms(e.g.,bonds,angles,andtorsions)cancelexactlyintheSTP,becausetheconformationsintheboundandunboundensemblesarethesame,leadingtoloweructuationsandeasierconvergenceinthebindingfreeenergy.TheSTPisappropriateifthereceptorandligandensemblesarecomparableintheboundandunboundstates.However,theconformationspopulatingtheunboundensemblestypicallyadoptstrainedcongurationswhenextractedfromtheboundstateensemble,therebyover-stabilizingthebinding,comparedtotheMTP. 6.1.2.2FreeEnergyDecompositionAmber[ 141 ]providesseveralschemestodecomposecalculatedfreeenergiesintospecicresiduecontributionsusingeithertheGBorPBimplicitsolventmodels,[ 194 ]followingtheworkof Gohlkeetal. [ 101 ]Interactionscanbedecomposedforeachresiduebyincludingonlythoseinteractionsinwhichoneoftheresidue'satomsisinvolvedaschemecalledper-residuedecomposition.Alternatively,interactionscanbedecomposedbyspecicresiduepairsbyincludingonlythoseinteractionsinwhichoneatomfromeachoftheanalyzedresiduesisparticipatingaschemecalledpairwisedecomposition.Thesedecompositionschemescanprovideusefulinsightsintoimportantinteractionsinfreeenergycalculations.[ 101 ]However,itisimportanttonotethatsolvationfreeenergiesusingGBandPBarenotstrictlypairwisedecomposable,sincethedielectricboundarydenedbetweentheproteinandthebulksolventisinherentlynonlocalanddependsonthearrangementofallatomsinspace.Thus,caremustbetakenwheninterpretingfreeenergydecomposi-tionresults.Analternativewayofdecomposingfreeenergiesistointroducespecicmutationsintheproteinsequenceandanalyzehowbindingfreeenergiesorstabilitiesareaffected.[ 112 ]Alaninescanning,whichisatechniqueinwhichanaminoacidinthesystemis 158

PAGE 159

mutatedtoalanine,canhighlighttheimportanceoftheelectrostaticandstericnatureoftheoriginalsidechain.[ 99 ]Assumingthatthemutationwillhaveanegligibleeffectonproteinconformation,wecanincorporatethemutationdirectlyintoeachmemberoftheoriginalensemble.ThisavoidstheneedtoperformanadditionalMDorMCsimulationtogenerateanensembleforthemutant. 6.1.2.3EntropyCalculationsTheimplicitsolventmodelsusedtocalculaterelativestabilityandbindingfreeenergiesinend-statecalculationsoftenneglectsomecontributionstothesoluteentropy.Ifweassumethatbiologicalsystemsobeyarigidrotormodel,wecancalculatethetranslationalandrotationalentropiesusingstandardstatisticalmechanicalformulae,[ 9 ]andwecanapproximatethevibrationalentropycontributionusingoneoftwomethods.First,thevibrationalfrequenciesofnormalmodescanbecalculatedatvariouslocalminimaofthepotentialenergysurface.[ 9 ]Alternatively,theeigenvaluesofthemass-weightedcovariancematrixconstructedfromeverymemberoftheensemblecanbeapproximatedasfrequenciesofglobal,orthogonalmotionsatechniquecalledthequasi-harmonicapproximation.[ 195 ]Usingeitherthenormalmodeorquasi-harmonicapproximations,wecansumthevibrationalentropiesofeachmodecalculatedfromstandardformulae.[ 9 ]Typically,normalmodecalculationsarecomputationallydemandingforlargesystems,becausetheyrequireminimizingeveryframe,buildingtheHessianmatrix,anddiagonalizingittoobtainthevibrationalfrequencies(eigenvalues).BecauseoftheHessiandiagonalization,normal-modecalculationsscaleasroughly(3N)3,whereNisthenumberofatomsinthesystem.Whilethequasi-harmonicapproachislesscomputationallyexpensive,alargenumberofsnapshotsaretypicallyneededtoextrapolatetheasymptoticlimitofthetotalentropyforeachensemble,whichincreasesthecomputationalcostoftheoriginalsimulation.[ 179 ] 159

PAGE 160

6.1.3GeneralWorkowMMPBSA.pyisaprogramwritteninPythonandnab[ 196 ]thatstreamlinestheprocedureofpreparingandcalculatingfreeenergiesforanensemblegeneratedbyMDorMCsimulationswhosegeneralworkowisshowninFig. 6-1 .TheprocessofcalculatingbindingfreeenergiescanbeatediousprocedurethatMMPBSA.pyaimstoshortenandsimplify.Pythonisausefulprogramminglanguageforperformingtasksthatarenotnumeri-callyintensive,andbecauseitisavailableonvirtuallyeveryplatform,Pythonprogramsarehighlyportable.NucleicAcidBuilder(nab),[ 196 ]whichisamolecule-basedpro-gramminglanguageincludedwithAmberTools,containsfunctionalitypertinenttobuilding,manipulatingandperformingenergycalculationsonbiologicalsystems,suchasproteinsandnucleicacids.End-statecalculationsoftenrequiremultipletopologyles(describedlater)thatcontaintheparameterscorrespondingtotheforceeld.SimulationsaretypicallyrunusingexplicitsolventwithanyoftheelectrostaticsmethodsdescribedinCh. 2 ,whichwouldrequirebothsolvatedandunsolvatedtopologylestousewithMMPBSA.py.Itisnecessarythatalltopologyleshaveaconsistentsetofparameters,especiallyforbindingfreeenergycalculations.Therefore,MMPBSA.pycheckstheinputtopologylespriortobindingfreeenergycalculationstopreventerroneousresultsduetoinconsistenciesthatmaynotbeimmediatelyobvious(e.g.,differentparticlecounts,partialchargesforthesameatoms,etc.).IwrotethePythonutilityante-MMPBSA.py(alsoreleasedalongsideMMPBSA.py),whichallowsausertoeasilycreatetopologyleswithaconsistentsetofparameters,includingchangingtheintrinsicimplicitsolventradiussettotthedesiredsolventmodel.TheuseofMMPBSA.pyissimilartothatofAmber'sMDenginessanderandpmemd.Thecommand-lineagscommontobothMMPBSA.pyandtheMDengines 160

PAGE 161

Figure6-1. Generalworkowforperformingend-statecalculationswithMMPBSA.py.LEaPisaprograminAmberusedtocreatetopologylesfordynamics.Theworkowshowninstep3istheseriesofstepsthatMMPBSA.pyautomates.Drytopologiesandensemblesaresystemswithoutexplicitsolventthataresubsequentlytreatedusinganimplicitsolventmodel.Externalprogramsreferstotheexecutablesthatperformtheenergycalculations(e.g.,sander). 161

PAGE 162

areidentical,andinputlesareseparatedwithsimilar,Fortran-stylenamelists,indicatedwithanampersand(&)prex.TheMMPBSA.pyinputlecontainsageneralnamelistforvariablesthatcontrolgeneralbehavior.Forexample,variablesthatcontrolthesubsetofframesanalyzed(startframe,endframe,andinterval)andtheamountofinformationprintedintheoutputle(verbose)arespeciedhere.Anexampleofthissectionisshownbelow: GeneralMMPBSA.pyinputfile&generalstartframe=1,endframe=100,interval=2,keep_files=0,verbose=1,strip_mask=:WAT:Cl-:Na+,/ 6.1.4RunninginParallelMMPBSA.pyisimplementedinparallel,souserswithaccesstomultipleprocessorscanspeeduptheircalculations.MMPBSA.py.MPIistheparallelimplementationofMMPBSA.pythatusesMPI(describedinAppendix C )forPython(mpi4py).Sinceenergycalculationsforeachframeareindependent,thecalculationcanbetriviallyparallelized,givenenoughavailableprocessors.MMPBSA.py.MPIdividesframesevenlyacrossallprocessors,whichallowscalculationsusingmanyframestoscalebetterthanifMMPBSA.pyinvokedparallelexecutablestocalculatefreeenergies.However,perfectscalingisnotattained,becausecertainsetupstasksandleinput/outputcanonlybedonewithasingleprocessor.Fig. 6-2 demonstratesscalingforasampleMM-PBSAandMM-GBSAcalculation. 6.1.5Differencestomm pbsa.plBothMMPBSA.pyandmm pbsa.plallowuserstoperformfreeenergycalculationsusingtheSTPandMTP,althoughMMPBSA.pyoffersmoreexibilitywhenusingtheMTP.BothprogramshavetheabilitytousedifferentPBandGBmodelscontainedwithinAmberandestimateentropiccontributions.Finally,MMPBSA.pyandmm pbsa.pl 162

PAGE 163

Figure6-2. MMPBSA.pyscalingcomparisonforMM-PBSAandMM-GBSAcalculationson200framesofa5910-atomcomplex.Timesshownarethetimesrequiredforthecalculationtonish.NotethatMM-GBSAcalculationsare5timesfasterthanMM-PBSAcalculations.AllcalculationswereperformedonNICSKeeneland(2IntelWestmere6-coreCPUspernode,QDRinnibandinterconnect). canrunfreeenergycalculationsinparallel,althoughonlyMMPBSA.pycanrunondistributedmemorysystems(i.e.,onmultiplenodesconnectedoveranetwork).Despitetheirobvioussimilarities,therearemanydifferencesthatexistintheiraccessibility,implementation,andcapabilities.MMPBSA.pyisavailablefreeofchargealongsideAmberTools,whileanAmberlicenseisnecessarytoobtainmm pbsa.pl.TheusageofMMPBSA.pyisintendedtoresembleAmbersMDenginesforeaseoftheuser,whilemm pbsa.plsinputleandusagehasitsownsyntax.OnlyMMPBSA.pyhasanintuitivemechanismforguessingtheligandandreceptormasksofacomplexbasedonthetopologylesprovidedandanalyzestopologylesforparameterconsistency.Furthermore,onlyMMPBSA.pycancalculateentropiccontributionstothefreeenergy 163

PAGE 164

usingthequasi-harmonicapproximation.AninterfacetoexternalPBsolverssuchasDelphi,MEAD,andUHBDisavailablewithmm pbsa.plonly,althoughbothcanusetheapbsprogramtosolvethePBequation.MMPBSA.pyallowsuserstoprovidetheirowninputlesforexternalprograms,whichgivesuserstheabilitytoadjustallparameters,notjustthevariablesdescribedintheMMPBSA.pymanual;incomparison,mm pbsa.plhasnosimilarfunctionalitywithoutdirectlyalteringthesourcecode.Finally,QM/MM-GBSAandMM/3D-RISMcalculationsareonlyavailablethroughtheMMPBSA.pyimplementation. 6.2ParmEdParmEdshortforParmtopEditorisaprogramthatallowsresearcherstoeasilymanipulateandextractinformationfromAmberparameter-topology(prmtop)les.TheprmtopisacompactASCII(i.e.,puretext)lewhoseformatwasoptimizedforextensibilityandFortran-styleparsing.ThedatastructuresstoredinthislearesimilartothedatastructuresusedinsidetheAmbercodesthatperformMMsimulations,makingthemoverlytedioustoextractinformationbysimplyreadingitscontents.ThefullstructureandspecicationoftheprmtopispresentedinAppendix B 6.2.1MotivationTheprmtoplesareverycomplexobjects,andthereisverylittle`locality'intheseles.Thatis,determiningwhichbondsexistandhowstrongtheirforceconstantsareisnotassimpleaslookingforthesectionslabeledwithBONDintheprmtop.PriortowritingParmEd,therewerenoprogramsreleasedwithAmberorAmberToolscapableofmodifyingthetopologyleinageneralway.Changingsimpleatomicpropertiessuchasthepartialchargeorthesetofintrinsicradiiusedforimplicitsolventmodelsrequiredtheusertomodifytheiroriginalinputlestotleapandrecreateatopologylefromtheiroriginalstructure,orinsomecasesevenmodifythetleapsourcecodedirectly!Becausemanyinputlesfortleaparesharedamongallusersandoriginalinputleshelpdocumentone'sprotocol,modifyingtheselesfrequentlyisdangerous. 164

PAGE 165

Thetediousanderror-pronenatureofthisprocessisadeterrentfortestingsomenewhypothesesandmethodsthatrequiresmallchangestothetopologyle.Forinstance,parameterizinganewGBmodelbyusingdifferentintrinsicradiitodenethedielectricboundaryrequireseithermodifyingthetopologylebyhandadangerousandtediousprocessorlearningandmodifyingthetleapsourcecodeandrebuildingtheprogramallintheprocessofreningasetofparameters.WithParmEd,usersandmethoddeveloperscanrapidlyprototypeanewmethodinareliableway.AprimarygoalofParmEdistoenablesafe,rapidprototypingofnewmethodsthatrequirestraight-forwardchangestotheprmtople.AsecondmotivatorforcreatingParmEdwastoprovideauniedplatformfordisseminatingprmtopmodicationsthatmayberequiredforaparticularmethod.Thetraditionalapproachwhenamethodrequiredaprmtopmodicationwasforthedeveloperthatreleasedthenewcodetodevelopastand-alonetoolintheirprogramminglanguageofchoicetobereleasedalongsideAmber.Thesetoolsoftenparsedandmodiedtopologylesinaminimalisticfashion,andarenotusedortestedfrequently.Suchanapproachquicklybecomesunsustainableastheauthorsofthesetoolsleavethedevelopercommunity(e.g.,throughgraduationorretirement).WithParmEd,IsoughttocreateasimpleplatformtounifyprmtopmodifyingprogramswithinAmberinanattempttoeasetheburdenofsupportandsimplifytheuserexperience.Therefore,ParmEdshouldbeintuitivetouseforexperiencedAmberusers,andwritteninawaythatthecodecanbeeasilyunderstoodbyotherdevelopers. 6.2.2ImplementationandCapabilitiesIwroteParmEdasasetoftwoPythonscriptsbuiltontopofacommonlibraryoffunctionality.Therst,parmed.py,isacommand-linetoolthatstronglyresemblesthepopulartrajectoryanalysisprogramsptrajandcpptrajinitsuse.Thesecond,xparmed.py,isagraphicaluserinterfacebuiltontheTcl/TktoolkitthroughtheTkinter 165

PAGE 166

Figure6-3. Screenshotofthexparmed.pyGUIwindow,labeledwiththeavailableActionsandamessagelog. Pythonbindings.TheGUI,showninFig. 6-3 ,ismeanttobeaverysimple,point-and-clickinterfaceforprmtopmodication,whileparmed.pyisidealforscriptingpurposes.TofurthersimplifytheuseofParmEdtothosefamiliarwithotherAmberprograms,theubiquitousAmbermasksyntaxisusedtospecifyallnecessaryatomselections.TheindividualcapabilitiesofParmEd,calledActions,areallsubclassedfromacommonActionbaseclass.EachActioninterpretsitsownlistofargumentsandimplementsitsown,uniquefunctionality.ToexpandtheutilityoftheParmEdcode, 166

PAGE 167

userscanincorporateindividualParmEdActionsintotheirownPythonscriptthroughanApplicationProgrammerInterface(API)documentedintheAmberToolsmanual.Thisallowsuserstoavoidtheneedtolearntheinner-workingsoftheprmtopleandre-implementexistingcodeinthecaseswhereParmEddoesnothandlealloftheusers'needs.Inthefollowingsections,IwilloutlinesomeoftheActionsandfunctionalityIconsidertobeparticularlyhelpfulorparticularlychallengingtoimplementthroughotherprograms. 6.2.2.1Lennard-JonesParameterModicationsIwillbrieydescribeherehowtheradius(ri)andwelldepth("i)assignedintheparameterdatabasesforeachatomtypeiistranslatedintoasetofparametersusedtocomputetheLJpotentialintheAmberforceeld.Specically,betweenpairsiandj,thewelldepth"i,jisthegeometricaverageandtheradiusri,jisthearithmeticaverage"i,j=p "i"jRmin,i,j=Rmin,i+Rmin,j (6)ThesecombinedradiianddepthsarethencombinedintoA-coefcientsandB-coefcientsusingtheequationsACOEFi,j="i,jr)]TJ /F10 7.97 Tf 6.59 0 Td[(12i,jBCOEFi,j=2"i,jr)]TJ /F10 7.97 Tf 6.58 0 Td[(6i,j (6)Eqs. 6 and 6 areevaluatedintleap,and"iandriareprovidedasinputintheparameterles.BecausetherearemoreACOEFandBCOEFparametersthanthereareinputparameters,thewaytleaphandlesLJparametersrestrictssomeexibilityintheforceeld.TheAandBcoefcientscanbethoughtofasamatrixofpairwisecombinedtermsdenedinEq. 6 inwhichonlythediagonaltermsarespecied. 167

PAGE 168

TheinteractionsbetweeneachpairofatomtypescannotbesetindependentlyliketheycanintheCHARMMprogramviatheNBFIXkeyword,forinstance.IwillmakeadetourheretodiscusshowtleapcompressesthenumberofLJparameterswrittentothetopologyle.SincetheLJpotentialiscomposedofpairwiseterms,theremustbeatermforeverypairofatomsinthesystemanumberthatbecomesastronomicallylargeforlargenumbersofparticles.ToavoidprintingoutontheorderofN2termsinbothcoefcientmatrices(whereNisthetotalnumberofatoms),tleapassignseachatomtoaparticularatomtypeindexthatitshareswitheveryotheratominthesystemthathasthesamesetofstartingLJparameters"iandri.Therefore,eachA-andB-coefcientprintedinthetopologylemaybeusedfornumerousotheratompairsintheforceandenergyevaluations.IimplementedanumberofActionsinParmEdthatallowuserstoqueryandadjustLJparametersinawaythatiscurrentlyimpossiblewithanyotherprogram.TheprintLJTypesActioninParmEdtakesanatomselectionandprintsouteveryotheratomthathasbeenassignedtothesameLJatomtype.ThechangeLJPairActionallowsuserstoadjustindividual,off-diagonalelementsoftheA-andB-coefcientmatricesforanypairofatoms.TheaddLJTypecommandprovidesfurtherexibilitybyallowingtheusertotreatasubsetofatomsasadifferentLJatomtypesoanyoff-diagonalchangesaffectonlythedesiredatoms. 6.2.2.2ChangingAtomicPropertiesAnotherActionimplementedinParmEdthechangeActionallowsuserstochangeoneofthefollowingatomicproperties:thepartialcharge,atomicmass,implicitsolventradius,implicitsolventscreeningfactor,atomname,atomtypename,atomtypeindex,ortheatomicnumber.ChanginganyofthesepropertieswithoutusingParmEdrequirestheusertomodifyanumberofles,includingstandardresiduelibraries,forceelddatabases,andtheoriginalstartingstructurebeforerunningthoselesthrough 168

PAGE 169

tleap.Eventhen,caremustbetakentoensurethattheprmtopwaschangedthedesiredway.Thisfunctionalityallowsrapidprototypingfortaskssuchasparameterizingnewchargeorimplicitsolventradiussets.Alternativesarecurrentlytediousanderror-prone. 6.2.2.3SettingupforH-REMDSimulationsTheH-REMDimplementationinAmberdescribedinCh. 5 iscapableofperform-ingalchemicalREFEPcalculationsprovidedthatthealchemicalpathwaycanbecharac-terizedbydifferenttopologyleswiththesameatoms.WhenanatomdisappearslikeinapKacalculationwhenaprotonvanishesadummyatomisrequiredintheendstateinwhichthatatomis`missing.'TheinterpolateActionisprovidedtocreateaseriesofprmtopswhosechargeandLJparametersarelinearlyinterpolatedbetweentwoprmtops.Alternativeapproachesare,again,timeconsuminganderror-prone. 6.2.2.4ChangingParametersPerhapsoneofthestrongestfeaturesofParmEdisitsabilitytochangeindividualbondedparametersi.e.,bonds,angles,andtorsions.ThesetBondandsetAnglecommandscanbeusedtoeitheraddormodifyabondorangleparameter,respectively.TheaddDihedralanddeleteDihedralcommandscanbeusedtocreate,remove,andevenchangeindividualtorsionparameters.Thiscontroloverthetorsionparametersisparticularlyusefulwhenattemptingtotnewtorsionparameterstoimproveforceelds.[ 23 25 ] 169

PAGE 170

APPENDIXANUMERICALINTEGRATIONINCLASSICALMOLECULARDYNAMICS A.1LagrangianandHamiltonianFormulationsTheLagrangianandHamiltonianformulationsofclassicalmechanicsshowninEqs. A and A ,respectivelyofferamoreconvenientformalismthanthemorepopularlyknownequationsderivedbyNewton.[ 197 ]WhileNewton'sequationsapplyinthree-dimensionalCartesianspace,theyarenotgenerallyapplicabletoothercoordinatesystems(e.g.,polarandspherical-polarcoordinates)thatmaybeamorenaturalwaytoexpresscertainproblems.Forinstance,polarcoordinatesmorenaturallydescribethemechanicsoforbitingbodiesthanstandardEuclideanspace.LagrangianEquation.TheLagrangianfunction,L=K)]TJ /F7 11.955 Tf 12.06 0 Td[(V,whereKisthekineticenergyandVisthepotentialenergy,satisestheLagrangianequation(Eq. A )formgeneralizedcoordinates(qm).TheadvantageofEq. A isthatitisderivedwithoutanyassumptionofaspeciccoordinatesystemforqm.Generalizedvelocitiesarethersttime-derivativeofthegeneralizedcoordinates,_qm.ThesegeneralizedvelocitiesareusedtodenethekineticenergyinthefamiliarformK=1=2_q2m.AnotheradvantagetotheLagrangianformulationofclassicalmechanicsisthattheequationsarestillvalidwhensubjecttoconstraintsonthedynamicsofthesystem(aslongastherearefewerconstraintsthanparticles).[ 197 ]Thispropertyiscrucialforcarryingoutconstraineddynamics,suchasthosesimulationsemployingthecommonly-usedSHAKE,[ 16 ]RATTLE,[ 17 ]orSETTLE[ 18 ]algorithms,tonameafew. d dt@L @_qm)]TJ /F3 11.955 Tf 16.62 8.09 Td[(@L @qm=0(A)LinEq. A istheLagrangianfunctionmentionedabove,qmarethegeneralizedcoor-dinatesofeveryparticleinthesystem,and_qmarethesetofcorrespondinggeneralizedvelocities.WhenapplyingEq. A toasysteminthestandardCartesiancoordinateswithoutconstraints,thefamiliarformofNewton'sequationsarerecovered.[ 197 ] 170

PAGE 171

HamiltonianEquation.TheHamiltonianformulationofclassicalmechanicsbuildsonthestrengthsoftheLagrangianformulationandprovidesadeeperinsightintothephysicalbehaviorofclassicalsystems.UnliketheLagrangian,theHamiltonianisdenedasthetotalenergyofthesystem:H=K+V.TheLagrangianofthesystem,L=K)]TJ /F7 11.955 Tf 10.83 0 Td[(V,playsanimportantpartinHamilton'sformulation.ThedegreesoffreedominHamilton'sequation(Eq. A )arethegeneralizedcoordinatesqmasdenedintheLagrangian,andtheirconjugatemomenta,pm.ThegeneralizedcoordinatesandmomentaaresaidtobecanonicallyconjugatebecausetheyobeytherelationshipgiveninEq. A .[ 197 ]qm=@H @pmpm=)]TJ /F3 11.955 Tf 12.65 8.08 Td[(@H @qm (A)Nowthataconvenientformulationofthelawsofclassicaldynamicsareknown,Iwillshiftthediscussiontowardtechniquesbywhichtheseequationsareusedtointegratethesesecond-orderdifferentialequationsintypicalmoleculardynamicssimulations. A.2NumericalIntegrationbyFiniteDifferenceMethodsTheequationsofmotionaresecond-orderdifferentialequationswithrespecttotheparticlecoordinates,sincetheforceisproportionaltothesecondtime-derivative(i.e.,theacceleration)ofthoseparticles.Duetothetypicalsizeandcomplexityofthesystemsandtheirpotentialsstudiedincomputationalchemistry,MDsimulationsrequirenumericalintegrationofthesecond-orderdifferentialequationsofmotion.Inthissection,IwilldescribetwocommonapproachestoiterativelyintegratingEqs. A and A so-calledpredictor-correctormethodsandtheVerletfamilyofintegrators. A.2.1Predictor-correctorThepredictor-correctorintegratorsarebasedonasimpleTaylor-seriesexpansionofthecoordinates.Knowingthatthevelocityandaccelerationaretherst-andsecond-timederivativesoftheparticlepositions,respectively,theTaylorexpansionsofeachof 171

PAGE 172

thesequantitiesaregivenbelow.~rp(t0+t)=~r(t0)+t~v(t0)+t21 2~a(t0)+t31 6d3~r(t) dt3+...~vp(t0+t)=~v(t0)+ta(t)+1 2t2d3~r(t) dt3+... (A)~ap(t0+t)=~a(t0)+td3~r(t) dt3+...Thesubscriptpintheseequationsemphasizesthatthesearethepredictedquantitiesofthepositions,velocities,andaccelerationsattimet0+tbasedontheknownvaluesattimet0.ItisconvenienttotruncatetheTaylorseriesinEqs. A aftertheaccelerationtermsincetheaccelerationattimet0canbeeasilycalculatedfromthegradientofthepotentialenergyfunction.Higherordertermsaredifculttocompute,andcontributeasignicantlysmalleramountasthetimestep,t,decreases.However,bytruncatingtheTaylorexpansionweusedanapproximationthatwillintroducesystematicerrorofourpredictedvaluescalculatedbyEqs. A comparedtotheirtruevalues.ThereisawayofapproximatingthemagnitudeofthedeviationofthepredictedvaluesfromEqs. A ,however,thatwillallowacorrectiontobeappliedtotheintegratedvalues.Asareminder,thegradientofthepotentialwasusedtocalculatetheforcesandthereforetheaccelerationoneachparticlewhenmakingtheinitialintegrationstepfromt0.Theaccelerationmaybecalculatedagainusingthegradientofthepotentialatthepredictedconformations: 5V[~rp(t0+t)]=m~a0(A)SincesystematicerrorhasbeenintroducedbytruncatingtheexpansioninEqs. A ,~a0fromEq. A and~ap(t0+t)fromEq. A willdiffer.ThemagnitudeofthisdifferencecanbeusedtocorrectthepredictedvaluesaccordingtoEqs. A 172

PAGE 173

~rc(t+t)=~rp(t+t)+c0~a(t+t)~vc(t+t)=~vp(t+t)+c1~a(t+t) (A)~ac(t+t)=~ap(t+t)+c2~a(t+t)wherethesubscriptsindicatetherelationshipbetweenthecorrectedandpredictedquantities,andthecoefcientsc0,c1,andc2areparametrizedtomaximizeperformance,[ 198 199 ]andhavetheappropriateunitstosatisfyeachequation.[ 54 ]Thecorrectorprocesscanbeiterateduntilthedesiredlevelofagreementbetweenthepredictedandcorrectedvaluesisreached.Whilethepredictor-correctoralgorithmallowslongtimestepstobetakenbyxingtheresultingsystematicerror,thecorrectorsteprequiresafullforceevaluationofthesystematasetofcoordinates,whichisthemosttime-consumingportionofthecalculation.Asaresult,thecorrectorstepiscomputationallydemanding,andpredictor-correctormethodshavebeenreplacedbyotherintegrationschemesinstandardpractice. A.2.2VerletIntegratorsAmongthemostpopulartypesofintegratorsincommonusetodayarebasedontheVerletalgorithms.TheVerletalgorithm,developedin 1967 by Verlet ,utilizesaTaylorseriesexpansionoftheparticlecoordinatesabouttimet0.ThekeytotheVerletapproachistouseboththeforwardandreversetimesteps,asshowninEqs. A .[ 54 ]~r(t0+t)=r(t0)+t~v(t0)+1 2t2~a(t0)+...~r(t0)]TJ /F3 11.955 Tf 11.96 0 Td[(t)=r(t0))]TJ /F3 11.955 Tf 11.95 0 Td[(t~v(t0)+1 2t2~a(t0))]TJ /F9 11.955 Tf 11.95 0 Td[(... (A) 173

PAGE 174

CombiningEqs. A gives ~r(t0+t)=2~r(t0)+t2~a(t0))]TJ /F3 11.955 Tf 10.86 .5 Td[(~r(t0)]TJ /F3 11.955 Tf 11.96 0 Td[(t)(A)wherethevelocitieshavebeeneliminatedfromtheexpressionandarethereforeunnecessarywhenintegratingtheequationsofmotion.Furthermore,likethevelocities,thet3termalsocancels,sotheVerletalgorithmisnotonlytime-reversiblegivenitssymmetryaroundt0butalsoaccuratetofourthorderinthetimestep.Thevelocitiesarestilluseful,however,tocomputethetotalkineticenergyandrelatedproperties,suchastheinstantaneoustemperature.Whennecessary,velocitiescanbeapproximatedastheaveragevelocityoverthetimeperiodfromt0)]TJ /F3 11.955 Tf 11.95 0 Td[(ttot0+t.PerformingMDusingtheVerletalgorithmrequiresstoringthecurrentpositions,`old'positionsattimet0)]TJ /F3 11.955 Tf 12.18 0 Td[(t,andtheaccelerationsattimet0amodestcostgiventheaccuracyoftheintegrationscheme.However,theuseofEq. A introducesanissueofnumericalprecision,since~r(t0)and~r(t0)]TJ /F3 11.955 Tf 11.84 0 Td[(t)arepotentiallylargevalues,whilet2~a(t0)istypicallyquitesmallsincethetimestepissmall.Sincerealnumberscanbestoredonlytoalimitedprecision,accuracyispotentiallylostwhenasmallnumberisaddedtoadifferenceoflargenumbers.[ 54 ]Toaddressthisissueandimprovethewayinwhichvelocitiesarehandled,theleap-frogandvelocityVerletmethodsarediscussedbelow.VelocityVerlet.In 1982 Swopeetal. developedavariantoftheVerletalgorithmthatsidestepsthepotentialroundofferrorsandnaturallystorespositions,velocities,andaccelerationsatthesametime.ATaylorseriesexpansionisagainusedtopropagatethepositions,butonlythet0+tstepisused,resultinginEq. A ~r(t0+t)=~r(t0)+t~v(t0)+1 2t2~a(t0)(A)Theaccelerationsoftheparticlesarecomputedfromtheirpositionsattimet0+t,andareusedtocomputethevelocities.Toincreasetheaccuracyofthecomputedvelocities,thevelocityintegrationisdividedintotwohalf-timesteps,showninEqs. A .Inthis 174

PAGE 175

case,theaccuracytot4inthepositionsobtainedbytheVerletalgorithmissacricedforimprovednumericalprecisionfornite-precisioncomputersandamoreaccuratetreatmentofsystemvelocities.~vt0+1 2t=~v(t0)+1 2t~a(t0)~v(t0+t)=~vt0+1 2t+1 2t~a(t0+t)~v(t0+t)=~v(t)+1 2t[~a(t)+~a(t+t)] (A)TheNABandmdgxprogramsoftheAmberTools12programsuite(andearlierversions,whereavailable),utilizethevelocityVerletalgorithmfordynamics.Leap-frog.AcommonintegratorusedinMDsimulationsistheleap-frogmethod,so-calledbecausethecomputedvelocities`leap'overthecomputedcoordinatesinamannerthatwillbeexplainedshortly.ThemaindynamicsenginesintheAmber12programsuitepmemdandsanderusetheleap-frogintegrator.WhilesimilartothevelocityVerletapproach,theleap-frogalgorithmcomputespositionsandaccelerationsofparticlesatintegraltimesteps,butcomputesvelocitiesathalf-integraltimestepsaccordingtoEqs. A .~r(t0+t)=~r(t0)+t~vt0+1 2t~vt0+1 2t=~vt0)]TJ /F9 11.955 Tf 13.16 8.09 Td[(1 2t+t~a(t0) (A)Ifthevelocitiesarerequiredattimet0,theycanbeestimatedastheaveragevelocitiesbetweentimest0)]TJ /F9 11.955 Tf 12.15 0 Td[(1=2tandt0+1=2t,whichissignicantlymoreaccuratethantheapproximationinVerlet'soriginalalgorithm.LikethevelocityVerletalgorithm,leap-frogintegrationsacricesthe4th-orderaccuracyinintegratedpositionstoalleviatetheaforementionedprecisionandvelocityissues.[ 54 ] 175

PAGE 176

APPENDIXBAMBERPARAMETER-TOPOLOGYFILEFORMATThisappendixdetailstheParameter-TopologyleformatusedextensivelybytheAMBERsoftwaresuiteforbiomolecularsimulationandanalysis,referredtoastheprmtopleforshort.TheformatspecicationoftheAMBERtopologylewaswritteninitiallyoveradecadeagoandpostedonhttp://ambermd.org/formats.html.Ihaverecentlyexpandedthatdocumenttoaccountforthedrasticchangetotheleformatthatoccurredwiththe2004releaseofAmber7.Thepre-Amber7format(oldformat)isdescribedmorebrieyafterwards,althougheachsectionprovidedintheoriginalformatcontainsexactlythesameinformationasthenewerversion.Thisappendixalsodetailstheformatchangesandadditionsintroducedbycham-bertheprogramthattranslatesaCHARMMparameterle(PSF)intoatopologylethatcanbeusedwiththesanderandpmemdprogramsinAMBER.Thisappendixdrawsfromtheinformationonhttp://ambermd.org/formats.htmlthatwasaddedbybothmeandothers,aswellastheexperienceIgleanedwhilewritingtheParmEdprogramandworkingwiththevariouscodesinAMBER.Asawarning,theprmtopleisaresultofbookkeepingthatbecomesincreasinglycomplexasthesystemsizeincreases.Therefore,hand-editingthetopologyleforallbutthesmallestsystemsisdiscouragedaprogramorscriptshouldbewrittentoautomatetheprocedure. B.1LayoutTherstlineoftheAmbertopologyleistheversionstring.AnexampleisshownbelowinwhichXXisreplacedbytheactualdateandtime. %VERSIONVERSION_STAMP=V0001.000DATE=XX/XX/XXXX:XX:XXThetopologyformatisdividedintoseveralsectionsinawaythatisdesignedtobeparsedeasilyusingsimpleFortrancode.Aconsequenceofthisisthatitisdifcultforparserswritteninotherlanguages(e.g.,C,C++,Python,etc.)tostrictlyadheretothe 176

PAGE 177

standard.Theseparsersshouldtry,however,tosupportasmuchofthestandardaspossible. %FLAGSECTION%COMMENTanarbitrarynumberofoptionalcommentsmaybeputhere%FORMAT()...dataformattedaccordingtoAllnames(e.g.,atomnames,atomtypenames,andresiduenames)arelimitedto4charactersandareprintedineldsofwidthexactly4characterswide,left-justied.Thismeansthatnamesmightnotbespace-delimitedifanyofthenameshave4characters.Requirementsforprmtopparsers.Parsers,regardlessofthelanguagetheyarewrittenin,shouldconformtoalistofattributestomaximizethelikelihoodthattheyareparsedcorrectly. Parsersshouldexpectthatsome4-characterelds(e.g.,atomorresiduenames)mayhavesomenamesthathave4charactersandthereforemightnotbewhitespace-delimited. ParsersshouldnotexpectSECTIONsintheprmtoptobeinanyparticularorder. Parsersshouldnotexpectorrequire%COMMENTlinestoexist,butshouldproperlyparsetheleifanynumberof%COMMENTlinesappearasindicatedabove Thetopologylemaybeassumedtohavebeengenerated`correctly'bytleaporsomeothercrediblesource.Nogracefulerrorcheckingisrequired.RequirementsformodifyingSECTIONs.Tominimizetheimpactofprmtopchangestoexisting,third-partyparsers,thefollowingconventionsshouldbefollowed. AnynewSECTIONshouldbeaddedtotheendofthetopologyletoavoidconictswithorder-dependentparsers. Theshouldbeassimpleaspossible(andavoidaddingnewformats)tomaintainsimplicityfornon-Fortranparsers. 177

PAGE 178

Avoidmodifyingifpossible.Considerifthisnewsectionorchangeistrulyneces-saryandbelongsintheprmtop. B.2ListofSECTIONsTITLE Thissectioncontainsthetitleofthetopologyleononeline(upto80characters).Whilethetitleservesaprimarilycosmeticpurpose,thissectionmustbepresent.%FORMAT(20a4)POINTERS Thissectioncontainstheinformationabouthowmanyparametersarepresentinallofthesections.Thereare31or32integerpointers(NCOPYmightnotbepresent).Theformatandnamesofallofthepointersarelistedbelow,followedbyadescriptionofeachpointer. %FLAGPOINTERS%FORMAT(10I8)NATOMNTYPESNBONHMBONANTHETHMTHETANPHIHMPHIANHPARMNPARMNNBNRESNBONANTHETANPHIANUMBNDNUMANGNPTRANATYPNPHBIFPERTNBPERNGPERNDPERMBPERMGPERMDPERIFBOXNMXRSIFCAPNUMEXTRANCOPY NATOM Numberofatoms NTYPES NumberofdistinctLennard-Jonesatomtypes NBONH NumberofbondscontainingHydrogen MBONA NumberofbondsnotcontainingHydrogen NTHETH NumberofanglescontainingHydrogen MTHETA NumberofanglesnotcontainingHydrogen NPHIH NumberoftorsionscontainingHydrogen MPHIA NumberoftorsionsnotcontainingHydrogen NHPARM Notcurrentlyusedforanything 178

PAGE 179

NPARM UsedtodetermineifthisisaLES-compatibleprmtop NNB Numberofexcludedatoms(lengthoftotalexclusionlist) NRES Numberofresidues NBONA MBONA+numberofconstraintbonds 1 NTHETA MTHETA+numberofconstraintangles 1 NPHIA MPHIA+numberofconstrainttorsions 1 NUMBND Numberofuniquebondtypes NUMANG Numberofuniqueangletypes NPTRA Numberofuniquetorsiontypes NATYP NumberofSOLTYterms.Currentlyunused. NPHB Numberofdistinct10-12hydrogenbondpairtypes 2 IFPERT Setto1iftopologycontainsresidueperturbationinformation. 3 NBPER Numberofperturbedbonds 3 NGPER Numberofperturbedangles 3 NDPER Numberofperturbedtorsions 3 MBPER Numberofbondsinwhichbothatomsarebeingperturbed MGPER Numberofanglesinwhichall3atomsarebeingperturbed MDPER Numberoftorsionsinwhichall4atomsarebeingperturbed 1 IFBOX Flagindicatingwhetheraperiodicboxispresent.Valuescanbe0(nobox),1(orthorhombicbox)or2(truncatedoctahedron) NMXRS Numberofatomsinthelargestresidue IFCAP Setto1ifasolventCAPisbeingused NUMEXTRA Numberofextrapointsinthetopologyle 1 AMBERcodesnolongersupportconstraintsinthetopologyle.2 ModernAMBERforceeldsdonotusea10-12potential3 NoAMBERcodessupportperturbedtopologiesanymore 179

PAGE 180

NCOPY NumberofPIMDslicesornumberofbeadsATOM NAME Thissectioncontainstheatomnameforeveryatomintheprmtop.%FORMAT(20a4)ThereareNATOM4-characterstringsinthissection.CHARGE Thissectioncontainsthechargeforeveryatomintheprmtop.Chargesaremulti-pliedby18.2223(p kelewherekeleistheelectrostaticconstantinkcalAmol)]TJ /F10 7.97 Tf 6.58 0 Td[(1q)]TJ /F10 7.97 Tf 6.59 0 Td[(2,whereqisthechargeofanelectron).%FORMAT(5E16.8)ThereareNATOMoatingpointnumbersinthissection.ATOMIC NUMBER Thissectioncontainstheatomicnumberofeveryatomintheprmtop.ThissectionwasrstintroducedinAmberTools12.[ 141 ]%FORMAT(10I8)ThereareNATOMintegersinthissection.MASS Thissectioncontainstheatomicmassofeveryatomingmol)]TJ /F10 7.97 Tf 6.59 0 Td[(1.%FORMAT(5E16.8)ThereareNATOMoatingpointnumbersinthissection.ATOM TYPE INDEX ThissectioncontainstheLennard-Jonesatomtypeindex.TheLennard-Jonespotentialcontainsparametersforeverypairofatomsinthesystem.TominimizethememoryrequirementsofstoringNATOMNATOM 2 Lennard-JonesA-coefcientsandB-coefcients,allatomswiththesameand"parametersareassignedtothesametype 2 Onlyhalfthisnumberwouldberequired,sinceai,jaj,i 180

PAGE 181

(regardlessofwhethertheyhavethesameAMBER ATOM TYPE).ThissignicantlyreducesthenumberofLJcoefcientswhichmustbestored,butintroducedtherequirementforbookkeepingsectionsofthetopologyletokeeptrackofwhattheLJtypeindexwasforeachatom.ThissectionisusedtocomputeapointerintotheNONBONDED PARM INDEXsection,whichitselfisapointerintotheLENNARD JONES ACOEFandLENNARD JONES BCOEFsections(seebelow).%FORMAT(10I8)ThereareNATOMintegersinthissection.NUMBER EXCLUDED ATOMS Thissectioncontainsthenumberofatomsthatneedtobeexcludedfromthenon-bondedcalculationloopforatomibecauseiisinvolvedinabond,angle,ortorsionwiththoseatoms.EachatomintheprmtophasalistofexcludedatomsthatisasubsetofthelistinEXCLUDED ATOMS LIST(seebelow).TheithvalueinthissectionindicateshowmanyelementsofEXCLUDED ATOMS LISTbelongtoatomi.Forinstance,ifthersttwoelementsofthisarrayis5and3,thenelements1to5inEXCLUDED ATOMS LISTaretheexclusionsforatom1andelements6to8inEXCLUDED ATOMS LISTaretheexclusionsforatom2.Eachexclusionislistedonlyonceinthetopologyle,andisgiventotheatomwiththesmallerindex.Thatis,ifatoms1and2arebonded,thenatom2isintheexclusionlistforatom1,butatom1isnotintheexclusionlistforatom2.Ifanatomhasnoexcludedatoms(eitherbecauseitisamonoatomicionorallatomsitformsabondedinteractionwithhaveasmallerindex),thenitisgivenavalueof1inthislistwhichcorrespondstoanexclusionwith(anon-existent)atom0inEXCLUDED ATOMS LIST.Theexclusionrulesforextrapointsaremorecomplicated.Whendeterminingexclusions,itisconsideredan`extension'oftheatomitisconnected(bonded)to. 181

PAGE 182

Therefore,extrapointsareexcludednotonlyfromtheatomtheyareconnectedto,butalsofromeveryatomthatitsparentatomisexcludedfrom.NOTE:Thenon-bondedinteractioncodeinsanderandpmemdcurrently(asofAmber12)recalculatestheexclusionlistsforsimulationsofsystemswithperiodicboundaryconditions,sothissectioniseffectivelyignored.TheGBcodeusestheexclusionlistinthetopologyle.%FORMAT(10I8)ThereareNATOMintegersinthissection.NONBONDED PARM INDEX ThissectioncontainsthepointersforeachpairofLJatomtypesintotheLENNARD JONES ACOEFandLENNARD JONES BCOEFarrays(seebelow).ThepointerforanatompairinthisarrayiscalculatedfromtheLJatomtypeindexofthetwoatoms(seeATOM TYPE INDEXabove).TheindexfortwoatomsiandjintotheLENNARD JONES ACOEFandLENNARD JONES BCOEFarraysiscalculatedas index=NONBONDED PARM INDEX[NTYPES(ATOM TYPE INDEX(i))]TJ /F9 11.955 Tf 11.96 0 Td[(1)+ATOM TYPE INDEX(j)](B)Note,eachatompaircaninteractwitheitherthestandard12-6LJpotentialorviaa12-10hydrogenbondpotential.IfindexinEq. B isnegative,thenitisanindexintoHBOND ACOEFandHBOND BCOEFinstead(seebelow).%FORMAT(10I8)ThereareNTYPESNTYPESintegersinthissection.RESIDUE LABEL Thissectioncontainstheresiduenameforeveryresidueintheprmtop.Residuenamesarelimitedto4letters,andmightnotbewhitespace-delimitedifanyresidueshave4-letternames.%FORMAT(20a4) 182

PAGE 183

ThereareNRES4-characterstringsinthissection.RESIDUE POINTER Thissectionliststherstatomineachresidue.%FORMAT(10i8)ThereareNRESintegersinthissection.BOND FORCE CONSTANT Bondenergiesarecalculatedaccordingtotheequation Ebond=1 2k(~r)]TJ /F3 11.955 Tf 10.86 .5 Td[(~req)2(B)Thissectionlistsallofthebondforceconstants(kinEq. B )inunitskcalmol)]TJ /F10 7.97 Tf 6.59 0 Td[(1A)]TJ /F10 7.97 Tf 6.59 0 Td[(2foreachuniquebondtype.EachbondinBONDS INC HYDROGENandBONDS WITHOUT HYDROGEN(seebelow)containsanindexintothisarray.%FORMAT(5E16.8)ThereareNUMBNDoatingpointnumbersinthissection.BOND EQUIL VALUE Thissectionlistsallofthebondequilibriumdistances(~reqinEq. B )inunitsofAforeachuniquebondtype.ThislistisindexedthesamewayasBOND FORCE CONSTANT.%FORMAT(5E16.8)ThereareNUMBNDoatingpointnumbersinthissection.ANGLE FORCE CONSTANT Angleenergiesarecalculatedaccordingtotheequation Eangle=1 2k()]TJ /F3 11.955 Tf 11.96 0 Td[(eq)2(B)Thissectionlistsalloftheangleforceconstants(kinEq. B )inunitsofkcalmol)]TJ /F10 7.97 Tf 6.59 0 Td[(1rad2foreachuniqueangletype.EachangleinANGLES INC HYDROGENandANGLES WITHOUT HYDROGENcontainsanindexintothis(andthenext)array.%FORMAT(5E16.8) 183

PAGE 184

ThereareNUMANGoatingpointnumbersinthissection.ANGLE EQUIL VALUE Thissectioncontainsalloftheangleequilibriumangles(eqinEq. B )inradians.NOTE:theAMBERparameterleslistequilibriumanglesindegreesandareconvertedtoradiansintleap.ThislistisindexedthesamewayasANGLE FORCE CONSTANT.%FORMAT(5E16.8)ThereareNUMBNDoatingpointnumbersinthissection.DIHEDRAL FORCE CONSTANT Torsionenergiesarecalculatedforeachtermaccordingtotheequation Etorsion=ktorcos(n+ )(B)Thissectionliststhetorsionforceconstants(ktorinEq. B )inunitsofkcalmol)]TJ /F10 7.97 Tf 6.59 0 Td[(1foreachuniquetorsiontype.EachtorsioninDIHEDRALS INC HYDROGENandDIHEDRALS WITHOUT HYDROGENhasanindexintothisarray.Amberparameterlescontainadividingfactorandbarrierheightforeachdihedral.Thebarrierheightintheparameterlesaredividedbytheprovidedfactorinsidetleapandthendiscarded.Asaresult,thetorsionbarriersinthissectionmightnotmatchthoseintheoriginalparameterles.%FORMAT(5E16.8)ThereareNPTRAoatingpointnumbersinthissection.DIHEDRAL PERIODICITY Thissectionliststheperiodicity(ninEq. B )foreachuniquetorsiontype.ItisindexedthesamewayasDIHEDRAL FORCE CONSTANT.NOTE:onlyintegersarereadbytleap,althoughtheAMBERcodessupportnon-integerperiodicities.%FORMAT(5E16.8)ThereareNPTRAoatingpointnumbersinthissection. 184

PAGE 185

DIHEDRAL PHASE Thissectionliststhephaseshift( inEq. B )foreachuniquetorsiontypeinradians.ItisindexedthesamewayasDIHEDRAL FORCE CONSTANT.%FORMAT(5E16.8)ThereareNPTRAoatingpointnumbersinthissection.SCEE SCALE FACTOR ThissectionwasintroducedinAmber11.Inpreviousversions,thisvariablewaspartoftheinputleandsetasinglescalingfactorforeverytorsion.Thissectionliststhefactorbywhich1-4electrostaticinteractionsaredivided(i.e.,thetwoatomsoneitherendofatorsion).Fortorsiontypesinwhich1-4non-bondedinteractionsarenotcalculated(e.g.,impropertorsions,multi-termtorsions,andthoseinvolvedinringsystemsof6orfeweratoms),avalueof0isassignedbytleap.ThissectionisindexedthesamewayasDIHEDRAL FORCE CONSTANT.%FORMAT(5E16.8)ThereareNPTRAoatingpointnumbersinthissection.SCNB SCALE FACTOR ThissectionwasintroducedinAmber11.Inpreviousversions,thisvariablewaspartoftheinputleandsetasinglescalingfactorforeverytorsion.Thissectionliststhefactorbywhich1-4vanderWaalsinteractionsaredi-vided(i.e.,thetwoatomsoneitherendofatorsion).ThissectionisanalogoustoSCEE SCALE FACTORdescribedabove.%FORMAT(5E16.8)ThereareNPTRAoatingpointnumbersinthissection.SOLTY Thissectioniscurrentlyunused,andwhile`futureuse'isplanned,thisassertionhaslaindormantforsometime.%FORMAT(5E16.8) 185

PAGE 186

ThereareNATYPoatingpointnumbersinthissection.LENNARD JONES ACOEF LJnon-bondedinteractionsarecalculatedaccordingtotheequation ELJ=ai,j r12)]TJ /F7 11.955 Tf 13.15 8.09 Td[(bi,j r6(B)ThissectioncontainstheLJA-coefcients(ai,jinEq. B )forallpairsofdistinctLJtypes(seesectionsATOM TYPE INDEXandNONBONDED PARM INDEXabove).%FORMAT(5E16.8)Thereare[NTYPES(NTYPES+1)]=2oatingpointnumbersinthissection.LENNARD JONES BCOEF ThissectioncontainstheLJB-coefcients(bi,jinEq. B )forallpairsofdistinctLJtypes(seesectionsATOM TYPE INDEXandNONBONDED PARM INDEXabove).%FORMAT(5E16.8)Thereare[NTYPES(NTYPES+1)]=2oatingpointnumbersinthissection.BONDS INC HYDROGEN ThissectioncontainsalistofeverybondinthesysteminwhichatleastoneatomisHydrogen.Eachbondisidentiedby3integersthetwoatomsinvolvedinthebondandtheindexintotheBOND FORCE CONSTANTandBOND EQUIL VALUE.Forrun-timeefciency,theatomindexesareactuallyindexesintoacoordinatearray,sotheactualatomindexAiscalculatedfromthecoordinatearrayindexNbyA=N=3+1.(Nisthevalueinthetopologyle)%FORMAT(10I8)Thereare3NBONHintegersinthissection.BONDS WITHOUT HYDROGEN ThissectioncontainsalistofeverybondinthesysteminwhichneitheratomisHydrogen.IthasthesamestructureasBONDS INC HYDROGENdescribedabove.%FORMAT(10I8) 186

PAGE 187

Thereare3NBONAintegersinthissection.ANGLES INC HYDROGEN ThissectioncontainsalistofeveryangleinthesysteminwhichatleastoneatomisHydrogen.Eachangleisidentiedby4integersthethreeatomsinvolvedintheangleandtheindexintotheANGLE FORCE CONSTANTandANGLE EQUIL VALUE.Forrun-timeefciency,theatomindexesareactuallyindexesintoacoordinatearray,sotheactualatomindexAiscalculatedfromthecoordinatearrayindexNbyA=N=3+1.(Nisthevalueinthetopologyle)%FORMAT(10I8)Thereare4NTHETHintegersinthissection.ANGLES WITHOUT HYDROGEN ThissectioncontainsalistofeveryangleinthesysteminwhichnoatomisHydro-gen.IthasthesamestructureasANGLES INC HYDROGENdescribedabove.%FORMAT(10I8)Thereare4NTHETAintegersinthissection.DIHEDRALS INC HYDROGEN ThissectioncontainsalistofeverytorsioninthesysteminwhichatleastoneatomisHydrogen.Eachtorsionisidentiedby5integersthefouratomsinvolvedinthetorsionandtheindexintotheDIHEDRAL FORCE CONSTANT,DIHEDRAL PERIODICITY,DIHEDRAL PHASE,SCEE SCALE FACTORandSCNB SCALE FACTORarrays.Forrun-timeefciency,theatomindexesareactuallyindexesintoacoordinatearray,sotheactualatomindexAiscalculatedfromthecoordinatearrayindexNbyA=N=3+1.(Nisthevalueinthetopologyle)Ifthethirdatomisnegative,thenthe1-4non-bondedinteractionsforthistorsionisnotcalculated.Thisisrequiredtoavoiddouble-countingthesenon-bondedinteractionsinsomeringsystemsandinmulti-termtorsions.Ifthefourthatomisnegative,thenthetorsionisimproper. 187

PAGE 188

NOTE:Therstatomhasanindexofzero.Since0cannotbenegativeandthe3rdand4thatomindexesaretestedfortheirsigntodetermineif1-4termsarecalculated,therstatominthetopologylemustbelistedaseithertherstorsecondatominwhatevertorsionsitisdenedin.Theatomorderinginatorsioncanbereversedtoaccommodatethisrequirementifnecessary.%FORMAT(10I8)Thereare5NPHIHintegersinthissection.DIHEDRALS WITHOUT HYDROGEN ThissectioncontainsalistofeverytorsioninthesysteminwhichnoatomisHydrogen.IthasthesamestructureasDIHEDRALS INC HYDROGENdescribedabove.%FORMAT(10I8)Thereare5NPHIAintegersinthissection.EXCLUDED ATOMS LIST Thissectioncontainsalistforeachatomofexcludedpartnersinthenon-bondedcalculationroutines.ThesubsetofthislistthatbelongstoeachatomisdeterminedfromthepointersinNUMBER EXCLUDED ATOMSseethatsectionformoreinformation.NOTE:Theperiodicboundarycodeinsanderandpmemdcurrentlyrecalculatesthissectionofthetopologyle.TheGBcode,however,usestheexclusionlistdenedinthetopologyle.%FORMAT(10I8)ThereareNNBintegersinthissection.HBOND ACOEF ThissectionisanalogoustotheLENNARD JONES ACOEFarraydescribedabove,butreferstotheA-coefcientina12-10potentialinsteadofthefamiliar12-6potential.Thistermhasbeendroppedfrommostmodernforceelds.%FORMAT(5E16.8)ThereareNPHBoatingpointnumbersinthissection. 188

PAGE 189

HBOND BCOEF ThissectionisanalogoustotheLENNARD JONES BCOEFarraydescribedabove,butreferstotheB-coefcientina12-10potentialinsteadofthefamiliar12-6potential.Thistermhasbeendroppedfrommostmodernforceelds.%FORMAT(5E16.8)ThereareNPHBoatingpointnumbersinthissection.HBCUT Thissectionusedtobeusedforacutoffparameterinthe12-10potential,butisnolongerusedforanything.%FORMAT(5E16.8)ThereareNPHBoatingpointnumbersinthissection.AMBER ATOM TYPE Thissectioncontainstheatomtypenameforeveryatomintheprmtop.%FORMAT(20a4)ThereareNATOM4-characterstringsinthissection.TREE CHAIN CLASSIFICATION Thissectioncontainsinformationaboutthetreestructure(borrowingconceptsfromgraphtheory)ofeachatom.Eachatomcanhaveoneofthefollowingcharacterindicators: M Thisatomispartofthemainchain S Thisatomispartofthesidechain E Thisatomisachain-terminatingatom(i.e.,anendatom) 3 Thestructurebranchesinto3chainsatthispoint BLA Ifnoneoftheabovearetrue%FORMAT(20a4)ThereareNATOM4-characterstringsinthissection. 189

PAGE 190

JOIN ARRAY Thissectionisnolongerusedandiscurrentlyjustlledwithzeros.%FORMAT(10I8)ThereareNATOMintegersinthissection.IROTAT Thissectionisnotusedandiscurrentlyjustlledwithzeros.%FORMAT(10I8)ThereareNATOMintegersinthissection.SOLVENT POINTERS ThissectionisonlypresentifIFBOXisgreaterthan0(i.e.,ifthesystemwassetupforusewithperiodicboundaryconditions).Thereare3integerspresentinthissectionthenalresiduethatispartofthesolute(IPTRES),thetotalnumberof`molecules'(NSPM),andtherstsolvent`molecule'(NSPSOL).A`molecule'isdenedasaclosedgraphthatis,thereisapathwayfromeveryatominamoleculetoeveryotheratominthemoleculebytraversingbonds,andtherearenopathwaysto`other'molecules. %FLAGSOLVENT_POINTERS%FORMAT(3I8)IPTRESNSPMNSPSOLATOMS PER MOLECULE ThissectionisonlypresentifIFBOXisgreaterthan0(i.e.,ifthesystemwassetupforusewithperiodicboundaryconditions).Thissectionlistshowmanyatomsarepresentineach`molecule'asdenedintheSOLVENT POINTERSsectionabove.%FORMAT(10I8)ThereareNSPMintegersinthissection(seetheSOLVENT POINTERSsectionabove). 190

PAGE 191

BOX DIMENSIONS ThissectionisonlypresentifIFBOXisgreaterthan0(i.e.,ifthesystemwassetupforusewithperiodicboundaryconditions).Thissectionliststheboxangle(OLDBETA)anddimensions(BOX(1)BOX(2)BOX(3)).Thevaluesinthissectionaredeprecatednowsincenewerandmoreaccurateinformationabouttheboxsizeandshapeisstoredinthecoordinatele.Sinceconstantpressuresimulationscanchangetheboxdimensions,thevaluesinthecoordinateleshouldbetrustedoverthoseinthetopologyle. %FLAGBOX_DIMENSIONS%FORMAT(5E16.8)OLDBETABOX(1)BOX(2)BOX(3)CAP INFO ThissectionispresentonlyifIFCAPisnot0.Ifpresent,itcontainsasingleintegerwhichisthelastatombeforethewatercapbegins(NATCAP)%FORMAT(10I8)CAP INFO2 ThissectionispresentonlyifIFCAPisnot0.Ifpresent,itcontainsfournumbersthedistancefromthecenterofthecaptooutsidethecap(CUTCAP),andtheCartesiancoordinatesofthecapcenter. %FLAGCAP_INFO2%FORMAT(5E16.8)CUTCAPXCAPYCAPZCAPRADIUS SET Thissectioncontainsaone-linestring(upto80characters)describingtheintrinsicimplicitsolventradiisetthataredenedinthetopologyle.Theavailableradiisetswiththeir1-linedescriptionsare: bondi Bondiradii(bondi) amber6 amber6modifiedBondiradii(amber6) 191

PAGE 192

mbondi modifiedBondiradii(mbondi) mbondi2 H(N)-modifiedBondiradii(mbondi2) mbondi3 ArgHandAspGlu0modifiedBondi2radii(mbondi3)%FORMAT(1a80)Thereisasinglelinedescriptioninthissection.RADII Thissectioncontainstheintrinsicradiiofeveryatomusedforimplicitsolventcalculations(typicallyGeneralizedBorn).%FORMAT(5E16.8)ThereareNATOMoatingpointnumbersinthissection.IPOL ThissectionwasintroducedinAmber12.InpreviousversionsofAmber,thiswasavariableintheinputle.Thissectioncontainsasingleintegerthatis0forxed-chargeforceeldsand1forforceeldsthatcontainpolarization.POLARIZABILITY ThissectionisonlypresentifIPOLisnot0.Itcontainstheatomicpolarizabilitiesforeveryatomintheprmtop.%FORMAT(5E16.8)ThereareNATOMoatingpointnumbersinthissection.%FORMAT(1I8) B.3DeprecatedSectionsAllofthesectionsofthetopologylelistedhereareonlypresentifIFPERTis1.However,nomodernprogramssupportsuchprmtopssothesesectionsarerarely(ifever)used.TheyareincludedinTable B-1 forcompleteness,only.Moreinfocanbefoundonlineathttp://ambermd.org/formats.html 192

PAGE 193

TableB-1. Listofalloftheperturbedtopologylesections. FLAGname%FORMATofvaluesDescription PERT BOND ATOMS10I82NBPERperturbedbondlistPERT BOND PARAMS10I82NBPERperturbedbondpointersPERT ANGLE ATOMS10I83NGPERperturbedanglelistPERT ANGLE PARAMS10I82NGPERperturbedanglepointersPERT DIHEDRAL ATOMS10I84NDPERperturbedtorsionlistPERT DIHEDRAL PARAMS10I82NDPERperturbedtorsionpointersPERT RESIDUE NAME20a4NRESendstateresiduenamesPERT ATOM NAME20a4NATOMendstateatomnamesPERT ATOM SYMBOL20a4NATOMendstateatomtypesALMPER5E16.8NATOMUnusedIAPER10I8NATOMIsAtomPERturbed?PERT ATOM TYPE INDEX10I8NATOMPerturbedLJTypePERT CHARGE5E16.8NATOMPerturbedcharge B.4CHAMBERTopologiesHerewewilldescribethegeneralformatoftopologylesgeneratedbythechamberprogram.ThechamberprogramwasdevelopedtotranslateCHARMMtopology(PSF)lesintoAmbertopologylesforusewiththeAMBERprogramsuite.DuetodifferencesintheCHARMMforceeld(e.g.,theextraCMAPandUrey-Bradleytermsandthedifferentwaythatimproperdihedralsaretreated),chambertopologiescontainmoresectionsthanAmbertopologies.Furthermore,toensurerigorousreproductionofCHARMMenergiesinsidetheAMBERprogramsuites,someofthesectionsthatarecommonbetweenAMBERandCHARMMtopologyleshaveadifferentformatfortheirdatatosupportadifferentlevelofinputdataprecision.Duetothedifferencesinthechambertopologyles,amechanismtodifferentiatebetweenchambertopologiesandAMBERtopologieswasintroduced.Ifthetopologylehasa%FLAGTITLEthenitisanAMBERtopology.Ifithasa%FLAGCTITLEinstead,thenitisachambertopology.ThefollowingsectionsofthechambertopologyareexaclythesameasthosefromtheAMBERtopologyles: POINTERS 193

PAGE 194

ATOM NAME MASS ATOM TYPE INDEX NUMBER EXCLUDED ATOMS EXCLUDED ATOMS LIST NONBONDED PARM INDEX RESIDUE LABEL BOND FORCE CONSTANT BOND EQUIL VALUE ANGLE FORCE CONSTANT DIHEDRAL FORCE CONSTANT DIHEDRAL PERIODICITY DIHEDRAL PHASE SCEE SCALE FACTOR SCNB SCALE FACTOR SOLTY BONDS INC HYDROGEN BONDS WITHOUT HYDROGEN ANGLES INC HYDROGEN ANGLES WITHOUT HYDROGEN DIHEDRALS INC HYDROGEN DIHEDRALS WITHOUT HYDROGEN HBOND ACOEF HBOND BCOEF HBCUT AMBER ATOM TYPE 194

PAGE 195

TableB-2. ListofagsthatarecommonbetweenAmberandchambertopologyles,buthavedifferentFORMATidentiers. FLAGnameAMBERFormatchamberFormat CHARGE5E16.83E24.16ANGLE EQUIL VALUE5E16.83E25.17LENNARD JONES ACOEF5E16.83E24.16LENNARD JONES BCOEF5E16.83E24.16 TREE CHAIN CLASSIFICATION 3 JOIN ARRAY IROTAT RADIUS SET RADII SCREEN SOLVENT POINTERS ATOMS PER MOLECULEInTable B-2 isalistofsectionsthathavethesamenameandthesamedata,butwithadifferentFortranformatidentier.FORCE FIELD TYPE ThissectionisadescriptionoftheCHARMMforceeldthatisparametrizedinthetopologyle.Itisasingleline(itcanbereadasasinglestringoflength80characters).Itdoesnotaffectanynumericalresults.%FORMAT(i2,a78)CHARMM UREY BRADLEY COUNT ThissectioncontainsthenumberofUrey-Bradleyparametersprintedinthetopol-ogyle.Itcontainstwointegers,thetotalnumberofUrey-Bradleyterms(NUB)andthenumberofuniqueUrey-Bradleytypes(NUBTYPES). 3 Notreallysupported.EveryentryisBLA 195

PAGE 196

%FLAGCHARMM_UREY_BRADLEY_COUNT%FORMAT(2i8)NUBNUBTYPESCHARMM UREY BRADLEY ThissectioncontainsalloftheUrey-Bradleyterms.ItisformattedexactlylikeBONDS INC HYDROGENandBONDS WITHOUT HYDROGEN.%FORMAT(10i8)Thereare3NUBintegersinthissection.CHARMM UREY BRADLEY FORCE CONSTANT ThissectioncontainsalloftheforceconstantsforeachuniqueUrey-Bradleyterminkcalmol)]TJ /F10 7.97 Tf 6.59 0 Td[(1A2.ItisformattedexactlythesameasBOND FORCE CONSTANT.%FORMAT(5E16.8)ThereareNUBTYPESoatingpointnumbersinthissection.CHARMM UREY BRADLEY EQUIL VALUE ThissectioncontainsalloftheequilibriumdistancesforeachuniqueUrey-BradleyterminA.ItisformattedexactlythesameasBOND EQUIL VALUE.%FORMAT(5E16.8)ThereareNUBTYPESoatingpointnumbersinthissection.CHARMM NUM IMPROPERS Thissectioncontainsthenumberofimpropertorsionsinthetopologyle.Itcontainsoneinteger,thetotalnumberofimpropertorsions. %FLAGCHARMM_NUM_IMPROPERS%FORMAT(i8)NIMPHICHARMM IMPROPERS Thissectioncontainsalloftheimpropertorsionterms.ItisformattedexactlylikeDIHEDRALS INC HYDROGENandDIHEDRALS WITHOUT HYDROGEN. 196

PAGE 197

%FORMAT(10i8)Thereare5NIMPHIintegersinthissection.CHARMM NUM IMPROPER TYPES Thissectioncontainsthenumberofuniqueimpropertorsiontypesinthetopologyle.Itcontainsoneinteger,thetotalnumberofimpropertorsionstypes. %FLAGCHARMM_NUM_IMPROPERS%FORMAT(i8)NIMPRTYPESCHARMM IMPROPER FORCE CONSTANT Thissectioncontainstheforceconstantforeachuniqueimpropertorsiontype.ItisformattedexactlylikeDIHEDRAL FORCE CONSTANT.%FORMAT(5E16.8)ThereareNIMPRTYPESintegersinthissection.CHARMM IMPROPER PHASE Thissectioncontainsthephaseshiftforeachuniqueimpropertorsiontype.ItisformattedexactlylikeDIHEDRAL PHASE%FORMAT(5E16.8)ThereareNIMPRTYPESintegersinthissection.LENNARD JONES 14 ACOEF Insteadofscalingthe1-4vanderWaalsinteractions,theCHARMMforceeldactuallyassignsentirelydifferentLJparameterstoeachatomtype.Therefore,chambertopologieshavetwoextrasectionsthatcorrespondtothesetofLJparametersfor1-4in-teractions.ThewaythesetablesaresetupisidenticaltothewayLENNARD JONES ACOEFandLENNARD JONES BCOEFaresetupinchambertopologies.%FORMAT(5E16.8)Thereare[NTYPES(NTYPES+1)]=2oatingpointnumbersinthissection. 197

PAGE 198

LENNARD JONES 14 BCOEF ThissectioncontainstheLJB-coefcientsfor1-4interactions.SeeLENNARD JONES 14 ACOEFabove.%FORMAT(5E16.8)Thereare[NTYPES(NTYPES+1)]=2oatingpointnumbersinthissection.CHARMM CMAP COUNT Thissectioncontainstwointegersthenumberoftotalcorrectionmap(CMAPtermsandthenumberofuniqueCMAP`types.' %FLAGCHARMM_CMAP_COUNT%FORMAT(2i8)CMAP_TERM_COUNTCMAP_TYPE_COUNTCHARM CMAP RESOLUTION Thissectionstorestheresolution(i.e.,numberofstepsalongeachphi/psiCMAPaxis)foreachCMAPgrid.%FORMAT(20I4)ThereareCMAP TERM COUNTintegersinthissection.CHARMM CMAP PARAMETER ThereareCMAP TYPE COUNTofthesesections,whereisreplacedbya2-digitintegerbeginningfrom01.Itisa2-dimensionalFortranarraywhose1-Dsequenceisstoredincolumn-majororder.%FORMAT(8(F9.5))ThereareCHARMM CMAP RESOLUTION(i)2oatingpointnumbersinthissection,whereiistheintheFLAGtitle. 198

PAGE 199

APPENDIXCMESSAGEPASSINGINTERFACEInthisappendix,IwillbrieydescribetheMessagePassingInterface(MPI)modelthatisfrequentlyusedtoparallelizeprogramsintheeldofcomputationalchemistry.TheMPIisusedextensivelyintheeldofcomputationalchemistrytoenablelarge-scaleparallelismonmodernsupercomputerarchitecture. Pachecho authoredaparticularlyusefultextforlearningMPIprogramming.[ 202 ] C.1ParallelComputing C.1.1DataModelsGenerallyspeaking,programsfallintooneoftwocategorieswithregardstohowhandlingandprocessingdataisparallelized.Therstapproachreferstousingmultiplethreadstorunthesameprogramorexecutable,eachofwhichworkonadifferentsetofdataanapproachcalledSingleProgramMultipleData(SPMD).ThesecondapproachreferstomultiplethreadseachrunningdifferentprogramsondifferentsetsofdataanapproachcalledMultipleProgramMultipleData(MPMD).MPIsupportsbothSPMDandMPMDdatamodels,withsupportforMPMDbeingintroducedwiththeadoptionoftheMPI-2standard.Withtheexceptionofsomespecial-izedQM/MMfunctionalityinsander,MPI-enabledprogramsinAmberusetheSPMDapproachtoparallelization,includingallcodesIcontributed. C.1.2MemoryLayoutInadditiontothevariousapproachesforparallelizingdataprocessing,parallelprogramsfallintooneoftwobroadfamilieswithrespecttomemorylayoutandaccess.Anapproachinwhichallprocessorsshareacommonmemorybankiscalledsharedmemoryparallelization(SMP).ThisistheapproachusedbytheOpenMPAPIthatisimplementedbymoststandardCandFortrancompilers.Theotherapproach,calleddistributedmemoryparallelization,denesseparatememorybuffersforeachprocess, 199

PAGE 200

andeachprocesscanonlymodifyitsownmemorybuffer.TheMPIimplementsthelatterformofparallelism.Sharedmemoryanddistributedmemoryparallelismeachofferdifferentadvantagesanddisadvantageswithrespecttoeachother.InSMP,onethreadcanaccessdatathathaspreviouslybeenmanipulatedbyadifferentprocesswithoutrequiringthattheresultbecopiedandpassedbetweenprocesses.Indistributedparallelism,however,thelackofrequiredsharedmemorymeansthatnotallprocessesneedaccesstothesamememorybank,allowingtaskstobedistributedacrossdifferentphysicalcomputers.Thedifferencebetweendistributedandsharedmemoryparallelizationcanbevisualizedbyconsideringanumberoftalentedcraftsmenconstructingacomplexmachineinaworkshop.SMPisanalogoustocrowdingmultipleworkersaroundasingleworkbenchwithasinglesetoftoolsorinstruments.Eachworkercanperformaseparatetasktowardcompletingtheprojectatthesametimeotherworkersareperformingtheirtasks.Furthermore,assoonasoneworkernishestheirtaskandreturnstheresulttotheworkbench,theresultisimmediatelyaccessibletoeveryotherworkeratthetable.Ofcourse,thenumberofworkersthatcanworkatthetableandthephysicalsizeofthetotalprojectislimitedbythenumberoftoolspresentattheworkbenchandthesizeofthattable,respectively.Byanalogy,thenumberoftoolscanbethoughtofasthenumberofprocessingcoresavailable,whilethesizeofthetableisanalogoustotheamountofavailablesharedmemory.DistributedmemoryparallelizationschemeslikeMPI,ontheotherhand,areakintoprovidingeachworkerwiththeirownworkbenchwheretheyperformwhatevertasksareassignedtothem.Whenoneworker'staskrequirestheresultofanother'swork,there-quiredmaterialsmustbetransported,or`communicated,'betweenthetwoworkers.Thisinter-workbenchcommunicationintroducesalatencythatisnotpresentinSMP.How-ever,thesizeoftheprojectisnolongerlimitedbythesizeoftheworkbench,butratherbywhetherornottheindividualpiecescantonanyoftheavailableworkbenches.In 200

PAGE 201

thiscase,theroomisaclusterofcomputers,andeachtableisaseparateprocessingcoreavailableinthatcluster.Sincemostmodernsupercomputersarecomposedoflargenumbersofsmaller,interconnectedcomputers,distributedmemoryprogramsmustbeusedforlarge,scalableapplications.Unsurprisingly,peakparallelperformanceleveragesthecapabilitiesofbothdis-tributedandsharedmemoryparallelizationtooptimizeloadbalancingacrosstheavailableresourcesandtominimizecommunicationrequirements.Onatypicalcom-puterclusterorsupercomputer,thereareasmallnumberofcoresoneachindividualmachinebetween8and48arecurrentlycommonplacethatareplacedinanetworkconnectinghundreds,thousands,oreventensofthousandsofthesemachines.UsingSMPwithinasinglenodeaspartofalarger,distributedapplicationallowsprogramstotakeadvantageofthestrengthsofbothprogrammingmodels.[ 203 ]Usingtheanalogyabove,thisapproachisequivalenttousingmultipleworkerseacharoundmultiplework-benches,suchthatSMPtakesplacewithinasingleworkbench,anddataandmaterialshavetobe`communicated'betweendifferentones. C.1.3ThreadCountAprocess,orthread,isaninstanceofaninstructionsetbywhichaprocessingunitoperatesondata.Drawingagainonouranalogy,athreadisequivalenttoasingleworkeratasingleworkbench.SomeparallelprogrammingAPIsuseadynamicthreadcount,sothatnewthreadsarelaunchedwhentheyareneededandendedwhentheyarenot.TheOpenMPAPIoperatesthisway.Thisisakintomoreworkersbeingcalledtoworkonthecomplex,labor-intensivepartsofthemanufacturingprocessandhavingthemleaveafterthatpartofthetaskisnished.Thisway,aparallelizationstrategyisonlynecessaryforparticularlytime-consumingpartsofthecomputationalprocess.TheMPIapproach,ontheotherhand,employsastaticthreadcountsetbeforetheprogramisinitiallylaunched,andthisnumberneverchanges.Inthiscase,theworkersarebroughtintotheworkroomandtheroomisthenlocked.Eachworkerisassigneda 201

PAGE 202

workbenchandasetofinstructionstofollowbasedontheIDcardtheyreceivedwhentheyenteredtheroom. C.2TheMechanicsofMPIAtitsmostbasiclevel,MPIconsistsofaseriesofAPIcallsthatallowthreadstocommunicatecontentsoftheirmemorybetweeneachothersothattheymaycoordinateeffortsonasingletask.Whileanyparallelprogrammaybeconstructedbysimplyallowinganytwothreadstosendandreceivedata,MPIprovidesanextensivesetofcommunicationoptionstosimplifycreatingefcientparallelprograms. C.2.1MessagesInMPI,datathatissentandreceivedbetweenthreadsisreferredtoasamessage,andtheactofpassingdatabetweenthreadsiscalledcommunication.ThefollowingsectionswilldescribehowmessagesarepassedviacommunicationwithinMPI. C.2.2CommunicatorsAcommunicatorisagroupingofthreadswithinanMPIuniversebetweenwhichmessagesmaybepassed.AllmessagessentandreceivedinanMPIprogramdosothroughaparticularcommunicator.Eachthreadwithinacommunicatorisgivenauniqueidentitywithinthatcommunicator,calleditsrank,thatisanintegervaluebetween0andN-1,whereNisthesizeofthecommunicator(i.e.,thenumberofthreadsthatdeneit).Theranksofthecommunicatorscanbeusedtoassigndifferentprocessorstodifferentportionsoftotalwork.CommunicatorscanbeassignedanddestroyedasdesiredwithinanMPIprogram,andareveryusefultoolsforassigningasubsetoftheavailablethreadstoaparticulartask.Thereisonecommunicator,MPI COMM WORLD,thatiscreatedwhenanMPIprogramislaunchedthatlinkseverythread. C.2.3CommunicationsCommunicatingdatabetweenthreadsistheheartofparallelizingaprogramusingMPI.Asmentionedabove,aprogrammaybefullyparallelizedusingMPIbyonly 202

PAGE 203

deningsimplesendandreceivecallsbetweentwothreads.However,theoptimalsetofsendsandreceivesdependsstronglyonwherethethreadsareplaced,thebandwidthandlatencyoftheconnectionbetweenthem,andthepurenumberofsuchcallsthatarerequiredforaparticulartask.Tofacilitatethecreationofportable,efcientparallelprograms,MPIprovidesanexpansivesetoffunctionstocommunicatedatatoabstractthecomplexityofoptimizingcommunications.Thefollowingsectionswillbrieydescribethethreemainfamiliesofcommunicationsaswellassomerepresentativeexampleswithinthosefamilies. C.2.3.1Point-to-pointThesimplestsetofcommunicationinvolvesexchangingdatabetweentwothreads.ThesecommunicationsarethecheapestindividualMPIcommunicationstouse,sincetheyrequirecommunicationbetweentheminimumnumberofthreadstwo.ExamplefunctionsinthisfamilyincludeMPI Send,MPI Recv,andMPI Sendrecv.Thersttwoallowdatatobesentfromoneprocesstoanother,andthesecondexplicitlyreceivessentdata.Everysendmusthaveacorrespondingreceivecallonthedestinationthreadtocompletethecommunication.Thelastfunction,MPI Sendrecvcombinesasendandreceiveinthesamefunction.TheeffectofthesefunctionsareshowninFig. C-1 C.2.3.2All-to-oneandOne-to-allThenextfamilyofcommunicationoccursbetweenaspeciedrootthreadandeveryotherthreadwithinacommunicator.Thesefunctionsinvolvemorecostlycommunicationthanthepoint-to-pointcommunicationsdescribedabovesinceitrequiresatleastasmanymessagesbesentastherearethreadsinthecommunicator.However,specicMPIimplementationscanoptimizethesefunctionswithrespecttothenaveimplementation,typicallymakingthemmoreefcientthanalternativesimplementedviaaseriesofpoint-to-pointcommunications. 203

PAGE 204

FigureC-1. Schematicofdifferentpoint-to-pointcommunications.Threadsanddataareshownasovalsandboxes,respectively,witharrowsindicatingthelinesofcommunication ExamplesinthisfamilyincludeMPI Bcast,MPI Gather,MPI Scatter,andMPI Reduce.MPI Bcastisabroadcastthatsendsdatafromtherootthreadtoev-eryotherthreadinacommunicator.MPI Gathercollectsdatafromallthreadsintoanarrayontherootthread.MPI ScatteroperatessimilarlytoMPI Bcast,exceptthatitdividesthedatasentbytherootintoequal-sizedchunksthataresentouttoeverythreadinthecommunicator(thisiseffectivelytheinverseofanMPI Gathercall).Finally,MPI Reducetakesanarrayofdataoneachthreadandcombinesthemviasomemath-ematicaloperation(i.e.,addition,subtraction,etc.)intothenalresultontherootthread.ThesefunctionsaredemonstrateddiagrammaticallyinFig. C-2 C.2.3.3All-to-allThelastfamilyofcommunicationinvolvestransferringdatafromeverythreadinacommunicatortoeveryotherthread.ExamplesincludeMPI Allgather,MPI Allreduce,andMPI Alltoall.ThesearethemostexpensiveofallMPIcommunicationssince 204

PAGE 205

FigureC-2. Schematicofdifferentall-to-oneandone-to-allcommunications.Threadsanddataareshownasovalsandboxes,respectively,witharrowsindicatingthelinesofcommunication.Communicatorsareshownasdottedlinesenclosingallthethreadsinthecommunicator.The`root'threadinallcommunicationsisthetopoval. theyinvolvethemostamountofcommunication.Asaresult,theyshouldbeavoidedwheneverpossible.However,duetothecomplexityoftherequiredcommunication,thesefunctionsarethebestcandidatesforperformanceoptimizationandtuningwithinanMPIimplementation.Asaresult,whensuchcommunicationisrequired,programsshouldnotattempttoimplementtheirown,equivalentalternatives.MPI AllgatherandMPI AllreducearelogicallyequivalenttoinvokinganMPI BcastcallfromtherootthreadfollowingeitheranMPI GatherorMPI Reducecalltothatroot.TheMPI AlltoallfunctionbehaveslikeanMPI GathertoarootprocessfollowedbyaMPI Scatterfromthatroot.TheMPI Allgatheristhemostexpensiveoftheall-to-allcommunicationsgiventheincreasedamountofdatathatmustbetransmittedbetweenthreads.Fig. C-3 illustrateshowtheseall-to-allcommunicationswork. 205

PAGE 206

FigureC-3. Schematicofdifferentall-to-allcommunications.Threadsanddataareshownasovalsandboxes,respectively,witharrowsindicatingwheredataistransferredtoandfrom. C.2.4Blockingvs.Non-blockingCommunicationsIngeneral,communicationswithinMPIfallintooneoftwocategories:so-calledblockingandnon-blockingcommunications.Blockingcommunicationsrequirethecom-municationcompletebeforetheprogramcancontinue.Non-blockingcommunications,ontheotherhand,returninstantaneouslyandallowtheprogramtocontinueexecutingcodewhilewaitingforthecommunicationtocomplete.Allcommunicationsinvolvingmorethantwothreadsi.e.,one-to-allandall-to-allareblocking.ThereisaspecialMPIfunction,MPI Barrierwhosesolepurposeistoblockallthreadswithinacommunicatorfromadvancingpastthebarrieruntileachthreadhasreachedit.Similarly,theMPI Waitfunctionspreventathreadfromcontinuingitscomputationsuntilafterthespeciednon-blockingcommunicationscomplete. 206

PAGE 207

REFERENCES [1] Lide,D.R.,Frederikse,H.P.R.,Brewer,L.,Koetzle,T.F.,Craig,N.C.,Lineberger,W.C.,Donnelly,R.J.,Smith,A.L.,Goldberg,R.N.,Westbrook,J.H.,Eds.CRCHandbookofChemistryandPhysics;CRCPress,Inc.:NewYork,1997. [2] Schrodinger,E.Phys.Rev.1926,28,1049. [3] McQuarrie,D.A.;Simon,J.D.PhysicalChemistry:AMolecularApproach;UniversityScienceBooks:Sausalito,CA,1997. [4] Jeletic,M.S.;Lowry,R.J.;Swails,J.M.;Ghiviriga,I.;Veige,A.S.J.Organomet.Chem.2011,696,3127. [5] Chandrasekhar,J.;Smith,S.F.;Jorgensen,W.L.J.Am.Chem.Soc.1985,107,154. [6] Watson,T.J.;Bartlett,R.J.Chem.Phys.Lett.2013,555,235. [7] Range,K.;Riccardi,D.;Cui,Q.;Elstner,M.;York,D.M.Phys.Chem.Chem.Phys.2005,7,3070. [8] Hehre,W.;Radom,L.;vonSchleyer,P.;Pople,J.AbinitoMolecularOrbitalTheory;JohnWileyandSons:NewYork,1986. [9] McQuarrie,D.A.StatisticalMechanics;UniversityScienceBooks:MillValley,CA,1973. [10] Metropolis,N.;Rosenbluth,A.W.;Rosenbluth,M.N.;Teller,A.H.J.Chem.Phys.1953,21,1087. [11] Tuckerman,M.E.StatisticalMechanics:TheoryandMolecularSimulation;OxfordUniversityPress,2010. [12] Leach,A.R.MolecularModelling:PrinciplesandApplications,2nded.;PrenticeHall,2001. [13] McCammon,J.A.;Gelin,B.R.;Karplus,M.Nature1977,267,585. [14] Woodcock,L.V.Chem.Phys.Lett.1971,10,257. [15] Berendsen,H.J.C.;Postma,J.P.M.;vanGunsteren,W.F.;Dinola,A.;Haak,J.R.J.Chem.Phys.1984,81,3684. [16] Ryckaert,J.P.;Ciccotti,G.;Berendsen,H.J.C.J.Comput.Phys.1977,23,327. [17] Andersen,H.C.J.Comp.Phys.1983,52,24. [18] Miyamoto,S.;Kollman,P.A.J.Comput.Chem.1992,13,952. 207

PAGE 208

[19] Forester,T.R.;Smith,W.J.Comput.Chem.1998,19,102. [20] Lee,S.-H.;Palmo,K.;Krimm,S.J.Comput.Phys.2005,210,171. [21] Kolos,W.;Wolniewicz,L.J.Chem.Phys.1964,41,3663. [22] Cramer,C.J.EssentialsofComputationalChemisrty:TheoriesandModels,2nded.;JohnWiley&Sons,Ltd.:111RiverSt.,Hoboken,NJ07030,USA,2004. [23] Hornak,V.;Abel,R.;Okur,A.;Strockbine,B.;Roitberg,A.;Simmerling,C.Proteins2006,65,712. [24] Perez,A.;Marchan,I.;Svozil,D.;Sponer,J.;CheathamIII,T.E.;Laughton,C.A.;Orozco,M.Biophys.J.2007,92,3817. [25] Lindorff-Larsen,K.;Stefano,P.;Palmo,K.;Maragakis,P.;Klepeis,J.L.;Dror,R.O.;Shaw,D.E.Proteins2010,78,1950. [26] Bayly,C.I.;Cieplak,P.;Cornell,W.D.;Kollman,P.A.J.Phys.Chem.1993,97,10269. [27] Cornell,W.D.;Cieplak,P.;Bayly,C.I.;Kollmann,P.A.J.Am.Chem.Soc.1993,115,9620. [28] Cieplak,P.;Cornell,W.D.;Bayly,C.;Kollman,P.A.J.Comput.Chem.1995,16,1357. [29] Mackerell,Jr.,A.D.;Feig,M.;Brooks,III,C.L.J.Comput.Chem.2004,25,1400. [30] MacKerell,Jr.,A.D.etal.J.Phys.Chem.B1998,102,3586. [31] Cornell,W.D.;Cieplak,P.;Bayly,C.I.;Gould,I.R.;Ferguson,D.M.;Spellmeyer,D.C.;Fox,T.;Caldwell,J.W.;Kollman,P.A.J.Am.Chem.Soc.1995,117,5179. [32] Duan,Y.;Wu,C.;Chowdhury,S.;Lee,M.C.;Xiong,G.;Zhang,W.;Yang,R.;Cieplak,P.;R.,L.;Lee,T.J.Comput.Chem.2003,24,1999. [33] Case,D.A.;CheathamIII,T.E.;Darden,T.;Gohlke,H.;Luo,R.;Merz,K.M.;Onufriev,A.;Simmerling,C.;Wang,B.;Woods,R.J.J.Comput.Chem.2005,26,1668. [34] Wang,J.;Cieplak,P.;Kollman,P.A.J.Comput.Chem.2000,21,1049. [35] Wang,J.;Wolf,R.M.;Caldwell,J.W.;Kollman,P.A.;Case,D.A.J.Comput.Chem.2004,25,1157. [36] Zgarbova,M.;Otyepka,M.;Sponer,J.;Mladek,A.;Banas,P.;CheathamIII,T.E.;Jurecka,P.J.Chem.TheoryComput.2011,7,2886. 208

PAGE 209

[37] Sitkoff,D.;Sharp,K.A.;Honig,B.J.Phys.Chem.1994,98,1978. [38] Klapper,I.;Hagstrom,R.;Fine,R.;Sharp,K.;Honig,B.Proteins1986,1,47. [39] Gilson,M.K.;Sharp,K.A.;Honig,B.H.J.Comput.Chem.1988,9,327. [40] Baker,N.A.;Sept,D.;Joseph,S.;Holst,M.J.;McCammon,J.A.Proc.Natl.Acad.Sci.USA2001,98,10037. [41] Nielsen,J.E.;Vriend,G.Proteins2001,43,403. [42] Wang,J.;Qin,C.;Li,Z.-L.;Zhao,H.-K.;Luo,R.Chem.Phys.Lett.2009,468,112. [43] Still,W.C.;Tempczyk,A.;Hawley,R.C.;Hendrickson,T.J.Am.Chem.Soc.1990,112,6127. [44] Qiu,D.;Shenkin,P.S.;Hollinger,F.P.;Still,W.C.J.Phys.Chem.A1997,101,3005. [45] Onufriev,A.;Bashford,D.;Case,D.A.J.Phys.Chem.B2000,104,3712. [46] Bashford,D.;Case,D.A.Annu.Rev.Phys.Chem.2000,51,129. [47] Onufriev,A.;Case,D.A.;Bashford,D.J.Comput.Chem.2002,23,1297. [48] Onufriev,A.V.;Sigalov,G.J.Chem.Phys2011,134,164104. [49] Onufriev,A.;Bashford,D.;Case,D.A.Proteins2004,55,383. [50] Mongan,J.;Simmerling,C.;McCammon,J.A.;Case,D.A.;Onufriev,A.J.Chem.TheoryComput.2007,3,156. [51] Nguyen,H.;Roe,D.R.;Simmerling,C.J.Chem.TheoryComput.2013, [52] Weiser,J.;Shenkin,P.S.;Still,W.C.J.Comput.Chem.1999,20,217. [53] Mei,C.;Sun,Y.;Zheng,G.;Bohm,E.J.;Kale,L.V.;Phillips,J.C.;Harrison,C.Enablingandscalingbiomolecularsimulationsof100millionatomsonpetascalemachineswithamulticore-optimizedmessage-drivenruntime.2011; http://doi.acm.org/10.1145/2063384.2063466 [54] Allen,M.P.;Tildesley,D.J.ComputerSimulationofLiquids;Oxfordsciencepublications;OxfordUniversityPress,USA,1989. [55] Schreiber,H.;Steinhauser,O.J.Mol.Biol.1992,228,909. [56] Schreiber,H.;Steinhauser,O.Biochemistry1992,31,5856. [57] Saito,M.J.Chem.Phys.1994,101,4055. [58] Aufnger,P.;Beveridge,D.L.Chem.Phys.Lett.1995,234,413. 209

PAGE 210

[59] CheathamIII,T.E.;Miller,J.L.;Fox,T.;Darden,T.A.;Kollman,P.A.J.Am.Chem.Soc.1995,117,4193. [60] Feller,S.E.;Pastor,R.W.;Rojnuckarin,A.;Bogusz,S.;Brooks,B.R.J.Phys.Chem.1996,100,17011. [61] Patra,M.;Karttunen,M.;Hyvonen,M.T.;Falck,E.;Lindqvist,P.;Vattulainen,I.Biophys.J.2003,84,3636. [62] Steinbach,P.J.;Brooks,B.R.J.Comput.Chem.1994,15,667. [63] Ewald,P.P.Ann.Phys.1921,64,253. [64] Darden,T.;Perera,L.;Li,L.;Pedersen,L.Structure1999,7,R55R60. [65] Miaskiewicz,K.;Osman,R.;Weinstein,H.J.Am.Chem.Soc.1993,115,15261537. [66] McConnell,K.J.;Nirmala,R.;Young,M.A.;Ravishanker,G.;Beveridge,D.L.J.Am.Chem.Soc.1994,116,4461. [67] unenberger,P.H.H.;McCammon,J.A.J.Chem.Phys.1999,110,1856. [68] Cerutti,D.S.;Case,D.A.J.Chem.TheoryComput.2010,6,443. [69] Wu,X.;Brooks,B.R.J.Chem.Phys.2005,122,044107. [70] Tironi,I.G.;Sperb,R.;Smith,P.E.;vanGunsteren,W.F.J.Chem.Phys.1995,102,5451. [71] Shaw,D.E.etal.SIGARCHComput.Archit.News2007,35. [72] Shaw,D.E.;Maragakis,P.;Lindorff-Larsen,K.;Piana,S.;Dror,R.O.;East-wood,M.P.;Bank,J.A.;Jumper,J.M.;Salmon,J.K.;Shan,Y.;Wriggers,W.Science2010,330,341. [73] Grosseld,A.WHAM:theweightedhistogramanalysismethod,version2.0.4.2005; http://membrane.urmc.rochester.edu/content/wham [74] Shirts,M.R.;Chodera,J.D.J.Chem.Phys.2008,129,124105. [75] Lee,T.;Radak,B.;Pabis,A.;York,D.M.J.Chem.TheoryComput.2013,9,153. [76] Jarzynski,C.Phys.Rev.Lett.1997,78,2690. [77] Lyubartsev,A.P.;Martsinovksi,A.A.;Shevkunov,S.V.;Vorontsov-Velyaminov,P.N.J.Chem.Phys.1992,96,1776. [78] Sugita,Y.;Okamoto,Y.Chem.Phys.Lett.1999,314,141. 210

PAGE 211

[79] Babin,V.;Roland,C.;Sagui,C.J.Chem.Phys.2008,128,134101. [80] Sugita,Y.;Kitao,A.;Okamoto,Y.J.Chem.Phys.2000,113,6042. [81] Fukunishi,H.;Watanabe,O.;Takada,S.J.Chem.Phys.2002,116,9058. [82] Fajer,M.;Swift,R.V.;McCammon,J.A.JComputChem2009,30,1719. [83] Arrar,M.;deOliveira,C.A.F.;Fajer,M.;Sinko,W.;McCammon,J.A.J.Chem.TheoryComput.2013,9,18. [84] Jiang,W.;Roux,B.J.Chem.TheoryComput.2010,6,2559. [85] Meng,Y.;Dashti,D.;Roitberg,A.E.J.Chem.TheoryComput.2011,7,27212727. [86] Wallace,J.A.;Shen,J.K.J.Chem.TheoryComput.2011,7,2617. [87] Itoh,S.G.;Damjanovic,A.;Brooks,B.R.Proteins2011,79,3420. [88] Swails,J.M.;Roitberg,A.E.J.Chem.TheoryComput.2012,8,4393. [89] Dashti,D.;Roitberg,A.J.Phys.Chem.B2012,116,8805. [90] Wu,X.;Hodoscek,M.;Brooks,B.R.J.Chem.Phys.2012,137,044106. [91] Bolhuis,P.G.J.Chem.Phys.2008,129,114108. [92] Vorobjev,Y.N.;Almagro,J.C.Proteins1998,32,399. [93] Head,M.S.;Given,J.A.;Gilson,M.K.J.Phys.Chem.A1997,101,1609. [94] Yang,A.S.;Honig,B.J.Mol.Biol.1995,252,351. [95] Yang,A.S.;Honig,B.J.Mol.Biol.1995,252,366. [96] Portman,J.J.;Takada,S.;Wolynes,P.G.Phys.Rev.Lett.1998,81,5237. [97] Eisenberg,D.;McLachlan,A.D.Nature1986,319,199. [98] Jean-Charles,A.;Anthony,N.;Sharp,K.;Honig,B.;Tempczyk,A.;Hendrick-son,T.F.;Still,W.C.J.Am.Chem.Soc.1991,113,1454. [99] Massova,I.;Kollman,P.A.J.Am.Chem.Soc.1999,121,8133. [100] Woo,H.-J.;Roux,B.Proc.Natl.Acad.Sci.2005,102,6825. [101] Gohlke,H.;Kiel,C.;Case,D.A.J.Mol.Biol.2003,330,891. [102] Gohlke,H.;Case,D.A.J.Comput.Chem.2004,25,238. [103] Meirovitch,H.Curr.Opin.Struct.Biol.2007,17,181. 211

PAGE 212

[104] Davies,J.;Doltsinis,N.;Kirby,A.;Roussev,C.;Sprik,M.J.Am.Chem.Soc.2002,124,6594. [105] Steinbrecher,T.;Joung,I.;Case,D.A.J.Comp.Chem.2011,32,3253. [106] Beutler,T.C.;Mark,A.E.;ReneC.vanSchaikandPaulR.GerberandWilfredF.vanGunsteren,Chem.Phys.Lett.1994,222,529. [107] Steinbrecher,T.;Mobley,D.L.;Case,D.A.J.Chem.Phys.2007,127,214108. [108] Zwanzig,R.W.J.Chem.Phys.1954,22,1420. [109] Homeyer,N.;Gohlke,H.Mol.Inf.2012,31,114. [110] MillerIII,B.R.;McGeeJr.,T.D.;Swails,J.M.;Homeyer,N.;Gohlke,H.;Roit-berg,A.E.J.Chem.TheoryComput.2012,8,3314. [111] Srinivasan,J.;Cheatham,III,T.E.;Cieplak,P.;Kollman,P.A.;Case,D.A.J.Am.Chem.Soc.1998,129,9401. [112] Massova,I.;Kollman,P.A.Perspect.DrugDiscov.2000,18,113. [113] Aqvist,J.;Medina,C.;Samuelsson,J.-E.ProteinEng.1994,7,385. [114] Hansson,T.;Marelius,J.;Aqvist,J.J.Comput.Aid.Mol.Des.1998,12,27. [115] Marcus,R.A.J.Chem.Phys.1955,24,966. [116] Cornish-Bowden,A.J.;Knowles,J.R.Biochem.J1969,113,353. [117] WhiteJr.,F.H.;Annsen,C.B.Ann.NYAcad.Sci.1959,81,515. [118] Tanford,C.;Kirkwood,J.G.J.Am.Chem.Soc.1957,79,5333. [119] Olsson,M.H.M.;Sondergaard,C.R.;Rostkowski,M.;Jensen,J.H.J.Chem.TheoryComput.2011,7,525. [120] Myers,J.;Grothaus,G.;Narayanan,S.;Onufriev,A.Proteins2006,63,928. [121] Bashford,D.;Karplus,M.Biochemistry1990,29,10219. [122] Bashford,D.;Gerwert,K.J.Mol.Biol.1992,224,473. [123] Antosiewicz,J.;McCammon,J.A.;Gilson,M.K.J.Mol.Biol.1994,238,415. [124] Song,Y.;Mao,J.;Gunner,M.R.J.Comput.Chem.2009,30,2231. [125] Baptista,A.M.;Martel,P.J.;Petersen,S.B.Proteins1997,27,523. [126] Baptista,A.M.;Teixeira,V.H.;Soares,C.M.J.Chem.Phys.2002,117,41844200. 212

PAGE 213

[127] Burgi,R.;Kollman,P.A.;vanGunsteren,W.F.Proteins2002,47,469. [128] Lee,M.S.;Salsbury,Jr.,F.R.;BrooksIII,C.L.Proteins2004,56,738. [129] Borjesson,U.;Hunenberger,P.H.J.Phys.Chem.B2004,108,13551. [130] Mongan,J.;Case,D.A.;McCammon,J.A.J.Comput.Chem.2004,25,20382048. [131] Khandogin,J.;BrooksIII,C.L.Biophys.J.2005,89,141. [132] Alexov,E.;Mehler,E.L.;Baker,N.;Huang,Y.;Milletti,F.;Nielsen,J.E.;Far-rell,D.;Carstensen,T.;Olsson,M.H.M.;Shen,J.K.;Warwicker,J.;Williams,S.;Word,J.M.Proteins2011,79,3260. [133] Machuqueiro,M.;Baptista,A.M.Proteins2011,79,3437. [134] Hamelberg,D.;Mongan,J.;McCammon,J.A.J.Chem.Phys.2004,120,1191911929. [135] Williams,S.L.;deOliveira,C.A.F.;McCammon,J.A.J.Chem.TheoryComput.2010,6,560. [136] Webb,H.;Tynan-Connolly,B.M.;Lee,G.M.;Farrell,D.;O'Meara,F.;Sonder-gaard,C.R.;Teilum,K.;Hewage,C.;McIntosh,L.P.;Nielsen,J.E.Proteins2011,79,685. [137] Pitera,J.W.;Swope,W.Proc.Natl.Acad.Sci.USA2003,100,7587. [138] Chodera,J.D.;Shirts,M.R.J.Chem.Phys.2011,135,194110. [139] Nadler,W.;Meinke,J.H.;Hansmann,U.H.E.Phys.Rev.E2008,78,061905. [140] Meng,Y.;Roitberg,A.E.J.Chem.TheoryComput.2010,6,1401. [141] Case,D.A.;Darden,T.A.;CheathamIII,T.E.;Simmerling,C.L.;Wang,J.;Duke,R.E.;Luo,R.;Walker,R.C.;Zhang,W.;Merz,K.M.;Roberts,B.;Hayik,S.;Roit-berg,A.;Seabra,G.;Swails,J.;Gotz,A.W.;Kolossvary,I.;Wong,K.F.;Paesani,F.;Vanicek,J.;Wolf,R.M.;Liu,J.;Wu,X.;Brozell,S.R.;Steinbrecher,T.;Gohlke,H.;Cai,Q.;Ye,X.;Wang,J.;Hsieh,M.-J.;Cui,G.;Roe,D.R.;Mathews,D.H.;Seetin,M.G.;Salomon-Ferrer,R.Sagui,C.;Babin,V.;Luchko,T.;Gusarov,S.;Kovalenko,A.;Kollman,P.A.AMBER12.UniversityofCalifornia,SanFrancisco:SanFrancisco,CA,2012. [142] Sindhikara,D.;Meng,Y.;Roitberg,A.E.J.Chem.Phys.2008,128,024103024103. [143] Sindhikara,D.J.;Emerson,D.J.;Roitberg,A.E.J.Chem.TheoryComput.2010,6,2804. 213

PAGE 214

[144] Takahashi,T.;Nakamura,H.;Wada,A.Biopolymers1992,32,897. [145] Bartik,K.;Redeld,C.;Dobson,C.M.Biophys.J.1994,66,1180. [146] Demchuk,E.;Wade,R.C.J.Phys.Chem.1996,100,17373. [147] Artymiuk,P.J.;Blake,C.C.F.;W.,R.D.;S.,W.K.ActaCryst.B1982,38,778783. [148] Vocadlo,D.J.;Davies,G.J.;Laine,R.;Withers,S.G.Nature2001,412,835. [149] Sindhikara,D.J.;Kim,S.;Voter,A.F.;Roitberg,A.E.J.Chem.TheoryComput.2009,5,1624. [150] Hamacher,K.J.Comp.Chem.2007,28,2576. [151] McClendon,C.L.;Hua,L.;Barreiro,G.;Jacobson,M.P.J.Chem.TheoryComput.2012,8,2115. [152] Vetter,J.S.;Glassbrook,R.;Dongarra,J.;Schwan,K.;Loftis,B.;McNally,S.;Meredith,J.;Rogers,J.;Roth,P.;Spafford,K.;Yalamanchili,S.ComputinginScienceandEngg.2011,13,90. [153] Cheatham,III,T.E.;A.Young,M.Biopolymers2001,56,232. [154] Varnai,P.;Djuranovic,D.;Lavery,R.;Hartmann,B.NucleicAcidsRes.2002,30,5398. [155] Klepeis,J.L.;Lindorff-Larsen,K.;Dror,R.O.;Shaw,D.E.Curr.Opin.Struct.Biol.2009,19,120. [156] Zhou,R.Proteins2003,53,148. [157] Geney,R.;Layten,M.;Gomperts,R.;Hornak,V.;Simmerling,C.J.Chem.TheoryComput.2006,2,115. [158] Donnini,S.;Tegeler,F.;Groenhof,G.;Grubmuller,H.J.Chem.TheoryComput.2011,7,1962. [159] Goh,G.B.;Knight,J.L.;Brooks,C.L.J.Chem.TheoryComput.2012,8,36. [160] Wallace,J.A.;Shen,J.K.J.Chem.Phys.2012,137,184105. [161] Machuqueiro,M.;Baptista,A.M.Proteins2008,72,289. [162] Baptista,A.M.;Soares,C.M.J.Phys.Chem.B2001,105,293. [163] GregoryD.Hawkins,C.C.;Truhlar,D.Chem.Phys.Lett.1995,246,122. [164] Hawkins,G.D.;Cramer,C.J.;Truhlar,D.G.J.Phys.Chem.1996,100,1982419839. 214

PAGE 215

[165] Shang,Y.;Nguyen,H.;Wickstrom,L.;Okur,A.;Simmerling,C.J.Mol.Graphics2011,29,676. [166] Frantz,C.;Barreiro,G.;Dominguez,L.;Xiaoming,C.;Eddy,R.;Condeelis,J.;Kelly,M.J.S.;Jacobson,M.P.;Barber,D.L.J.CellBiol.2008,183,865. [167] Jorgensen,W.L.;Chandrasekhar,J.;Madura,J.D.;Impey,R.W.;Klein,M.L.J.Chem.Phys.1983,79,926. [168] Young,A.C.M.;Dewan,J.C.J.Appl.Cryst.1993,26,309. [169] Berisio,R.;Sica,F.;Lamzin,V.S.;Wilson,K.S.;Zagari,A.;Mazzarella,L.ActaCrystallogr.D.2002,58,441. [170] Wlodawer,A.;Svensson,L.A.;Sjolin,L.;Gilliland,G.L.Biochemistry1988,27,2705. [171] Uberuaga,B.P.;Anghel,M.;Voter,A.F.J.Chem.Phys.2004,120,6363. [172] Darden,T.;York,D.;Pedersen,L.J.Chem.Phys.1993,98,10089. [173] Essmann,U.;Perera,L.;Berkowitz,M.L.;Darden,T.;Hsing,L.;Pedersen,L.G.J.Chem.Phys.1995,103,8577. [174] Baker,W.R.;Kintanar,A.Arch.Biochem.Biophys.1996,327,189. [175] Patriksson,A.;vanderSpoel,D.Phys.Chem.Chem.Phys.2008,10,2073. [176] Okur,A.;Wickstrom,L.;Layten,M.;Geney,R.;Song,K.;Hornak,V.;Simmer-ling,C.J.Chem.TheoryComput.2006,2,420. [177] Chodera,J.D.;Swope,W.C.;Pitera,J.W.;Seok,C.;Dill,K.A.J.Chem.TheoryComput.2007,3,26. [178] Cheng,X.;Cui,G.;Hornak,V.;Simmerling,C.J.Phys.Chem.B2005,109,8220. [179] Wang,J.;Morin,P.;Wang,W.;Kollman,P.A.J.Am.Chem.Soc.2001,123,5221. [180] Kuhn,B.;Gerber,P.;Schulz-Gasch,T.;Stahl,M.J.Med.Chem.2005,48,40404048. [181] Weis,A.;Katebzadeh,K.;Soderhjelm,P.;Nilsson,I.;Ryde,U.J.Med.Chem.2006,49,6596. [182] Genheden,S.;Ryde,U.J.Comp.Chem.2009,31,837. [183] Wang,W.;Donini,O.;Reyes,C.M.;Kollman,P.A.Annu.Rev.Biophys.Biomol.Struct.2001,30,211. 215

PAGE 216

[184] Bradshaw,R.T.;Patel,B.H.;Tate,E.W.;Leatherbarrow,R.J.;Gould,I.R.Bioinformatics2010,24,197. [185] Gouda,H.;Kuntz,I.D.;Case,D.A.;Kollman,P.A.Biopolymers2003,68,16. [186] Combelles,C.;Gracy,J.;Heitz,A.;Craik,D.J.;Chiche,L.Proteins2008,73,87. [187] Brice,A.R.;Dominy,B.N.J.Comp.Chem.2011,32,1431. [188] Mikulskis,P.;Genheden,S.;Rydberg,P.;Sandberg,L.;Olsen,L.;Ryde,U.J.Comput.AidedMol.Des.2012,26,527. [189] Sanner,M.F.J.Mol.GraphModel.1999,17,57. [190] Cock,P.J.A.;Antao,T.;Chang,J.T.;Chapman,B.A.;Cox,C.J.;Dalke,A.;Fried-berg,I.;Hamelryck,T.;Kauff,F.;Wilczynski,B.;deHoon,M.J.L.Bioinformatics2009,25,1422. [191] Michaud-Agrawal,N.;Denning,E.J.;Woolf,T.B.;Beckstein,O.J.Comp.Chem.2011,32,2319. [192] Genheden,S.;Luchko,T.;Gusarov,S.;Kovalenko,A.;Ryde,U.J.Phys.Chem.B2010,114,8505. [193] Wang,J.;Hou,T.;Xu,X.Curr.Comput.-Aid.Drug2006,3. [194] Metz,A.;Peger,C.;Kopitz,H.;Pfeiffer-Marek,S.;Baringhaus,K.-H.;Gohlke,H.J.Chem.Inf.Model.2012,52,120. [195] Brooks,B.R.;Janezic,D.;Karplus,M.J.Comput.Chem.1995,16,1522. [196] Macke,T.J.;Case,D.A.InMolecularModelingofNucleicAcids;Leontis,N.B.,SantaLucia,J.,Eds.;AmericanChemicalSociety:Washington,DC,1997;Chapter25,pp379. [197] Corben,H.C.;Stehle,P.ClassicalMechanics,2nded.;DoverPublications,Inc.:NewYork,1950. [198] Gear,C.W.Thenumericalintegrationofordinarydifferentialequationsofvariousorders;1966. [199] Gear,C.W.NumericalInitialValueProblemsinOrdinaryDifferentialEquations,1sted.;PrenticeHall,1971. [200] Verlet,L.Phys.Rev.1967,159,98. [201] Swope,W.C.;Andersen,H.C.;Berens,P.H.;Wilson,K.R.J.Chem.Phys.1982,76,637. 216

PAGE 217

[202] Pachecho,P.ParallelProgrammingwithMPI;MorganKaufmannPublishers,Inc.:SanFrancisco,CA,1997. [203] Lusk,E.;Chan,A.Lec.NotesinComp.Sci.2008,5004,36. 217

PAGE 218

BIOGRAPHICALSKETCH JasonM.SwailswasborninBinghamton,NYandgrewupinVestal,NY.HeattendedBinghamtonUniversityforhisundergraduatestudieswherehemajoredinchemistry.Inthesummerof2007afterhisjunioryearatBinghamton,hewenttotheUniversityofFloridaandworkedinProfessorAdrianRoitberg'sresearchlabundertheNSFREUprogram.ThenextsummerfollowinghissenioryearatBinghamtonUniversity,JasonstudiedattheUniversityofBuenosAiresinArgentinaunderaninternationalNSFREUprogramfundedthroughtheUniversityofFlorida.ThatfallhebegangraduatestudiesattheUniversityofFlorida,andwasawardedtheNSFGRFPfellowship.HereceivedhisPh.D.fromtheUniversityofFloridainthesummerof2013.OnJuly9,2011,JasonmarriedRoxyJ.Lowry,agraduatefromtheUniversityofFlorida,nearBoise,Idaho. 218