Developing and Optimizing the Replica Exchange-Based Free Energy Methods

MISSING IMAGE

Material Information

Title:
Developing and Optimizing the Replica Exchange-Based Free Energy Methods
Physical Description:
1 online resource (1 p.)
Language:
english
Creator:
Sabri Dashti, Danial
Publisher:
University of Florida
Place of Publication:
Gainesville, Fla.
Publication Date:

Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Physics
Committee Chair:
Roitberg, Adrian E
Committee Members:
Trickey, Samuel B
Deumens, Erik
Dufty, James W
Bowers, Clifford Russell

Subjects

Subjects / Keywords:
carlo -- dynamics -- exchange -- hamiltonian -- metropolis -- molecular -- monte -- optimization -- replica
Physics -- Dissertations, Academic -- UF
Genre:
Physics thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract:
In recent decades, by increasing the power of computers’ hardware and developing efficient algorithms, the study of large biomolecular systems has been facilitated. During my PhD period, I attempted to improve the efficiency of sampling by developing new algorithms and optimizing existing ones. My first project was about developing the Replica Exchange Free Energy Perturbation (REFEP) method, which is a combination of the Free Energy Perturbation (FEP) and the Hamiltonian Replica Exchange Molecular Dynamics (HREMD) methods. We showed that the HREMD method not only improves convergence in free energy calculations, but also can be used to estimate free energy differences directly via the FEP algorithm. My next project was a demonstration of the capabilities of REFEP in estimating the pKa of complicated proteins. According to experimental measurements, the pKa value of Glutamate 66 (GLU66) in a hyperstable mutant of staphylococcal nuclease displays a large shift, roughly 4.6 pH units, relative to its normal value in water. In my third research project I developed and validated a pH-Replica Exchange Molecular Dynamics (pH-REMD) method, which improves the coupling between conformational and protonation sampling. Finally, in my last project, I have focused on optimizing HREMD methods. The goal is to find the best position for replicas in order to maximize the round trip between extremum positions on a replica ladder.
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility:
by Danial Sabri Dashti.
Thesis:
Thesis (Ph.D.)--University of Florida, 2013.
Local:
Adviser: Roitberg, Adrian E.
Electronic Access:
RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2014-02-28

Record Information

Source Institution:
UFRGP
Rights Management:
Applicable rights reserved.
Classification:
lcc - LD1780 2013
System ID:
UFE0045878:00001


This item is only available as the following downloads:


Full Text

PAGE 1

1 Rightslink¨ by Copyright Clearance Center May 9, 2013 4:47:07 PM https://s100.copyright.com/AppDispatchServlet Title : pH Replica Exchange Molecular Dynamics in Proteins Using a Discrete Protonation Method Author : Danial Sabri Dashti Yilin Meng and Adrian E Roitberg Publication : The Journal of Physical Chemistry B Publisher : American Chemical Society Date : Aug 1 2012 Copyright 2012 American Chemical Society User ID Password Enable Auto Login Forgot Password / User ID ? If you re a copyright com user you can login to RightsLink using your copyright com credentials Already a RightsLink user or want to learn more ? PERMISSION / LICENSE IS GRANTED FOR YOUR ORDER AT NO CHARGE This type of permission / license instead of the standard Terms & Conditions is sent to you because no fee is being charged for your order Please note the following : Permission is granted for your request in both print and electronic formats and translations If figures and / or tables were requested they may be adapted or used in part Please print this page for your records and send a copy of it to your publisher / graduate school Appropriate credit for the requested material should be given as follows : Reprinted ( adapted ) with permission from ( COMPLETE REFERENCE CITATION ) Copyright ( YEAR ) American Chemical Society Insert appropriate information in place of the capitalized words One time permission is granted only for the use specified in your request No additional uses are granted ( such as derivative works or other editions ) For any other uses please submit a new request Copyright 2013 Copyright Clearance Center Inc All Rights Reserved Privacy statement Comments ? We would like to hear from you E mail us at customercare @ copyright com



PAGE 1

DEVELOPINGANDOPTIMIZINGTHEREPLICAEXCHANGE-BASEDFREEENERGYMETHODSByDANIALSABRIDASHTIADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOLOFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENTOFTHEREQUIREMENTSFORTHEDEGREEOFDOCTOROFPHILOSOPHYUNIVERSITYOFFLORIDA2013

PAGE 2

c2013DanialSabriDashti 2

PAGE 3

Idedicatethisdissertationtomywifeandmyfamily. 3

PAGE 4

ACKNOWLEDGMENTS Iprimarilythankmyadvisor,ProfessorAdrianRoitbergforourscienticconversationsanddiscussions.Iamalsogratefulforsignicantassistanceofmylovelywife,Sahar,duringmyPh.D.period.Iwouldneverhavebeenabletonishmydissertationwithoutthesupportofmyfamily.IalsoliketothankDr.YilinMengforhiseffectivecollaboration.Moreover,thanksgoouttoallwhosupportedmeovermyPh.DstudyattheUniversityofFlorida. 4

PAGE 5

TABLEOFCONTENTS page ACKNOWLEDGMENTS .................................. 4 LISTOFTABLES ...................................... 8 LISTOFFIGURES ..................................... 9 ABSTRACT ......................................... 11 CHAPTER 1INTRODUCTION .................................. 12 1.1Prologue .................................... 12 1.2FreeEnergiesandEnsembles ........................ 12 1.2.1PhaseSpace .............................. 12 1.2.2PartitionFunctionsandEnsembles .................. 13 1.2.3FreeEnergyandPotentialofMeanForce .............. 14 1.2.3.1HelmholtzandGibbsfreeenergies ............. 14 1.2.3.2Potentialofmeanforce ................... 15 1.3SamplinginBiomolecular ........................... 15 1.3.1Sampling ................................ 15 1.3.2StandardSamplingTechniquesinBiomolecularSimulations .... 16 1.3.2.1Moleculardynamics ..................... 16 1.3.2.2MonteCarlo ......................... 16 1.4AdvancedSamplingMethods ......................... 18 1.4.1GeneralizedEnsembleMethods .................... 18 1.4.1.1Simulatedtempering ..................... 18 1.4.1.2Multicanonicalalgorithm ................... 19 1.4.1.3Wang-Landausampling ................... 19 1.4.1.4Replicaexchange ...................... 20 1.4.1.5Umbrellasampling ...................... 22 1.4.2Slow-GrowthMethods ......................... 22 1.4.2.1Thermodynamicintegration ................. 23 1.4.2.2Freeenergyperturbation .................. 24 1.5Convergence,ErrorEstimation,andSamplingQualityinBiomolecularSimulations ................................... 24 1.5.1ErrorinComputationalMethods .................... 24 1.5.2ErgodicityandConvergence ...................... 25 1.5.3SamplingQualityChecks ....................... 26 1.5.3.1Rootmeansquaredeviationanalysis ........... 26 1.5.3.2Rootmeansquaredeviationclustering .......... 26 1.5.3.3Blockaveraging ....................... 27 1.5.3.4Principalcomponentanalysis ................ 27 1.5.3.5Kullback-Leiblerdivergence ................. 28 5

PAGE 6

1.6OutlineofMyResearch ............................ 28 2COMPUTINGALCHEMICALFREEENERGYDIFFERENCESWITHHAMILTONIANREPLICAEXCHANGEMOLECULARDYNAMICS(HREMD)Simulations ... 31 2.1LiteratureReview ................................ 31 2.2TheoryandMethod .............................. 33 2.2.1FreeEnergyPerturbation(FEP) .................... 33 2.2.2ThermodynamicIntegration ...................... 35 2.2.3HamiltonianReplicaExchangeMolecularDynamics(HREMD) .. 36 2.2.4SimulationDetails ........................... 38 2.3ResultsandDiscussions ............................ 41 2.3.1AcceptanceRatioofHREMDSimulations .............. 41 2.3.2AsparticAcidModelCompoundStudy ................ 41 2.3.3StudyonAsp26inThioredoxin .................... 43 2.3.4pKaPredictionforAsp26inThioredoxin ............... 44 2.4ConcludingRemarks .............................. 46 3LARGEPKashiftsFORBURIEDPROTONABLERESIDUES.RATIONALIZINGTHECASEOFGLUTAMATE66INSTAPHYLOCOCCALNUCLEASE ..... 47 3.1LiteratureReview ................................ 47 3.2SimulationDetails ............................... 48 3.3Discussion ................................... 50 3.4ConcludingRemarks .............................. 60 4PH-REPLICAEXCHANGEMOLECULARDYNAMICSINPROTEINSUSINGADISCRETEPROTONATIONMETHOD ..................... 62 4.1LiteratureReview ................................ 62 4.2TheoreticalMethodandSimulationDetails ................. 65 4.2.1ConstantpHMolecularDynamicsandpH-ReplicaExchangeMolecularDynamics(pH-REMD) ......................... 65 4.2.2TitrationCurve ............................. 66 4.2.3pH-REMD ................................ 67 4.3SimulationDetails ............................... 68 4.4ResultsandDiscussion ............................ 69 4.4.1TitratableModelCompounds ..................... 69 4.4.2ADFDAModelCompounds ...................... 69 4.4.3HeptapeptideDerivedfromOMTKY3 ................. 71 4.5ConcludingRemarks .............................. 74 5OPTIMIZATIONOFUMBRELLASAMPLINGREPLICAEXCHANGEMOLECULARDYNAMICSBYREPLICAPOSITIONING ..................... 78 5.1LiteratureReview ................................ 78 5.2Theory ...................................... 81 6

PAGE 7

5.2.1UmbrellaSamplingReplicaExchange ................ 81 5.2.2CalculatingExchangeAcceptanceRatioinUmbrellaSamplingReplicaExchange ........................... 82 5.2.3UmbrellaSamplingReplicaExchangeOptimization ......... 83 5.2.4UmbrellaSamplingReplicaExchangeOptimizationWorkows .. 84 5.2.5SimulationDetails ........................... 86 5.3ResultsandDiscussion ............................ 87 5.3.1PotentialofMeanForceAlongtheButaneDihedralAngle ..... 88 5.3.2PotentialofMeanForceforNH+4+OH)]TJ /F1 11.955 Tf 10.4 -4.34 Td[(SaltBridgeinExplicitSolvent ................................. 90 5.4ConcludingRemarks .............................. 91 6CONCLUSIONS ................................... 93 APPENDIX ACALCULATINGTHEPDFOFINUSRE .................... 95 BESTIMATINGMEANSANDVARIANCESOFTHEWINDOWSUSINGNEARESTNEIGHBORWEIGHTEDAVERAGINGANDREWEIGTHING .......... 96 B.1ReweightingFittedGaussianDistributiononWindows ........... 98 B.2NearestNeighborWeightedAveraging(NNWA) ............... 99 CBARZILAIANDBORWEINOPTIMIZATION(BBMETHOD) ........... 100 REFERENCES ....................................... 101 BIOGRAPHICALSKETCH ................................ 111 7

PAGE 8

LISTOFTABLES Table page 2-1Freeenergydifference(inKcal=mol)betweenprotonatedanddeprotonatedAsparticacidsobtainedfromTI,REFEP,andFEPalchemicalfreeenergysimulations 43 4-1pKasofthereferencecompoundscomputedbydifferentmethods ........ 69 4-2pKapredictionandHillcoefcientofttedfromtheHHequation. ........ 70 4-3pKavaluesofthetitratableresiduesintheheptapeptidederivedfromOMKTY3. 72 5-1PositionandEARforfourdifferentsettingsofUSREcalculationsonthebutanedihedralsimulationinimplicitwater. ........................ 88 8

PAGE 9

LISTOFFIGURES Figure page 2-1DiagramsdisplayingtheHREMDexchangealgorithmandfreeenergycalculation. 37 2-2ThermodynamiccycleusedtocomputethepKashift. .............. 39 2-3Cumulativeaveragefreeenergydifferencesbetweenprotonatedanddeprotonatedasparticacidinthemodelcompound(G(AH!A)]TJ /F4 11.955 Tf 7.09 -4.34 Td[()) .............. 42 2-4PredictedpKavalueofAsp26inthioredoxinasafunctionoftime ........ 45 3-1oneturnofan)]TJ /F1 11.955 Tf 9.29 0 Td[(helixexposesthesidechainofGLU66. ............ 50 3-2FreeenergyconvergenceinREFEPforunrestrainedV66E+PHS(red)andGlumodelcompound(green). ........................... 51 3-3ThermodynamiccycleforconformationprotonationmodelofGLU66. ...... 52 3-4CumulativeFreeenergyVSSimulationtimeforGlu-66restrainedinside(Red)andrestrainedoutside(Green). ........................... 53 3-5Concentrationofspeciesvs.pHfortwodifferentvalueofKe=b,Di.e.,Ke=b,D;1=2andKe=b,D;2=108. ................................. 55 3-6CSS,asdenedinEquation 3 ,asafunctionofpHforKe=b,D=10 ....... 56 3-7ApparentpKavs.Ke=b,D. .............................. 57 3-8CarbonRMSDdistributionsofresidues62to69andthesidechainofGLU66inunrestrainedREFEPsimulations ......................... 58 3-9Averagesecondarystructureforresidue57-69vs.reactioncoordinate. .... 59 3-10Ellipticityat222nmvs.reactioncoordinate .................... 60 4-1ComparisonofHillplotsbetweenpH-REMDandCpHmethodsforLysreferencemodel. ........................................ 70 4-2ThetitrationcurvesofAspsidechainsinADFDAcomputedbybothconstantpHMD(blueandpurple)andpH-REMD(redandgreen)methods. ....... 71 4-3CumulativeaverageprotonationfractionsofAspssidechainsinADFDAVSMCtitrationstepsatpH=4.0. ............................ 72 4-4ThetitrationcurvesofAsp3intheheptapeptidederivedfromOMTKY3. .... 73 4-5ThetitrationcurvesofLys5andTyr7intheheptapeptidederivedfromOMTKY3. 74 4-6CumulativeaverageprotonationfractionforTYR7versusMCtitrationstepsatpH=8.0,9.0,10.0,11.0,12.0and13.0. ........................ 75 9

PAGE 10

4-7TheKullback-LeiblerdivergencemeasureofRMSDdistributionsofCpH(Green)andpH-REMD(Red)respecttothenalRMSDdistributioninCpH. ...... 76 4-8RMSDAutocorrelationforCpH(Green)andpH-REMD(Red)atallpHs. .... 77 5-1FBSFworkow .................................... 85 5-2SBSFworkow ................................... 86 5-3ARTprobabilitydensity.TheinsetshowsthePMFvs.reactioncoordinate(dihedralangle) ......................................... 89 5-4Windowpositionsvs.EARsintheequi-distance(non-optimized)(redcircles)andoptimized(set2)(greencircles)simulations. ................. 90 5-5ThemeanandRMSEofKullback-Leiblerdivergenceover10simulationsforeachset. ....................................... 91 10

PAGE 11

AbstractofDissertationPresentedtotheGraduateSchooloftheUniversityofFloridainPartialFulllmentoftheRequirementsfortheDegreeofDoctorofPhilosophyDEVELOPINGANDOPTIMIZINGTHEREPLICAEXCHANGE-BASEDFREEENERGYMETHODSByDanialSabriDashtiAugust2013Chair:AdrianRoitbergMajor:PhysicsInrecentdecades,byincreasingthepowerofcomputers'hardwareanddevelopingefcientalgorithms,thestudyoflargebiomolecularsystemshasbeenfacilitated.DuringmyPhDperiod,Iattemptedtoimprovetheefciencyofsamplingbydevelopingnewalgorithmsandoptimizingexistingones.MyrstprojectwasaboutdevelopingtheReplicaExchangeFreeEnergyPerturbation(REFEP)method,whichisacombinationoftheFreeEnergyPerturbation(FEP)andtheHamiltonianReplicaExchangeMolecularDynamics(HREMD)methods.WeshowedthattheHREMDmethodnotonlyimprovesconvergenceinfreeenergycalculations,butalsocanbeusedtoestimatefreeenergydifferencesdirectlyviatheFEPalgorithm.MynextprojectwasademonstrationofthecapabilitiesofREFEPinestimatingthepKaofcomplicatedproteins.Accordingtoexperimentalmeasurements,thepKavalueofGlutamate66(GLU66)inahyperstablemutantofstaphylococcalnucleasedisplaysalargeshift,roughly4.6pHunits,relativetoitsnormalvalueinwater.InmythirdresearchprojectIdevelopedandvalidatedapH-ReplicaExchangeMolecularDynamics(pH-REMD)method,whichimprovesthecouplingbetweenconformationalandprotonationsampling.Finally,inmylastproject,IhavefocusedonoptimizingHREMDmethods.Thegoalistondthebestpositionforreplicasinordertomaximizetheroundtripbetweenextremumpositionsonareplicaladder. 11

PAGE 12

CHAPTER1INTRODUCTION 1.1PrologueMolecularsimulationisabranchofstatisticalphysics.Sincebiophysicalsystemsaretoocomplexandinhomogeneoustobetreatedtheoretically,theroleofnumericalsimulationshasbeenrecognized.Withrecentadvancementsincomputerhardwareandsoftware,thesimulationsofmorecomplexmolecularsystemsbecomemorefeasible.Therearetwobasicproblemsinstudyingmolecularsystemsusingcomputersimulations: Becauseofthelargenumberofparticlesinthosesystems,thesizeofthephasespaceisenormous. Theaccuracyofthemolecularmodelsislimited.Manymethodshavebeenproposedtoaddressthoseproblems.Thisdissertationisabouttherstproblem,i.e.improvingtheefciencyofphasespacesampling.Morespecically,IconcentrateontheReplicaExchangesamplingmethod[ 1 4 ],whichisoneofthemostfrequentlyusedtechniquesforefcientsamplingofthephasespace.Thischapterisdedicatedtothefundamentalsofstatisticalmolecularmechanics,samplingmethods,anderroranalysis.Istartwiththeconceptofphasespaceandpartitionfunctionsandnishwithintroducingsomeofthefrequentlyusedsamplinganalysistechniques.Also,IprovideanoverviewonmyresearchprojectsduringmyPh.Dperiod,andinthenextchapters,describetheseprojectsindetail. 1.2FreeEnergiesandEnsemblesInthissectionIoverviewthekeyconceptsinstudyingofamolecularsystemfromclassical-statisticalphysics'pointofview.[ 5 9 ] 1.2.1PhaseSpaceAccordingtoclassicalstatisticalmechanics,thestateofasystemcanfullybedescribedbyknowingthepositionsandmomentaofitsparticles.Therearesixcoordinatesassociatedwiththecongurationandthemomentumofeveryparticle 12

PAGE 13

in3-dimensionalspace.Consequently,asystemwithNparticlescanbefullydescribedina6N-dimensionalspace,whichiscalledphasespace. 1.2.2PartitionFunctionsandEnsemblesAnensembleisacollectionofallpossiblecongurationsofasysteminphasespace.Eachensembleisconcomitantwithapartitionfunction,whichdescribesallpossiblemicrostatesofasysteminequilibrium.Allcollectivepropertiesofthesystem(e.g.,averageenergy,entropy,andfreeenergy)canbecomputedusingthepartitionfunctionanditsderivatives.Differenttypesofensemblescanbedenedbasedonthetypesofcommunicationbetweenthesystemandtherestofuniverse.Inthecontextofstatisticalmechanics,thefollowingarethemostfrequentlyusedensembles: Microcanonicalensemble(N,V,E):InaMicrocanonicalensemble,volume(V),numberofparticles(N),andtotalenergy(E)ofasystemarexed,andthesystemhasnoenergycommunicationwiththeuniverse.itcanberepresentedas(N,V,E)=RpRq(H(q,p))]TJ /F4 11.955 Tf 11.95 0 Td[(E)dpdq.Here,H,q,andparetheHamiltonian,theconguration,andthemomentumofthesystem,respectively. Canonicalensemble(N,V,T):Inthisensemble,V,N,andtemperature(T)areconstantandthesystemisinthermalequilibriumwithitsenvironment.ThecanonicalpartitionfunctionisQ(N,V,T)=RpRqexp()]TJ /F7 11.955 Tf 9.3 0 Td[(H(q,p))dpdq,where=1=kBTandkBistheBoltzmannconstant.Themajorityofmolecularsimulationsareperformedinthisensemble.InthecaseofconservativeHamiltonian,onecanseparatethecontributionsofkineticandpotentialenergies.Thekineticenergycontributioncancelsoutinmostoftherelativefreeenergycalculations.Consequentlyitisappropriatetousecongurationintegral(orsum)insteadofthefullpartitionfunction,i.e.,Z(N,V,T)=Rqexp()]TJ /F7 11.955 Tf 9.3 0 Td[(U(q))insteadofQ(N,V,T),whereUisthepotentialenergyofthesystem.Fromnowon,Iwillusethepartitionfunctionandthecongurationintegralinterchangeablyinthismanuscript. Isothermal-isobaricensemble(N,P,T):Inthisensemblepressure(P),N,andTareconstant.Thepartitionfunctioncanbewrittenas(N,P,T)=R10dVQ(N,V,T)exp()]TJ /F7 11.955 Tf 9.3 0 Td[(PV),whichistheLaplacetransformofthecanonicalpartitionfunction. Grandcanonicalensemble(,V,T):IntheGrandcanonicalensemble,thechemicalpotential(),V,andTarexed.Thegrandcanonicalpartitionfunctioncanbewrittenas=P1N=0Z(N,V,T)exp()]TJ /F7 11.955 Tf 9.3 0 Td[(N). 13

PAGE 14

Theappropriateensembletousedependsuponthephysicalcircumstances,althoughinthelimitoflargesystems(i.e.,thermodynamiclimit),wheretheuctuationsareinsignicant,alldescriptionsofensemblesbecomeindistinguishable.Thexedvaluesineachensemblearecalledstatevariables.InthenextsectionIdescribetherelationbetweenfreeenergies,statevariables,andpartitionfunctions. 1.2.3FreeEnergyandPotentialofMeanForceFreeenergyisthemaximumamountofworkthatathermodynamicsystemcanperform.Freeenergyisastatefunction,i.e.,ineachensembleitonlydependsuponthestatevariablesofthatensemble.Fromthestatisticalmechanicspointofview,inallensembles(excepttheMicrocanonicalensemble),freeenergiesarerelatedtothelogarithmofthepartitionfunctions.HereIexplainGibbsandHelmholtzfreeenergies,whicharefrequentlyusedinbiomolecularsimulations. 1.2.3.1HelmholtzandGibbsfreeenergiesHelmholtzfreeenergyisthefreeenergyassociatedwiththecanonicalensembleandisequaltomaximumextractableworkfromaclosedthermodynamicsystematconstantvolumeandtemperature.Itisrelatedtothecanonicalpartitionfunctionby:A(N,V,T)=)]TJ /F4 11.955 Tf 9.3 0 Td[(1 ln(Q(N,V,T)). (1)Ontheotherhand,theGibbsfreeenergy(alsoknownasfreeenthalpy)measurestheusefulworkintheisothermal-isobaricensembleandgivenby:G(N,V,T)=)]TJ /F4 11.955 Tf 9.3 0 Td[(1 ln((N,P,T)). (1)Sincethefreeenergyisastatefunction,ittakesadenite,non-uctuatingvalueatequilibrium.Itdoesnotgiveusanyinformationabouttheenergyofthesystemasafunctionofsomespecicreactioncoordinates.Inbiomolecularsimulations,insteadoffreeenergy,itisconventionaltousepotentialofmeanforce,whichisaprojectionoffreeenergyalongsomechemical/alchemical/physicalcoordinates. 14

PAGE 15

1.2.3.2PotentialofmeanforceThePotentialofMeanForce(PMF)isadecompositionofthefreeenergyalongsomereactioncoordinateparameterswhichmeasureshowtheenergyofasystemchangesalongthosecoordinates.InordertocalculatethePMFalongsomecoordinatesinanensemble,oneneedstointegratetheprojectionofthepartitionfunctionoverallotherdegreesoffreedomincongurationspace.ForexampleinthecaseoftheCanonicalensemble,thePMFalongXcanbecalculatedas:PMF(X)=)]TJ /F4 11.955 Tf 9.3 0 Td[(1 lnZ(exp()]TJ /F7 11.955 Tf 9.3 0 Td[(U(rN)(X)]TJ /F4 11.955 Tf 11.96 0 Td[(rN)drN, (1)whereU(rN)isthepotentialenergy. 1.3SamplinginBiomolecularDuetothecomplexityofpotentialenergyandthelargenumberofmicrostates,analyticalcalculationofthepartitionfunctionisnotfeasibleexceptforverysimplesystemswithafewparticles.Instead,inbiomolecularsimulations,weestimatethepartitionfunctioneitherbyrandomlysamplingofthephasespaceorbypropagatingatrajectoryinthephasephase.InthenextpartIexplaintheconceptofsamplingasitispresentedinthestatisticsandstatisticalphysics. 1.3.1SamplingInstatistics,samplingistheprocessofselectingasubsetofanoriginalpopulation.Agoodsampleisastatisticalrepresentationofthepopulationproperties,howeverasthesizeofpopulationincreasesappropriatesamplingbecomesmoredemanding.Duetothesizeofphasespaceinbiomolecularsimulations,theimportanceofsamplinghasbeenrecognized.Agoodsamplehastwocharacteristics:itisunbiased,anditislargeenoughtobeprecise.Later,Iwillexplainmoreonthosecharacteristicsintheerrorestimationsectionofthischapter. 15

PAGE 16

1.3.2StandardSamplingTechniquesinBiomolecularSimulationsMolecularDynamicsandMonteCarloarethetwofundamentalsamplingmethodsintheeldofmolecularmodeling.Almostallothersamplingmethodsareeitherinheritedorbranchedfromoneorbothofthesemethods. 1.3.2.1MoleculardynamicsMolecularDynamics(MD)[ 7 10 11 ]simulationpropagatesasysteminthephasespacebynumericallyintegratingNewton'sequations.AccordingtoMD,thetrajectoryofeachparticle(inthecaseoftheMicrocanonicalensemble)canbeobtainedbyiterativelysolvingEquations 1 :8><>:F(X)=rU(X)=m_V(t)V(t)=_X(t), (1)whereXandVarethevectorsofpositionandvelocityoftheparticlerespectively.Then,basedontheergodichypothesis(seesection 1.5.2 ),theaverageofanobservableAcancomputedby:hAi=limt!11 MMXi=1A(ti). (1)whereA(ti)isthevalueofAattimeti,alsoMisthetotalnumberoftis.ThecomputationalproblemwithregularMDisthatitmaybetrappedinlocalminimaofthepotentialenergysurface.Suchatrappingpreventsthesimulationfromsamplingthewholephasespaceefciently. 1.3.2.2MonteCarloTheMonteCarlo(MC)[ 7 12 13 ]techniquewasusedtogeneratetherstcomputersimulationofamolecularsystem.InaMonteCarlosimulation,anewcongurationisintroducedbyapplyingarandomchangetothepositionsofthecurrentcongurationofthesystem.UnliketheMDsimulationsthereisnomomentumcontributioninMCsimulationsandconsequentlythereisnodynamicsinvolvesinthesimulation. 16

PAGE 17

EstimatingthepartitionfunctionforasystemofNatomsusingregularMonteCarlocomprisesthefollowingsteps: 1. Creatingacongurationbyrandomlygenerating3NCartesiancoordinates. 2. Calculatingthepotentialenergy,U(rN),andBoltzmannfactor,exp()]TJ /F7 11.955 Tf 9.3 0 Td[(U(rN)),forthatconguration. 3. Repeatingsteps1and2,Mtimes(M>>1)untiladdingnewcongurationdoesnotchangetheaverageofobservableAinthesystem,hAi=limM!1MPi=1Aiexp()]TJ /F7 11.955 Tf 9.3 0 Td[(U(rN)) MPi=1exp()]TJ /F7 11.955 Tf 9.3 0 Td[(U(rN)). (1)Sincemanyofthegeneratedcongurationsmayhavenosignicantcontributiontothepartitionfunction,thisisnotapracticalapproachinsystemswithalargenumberofparticles.Onewaytoovercomethisbarrierisuseimportancesamplingtechniques,i.e.toaddcongurationstothesamplebasedontheirassociatedprobabilityinthepartition.ThisideawasintroducedbyMetropolisetal[ 14 ].UnlikeinregularMonteCarlo,inMetropolisMonteCarlo(MMC),onechoosesthecongurationsbasedontheirprobabilitiesandweightsthemevenlyinsteadofchoosingcongurationsrandomlyandweightingthemwithproperprobabilityafterwards.AccordingtotheMMCtechnique,oneneedstoaddanewstepbetweensteps2and3ofregularMCinordertoacceptorrejectanewconguration.TheprobabilityofacceptanceisPacc=min(1,exp()]TJ /F7 11.955 Tf 9.3 0 Td[(U)), (1)whereUisthedifferencebetweenthepotentialenergyofoldandnewcongurations.ConsideringthecomputationalcostofMMC,althoughmanyoftherandomlyproducedcongurationsmayberejected,stillonehastocalculatetheassociatedenergyofallthecongurationstodecideaboutacceptanceorrejection. 17

PAGE 18

1.4AdvancedSamplingMethodsInbiomolecularsystemswithmanydegreesoffreedom,therearelargenumbersoflocalminimainthepotentialenergysurface.Conventionalsamplingmethods,suchasMCandMD,tendtobetrappedinthoseminima,andconsequentlygiveinaccuratethermodynamicaverages.Advancesamplingmethodshelptoovercomethislimitation.Therearemanytypesofadvancedsamplingtechniques.HereIexplaintwoofthemoreprominentcategoriesofthose:generalizedensemblemethodsandslow-growthmethods. 1.4.1GeneralizedEnsembleMethodsGeneralizedEnsembleMethods[ 3 15 17 ]overcometheproblemoftrappinginlocalminimabychangingtheBoltzmannprobabilityweightstonon-Boltzmannweightssuchthatarandomwalkinpotentialenergyspaceisrealized. 1.4.1.1SimulatedtemperingInsimulatedtempering[ 15 ],temperaturedynamicallychangeswithtimesuchthatthesystemaccomplishesarandomwalkintemperaturespace.Theprobabilityofvisitingamicrostateisproportionaltothesimulatedtemperingweight:WST(E,T)=exp()]TJ /F7 11.955 Tf 9.3 0 Td[(E+a(T)), (1)wherea(T)ischosensuchthattheprobabilitydistributionofmicrostatesintemperaturespacebecomesat,i.e.:PST(T)=ZdEn(E)WST(E,T)=const, (1)wheren(E)isthedensityofenergystates.InnumericalsimulationstemperatureaxesislimitedtoalistofMdiscretevaluesandthesimulationcanhopbetweenthosevalues.Theoptimalvaluesofa(T)shouldbedeterminedbyiterationoftrialsimulations.ThefollowingisasummaryoftheSTalgorithm: 1. RunningacanonicalMDorMCsimulationattemperatureTiforacertainsteps. 18

PAGE 19

2. ChangingTitooneofitsearliestneighborsinthetemperaturelistwithtransitionprobabilityof:(Ti!Ti1)=min1,WST(E,Ti1) WST(E,Ti),, (1)whichisbasedonMMCcriteia. 1.4.1.2MulticanonicalalgorithmIntheMulticanonicalAlgorithm(MUCA)[ 17 18 ],aatpotentialenergydistributionisachievedbyweightingeachmicrostateusinganon-Boltzmannweightfactor,WMUCA,whichimplies:p(E)/n(E)WMUCA(E)=const. (1)Thisinducesarandomwalkintheenergyspaceofthesystem,whichpreventsasimulationfromtrappinginlocalminimaofthepotentialenergysurfaceofthesystem.SimilartotheSTmethod,thedensityofstatesisnotaprioriknownandtheMUCAweightfactorshouldbedeterminedbyiterativelyrunningshorttrialsimulations[ 17 19 ].OnecanperformMUCAusingbothMDandMCmethods.MoreovertheaverageofphysicalquantityAattemperatureTcanbecalculatedusingsinglereweightingtechniques:hAiT=PiA(qi)W)]TJ /F5 7.97 Tf 6.59 0 Td[(1MUCA(E(qi))exp()]TJ /F7 11.955 Tf 9.3 0 Td[(E(qi)) PiW)]TJ /F5 7.97 Tf 6.59 0 Td[(1MUCA(E(qi))exp()]TJ /F7 11.955 Tf 9.29 0 Td[(E(qi)) (1)whereqiistheithconguration. 1.4.1.3Wang-LandausamplingTheWang-Landaualgorithm[ 16 ]isaMMCtechniqueforestimatingthedensityofstates.ItcanbeusedasacomplementarytoolfordeterminingtheweightfactorinMUCA.Inthismethod,weinitiallysetalln(E)=1andtheprobabilityofjumpingfroman 19

PAGE 20

energystateEitoEjisp(Ei!Ej)=min1,n(Ej) n(Ei). (1)whichinducedarandomwalkineneryspaceofthesystem.Eachtimethecurrentenergydensityofstateupdatesasfollowing:n(E) n(E)f, (1)wherefisthemodicationfactor.Typicallyasimulationbeginswithf=e'2.71828.Duringtherandomwalk,weaccumulateahistogramH(E),whichisthenumberofvisitsateachenergylevelE.Whenthehistogrambecomesat,weresetthedensityofstates,updatethemodicationfactorf p f,andrestartthewholeprocessfromthebeginning.Wecancontinuethisschemeuntilwereachdesiredaccuracyinestimatingofthedensityofstates. 1.4.1.4ReplicaexchangeTheReplicaExchange(RE)method,whichoriginallywasproposedbySwendsen[ 1 20 ],isoneofthemostsuccessfulmethodsintheeldofmolecularsimulations.Thismethodhastwoadvantages:First,unlikeSTandMUCA,inRE,theweightfactorisknownapriori.Second,REcanintuitivelybecombinedwithmanyotherenhancedsamplingmethods[ 21 24 ].Inthistechnique,Mindependentcopies(replicas)ofthesystemaretreatedbyMD(i.e.,REMD)orMC(i.e.,REMC)simultaneously,andtheyasktoexchangetheircongurationsaftereverycertainnumbersofstepsofsimulation.Byexchangingcongurationsbetweenreplicas,arandomwalkinthereplicas'ladderspaceisrealized.BasedonthetypeofRE,replicascanspantemperaturespace(TemperatureRE),Hamiltonianspace(HamiltonianRE),orboth(MultiDimensionalRE).Thecoreoftheexchangecriteriaisbasedonimposingadetailedbalanceequationto 20

PAGE 21

thegeneralizedensemble:w(X!X)}(X)=w(X!X)}(X), (1)where}(X)andw(X!X)aretheprobabilityofbeingatstateXandtheprobabilityoftransitionfromXtoX,respectively.XandXaregeneralizedensemblesbeforeandaftertheexchange,i.e.,X=0BBBB@T1...TiTj...TMH1...HiHj...HMq1...qiqj...qM1CCCCA,X=0BBBB@T1...TjTi...TMH1...HjHi...HMq1...qiqj...qM1CCCCA (1)HereHi,Ti,andqirepresenttheHamiltonian,temperature,andcongurationofreplicaijustbeforeexchange.Becauseallreplicasareindependent,theprobabilityofthesystembeingatthegeneralizedstateofXis}(X)=QMi=1pi(Ti,Hi,qi),wherepi(Ti,Hi,qi)istheprobabilitythatreplicai,withHamiltonianHiandtemperatureofTiisfoundwiththeconformationstateofqi.Inthecaseofcanonicalensemblepi(Ti,Hi,qi)/exp()]TJ /F7 11.955 Tf 9.3 0 Td[(iHi(qi)), (1)wherei=1 kBTi.Thenthedetailedbalanceequationcanbewrittenas:w(X!X) w(X!X)=pi(Tj,Hj,qi)pj(Ti,Hi,qj) pi(Ti,Hi,qi)pj(Tj,Hj,qj)=exp()]TJ /F7 11.955 Tf 9.3 0 Td[(jHj(qi))exp()]TJ /F7 11.955 Tf 9.3 0 Td[(iHi(qj)) exp()]TJ /F7 11.955 Tf 9.3 0 Td[(iHi(qi))exp()]TJ /F7 11.955 Tf 9.3 0 Td[(jHj(qj))=exp()]TJ /F4 11.955 Tf 9.3 0 Td[(), (1)where=[(jHj(qi)+iHi(qj)))]TJ /F4 11.955 Tf 11.95 0 Td[((iHi(qi)+jHj(qj))]. (1)ThenusingtheMMCcrieria,theprobabilityoftransitionfromXtoXcanbecalculatedas:w(X!X)=min(1,exp()]TJ /F4 11.955 Tf 9.3 0 Td[()). (1) 21

PAGE 22

InthecaseofTRE,whereallHamiltoniansarethesame,Equation 1 canbesimpliedto:=[(j)]TJ /F7 11.955 Tf 11.96 0 Td[(i)(H(qj))]TJ /F4 11.955 Tf 11.96 0 Td[(H(qi))], (1)andinthecaseofHRE,wherealltemperaturesareequal,itcanbewrittenas:=[Hj(qi)+Hi(qj))]TJ /F4 11.955 Tf 11.96 0 Td[(Hi(qi))]TJ /F4 11.955 Tf 11.95 0 Td[(Hj(qj)]. (1) 1.4.1.5UmbrellasamplingTheUmbrellaSamplingmethod,whichwasintroducedbyTorrieandValleau[ 25 ]in1977,isoneofthemajortechniquesforcalculatingthePMFalongpresetreactioncoordinates.Inthismethod,werestrainthesystematdifferentpart/partsofthereactioncoordinatebyaddingsingle/multiplebiaspotential/potentials.Thebiasisanadditionalpotentialenergyterm,whichrestrainsthesystemtothereactioncoordinate.Usually,thebiasisaquadraticfunctionofthereactioncoordinate:B()=k()]TJ /F7 11.955 Tf 11.95 0 Td[(0)2, (1)wherek,,and0arethebiasstrength,theorderparameterandthebiascenteronthereactioncoordinateandcongurationrespectively.InordertoextractanestimationofthePMFasafunctionofareactioncoordinate,itisnecessarytoremovetheeffectofbiasusingpost-processingmethods.Manymethodshavebeenproposedintheliterature;amongthose,theWeightedHistogramAnalysisMethod(WHAM)[ 26 ]andUmbrellaIntegration(UI)[ 27 28 ]arethemostpromising. 1.4.2Slow-GrowthMethodsSlow-growthmethodsareusedtocomputethePMFbetweentwogivenstates(e.g.,AandB)ofasystembysummingthefreeenergydifferencesbetweentheintermediate 22

PAGE 23

steps.Herethepotentialenergyforintermediatestagesisdenedas:U()=UA+(UB)]TJ /F4 11.955 Tf 11.95 0 Td[(UA), (1)hereiscalledthecouplingparameter,U(0)=UA,andU(1)=UB.Inslow-growthtechniques,intermediatestatesi.e.0<<1,maynothaveanyphysicalmeaning.Consequently,forthecaseofthecanonicalensemble,theHelmholtzfreeenergyatthestateofcanbewrittenas:A(N,V,T,)=)]TJ /F4 11.955 Tf 9.29 0 Td[(1 ln(Z(N,V,T,)), (1)whereZ(N,V,T,)=Piexp()]TJ /F7 11.955 Tf 9.3 0 Td[(Ui())isthepartitionfunctionforthestateof.HereIexplainThermodynamicsIntegration(TI)andFreeEnergyPerturbation(FEP),whichareamongtheprevalentslow-growthmethods.OneoftheapplicationsofthistechniqueisthecalculationofpKasofdifferentspeciesinproteins[ 29 ].Inthistypeofcalculation,oneofthetwoendstates(i.e.,=0or1)characterizestheprotonatedstate,andtheotherrepresentsthedeprotonatedstateorviceversa.Theintermediatescorrespondtohybridprotonatedanddeprotonatedstates. 1.4.2.1ThermodynamicintegrationAccordingtotheTItechnique,thefreeenergydifferencebetweenstatesAandBis:A(A!B)=Z10d@A @. (1)UsingEquation 1 thiscanberewrittenas:A(A!B)=Z10d)]TJ /F4 11.955 Tf 9.3 0 Td[(1 Z@Z @=Z10d1 ZXiexp()]TJ /F7 11.955 Tf 9.3 0 Td[(Ui())@Ui() @=Z10dh@U() @i, (1)wherethebracketaveragerepresentanensembleaveragegeneratedat.Inpractice,onehastobreaktheintegrationtothesumofinnitesimallydifferentintermediate 23

PAGE 24

states.Moreover,theintegrationpathshouldbereversible,i.e.,changesinshouldbesmallenough,andthesystemshouldberelaxedateachvalue. 1.4.2.2FreeenergyperturbationTheFreeenergyperturbationmethod,whichwasintroducedbyZwanzigin1954[ 30 ],isanotherfrequentlyusedfreeenergydifferencemethod.Consideringthecanonicalensemble,theHelmholtzfreeenergydifferencebetweenstatesAandBcanbeexpressedas:A(A!B)=)]TJ /F4 11.955 Tf 9.3 0 Td[(1 lnZB ZA=)]TJ /F4 11.955 Tf 9.3 0 Td[(1 lnXexp()]TJ /F7 11.955 Tf 9.3 0 Td[(UB) ZA=)]TJ /F4 11.955 Tf 9.3 0 Td[(1 lnXexp()]TJ /F7 11.955 Tf 9.3 0 Td[(UA) ZAexp()]TJ /F7 11.955 Tf 9.3 0 Td[((UB)]TJ /F4 11.955 Tf 11.96 0 Td[(UA))=)]TJ /F4 11.955 Tf 9.3 0 Td[(1 lnhexp()]TJ /F7 11.955 Tf 9.3 0 Td[((UB)]TJ /F4 11.955 Tf 11.96 0 Td[(UA))iA. (1)DuringthederivationofEquation 1 ,weimplicitlyassumedthenumberofenergymicrostatesinstatesAandBtobeequal,butthisisonlytruewhentwostatesareinnitesimallyclose.OnecanusetheEquation 1 toadapttheEquation 1 forcomputingthefreeenergydifferencebetweendistinctstatesinthesystem:A(A!B)=)]TJ /F4 11.955 Tf 9.3 0 Td[(1 M)]TJ /F5 7.97 Tf 6.58 0 Td[(1Xi=1lnhexp()]TJ /F7 11.955 Tf 9.3 0 Td[((Ui+1)]TJ /F4 11.955 Tf 11.96 0 Td[(Ui))ii, (1)whereMisthenumberofdiscretestepswhichhasbeenusedformovingfromstateAtostateB. 1.5Convergence,ErrorEstimation,andSamplingQualityinBiomolecularSimulations 1.5.1ErrorinComputationalMethodsTherearetwotypesoferrorsinestimatingavariableusingcomputationalmethods,systematicandrandom.Randomorstatisticalerrors,whichareassociatedwithprecisionofmeasurements,areuctuationsinthemeasureddataduetoimperfectsampling.Inordertoestimatetherandomerror,oneneedstorepeatthesimulation 24

PAGE 25

manytimesandcalculatethevarianceofthemeasuredvariable.Incontrast,systematicerrorsorbiasesarereproducibleinaccuracies,whichusuallycomefromthedeciencyofenergyandforcecomputationmethods.Thebiascanbedenedasadifferencebetweenthecorrectvalueandthemeasuredvalue.Theestimationofbiasiscomplicated,sincethecorrectvalueofavariableisnotaprioriknown.Toovercomethisproblem,typicallythebiasismeasuredwithrespecttoamoresophisticatedmodel(e.g.,biasofaforceeldmethodwithrespecttoaquantummethod).Ontheotherhand,randomerrorisassociatedwiththequalityofsampling.InthenextsectionIdescribetheergodictheory,whichisthebaseofmanysamplingqualitymeasurementmethods. 1.5.2ErgodicityandConvergenceErgodictheory,whichoriginatedfromBoltzmann'sworkinstatisticalphysics,describesthebehaviorofadynamicalsystemwhenitrunsforalongtime.Accordingtothishypothesis,asystemeventuallyvisitsallpointsinitsphasespace,ifitisallowedtorunforenoughtime.Oneoftheconsequencesofergodictheoryassertsthat,undercertainconditions,thetimeaverageofafunctionoveradynamicstrajectoryisequaltothespaceaverageofthatfunction,i.e.:hAi=ZpZqp(q,p))dpdq=limt!11 tZt0+tt0A()d, (1)wherep(q,p)istheprobabilityofthesystemhavingatcongurationqandthemomentumofp,moreovert0isanarbitraryoriginonthetimecoordinate.Theergodictheoryiscloselyrelatedtotheconceptofconvergence.Inthecontextofmolecularsimulations,asimulationisconsideredconvergedwhenthecollectedset(sampledset)isapreciserepresentationoftheoriginalensemble.Moreover,asystemiscalledsemi-ergodic,whensomestatesarenotaccessibleduringareasonablesimulationtime,sothesystemwillnotconvergeeasily.Sincetheoriginalensembleisnotfeasible,fullconvergenceisnotpracticableeither.Inotherwords,onecanapproachtheidealensemble,butcannotachieveit.Inthenextparts,Idiscusssometeststhat 25

PAGE 26

quantifythedegreeofcondenceaboutasimulationconvergence.Wehavetonotethatthoseassessmentsonlydescribenecessaryassurances,butnotthesufcientones. 1.5.3SamplingQualityChecksOneofthemostimportantquestionsinsamplingmethodsisthis:whenhavewecollectedenoughsamples?Toanswerthisquestion,weneedtoquantifytheconvergencerateandthesamplingqualityinasimulation.HereIdescribethemostemployedconvergenceanalysistoolsincomputationalbiophysicsandbiochemistry. 1.5.3.1RootmeansquaredeviationanalysisRootMeanSquareDeviation(RMSD)ofthecongurationsmeasurestheaveragemovementsofparticlesinasnapshotfrom(usually)theinitialsnapshot.ForasystemwithNparticles,Itisdenedas:RMSDt=s PNi=1jXi(t))]TJ /F4 11.955 Tf 11.96 0 Td[(X0ij2 N (1)wherejXi(t))]TJ /F4 11.955 Tf 11.95 0 Td[(X0ijisthespatialdisplacementofparticleithbetweentimetandtime0.OnewaytotracktheconvergencerateistocalculatethecumulativeRMSDaverageasfunctionoftime:hRMSDit=PtRMSDt n. (1)wherenisthenumberofsnapshotsfrombeginningtotimet.Thismethodisusefulfortrackingthefoldingstateofprotein,sinceinafoldedstatetheproteinuctuatesaroundtheaverageconguration.ThedenitionaboveisnotuniqueandthereareothervariationsofRMSDdenitions[ 31 35 ]. 1.5.3.2RootmeansquaredeviationclusteringClusteringistheprocessoforganizingasetofobjectsintogroupswhosemembersarerelatedinsomeway.Clusteringisapartofmanystatisticalanalyses.However,consideringthelonglistofthosemethodsisbeyondthescopeofthisdissertation.Amongthem,Rootmeansquaredeviationclusteringisfrequentlyusedinthecontext 26

PAGE 27

ofmolecularsimulations.Thismethodcanbeusedfordividinganensembleintosetsofself-similarstructures.Dauraetal[ 36 ]adoptedthismethodforassessingtheconvergencerateinthesimulation.Accordingtothismethod,asignicantdecreaseintherateofdiscoveryofnewclustersduringthesimulationisasignofconvergence.Howeveracloserinvestigationrevealsthatthisconditionisnecessarybutnotsufcient.Forexample,trappinginenergyminimacansignicantlydecreasetherateofclusterdiscoveryinasimulation,althoughthesimulationisnotconverged. 1.5.3.3BlockaveragingBlockAveraging(BA),asproposedbyFlyvbjergandPetersen[ 37 ],isamethodforexaminingthequalityofsamplinginacorrelatedsimulation.Thismethodcanbedescribedasfollowing:Atrajectorywithtimelengthof=M.nsnapshotsissplitintoMdifferentblockofsizen,startingwithsmalln,e.g.,n=1.ThemeanofobservableAiscalculatedforeachblock.ThenonecancalculatetheBlockedStandardError(BSE)as:BSE(A,n)=n=p M, (1)wherenisthevarianceamongMblockedaverages.Forsmallvaluesofn(i.e.,smallblocks)blocksaremorecorrelated,SoBSEunderestimatesthestandarderror.Asthevalueofnincreases(orMdecreases)thecorrelationdecreasesandBSEconvergestoavalue.PlottingtheBSE(A,n)vs.ncomprisesasignalofbothstatisticalerrorconvergenceanddecorrelationofA. 1.5.3.4PrincipalcomponentanalysisPrincipalComponentAnalysis(PCA)[?]isatechniqueforcalculatingthelarge-scalecharacteristicmotionsfromasimulationtrajectory.InordertoperformPCA,oneneedstocreatethe3N3N(whereNisthenumberofatoms)covariancematrixas:Cij=h(xi)]TJ ET q .478 w 234.01 -616.04 m 243.04 -616.04 l S Q BT /F4 11.955 Tf 234.01 -622.86 Td[(xi)(xj)]TJ ET q .478 w 278.01 -616.04 m 287.27 -616.04 l S Q BT /F4 11.955 Tf 278.01 -622.86 Td[(xj)i (1) 27

PAGE 28

andcomputetheeigenvaluesandeigenvectorsofthematrix.Wherexirepresentsithdegreeoffreedominthecongurationspace.Eacheigenvaluerepresentsthemeansquaredeviationalongthecorrespondingeigenvector.Themodeswithhighereigenvaluessignifythemaindirectionsofmotioninthesystem.AnotherapplicationofPCAisformeasuringthedegreeofsimilarityinthevariationsoftwotrajectories[ 38 ].Also,thiscanbeusedasatestofconvergenceinasinglelongtrajectorybydividingthelongtrajectoryintotrajectorieswithsmallerlengths. 1.5.3.5Kullback-LeiblerdivergenceKullback-Leiblerdivergence[ 39 40 ]orrelativeentropyisdenedas:DK)]TJ /F5 7.97 Tf 6.58 0 Td[(L(PjjQ)=XiP(i)lnP(i) Q(i), (1)wherePandQaretheprobabilitydistributionsofarandomvariable.Itistheaverage,overthedistributionP,ofthelogarithmicdifferencebetweentheprobabilitiesPandQ.ItmeasuresthedegreetowhichPisdistinguishablefromQ.DK)]TJ /F5 7.97 Tf 6.59 0 Td[(Lisnon-negativequantityandasmallvalueofitindicatesthatPandQarehighlyoverlapped.Thismetriccanbeusedasameasureofconvergencerate,whereQisatarget(orreference)distributionandPischangingwithtime. 1.6OutlineofMyResearchDuringmyPh.Dstudies,IhavebeenmostlyfocusedondevelopingandtestingenhancedsamplingmethodsandmainlyHamiltonianReplicaExchangeMolecularDynamics(HREMD).Myrstproject,whichisdescribedinthenextchapter,wasaboutdevelopingReplicaExchangeFreeEnergyPerturbation(REFEP)method,whichisacombinationoftheFreeEnergyPerturbation(FEP)andHREMD[ 22 ].WedemonstratedthatHREMDmethodnotonlyimprovesconvergenceinalchemicalfreeenergycalculations,butalsocanbeusedtocomputefreeenergydifferencesdirectlyviatheFEPalgorithm.WeshowedadirectmappingbetweentheHREMDandtheusualFEPequations,whichare 28

PAGE 29

thenuseddirectlytocomputefreeenergies.WetestedREFEPonpredictingthepKavalueoftheburiedAsp26inthioredoxin.WecomparedtheresultsofREFEPwithTIandregularFEPsimulations.REFEPcalculationsconvergedfasterthanthosefromTIandregularFEPsimulations.ThenalpKavaluepredictedfromHREMDsimulationwasonly0.4pKaunitsabovetheexperimentalvalue.Briey,weshowedREFEPalgorithmnotonlyimprovesconformationalsampling,butalsoimprovestheconvergencerateoffreeenergysimulations.Mynextproject,i.e.,thethirdchapter,wasademonstrationofthecapabilitiesofREFEPinestimatingthepKaofcomplicatedproteins[ 41 ].Accordingtotheexperimentalmeasurements,thepKavalueofGlutamate66(GLU66)inahyperstablemutantofstaphylococcalnucleasedisplaysalargeshift,roughly4.6pKaunits,relativetoitsnormalvalueinwater,asmeasuredinthelabofMorenoetal[ 83 84 92 98 ].Inordertoreproducethelargeexperimentalshiftusingasinglestructure,continuumsolventandcomputationalmethods,aninternaldielectricconstantaround10isnecessary.Thephysicalreasonforthisisnotyetunderstood,buthypotheseshavebeenproducedbyMorenoetal[ 91 97 99 104 ]regardingsolventpenetration,proteinreorganization,etc.Weaimedtoresolvethisinconsistencybetweenexperimentalandcontinuummethodsbyintroducingafour-statethermodynamiccyclethatcouplesconformationalstateswithprotonationstatesofGLU66.Weproposedthattheexperimentalmethods(whicharemostlysensitivetocongurationalchanges)measuretheequilibriumconstantbetweenthetwocongurationalstatesinsteadofthetwoprotonationstates.WeusedREFEPmethodinimplicitsolventtocalculatethepKavalueofGLU66foreachofthecongurationalstatesaswellasthemixedconguration.TheresultsareinalmostperfectagreementwiththeexperimentsofMorenoetal.InmythirdresearchprojectIdevelopedandvalidatedapH-ReplicaExchangeMolecularDynamics(pH-REMD)method[ 24 ].Thismethodimprovesthecouplingbetweenconformationalandprotonationsampling.UnderaHREMDsetup,conformations 29

PAGE 30

areswappedbetweentwoneighboringreplicas,eachhavingdifferentpHs.WeappliedpH-REMDtoaseriesofmodelcompounds,aterminallychargedADFDApentapeptide,andaheptapeptidederivedfromtheovomucoidthirddomain(OMTKY3).Inallofthosesystems,thepredictedpKavaluesbypH-REMDwereveryclosetotheexperimentalvaluesandalmostidenticaltotheonesobtainedbyconstantpHmoleculardynamics(CpHMD).InthelastyearofmyPHD,IhavefocusedonoptimizingHREMDmethods.Thegoalisndingthebestpositionforreplicasinordertomaximizetheroundtripbetweenextremumpositionsonreplicaladder.Wedeveloped,validated,andtestedamethodforestimatingtheprobabilityofexchangebetweenneighboringreplicasinUmbrellaSamplingReplicaExchangeMD(USRE)[ 42 ].Weuseinformationfromveryshortumbrellaruns,needingonlyahandfulofwindows.Wedesignedamultidimensionalscoringfunctiontooptimizethesetofreplicas(windows).Bymaximizingthescoringfunction,weenforcethesameexchangeacceptanceforallneighborreplicapairs.Wefoundhavingequalexchangeacceptancebetweenpairsincreasesthenumberofroundtripsandimprovestheefciencyofsampling.Adescriptionofthismethodcanbefoundinchapter5. 30

PAGE 31

CHAPTER2COMPUTINGALCHEMICALFREEENERGYDIFFERENCESWITHHAMILTONIANREPLICAEXCHANGEMOLECULARDYNAMICS(HREMD)SIMULATIONS 2.1LiteratureReviewFreeenergy,especiallythefreeenergydifferencebetweentwostates,isacrucialquantityinthestudyofchemicalandbiologicalsystems.[ 43 ]Knowledgeofthefreeenergydifferencescanhelpusunderstandthebehaviorsofsuchsystems.Forexample,thefreeenergyofbindingisoneofthecriteriausedtoevaluatetheperformanceofdrugs.[ 44 ]Therefore,oneimportantaspectofmolecularmodelingistoyieldaccuratefreeenergydifferencesefciently.Manyfreeenergycalculationmethodologies(suchasfreeenergyperturbation,[ 30 ]thermodynamicintegration,[ 45 ]umbrellasampling,[ 25 46 47 ]andJarzynski'sequality[ 48 ]aswellasanalysistechniques(suchastheweightedhistogramanalysismethod[ 49 ]andBennettacceptanceratiomethod[ 50 51 ]havebeendevelopedtoachievethisgoal.Ingeneral,freeenergycalculationscouldbedividedintoalchemicalfreeenergyandconformationalfreeenergycalculations.Thealchemicalfreeenergycalculationsareoftenemployedwhenstudyingthefreeenergydifferencesofprocessesthatinvolvechangesinnoncovalentinteractions.Inanalchemicalfreeenergysimulation,anonphysicalreactioncoordinateisgenerallyadoptedintoconnecttheinitialandnalstates.Thisreactioncoordinateisusuallyexpressedasaninterpolationoftheinitialandnalstates.Thus,analchemicalprocessisachievedthroughaseriesofintermediatestateshavingnodirectphysicalmeaning.Sincethefreeenergydifferencebetweentwostatesisastatefunction,theactualchoiceofcoordinatecannot,inthelimitofinnitesampling,affecttheresults.Freeenergyperturbation(FEP) ReprintedwithpermissionfromMeng,Y.;Dashti,D.S.;Roitberg,A.E.J.Chem.TheoryComput.2011,7,2721. 31

PAGE 32

andthermodynamicintegration(TI)aretwocommonmethodologiesthatareutilizedinalchemicalfreeenergycomputations.Oneimportantissueinalchemicalfreeenergycalculationsistheconvergenceofthefreeenergydifferenceversuscomputationalcost.Suchconvergenceisparticularlydifcultinsystemsinvolvingslowstructuraltransitionorlargeenvironmentalreorganizationaschanges.[ 21 52 53 ]Therefore,conformationalsamplingiscrucialinalchemicalfreeenergycalculations.Enhancedsamplingmethods,suchasreplicaexchangemoleculardynamics(REMD),[ 2 ]orthogonalspacerandomwalk(OSRW),[ 53 ]andacceleratedmoleculardynamics(AMD)[ 54 ]havebeenappliedtofreeenergysimulationsinordertoaccelerateconformationalsamplingand,inturn,toyieldaccurateandconvergedfreeenergydifferences.Amongtheenhancedsamplingmethodologies,theREMDmethodisofparticularinterestbecausetheweightofeachstateisaprioriknown(Boltzmannfactor).Bothtemperature-basedandHamiltonian-basedREMDhavebeenappliedtoalchemicalfreeenergycalculations.Woodsetal.[ 21 ]andRick[ 60 ]havecombinedthetemperature-basedREMDwithTIcalculation.Atemperature-basedREMDsimulationisconductedateachstatealongthereactioncoordinate.Woodsetal.[ 21 ]havealsoappliedtheHREMDmethodologytoFEPandTIcalculations.EachreplicaintheHREMDsimulationrepresentsastatealongthereactioncoordinate,andaperiodicswapinisattempted.Relativesolvationfreeenergyofwaterandmethaneaswellastherelativebindingfreeenergiesofhalidestocalispyrrolehavebeencalculatedinthisway[ 21 ].TheYanggrouphasdevelopedadual-topologyalchemicalHREMD(DTA-HREM)method[ 61 ].Theirmethodwastestedonthefreeenergyofmutatinganasparagineaminoacid(withtwoendsblocked)toleucine.Morerecently,theRouxgroupcoupledtheFEPmethodologywiththedistributedreplicatechnique(REPDSTR)[ 62 63 ].Anadditionalaccelerationinthesamplingoftheside-chaindihedralanglewasalsoincorporatedwhenJiangandRouxutilizedtheFEP/HREMD 32

PAGE 33

methodtostudytheabsolutebindingfreeenergyofp-xylenetotheT4lysozymeL99Amutant[ 63 ].Inallofthosestudies,theconformationalsamplingandconvergenceoffreeenergycomputationsshowedsignicantimprovementwhentheREMDmethodwasapplied.Theprotocolpresentedhereacceleratesconvergencebut,ofcourse,doesnotsolveknownproblemsintheeldrelatedtoenhancedsamplingofcoordinatesorthogonaltospace,whichwouldhampermanyofthecurrentmethods.Inthischapter,wewilldemonstratethatFEPisactuallyalreadyincorporatedintheHREMDmethodinanelegantandformalway.TheREFEPmethodisshowntobenotonlyanenhancedsamplingmethodbutalsoafreeenergycalculationalgorithm.WewillapplytheREFEPmethodtothepKapredictionofthioredoxinAsp26.TheexperimentalpKavalueof7.5hasbeenshowntobeoneofthelargestshiftedfromtheintrinsicpKavalue[ 64 65 ]and,hence,makesitaninterestingcasetobestudiedtheoretically.TIandFEP(regularmoleculardynamicsforconformationalsampling)alchemicalfreeenergysimulationshavebeenconductedinordertocomparewithREFEPsimulations.AveryaccuratetheoreticalpKavalueisobtainedfromREFEPsimulations.TheconvergenceofthefreeenergydifferenceandpKavalueisachievedinREFEPsimulationsmuchfasterthanthatintheFEPandTIsimulations.TheadvantageandsimplicityofusingtheHREMDsimulationtocomputethealchemicalfreeenergydifferenceisclearlyshown. 2.2TheoryandMethod 2.2.1FreeEnergyPerturbation(FEP)TheFEPmethod,whichwasinitiallyintroducedbyZwanzigin1954[ 30 ],isawellestablishedmethodandisconsideredthemostfrequentlyemployedmethodologyinalchemicalfreeenergycalculations[ 52 ].ThedetailsoftheFEP,aswellastheTI,methodologyanditsapplicationshavebeenextensivelyreviewed[ 52 66 69 ].Therefore,onlyaverybriefdescriptionoftheFEPandTImethodswillbegivenhere.Considertwostates(1and2)ofasysteminthecanonical(NVT)ensemble,andtheircorrespondingHelmholtzfreeenergiesA1andA2.TheHelmholtzfreeenergydifferencebetweentwo 33

PAGE 34

statescanbeexpressedasA1!2=)]TJ /F4 11.955 Tf 9.3 0 Td[(kBTlnhexp[U2(q))]TJ /F4 11.955 Tf 11.95 0 Td[(U1(q)] kBTi1. (2)Here,kBistheBoltzmannconstant,Tisthetemperature,andqisthemolecularstructure.U1andU2arethepotentialenergiesofstates1and2,respectively.Thebracketwithsubscript1standsfortheaveragecalculatedoverthestructuralensemblegeneratedbystate1.InordertocomputeA1!2,onesimulationofstate1isperformed.Onceacongurationqistaken,thepotentialenergydifferenceatcongurationqiscomputed.Theensembleaverage,whichishexp[U2(q))]TJ /F5 7.97 Tf 6.58 0 Td[(U1(q)] kBTi1,canbecalculatedeasily,andhence,A1!2isobtained.AlthoughtheHelmholtzfreeenergiesareutilizedhere,Equation 2 canbeextendedtoanisothermal-isobaric(NPT)ensembleandtotheGibbsfreeenergyinthesamemanner.WhentheuctuationsinUinEquation 2 aretoolarge,FEPcalculationsarenotoriouslyhardtoconverge.TheconvergenceoftheFEPcalculationwillbepooriftheoverlapinphasespacebetweenthetwostatesissmall.Inordertocomputethefreeenergydifferencebetweentwostatesthatareverydifferent,intermediatestatesmixingthetwoendpointsareadoptedinsuchawaythatthedifferencesbetweenneighborscanbetreatedasperturbations.Afrequentlyemployedmethodtogenerateintermediatestatesistointerpolatepotentialenergyfunctionslinearly,asshowninEquation 2 .InEquation 2 ,U1andU2arethepotentialenergyfunctionsofstates1and2,respectively.Freeenergydifferencesbetweenneighboringstatesarethencomputed.Thesumofindividualfreeenergydifferenceswillbethetargetedfreeenergydifferencebetweenstates1and2(Equation 2 ).TherearemanywaysofexecutingFEPcalculationsinvolvingintermediatestates.Thedouble-ended,double-wide[ 67 70 ],andoverlapsamplingalgorithms[ 71 ]areamongthemostpopularones.Athoroughdescriptionofdifferentalgorithmsandtheirperformancecanbefoundinarecentreview 34

PAGE 35

byJorgensenandThomas[ 67 ].U()=U1+(U2)]TJ /F4 11.955 Tf 11.95 0 Td[(U1), (2)A1!2=)]TJ /F4 11.955 Tf 9.3 0 Td[(kBTXilnhexp[U(i+1))]TJ /F4 11.955 Tf 11.96 0 Td[(U(i)] kBTii. (2)Inpractice,computingA1!2(forwardfreeenergydifference)isequallyeasy(orhard)ascomputingA2!1(backwardfreeenergydifference),andoneisexactlytheoppositeoftheotherinprinciple.Evaluationofforwardandbackwardfreeenergydifferencesprovidesanindicationofconvergence.Furthermore,thepotentialenergydifferencesgeneratedfrombothdirectionscanbeutilizedtoreducestatisticalerror.TheBennettacceptanceratio(BAR)methodisafrequentlyemployedschemetoimprovetheprecisionofafreeenergyestimator[ 50 52 ]. 2.2.2ThermodynamicIntegrationAnotherwayofwritingthefreeenergydifferencebetweentwostates1and2isA1!2=)]TJ /F4 11.955 Tf 9.3 0 Td[(kBTXilnhexp[U(i+1))]TJ /F4 11.955 Tf 11.96 0 Td[(U(i)] kBTii. (2)Here,isareactioncoordinateconnectingstates1and2,andUisthepotentialenergyofastatealongthereactioncoordinate.Thebracketrepresentsanensembleaveragegeneratedatavalueof.TheintegrationisoftenevaluatednumericallyviatrapezoidalruleorGaussianquadrature.IfU()isconstructedasinEquation 2 ,thederivativeofU()withrespecttois@U() @=U2)]TJ /F4 11.955 Tf 11.96 0 Td[(U1. (2)Andthefreeenergydifferencebetweenstates1and2canbeexpressedasA1!2=1Z0hU2)]TJ /F4 11.955 Tf 11.95 0 Td[(U1id (2) 35

PAGE 36

Hence,theensembleaverageofthepotentialenergygapbetweenstates1and2ateachvalueisneededinaTIcalculation.Inthischapter,weusethetermTItorefertoconstrainedTI,inwhichthevalueofisnotallowedtochangeateachwindow. 2.2.3HamiltonianReplicaExchangeMolecularDynamics(HREMD)TheoriginalREMDmethodutilizesreplicashavingdifferenttemperatures(TREMD).Replicasathightemperaturesovercomepotentialenergybarriersmoreeasilythanthoseatlowtemperatures.Anotherwaytoovercomepotentialenergybarriersissimplytochangethepotentialenergysurfacetoreducepotentialenergybarriers.IntheHREMDalgorithm,replicasdifferintheirHamiltoniansbuthavethesametemperature.RegularMDisperformed,andanexchangeofcongurationsbetweentwoneighboringreplicasisattemptedperiodically.Figure 2-1 demonstratestheHREMDalgorithmandthefreeenergycomputationinanHREMDsimulation.Letusconsidertworeplicas1and2withcorrespondingpotentialenergiesU1andU2.ByemployingthedetailedbalanceconditionandBoltzmannweightofeachmolecularstructure,thetransitionprobabilitycanbewrittenasw(q1!q2)=minf1,exp[)]TJ /F4 11.955 Tf 9.3 0 Td[((U1(q2)+U2(q1))]TJ /F4 11.955 Tf 11.96 0 Td[(U1(q1))]TJ /F4 11.955 Tf 11.95 0 Td[(U2(q2))=kBT]g (2)whereq1andq2arethemolecularstructuresofreplicas1and2beforeanexchangeattempt,respectively.AMonteCarloMetropoliscriterion[ 14 ]isusedtoevaluatewhethertheattemptedswapofstructuresbetweentworeplicasshouldbeacceptedornot.Equation 2 canberegroupedasw(q1!q2)=minf1,exp)]TJ /F4 11.955 Tf 11.29 0 Td[([(U2(q1))]TJ /F4 11.955 Tf 11.95 0 Td[(U1(q1)+U1(q2))]TJ /F4 11.955 Tf 11.96 0 Td[(U2(q2))=kBTg (2)WhencomparingtheexponentialtermsinEquation 2 and 2 ,itisclearthatEquation 2 incorporatesallinformationnecessaryforaFEPcalculation.U2(q1))]TJ /F4 11.955 Tf -433.22 -23.91 Td[(U1(q1)isthepotentialenergydifferencecomputedonthebasisofthestructuralensemblegeneratedbyU1,whileU1(q2))]TJ /F4 11.955 Tf 12.82 0 Td[(U2(q2)isthepotentialenergydifference 36

PAGE 37

Figure2-1. DiagramsdisplayingtheHREMDexchangealgorithmandfreeenergycalculation.(A)Exchangeattemptorders.Replicasconnectedbyacurveareneighbors,andattemptsaremadetoexchangemolecularcongurations(q).(B)FreeenergycalculationsintheHREMDmethod.Eachreplicahastwofreeenergydifferences:AupandAdownfromitsattemptingneighborformapairandarecomputedsimultaneously,whileAdownandAdownfromitsattemptingneighborformtheotherpair.Inexchangeattempts(regardlessiftheattemptsareacceptedorrejected),twopairsoffreeenergydifferencesarecomputedinanalternatingfashionutilizingEquation 2 computedonthebasisofthestructuralensemblesgeneratedbyU2.Everytimethetransitionprobabilityiscomputed,thosepotentialenergydifferencescanbeutilizedtocomputetheensembleaverageshowninEquation 2 .Therefore,A1!2andA2!1canbecomputedon-the-yutilizingthedouble-endedscheme.TheensembleaverageinEquation 2 iscomputedregardlessofwhetheranexchangeattemptisacceptedorrejected.WhenemployingtheHREMDmethodtoimproveconformationalsamplinginthestudyofalchemicalchanges,HREMDsimulationsareablenotonlytoenhanceconformationalsamplingbutalsotoyieldthefreeenergydifferencedirectly.Infact, 37

PAGE 38

aregularFEPcalculationcanbethoughtofasanHREMDcalculationinwhichnoexchangesareallowedbetweenreplicas.Inpractice,asshowninFigure 2-1 ,therearetwofreeenergydifferencecalculations(AupandAdown)continuouslyassociatedwitheachreplica.Takereplica1asanexample:Aup=A1!2whileAdown=A2!1.Inprinciple,whenconverged,A1,upshouldbeequaltothenegativeofA2,down:A1,up=)]TJ /F4 11.955 Tf 9.3 0 Td[(kBTlnhexp[U2)]TJ /F4 11.955 Tf 11.96 0 Td[(U1] kBTi1=)]TJ /F4 11.955 Tf 9.3 0 Td[(A2,down=kBTlnhexp[U1)]TJ /F4 11.955 Tf 11.95 0 Td[(U2] kBTi2 (2)Anydifference(exceptforthesign)betweenthetwoisanindicationoferrororlackofconvergence.Convergencealsowasgaugedbythetimedependenceofthepredictedfreeenergydifferences,computingGversussimulationlength.ThisprovidesanasymptoticallyunbiasedestimatorforG,thusallmethodspresentedheremusteventuallyreachthesamenalvalue(withinerrorbars).REFEPispresentedinthischapterasshowingfasterconvergencetowardthenalvalue. 2.2.4SimulationDetailsAccuratelydeterminingthepKavaluesofionizableresidues,especiallythosewithlargeshiftsfromintrinsicpKavalues,isofgreatinterestbothexperimentallyandcomputationally[ 64 65 72 ].Here,thepKacalculationofAsp26inthioredoxinhasbeenselectedasatestcaseinordertocomparetheperformanceofalchemicalfreeenergysimulations.Asp26hasbeenfounddeeplyburiedinthioredoxinandpossessesoneofthelargestpKashiftsamongproteincarboxylicgroups[ 64 65 ].FollowingtheprotocolemployedinthepaperofSimonsonetal[ 72 ],thethermodynamiccycleutilizedtocomputethepKavalueofanionizableresidueisgiveninFigure 2-2 .Asitcanbeseenthere,theuseofamodelcompoundasanauxiliaryleginthethermodynamiccyclemakesG3(protontoproton)equaltozero.Essentially,thepKashiftrelativetothe 38

PAGE 39

intrinsicvalue(pKa,model)iscomputedaspKa(protein)=pKa(model)+1 2.303kBTG(proteinAH!proteinA)]TJ /F4 11.955 Tf 7.08 -4.94 Td[())]TJ /F4 11.955 Tf 11.96 0 Td[(G(AH!A)]TJ /F4 11.955 Tf 7.09 -4.94 Td[() (2)whereG(proteinAH!proteinA)]TJ /F4 11.955 Tf 7.09 -4.34 Td[()andG(AH!A)]TJ /F4 11.955 Tf 7.08 -4.34 Td[()arethefreeenergydifferencesbetweenprotonatedanddeprotonatedasparticacidintheproteinenvironmentandinaqueoussolution,respectively.Alchemicalfreeenergysimulationswereperformedinordertoyieldthosetwoterms.InEquation 2 ,theGibbsfreeenergydifferencesareusedbecauseexperimentsdeterminingpKavaluesgenerallyareconductedunderanisobaric-isothermalcondition. Figure2-2. ThermodynamiccycleusedtocomputethepKashift.Bothaciddissociationreactionsoccurinaqueoussolution.Theprotein-AHrepresentstheionizableresidueinaproteinenvironment.TheAHrepresentsthemodelcompoundwhichisusuallythesameionizableresiduewithcappedterminii.Inpractice,aprotondoesnotdisappearbutinsteadbecomesadummyatom.Theprotonstillhasitspositionandvelocity.Thebondedinteractionsinvolvingtheprotonarestilleffective.However,therearenononbondedinteractionsforthatproton.Thechangeintheionizationstateisreectedbychangesofpartialchargesintheionizableresidue. AsparticaciddipeptideinimplicitwatersolventwastakenasthemodelcompoundwithapKavaluetakenas4.0[ 73 ].Theoxidizedformofthioredoxin(PDBcode2TRX)[ 74 ]inimplicitwaterwasusedinoursimulation.Changesinionizationwererepresentedbychangesinthepartialchargesoftheasparticacidsidechain(ASH!ASP 39

PAGE 40

intheAMBERterminology).SincethevanderWaalsradiusoftheprotoninasparticacidiszeroforbothprotonatedanddeprotonatedspecies,thefreeenergydifferenceonlycontainstheelectrostaticinteractions.Threetypesoffreeenergysimulationswereperformedforboththemodelcompoundandtheprotein:TI(forwardandbackward),HREMD-FEP(REFEP),andregularFEPsimulations.OurregularFEPsimulationswerecarriedoutviaHREMDsimulationsbutwithallexchangeattemptsrejected.ComparingthepKapredictionandfreeenergyconvergencefromFEPandREFEPsimulationswilldirectlyindicatetheeffectoftheenhancedconformationalsamplingduetotheexchanges.Linearinterpolationofpointchargeswascarriedoutinordertoassignsidechainchargesforintermediatestates.Aseven-pointGaussianquadraturewasselectedtocomputetotalfreeenergydifferenceforTIcalculations.Therefore,eightvalues(oneendpointisneededineitherdirection)wereutilizedintheTIsimulation.DuetotheimplementationoftheTIalgorithminAMBER[ 75 ],16replicaswereutilizedtoensurethesameamountofsimulationtimeforallfreeenergysimulations.Asimulationtimeof5nswasusedforeachvalueandforeachreplicainthestudyofthemodelcompound,whileforthioredoxin,weused4nsruns.Structuralswapsbetweenneighboringreplicaswereattemptedevery2ps(1000MDsteps).Noparticularattemptwasmadeinthisworktooptimizethenumberorlocationofthereplicas,northeexchangeattemptfrequency.AllsimulationsweredoneusingtheAMBER10molecularsimulationsuite[ 75 ],locallymodiedtoaddHREMD/REFEPcapabilities.TheAMBERff99SBforceeld[ 76 ]wasutilizedinallofthesimulations.TheSHAKEalgorithm[ 77 ]wasusedtoconstrainthebondsconnectinghydrogenatomswithheavyatomsinallofthesimulations,whichallowedtheuseofa2fstimestep.TheOBC(Onufriev,Bashford,andCase)generalizedBornimplicitsolventmodel(igb=5intheAMBERterminology)[ 78 ]wasusedtomodelthewaterenvironmentinallofourcalculations.ThecutofffornonbondedinteractionandtheBornradiiwassetto99A.Thisvalueislargerthanthedimensionof 40

PAGE 41

bothsystems.Langevindynamicswasemployedinordertomaintainthetemperatureat300K,usingafrictioncoefcientof3.0ps)]TJ /F5 7.97 Tf 6.59 0 Td[(1. 2.3ResultsandDiscussions 2.3.1AcceptanceRatioofHREMDSimulationsTheaccuracyofFEPdependsontheoverlapsbetweenphasespaces,whichcanbemeasuredasoverlapsbetweenpotentialenergydifferencedistributions[ 52 ].TheacceptanceratioinanHREMDsimulationisanindicationoftheoverlapbetweentwopotentialenergydifferencedistributions[ 61 ].Therefore,itcouldbeutilizedtomonitortheconvergenceoffreeenergycalculationqualitatively.Inourstudy,largeacceptanceratioswereobservedinboththemodelcompoundandproteinHREMDsimulation.Theacceptanceratiobetweentwoneighborsrangedfrom0.7to0.9inallHREMDsimulations.Thoselargeacceptanceratiosindicatethattheoverlapinphasespaceislarge. 2.3.2AsparticAcidModelCompoundStudyThefreeenergydifferencesontheright-handsideofEquation 2 werecalculatedasdescribedintheTheoryandMethodsection.Thecumulativeaveragefreeenergydifferenceasafunctionoftimeisreportedhere.Figure 2-3 AshowstheG(AH!A)]TJ /F4 11.955 Tf 7.09 -4.33 Td[()fromTI,HREMD,andFEPsimulations(asmentionedbefore,aFEPsimulationhasbeenperformedbyrejectingallexchangeattemptsinanHREMDsimulation).ThedifferencesbetweenforwardandbackwardG(AH!A)]TJ /F4 11.955 Tf 7.08 -4.34 Td[()areshowninFigure 2-3 B.Aconvergedalchemicalfreeenergysimulationshouldgeneratethesameforwardandbackwardfreeenergynumerically(exceptforanoppositesign).Anynonzerovalueisanindicationoffreeenergynotconverged.Forasimplesystemsuchasasparticacidinimplicitwater,5nsofsimulationtimewaslongenoughforG(AH!A)]TJ /F4 11.955 Tf 7.08 -4.34 Td[()tostabilizeinallthreealchemicalfreeenergysimulations,asshowninFigure 2-3 A.TheforwardandbackwardG(AH!A)]TJ /F4 11.955 Tf 7.09 -4.34 Td[()attheendofeachfreeenergycalculationandthecorrespondingerrorbarsarelisted 41

PAGE 42

Figure2-3. (A)Cumulativeaveragefreeenergydifferencesbetweenprotonatedanddeprotonatedasparticacidinthemodelcompound(G(AH!A)]TJ /F4 11.955 Tf 7.09 -4.34 Td[()).(B)ThedifferencesbetweenforwardandbackwardG(AH!A)]TJ /F4 11.955 Tf 7.09 -4.34 Td[().(C)CumulativeaveragefreeenergydifferencesbetweenprotonatedanddeprotonatedAsp26inthioredoxin(G(proteinAH!proteinA)]TJ /F4 11.955 Tf 7.08 -4.34 Td[()).(D)Thedifferencesbetweenforwardandbackward(G(proteinAH!proteinA)]TJ /F4 11.955 Tf 7.09 -4.33 Td[()). 42

PAGE 43

Table2-1. Freeenergydifference(inKcal=mol)betweenprotonatedanddeprotonatedAsparticacidsobtainedfromTI,REFEP,andFEPalchemicalfreeenergysimulations TIREFEPFEP ASPmodelforward-59.430.06-59.690.05-59.840.06backward-59.560.06-59.660.05-59.720.06average-59.500.08-59.680.08-59.780.08 Asp26inthioredoxinforward-54.350.61-54.290.17-54.230.56backward-55.820.39-54.240.14-53.840.56average-55.090.72-54.270.22-54.040.79 Gdifferenceforward5.080.615.400.185.610.56backward3.740.395.420.155.880.56average4.410.725.410.235.740.79 predictedpKa,proteinforward7.70.47.90.18.10.4backward6.70.37.90.18.30.4average7.20.57.90.28.20.6 inTable 2-1 .Theforwardandbackwardfreeenergydifferencesarethesame(withinerrorbars)forbothREFEPandFEPsimulations.However,theTIsimulationsfailedtodothat,althoughthedifferencewasverysmall(thedifferencebetweenforwardandbackwardG(AH!A)]TJ /F4 11.955 Tf 7.08 -4.34 Td[()wasonly0.13Kcal=mol).TheaverageofforwardandbackwardG(AH!A)]TJ /F4 11.955 Tf 7.08 -4.34 Td[()wastakenasthenalvalueofG(AH!A)]TJ /F4 11.955 Tf 7.08 -4.34 Td[()forthemodelcompoundandisalsoreportedinTable 2-1 .Clearly,asshowninFigure 2-3 B,theREFEPsimulationshaveconvergedmuchfasterthantheFEPcalculationsdid. 2.3.3StudyonAsp26inThioredoxinThefreeenergydifferencebetweenprotonatedanddeprotonatedAsp26isshowninFigure 2-3 CandD.Byanalogywiththemodelcompoundplots,thecumulativeaverageasafunctionoftimeisreported.ThecumulativeaveragewasclearlynotconvergedduringtheTIsimulation,andneitherwasthedifferencebetweenforwardandbackwardG(proteinAH!proteinA)]TJ /F4 11.955 Tf 7.08 -4.34 Td[().AccordingtoTable 2-1 ,after4nsofTIsimulation,thedifferencebetweenforwardandbackwardfreeenergywas1.4Kcal=mol,whiletheuncertaintyoftheforwardandbackwardfreeenergydifferenceswas0.61and0.39Kcal=mol,respectively.DatanotpresentedhereshowthatTIrequiresroughly 43

PAGE 44

40nsofdynamicsbeforeconvergingtoresultscomparablewithFEP/REFP.ItisworthnotingthatthiscomparisonisslightlyunfairtoTIanddeservesfurtherexplanation.First,weusedeightintermediatestatesforTIversus16forFEP/REFEP.Thissetup,whenexecutedwithinAmber,usesthesameCPUtimesincetheTIimplementationisdonewithdual-topologymethods.Infact,reusingtheensemblegeneratedwiththeFEPHamiltoniansandcomputingTIvaluesonthatensembleproducesveryrapidly-convergingresults.ForregularFEPfreeenergycalculations,thecumulativeaveragesstabilizedafterroughly2.2nsofsimulation,whilethecumulativeaveragesfortheREFEPsimulationstabilizedmuchmorerapidly(showninFigure 2-3 C).Furthermore,Figure 2-3 DillustratesthatthedifferencebetweenforwardandbackwardG(proteinAH!proteinA)]TJ /F4 11.955 Tf 7.09 -4.34 Td[()intheREFEPreachedavalueveryclosetozero(-0.05Kcal=mol)veryquickly.Asdescribedpreviously,thenalvalueofG(proteinAH!proteinA)]TJ /F4 11.955 Tf 7.08 -4.34 Td[()wascalculatedastheaverageofforwardandbackwardfreeenergydifferences.Althoughthenalfreeenergydifferencescomputedfrom4nsofsimulationwerethesameforREFEPandregularFEP,thecalculationsconvergedmuchfasterinREFEPthaninFEPsimulation.SincetheHREMDandFEPcalculationsonlydifferedinwhetherstructureswereallowedtobeexchangedornot,theimprovementinalchemicalfreeenergyconvergenceresultedfromemployingenhancedconformationalsamplingtechniqueissignicant.DatanotpresentedhereshowthatthehistogramsofP1(U)exp()]TJ /F7 11.955 Tf 9.29 0 Td[((U))forthecalculationofthefreeenergydifferencebetweenreplicas1and2fordifferentsamplingtimesareslightlydifferentforFEPandREFEP.TheREFEPdistributionsconvergefasterwithtimeandsampletheleftsideofthedistributionbetter.Thishelpsrationalizethefasterconvergenceofourtechnique. 2.3.4pKaPredictionforAsp26inThioredoxinThepKavalueofAsp26inthioredoxincanbecomputedfromEquation 2 .ThenalvalueofG(proteinAH!proteinA)]TJ /F4 11.955 Tf 7.08 -4.33 Td[()fromtheREFEPsimulationwas 44

PAGE 45

Figure2-4. PredictedpKavalueofAsp26inthioredoxinasafunctionoftime.The(G(AH!A)]TJ /F4 11.955 Tf 7.09 -4.34 Td[())valuesutilizedinEquation 2 were-59.68and-59.78Kcal/molforREFEPandFEP,respectively.Theexperimentalvalueis7.5. -54.3Kcal=mol,withapredictedpKavalueof7.9,whichisonly0.4pKaunitsabovetheexperimentalvalue.ThepredictedpKavaluewithrespecttotimefromREFEPsimulationswasplottedinFigure 2-4 inordertodemonstratetheconvergenceofthepKaprediction.Figure 2-4 showsthatREFEPsimulationsnotonlyyieldedanaccuratepredictedpKavaluebutalsoachievedconvergenceveryfast.TheregularFEPsimulationpredictedapKavalueof8.2,whichis0.7pKaunitsabovetheexperimentalvalue.TheconvergenceintheregularFEPsimulationwasalsoworsethanthatintheREFEPsimulation. 45

PAGE 46

2.4ConcludingRemarksConformationalsamplingiscrucialinfreeenergycalculations.Inthecaseofalchemicalfreeenergycalculations,HREMDisausefulandpopularmethodtoenhancetheaccuracyandconvergenceoffreeenergysimulations.Inthischapter,wehavedemonstratedthatREFEPnotonlyimprovesconformationalsamplinginfreeenergycalculationsbutalsoyieldsafreeenergydifferencedirectlyviatheFEPalgorithm.TheimplementationofREFEPistrival,onceaHREMDcodeisinplace.TheREFEPalchemicalfreeenergycalculationwastestedonpredictingthepKavalueofAsp26inthioredoxinandcomparedwithTIandregularFEPsimulations.FreeenergydifferencesfromtheREFEPsimulationconvergedfasterthanthosefromTIandregularFEPsimulations.ThenalpredictedpKavaluefromtheREFEPsimulationwasveryaccurate,only0.4pKaunitabovetheexperimentalvalue.UtilizingtheREFEPalgorithmsignicantlyimprovesconformationalsampling,andthisinturnimprovestheconvergenceofalchemicalfreeenergysimulations. 46

PAGE 47

CHAPTER3LARGEPKaSHIFTSFORBURIEDPROTONABLERESIDUES.RATIONALIZINGTHECASEOFGLUTAMATE66INSTAPHYLOCOCCALNUCLEASE 3.1LiteratureReviewThepKaofanionizablespeciesisrelatedtotheequilibriumconstantbetweenitsprotonatedanddeprotonatedstates.Becausetheelectrostaticpotentialandtheprotonationequilibriumarecoupled,proteinelectrostaticscanbeprobedindirectlybypKameasurements.Ionizablegroupsareusuallysolvent-exposedandsituatedneartheproteinsurface,wherethepKashift(asmeasuredversusthesameresiduebyitself,inthesamesolution,usuallycalledamodelcompound)ofthesegroupsisrelativelysmallandascribedtothecoulombicinteractionswithotherionizableresiduesandsaltbridges.Onrareoccasions,proteinshaveburiedionizableresiduesthatbehaveverydifferentlyfromthoselocatednearthesurface.Theseexceptionalcasescanbeveryusefultounderstandproteinelectrostatics.Buriedtitratablegroupscanplaycriticalrolesinavarietyofsituations[ 79 82 ].Thechargedformsofionizablegroupsaremorefavorableinstronglypolarenvironments,whiletheneutralspeciesofionizablegroupsaredominantinhydrophobicenvironments.ThispreferencefortheneutralspeciesinthehydrophobicinteriorofproteinscausesalargeshiftinthepKaofburiedtitratablegroups,acidicgroupshavepKavaluesshiftedsignicantlyhigherandbasicgroupshavepKavaluesshiftedsignicantlylower[ 83 84 ].MeasuringandrationalizingthepKaofinternalionizablegroupsiscrucialtounderstandingbiochemicalprocessessuchaschargetransfer[ 82 85 87 ],molecularrecognition[ 88 ],andiontransport[ 89 90 ].ElectrostaticinteractionsbetweentitratablesidechainsgovernallpH-dependentpropertiesofproteins[ 91 ].Recentstudiesshowthatthehydrophobicinteriorofsomeproteinscantoleratechargedresidueswithoutanysignicantstructuraladaptation.GarciaMorenoetal.havestudiedthestabilityandpKashiftofE.coli'sstaphylococcalnuclease(SNase)mutantsusinguorescence,circulardichroismspectroscopy(CD),andother 47

PAGE 48

experimentalmethods[ 83 84 92 98 ].Inoneofthosestudies,avaline-to-glutamatemutationatposition66(V66E)hasbeenintroducedto+PHS,ahyper-stablemutantofSNase.ThissubstitutionresultsinahighlyshiftedpKavalueforglutamate(GLU)66comparedtothatoftheGLUmodelcompound.Differentexperimentalmethodshavereportedthisshiftedvalueas9.00.2(i.e.,pKa4.6)[ 92 94 99 100 ],makingitoneofthelargestpKashiftseverreportedforatitratableresidue.Ontheotherhand,continuumsolventcomputationalmethods(e.g.,theGeneralizedBornmodel)predictasmallerpKashiftforGLU66[ 97 99 ].AccordingtotheBornformulation[ 101 102 ],thisshiftcorrespondstothetransferofachargedgroupfromwatertoamediumofdielectricconstantof10-12.Thisvalue,requiredforimprovedagreementbetweencontinuumsolventmethodsandexperimentalresultsinthiscase,ismuchlargerthantheusualvalueof2to4[ 101 103 ].Thereisnoobviousphysicalrationalebehindthishighvalueofdielectricconstant[ 91 97 99 100 ].Thereforedifferentreasonshavebeenproposedtorationalizethisdiscrepancy.Someofthemostsignicantonesarechangesinthestateofionizationofotherresidues,interactionwithburiedwater,proteinrelaxation[ 83 84 ],andinteractionswithotherionizableresidues[ 91 97 99 104 ].Evenafteraccountingfortheseeffects,adielectricconstantaround6isstillnecessarytoreproduceexperimentaldata[ 91 97 99 105 ].Inthischapterweshowthattounderstandthesesystemsproperly,oneneedstoexplicitlytreattheconformationalandprotonationequilibriaascoupled,andthereisnosinglestructurethatcanbeseenasresponsibleforthepKaswitch. 3.2SimulationDetailsWebuilttheV66Emutantofthehyperstableformofstaphylococcalnuclease(SNase)knownas+PHS(pdbcode3BDC17).Asbefore,allsimulationswereperformedusingtheAmberFF99SBforceeld[ 76 ]asimplementedintheAmber10molecularsimulationpackage[ 75 ].TheSHAKEalgorithm[ 77 ]wasusedtokeepbondsinvolvinghydrogenatomsattheirequilibriumlength.Newton'sequationswere 48

PAGE 49

integratedwithatimestepof2fs.Weminimizedthesystemusing500stepsofthesteepestdescentalgorithm,followedby500stepsoftheconjugategradientalgorithm.Next,weheatedthesystemto300Kinthreesteps,followedbyanalequilibrationinthecanonicalensemble.AllcalculationswereperformedusingtheLangevinThermostat[ 106 ]anddistinctrandomseeds(ig=-1)inordertogenerateindependentsimulations[ 107 ].ThegeneralizedBornimplicitsolventmodel[ 102 108 ]wasappliedtomodelthewaterenvironmentinallcalculations.Thecutoffusedfornon-bondedinteractionsandcalculatingBornradiiwas999A.WeusedtherecentlydevelopedReplicaExchangeFreeEnergyPerturbationmethod(REFEP)[ 22 ](Detailsinchapter 2 )tocalculatethefreeenergydifferencebetweentheprotonatedanddeprotonatedstatesineachconformationalstate(i.e.,eitherburiedorexposedGLU66)andinamixofconformationalstates.ThreesetsofREFEPsimulationswereperformedonthissystem:(1)conformationallyunrestrained,(2)a0.4kcalmol)]TJ /F5 7.97 Tf 6.58 0 Td[(1A)]TJ /F5 7.97 Tf 6.59 0 Td[(2harmonicrestraintonallheavyatomsintheconformationwithGLU66buriedand(3)thesamerestraintsintheconformationwhereGLUisexposedtothesolvent.Inallthreesetsofsimulations,16replicaswereusedtoperturbtheprotonatedGLU66(=0)tothedeprotonatedstate(=0).Thelengthofthesimulationwas5nsforsimulation(1)and4nsforsimulations(2)and(3).Alsoa5nsREFEPsimulationwasperformedonaGlureferencecompound(i.e.,Ace)]TJ /F4 11.955 Tf 12.03 0 Td[(Glu)]TJ /F4 11.955 Tf 12.03 0 Td[(Nme)using16replicastoperturbtheprotonationstateofGlutoestablishthedeprotonationfreeenergyofafreeGlutamicacidinsolution.Hereafter,wewillrefertothepKaofGLU66asthepKaofGLU66inV66E+PHSandtothepKashiftasthedifferencebetweenthisvalueandthatofthereferencecompoundinwater(4.4).Inthiswork,weusedtheCDProsoftwarepackage[ 109 ]tocalculatetheCircularDichroism(CD)spectraastheprotonationofthesystemismodied.CDProcomputes 49

PAGE 50

theCDspectraasfunctionofconformation,usingparameterizeddipole-dipoleinteractions[ 110 ]. 3.3DiscussionAccordingtoexperimentalmethodssuchasX-ray,TRPFlorescence,andCDSpectroscopy,ionizingtheGLU66sidechaininV66E+PHStriggerstheunwindingofoneturnofan)]TJ /F1 11.955 Tf 9.3 0 Td[(helix,exposingthepreviouslyburiedcarboxylicgroupofGLU66tothebulksolvent,showninFigure 3-1 Figure3-1. oneturnofan)]TJ /F1 11.955 Tf 9.3 0 Td[(helixexposesthesidechainofGLU66. Asexplainedintheintroduction,thereisinconsistencybetweenpKashiftcalculatedbyexperimentalmethodsandcontinuumsolventcomputationalmethods.Inthispartwetriedtorationalizethisinconsistency.WeusedREFEPtocalculatethefreeenergydifferencebetweenthetwo-protonationstatesforboththeGLUreferencecompoundandGLU66inV66E+PHSandturnedthatdataintoapKashiftduetotheprotonation.TheREFEPdataisshowninFigure 50

PAGE 51

3-2 .Comparingthenalfreeenergiesinbothdirections(Figure 3-2 )showsthatboth Figure3-2. FreeenergyconvergenceinREFEPforunrestrainedV66E+PHS(red)andGlumodelcompound(green).Thecumulativefreeenergydifferencesarebetweenthehighest(fullydeprotonated)andthelowest(fullyprotonated)replicaForA)BackwardandB)Forwardfreeenergydifferences. REFEPcalculationsofunrestrainedV66E+PHSandGlucompoundareconverged.WefoundthattheGLU66pKainunrestrainedisshiftedabout3.2pKaunitshigherrelativetothemodelcompound,roughly1.5pKaunitslessthantheexperimentalvalue[ 92 104 ].Analyzingthisdata,itbecomesobviousthatchangingtheprotonationcoordinateduringREFEP,alsoinducedalocalconformationalchangearoundtheloopcontainingresidue66(SeeFigure 3-1 ).Accordingly,translatingthecomputedfreeenergydifferencefortheproteinintoapKaisnotcorrect. 51

PAGE 52

Atthispoint,wepresentthemostimportantresultofthischapter:sofarmanymacroscopicandmicroscopicmodelshavebeenproposedforestimationandcalculationofpKainproteins[ 22 111 115 ].Themicroscopicapproachinthischapterisgroundedoncalculationofprotonation/deprotonationenergiesatagivenconformation.Basedonthat,thepropertreatmentofthesesystemsrequiresexplicitinclusionofcoupledprotonationandconformationalequilibria[ 116 ].Basically,atlowpH,GLU66isbothburiedandprotonated,whileathighpHitissolventexposedanddeprotonated.Tounderstandthissystemweintroduceaconformation-protonationequilibriummodel,showninFigure 3-3 ,tocalculatethepKaofGLU66intheSNasemutant.Usingthismodel,wedistinguishedbetweenconformationalstatesoftheGLU66sidechain(i.e.,e=exposed/b=buried)anditsprotonationstates(i.e.,P=protonated/D=deprotonated). Figure3-3. ThermodynamiccycleforconformationprotonationmodelofGLU66.DbandPbstandfordeprotonated-buriedandprotonated-buried,respectively;andDeandPestandfordeprotonated-exposedandprotonated-exposed,respectively. Thehorizontallineatthetopcorrespondstotheprotonationchangeattheexposedconformation,thehorizontallineatthebottomtotheprotonationchangeattheburiedconformation,whilethetwoverticallinesarethefreeenergiesrequiredtochange 52

PAGE 53

conformationsatxedprotonationstates.Thisseparationallowsustostudytheconformationalandprotonationchangesseparatelyandcoupled,tobetterunderstandtheresults.TocalculatethepKasintheburiedandexposedcasesindividually,weperformedtwosetsofrestrainedREFEPsimulations.Intheburiedcase,theGLU66sidechainwasrestrainedtotheinterioroftheprotein,andintheexposedcase,itwasrestrainedtostayexposedtothesolventbyapplyingaharmonicrestraintonallheavyatoms.RestrainedREFEPcalculations(showninFigure 3-4 )computeapKaof10.6fortheburiedsituationandof4.6fortheexposedGLU66sidechain,respectively. Figure3-4. CumulativeFreeenergyVSSimulationtimeforGlu-66restrainedinside(Red)andrestrainedoutside(Green). 53

PAGE 54

ThethermodynamiccycleinFigure 3-3 iswritteninsuchawaythatknowledgeofthetwomicroscopicpKasandoneoftheequilibriumconstant(orfreeenergy)fortheconformationalchange,fullyxedthevalueofthesecondequilibriumconstant.Usingmassbalanceequations,wehaveasystemthathasasinglefreeparameter,correspondingtooneofthetwoverticallinesinFigure 3-3 .Sincemostexperimentalprobesaresensitivetoconformations,wewillcomputeanequilibriumconstantthatdependsoneitherburiedofexposedGLU66,disregardingprotonationstates.TheequilibriumconstantbetweentheexposedandburiedconformationsoftheGLU66sidechaincanbewrittenas:Ke=b=[De]+[Pe] [Db]+[Pb]=[De] [Db]1+10(pKa,e)]TJ /F5 7.97 Tf 6.58 0 Td[(pH) 1+10(pKa,b)]TJ /F5 7.97 Tf 6.58 0 Td[(pH)=Ke=b,D1+10(pKa,e)]TJ /F5 7.97 Tf 6.58 0 Td[(pH) 1+10(pKa,b)]TJ /F5 7.97 Tf 6.59 0 Td[(pH) (3)Here,Ke=b,DistheKe=bathighpH(i.e.,whenGLU66isdeprotonated).UsingthecyclepresentedinFigure 3-3 ,analyticalexpressionsoftheconcentrationofeachspeciesasafunctionoftheequilibriumconstantsandpHcanbederived.8>>>>>>><>>>>>>>:[De]=Ke=b 1+Ke=b1 1+10(pKa,e)]TJ /F11 5.978 Tf 5.76 0 Td[(pH)[Pe]=Ke=b 1+Ke=b10(pKa,e)]TJ /F11 5.978 Tf 5.75 0 Td[(pH) 1+10(pKa,e)]TJ /F11 5.978 Tf 5.76 0 Td[(pH)[Db]=1 1+Ke=b1 1+10(pKa,b)]TJ /F11 5.978 Tf 5.76 0 Td[(pH)[Pb]=1 1+Ke=b10(pKa,b)]TJ /F11 5.978 Tf 5.76 0 Td[(pH) 1+10(pKa,b)]TJ /F11 5.978 Tf 5.76 0 Td[(pH). (3)TheseexpressionscanbeusedtocalculatetheconcentrationofthesespeciesfordifferentvaluesofKe=b,D(Figure 3-5 ).ForhighvaluesofKe=b,D,theexposed-deprotonatedspecie(De)isdominantatHighpHvalue.TheoverallpKacanbedenedintermsofaprotonationequilibriumconstant,i.e.,KD=P=[De]+[Db] [Pe]+[Pb] (3)However,mostexperimentalmethodsaresensitivetoconformationsanddonotdirectlyreportontheprotonationstateofaspecicresidue.Basedonthis,wecandenea 54

PAGE 55

Figure3-5. Concentrationofspeciesvs.pHfortwodifferentvalueofKe=b,Di.e.,Ke=b,D;1=2andKe=b,D;2=108. Conformation-SensitiveSignal(CSS)forthecaseofGLU66inV66E+PHSas:CSS=[De]+[Pe])]TJ /F4 11.955 Tf 11.95 0 Td[([Db])]TJ /F4 11.955 Tf 11.96 0 Td[([Pb] (3)AtaveryhighpH,GLU66ispredominantlydeprotonatedandonlyspecieswhereGLU66isdeprotonatedwillcontributesignicantlytotheobservedCSS.Ontheotherhand,ataverylowpH,onlytheprotonatedspecieswillbesignicant,i.e.,8><>:CSSpHhigh=[De])]TJ /F4 11.955 Tf 11.96 0 Td[([Db]CSSpHlow=[Pe])]TJ /F4 11.955 Tf 11.96 0 Td[([Pb]. (3) 55

PAGE 56

Figure 3-6 showstheCSS,asdenedinEquation 3 ,asafunctionofpH.ThepHatwhichtheinectionpointisobservedcorrespondstotheexperimentally-measured,apparentpKa.IftheGLU66isfullyexposedathighpHandfullyburiedatlowpH,thentheobservedinectionpointwillcorrespondtoazerovalueoftheCSS.However,ifthereisamixtureofburiedandexposedstates,thentheinectionpointwillbeshiftedtoanegativesignalvalue.BasedonEquation 3 andEquation 3 ,itispossibletocalculatetheexperimentalinectionpointasafunctionofpKa,e,pKa,b,andKe=b,D. Figure3-6. CSS,asdenedinEquation 3 ,asafunctionofpHforKe=b,D=10;theorangearrowspointtotheinectionpoint(i.e.,halfwaybetweenthehighpHandthelowpHvalues). Figure 3-7 showstheapparentpKaplottedasafunctionofKe=b,D.Figure 3-7 demonstratesthatthepKa,whichismeasuredusingexperimentalmethodsthatare 56

PAGE 57

sensitivetoconformations,dependsontheratiooftwodistinctconformationalstatesathighpH.ThismodelpredictsthattheapparentpKarangesbetween4.6(Ke=b,Do1,fullysolventexposed)and10.6(fullyburied,Ke=b,Dn1).Ke=b,DmodulatestheapparentvalueofpKa.AsKe/b,Dincreases,theapparentpKadecreases.AsKe=b,Dapproaches Figure3-7. ApparentpKavs.Ke=b,D. innity,theapparentpKatendsto4.6.WeestimatedKe=b,DbasedonRootMeanSquareDeviation(RMSD)distributionsoftheGLU66sidechainandpartofitsbelonged)]TJ /F1 11.955 Tf 9.3 0 Td[(helix(residue62-69),forallthereplicasintheunrestrainedREFEPsimulations(Figure 3-8 ).InFigure 3-8 ,for=0(GLU66isprotonated)wecanidentifytwopeaksat0.4and1.1A,withtheformerbeingthehighest.Asincreases,anewpopulationaround1.6Awillappearandtheotherswilldisappear.At=1.0thepeakat0.4Ahasdisappeared,butthereisstillasmallpeakat1.1A.BecauseofoverlapbetweenprotonatedanddeprotonatedRMSDdistributions 57

PAGE 58

Figure3-8. CarbonRMSDdistributionsofresidues62to69andthesidechainofGLU66inunrestrainedREFEPsimulationsforallsixteenreplicaswithrespecttotheaveragestructure. (=0and1.0),wecanconcludethatbothprotonationstatesareensemblesofbothburiedandexposedcongurations.Thisshowsthatthechangesinprotonationstatesarenotfullysynchronizedwiththechangesinconformationstates.Formorerigorousinvestigationofthedecoupling,wecalculatedtheCDspectraandaveragesecondarystructureforallvaluesintheunrestrainedsimulations.Averagesecondarystructureshowsthatasincreases,the)]TJ /F1 11.955 Tf 9.3 0 Td[(helixcontentdecreasesand310helixcontentincreases(Figure 3-9 ).Figure 3-10 showstheellipticityat222nm, 58

PAGE 59

Figure3-9. Averagesecondarystructureforresidue57-69vs.reactioncoordinate. whichisthewavelengthatwhichthengerprintof)]TJ /F1 11.955 Tf 9.3 0 Td[(helixappearsinCD,forthewholeproteinandtheregioncontainingGlu66(i.e.,residues57-69).The)]TJ /F1 11.955 Tf 9.3 0 Td[(helixcontentdecreasesasthesystemshiftstowardthedeprotonatedGlu66.ThetrendobservedforthewholeproteinandthelocalregionisthesameandisconsistentwiththefactthatthedeprotonationofGLU66locallydenaturatespartof)]TJ /F1 11.955 Tf 9.3 0 Td[(helixcontentofthatregion.TheresultsofFigure 3-10 canberationalizedbylookingatFigure 3-8 .Inthelatter,RMSDhistogramsof0.2-0.4showthemaximumfrequencyontheleftmostpartoftheRMSDaxis(lessthan1.3A).ThesearethesthathavethemaximumabsolutevalueofellipticityinFigure 3-10 .EstimatingRMSD1.0-1.3AastheboundarybetweenanexposedandafullyburiedGLU66sidechain,when=1.0lessthan1.8-6.7%of 59

PAGE 60

Figure3-10. Ellipticityat222nmvs.reactioncoordinate(=0,i.e.,GLU66protonatedand=1,i.e.,GLU66deprotonated)forwholeprotein(red,leftyaxis)andfortheregioncontainingGlu66(green,rightyaxis) thepopulationisinafullyburiedconformation.Basedontheseresults,Ke=b,Dcanbeestimatedtobe15-57,whichyieldsanapparentpKaof8.7-9.2(Figure 3-7 ).Thisagreeswellwiththeresultsofstructuresensitiveexperimentalmethods,whichestimatethepKaoftheGLU66sidechaintobe9.0-9.1[ 92 ]. 3.4ConcludingRemarksTherearemanychallengesindeterminingpKashiftsofburiedionizablegroups.Themutationofposition66instaphylococcalnucleaseisaninterestingexamplefortworeasons.First,accordingtoexperimentalmethodstheshiftinpKaofionizablegroupmutantsrangefrom3.5-5.0,dependingontheionizablegroupandtheirrelativenormal 60

PAGE 61

pKa.Second,thereisaninconsistencybetweenpKashiftscalculatedbyimplicitsolventmethodsandthoseofexperimentalmethods.Tosolvethischallengingproblem,weusedREFEPtocalculatethepKashiftofstaphylococcalnucleasemutant(V66E+PHS).Notonlytheconformationalresponseofaproteintoachangeinprotonationstateofanionizablegroupmaybebeyondthesensitivityofexperimentalmethods,butalsoconformationalchangesmaynotbefullycorrelatedtothechangesinprotonationstate.Thefactthatprotonationchangesarenotnecessarilyaccompaniedwithcongurationalchangesshowstherobustthresholdofaproteinagainstlocalelectrostaticsvariations.Basedonthisfact,theinconsistencybetweenimplicitsolventsimulationsandconformationalsensitiveexperimentalmethodshasbeenexplained.Afour-statethermodynamiccyclehasbeenintroducedtodecipherthisinconsistency.Throughthethermodynamiccycle,fourspecieshavebeendened,whichcorrespondtoallpossiblecombinationsofconformational-protonationoftheGLU66sidechain.WehaveshownthattheinectionpointofexperimentaltitrationcurvesisdependentontheratioofconformationalstatesathighpH;bycalculatingtheratioofconformationalstatesathighpH,theexperimentalinectionpointthatwehavedeterminedisingoodagreementwiththeoneobtainedfromtheexperimentalresults. 61

PAGE 62

CHAPTER4PH-REPLICAEXCHANGEMOLECULARDYNAMICSINPROTEINSUSINGADISCRETEPROTONATIONMETHOD 4.1LiteratureReviewSolutionpHplaysaveryimportantroleinproteinfunction,dynamicsandstructure.ManyimportantbiologicalphenomenafunctiononlyatacertainrangeofpHs,includingproteinfolding/misfolding[ 117 119 ],enzymecatalysis[ 120 122 ]andligand-substratedocking[ 123 124 ].Inthosecases,apHchangeleadstoachangeintheratioofprotonationstatesofdifferenttitratableresidues,whichisusuallycoupledtoachangeintheconformationanddynamicsoftheproteinitself.ThepKaofatitratableresidueisthepHvalueatwhichtheratioofdeprotonatedtoprotonatedconcentrationsofthatresidueisequaltoone[ 72 112 125 126 ].ThepKaofanionizableresidueinaproteinishighlydependentonitselectrostaticenvironment,whichiscoupledtotheprotonationstatesofothertitratablegroupsandtheconformationstateoftheprotein.MostcurrentsimulationsofpHdependentpropertiesdonot,infact,changeprotonationstatesduringmoleculardynamics,butratherpickacertainsetofprotonationstatesusingeducatedguesses,orguidedbysimpliedalgorithms,andthenkeepitconstantfortheremainderofthesimulation.TheseconstantprotonationMDmethodssufferfromtwobigdisadvantages[ 127 ].First,atapHnearthepKaofanyofthetitratableresidues,anypossiblechoiceofconstantprotonationwouldclearlybewrong.Second,thechoiceofprotonationstateiscoupledforneighboringresidues,whichitselfcouplestotheconformationalspaceofthesystem.Likeconcentrationandtemperature,thesolutionpHisaveryusefulexternalandcontrollablevariableinexperimentalmethods.Hence,theimportanceofconstant ReprintedwithpermissionfromSabriDashti,D.;Meng,Y.;Roitberg,A.E.J.Phys.Chem.B2012,116,8805. 62

PAGE 63

pHMD(CpHMD)hasbeenrecognized.Inthelasttwodecades,manyconstantpHMDmethodshavebeendeveloped[ 128 129 ].ThegoalofCpHMDistodescribecorrectlytheprotonationequilibriumcoupledbyconformationequilibriumatacertainpH.ThemajorityofCpHMDmethodscanbedividedintocontinuous[ 128 136 ]anddiscrete[ 127 137 146 ]protonationstatemethods.Continuousprotonationmethodsuseacontinuousprotonationparametertoperturbionizableresiduebetweenprotonatedanddeprotonatedstates.In1994,MertzandPettitt[ 135 ]developedagrandcanonicalmethodforsimulatingasimplechemicalreaction.Theyappliedthemethodforexchangingaprotonbetweenwatermoleculesandanionizablesidechain.In1997,Baptistaetal.[ 128 ]introducedacontinuousconstantpHmethodinimplicitsolventbasedonamean-eldapproximation.In2001,Borjessonetal.[ 129 ]usedaweaklycoupledprotonbathtocontinuouslyadjusttheprotonationfractionofeachtitratablegrouptowardsequilibrium.Morerecently,theBrooksgrouphasdevelopedthecontinuousprotonationstatemethodfurther[ 130 134 147 ].Inthecaseofhighlycoupledtitrationgroups,forwhichcooperativityeffectsarenon-negligible,thismodelleadstoinappropriateestimationofphysicalvariables.Toalleviatethisproblem,Leeetal.[ 147 ]addedabiasingpotential,centeredatequal0.5,tohelpdrivetheprotonationcoordinatevaluetofullyprotonated/deprotonatedstates,andawayfromthemid-unphysicalstates.Incontrasttothecontinuousprotonationstatemethods,discreteprotonationmodelsdenetheprotonationstateoftheionizablegroupaseitherzerooroneduringthesimulation,correspondingtoprotonatedanddeprotonatedstatesonly.ThesemodelsuseahybridMD-MCscheme;theMDisusedtosampleconformationalspaceforanumberofsteps,afterwhichaMetropolisMC[ 14 ]attemptisdoneforchangingtheprotonationstate/states.AnewsetofMDstepsisthendonewiththeprotonationstatechosenbytheMCstep,andtheprocessisrepeated.ManyversionsoftheconstantpH/discreteprotonationMDmethodhavebeendeveloped[ 115 127 138 63

PAGE 64

143 145 146 148 ].TheBaptistagroup[ 137 140 148 ]usedexplicitsolventforthepropagationofcoordinatesandthePoisson-Boltzmann(PB)methodforcalculatingtheenergyintheMCsectionofthealgorithm.Walczaketal.[ 146 ]employedLangevinDynamicsforMDandthePBmethodforbothoftheMCandMDsteps.Burgietal.[ 141 ]appliedThermodynamicIntegration(TI)tocalculatethetransitionenergybetweentheprotonatedanddeprotonatedspeciesinexplicitsolventMD.ThedrawbackforthisapproachisthelargecomputationalcostofTIcalculations,whichlimitstheamountofsamplinginprotonationspace.In2004,Monganetal.developedadiscreteprotonationMDmethodbyusingtheGBimplicitsolventmethodtoboththeMD(structure)andMC(protonationstate)samplingsections[ 127 ].Amoredetailedexplanationofthemethodcanbefoundinthemethodssection.ThismethodisimplementedintheAMBERMolecularDynamicsPackage[ 75 ].Ithasbecomeclearinrecentyearsthataccuratemodelingofprotonationspacealsorequiresenhancedsamplingofconformationalspace[ 149 151 ].Accuratesamplingofconformationalspaceofproteinsremainsachallengingarea[ 2 3 18 152 153 155 157 ].Manytheoreticalmethodshavebeenproposedtoovercomethefreeenergybarriersinconformationalspace(seechapter1).Toaccountforthecouplingofprotonationandconformationalsampling,Wallaceetal.[ 158 ]recentlyhavecombinedthecontinuousprotonationConstantpHMDwiththeREMDmethod(REX-CPHMD)[ 131 ].TheyappliedittotheproblemsofpKapredictioninproteinfoldingandpHdependentconformation,amongothers[ 158 159 ].Recently,MengandRoitberg[ 160 ]utilizedahybridmethodbycombiningtheTemperatureREMD(TREMD)anddiscreteprotonationConstantpHMD.Inthischapter,weintroduceamethodforpH-ExchangeMDbycombiningthediscreteprotonationConstantpHMD(proposedbyMonganetal.[ 127 ])andHamiltonianREMD(HREMD)[ 22 ].Wetestedourmethodbyapplyingittovemodeldipeptides,toanuncappedpentapeptidewithsequence+H3N)]TJ /F4 11.955 Tf 11.9 0 Td[(Ala)]TJ /F4 11.955 Tf 11.89 0 Td[(Asp)]TJ /F4 11.955 Tf 11.9 0 Td[(Phe)]TJ /F4 11.955 Tf 11.9 0 Td[(Asp)]TJ /F4 11.955 Tf 11.9 0 Td[(Ala)]TJ /F4 11.955 Tf 11.9 0 Td[(COO)]TJ /F1 11.955 Tf 10.41 -4.34 Td[((ADFDA),andtoanheptapeptidederived 64

PAGE 65

fromtheovomucoidthirddomainOMTKY3protein.BecausethetwoendsofADFDAarenotcapped,itstwoAspresidueshaveslightlydifferentelectrostaticenvironmentsandtheirpKasdeviateindifferentdirectionsfromthepKaofunperturbedAsp.WewillshowthatthepHREMDMDimprovesthesamplingefciencyinbothprotonationandconformationspaces.Intherestofthischapter,ConstantpHMD(CpH)referstoMongan'sCpHMDapproachunlessmentionedotherwise. 4.2TheoreticalMethodandSimulationDetails 4.2.1ConstantpHMolecularDynamicsandpH-ReplicaExchangeMolecularDynamics(pH-REMD)ThegoalofCpHMDistosampletheequilibriumbetweenprotonatedanddeprotonatedstateoftitratablesitesatagivenpH.Thefreeenergydifferencebetweenprotonatedanddeprotonatedstatesdeterminestheratiooftheirconcentrations.ThisfreeenergydifferencecannotbecalculatedbyMolecularMechanics(MM),becausethechangeinprotonationstateinvolvesabondbreaking/formingphenomenon,whichrequiresaseriesofhighlyaccuratequantumcalculations.Toaddressthisissue,amethod[ 115 127 161 ],whichusesapre-calculatedpKaofreferencecompounds,hasbeendeveloped.Reference/modelcompounds,inAMBERterminology,arerepresentedbyacappeddipeptideforeachtitratableresidue(i.e.,ACE-titratableresidue-NME).ThefreeenergyfortheprotonationchangetobeusedintheMCcriteriaisdescribedbyMonganetal.[ 161 ]as:Gprotein=Gprotein,MM+kBT(pH)]TJ /F4 11.955 Tf 11.95 0 Td[(pKa,ref)ln10)]TJ /F4 11.955 Tf 11.95 0 Td[(Gref,MM. (4)Here,asbefore,Tisthetemperature,andkBistheBoltzmannconstant.Gprotein,MMisthemolecularmechanicspartofthefreeenergyofthetitratablesiteintheprotein,andGref,MMisthepre-computeddeprotonationfreeenergyforthereferencecompound,describedabove.UsingEquation 4 ,thereisnoneedtocalculating 65

PAGE 66

theQMcontributiontothefreeenergyintheprotein.ThismethodisimplementedintheAMBERMDsuite[ 75 ],usingtheGBimplicitsolventmodel.EveryfewMDstepsaMetropolisMC[ 14 ]attemptisdonetochangetheprotonationstateofthetitratableresidueandGproteinisusedtomakeadecisionaboutacceptingorrejectingtheproposedMCmove.Inotherwords,theMCmovessampleprotonationspace,andtheMDstepssamplecongurationspace.DuringtheMDsteps,theprotonationstateiskeptconstant. 4.2.2TitrationCurveThepKaofatitratableresidueisrelatedtothepHenvironmentthroughtheHenderson-Hasselbalch(HH)equation,pKa=pH)]TJ /F4 11.955 Tf 11.96 0 Td[(nlog[A)]TJ /F4 11.955 Tf 7.09 -4.34 Td[(] [HA], (4)with[A)]TJ /F4 11.955 Tf 7.08 -4.34 Td[(]and[HA]beingthedeprotonatedandprotonatedconcentrationsrespectively;nistheHillcoefcient,whichshouldapproach1fornon-interactingionizableresidues,butdeviatefromoneinthecaseofinteractingtitratableresiduesbecauseofcooperativity[ 162 ].BecauseoftheergodicityassumptionunderlyingMD,theratioofthetimethatatitratableresiduespendatdeprotonatedstatetothetimespendinprotonatedstatecanbeconsideredasaratioofconcentrationofdeprotonatedstatetothatofprotonatedstate.Ontheotherhand,pH-REMDisacombinationofCpH[ 161 ]andHREMD[ 22 ].Incontrasttotemperaturereplicaexchange[ 2 ],inwhicheachreplicarunsatadifferenttemperature,inHREMD,eachreplicarunsinadistinctHamiltonianbutatthesametemperature.InpH-REMDeachreplicarunsaconstantpHMDatauniquepH,andperiodicallyanexchangeofconformationbetweentwoadjacentreplicasisattempted(section 4.2.3 ). 66

PAGE 67

Wenotethatarecentpublication[ 23 ]presentedapHreplicaexchangemethodusingadifferentexchangecriterion,thatdoesnotrequirerecomputingenergiesfordifferentreplicas.Inthefuture,wewillcomparethetwoformulations. 4.2.3pH-REMDIntheHREMDalgorithm,eachreplicarunsadistinctHamiltonian.InpH-REMDeachreplicarunsaCpHMDsimulationatauniquepH,withaswapofconformationsbetweentwoadjacentreplicasisattemptedregularly.Usingthedetailedbalancecondition,wecanwriteanequilibriumpropositionfortheensemblebeforeandafteranexchangeisattempted,w(X!X)}(X)=w(X!X)}(X), (4)where}(X)istheprobabilityofbeingatstateXandw(X!X)istheprobabilityoftransitionfromstateXtothatofX.Attheexchangemoment,ifweonlyswaptheconformationsbetweentwoadjacentreplicasandkeeptheprotonationstateunchanged,thenthegeneralizedstatesofXandXcanbewrittenasX=0B@n1...ninj...nMq1...qiqj...qM1CA,X=0B@n1...njni...nMq1...qiqj...qM1CA, (4)Hereqrepresentsaconformationandnrepresentsprotonationstates.Theithcolumnisrelatedtotheithreplica.BecauseallMreplicasareindependent,theprobabilityofthesystembeinginthegeneralizedstateXcanbewrittenas}(X)=QMi=1Pi(qi,ni),wherePi(qi,ni)istheprobabilityofreplicaiatconformationqiandprotonationstateofni.SubstitutingthoseprobabilitiesinEquation 4 willresultwX!X wX!X=P(qj,ni)P(qi,nj) P(qi,ni)P(qj,nj)=exp()]TJ /F7 11.955 Tf 9.3 0 Td[(), (4) 67

PAGE 68

Inthecanonicalensembleregime(NVT),andusingtheBoltzmanndistributionasalimitingensemble,thiscanbewrittenaswX!X wX!X=exp()]TJ /F7 11.955 Tf 9.3 0 Td[(H(qi,nj))exp()]TJ /F7 11.955 Tf 9.3 0 Td[(H(qj,ni)) exp()]TJ /F7 11.955 Tf 9.3 0 Td[(H(qi,ni))exp()]TJ /F7 11.955 Tf 9.3 0 Td[(H(qj,nj))=exp()]TJ /F4 11.955 Tf 9.3 0 Td[(), (4)where=1 kBTand=[H(qi,nj+H(qj,ni))]TJ /F4 11.955 Tf 11.96 0 Td[(H(qi,ni))]TJ /F4 11.955 Tf 11.96 0 Td[(H(qj,nj)].ThissetupcanberealizedbysettingtheexchangeprobabilityaccordingtotheMetropolisMCcriteriaaswX!X=w((qi,ni);(qj,nj)!(qj,ni);(qi,nj))=min(1,exp()]TJ /F4 11.955 Tf 9.3 0 Td[()). (4)Forcomputingonlythepotentialenergiesarerequired.Sincethetwoexchangingreplicasarerunningatthesametemperatureandhavethesamenumberandmassofparticles,theirkineticenergytermswillcancel. 4.3SimulationDetailsTovalidateandtestthemethodpresentedhere,wechoseandransimulationsonthreecategoriesofsystems.First,westudiedthecappedreferencecompounds(describedinthemethodssection),consistingoftheACE)]TJ /F12 11.955 Tf 12.36 0 Td[(titratableresidue)]TJ /F4 11.955 Tf 12.36 0 Td[(NME.Simulationtimeswere3nsforbothCpHMDandpH-REMD(foreachreplica)methods.WeusedeightpH-replicasforallthemodelcompoundsinpH-REMD.Wealsotestedoursimulationmethodonaterminallychargedpentapeptidemodel,+H3N)]TJ /F4 11.955 Tf 13.08 0 Td[(Ala-)]TJ /F4 11.955 Tf 12.72 0 Td[(Asp)]TJ /F4 11.955 Tf 12.72 0 Td[(Phe)]TJ /F4 11.955 Tf 12.72 0 Td[(Asp)]TJ /F4 11.955 Tf 12.72 0 Td[(Ala)]TJ /F4 11.955 Tf 12.72 0 Td[(COO)]TJ /F1 11.955 Tf 10.41 -4.34 Td[((ADFDA).BecausethetwoendsofADFDAareoppositelycharged,thetwoAspexperiencedifferentelectrostaticenvironmentsandhavedifferentpKas.ThesimulationtimesofADFDAwere90nsforCpHand10nsforeachreplicainpH-REMD.Eightreplicas,atpH=2.5-6.0withincrementof0.5wereusedinpH-REMDmethod.ThethirdsystemwasaheptapeptidederivedfromOMTKY3(ACE)]TJ /F4 11.955 Tf 12.46 0 Td[(Ser)]TJ /F4 11.955 Tf 12.45 0 Td[(Asp)]TJ /F4 11.955 Tf 12.46 0 Td[(Asn)]TJ /F4 11.955 Tf 12.45 0 Td[(Lys)]TJ /F4 11.955 Tf 12.46 0 Td[(Thr)]TJ /F4 11.955 Tf 12.45 0 Td[(Tyr)]TJ /F4 11.955 Tf 12.45 0 Td[(Gly)]TJ /F4 11.955 Tf 12.45 0 Td[(NME).Wesimulated100nsforCpHand10nsforpH-REMD.Twelvereplicas,atpH=2-13withincrementof1pHunit,wereusedinpH-REMDmethod.Allcalculationsinthischapterweredoneusingthe 68

PAGE 69

AMBER10molecularsimulationsuite.TheAMBERff99SBforceeld[ 76 ]andOBCGeneralizedBornimplicitsolvent[ 77 ](igb=2)wereusedinallsimulations.TheShakealgorithm[ 101 ]wasappliedtoconstrainthebondsbetweenheavyandhydrogenatoms,whichallowedtheuse2fsMDsteps.Acutoffof30Afornon-bondedinteractionswaschoseninallcalculations.InallpH-REMDsimulations,excepttheheptapeptidederivedfromOMTKY3,replicasattemptedapHexchangeevery1000MDsteps.Inthecaseoftheheptapeptide,replicasattemptedapHexchangeevery500MDsteptoacceleratesampling[ 163 ].Theexchangeacceptanceratiosbetweenreplicasforallsystemwerebetween0.2and1.0.ForcalculatingtheerrorbarsandtheuncertaintyofpKas,every4000MCsteps(inprotonationspace),thedeprotonationfractionhasbeencalculated. 4.4ResultsandDiscussion 4.4.1TitratableModelCompoundsWhenweapplypH-REMDtothemodelcompoundsthatwereinitiallyusedtoparametrizethemethod,itisnotsurprisingtondthatitproducesthecorrectpKas.Itis,however,animportantcalculationtoperformtocheckthemethodandtogaugeitsefciencyversusconstantpHruns.TheresultsareshowninTable 4-1 ,whereallpKavalueshavebeencalculatedbyattothelinearizedversionoftheHenderson-HasselbalchEquation. Table4-1. pKasofthereferencecompoundscomputedbydifferentmethods ASPGLUHISTYRLYS Experimentalvalue4.04.46.39.610.5CpHMD3.90.14.40.16.30.19.60.0310.40.03pH-REMD3.90.14.40.16.40.19.70.0210.40.04 InFigure 4-1 ,theHillplotforcappedLysineisshownforbothpH-REMDandCpHMDmethods,withagreementbetweenthemoveralargepHrange. 4.4.2ADFDAModelCompoundsWenowapplythemethodtothemodelpeptide,+H3N)]TJ /F4 11.955 Tf 12.41 0 Td[(Ala)]TJ /F4 11.955 Tf 12.41 0 Td[(Asp)]TJ /F4 11.955 Tf 12.41 0 Td[(Phe)]TJ /F4 11.955 Tf 12.41 0 Td[(Asp-)]TJ /F4 11.955 Tf 12.07 0 Td[(Ala)]TJ /F4 11.955 Tf 12.07 0 Td[(COO)]TJ /F1 11.955 Tf 10.41 -4.34 Td[((ADFDA),thathaschargedends.Thissystemhasanintrinsicbutsubtle 69

PAGE 70

Figure4-1. ComparisonofHillplotsbetweenpH-REMDandCpHmethodsforLysreferencemodel. asymmetryintheelectrostaticenvironmentforthetwoAspionizablesidechains.Asp2isclosetotheNH+3,whichcausesitspKatobeshiftedslightlybelow4.0.Asp4isclosetotheCOO)]TJ /F1 11.955 Tf 10.41 -4.34 Td[(terminal,whichcausesitspKatobeshiftedabove4.0.TheplottedtitrationcurvesofAsp2andAsp4inFigure 4-2 haveAsp2shiftedtotheleftofthemodelcompound,andAsp4shiftedtotheright.ThereisagoodagreementbetweenthepH-REMDandCpHcurves.ThepredictedpKasandHillcoefcientsforbothmethodshavebeencalculatedbylinearttingonaHillplot. Table4-2. pKapredictionandHillcoefcientofttedfromtheHHequation. ASP2ASP4 pHHillcoefcientpHHillcoefcient pH-REMD3.80.10.894.50.10.88CpHMD3.80.10.914.50.10.86 70

PAGE 71

Figure4-2. ThetitrationcurvesofAspsidechainsinADFDAcomputedbybothconstantpHMD(blueandpurple)andpH-REMD(redandgreen)methods;ThetitrationcurveofAspforthereferencecompound(lightblue)isalsoshown. Table 4-2 showsthattheresultsofbothmethodsareidentical.ItisworthnotingthatthefreeenergydifferenceassociatedwiththetwodifferentpKasis0.96kcal=mol,whichhighlightsthesensitivityofthemethodtoverysmallenvironmentalchanges.TogaugethecomparativeefciencyofCpHandpH-REMD,inFigure 4-3 ,thecumulativeaverageprotonationfractionsofAsp2andAsp4forbothmethodsatpH=4.0havebeenplotted.Thedataclearlyshowsthatforbothtitratablegroups,pH-REMDconvergesmorerapidlyinprotonationspacethantheregularCpHmethod. 4.4.3HeptapeptideDerivedfromOMTKY3WeappliedthepH-REMDmethodtoacappedheptapeptidederivedfromOMTKY3(ACE)]TJ /F4 11.955 Tf 9.84 0 Td[(Ser)]TJ /F4 11.955 Tf 9.84 0 Td[(Asp)]TJ /F4 11.955 Tf 9.84 0 Td[(Asn)]TJ /F4 11.955 Tf 9.84 0 Td[(Lys)]TJ /F4 11.955 Tf 9.84 0 Td[(Thr)]TJ /F4 11.955 Tf 9.85 0 Td[(Tyr)]TJ /F4 11.955 Tf 9.85 0 Td[(Gly)]TJ /F4 11.955 Tf 9.85 0 Td[(NME).DlugoszandAntosiewicz[ 142 143 ]studiedthisheptapeptideandpredictedthepKaof4.24forAsp3usingtheirCpHMD 71

PAGE 72

Figure4-3. CumulativeaverageprotonationfractionsofAspssidechainsinADFDAVSMCtitrationstepsatpH=4.0. method.AccordingtoNMRexperiments[ 144 145 ],thepKaofAspinthisheptapeptideisabout3.6. Table4-3. pKavaluesofthetitratableresiduesintheheptapeptidederivedfromOMKTY3. ASP3Lys5Tyr7 pH-REMD3.60.210.60.110.10.1CpHMD3.70.210.60.19.90.1 FromaHillplot,thepKaofthethreetitratablegroupshasbeencalculatedwiththeresultspresentedintable 4-3 .OurcomputedforvaluestheAsp3pKaof3.7(CpH)and3.6(pH-REMD)areinexcellentagreementwiththeexperimentalvalue.ThetitrationcurvesareshowninFigure 4-4 .Therearetwomoretitratablegroupsintheheptapeptide,i.e.,Lys5andTyr7,whichwealsotitrated.Figure??presentsthetitrationcurvesofLys5andTyr7,withthe 72

PAGE 73

Figure4-4. ThetitrationcurvesofAsp3intheheptapeptidederivedfromOMTKY3. computedpKaslistedinTable 4-3 .TocomparetheconvergencespeedbetweenCpHandpH-REMD,westudiedthecumulativeaverageprotonation/deprotonationfractionasfunctionofMCtitrationstepsforallionizableresidues.Figure 4-5 showsthedataforTyr7.ItisevidentthatpH-REMDconvergesfaster(andsmoother)tothenalprotonationfraction.WhiletheconvergenceofprotonationequilibriumiscrucialforthepropercomputationofapKa,theconvergenceofstructuralpropertiesisalsoimportant.Toconsiderthisissue,wecalculatedtheRMSDof-CarbonsforallpHs(forbothCpHandpH--REMDsimulations)withrespecttotheaveragestructurefromaCpHsimulationatpH10.ForbothCpHandpH-REMDmethods,theconformationalconvergencewasstudiedbycalculationoftheKullback-Leiblerdivergence(seesection 1.5.3.5 )ofRMSDcumulativedistributionsvs.time.Thisisameasureoftherateofconvergencetothenalconformationalensemble.WepresentedresultsforbothCpHandpH-REMDatpH10forevery100ps(Figure 4-7 ).WeusedthenalRSMDdistributionofCpHsimulationatpH10asareferencefortheplot.AstheinsetofFigure 4-7 shows,bothsimulations 73

PAGE 74

Figure4-5. ThetitrationcurvesofLys5andTyr7intheheptapeptidederivedfromOMTKY3. convergetothesameRMDSdistribution.ItisclearthatpH-REMDconvergesmorerapidlyandsmoothlythanCpH.WealsocomparedtherateofvisitingdistinctstructuresbycalculatingtheRMSDautocorrelationatallpHforbothmethods.AccordingtoFigure 4-8 ,thecorrelationtimeofRMSDsinpH-REMDsimulationsaresignicantlyshorterthanthatofCpHsimulations,whichimpliesthatpH-REMDvisitsdistinctconformationsmoreoftenthanCpH. 4.5ConcludingRemarksInthepresentchapter,wehavecombinedHamiltonianREMDwithConstantpHMDtocreatewhatwecallpH-REMD.ThepredictedpKaforanumberofsystemsusingpH-REMDisinexcellentagreementwithexperimentaldata.ComparedtoCpHMD,pH-REMDconvergesfasterinbothconformationalandprotonationspaces.IncontrasttotheTemperatureReplicaExchangeMolecularDynamicsmethods(TREMD),inpH-REMDthereplicaladder(pH)isverylimited,sotheoverlapofenergyand 74

PAGE 75

Figure4-6. CumulativeaverageprotonationfractionforTYR7versusMCtitrationstepsatpH=8.0,9.0,10.0,11.0,12.0and13.0,comparisonbetweentheCpH(A)andthepH-REMD(B)methods. 75

PAGE 76

Figure4-7. TheKullback-LeiblerdivergencemeasureofRMSDdistributionsofCpH(Green)andpH-REMD(Red)withrespecttothenalRMSDdistributioninCpH.TheinsetshowstheRMSDdistributionsofCpH(Green)andpH-REMD(Red)respectivelyafter100and10ns. consequentlytheexchangeratiobetweenneighboringreplicasarealwayshigh.Thisnewmethodisexpectedtoperformverywellforbiosystemswithhighlycoupledconformationalandprotonationstates,likeproteinswithpH-dependentstructureanddynamics. 76

PAGE 77

Figure4-8. RMSDAutocorrelationforCpH(Green)andpH-REMD(Red)atallpHs. 77

PAGE 78

CHAPTER5OPTIMIZATIONOFUMBRELLASAMPLINGREPLICAEXCHANGEMOLECULARDYNAMICSBYREPLICAPOSITIONING 5.1LiteratureReviewAmongtheadvancedsamplingmethods(seesection 1.4 ),theUmbrellaSamplingmethod(US)[ 25 49 170 ]iswellknownforimprovingthesamplingofrareeventsalongthereactioncoordinate.USattemptstosolvethebadsamplingproblembyapplyingabiasingpotentialtothesystem,toguaranteetheeffectivesamplingalongthereactioncoordinates.Thiscanbeachievedeitherinoneorseveralsimulations(windows).WhiletheefciencyofUSisclear,itneedscarefultuning.Forexample,thepositionsofwindowsandthestrengthofthebiaspotentialshavetobesuchthattheyhaveenoughoverlap.REMDtechniques[ 1 3 22 24 ]havebecomeincreasinglypopularschemes.REMDisnotonlyapromisingmethodfortacklingthequasi-ergodicity[ 171 ]problem,butitisalsoabletocombineintuitivelywithotherenhancedsamplingtechniques[ 2 4 21 23 24 55 168 ].InREMDmethods,Nnon-interactingcopies(replicas)ofthesystemwithdifferenttemperatures(TemperatureREMD[ 2 4 ])ordifferentpotentialenergyparameters(HamiltonianREMD[ 3 22 24 56 172 ])runconcurrentlyandeachreplicaattemptstoexchangewithitsneighborseveryfewsteps.Inotherwords,thesystemperformsarandomwalkinthereplicas'ladderspace.ThreetypesofeffortshavebeenmadetooptimizeREMDwithmostofthosebeingonlyapplicabletoTREMD.First,sometechniquestrytokeepthenumberofreplicaslimited.Second,therearemethodsthattrytoincreasetheefciencyofREMDbyincreasingtheprobabilityofacceptinganexchange.Third,aremethodsthatincreasetheefciencybyoptimalselectionofthepositionsofreplicasonaTemperature/Hamiltonianladder.InTREMD,thenumberofneededreplicasinasystemincreaseswiththenumberdegreesoffreedom[ 56 58 173 174 ].ThisscalingconnesTREMDtosmallsystems.Aclearwaytoavoidthisproblemistodecreasethenumber 78

PAGE 79

ofdegreesoffreedom[ 175 176 ].Approachescomprisetheuseofhybridexplicit/implicitsolventmodelsfortheMDandexchangepartsrespectively[ 173 ],theuseofseparatebathsforsolventandsolute[ 174 ],theuseofacoarsegrainedmodelforasubsetofthesystem[ 58 ]etc.Also,somestudieshavetriedtodecreasethenumberofreplicasbycouplingthesystemtoapre-sampledreservoir[ 177 178 ].Afeweffortshavebeenmadetoincreasetheprobabilityofexchange.Rick[ 179 ]useddynamicallyscaledreplicasbetweenconventionalreplicasatbroadtemperatureranges.Lietal.[ 180 ]devisedanewapproachcalledTemperatureIntervalswithGlobalEnergyReassignment(TIGER).Intheirmethod,shortrunsofheating-sampling--quenchingsequencesaredoneandattheendofeachsequencethepotentialenergiesofallreplicasarecomparedusingMetropolisMCsamplingandnallyreassignedtothetemperaturelevels.Kamberajetal.[ 181 ]addedappropriatebiaspotentialstothesystemtoattenthePotentialofMeanForceandtoachievearandomwalkalongthetemperatureladder.Also,Ballardetal.[ 168 ]usednon-equilibriumworkssimulationstogeneratetheattemptedcongurationswaps.Fortheoptimalchoiceoftheladderofreplicas,ithasbeenarguedthatthehighestefciencyisachievedwhentheExchangeAcceptanceRatio(EAR)betweenneighborreplicasisabout20%.[ 182 185 ].In2002,Kofke[ 182 183 ]showedthatwhenthespecicheatofthesystemisindependentoftemperature,assigningtemperaturesinageometricprogressionleadstoequalEARforTREMD.But,ifthespecicheatisafunctionoftemperature,thenthechoiceoftemperaturesisnotsystem-independent.Althoughsettingtheequi-EARbetweentheneighborsisthemostacceptedcriterionforoptimizingREMD(especiallyinTREMD);ithasbeenshown[ 186 190 ]thatforsystemswithphasetransitionsalongreplicaladders,thisisnottheoptimalarrangement.Afewtechniqueshavebeenproposedforselectingthesetoftemperatureswhenthespecicheatchangeswithtemperature.Forexample,Rathoreetal.[ 187 ]usedasmallsetofpre-simulatedreplicasatdifferenttemperaturesto 79

PAGE 80

determinethemeanandvarianceofenergyatthosetemperaturesandthenappliedaniterativeschemetondtherightpositionsofreplicas(temperatures).Katzgraberetal.[ 188 ]alsodevisedaniterativefeedbackalgorithmforoptimizingTREMD,buttheirmethodiscomputationallydemanding.In2007,HirtzandOostenbrink[ 189 ]proposedatechniqueforoptimizingTREMDandHREMDinsystemswithknownstableconformations.Intheirmethod,onerstgeneratesmanyshortMDtrajectoriesusingdifferentHamiltonians/temperaturesandstartingfromdifferentregionsofconformationspace.ThenoneclusterstheframesbasedontheirconformationalstatesandHamiltonians/temperaturesatwhichthoseweresimulatedandnallyappliesthetransitionprobabilitybetweenthepairsindifferentgroups.Shenfeldetal.[ 191 ]minimizedthethermodynamiclength[ 192 ]inordertondtheoptimalpositionsofreplicas.ThemajorityofproposedoptimizationmethodsforREMDareapplicableonlytoTREMD,withthosethatareapplicabletoHREMDmethodsbeingcomputationallydemanding.InthischapterweproposeamethodforoptimizingUmbrellaSamplingReplicaExchange.WeusethreeuniquepropertiesofUSREforoptimizingthepositionofreplicas.First,inthecaseofUSRE,thereisnophasetransitionalongthereactioncoordinate,becausethebiaspotentialsconstrainthesampledcoordinatelocally.Asdescribedabove,theoptimalreplicasetsintheabsenceofphasetransitionsarethosethathaveequi-EARbetweenallneighborreplicas.Second,theexchangeprobabilityinUSREisonlyafunctionofthedifferencebetweenorderparametersatthemomentofexchange.ThisExchangeAcceptanceRatio(EAR)isveryeasytocompute.Third,inUSREwecansafelyapproximatetheorderparameterdistributionineachwindowasaGaussiandistribution[ 27 28 ].Inthischapter,wedonotdiscussorproposeanyoptimalvalueofEARbetweentheneighbors,butweprovideawaytooptimizestheefciencyofUSREgiveneitherthenumberofreplicasorthedesiredEARbetweenallneighbors. 80

PAGE 81

TotestourmethodweappliedittooptimizeUSREsimulationstosamplingbutanedihedralrotationinimplicitsolventandtocomputethefreeenergyofformationofaNH+4+OH)]TJ /F1 11.955 Tf 10.41 -4.34 Td[(saltbridgeinexplicitsolvent.Theproposedpositionsofreplicas(windows)byourmethodnotonlyresultedinbettermixingandconvergencewithrespecttothatofequaldistanceonthereactioncoordinatebutalsothebestpossiblearrangementwithgivenrangeandnumberofreplicas.Weshowthatanydeviationfromthisarrangementproducesalowerrateofmixing.Intherestofthechapterweusereplicaandwindowsinterchangeably. 5.2Theory 5.2.1UmbrellaSamplingReplicaExchangeUSRE[ 4 ]isacombinationofUmbrellaSamplingandHamiltonianReplicaExchangeMD(seechapter1),i.e.,inUSRENnon-interactingreplicasrunsimultaneouslywithNdifferentbiaspotentials.Thebiaspotentialsareharmonicfunctionswithdistinctstrengthsandcoordinatereferences.ThebiasoftheithwindowhasaquadraticformofBi()=ki((q))]TJ /F7 11.955 Tf 11.95 0 Td[(0i)2, (5)whereki,(q),0i,andqarethebiasstrength,theorderparameter,thebiascenteronthereactioncoordinate,andcongurationrespectively.Wewillpresentresultsonlyforthecaseforwhichallourreplicashavethesameforceconstantbutdifferentcoordinatecenters.Howeverthemethodiseasilyexpandabletothecaseofdissimilarbiasstrengths.TheHamiltonianfortheithwindowcanbewrittenas:Hi=K+UUB+Bi, (5)whereUUBandKarerespectivelytheunbiasedpotentialenergyandthekineticenergy.Fortwoexchangingreplicas,iandj,withki=kj=k,inEquation 1 canbe 81

PAGE 82

simpliedto:=k(i)]TJ /F7 11.955 Tf 11.95 0 Td[(0j)2+k(j)]TJ /F7 11.955 Tf 11.96 0 Td[(0i)2)]TJ /F10 11.955 Tf 11.96 9.68 Td[(k(i)]TJ /F7 11.955 Tf 11.96 0 Td[(0i)2+k(j)]TJ /F7 11.955 Tf 11.95 0 Td[(0j)2=2k(0i)]TJ /F7 11.955 Tf 11.96 0 Td[(0j)(i)]TJ /F7 11.955 Tf 11.96 0 Td[(j)=2k0ijij. (5)Foragivenchoiceofbiascentersforeachwindow,0ijisaconstantandthenbecomesalinearfunctionofijandisnotdependentonthepotentialenergiesofexchangingreplicas. 5.2.2CalculatingExchangeAcceptanceRatioinUmbrellaSamplingReplicaExchangeByinsertingEquation 5 intoEquation 1 ,wecanwritetheprobabilityofacceptinganexchangeas:Pacc=min)]TJ /F4 11.955 Tf 5.48 -9.68 Td[(1,exp()]TJ /F4 11.955 Tf 9.3 0 Td[(2k0ijij). (5)UsingtheProbabilityDensityFunction(PDF)ofij(i.e.,p(ij),theEARij(ortheaverageprobabilityofacceptancebetweenreplicasiandj)canbecalculatedas:EARij=hPaccip(ij)=ZPacc(ij)p(ij)dij. (5)Wecanestimatethep(ij)asanormalDistribution(seeappendix A fordetails)p(ij)=1 r 22i+2jexp24)]TJ /F4 11.955 Tf 10.5 8.09 Td[((ij)]TJ /F4 11.955 Tf 11.96 0 Td[((hii)-222(hji))2 22i+2j35. (5)here,hiiand2iarethemeanandthevarianceofidistribution.Notethathii6=0i.BysubstitutingtheEquation 5 andEquation 5 intheEquation 5 ,theaverageprobabilityofacceptancebecomesEARij=Z1min)]TJ /F4 11.955 Tf 5.48 -9.69 Td[(1,exp()]TJ /F4 11.955 Tf 9.3 0 Td[(2k0ijij)1 r 22i+2jexp24)]TJ /F4 11.955 Tf 10.5 8.09 Td[((ij)]TJ /F4 11.955 Tf 11.96 0 Td[((hii)-222(hji))2 22i+2j35dij. (5) 82

PAGE 83

Withoutlossofgenerality,wecansimplifytheminbyassuming0i<0j,andnallyEARij=1 2erfc0BB@)]TJ /F4 11.955 Tf 19.08 8.08 Td[((hii)-223(hji) r 22i+2j1CCA+1 2erfc0BB@(hii)-222(hji))]TJ /F4 11.955 Tf 11.95 0 Td[(2k(0ij)2i+2j r 22i+2j1CCAexph)]TJ /F4 11.955 Tf 9.29 0 Td[(2k(0ij)(hii)-222(hji)+22k22i+2j(0ij)2i. (5)InEquation 5 ,allotherparametersexceptthehijjiand2ijjareknown.Inappendix B wepresentamethodforestimatingthemeanandvarianceofatanypointofthereactioncoordinateusingveryfewshortpre-simulatedwindows. 5.2.3UmbrellaSamplingReplicaExchangeOptimizationAswediscussedabove,thereisnophasetransitionalongthereplicasladderinthecaseofUSRE.Ifthereisnophasetransition,thebestsolutionforpositionsofreplicasisfortheEARbetweenallneighborpairsisequal.Wepresentheretwodifferentapproachesformakingasetofequi-EARsamongneighborpairs.Wediscusstnessfunctionsforeachcase,that,whenmaximized,leadtoequalEARamongneighborreplicas.Intherstmethod,thenumberofreplicasandthecenteroftherstandlastreplicasarexed(therangeforthesamplingcoordinate).Inordertondthesetofequi-EARreplicas,wedeneaFullBatchScoringFunction(FBSF)as:FBSF)]TJ /F7 11.955 Tf 5.48 -9.69 Td[(02...0N)]TJ /F5 7.97 Tf 6.59 0 Td[(1=)]TJ /F5 7.97 Tf 16.75 14.95 Td[(NXi=2)]TJ /F4 11.955 Tf 5.48 -9.69 Td[(EARi)]TJ /F5 7.97 Tf 6.58 0 Td[(1i(0i)]TJ /F5 7.97 Tf 6.58 0 Td[(1)]TJ /F7 11.955 Tf 11.96 0 Td[(0i))-222(hEARi2 (5)andmaximizeitwithrespecttothecenterofbiasesinallreplicasexceptthelastandrstones(i.e.,02...N)]TJ /F5 7.97 Tf 6.59 0 Td[(12).HereNisthenumberofreplicasandhEARi=NPi=2EARi)]TJ /F11 5.978 Tf 5.75 0 Td[(1i(0i)]TJ /F11 5.978 Tf 5.75 0 Td[(1)]TJ /F8 7.97 Tf 6.59 0 Td[(0i) N)]TJ /F5 7.97 Tf 6.59 0 Td[(1. 83

PAGE 84

InFBSFwesetthepositionoftherstandthelastreplicas.ConsequentlythetotalnumberofdegreesoffreedomisequaltoN)]TJ /F4 11.955 Tf 12.33 0 Td[(2.SinceEARi)]TJ /F5 7.97 Tf 6.59 0 Td[(1iisastrictlydecreasingfunctionof0i)]TJ /F5 7.97 Tf 6.59 0 Td[(1)]TJ /F7 11.955 Tf 11.95 0 Td[(0i,thenitcanbeconcludedthatthescoringfunction(whichisanN)]TJ /F4 11.955 Tf 11.96 0 Td[(2dimensionalparabola)hasauniquemaximum.MaximizingtheFBSFleadstoequalEARbetweenadjacentreplicas.Inthesecondmethod,wechooseadesiredvaluefortheEARbetweenreplicas,andwexthecenterofbiasfortherstreplica(lowestvalueonreactioncoordinate)andtherangeofthereactioncoordinate.Inthisscheme,startingfromthelowestvalueonthereactioncoordinate,weaddanewreplicaatatimeandmaximizetheSingleBatchScoringFunction(SBSF)respecttothatcenterofbiasinthatreplica.Wecontinuethisschemeuntilthelastreplicapassesthemaximumrangeofthereactioncoordinate.Inordertoadjustthepositionithreplica,wetrytomaximizethefollowingSBSFateveryreplicaaccumulation:SBSF)]TJ /F7 11.955 Tf 5.48 -9.68 Td[(0i=)]TJ /F10 11.955 Tf 11.29 9.68 Td[()]TJ /F4 11.955 Tf 5.48 -9.68 Td[(EARi)]TJ /F5 7.97 Tf 6.59 0 Td[(1i(0i)]TJ /F5 7.97 Tf 6.58 0 Td[(1)]TJ /F7 11.955 Tf 11.96 0 Td[(0i))]TJ /F4 11.955 Tf 11.96 0 Td[(EARc2 (5)whereEARcisthedesiredvalueofEAR,whichisboundedbetween0and1.AttheendofthisprocesstheEARbetweenallneighborpairsequalsEARc.WeusedtheBarzilaiandBorwein[ 194 195 ]optimizationmethodformaximizingbothSBSFandFBSF(seetheAppendix C formoredetails). 5.2.4UmbrellaSamplingReplicaExchangeOptimizationWorkowsBasedonthechoiceofoptimization,twotypesofworkowsexist.Inbothmethodswepre-simulateMwindows.IntheFBSFmethod,thenumberofreplicasisset,butthenalaverageEARisnotknownapriori(Figure 5-1 ).IntheSBSFmethodtheEARcisachosenparameterandthenumberofreplicasinnotknowninadvance(Figure 5-2 ).Wehavewrittenscripts(forbothFBSFandSBSF)thatimplementtheideaspresentedhere.Bothscriptsareavailableat http://www.clas.ufl.edu/users/roitberg/software.html 84

PAGE 85

Figure5-1. FBSFworkow 85

PAGE 86

Figure5-2. SBSFworkow 5.2.5SimulationDetailsInordertovalidateourmethod,weperformedUSREsimulationsontwosystems.First,westudiedthePMFalongthebutanetorsionalangleintherangefrom-65to67degreesusing12replicas.Weperformedfoursetsofsimulationsfordifferentarrangementsofreplicasonthereactioncoordinate.WeappliedtheOBCGeneralizedBorn[ 101 ]implicitsolventmodel(igb=5)withacutoffof12Afornonbondedinteractions.Aspringconstantofk=30kcalmol)]TJ /F5 7.97 Tf 6.59 0 Td[(1radian)]TJ /F5 7.97 Tf 6.59 0 Td[(2hasbeenappliedtoallreplicas.Thelengthofeachsimulationwasabout50nsinordertohaveenoughsampling 86

PAGE 87

forestimatingtheAverageRoundtripTimebetweenthehighestandthelowestreplicasineachset.Second,weperformedtwosetsofsimulationsusing24replicasonaNH+4+OH)]TJ /F1 11.955 Tf 10.41 -4.34 Td[(saltbridgeimmersedinaboxofTIP3Psolventmodel[ 196 ]of21.421.421.4A3.WecalculatedthelongrangeelectrostaticinteractionswiththeparticlemeshEwaldmethod,andusedan8Acutofffortheshortrangenonbondedinteractions.Aspringconstantofk=30kcalmol)]TJ /F5 7.97 Tf 6.59 0 Td[(1A)]TJ /F5 7.97 Tf 6.58 0 Td[(2hasbeenappliedtoallreplicas.Thelengthsofallsimulationswere40ps,howeverweuseda5nsUSREsimulationasareferenceincalculationoftheKullback-Leiblerdivergence[ 39 ].Inallcalculationsinthischapter,weusedtheAMBER12molecularsimulationsuite[ 197 ]andthegeneralAMBERforceeld(GAFF)[ 198 ].TheSHAKE[ 77 ]algorithmwasemployedtoconstrainthedistancebetweenhydrogensandheavyatoms,whichpermittedtheuseof2fsMDsteps.Replicawereattemptedtoexchangeevery50MDsteps[ 199 ]andLangevindynamicswithfrictioncoefcientof2.0ps)]TJ /F5 7.97 Tf 6.58 0 Td[(1wasemployedtosustainthesystemsat300K.Inordertoproduceindependentsimulations[ 107 ],weuseddistinctrandomseeds(ig=-1inAMBER)fortheLangevinthermostatsinallsimulations.Wesavedtheorderparameterevery10MDsteps(i.e.,20fs)andweusedGrosselds[ 26 ]implementationofWeightedHistogramAnalysismethod(WHAM)forcalculatingthePotentialofMeanForcesversusreactioncoordinateinallsimulations. 5.3ResultsandDiscussionTostudytheeffectofreplicaoptimizationinUSRE,wecomputedandcomparedtheAverageRoundtripTimefordifferentpositionsofthereplicas,includingouroptimizedone.Furthermorewemeasuredtheconvergencespeedforoptimizedandnon--optimizedUSREarrangementofNH+4+OH)]TJ /F1 11.955 Tf 10.41 -4.34 Td[(saltbridgeinexplicitsolvent.Inbothsimulations,wechoosethetotalnumberofreplicastobeequaltothatofpre-simulatedreplicas(i.e.,M=N). 87

PAGE 88

5.3.1PotentialofMeanForceAlongtheButaneDihedralAngleWeperformedfoursetsofUSREcalculationsonbutanedihedralinimplicitsolvent.Intherstone,thecentersofbiaseswereequallydistributed(every12)onthereactioncoordinate.Inthesecondsetofcalculationswemovedtwoneighboringreplicasinthemiddleofthereactioncoordinateandwekeptallotherreplicasinplace.ForthethirdsetweoptimizedthereplicapositionsbymaximizingtheFBSF.Fortheoptimization,weusedtherst50psoftherstsetaspre-simulateddata.Finally,inthefourthsetofsimulationsweperturbedthepositionofonereplicainthethird(optimized)set.ThepositionsofreplicasandcorrespondingEARineachsethasbeenshowninTable 5-1 Table5-1. PositionandEARforfourdifferentsettingsofUSREcalculationsonthebutanedihedralsimulationinimplicitwater.Positionofeachreplicacorrespondstothecenterofbiasforthatreplica.Also,theEARofeachpositioncorrespondstotheEARbetweenthatreplicapositionanditshigherneighboronthereactioncoordinate. Set1Set2Set3Set4 PositionEARPositionEARPositionEARPositionEAR-65.000.18-65.000.19-65.000.14-65.000.14-53.000.18-53.000.18-51.720.14-51.720.14-41.000.16-41.000.16-38.580.14-38.580.14-29.000.13-29.000.13-26.010.14-26.010.14-17.000.10-17.000.13-14.440.15-14.440.15-5.000.07-6.000.03-4.160.14-4.160.107.000.109.000.185.730.157.000.2119.000.1419.000.1416.100.1416.100.1331.000.1631.000.1727.980.1427.980.1443.000.1843.000.1840.600.1440.600.1455.000.1955.000.1953.710.1453.710.1467.000.0067.000.0067.000.0067.000.00 Foreachset,theAverageRoundtripTime(ART)betweenextremumreplicaswascomputed.Togathersufcientstatistics,eachsimulationwasrepeated10times.Sincetherewere12replicasineachsimulation,thetotalnumberofmeasuredARTsineachsetwas120.WepresentbelowtheprobabilitydensityoftheARTs(Figure 5-3 ).Figure 5-3 canbeanalyzedbasedontheCentralLimitTheorem.TheCLTdescribesthecharacteristicsofthepopulationofthemeansofaninnitenumberofrandomsamples 88

PAGE 89

Figure5-3. ARTprobabilitydensity.TheinsetshowsthePMFvs.reactioncoordinate(dihedralangle) ofsizeN,drawnfromaninniteparentpopulation.AccordingtotheCLT,thedistributionofthemeanconvergestoaGaussiandistribution,andthemeanofthemeans(i.e.,centeroftheGaussian)isequaltothemeanoftheparentdistribution.BasedontheCLTthecentersofdistributionsareequaltothemeanoftheparentpools.AccordingtoFigure 5-3 ,theOptimizedset(set3)hasthelowestARTamongtheallsets,moreoverasthereplicapositionsdeviatemorefromoptimizedarrangement(i.e.,sets4,1and2respectively)theARTincreases.Thisshowstheequi-EARsetcorrespondstotheminimumARTandthemixingisthebestinthisarrangement.Sincethenumberofexchangeattemptsisequalinallsets,weexpectthesetswithhigherARTstohavethesmallersamplesizes(i.e.,thenumberofroundtripsineachsimulation),whichaccordingtotheCLTleadstowiderARTdistributions.Moreover,Figure 5-3 alsoshowsthenaivearrangementofequaldistancebetweenthecentersofbiases(set1)isnottheoptimizedarrangementofreplicas. 89

PAGE 90

5.3.2PotentialofMeanForceforNH+4+OH)]TJ /F9 11.955 Tf 10.4 -4.34 Td[(SaltBridgeinExplicitSolventInordertoshowtheeffectofUSREoptimizationontherateoftheconvergenceforfreeenergycalculations,weappliedittoamorecomputationallydemandingsystem,i.e.,NH+4+OH)]TJ /F1 11.955 Tf 10.41 -4.33 Td[(saltbridgeinexplicitsolvent.WehavecomputedthesaltbridgePMFvs.N)]TJ /F4 11.955 Tf 11.96 0 Td[(Odistance(insetofFigure 5-5 ).Weperformed2setsofsimulationsonthesaltbridgeandforeachsetwerepeatedthecalculations10times.Inset1weusedthenaivesettingofequaldistancebetweentheadjacentbiascentersalongthereactioncoordinateandintheset2weoptimizedthepositionsofreplicas.Fortheoptimizationleadingtoset2,weuseddumpedvaluesoftherst20psofoneofthesimulationsinset1tomaximizetheFBSF. Figure5-4. Windowpositionsvs.EARsintheequi-distance(non-optimized)(redcircles)andoptimized(set2)(greencircles)simulations.TheEARofeachpositioncorrespondstoEARbetweenthatreplicapositionanditshigherneighboronthereactioncoordinate. 90

PAGE 91

Figure5-5. ThemeanandRMSEofKullback-Leiblerdivergenceover10simulationsforeachset.Set1(red)isthenaiveequi-distancearrangementandset2(green)isouroptimizedarrangement.WeusedthePMFof5nsUSREsimulationasareferenceforbothsimulationsets(theinset).TheinsetshowsPMFvs.reactioncoordinate(N)]TJ /F4 11.955 Tf 11.96 0 Td[(Odistance) 5.4ConcludingRemarksInthepresentwork,weusedstatisticalmechanicstechniquestoestimatetheExchangeAcceptanceRatiosbetweenanytwowindowsonthereactioncoordinateforUSRE.UsingtheproposedschemeswesetthepositionsofreplicassuchthatallreplicashavethesameEARwiththeirimmediateneighbors.SinceinUSREthereisnophasetransitionalongthereactioncoordinatethenthissetupmaximizesthenumberofroundtripsbetweenthelowestandhighestreplicasandconsequentlytheefciencyofUSRE.WeappliedouroptimizationmethodtobutaneinimplicitsolventandNH+4+OH)]TJ /F1 11.955 Tf -460.07 -28.25 Td[(saltbridgeinexplicitsolvent.Fromtheresults,itwasevidentthatthisoptimizationsubstantiallyincreasedthemixingandconvergencerateforbothsystems.Weexpect 91

PAGE 92

theoptimizationofUSREtosignicantlyincreasethespeedofconvergenceinlargersystems. 92

PAGE 93

CHAPTER6CONCLUSIONSInspiteoftheexcessivegrowthinthepowerofcomputerhardwareinrecentyears,stilltheroleofefcientsamplinginComputationalbiophysicsandbiochemistryiscrucial.MyPhDresearchwasmostlydedicatedtodevelopingandoptimizingenhancedsamplingmethodsandpredominantlyHamiltonianReplicaExchangeMolecularDynamics(HREMD).Inmyrstproject,wedemonstratedthattheHREMDmethodimprovestheconvergencerateinalchemicalfreeenergycalculations.Moreover,weshowedthatthereisadirectmappingbetweentheHREMDandFreeEnergyPerturbation(FEP)methods,whichcanbeusedforbothexchangeacceptanceandfreeenergycalculations.Inthesecondproject,IusedtheReplicaExchangeFreeEnergyMethodtoestimatethepKashiftinGlutamate66inahyperstablemutantofstaphylococcalnuclease.Inaddition,weaimedtoresolveaninconsistencybetweenpKaexperimentalandcontinuummethodsinestimationofpKavalueofposition66inthatprotein.Weproposedthattheexperimentalmethods,whicharemostlysensitivetocongurationalchanges,measuretheequilibriumconstantbetweentwocongurationalstatesinsteadoftwoprotonationstates.TheresultsareinalmostperfectagreementwiththeMorenoeslabexperiments.Inthethirdproject,wedevelopedandvalidatedapH-ReplicaExchangeMolecularDynamics(pH-REMD)method.Weappliedthismethodtoaseriesofmodelcompounds,aterminallychargedADFDApentapeptide,andaheptapeptidederivedfromtheOvomucoidthirddomain(OMTKY3).WeshowedthatnotonlythepredictedpKasinpH-REMDareverysimilartothatofConstantpHMD(CpHMD),butalsothesamplingismoreeffectivethanthatofCpHMDmethods.Finallyinthelastproject,wefocusedonoptimizingtheUmbrellaSamplingReplicaExchange(USRE)method.Weinvented,validated,andtestedamethodforestimating 93

PAGE 94

theprobabilityofexchangebetweenneighboringreplicas.Usinginformationfromshortumbrellaruns,weoptimizedthepositionofreplicas(windows)alongthereactioncoordinate,andweshowedthattheequalexchangeacceptancebetweenreplicapairsistheoptimumsettingforUSRE. 94

PAGE 95

APPENDIXACALCULATINGTHEPDFOFINUSREInUS(andsubsequentlyUSRE),approximatingtheProbabilityDensityFunction(PDF)ofbyanormaldistributionisvalidassumption[ 169 ].Soe.g.,forreplicaiwecanwrite:pi()=1 q 22iexp")]TJ /F4 11.955 Tf 10.49 8.09 Td[(()-222(hii)2 22i#. (A)Sincethewindowsareindependent,thejointPDFofiandj(i.e.,p(i,j))isequaltomultiplicationoftheirPDF;thenwecancomputethePDFofij=i)]TJ /F7 11.955 Tf 12.11 0 Td[(jbyintegratingthejointPDFofiandj,i.e.,p(ij)=Z1dip(i,j)=Z1dip(i)p(j)=Z1dip(i)p()]TJ /F4 11.955 Tf 9.3 0 Td[(ij+i)=Z11 2ijexpf)]TJ /F4 11.955 Tf 16.47 8.09 Td[(1 2"(i)-222(hii)2 2i+()]TJ /F4 11.955 Tf 9.3 0 Td[(ij+i)-221(hji)2 2j#gdi=1 r 22i+2jexp24)]TJ /F4 11.955 Tf 10.49 8.08 Td[((ij)]TJ /F4 11.955 Tf 11.95 0 Td[((hii)-223(hji))2 22i+2j35. (A)whichitselfisanormaldistribution. 95

PAGE 96

APPENDIXBESTIMATINGMEANSANDVARIANCESOFTHEWINDOWSUSINGNEARESTNEIGHBORWEIGHTEDAVERAGINGANDREWEIGTHINGInordertondtheoptimalpositionsofreplicasonthereactioncoordinate,weneedtoestimatethemeanandvarianceatthereplicas'newpositions.Themostprimitivewaytoestimatethemeanandvarianceof(i.e.hiand2)foreachwindowistorunashortUSsimulationatthatwindow.Sinceoptimizingthepositionsofreplicasrequiresmovingthepositions(i.e.,centerofbiases)ofreplicasateachstepofoptimization,itisnotveryefcienttousethismethodforestimatingthemeanandvarianceof.Instead,wedevelopamethodbelow,whichapproximateshiand2foranyarbitraryreplicaonthereactioncoordinateusingthehiand2ofafew,veryshorttimepre-simulatedwindows.Supposeweknowthecenterofbias,varianceandmeanforNpre-simulatedwindows,i.e.,8>>>><>>>>:01...0Nh1i...hNih21i...h2Ni. (B)Ifreplicaiisapre-simulatedreplica,thentheoreticallywecanestimatethersttwonon-centralmomentsofthedistribution(i.e.,hiand2)onanynewwindowmonthereactioncoordinateusingthereweightingtechnique(section B.1 )as:him;i=R11 q 22iexp)]TJ /F5 7.97 Tf 10.49 5.7 Td[(()]TJ /F6 7.97 Tf 6.25 1.08 Td[(hii)2 22iexp[k(()]TJ /F7 11.955 Tf 11.95 0 Td[(0i)2)]TJ /F4 11.955 Tf 11.96 0 Td[(()]TJ /F7 11.955 Tf 11.95 0 Td[(0m)2)]d R11 q 22iexp)]TJ /F5 7.97 Tf 10.5 5.7 Td[(()]TJ /F6 7.97 Tf 6.26 1.07 Td[(hii)2 22iexp[k(()]TJ /F7 11.955 Tf 11.96 0 Td[(0i)2)]TJ /F4 11.955 Tf 11.95 0 Td[(()]TJ /F7 11.955 Tf 11.96 0 Td[(0m)2)]d (B)andh2im;i=R121 q 22iexp)]TJ /F5 7.97 Tf 10.49 5.7 Td[(()]TJ /F6 7.97 Tf 6.25 1.08 Td[(hii)2 22iexp[k(()]TJ /F7 11.955 Tf 11.95 0 Td[(0i)2)]TJ /F4 11.955 Tf 11.95 0 Td[(()]TJ /F7 11.955 Tf 11.96 0 Td[(0m)2)]d R11 q 22iexp)]TJ /F5 7.97 Tf 10.49 5.7 Td[(()]TJ /F6 7.97 Tf 6.25 1.07 Td[(hii)2 22iexp[k(()]TJ /F7 11.955 Tf 11.95 0 Td[(0i)2)]TJ /F4 11.955 Tf 11.95 0 Td[(()]TJ /F7 11.955 Tf 11.96 0 Td[(0m)2)]d (B) 96

PAGE 97

wherehf()im;iistheaverageoff()overdistributionofmusingthedistributionofi.Wecansolvetheaboveintegrationsbywritingthepowerofexponentialinasinglesquareform:)]TJ /F4 11.955 Tf 10.49 8.09 Td[(()]TJ /F2 11.955 Tf 7.08 1.8 Td[(hii)2 22i+k)]TJ /F4 11.955 Tf 5.48 -9.68 Td[(()]TJ /F7 11.955 Tf 11.96 0 Td[(0i)2)]TJ /F4 11.955 Tf 11.96 0 Td[(()]TJ /F7 11.955 Tf 11.96 0 Td[(0m)2=)]TJ /F4 11.955 Tf 9.3 0 Td[(2hiik(0i)]TJ /F7 11.955 Tf 11.96 0 Td[(0m)+22i(k(0i)]TJ /F7 11.955 Tf 11.95 0 Td[(0m))2)]TJ /F4 11.955 Tf -241.85 -20.63 Td[(()]TJ /F4 11.955 Tf 11.96 0 Td[((hii)]TJ /F4 11.955 Tf 19.26 0 Td[(22ik(0i)]TJ /F7 11.955 Tf 11.96 0 Td[(0m)) 22i+k)]TJ /F4 11.955 Tf 5.48 -9.68 Td[((0i)2)]TJ /F4 11.955 Tf 11.96 0 Td[((0m)2 (B)andusing8>><>>:1R1 p Cexph)]TJ /F5 7.97 Tf 10.5 5.7 Td[(()]TJ /F5 7.97 Tf 6.59 0 Td[(B)2 Ci=B1R1 p C2exph)]TJ /F5 7.97 Tf 10.49 5.7 Td[(()]TJ /F5 7.97 Tf 6.58 0 Td[(B)2 Ci=1 2C+B2, (B)Equations( B )and B simpliesto:8><>:him;i=hii)]TJ /F4 11.955 Tf 19.26 0 Td[(22ik(0i)]TJ /F7 11.955 Tf 11.96 0 Td[(0m)h2im;i=(hii)]TJ /F4 11.955 Tf 19.26 0 Td[(22ik(0i)]TJ /F7 11.955 Tf 11.95 0 Td[(0m))2+2i, (B)Ifreplicasiandi+1arethenearestpre-simulatedneighborsofourimaginaryreplicam,i.e.,0i<0m<0i+1,thenusingthenearestneighborweightedaveragingmethod(section B.2 )wecanestimatethemeanandvarianceofonwindowsmasfollowing:8>>>><>>>>:him=wi(hii)]TJ /F4 11.955 Tf 19.26 0 Td[(22ik(0i)]TJ /F7 11.955 Tf 11.96 0 Td[(0m))+wi+1)]TJ /F2 11.955 Tf 5.48 -9.69 Td[(hxii+1)]TJ /F4 11.955 Tf 11.96 0 Td[(22i+1k(0i+1)]TJ /F7 11.955 Tf 11.95 0 Td[(0m)h2im=wih(hii)]TJ /F4 11.955 Tf 19.26 0 Td[(22ik(0i)]TJ /F7 11.955 Tf 11.95 0 Td[(0m))2+2ii+wi+1h)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(hi+1i)]TJ /F4 11.955 Tf 19.26 0 Td[(22i+1k(0i+1)]TJ /F7 11.955 Tf 11.96 0 Td[(0m)2+2i+1i2m=h2im)-221(hi2m, (B) 97

PAGE 98

wherewechoosethepowerparameterequalto1.2andnumberofinvolvingnearestneighborsequalsto2,soEquation B inthesection B.2 implies:wi=8>>>><>>>>:(j0i)]TJ /F8 7.97 Tf 6.58 0 Td[(0mj)0.6 (j0i)]TJ /F8 7.97 Tf 6.59 0 Td[(0mj)0.6+(j0i+1)]TJ /F8 7.97 Tf 6.59 0 Td[(0mj)0.60m6=0iand0m6=0i+110m=0i00m=0i+1. (B)Usingtheabovesetup,weareabletoestimatethemeanandvarianceofanyarbitrarywindowonthereactioncoordinate.SubsequentlyitispossibletoapproximatetheEARbetweenanytworeplicasusingEquation 5 B.1ReweightingFittedGaussianDistributiononWindowsHerewedescribethereweightingmethodthatweusedforestimationofthemeanandthevarianceofarbitrarywindowsonreactioncoordinate.Consideracontinuousfunctionf().Wecancalculatetheaverageoff()overadistributionofp()asfollowing:hf()ip=Rf()p()d Rp()d. (B)Ifwewanttocalculatetheaverageoff()overanotherdistributionq(),wherethedomainofq()isasubsetofp(),thenwemaywrite:hf()iq;p=Rf()q()d Rq()d=Rf()p()q() p()d Rp()q() p()d=hq() p()f()ip hq() p()ip, (B)wherehf()iq;pistheaverageoff()overdistributionofqusingthedistributionofp.InthecaseofUmbrellaSamplingandbyassumingGaussiandistributionsforthecollectivevariablesforeverywindow(i.e.,usingEquation A ),wecanestimatetheaverageoff()overwindowi:hf()ipi=Rf()1 q 22iexp)]TJ /F5 7.97 Tf 10.49 5.7 Td[((hii)2 22id R1 q 22iexp)]TJ /F5 7.97 Tf 10.5 5.7 Td[((hii)2 22id. (B) 98

PAGE 99

UsingtheEquation 5 andEquation 5 ,wecancomputetheratioofprobabilitydistributionofreplicasiandmaspm() pi()=exp[)]TJ /F7 11.955 Tf 9.3 0 Td[((UUB+Bm)] exp[)]TJ /F7 11.955 Tf 9.3 0 Td[((UUB+Bi)]=expk)]TJ /F4 11.955 Tf 5.48 -9.68 Td[(()]TJ /F7 11.955 Tf 11.95 0 Td[(0i)2)]TJ /F4 11.955 Tf 11.95 0 Td[(()]TJ /F7 11.955 Tf 11.96 0 Td[(0m)2 (B)FinallyusingEquation B andEquation B ,wecanexploithf()ipm;pi=hf()im;i=Rf()1 q 22iexp)]TJ /F5 7.97 Tf 10.49 5.7 Td[((hii)2 22iexp[k(()]TJ /F7 11.955 Tf 11.96 0 Td[(0i)2)]TJ /F4 11.955 Tf 11.96 0 Td[(()]TJ /F7 11.955 Tf 11.96 0 Td[(0m)2)]d R1 q 22iexp)]TJ /F5 7.97 Tf 10.49 5.7 Td[((hii)2 22iexp[k(()]TJ /F7 11.955 Tf 11.95 0 Td[(0i)2)]TJ /F4 11.955 Tf 11.95 0 Td[(()]TJ /F7 11.955 Tf 11.96 0 Td[(0m)2)]d. (B)Whichistheaverageoff()overthewindowmusingthewindowi.UsingEquation B wecalculatethevaluesofhiandh2iforanyarbitrarywindows,usingthepre--simulatedwindows(i.e.,Equation B ). B.2NearestNeighborWeightedAveraging(NNWA)WeusedNNWAtomakeabetterestimationofmeanandvarianceofforanyarbitrarywindowbypre-simulatedwindows.NNWAisamemberofLocallyWeightedLearningmethods[ 200 ].Considerafunctionf()withafewknownpoints(i.e.,i);furthermoresupposethevalueoffatanyarbitrarypositioncanbeestimatedseparatelybyeachofthoseknownpoints;thenusingkknownnearestneighborof,wecanapproximatethevalueoff,as:f()=kXi=1wi()fi() kPi=1wi(). (B)wherefi()istheestimationoff()usingiandwi()=1 d(,i).disso-calledmetricoperator,whichinitssimplestformisd(,i)=[()]TJ /F7 11.955 Tf 11.95 0 Td[(i)2].pisapositiverealnumberknownasapowerparameter.Dependingontheproblem,kcanbechangedfromtherstnearestneighborstothetotalnumberofknownpoints.Inthispaperwesetp=1.2andk=2,whichhelpedtheoptimizertosmoothlyconverge. 99

PAGE 100

APPENDIXCBARZILAIANDBORWEINOPTIMIZATION(BBMETHOD)TheSteepestDescentorGradientDescentmethodisamongthemostimportantrstorderoptimizationalgorithms.Itcanbeformulizedasfollows:xk+1=xk+kgk, (C)wheregk=rf(xk)andkisthestepsizeatstepk.Therearemanymethodsforestimatingtheoptimalstepsize[ 195 201 ].BBmethod[ 194 ]isoneofthemostefcientroutinesamongthose,inwhichthestepsizeisdeterminedbyminimizingeitherkx)]TJ /F7 11.955 Tf 10.66 0 Td[(gk2oritscorrespondentk)]TJ /F5 7.97 Tf 6.58 0 Td[(1x)]TJ /F4 11.955 Tf 10.67 0 Td[(gk2withrespectto,wherex=xk)]TJ /F4 11.955 Tf 10.66 0 Td[(xk+1andg=gk)]TJ /F4 11.955 Tf 11.95 0 Td[(gk+1.Basedonthose,theoptimalstepsizecanbederivedas:k=x|g g|g. (C) 100

PAGE 101

REFERENCES [1] Swendsen,R.H.;Wang,J.-S.Phys.Rev.Lett.1986,57,2607. [2] Sugita,Y.;Okamoto,Y.Chem.Phys.Lett.1999,314,141. [3] Mitsutake,A.;Sugita,Y.;Okamoto,Y.Biopolymers.2001,60,96. [4] Sugita,Y.;Kitao,A.;Okamoto,Y.arXivpreprintcond-mat2000,113,6042. [5] McQuarrie,D.StatisticalMechanics;UniversityScienceBooks,2000. [6] Chandler,D.Introductiontomodernstatisticalmechanics;Oxforduniversitypress:NewYork,Oxford,1987. [7] Leach,A.MolecularModelling:PrinciplesandApplications(2ndEdition),2nded.;PrenticeHall,2001. [8] Landau,R.H.;PaezMeja,M.J.;Bordeianu,C.ASurveyofcomputationalphysics:introductorycomputationalscience,har/cdred.;PrincetonUniversityPress,1997. [9] Cramer,C.J.EssentialsofComputationalChemistry:TheoriesandModels,2nded.;JohnWileySons:WestSussex,England,2005. [10] vanGunsteren,W.F.;Berendsen,H.J.Angew.Chem.1990,29,992. [11] Levitt,M.;Warshel,A.Nature1975,253,694. [12] Li,Z.;Scheraga,H.A.Proc.Natl.Acad.Sci.U.S.A.1987,84,6611. [13] Landau,D.P.;Binder,K.AguidetoMonteCarlosimulationsinstatisticalphysics;Cambridgeuniversitypress,2009. [14] Metropolis,N.;Rosenbluth,A.W.;Rosenbluth,M.N.;Teller,A.H.;Teller,E.J.Chem.Phys.1953,21,1087. [15] Marinari,E.;Parisi,G.Europhys.Lett.1992,19,451. [16] Wang,F.;Landau,D.P.Phys.Rev.Lett.2001,86,2050. [17] Okamoto,Y.;Hansmann,U.H.E.J.Phys.Chem.1995,99,11276. [18] Berg,B.A.;Neuhaus,T.Phys.Rev.Lett.1992,68,9. [19] Berg,B.A.;Celik,T.Phys.Rev.Lett.1992,69,2292. [20] Murata,K.;Sugita,Y.;Okamoto,Y.Chem.Phys.Lett.2004,385,1. [21] Woods,C.J.;Essex,J.W.;King,M.A.J.Phys.Chem.B2003,107,13703. 101

PAGE 102

[22] Meng,Y.;Dashti,D.S.;Roitberg,A.E.J.Chem.TheoryComput.2011,7,2721. [23] Itoh,S.G.;Damjanovic,A.;Brooks,B.R.Proteins:Struct.,Funct.,Bioinf.2011,79,3420. [24] SabriDashti,D.;Meng,Y.;Roitberg,A.E.J.Phys.Chem.B2012,116,8805. [25] Torrie,G.;Valleau,J.J.Comput.Phys.1977,23,187. [26] Grosseld,A.WHAM:animplementationoftheweightedhistogramanalysismethod.2010; http://membrane.urmc.rochester.edu/content/wham [27] Kastner,J.;Thiel,W.J.Chem.Phys.2005,123,144104. [28] Kastner,J.J.Chem.Phys.2012,136,234102. [29] Riccardi,D.;Schaefer,P.;Cui,Q.J.Phys.Chem.B2005,109,17715,PMID:16853267. [30] Zwanzig,R.W.J.Chem.Phys.1954,22,1420. [31] Carugo,O.;Pongor,S.ProteinSci.2001,10,1470. [32] Matthews,B.W.;Rossmann,M.G.InDiffractionMethodsforBiologicalMacro-moleculesPartB;HaroldW.Wyckoff,S.N.T.,C.H.W.Hirs,Ed.;MethodsinEnzymology;AcademicPress,1985;Vol.115;pp397420. [33] Orengo,C.Curr.Opin.Struct.Biol.1994,4,429440. [34] Holm,L.;Sander,C.J.Mol.Biol.1993,233,123138. [35] Garc,A.E.Phys.Rev.Lett.1992,68,2696. [36] Smith,L.J.;Daura,X.;vanGunsteren,W.F.Proteins:Struct.,Funct.,Bioinf.2002,48,487. [37] Flyvbjerg,H.;Petersen,H.G.J.Chem.Phys.1989,91,461. [38] Hess,B.Phys.Rev.E2002,65,031910. [39] Kullback,S.;Leibler,R.A.Ann.Math.Statist.1951,22,79. [40] McClendon,C.L.;Hua,L.;Barreiro,A.;Jacobson,M.P.J.Chem.TheoryComput.2012,8,2115. [41] SabriDashti,D.;Roitberg,A.submittedtoJ.Phys.Chem.B [42] SabriDashti,D.;Roitberg,A.submittedtoJ.Chem.TheoryComput. 102

PAGE 103

[43] Chipot,C.;Pohorille,A.FreeEnergyCalculations:TheoryandApplicationsinChemistryandBiology;Springerseriesinchemicalphysics;Springer-VerlagBerlinHeidelberg,2007. [44] Bash,P.;Singh,U.;Langridge,R.;Kollman,P.;etal.,Science1987,236,564. [45] Kirkwood,J.G.J.Chem.Phys.1935,3,300. [46] Mezei,M.J.Comput.Phys.1987,68,237. [47] Roux,B.Comput.Phys.Commun.1995,91,275. [48] Jarzynski,C.Phys.Rev.Lett.1997,78,2690. [49] Kumar,S.;Rosenberg,J.M.;Bouzida,D.;Swendsen,R.H.;Kollman,P.A.J.Comput.Chem.1992,13,1011. [50] Bennett,C.H.J.Comput.Phys.1976,22,245. [51] Shirts,M.R.;Chodera,J.D.J.Chem.Phys.2008,129,124105. [52] Pohorille,A.;Jarzynski,C.;Chipot,C.J.Phys.Chem.B2010,114,10235. [53] Zheng,L.;Chen,M.;Yang,W.Proc.Natl.Acad.Sci.U.S.A.2008,105,20227. [54] Hamelberg,D.;Mongan,J.;McCammon,J.A.J.Chem.Phys.2004,120,11919. [55] Fukunishi,H.;Watanabe,O.;Takada,S.J.Chem.Phys.2002,116,9058. [56] Liu,P.;Kim,B.;Friesner,R.A.;Berne,B.J.Proc.Natl.Acad.Sci.2005,102,13749. [57] Liu,P.;Voth,G.A.J.Chem.Phys.2007,126,045106. [58] Lyman,E.;Ytreberg,F.;Zuckerman,D.Phys.Rev.Lett.2006,96,028105. [59] Lwin,T.Z.;Luo,R.J.Chem.Phys.2005,123,194904. [60] Rick,S.W.J.Chem.TheoryComput.2006,2,939. [61] Min,D.;Li,H.;Li,G.;Bitetti-Putzer,R.;Yang,W.J.Chem.Phys.2007,126,144109. [62] Jiang,W.;Hodoscek,M.J.Chem.TheoryComput.2009,5,2583. [63] Jiang,W.;Roux,B.J.Chem.TheoryComput.2010,6,2559. [64] Dyson,H.J.;Tennant,L.L.;Holmgren,A.Biochemistry1991,30,4262. 103

PAGE 104

[65] Langsetmo,K.;Fuchs,J.A.;Woodward,C.Biochemistry1991,30,7603. [66] Christ,C.D.;Mark,A.E.;vanGunsteren,W.F.J.Comput.Chem.2010,31,1569. [67] Jorgensen,W.L.;Thomas,L.L.J.Chem.TheoryComput.2008,4,869. [68] Kollman,P.Chem.Rev.1993,93,2395. [69] Straatsma,T.;McCammon,J.Annu.Rev.Phys.Chem.1992,43,407. [70] Jorgensen,W.L.;Ravimohan,C.J.Chem.Phys.1985,83,3050. [71] Lu,N.;Kofke,D.A.;Woolf,T.B.J.Comput.Chem.2004,25,28. [72] Simonson,T.;Carlsson,J.;Case,D.A.J.Am.Chem.Soc.2004,126,4167. [73] Bashford,D.;Case,D.A.;Dalvit,C.;Tennant,L.;Wright,P.E.Biochemistry1993,32,8045. [74] Katti,S.K.;LeMaster,D.M.;Eklund,H.J.Mol.Biol.1990,212,167. [75] Case,D.A.;Darden,T.A.;Cheatham,I.;Simmerling,C.L.;Wang,J.;Duke,R.E.;Luo,R.;Crowley,M.;Walker,R.C.;Zhang,W.;etal.,AMBER10.2008. [76] Hornak,V.;Abel,R.;Okur,A.;Strockbine,B.;Roitberg,A.;Simmerling,C.Proteins:Struct.,Funct.,Bioinf.2006,65,712. [77] Ryckaert,J.P.;Ciccotti,G.;Berendsen,H.J.C.J.Comput.Phys.1977,23,327. [78] Onufriev,A.;Case,D.A.;Bashford,D.J.Comput.Chem.2002,23,1297. [79] Perutz,M.F.FaradayDiscuss.1992,1,PMID:1290926. [80] Varadarajan,R.;Zewert,T.;Gray,H.;Boxer,S.Science1989,243,69. [81] Churg,A.K.;Warshel,A.Biochemistry1986,25,1675,PMID:3011070. [82] Gennis,R.B.Proc.Natl.Acad.Sci.U.S.A.1998,95,12747. [83] Isom,D.G.;Cannon,B.R.;Castaneda,C.A.;Robinson,A.;etal.,Proc.Natl.Acad.Sci.U.S.A.2008,105,17784. [84] Isom,D.G.;Castaneda,C.A.;Cannon,B.R.;E,B.G.-m.Proc.Natl.Acad.Sci.U.S.A.2011,108,5260. [85] Luecke,H.;Lanyi,J.K.;Rees,D.C.MembraneProteins;AcademicPress,2003;Vol.63;pp111. [86] Parson,W.W.;Chu,Z.-T.;Warshel,A.BiochimBiophysActaBioenerg1990,1017,251. 104

PAGE 105

[87] Li,Y.K.;Kuliopulos,A.;Mildvan,A.S.;Talalay,P.Biochemistry1993,32,1816,PMID:8439542. [88] Bogan,A.A.;Thorn,K.S.J.Mol.Biol.1998,280,1. [89] Adelroth,P.;Ek,M.S.;Mitchell,D.M.;Gennis,R.B.;Brzezinski,P.Biochemistry1997,36,13824,PMID:9374859. [90] Luecke,H.;Richter,H.-T.;Lanyi,J.K.Science1998,280,1934. [91] Denisov,V.P.;Schlessman,J.L.;Garca-MorenoE,B.;Halle,B.Biophys.J2004,87,3982,PMID:15377517. [92] Karp,D.A.;Stahley,M.R.;Garca-Moreno,B.Biochemistry2010,49,4138,PMID:20329780. [93] Mehler,E.L.;Fuxreiter,M.;Simon,I.;GarciaMorenoE,B.Proteins:Struct,Funct,Bioinf2002,48,283. [94] Stites,W.E.;Gittis,A.G.;Lattman,E.E.;Shortle,D.J.Mol.Biol1991,221,7,PMID:1920420. [95] Castaneda,C.A.;Fitch,C.A.;Majumdar,A.;Khangulov,V.;Schlessman,J.L.;Garca-Moreno,B.E.Proteins.2009,77,570,PMID:19533744. [96] Isom,D.G.;Castaneda,C.A.;Cannon,B.R.;Velu,P.D.;Garca-MorenoE.,B.Proc.Natl.Acad.Sci.U.S.A2010,107,16096,PMID:20798341PMCID:2941338. [97] Karp,D.A.;Gittis,A.G.;Stahley,M.R.;Fitch,C.a.;Stites,W.E.;Garca-MorenoE,B.;Garca-MorenoE.,B.Biophys.J.2007,92,2041. [98] Chimenti,M.S.;Khangulov,V.S.;Robinson,A.C.;Heroux,A.;Majumdar,A.;Schlessman,J.L.;Garca-Moreno,B.Structure2012,20,1071,PMID:22632835. [99] Dwyer,J.Biophys.J.2000,79,1610. [100] Garca-Moreno,B.;Dwyer,J.J.;Gittis,A.G.;Lattman,E.E.;Spencer,D.S.;Stites,W.E.Biophys.Chem1997,64,211,PMID:9127946. [101] Onufriev,A.;Bashford,D.;Case,D.A.J.Phys.Chem.B2000,104,3712. [102] Tsui,V.;Case,D.A.Biopolymers2000,56,275,PMID:11754341. [103] Onufriev,A.;Bashford,D.;Case,D.A.Proteins2004,55,383,PMID:15048829. [104] Damjanovic,A.;Garca-Moreno,B.;Lattman,E.E.;Garca,A.E.Proteins2005,60,433,PMID:15971206. 105

PAGE 106

[105] Fitch,C.Biophys.J.2002,82,3289. [106] Wu,X.;Brooks,B.R.Chem.Phys.Lett.2003,381,512. [107] Sindhikara,D.J.;Kim,S.;Voter,A.F.;Roitberg,A.E.J.Chem.TheoryComput.2009,5,1624. [108] Sosa,C.P.;Hewitt,T.;Lee,M.R.;Case,D.A.J.Mol.Struct.Theochem2001,549,193. [109] Sreerama,N.;Woody,R.W.Anal.Biochem.2000,287,252. [110] Sreerama,N.;Woody,R.W.Anal.Biochem.2004,383,318. [111] Alexov,E.;Mehler,E.L.;Baker,N.;M.Baptista,A.;Huang,Y.;Milletti,F.;ErikNielsen,J.;Farrell,D.;Carstensen,T.;Olsson,M.H.M.;Shen,J.K.;Warwicker,J.;Williams,S.;Word,J.M.Proteins:Struct.,Funct.,Bioinf.2011,79,3260. [112] Tanford,C.;Kirkwood,J.G.J.Am.Chem.Soc.1957,79,5333. [113] Bashford,D.;Karplus,M.Biochemistry1990,29,10219. [114] Warshel,A.;Russell,S.T.Q.Rev.Biophys.1984,17,283. [115] Baptista,A.M.J.Chem.Phys.2002,116,7766. [116] DiRusso,N.V.;Estrin,D.A.;Mart,M.A.;Roitberg,A.E.PLoSComputBiol2012,8,e1002761. [117] Bierzynski,A.;Kim,P.S.;Baldwin,R.L.Proc.Natl.Acad.Sci.U.S.A.1982,79,2470. [118] Shoemaker,K.R.;Kim,P.S.;Brems,D.N.;Marqusee,S.;York,E.J.;Chaiken,I.M.;Stewart,J.M.;Baldwin,R.L.Proc.Natl.Acad.Sci.U.S.A.1985,82,2349. [119] Schaefer,M.;VanVlijmen,H.W.;Karplus,M.Adv.ProteinChem.1998,51,1. [120] Demchuk,E.;Genick,U.K.;Woo,T.T.;Getzoff,E.D.;Bashford,D.Biochemistry2000,39,1100. [121] Dillet,V.;Dyson,H.J.;Bashford,D.Biochemistry1998,37,10298. [122] Harris,T.K.;Turner,G.J.IUBMBlife2002,53,85. [123] Antosiewicz,J.;Briggs,J.M.;McCammon,J.A.Eur.Biophys.J.1996,24,137. [124] Hunenberger,P.H.;Helms,V.;Narayana,N.;Taylor,S.S.;McCammon,J.A.Biochemistry1999,38,2358. 106

PAGE 107

[125] Hill,T.L.J.Am.Chem.Soc.1956,78,1577. [126] Warshel,A.Nature1987,330,15. [127] Mongan,J.;Case,D.A.;McCammon,J.A.J.Comput.Chem.2004,25,2038. [128] Baptista,A.M.;Martel,P.J.;Petersen,S.B.Proteins1997,27,523. [129] Borjesson,U.;Hunenberger,P.H.J.Chem.Phys.2001,114,9706. [130] Khandogin,J.;Brooks,C.L.Biophys.J.2005,89,141. [131] Khandogin,J.;Brooks,C.L.Biochemistry2006,45,9363. [132] Khandogin,J.;Brooks,C.L.Proc.Natl.Acad.Sci.U.S.A.2007,104,16880. [133] Khandogin,J.;Chen,J.H.;Brooks,C.L.Proc.Natl.Acad.Sci.U.S.A.2006,103,18546. [134] Khandogin,J.;Raleigh,D.P.;Brooks,C.L.J.Am.Chem.Soc.2007,129,3056. [135] Mertz,J.E.;Pettitt,B.M.Int.J.Supercomput.Ap.1994,8,47. [136] Borjesson,U.;Hunenberger,P.H.J.Phys.Chem.B2004,108,13551. [137] Baptista,A.M.;Teixeira,V.H.;Soares,C.M.J.Chem.Phys.2002,117,4184. [138] Machuqueiro,M.;Baptista,A.M.Biophys.J.2006,92,1836. [139] Machuqueiro,M.;Baptista,A.M.Proteins:Struct.,Funct.,Bioinf.2008,72,289. [140] Machuqueiro,M.;Baptista,A.M.J.Am.Chem.Soc.2009,131,12586. [141] Burgi,R.;Kollman,P.a.;VanGunsteren,W.F.Proteins2002,47,469. [142] Dlugosz,M.;Antosiewicz,J.M.Chem.Phys.2004,302,161. [143] Dlugosz,M.;Antosiewicz,J.M.J.Phys.Chem.B2005,109,13777. [144] Dlugosz,M.;Antosiewicz,J.M.J.Phys.:Condens.Matter2005,17,S1607. [145] Dlugosz,M.;Antosiewicz,J.M.;Robertson,A.D.Phys.Rev.E2004,69,21915. [146] Walczak,A.M.;Antosiewicz,J.M.Phys.Rev.E2002,66,051911. [147] Lee,M.S.;Salsbury,F.R.;Brooks,C.L.Proteins:Struct.,Funct.,Bioinf.2004,56,738. 107

PAGE 108

[148] Machuqueiro,M.;Baptista,A.M.J.Phys.Chem.B2006,110,2927. [149] Johnson,M.L.,Brand,L.,Eds.MethodsinEnzimology;AcademicPress:SanDiego,Burlington,London,2009;Vol.467. [150] Warwicker,J.ProteinSci.2004,13,2793. [151] Barth,P.;Alber,T.;Harbury,P.B.Proc.Natl.Acad.Sci.U.S.A.2007,104,4898. [152] Li,H.;Fajer,M.;Yang,W.J.Chem.Phys.2007,126,024106. [153] Zheng,L.Q.;Chen,M.G.;Yang,W.J.Chem.Phys.2009,130,234105. [154] Lyubartsev,A.P.;Martsinovski,A.A.;Shevkunov,S.V.;VorontsovVelyaminov,P.N.J.Chem.Phys.1992,96,1776. [155] Berg,B.A.;Neuhaus,T.Phys.Lett.B1991,267,249. [156] Laio,A.;Parrinello,M.Proc.Natl.Acad.Sci.U.S.A.2002,99,12562,PMID:12271136PMCID:130499. [157] Zheng,L.;Chen,M.;Yang,W.Proc.Natl.Acad.Sci.U.S.A2008,105,20227. [158] Wallace,J.A.;Shen,J.K.J.Chem.TheoryComput.2011,7,2617. [159] Wallace,J.A.;Wang,Y.;Shi,C.;Pastoor,K.J.;Nguyen,B.;Xia,K.;Shen,J.K.Proteins:Struct.,Funct.,Bioinf.2011,79,3364. [160] Meng,Y.;Roitberg,A.E.J.Chem.TheoryComput.2010,6,1401. [161] Mongan,J.;Case,D.A.Curr.Opin.Struct.Biol.2005,15,157. [162] Williams,D.H.;Stephens,E.;O'Brien,D.P.;Zhou,M.J.Chem.Phys.2004,43,6596. [163] Sindhikara,D.;Meng,Y.;Roitberg,A.E.J.Chem.Phys.2008,128,24103. [164] Earl,D.J.;Deem,M.W.Phys.Chem.Chem.Phys.2005,7,3910. [165] Faller,R.;Yan,Q.;dePablo,J.J.Biopolymers.2002,116,5419. [166] Fenwick,M.K.;Escobedo,F.a.J.Chem.Phys.2003,119,11998. [167] Chodera,J.D.;Shirts,M.R.J.Chem.Phys.2011,135,194110. [168] Ballard,A.J.;Jarzynski,C.Proc.Natl.Acad.Sci.U.S.A.2009,106,12224. [169] Kastner,J.WIREs.Comput.Mol.Sci.2011,1,932. 108

PAGE 109

[170] Bartels,C.Chem.Phys.Lett.2000,331,446. [171] Neumann,J.V.Proc.Natl.Acad.Sci.U.S.A.1932,18,70. [172] Hritz,J.;Oostenbrink,C.J.Chem.Phys.2008,128,144121. [173] Okur,A.;Wickstrom,L.;Layten,M.;Geney,R.;Song,K.;Hornak,V.;Simmerling,C.J.Chem.TheoryComput.2006,2,420. [174] Cheng,X.;Cui,G.;Hornak,V.;Simmerling,C.J.Phys.Chem.B2005,109,8220. [175] Zhou,R.;Berne,B.J.Proc.Natl.Acad.Sci.U.S.A2002,99,12777. [176] Zhou,R.Proteins.2003,53,148. [177] Roitberg,A.E.;Okur,A.;Simmerling,C.J.Chem.TheoryComput.2007,111,2415. [178] Okur,A.;Roe,D.R.;Cui,G.L.;Hornak,V.;Simmerling,C.J.Chem.TheoryComput.2007,3,557. [179] Rick,S.W.J.Chem.Phys.2007,126,054102. [180] Li,X.;O'Brien,C.P.;Collier,G.;Vellore,N.A.;Wang,F.;Latour,R.A.;Bruce,D.A.;Stuart,S.J.J.Chem.Phys.2007,127,164116. [181] Kamberaj,H.;vanderVaart,A.J.Chem.Phys.2009,130,074906. [182] Kone,A.;Kofke,D.A.J.Chem.Phys.2005,122,206101. [183] Kofke,D.J.Chem.Phys.2002,117,6911. [184] Kofke,D.A.J.Chem.Phys.2004,121,1167. [185] Predescu,C.;Predescu,M.;Ciobanu,C.V.J.Chem.Phys.2004,120,4119. [186] Sanbonmatsu,K.;Garcia,A.Proteins.2002,46,225. [187] Rathore,N.;Chopra,M.;dePablo,J.J.J.Chem.Phys.2005,122,024111. [188] Katzgraber,H.G.;Trebst,S.;Huse,D.a.;Troyer,M.J.Stat.Mech.Theor.Exp.2006,2006,03018. [189] Hritz,J.;Oostenbrink,C.J.Chem.Phys.2007,127,204104. [190] Trebst,S.;Troyer,M.;Hansmann,U.H.E.J.Chem.Phys.2006,124,174903. [191] Shenfeld,D.;Xu,H.;Eastwood,M.;Dror,R.;Shaw,D.Phys.Rev.E2009,80,046705. 109

PAGE 110

[192] Crooks,G.Phys.Rev.Lett.2007,99,100602. [193] Murata,K.;Sugita,Y.;Okamoto,Y.J.Theor.Comput.Chem2005,04,411. [194] Barzilai,J.;Borwein,J.M.J.Inst.Math.ItsApp.1988,8,141. [195] Yuan,Y.-x.AMS/IPStud.Adv.Math.2008,42,785. [196] Jorgensen,W.L.;Chandrasekhar,J.;Madura,J.D.;Impey,R.W.;Klein,M.L.J.Chem.Phys.1983,79,926. [197] Case,D.;Darden,T.;Cheatham,T.;Simmerling,C.;Wang,J.;etal.,AMBER12.2012. [198] Wang,J.;Wolf,R.M.;Caldwell,J.W.;Kollman,P.a.;Case,D.a.J.Comput.Chem.2004,25,1157. [199] Sindhikara,D.;Emerson,D.;Roitberg,A.J.Chem.TheoryComput.2010,6,2804. [200] Atkeson,C.;Moore,A.;Schaal,S.Artif.Intell.Rev.1997,11. [201] Burachik,R.;GranaDrummond,L.M.;Iusem,A.;Svaiter,B.F.Optimization1995,32,137. 110

PAGE 111

BIOGRAPHICALSKETCH DanialSabriDashtiwasbornin,Tehran,Iran.HewenttotheUniversityofKashanmajoringinphysicsandgraduatedwithabachelor'sdegreein2003.Then,hewenttoSharifUniversityofTechnologywhere,hereceivedhismaster'sdegreeinphysicsin2006.Afternishinghismaster,hemovedtoGainesville,Floridain2007.HeenteredtheDepartmentofPhysicsoftheUniversityofFloridatopursueaPh.Ddegreeincomputationalbiophysics.HereceivedhisPh.D.fromtheUniversityofFloridainthesummerof2013. 111



PAGE 1

1 Rightslink¨ by Copyright Clearance Center May 9, 2013 4:50:34 PM https://s100.copyright.com/AppDispatchServlet Title : Computing Alchemical Free Energy Differences with Hamiltonian Replica Exchange Molecular Dynamics ( H REMD ) Simulations Author : Yilin Meng Danial Sabri Dashti and Adrian E Roitberg Publication : Journal of Chemical Theory and Computation Publisher : American Chemical Society Date : Sep 1 2011 Copyright 2011 American Chemical Society User ID Password Enable Auto Login Forgot Password / User ID ? If you re a copyright com user you can login to RightsLink using your copyright com credentials Already a RightsLink user or want to learn more ? PERMISSION / LICENSE IS GRANTED FOR YOUR ORDER AT NO CHARGE This type of permission / license instead of the standard Terms & Conditions is sent to you because no fee is being charged for your order Please note the following : Permission is granted for your request in both print and electronic formats and translations If figures and / or tables were requested they may be adapted or used in part Please print this page for your records and send a copy of it to your publisher / graduate school Appropriate credit for the requested material should be given as follows : Reprinted ( adapted ) with permission from ( COMPLETE REFERENCE CITATION ) Copyright ( YEAR ) American Chemical Society Insert appropriate information in place of the capitalized words One time permission is granted only for the use specified in your request No additional uses are granted ( such as derivative works or other editions ) For any other uses please submit a new request Copyright 2013 Copyright Clearance Center Inc All Rights Reserved Privacy statement Comments ? We would like to hear from you E mail us at customercare @ copyright com