UFDC Home  myUFDC Home  Help 



Full Text  
PAGE 1 TRANSFERABILITY IN AB INITIO QUANTUM CHEMISTR Y: CORRELA TED ELECTR ONIC STR UCTURE THEOR Y F OR LAR GE MOLECULES By THOMAS FRANK HUGHES A DISSER T A TION PRESENTED TO THE GRADUA TE SCHOOL OF THE UNIVERSITY OF FLORID A IN P AR TIAL FULFILLMENT OF THE REQUIREMENTS F OR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORID A 2008 1 PAGE 2 c r 2008 Thomas F rank Hughes 2 PAGE 3 T o m y wife, Kelly C. Hughes, who I lo v e for b eing an amazing h uman b eing with an infectious p ersonalit y And to m y paren ts, Thomas James and Rose Ann Hughes; and m y t w o older sisters, Erica Christine and Kelly Marie Hughes, for their inspiration. 3 PAGE 4 A CKNO WLEDGMENTS Sp ecial recognition needs to b e giv en to m y advisor, Ro dney J. Bartlett, for his great men torship, as w ell as to m y committee for their sup ervision and assistance. Sp ecial thanks is giv en to the Bartlett group for their help with the A CES suite of programs and for v ery man y ric h and stim ulating discussions. Additionally I also o w e great thanks to man y studen ts, p ostdo cs, sta, and facult y at QTP for their friendship and to m y Alma mater, the Univ ersit y of North Florida, for instilling in tellectual curiosit y Lastly on b ehalf of the Bartlett group I w ould also lik e to thank the National Science F oundation, Air F orce Oce of Scien tic Researc h, and the Arm y Researc h Oce for funding. 4 PAGE 5 T ABLE OF CONTENTS page A CKNO WLEDGMENTS . . . . . . . . 4 LIST OF T ABLES . . . . . . . . . 7 LIST OF FIGURES . . . . . . . . . 8 ABSTRA CT . . . . . . . . . . 10 CHAPTER 1 QUANTUM CHEMISTR Y . . . . . . . 12 1.1 A b Initio Electronic Structure Metho ds . . . . . 12 1.1.1 HartreeF o c k . . . . . . . 12 1.1.2 CoupledCluster . . . . . . . 16 1.2 Appro ximate Correlated Indep enden t P article Metho ds . . 18 2 LOCAL APPR O XIMA TIONS F OR THE GR OUND ST A TE . . 20 2.1 Natural Lo calized Molecular Orbitals . . . . . 21 2.2 F ragmen t Based Approac hes . . . . . . 23 2.2.1 F ragmen t Molecular Orbitals . . . . . 23 2.2.2 HartreeF o c k with F ragmen t Molecular Orbitals . . 24 2.2.3 Electron Correlation with F ragmen t Molecular Orbitals . 27 2.3 P arallelization in the A CES Programs . . . . . 28 2.3.1 Sup er Instruction Assem bly Language . . . . 28 2.3.2 Sup er Instruction Arc hitecture . . . . . 29 2.3.3 In tegral Direct Computation . . . . . 30 2.3.4 Automated DualLa y er P arallelization Strategy . . . 30 2.4 Lo calized CoupledCluster . . . . . . 31 2.4.1 Lo calized T 3 Con tributions . . . . . 34 2.4.2 Av oiding T 3 Storage . . . . . . 36 2.4.3 NonLo cal In teractions in Hybrid Metho ds . . . 39 2.5 Implemen tation . . . . . . . . 42 2.6 Applications . . . . . . . . 44 2.6.1 Motiv ation . . . . . . . . 44 2.6.2 P olyGlycine . . . . . . . 46 2.6.3 MetEnk ephalin . . . . . . . 48 3 LOCAL APPR O XIMA TIONS F OR IONIZED ST A TES . . . 65 3.1 Metho ds for Calculating Excited States . . . . . 65 3.2 Appro ximate Metho ds for Calculating Excited States . . . 66 3.2.1 Lo calized Metho ds . . . . . . 66 3.2.2 F ragmen ted Metho ds . . . . . . 67 5 PAGE 6 3.3 Chromophores . . . . . . . . 67 3.3.1 Compact Represen tations . . . . . . 67 3.3.2 Lo cating Chromophores with Lo w er Lev el W a v efunctions . 68 3.3.3 In teracting Chromophores and ChargeT ransfer . . 69 3.4 Metho ds for Calculating Ionized States . . . . . 71 3.4.1 IPEOMCC W a v efunction . . . . . 72 3.4.2 Eigen v alue Equation in H . . . . . 72 3.4.3 LeftHand Eigenstate of H . . . . . 73 3.5 Lo calized Metho ds for Calculating Ionized States . . . 74 3.5.1 Lo cating Chromophores: Bond Ionization Energies . . 75 3.5.2 Lo cating Chromophores: E Bond Ionization Energies . 76 3.5.3 Lo cating Chromophores: IPEOMMP2 Densit y Matrices . 77 3.5.4 Electrostatic P oten tials . . . . . . 78 3.6 Implemen tation . . . . . . . . 79 3.7 Applications . . . . . . . . 79 3.7.1 Motiv ation . . . . . . . . 79 3.7.2 Example Using Bond Ionization Energies . . . 82 3.7.3 Example Using IPEOMMP2 Densit y Matrix . . . 83 3.7.4 ( H 2 O ) 13 . . . . . . . . 85 3.7.5 Gly 5 Helix . . . . . . . 86 3.7.6 ( H 2 O ) 28 . . . . . . . . 86 4 LOCAL APPR O XIMA TIONS F OR MOLECULAR PR OPER TIES . 107 4.1 Eectiv e Hamiltonian Dynamic P olarizabilities . . . . 108 4.1.1 SizeExtensivit y . . . . . . . 109 4.1.2 Linear Resp onse . . . . . . . 110 4.2 Lo calized Eectiv e Hamiltonian Dynamic P olarizabilities . . 111 4.2.1 Lo calized DeExcitation Op erators . . . . 111 4.2.2 Eectiv e Bond Dynamic P olarizabilities . . . . 113 4.2.3 Eectiv e Bond Resp onse Matrices via L owdin P artioning . 115 4.3 Disp ersion In teractions . . . . . . . 116 4.3.1 CasimirP older Equation . . . . . . 117 4.3.2 Correctiv e Disp ersion P oten tial . . . . . 117 4.3.3 Anisotropic Con tributions . . . . . 118 4.4 Implemen tation . . . . . . . . 118 4.5 Applications . . . . . . . . 119 4.5.1 Motiv ation . . . . . . . . 119 4.5.2 T ryptophan Dynamic P olarizabilities . . . . 122 4.5.3 P olyGlycine Dynamic P olarizabilities . . . . 123 4.5.4 Alk ane Disp ersion Co ecien ts . . . . . 124 4.5.5 DiGlycine Disp ersion Co ecien ts . . . . 124 REFERENCES . . . . . . . . . 139 BIOGRAPHICAL SKETCH . . . . . . . . 145 6 PAGE 7 LIST OF T ABLES T able page 21 T ransferabilit y of b ond energies from dieren t molecules. . . . 58 22 NLSCC correlation energies for p olyglycine. . . . . . 59 23 CCSD b ond correlation energies for nonglycine residues. . . . 63 24 Summary of NLSCCSD calculations on metenk ephalin. . . . 64 31 P artial atomic c harges for electronic states of H 2 O . . . . 89 32 P artial atomic c harges for electronic states of ( H 2 O ) 2 . . . . 90 33 IPEOMCCSD v alence b ond ionization energies. . . . . 93 34 The largest b ond ionization energies for ( H 2 O ) 8 . . . . 98 35 IPEOMCCSD v alence ionization energies of ( H 2 O ) 28 . . . 106 41 Static b ond p olarizabilities for tryptophan. . . . . . 132 42 F requency dep enden t NLSCCSD p olarizabilities for p olyglycine. . . 134 43 Disp ersion co ecien ts p er meth ylene unit cell. . . . . 136 44 Bond disp ersion co ecien ts for diglycine. . . . . . 138 7 PAGE 8 LIST OF FIGURES Figure page 21 Increasing sparsit y of the CCSDT3 amplitudes with system size. . . 51 22 Sparsit y of t w oelectron in tegrals in a lo calized virtual space. . . 52 23 T ransferabilit y of lo cal amplitudes from size(in)extensiv e metho ds. . 53 24 T ransferabilit y of j T 2 j and j T 3 j from CCSDT3 in NLMOs. . . . 54 25 T ransferabilit y of CCSD and CCSDT3 b ond energies in NLMOs. . . 55 26 Eect of triple excitations on calculated b ond energies. . . . 56 27 Denition of the regions for the NLSCC calculation of p olyglycine. . 57 28 NLSCC correlation energy p er unit cell of p olyglycine. . . . 60 29 Mo del of the threedimensional p en tap eptide metenk ephalin. . . 61 210 Mo del sho wing the sidec hains of nonglycine residues. . . . 62 31 Ball and stic k mo dels of w ater dimer geometries. . . . . 88 32 T ransferabilit y of the ionized state pro jection of H . . . . 91 33 T ransferabilit y of a v alence ionized state w a v efunction. . . . 92 34 T ransferabilit y of IPEOMCCSD b ond ionization energies. . . . 94 35 Ball and stic k mo del of quasilinear ( H 2 O ) 8 . . . . . 95 36 T otal core ionization energies of gro wing w ater clusters. . . . 96 37 Bond ionization energy con tributions of gro wing w ater clusters. . . 97 38 Bond ionization energies for ( H 2 O ) 8 . . . . . . 99 39 IPEOMCCSD ionization energies from lo w er lev el densit y matrices. . 100 310 Ionization c hromophores in ( H 2 O ) 13 . . . . . . 101 311 Bond ionization energies for the ( H 2 O ) 5 c hromophore. . . . 102 312 Ionization c hromophores in gly 5 helix. . . . . . 103 313 Bond ionization energies for the gly 1 and gly 2 c hromophores. . . 104 314 Ball and stic k mo dels of ( H 2 O ) 28 . . . . . . 105 41 T ransferabilit y of the eectiv e Hamiltonian. . . . . . 126 8 PAGE 9 42 T ransferabilit y of CCSD b ond p olarizabilities. . . . . 127 43 F requency dep enden t b ond p olarizabilities and asso ciated p oles. . . 128 44 F requency dep enden t b ond p olarizabilities for ethanol. . . . 129 45 F requency dep enden t b ond p olarizabilities for eth ylamine. . . . 130 46 Ball and stic k mo dels of tryptophan and diglycine. . . . . 131 47 F requency dep enden t b ond p olarizabilities for tryptophan. . . . 133 48 T ransferabilit y of disp ersion co ecien ts for alk anes. . . . . 135 49 Con v ergence of diglycine disp ersion co ecien ts with gridp oin ts. . . 137 9 PAGE 10 Abstract of Dissertation Presen ted to the Graduate Sc ho ol of the Univ ersit y of Florida in P artial F ulllmen t of the Requiremen ts for the Degree of Do ctor of Philosoph y TRANSFERABILITY IN AB INITIO QUANTUM CHEMISTR Y: CORRELA TED ELECTR ONIC STR UCTURE THEOR Y F OR LAR GE MOLECULES By Thomas F rank Hughes Decem b er 2008 Chair: Ro dney J. Bartlett Ma jor: Chemistry The natural linearscaled coupledcluster [N. Flo c k e and R. J. Bartlett, J. Chem. Ph ys. 121 10935 (2004)] metho dology is extended to more adv anced electronic structure problems. This metho d exploits the extensivit y of the coupledcluster w a v efunction to represen t it in terms of transferable electronic structure regions, t ypically incorp orating natural lo calized molecular orbitals, thereb y pro viding a systematically impro v able metho d for large molecules. Correlated accuracy is built in b y allo wing appro ximately xed electronic structure regions to b e p erturb ed through classifying in teractions among regions in a molecule in terms of imp ortan t in terregion excitations and deexcitations. Sub ject to its represen tation in terms of these lo calized orbitals the correlated eectiv e Hamiltonian from equationofmotion coupledcluster theory whic h in principle oers an exact solution of a c hemical system, is transferable meaning that for a large target system the eectiv e Hamiltonian is calculable from smaller eectiv e Hamiltonians. It is then p ossible to extract b ond dep enden t quan tities, for example ground, excited, ionized, etc. energies, as w ell as resp onse prop erties lik e dynamic p olarizabilities and disp ersion co ecien ts, and also transferable w a v efunctions and densit y matrices. The exten t to whic h b onds or functional groups, as opp osed to atoms whic h are one order of in teraction remo v ed from co v alen t b onds, can b e com bined to pro vide the prop erties of a large target molecule is in v estigated. Giv en that the noncanonical lo calized orbitals are not energy eigenstates the theory lac ks a Ko opmans' theorem thereb y complicating the treatmen t of 10 PAGE 11 frequencydep enden t theories requiring c hromophoric regions to lo calize excited or ionized state w a v efunctions. Less transferable longrange w eak or c hargetransfer in teractions are represen ted using electrostatic p oten tials or other correctiv e longrange p oten tials, some built from a n um b er of lo cal regions as opp osed to a single region. The natural linearscaled metho ds repro duce con v en tional results for amenable examples. More realistic applications include w ater clusters, tryptophan, p olyglycine, and metenk ephalin. 11 PAGE 12 CHAPTER 1 QUANTUM CHEMISTR Y Mo deling c hemical reactions p oses a great c hallenge to theoretical metho ds b ecause of the accuracy and time required in treating a large n um b er of collectiv e quan tum mec hanical (QM) and statistical mec hanical degrees of freedom. Dev elopmen t of appro ximate electronic structure metho ds aim to tak e adv an tage of underlying simplicities via cancellations whic h are unfortunately dicult to trac k without resorting to man yb o dy ab initio metho ds. These latter metho ds are based on rst principles and th us ha v e the rigor needed to fully describ e equilibrium and nonequilibrium regions of the p oten tial energy surfaces (PESs) of ground, excited, ionized, or attac hed electronic states and their sp ectra. With these adv ances it is p ossible to mo v e b ey ond electronic structure theory and predict c hemistry for example reaction dynamics. Although gas phase results often times appro ximate those of other phases, in general theories whic h bridge the gap b et w een the microscopic prop erties of particles to the macroscopic prop erties of a large system are needed. Using densit y functional theory (DFT) with the B3L YP exc hangecorrelation(X C) functional remains the quan tum c hemical metho d of c hoice for treating large molecules b ecause of its eciency and its appro ximate inclusion of dynamic electron correlation. Ho w ev er, there are still man y adv ances that need to b e made in dev eloping a densit y based theory whic h includes a priori the man yb o dy cancellations needed to mak e an exact linearscaled electronic structure metho d. 1.1 A b Initio Electronic Structure Metho ds 1.1.1 HartreeF o c k The fo cus of quan tum c hemistry is to solv e the Sc hr odinger equation, H j i = E j i ; (1{1) 12 PAGE 13 for the molecular Hamiltonian, H = X 1 2 m r 2 X i 1 2 r 2i X i Z r ia + X i PAGE 14 The HartreeF o c k (HF) appro ximation, whic h is cen tral to the treatmen t of man yelectron w a v efunctions, assumes a trial j el ec i in the form of a single N electron Slaterdeterminan t, j i = Aj 1 2 N i = 1 ( N !) 1 = 2 1 (1) 2 (1) N (1) 1 (2) 2 (2) N (2) ... ... ... ... 1 ( N ) 2 ( N ) N ( N ) ; (1{6) with A an an tisymmetrization op erator necessary to in tro duce the prop er an tisymmetry for fermions and where the electronic subscript has b een dropp ed for simplicit y The f i g are a set of singleparticle molecular orbitals whic h are v ariationally optimized to the HF molecular orbitals b y minimizing the singledeterminan t energy expression, E [ H F ] = h H F jH j H F i = min h jH j i ; (1{7) under the orthonormalit y constrain t h i j j i = ij The v ariation in the singledeterminan t energy expression, E with, E [] = h jH j i = X i h i j h j i i + 1 2 X ij h i j jj i j i ; (1{8) is then minimized. Note that h denotes the oneparticle core Hamiltonian whic h con tains the rst t w o terms of H el ec and the an tisymmetrized t w oelectron in tegral in Dirac notation, h i j jj i j i = h i j j i j i h i j j j i i These in tegrals are dened as the classical coulom b, h i j j i j i = h i j J j j i i = h i j Z d ~ r 2 j ( ~ r 2 ) j ( ~ r 2 ) j ~ r 1 ~ r 2 j j i i ; (1{9) and a purely quan tum nonlo cal exact exc hange, h i j j j i i = h i j K j j i i = h i j Z d ~ r 2 j ( ~ r 2 ) P 12 j ( ~ r 2 ) j ~ r 1 ~ r 2 j j i i ; (1{10) 14 PAGE 15 con tribution where P 12 exc hanges particles one and t w o. Note that these eectiv e oneparticle op erators dep end up on the orbitals themselv es and th us m ust b e solv ed iterativ ely thereb y giving the selfconsisten t eld metho d. V ariational optimization giv es the follo wing equation for the eectiv e oneparticle F o c k op erator, f f j i i = [ h + X j J j K j ] j i i = X ij ij j j i ; (1{11) where ij is a matrix of Lagrange m ultipliers in the noncanonical HF basis. The unitary basis whic h diagonalizes the Hermitian matrix ij is called the canonical HF basis and giv es pseudoeigen v alue equation for the F o c k matrix, f j i i = i j i i : (1{12) By expanding the singleparticle molecular orbitals in an orthonormal basis of atom cen tered Gaussians, fj ig j i i = X C i j i ; (1{13) the follo wing pseudoeigen v alue equation for the F o c k matrix is obtained, F C = C ; (1{14) where, F = H + X P [( j ) 1 2 ( j )] : (1{15) The molecular orbital co ecien ts, C pro vide the densit y (b ondorder) matrix, P P = CnC y ; (1{16) where n is the o ccupation matrix. The selfconsisten t eld energy is then found b y tracing with the total eectiv e oneparticle op erator, E = 1 2 T r [ P ( H + F )] : (1{17) 15 PAGE 16 The matrix of molecular orbital co ecien ts, C has dimensions equiv alen t to the total n um b er of basis functions. When this n um b er is larger than the n um b er of electrons, arbitrary virtual or uno ccupied molecular orbitals, j a i are obtained in addition to the o ccupied molecular orbitals, j i i 1.1.2 CoupledCluster The correlation energy of a system is dened as the dierence b et w een the exact and HF energies, E cor r = E exact E H F ; (1{18) E exact is determined using full conguration in teraction (F CI). The CI metho d in tro duces excited determinan ts in to the w a v efunction b y means of an excitation op erator, C whic h allo ws for the electrons to escap e eac h others inruence via one, t w o, etc. b o dy electron in teractions, j C I i = C j H F i = (1 + C 1 + C 2 + C 3 + ) j H F i ; (1{19) with, C n = 1 ( N !) 2 X ij ab c ab ij a y b y j i; (1{20) where f c g are the CI amplitudes. The secondquan tized creation, a y and annihilation, i op erators excite electrons, j ai i = a y i j H F i = a y i Aj 1 2 i i = Aj 1 2 a i : (1{21) Because of the extremely large n um b er of p erm uted determinan ts in F CI, in practice it is only computationally feasible on systems a with small n um b er of electrons spurring the need for truncated CI metho ds whic h include only up through single, double, triple, etc. determinan ts. The CI amplitudes and energy are v ariationally determined as eigen v ectors and eigen v alues b y diagonalizing the Hamiltonian in the basis of excited determinan ts. In practice the diagonalization is done using iterativ e metho ds. 16 PAGE 17 The coupledcluster (CC) metho d oers a cum ulan t decomp osition of the CI excitation op erator b y means of an exp onen tial w a v eop erator ansatz, j C C i = e T j H F i = (1 + T + 1 2! T 2 + 1 3! T 3 + ) j H F i ; (1{22) where T = T 1 + T 2 + T 3 + + T N with N the n um b er of electrons, T n = 1 ( N !) 2 X ij ab t ab ij a y b y j i: (1{23) The relationship b et w een the CI and CC expansion co ecien ts, C 1 = T 1 C 2 = T 2 + T 2 1 C 3 = T 3 + T 2 T 1 + T 3 1 ; (1{24) sho ws that the CC is equiv alen t to F CI in the limit of N particle excitations. By virtue of the exp onen tial ansatz truncated CC theory (for example CCSD) approac hes this limit more eectiv ely than truncated CI theory (for example CISD) b ecause CCSD theory con tains not only all of the single and double excitation manifold but also some of the triple, disconnected T 1 T 2 quadruple, disconnected T 2 T 2 etc. manifolds. Because the CC amplitude equations, h Q j H j H F i = 0 ; (1{25) are coupled and nonlinear, and th us without an eigen v alue equation, they can't b e solv ed using matrix diagonalization, instead they are solv ed iterativ ely Here the Q space can b e singly doubly triply etc. excited determinan ts and H = e T H e T = ( H e T ) c where c means connectedness in the usual con text of CC theory The initial guess for T 1 and T 2 are often times determined from secondorder p erturbation theory (MP2), t a (1) i = f ia i a ; (1{26) 17 PAGE 18 and t ab (1) ij = h ij jj ab i i + j a b : (1{27) The energy is similarly giv en b y h H F j H j H F i = E = X ia f ia t ai + 1 4 X ij ab t abij h ij jj ab i + 1 2 X ij ab t ai t bj h ij jj ab i : (1{28) Ob viously other reference functions b esides j H F i can b e used. The highly accurate CC and CI metho ds are troubled b y the task of including all p ossible p erm utations of one, t w o, three, etc. electrons among a large n um b er of singleparticle basis functions. As a result, in practice the excitation op erator, T is often times truncated after the t w oparticle piece, T = T 1 + T 2 + T 3 + T 1 + T 2 leading to singly (S) and doubly (D) excited determinan ts. Both the CCSD and CISD metho ds scale as O ( o 2 v 4 ), where o and v are resp ectiv ely the n um b er of o ccupied and virtual orbitals, but as previously men tioned only the exp onen tial ansatz, unique to CC theory additionally includes disconnected parts of triple and quadruple excitations. These disconnected triple and quadruple excitations, lik e T 1 T 2 and T 2 T 2 resp ectiv ely mak e truncated CC con v erge to the full CI (F CI) m uc h more quic kly than truncated CI, th us making CC the metho d of c hoice for highlev el applications to mediumsized molecules. The remaining pieces of the triple excitation manifold are said to b e connected and ma y b e required for 1 k cal mol 1 accuracy in the energy and similar accuracy in other prop erties. The exp onen tial ansatz used in CC theory mak es it a sizeextensiv e metho d whic h is a necessary condition for a system with a large n um b er of electrons. F or large molecules, a sizeinextensiv e linear ansatz, for example as in truncated CI, w ould giv e a v anishing correlation energy p er electron [ 1 2 ]. 1.2 Appro ximate Correlated Indep enden t P article Metho ds Man y appro ximate indep enden t particle metho ds con taining mo del correlation p oten tials ha v e b een prop osed to circum v en t the cost of man yb o dy correlation metho ds, 18 PAGE 19 as w ell as the underlying HF reference, in the treatmen t of large molecules. The semiempirical metho ds add a singleparticle correlation p oten tial to Eqn. 1{15 F = H + J K + V c ; (1{29) but represen t it only eectiv ely F = H 0 + J 0 K 0 ; (1{30) b y means of empirical mo del forms for oneand t w oelectron in tegrals in v olving parameters t to repro duce exp erimen t or highlev el theory [ 3 ]. Semiempirical theory is routinely used for large molecule calculations and sho wn to giv e qualitativ e descriptions. Semiempirical theory as it follo ws from a parameterization of HF theory is more of a means to an end than it is deriv able from rst principles. The latter marks the dierence b et w een semiempirical metho ds and DFT whic h in principle targets Eqn. 1{29 from a densit y as opp osed to w a v efunction, dep enden t p ersp ectiv e as giv en b y the Hohen b ergKohn theorems. Here K is replaced b y an often times lo cal exc hange p oten tial, V x in the KohnSham equation and man y tec hniques ha v e b een studied to nd V xc = V x + V c Despite b eing the most reliable electronic structure metho d for large molecules, DFT metho ds are not systematic with basis set in the same sense as w a v efunction based metho ds. The often times inaccurate forces, and represen tation of c hargetransfer and w eak in teractions, obtained with these appro ximate Hamiltonians leads to incorrect conclusions ab out c hemical reactivit y Additionally DFT lac ks in its description of the b ond breaking regime. T o remedy this a systematic isolation of the failures of DFT is needed p erhaps b y comparison with highlev el, p oten tially m ultireference, CC metho ds. This pro vides the stim ulus to dev elop faster and more accurate appro ximations whic h can systematically approac h man yb o dy metho ds. 19 PAGE 20 CHAPTER 2 LOCAL APPR O XIMA TIONS F OR THE GR OUND ST A TE "By 1935 ... I felt that I had an essen tially complete understanding of the nature of the c hemical b ond" Lin us C. P auling, Daedalus 99, 988 (1970). "A ttempts to regard a molecule as consisting of sp ecic atomic or ionic units held together b y discrete n um b er of b onding electrons or electronpairs are considered as more or less meaningless ..." Rob ert S. Mullik en, Ph ysical Review 40, 55 (1932). Despite the adv ances in sup ercomputing tec hnologies and highly ecien t or parallel quan tum c hemistry programs whic h utilize them, there is great need for metho ds whic h scale fa v orably with molecule size but are sucien tly accurate to allo w for critical assessmen t of molecular ev en ts. These appro ximate metho ds are a means to push the limits in applying theoretical c hemistry to large molecules as w ell as to guide the dev elopmen t of b oth con v en tional quan tum c hemistry programs and theories outside of electronic structure programs. One particularly attractiv e route to w ard accomplishing this is based on p erforming rigorous ab initio calculations on small regions of a molecule and then pro cessing these regions to pro vide the prop erties of large target molecules in the least am biguous w a y p ossible. Chemistry is built up on the transferabilit y of functional groups. T o a reasonable appro ximation suc h functional groups retain some essen tial c haracteristics in an y molecule. Y et the c hanges that o ccur when they are com bined pro vide the enormous v ariations in c hemistry and while often mo dest in a relativ e sense, are critical in others. A b initio quan tum c hemistry ho w ev er, while pro viding quan titativ e results for the structure and sp ectra of molecules, do es not oer a theory of the c hemical b ond or functional groups. Bonds are not among the observ ables of quan tum mec hanics. Hence, that kind of information can only b e extracted a p osteriori from a computed w a v efunction or densit y and then sub ject to a m ultitude of sometimes conricting criteria. This failing of quan tum mec hanics prohibits the natural, a priori separation of a large molecule in to smaller, 20 PAGE 21 computational and conceptual groups, that w ould t together to pro vide the whole. If this quan tum c hemistry dilemma w ere remo v ed, then just as syn thetic c hemists build molecules from adding and remo ving functional groups, quan tum c hemists could do m uc h the same for electronic structure. In the limit of p erfect transferabilit y a giv en prop ert y w ould b e completely c haracterized in terms of a simple set of univ ersal functional groups. In practice, this is not p ossible b ecause c hemistry arises from the c hange in the c harge distributions and subsequen t molecular geometries, as the functional groups are b onded to eac h other. The next step is to extract the functional group from an en vironmen t that allo ws for suc h c hanges in the c harge distribution. Represen ting the prop erties of molecules in terms of constituen t atoms has b een the fo cus of some w ork [ 4 5 6 7 8 9 10 11 12 13 ] but for m uc h of c hemistry functional groups w ould app ear to b e a sup erior c hoice b ecause they include the eects of strong p erturbations lik e co v alen t b onds. One fo cus of this w ork is to maximally incorp orate xed electronic structure regions, usually via lo calized orbitals, in the description of correlated w a v efunctions. The relaxation of these regions is accomplished via lo cal correlation metho ds whic h are c haracterized b y allo wing these xed electronic structure regions to in teract via in terregion excitations. The reference function can b e treated similarly ho w ev er giv en the longranged electrostatics, a dieren t strategy is emplo y ed. 2.1 Natural Lo calized Molecular Orbitals Natural lo calized molecular orbitals (NLMOs) [ 14 15 ] are used to in tro duce a scale whic h oers certain adv an tages to lo cal correlation metho ds. One b enet of using NLMOs is that the virtual space is comp osed of lo calized orthogonal orbitals whic h are obtained with little con v ergence diculties ev en for large, diuse basis sets. In the NLMO searc h for ncen tered orbitals, where n is t ypically small n < N with N the n um b er of cen ters in the molecule, only small diagonalizations of the singleparticle densit y matrix are required ensuring that the NLMO determination is quite fast [ 15 ]. In con trast to other 21 PAGE 22 lo calization metho ds whic h use an initial set of orbitals, the NLMO pro cedure is based on a singleparticle densit y matrix, r = C y S y PSC with C S and P b eing orbitals, o v erlap, and singleparticle densit y matrix in atomic orbitals, whic h can b e correlated or uncorrelated. F or example, lo calized correlated orbitals ma y ha v e certain prop erties whic h are more desirable in treating quan tum regions that w ould otherwise ha v e b een quite large. These lo calized correlated orbitals carry information ab out correlations only eectiv ely due to the fact that they are still singleparticle functions. This is to b e con trasted with explicitly correlated orbitals, more appropriately called geminals, whic h are t w oparticle functions. Restricted as w ell as unrestricted NLMOs can b e obtained from the corresp onding restricted HF (RHF) or unrestricted (UHF) densit y matrices. The NLMOs deriv e from nonHF natural b ond orbitals (NBOs) resulting in the formation of b ond/an tib ond pairs. As previously men tioned, in the A CES I I [ 16 ] NLMO program the b onds can b e ncen tered with 1 n N where N is the n um b er of cen ters in the molecule and are classied as o ccupied cores ( n = 1), lonepairs ( n = 1), and b onds ( n 2) and virtual an tib onds ( n 2) and Rydb ergs ( n = 1). The n > 2cen tered b onds are useful for describing delo calized moieties, for example aromatic molecules, without resorting to more articial Kekul e structures. The remark able transferabilit y of lo calized c hemical b onds and/or functional groups, and their asso ciated prop erties among the ground states of molecules is one of the most fundamen tally imp ortan t and useful concepts in c hemistry Exceptions to this rule include certain pathological cases, for example D 3 d cyclohexane cation, where the unpaired hole remains delo calized, despite the fact that all the other electrons in the molecule are lo calized. F or example, the concept of transferable groups in conjugated systems w as recen tly in v estigated for substituted p oly enes where it w as found that eects due to substitution propagate to three or four meth ylene group distances [ 17 ]. Not surprisingly the ma jorit y of applications of lo cal metho ds ha v e b een on saturated molecules for whic h there is a large bandgap and therefore a more lo cal electronic structure. 22 PAGE 23 2.2 F ragmen t Based Approac hes 2.2.1 F ragmen t Molecular Orbitals Lo calized orbitals can also b e determined b y fragmen ting a molecule in to smaller regions and simply nding the asso ciated canonical orbitals whic h b y virtue of the fragmen tation are lo calized to the sp ecic fragmen ts. In this w a y the electrons can b e lo calized to particular fragmen ted regions of a molecule with higherorder terms giv en b y fragmen t dimers, trimers, etc. These sets of orbitals are orthonormal within eac h fragmen t but the orbitals b et w een fragmen ts w ould ha v e to b e orthonormalized to determine the reference w a v efunction of the whole molecule. If the fragmen ts are c hosen to b e functional groups then these orbitals and the NLMOs should b e similar. The accuracy of this fragmen t molecular orbital (FMO) metho d [ 18 19 20 ] dep ends on ho w the molecule is fragmen ted and as suc h dep ends on ho w co v alen t b onds at the b oundary are treated. The fragmen tation is most commonly done b y using the c hemical in tuition of the user, ho w ev er, more robust means of fragmen ting a molecule are needed and under dev elopmen t, hop efully making use of singular v alue decomp osition (SVD). In the con v en tional implemen tation of the FMO metho d the b onds are fragmen ted electrostatically meaning that the b ond is k ept in tact and incorp orated in to one of the adjacen t fragmen ts with no b ond for the other fragmen t. In the curren t implemen tation it is assumed that the fragmen ts are closed shells, ho w ev er, c hanges can b e incorp orated as needed to allo w for op en shells as in certain c harged molecules or radicals. Lik ewise sligh t c hanges are needed to accoun t for diuse or delo calized c haracter. The FMOs can b e used to determine reference, for example HF, w a v efunctions, more sp ecically densities and energies, of large molecules. The Hamiltonian for a fragmen t is giv en b y H I = n I X i 1 2 r 2i n I X i X Z r i + n I X i N X J 6 = I Z dr j J ( r j ) r ij + n I X i PAGE 24 where I ; J ; : : : denote fragmen ts of whic h there are N and where n I sp ecies the n um b er of electrons in fragmen t I Note that runs o v er all n uclei in the molecule. The electronic distribution of a giv en fragmen t, I ( r ), is determined selfconsisten tly in the electronic and n uclear electrostatic p oten tial (ESP) due to the other ( N 1) fragmen ts. The same is true for all other fragmen ts, resulting in selfconsisten tly determined ESPs whic h allo w for resp ectiv e p olarization. Note that this ESP accoun ts for some of the N b o dy in teractions in the molecule with the remainder to b e reco v ered using higherorder fragmen t dimer, trimer, etc. [ 21 ] corrections. The computation of this ESP is the most time consuming step in the FMO calculation. The Hamiltonian for a pair of fragmen ts is giv en b y H I J = n I + n J X i 1 2 r 2i n I + n J X i X Z r i + n I + n J X i N X K 6 = I ;J Z dr k K ( r k ) r ik + n I + n J X i PAGE 25 where and as w ell as and b elong to fragmen t I and where V I is the ESP due to the other ( N 1) fragmen ts, V I = N X J 6 = I X h j Z r j i + N X J 6 = I X P J ( j ) ; (2{6) where and b elong to fragmen t I while and b elong to fragmen t J This ESP sho ws the absence of exc hange for the in termonomer calculations. Exc hange is handled similarly for the dimer calculations, F I J = H I J + X P I J [( j ) 1 2 ( j )] + V I J ; (2{7) where and as w ell as and b elong to dimer I S J and where V I J is the ESP due to the other ( N 2) fragmen ts, V I J = N X K 6 = I ;J X h j Z r j i + N X K 6 = I ;J X P K ( j ) ; (2{8) where and b elong to I S J while and b elong to fragmen t K These equations giv e the follo wing pseudoeigen v alue equations for the F o c k matrices in terms of FMOs and orbital energies, F I C I = C I I (2{9) F IJ C IJ = C IJ IJ : (2{10) The electronic energy of the molecule is giv en b y the follo wing telescoping series in monomer, dimer, etc. energies, E = N X I E I + N X I PAGE 26 The total electron densit y of the molecule is computed as, ( ~ r ) = N X I I ( ~ r ) + N X I PAGE 27 metho ds are applied to large molecules. The FMO metho d is trivially parallelizable [ 22 ] b ecause once the ESP is determined and the monomer calculations are complete, the subsequen t indep enden t dimer, trimer, etc. calculations can b e done sim ultaneously Note that eac h of the monomer, dimer, trimer, etc. calculations can b e further parallelized meaning that the nal metho d can use a dualla y er strategy to the o v erall parallelization. The FMO approac h has great p oten tial [ 23 ] for routine, practical, accurate, and systematically impro v able applications to large molecules, for example in protein structure prediction, assessing conformer stabilit y proteinligand binding anit y explicit solv en t eects, molecular dynamics sim ulations, etc. Although large basis sets ha v e b een used, most applications ha v e b een limited to small, for example minimal, basis sets. Despite this, the results are usually in v ery go o d agreemen t to the con v en tional metho ds including cases for whic h there are m ultiple thousands of basis functions. 2.2.3 Electron Correlation with F ragmen t Molecular Orbitals Correlation energies, for example MP2 or CC, can b e computed using a telescoping series in the correlation energy [ 24 25 ]. Unlik e man y lo cal correlation metho ds whic h rely on large QM regions to obtain w eak in teractions, the FMO pro cedure accoun ts for them pro vided the paren t metho d has sucien t ph ysics to represen t them. When treating the correlation problem it is imp ortan t to note that the con v en tional FMO approac h is not based on using the FMO reference densit y matrix, but rather fo cuses on a telescoping series in the correlation energy where monomers, dimers, trimers, etc. are calculated at the correlated lev el of theory This giv es an FMO correlation energy to b e added to the FMO reference energy There are a signican tly smaller n um b er of dimers whic h need to b e calculated in the correlation problem than in the electrostatic one b ecause the correlation con tribution is m uc h more lo cal in space. Also in FMO the F o c k pseudoeigen v alue equations are represen ted in canonical form, as opp osed to noncanonical or lo calized form, and therefore equations based on them are greatly simplied, for example as in using iterativ e v ersus noniterativ e MP2. This is in con trast to using noncanonical 27 PAGE 28 or lo calized orbitals in highlev el metho ds. Within the FMO pro cedure the reference energy can b e computed up to threeb o dy terms while the more exp ensiv e correlation energy can b e computed up to t w ob o dy terms. It is imp ortan t when adding correlation corrections to ensure that the error in the reference function is sucien tly small to mak e adding correlation w orth while. 2.3 P arallelization in the A CES Programs 2.3.1 Sup er Instruction Assem bly Language An option to calculate the FMOs w as recen tly built in to the A CES I I I program [ 26 ] whic h is an extension of the A CES I I program [ 16 ] that allo ws for highly ecien t parallel computation. The A CES I I I program is p ortable and designed to attain excellen t p erformance and scalabilit y on up to 1000 or more pro cessors. The design strategy used for A CES I I I exploits the eciency of mo dern micropro cessors to p erform roating p oin t op erations on a large n um b er of n um b ers, a sup er n um b er or blo c k of n um b ers, as opp osed to a single n um b er. It is the micropro cessors deep hierarc h y of pro cessing sp eeds v ersus data sizes whic h mak es this p ossible. Th us the parallel extension of A CES I I requires completely redesigning the w a y in whic h it is written. This eort w as facilitated b y in tro ducing a new language called sup er instruction assem bly language (SIAL) whic h allo ws the rather complex algorithms of quan tum c hemistry to b e easily written in a compact, ob ject orien ted form whic h is then pro cessed b y a sup er instruction pro cessor (SIP) resp onsible for in terpreting in terms of sup er instructions on blo c ks of n um b ers, t ypically 10,000 of them, m uc h lik e the message passing in terface (MPI). Routines from the MPI library are used as needed, esp ecially in sending and receiving data. Arra y indices can then b e group ed in to segmen ts imp osing a breakdo wn of the data in to blo c ks whic h need to b e sucien tly large so that computation as opp osed to comm unication dominates at run time. F or example, consider the parallel do construct in SIAL whic h is used to run a piece of co de lo cally on a blo c k of data from a distributed arra y In addition to using distributed arra ys for large data structures, lik e t w oelectron in tegrals in whic h 28 PAGE 29 case the SIP organizes all required comm unication, there are lo cal arra ys for smaller data structures, lik e oneelectron in tegrals in whic h eac h no de has a lo cal cop y obtained using MPI send. The SIAL language is a means to p erform the large arra y op erations of quan tum c hemistry in parallel in suc h a w a y that all details p ertaining to the parallelization are hidden to the programmer. All declarations, including a n um b er of blo c k stac ks of xed sizes, are made at the b eginning of the SIAL co de and up on compilation a dryrun phase is executed in whic h the n um b er of blo c k stac ks needed for the calculation is determined. These blo c k stac ks are allo cated from random access memory (RAM) as needed in run time. Programs suc h as HF, MP2, CC, etc. are written in SIAL calling sup er instructions or ob jects whic h are written in C and F ortran 77, one suc h example is the in tegral computation [ 27 ] in A CES I I I. The A CES I I I executable runs one or more of these SIAL programs sequen tially to ac hiev e its desired computational task, passing data to the next SIAL co de using scratc h les created at the end of the curren t SIAL co de. 2.3.2 Sup er Instruction Arc hitecture Execution is optimized b y means of a sup er instruction arc hitecture (SIA) where pro cessing is p erformed async hronously in an attempt to hide laten t op erations b ehind other op erations, for example in accessing memory or in comm unication. The SIA divides the requested pro cessors in to a master pro cess, resp onsible for initialization and clean up of SIAL programs, and a n um b er of w ork er and serv er pro cesses, resp onsible for computations, lik e tensor con traction, matrix diagonalization, etc., and data storage in scratc h les, resp ectiv ely Eciency is maximized when accessing memory b y carefully sc heduling the read from memory to start long b efore the data is needed. Similarly other op erations are sc heduled to start while the results from a previous op eration are b eing written to memory Some tuning of the problem to the sp ecic arc hitecture a v ailable is automated or can b e pro vided b y the user as SIP parameters in a ZMA T. The A CES 29 PAGE 30 I I I program is rexible enough to scale with a small ab out of RAM for eac h w ork er no de, ho w ev er, increasing the RAM p er no de can increase the o v erall sp eed of the calculation. 2.3.3 In tegral Direct Computation Generation of the FMOs in the A CES I I I program is done ecien tly b y means of direct computation of oneand t w oelectron in tegrals using a new ob ject orien ted parallel in tegral pac k age [ 27 ]. The in tegral pac k age uses stateoftheart algorithms where great atten tion has b een paid to mo dern micropro cessors, for example a v oiding cac he misses, using blo c k data structures, etc. Ob jects ha v e the adv an tage of b eing completely indep enden t of other programs from whic h they can b e referenced. Only fragmen t sup er instructions, whic h allo w for the calculation of fragmen t sp ecic in tegrals, along with the appropriate SIAL co de, from whic h these instructions and some others are referenced, needed to b e written. Since the generation of FMOs follo ws from simply dropping certain atomic orbitals in the in tegral ev aluations, a drop atomic orbital (DR OP A O) namelist w as created in the ZMA T sp ecifying the mapping b et w een atomic cen ters and fragmen ts. 2.3.4 Automated DualLa y er P arallelization Strategy Automated calculation of HF reference w a v efunctions, more sp ecically densities and energies, for large molecules using these FMOs w as accomplished using a series of P erl scripts resp onsible for the programs initialization, managemen t, and collection. The geometry section of a standard A CES I I I ZMA T is fragmen ted according to the user's sp ecication b y inserting lines con taining F rag1, F rag2, ..., F ragN, where N is the total n um b er of fragmen ts used, eac h follo w ed b y lines con taining the usual atomic cen ters and Cartesian co ordinates b elonging to the giv en fragmen t. The initialization script creates all monomer, dimer, trimer, etc. ZMA Ts from the original ZMA T eac h with the appropriate DR OP A O namelist. Giv en that the dimer, trimer, etc. A CES I I I calculations are all indep enden t, they can b e sim ultaneously computed in batc hes con taining as little as one dimer, trimer, etc. or as man y as all of them. A dualla y er parallelization strategy is used, 30 PAGE 31 where a user sp ecies the n um b er of sim ultaneous dimer, trimer, etc. calculations as w ell as the n um b er of no des in the con v en tional A CES I I I parallelization. The rst la y er of parallelization is trivially accomplished using pro cess forks from the P erl managemen t script. F ork creates an additional pro cesses, another thread of execution, b y cloning the curren t pro cess, called the paren t pro cess, and creating c hild pro cesses whic h share the same executable co de. T o distinguish b et w een paren t and c hild pro cesses the call returns nonzero and zero, resp ectiv ely where the nonzero v alue returned to the paren t is the pro cess iden tication (PID) of the c hild pro cess. Eac h c hild pro cess is resp onsible for managing an en tire A CES I I I calculation. Batc h cycles are managed using the w aitpid command in P erl where the program w aits on the sp ecied n um b er of c hild PIDs. F ork pro vides only coursegrained parallelization, for example in its curren t implemen tation it assumes that for the c hosen batc h sizes the calculations will tak e appro ximately the same amoun t of time and p ossibly comp ete for resources, th us it is not rigorously loadbalanced. This dualla y er parallelization strategy is m uc h lik e ho w serial A CES I I parallelizes nite dierence deriv ativ es, ho w ev er, unlik e this approac h where eac h c hild is resp onsible for an in tense serial A CES I I calculation, the implemen tation in A CES I I I allo ws eac h c hild to further parallelize. P arsing of monomer, dimer, trimer, etc. output les to pro duce the nal HF results for the large target molecule is p erformed b y the collection script. This script also pro vides a summary of all of the in teraction energies in the large target molecule. Similar parsing of results from correlated calculations pro vides the con v en tional w a y of doing correlated calculations in the FMO metho ds. The managemen t script is sucien tly general so that it could b e used for computing nite dierence deriv ativ es in A CES I I I using the dualla y er parallelization strategy with either FMO based HF or ev en lo calized CC. 2.4 Lo calized CoupledCluster Since pair correlation is w ell describ ed b y MP2, whic h is c heap er than the w ell kno wn CC metho ds, it is the computational c hemists metho d of c hoice for treating correlation 31 PAGE 32 from rst principles. The lo calized correlation metho ds w ere rst implemen ted in MP2 metho ds with great success and later extended to CC metho ds. The CC metho ds expand up on MP2 b y allo wing electron pairs to inruence one another via a cum ulan t expansion whic h results in higher, for example threefold, fourfold, etc., electron excitations. It is p ossible to a v oid the increase in cost for CC metho ds b y using the NLMOs, to simplify the electronic structure b y imp osing lo cal excitations. This natural linearscaled CC (NLSCC) metho d [ 28 29 ] has b een implemen ted, using the F ortran and P erl programming languages, in to the serial A CES I I [ 16 ] quan tum c hemistry soft w are pac k age. The Molden [ 30 ] mo deling pac k age w as used regularly for prepro cessing input les. More recen t w ork has fo cused on maximizing the p erformance of NLSCC b y using the parallel A CES I I I [ 26 ] pac k age whic h is written in SIAL, a new programming language, based on the MPI. Ultimately the goal is to pro vide a blac kb o x, systematically impro v able, aordable, and accurate CC metho d for large molecules. F or sizeextensiv e metho ds, lik e MP2 and CC in con trast to truncated CI, it is p ossible to obtain v ery accurate correlation energies with NLS metho ds. Obtaining total energies is more cum b ersome due to longranged electrostatic terms. Lo cal correlation metho ds aim to circum v en t the nonlinear scaling b y exploiting an underlying simplicit y in b oth the electronic structure and computational pro cedure as in the NLSCC metho d [ 28 ]. The NLSCC metho d has b een sho wn to giv e v ery accurate correlation energies for large molecules in whic h the rate limiting step is the size of the QM regions. This region can b e determined selfconsisten tly making these metho ds systematically impro v able. F or this region other reduced scaling metho ds, whic h are outside of the lo cal appro ximation, can oer adv an tages [ 26 31 32 33 34 35 ]. These metho ds b ecome esp ecially useful for threedimensional systems where an accurate lo cal correlation treatmen t can require large QM regions to represen t w eak in teractions. In a lo calized basis the correlation space for a giv en o ccupied orbital do es not gro w with system size thereb y reducing the n um b er of free parameters and allo wing unimp ortan t 32 PAGE 33 in teractions to b e eliminated from the amplitude equations [ 36 ]. F urthermore, the metho d insists up on transferabilit y of electronic structure units and as suc h has the p oten tial to pro vide transferable electronic structure, m uc h the w a y as geometric structure dep ends up on transferable geometric units. Therefore, highlev el lo cal correlation metho ds ha v e the p oten tial to b ecome a standard, systematically impro v able to ol for calculating energies of v ery large molecules. F or systems with a bandgap it can b e denitiv ely sho wn that the computation of man yelectron states and their asso ciated densit y matrices ha v e to scale linearly with resp ect to particle n um b er [ 37 ], lending justication to a v ariet y of linearscaled approac hes and in particular those whic h accoun t for electron correlation [ 28 36 38 25 39 40 41 42 ]. The ground state NLSCCSD [ 28 ] metho d follo ws b y rewriting the w a v efunction, without loss of generalit y as, j i = e T 1 + T 2 j 0 i = Y i e T i 1 + T i 2 j 0 i ; (2{14) where j 0 i is the singledeterminan t reference function curren tly assumed to encompass the whole molecule but could also b e the FMO reference function or p ossibly some other reference function. In the nonin teracting limit b y using a sizeextensiv e reference function our reference b ecomes j 0 i = A ( j i ij j ij k i ) with i j and k b eing o ccupied NLMOs or functional groups thereof and A is an an tisymmetrizer. Additional terms lik e j ij i are needed as the groups are allo w ed to in teract. The op erator T i 2 for an excitation sp ecic to orbital i is giv en as, T 2 = X i T i 2 = X i 1 4 X j ab t abij a y b y j i ; (2{15) where ij 2 f o cc g ab 2 f vir g and pq 2 f o cc [ vir g The assumption is that the op erators T i in the NLMO basis only need to b e lo cally correlated due to the at w orst R 3 deca y of the "dip oledip ole" cluster amplitudes. T i 1 and T i 2 are appro ximated b y only summing o v er a small region, QM2, comp osed of orbitals that are spatially close to i This is sho wn 33 PAGE 34 b elo w for double excitations, X j ab = X j ab 2 QM 2 + X P ( j ab ) = 2 QM 2 X j ab 2 QM 2 ; (2{16) with P ( j ab ) a p erm utation op erator whic h in tro duces lo cal orbitals outside of QM2. Orbital i is dened to b e in the rst QM region, QM1, with QM 1 QM 2 giving T i 2 as, T i 2 1 4 X j ab 2 QM 2 t abij a y b y j i i 2 QM 1 j; a; b 2 QM 2 : (2{17) Note that the QM1 regions are disjoin t from eac h other as opp osed to the QM2 regions. This reduces the n um b er of singles and doubles parameters in the amplitude equations leading to linear or ev en sublinear scaling with resp ect to the n um b er of basis functions. The metho d is exact in the limit of QM2 b eing the whole molecule. Unrestricted references can also b e used pro vided that the corresp onding and densit y matrices are lo calizable. The total correlation energy is then a sum of orbital correlation energies for eac h QM1 region, E X i E i ; (2{18) whic h b ecomes exact in the limit of QM2 b eing the whole molecule. 2.4.1 Lo calized T 3 Con tributions Correlated metho ds suc h as MP2 and CC are a minimal requiremen t for predicting eects suc h as disp ersion in teractions whic h can pla y imp ortan t collectiv e roles in large, alb eit con v en tionally unattainable, systems, suc h as p eptides. There are lo cal CC metho ds whic h ha v e b een designed for single and double excitations [ 43 44 36 42 28 45 46 47 48 49 50 41 51 40 52 53 ], ho w ev er, less w ork has b een done on extending the lo cal CC metho ds to higher excitations, for example triple excitations [ 39 54 25 38 55 56 ]. Disconnected T 1 T 2 or ev en T 3 1 obtained with CCSD accoun ts for some of the triples con tribution to the correlation energy but t ypically the connected T 3 con tributions mak e signican t con tributions. It is exp ected that the triples amplitudes are longerranged than the doubles amplitudes b ecause only t w o of the three o ccupiedvirtual pairs need 34 PAGE 35 to b e spatially close to mak e a signican t con tribution. This could result in signican t mo dication of lo cal correlation metho ds whic h w ere in tended for CCSD. Additionally as the excitation lev el increases the nonlinearit y of the CC equations leads to complicated terms in whic h there are one, t w o, three, etc. excitation op erators making it dicult to determine if the o v erall con tribution is ev en lo cal. The necessit y of connected triples is understo o d from an orderb yorder comparison with p erturbation theory in whic h it is found that the triply excited manifold is dominated b y connected T 3 as opp osed to T 1 T 2 or T 3 1 This is in con trast to the dominan t term in the quadruply excited manifold whic h is disconnected T 2 2 as opp osed to connected T 4 These connected con tributions are not obtained un til explicit inclusion of T 3 as in CCSD(T), CCSDTx (x = 1a,1b,2,3), or CCSDT. These metho ds scale as noniterativ e O ( o 3 v 4 ), iterativ e O ( o 3 v 4 ), and iterativ e O ( o 3 v 5 ) resp ectiv ely making these CC metho ds, without mo dication, prohibitiv e for large molecules. In order to exp edite the ev aluation of triples within NLSCCSDT3, it is useful to rst consider the dominan t linear term in the T i 3 equation, h abcij k j ( W N T 2 ) c j 0 i ; (2{19) with, H N = X pq F N pq f p y q g + 1 4 X pq r s W N pq r s f p y q y sr g ; (2{20) and, H N = H h 0 j H j 0 i ; (2{21) whic h is the cornerstone of CCSD(T) and CCSDT1x (x = a,b) metho ds. Within the con text of NLSCCSDT3 nonlinear con tributions to T 3 for example h abcij k j ( W N T 2 2 ) c j 0 i allo w the lo calized regions to indirectly inruence one another, for example b y certain strong in terregional excitations, without destro ying the lo calit y or transferabilit y of the 35 PAGE 36 individual regions. These t yp es of terms are incomplete in linear v ersions suc h as CCSDT1x (x = a,b) or CCSD(T). Nonlinear terms in v olving T 3 are limited to the doubles equation h abij j ( W N T 1 T 3 ) c j 0 i within the NLSCCSDT3 appro ximation meaning that all other terms in v olving T 3 are linear. Eqns. 2{22 2{23 and 2{24 sho w the coupled nonlinear T 1 T 2 and T 3 CCSDT equations, where in the latter those b enecial terms that w ere previously men tioned, namely those of CCSDT3, are underbraced. The CCSDT3 T 1 and T 2 equations are the same as they are for CCSDT. Prefactors ha v e b een dropp ed for simplicit y h ai j ( H N (1 + T 2 + T 1 + T 1 T 2 + T 2 1 + T 3 1 + T 3 )) c j 0 i = 0 (2{22) h abij j ( H N (1 + T 2 + T 2 2 + T 1 + T 1 T 2 + T 2 1 + T 2 1 T 2 + T 3 1 + T 4 1 + T 3 + T 1 T 3 )) c j 0 i = 0 (2{23) h abcij k j ( H N ( T 2 {z} + T 3 {z} + T 2 2 {z} + T 1 T 2  {z } + T 2 T 3 + T 1 T 3 + T 2 1 T 2  {z } + T 1 T 2 2  {z } + T 2 1 T 3 + T 3 1 T 2  {z } )) c j 0 i = 0 (2{24) It can b e seen that CCSDT3 is as close to CCSDT as p ossible, but still O ( N 7 ) and do es not require storing the T 3 amplitudes in a canonical or semicanonical basis b ecause in those bases the T 3 in toT 3 con tributions v anish. F or example, the only T 3 in toT 3 con tribution in CCSDT3 is, h abcij k j ( F N T 3 ) c j 0 i ; (2{25) whic h for canonical HF, or semicanonical orbitals, F pq = p pq b ecomes diagonal establishing that the set of triples need not b e stored. The CCSDT3 metho d is correct through fourthorder in the energy and secondorder in the w a v efunction. 2.4.2 Av oiding T 3 Storage As an example of lo cally correlated metho ds for connected triples consider b oth noniterativ e CCSD(T) [ 38 56 ] and iterativ e CCSDT1b [ 55 ]. The easiest w a y to include connected triples con tributions is via p erturbation theory giving the w ell kno wn CCSD(T) 36 PAGE 37 metho d [ 57 ] and its nonHF generalization [ 58 ]. P erturbativ e triples corrections lik e CCSD(T) in a lo calized basis require b oth an iterativ e solution and storage of the triples amplitudes [ 58 ] whic h can destro y p erformance due to large memory requiremen ts. It is imp ortan t to ha v e appro ximations that eliminate the need to store the triples amplitudes. Note that for lo cal p erturbativ e triples appro ximations, lik e lo cal CCSD(T), an appro ximate triples energy con tribution is added to an already appro ximate singles and doubles energy This is in con trast with appro ximate iterativ e triples metho ds in whic h the triples are coupled to the single and double equations, and vice v ersa. Due to the fact that CCSDT3 con tains man y more nonlinear triples con tributions than CCSD(T) or CCSDT1x (x = a,b) the triples excitation regions ma y need to b e larger than the double excitation regions. As previously men tioned the CCSD(T) metho d pro vides the easiest connected triples con tribution, ho w ev er, as in the con text of secondorder with noncanonical HF orbitals, the F o c k matrix is blo c kdiagonal, meaning that p erturbativ e metho ds lik e CCSD(T) m ust b e solv ed iterativ ely thereb y requiring storage of the corresp onding triples amplitudes [ 58 ]. One disadv an tage of lo calized orbitals is that they are not eigenfunctions of an y common energy op erators, h j p i = X q pq j q i : (2{26) Where h is some singleparticle energy op erator, p; q are lo calized o ccupied or virtual orbitals, and is an energy matrix. The orbitals migh t diagonalize some other energy op erator, h ef f j p i = ef f p j p i ; (2{27) or they migh t diagonalize some other op erator, for example the NLMOs blo c kdiagonalize the densit y matrix. This results in p erturbativ e metho ds needing to b e solv ed iterativ ely This iterativ e solution negates one of the most useful prop erties of p erturbativ e metho ds in that they are noniterativ e for canonical orbitals. If the equations ha v e to b e iterated 37 PAGE 38 it is w orth while to include more terms p er iteration amoun ting to a CC calculation. Since the o v erhead is in the storage of triples as opp osed to the inclusion of more terms, CCSDT3 is used whic h is as close to CCSDT as p ossible while still b eing O ( N 7 ). Additionally for CCSDT3 note that with canonical or semicanonical orbitals the triples amplitudes need not b e stored in con trast to CCSDT. In the case of NLMOs there are odiagonal o ccupiedo ccupied and virtualvirtual terms in F N whic h require that the amplitudes b e stored in the con v en tional implementation. The semicanonical transformation that eliminates these terms in the case of a nonHF reference function [ 58 ], thereb y prev en ting storage, cannot b e used here b ecause it w ould simply return the delo calized canonical HF orbitals. In NLSCCSDT3 storage of the triples amplitudes in the lo calized basis is a v oided b y rst solving the CC equations for our small QM regions in the canonical basis and then subsequen tly unitarily transforming amplitudes and in tegrals in to NLMOs, sho wn b elo w using the summation con v en tion. t a 0 b 0 i 0 j 0 = U aa 0 U bb 0 U ii 0 U j j 0 t ab ij (2{28) w p 0 q 0 r 0 s 0 = U r r 0 U ss 0 U pp 0 U q q 0 w pq r s (2{29) That is the lo calization matrix is applied to the canonical CC w a v efunction from a QM region instead of to the reference function. This a poster ior i transformation do es not aect our c hoice of QM regions b ecause the transferabilit y of the NLMOs necessitates that they can b e determined from either a global or fragmen ted reference function. In Eqn. 2{28 and 2{29 U is the lo calization transformation obtained in the NLMO pro cedure from C 0 = CU where C 0 are the NLMOs and C are the canonical HF orbitals. The distinction here in comparison to man y other lo cal CC metho ds is that b y virtue of the fact that CC is an inniteorder metho d, in comparison to noniterativ e p erturbativ e metho ds, there is freedom to apply the lo calization to the nal correlated w a v efunction instead of to the 38 PAGE 39 reference function. Therefore, within NLSCCSDT3 the triples amplitudes do not need to b e stored. 2.4.3 NonLo cal In teractions in Hybrid Metho ds The NLSCCSDT3 metho d follo ws straigh tforw ardly from the NLSCCSD metho d [ 28 ]. The ground state NLSCCSDT3 w a v efunction is rewritten from the CCSDT3 w a v efunction without loss of generalit y as, j i = e T 1 + T 2 + T 3 j 0 i = Y i e T i 1 + T i 2 + T i 3 j 0 i ; (2{30) where j 0 i is the singledeterminan t reference function. The op erator T i 3 for an excitation sp ecic to orbital i is giv en as, T 3 = X i T i 3 = X i 1 36 X j k abc t abcij k a y b y c y k j i : (2{31) The T i 1 and T i 2 are appro ximated b y only summing o v er a third QM region, QM3, and lik ewise for T i 3 with resp ect to a second QM region, QM2, b oth comp osed of orbitals that are spatially close and closer to i resp ectiv ely This is sho wn b elo w for triple excitations, X j k abc = X j k abc 2 QM 2 + X P ( j k abc ) = 2 QM 2 X j k abc 2 QM 2 ; (2{32) with P ( j k abc ) a p erm utation op erator whic h in tro duces lo cal orbitals outside of QM2. Orbital i is dened to b e in the rst QM region, QM1, with QM 1 QM 2 QM 3 giving T i 3 as, T i 3 1 36 X j k abc 2 QM 2 t abcij k a y b y c y k j i i 2 QM 1 j; k ; a; b; c 2 QM 2 ; (2{33) and T i 2 as, T i 2 1 4 X j ab 2 QM 3 t abij a y b y j i i 2 QM 1 j; a; b 2 QM 3 : (2{34) The single excitation op erator, T i 1 is giv en as, T i 1 X a 2 QM 3 t ai a y i i 2 QM 1 a 2 QM 3 : (2{35) 39 PAGE 40 Note that the QM1 regions are disjoin t from one another as opp osed to the QM2 and QM3 regions. This reduces the n um b er of singles and doubles parameters and signican tly reduces the n um b er of triples parameters leading to linear or ev en sublinear scaling with resp ect to the n um b er of basis functions. By adopting a h ybrid approac h to NLSCC where NLSCCSDT3 is p erformed in a small region, QM2, while NLSCCSD is p erformed in a larger region, QM3, it is p ossible to tak e adv an tage of simplications whic h manifest in other h ybrid metho ds that mix small CCSD regions with large MP2 regions. These latter h ybrid sc hemes are based on the assumption that at large enough distances the lo calized electron pairs are only w eakly in teracting and are therefore w ell appro ximated b y MP2, whic h do es not allo w these pairs to inruence one another to the exten t that they do within CC theory A h ybrid approac h to treating triples within the NLSCC framew ork, as opp osed to a full NLSCCSDT3 treatmen t, should b enet from similar simplications where the fo cus is on a small triples region in whic h the triples inruence one another more strongly than those whic h are scattered o v er a larger region. F or those systems whic h are w ell describ ed b y single reference, as opp osed to a m ultireference system, it is exp ected that the h ybrid NLSCC metho ds for triples w ould compare w ell with a full NLSCCSDT3 metho d. Amplitudes in QM2QM1 ha v e an indirect inruence on those amplitudes T i As an example consider a QM2 region with three o ccupied orbitals, i; j; and k and an arbitrary n um b er of virtuals and let i 2 QM 1 and j; k 2 QM 2 QM 1. The T 2 equation will dep end on t ij ; t ik ; and t j k where the virtual orbital lab els ha v e b een dropp ed for con v enience. It is tempting to think ab out dropping the t j k term and represen ting the NLSCC w a v efunction en tirely in terms of QM 1 QM 1 and QM 1 QM 2 excitations, ho w ev er, the QM 2 QM 2 excitations ha v e an imp ortan t, alb eit indirect, eect on t ij and t ik Ob viously those outside of the largest QM region dened do not mak e a con tribution and th us are neglected within NLSCC. This c hoice of QM regions should b e done selfconsisten tly to ensure that an y imp ortan t longerrange ph ysics is included 40 PAGE 41 and an automated QM determination is curren tly underw a y The regions are pic k ed to minimize tail con tributions. In our curren t implemen tation these regions are pic k ed using Molden [ 30 ] based on results from calculations p erformed on substituted alk anes of v arying lengths. In the limit of p erfect lo calization where j 0 i = j i ij j ij k i the NLSCCSDT3 w a v efunction b ecomes a pro duct, b y virtue of its sizeextensiv e exp onen tial ansatz, j i = j i ij j ij k i : (2{36) Due to the orthogonalit y tails in the orthonormal NLMO basis there will b e higherorder terms sho wn in the follo wing man yb o dy expansion, j i = j i ij j ij k i + j ij ij k i + j ik ij j i + j j k ij i i + j ij k i : (2{37) The total correlation energy E = h 0 j ( H N e T ) c j 0 i = h 0 j H j 0 i E = X ia f a i t ai + 1 4 X ij ab w ab ij t abij + t ai t bj t bi t aj ; (2{38) is then a sum of orbital correlation energies, E = X i E i : (2{39) Using our truncation of excitation op erators giv es E i = E i ( QM 1) + E i ( = 2 QM 1) and so i is pic k ed to minimize E i ( = 2 QM 1) obtaining E i E i ( QM 1) to a go o d appro ximation. This ob viously b ecomes exact in the limit of QM2 b eing the en tire molecule. If E T is designated as the total CCSDT3 correlation energy then Eqn. 2{39 giv es, E T = X i E T i ; (2{40) in terms of triply correlated NLMO and/or functional group energies. The follo wing hierarc h y in terms of the QM regions is considered within the language of NLSCC, 41 PAGE 42 QM 1 QM 2 QM 3, where ideally the triples con tributions to QM1 w ould b e from as large a region as p ossible giv en b y QM3. If E T i is written in terms of QM2 and QM3 con tributions as, E T i = E T i ( QM 3 i ) + E T i ( QM 2 i ) ; (2{41) an in teresting h ybrid approac h can b e dev elop ed b y assuming that E T i ( QM 3 i ) E D i ( QM 3 i ) thereb y a v oiding longerrange con tributions to the triples equation resulting in, E T i E D i ( QM 3 i ) + E T i ( QM 2 i ) ; (2{42) where E D i is a doubly correlated b ond energy Hybrid approac hes are lab eled as NLS(CCSDT3/CCSD). This appro ximation is justied in the case of NLS(CCSD/MP2) b y considering that the eects of coupled excitations that CC oers in comparison to MP2 w ould b e less imp ortan t at large distances. A t large distances the lo calized electron pairs are only w eakly in teracting and so a theory in whic h the pairs are not allo w ed to inruence one another, suc h as MP2 in con trast to CC, is appropriate. Although this pro vides the energy calculating gradien ts of the PES has pro v en more dicult within the con text of lo cal correlation metho ds. One question is simply if the qualit y of the oneand t w oparticle densit y matrices is sucien t. Although it is p ossible to dene the gradien ts, there is a v ery exp ensiv e rectangular bac k transformation that b ecomes rate limiting, thereb y signican tly degrading the p erformance. This bac k transformation is necessary to a v oid the deriv ativ e in tegral transformation that w ould b e needed for eac h new geometric p erturbation. In the ev en t that this transformation is p erformed the subsequen t forces are discon tin uous b ecause the correlation regions are geometry dep enden t. T o remedy this it is necessary to ensure that the same orbitals are dropp ed in the same w a y for eac h force ev aluation. 2.5 Implemen tation All calculations w ere p erformed using a mo died v ersion of the serial A CES I I [ 16 ], quan tum c hemistry soft w are pac k age. With the exception of those for metenk ephalin, 42 PAGE 43 results w ere obtained using a 375 MHz p o w er3 pro cessor using a maxim um of 4 of the a v ailable 8 GB of shared memory o v er four pro cessors. The calculations on metenk ephalin w ere p erformed on a 8 dualcore ia64 pro cessor SGI Altix mac hine with 256 GB of shared memory using m ultithreaded libraries. Suc h calculations made use of some recen t adv ances in the A CES I I [ 16 ] co de made to tak e adv an tage of large shared memory mac hines. These adv ances included up dating some old memory limitations to allo w for large reference functions (739 basis functions for the case of ccpVDZ metenk ephalin) and incore con tractions in the con text of man yb o dy metho ds. Geometries for p olyglycine w ere tak en from reference [ 36 ]. The geometry for metenk ephalin w as tak en from the Protein Data Bank (1PL W.p db, mo del 1) [ 59 ]. Geometries of test systems are a v ailable up on request. Giv en that the NLSCC metho d is implemen ted as a pilot co de in A CES I I [ 16 ] a man ual determination of the QM regions using Molden [ 30 ] is used. W e rst consider a quasilinear translationally p erio dic system b ecause for these t yp es of systems the argumen ts in fa v or of lo calit y are most simply understo o d. P olyglycine is mean t to represen t the simplest p eptide but still con taining a v ariet y of functional groups. F or example, it has a 3cen ter p eptideb ond. Metenk ephalin is mean t to represen t a more realistic application b ecause it is threedimensional and has nontrivial sidec hains. It is a conformationally rexible neurop eptide with a high anit y for the opiate receptor [ 59 ] for whic h there has b een recen t in terest from the theoretical c hemistry comm unit y [ 19 60 61 62 ]. F or p olyglycine, 2cen ter b onds are used, although 3cen ter b onds certainly could ha v e b een used, while for metenk ephalin up to 6cen tered b onds are used. A ccpVDZ basis is used unless sp ecied otherwise, ho w ev er, for more practical applications a triplezeta or b etter basis should b e used. There are a n um b er of gures whic h establish transferabilit y b y plotting a quantit y for a functional group, for example a meth yl group or a b ond, b elonging to one side of a quasilinear molecule as a function of its distance from a p erturbation on 43 PAGE 44 the opp osite side of the molecule. This distance is tak en to b e the n um b er of meth ylene groups b et w een the t w o. P erturbations are tak en as simple functional groups, R = C H 3 ; O H ; N H 2 ; F ; H 2.6 Applications 2.6.1 Motiv ation It is demonstrated in Fig. 21 for C 1 w ater clusters that the NLMOs can lo calize the triples con tribution to the CCSDT3 w a v efunction. These w ater clusters w ere c hosen b ecause they are examples of small systems whic h ha v e C 1 symmetry ho w ev er, other C 1 molecules suc h as distorted alk anes, etc. could b e used. It can b e sho wn that the n um b er of negligible amplitudes resulting from spatial lo calit y is greater than the n um b er due to symmetry F or a large symmetric molecule it is necessary that the molecule b e sucien tly large so as to pass the crosso v er p oin t for whic h spatial lo calit y as opp osed to symmetry dominates the n um b er of negligible amplitudes. In the canonical basis 90% of the j T 2 j amplitudes on a v erage are ab o v e the 5 : 0 10 8 threshold while for the j T 3 j amplitudes this n um b er reduces to 50%. These a v erages ignore the con tribution from a single w ater molecule whic h b ecause of the C 2 v symmetry giv es a crosso v er in the n um b er of amplitudes ab o v e the threshold for canonical v ersus lo cal bases. It is not surprising that the size of j T 3 j is smaller than j T 2 j b ecause these clusters ha v e relativ ely simple, nondegenerate ground states thereb y lac king nondynamical correlation eects usually asso ciated with higher excitation op erators. In the lo calized NLMO basis b oth j T 2 j and j T 3 j are sparse giving for example only 35% and 10% amplitudes ab o v e threshold, resp ectiv ely for the w ater heptamer. The deca y of lo calized amplitudes is monotonic for j T 3 j and nearly monotonic for j T 2 j ho w ev er, due to the lac k of symmetry in the clusters it is not necessarily exp ected that the b eha vior b e monotonic. By virtue of the lo calization scale it is p ossible to get a handle on the dominan t amplitudes. Fig. 22 examines the lo calit y of the virtual NLMOs, sp ecically the absolute v alue of diagonal t w oelectron in tegrals, h ii j aa i for a ( H 2 O ) 5 calculation. The prole of the 44 PAGE 45 in tegrals with resp ect to dieren t virtual bases is sho wn where the k ey is understo o d as follo ws with the n um b er giving the degree of lo calization: lo cal 0 (canonical o ccupied and canonical virtual), lo cal 1 (lo calized o ccupied and canonical virtual), and lo cal 2 (lo calized o ccupied and lo calized virtual). F or the lo calized o ccupied calculations i = O H and for the canonical o ccupied calculations i is delo calized. The lo cal 0 curv e sho ws that all the virtual orbitals giv e signican t in teractions with the c hosen o ccupied. The lo cal 1 curv e sho ws that lo calizing the o ccupied orbitals do es little to impro v e on lo cal 0, for example a giv en O H has signican t in teractions with all of the canonical virtual space, supp orting the w ell kno wn conclusion that a lo calized virtual space is imp erativ e. The lo cal 2 curv e sho ws signican t impro v emen ts on lo cal 0 and lo cal 1 b ecause b oth the o ccupied and virtual spaces are lo calized b y virtue of our NLMO implemen tation where the p eaks represen t in tramolecule in teractions. The rst set of p eaks corresp ond to O H O H pair in teractions, while the second and third corresp ond to in teractions with the Rydb erg spaces of o xygen and h ydrogen, resp ectiv ely The transferabilit y of the CC w a v efunction in terms of NLMOs is sho wn in Fig. 23 and Fig. 24 for a series of alk anes and substituted alk anes, resp ectiv ely Fig. 23 sho ws diagonal j T 2 j (CCSD) and j C 2 j (CISD) amplitudes for a C H C H pair conned to a meth yl group as a function of alk ane size. It is seen that b ecause the CC w a v efunction is sizeextensiv e the CC amplitudes are transferable while the CI amplitudes are not. In Fig. 24 the transferabilit y of j T 2 j and j T 3 j is sho wn for the CCSDT3 lev el of theory The amplitudes are conned to b e on the meth yl cap on the opp osite side of the substituen t, for example t C H C H C H C H and t C H C H 0 C H C H C H 0 C H where the prime represen ts another b ond dieren t from the rst also on the meth yl cap. The rst p oin t in Fig. 24 represen ts the amplitude and substituen t as nearest neigh b ors and th us the amplitudes are v ery dieren t among substituen ts. T 2 requires only one region of screening and therefore rapidly approac hes a constan t transferable v alue, while T 3 from a CCSDT3 calculation requires t w o or more regions of screening b efore b ecoming transferable. Note that these results are similar o v er 45 PAGE 46 all functional groups studied, including the v ery electronegativ e ruorine atom, b ecause of the robust transferabilit y of the CC amplitude equations as represen ted in terms of NLMOs. The CCSD (left) and CCSDT3 (righ t) C H b ond energies computed from these transferable amplitudes using Eqn. 2{18 are themselv es transferable as sho wn in Fig. 25 for the same series of substituted alk anes. The dierence b et w een CCSD and CCSDT3 b ond energies is, not surprisingly small, 1 mH giv en the simplicit y of the electronic structure. Both CCSD and CCSDT3 b ond energies require t w o or more regions of screening b efore reac hing a constan t v alue and the rate at whic h they b ecome transferable are nearly iden tical. 2.6.2 P olyGlycine Dierences b et w een CCSD and CCSDT3 b ond energies are sho wn in Fig. 26 for t w o molecules used in the NLSCC calculation of p olyglycine. The t w o molecules are designated b y the fact that they are either the Ntermin us or Ctermin us and are sho wn in Fig. 27 on the lo w erleft and lo w errigh t resp ectiv ely There are negligible triples con tributions from the core orbitals and small 1 : 5 mH on a v erage con tributions for the b onds. The lonepairs in these t w o molecules ha v e larger triples con tributions 2 mH on a v erage, and they are o v er a longerrange than the other b onds, 1 mH 3 mH This is probably b ecause they are more dep enden t on lo cal en vironmen t due to their greater diusivit y The o ccupied indices within eac h of the core, lonepair, b ond, and b ond sections are not in a sp ecic order. The triples con tribution from the b onds is largest, 4 mH suggesting an in teresting activ e space triples metho d based on NLSCC in whic h dieren t b onds are correlated at dieren t lev els of theory Fig. 27 sho ws the QM regions dening the h ybrid NLSCC calculation of p olyglycine. The functional groups of in terest represen ted b y the QM1 region, the QM lab el is to sp ecify that this region is treated quan tum mec hanically as opp osed to b y more classical in teractions, for example b y an ESP etc., whic h could b e incorp orated in to regions outside 46 PAGE 47 of QM3, but will not in the curren t form ulation. There is one region for the meth ylene group, one for the p eptide group, one for the Ntermin us, and one for the Ctermin us. This region is em b edded in the QM2 region whic h will b e treated at the CCSDT3 lev el of theory whic h in turn is em b edded in the QM3 region to b e treated at the CCSD lev el of theory The reason for t w o em b edding regions is to capture an y strong correlations for those functional groups whic h are close to one another with CCSDT3 and for w eak er correlations b et w een more distan t functional groups CCSD can b e used. The p eptideb ond in p olyglycine is treated as a Kekul e structure comp osed of 2cen ter b onds. The CCSD and CCSDT3 functional group energies of the molecules used in the NLSCC calculations on p olyglycine are sho wn in T able 21 The diagonal elemen ts are simply the energies tak en from the corresp onding QM1 regions of Fig. 27 while the upp erdiagonal elemen ts for p eptide and meth ylene are the energies of those groups extracted from b oth the calculations for whic h the target w as the Ntermin us and Ctermin us. The dierence b et w een CCSD and CCSDT3 functional group energies is largest for the Ctermin us, 35 mH follo w ed b y the Ntermin us and the a v erage p eptide group whic h are 24 mH and 14 mH resp ectiv ely The smallest dierence is for the a v erage meth ylene group whic h is 6 mH This ordering follo ws from the total n um b er of lonepairs and b onds in eac h of the groups whic h from Fig. 26 are sho wn to giv e large triples corrections to the energy The transferabilit y of the p eptide group among three dieren t molecules is seen b y considering the upp erdiagonal elemen ts whic h are within 1 mH of one another for b oth CCSD and CCSDT3. The same is true for the meth ylene group from the CCSD calculation, ho w ev er, for the CCSDT3 calculation the dierences are sligh tly larger. The total and unit cell NLSCCSD and NLS(CCSDT3/CCSD) correlation energies for p olyglycine are sho wn in T able 22 along with some literature results obtained using lo cal CC metho ds [ 43 55 ]. The unit cell is glycine. The triples con tribution acts to lo w er the correlation energy con tribution p er unit cell b y 20 mH for all p olyglycine examples including the innite limit. The saturation of the correlation energy p er unit 47 PAGE 48 cell is clearer in Fig. 28 The energy dierence for the innite v ersus small molecule limit for the glycine unit cell is 30 mH for b oth NLSCCSD and NLS(CCSDT3/CCSD). F or NLSCCSD o v er 99% of the correlation energy is obtained when compared to con v en tional results for the glycine monomer, dimer, and tetramer while for NLS(CCSDT3/CCSD), the comparison with con v en tional metho ds w as only p ossible with the glycine monomer for whic h 99 : 4% w as reco v ered. In order to v alidate the NLSCC triples metho ds further comparison with con v en tional metho ds w ould b e necessary Despite the fact that a direct comparison b et w een the NLSCC and lo cal CC metho ds is not p ossible, b ecause of the frozen core appro ximation in addition to the fact they are h ybridized dieren tly the NLSCC metho d pro vides the literature results for extended systems. As an example of time sa vings consider ( g l y ) 1 with 95 basis functions and ( g l y ) 16 with 1160 basis functions. The time required for a CCSDT3 calculation on ( g l y ) 1 is 0 : 95 day s whic h can b e crudely extrap olated to 3 : 4 10 7 day s for ( g l y ) 16 The NLS(CCSDT3/CCSD) serial calculation on ( g l y ) 16 to ok 11 : 9 day s to b e compared to a parallel FMO CCSD(T) on four 3.2 GHz p en tium4 pro cessors whic h to ok 9 : 9 day s [ 25 ]. By virtue of the fact that the NLSCC metho ds are built up on lo cal QM regions the rate determining step of the calculations b ecomes that of the largest QM region. F or a simple translationally p erio dic example once the energies of the QM1 regions sho wn in Fig. 27 are determined it is trivial to reuse these regions in symmetrically equiv alen t cases to determine the energies of larger and larger molecules. More timings are discussed in the section on metenk ephalin. 2.6.3 MetEnk ephalin The threedimensional p en tap eptide (metpheglyglyt yr) metenk ephalin is sho wn in Fig. 29 along with a crude represen tation of the QM1 regions c hosen for the NLSCC calculations. Sidec hains of nonglycine residues methionine (QM1 1 ), phen ylalanine (QM1 4 ), and t yrosine (QM1 9 ) are sho wn in Fig. 210 along with atom indices for use in T able 23 48 PAGE 49 T able 23 sho ws a decomp osition of the CCSD correlation energy in to functional group and NLMO con tributions for nonglycine QM1 regions whic h are sho wn in Fig. 210 The NLMOs are ordered p er column as core, l p and orbitals as w ell as a 6cen ter phenolicb ond giv en b y 1 6 3 6 All NLMO core con tributions are 1 2 mH while the a v erage correlation energy for an o xygen l p ( 50 mH ) is 5 mH greater than a sulfur l p ( 45 mH ). The a v erage correlation energy for a C H b ond in methionine, 49 mH is m uc h greater than for phen ylalanine and t yrosine whic h are b oth around 39 mH b ecause of the similarit y in their sidec hains. The a v erage correlation energy for a C C b ond in methionine is 56 mH compared to 44 mH for a ring C C and 47 mH for a nonring C C in phen ylalanine and t yrosine. The a v erage C S b ond in methionine has 54 mH in comparison to the a v erage C C b ond whic h is 56 mH The O H b ond in t yrosine has 53 mH The delo calized 6cen tered b onds con tain the most correlation amoun ting to an a v erage of 64 mH for b oth phen ylalanine and t yrosine. The similarit y in the b ond correlation energies for phen ylalanine and t yrosine are attributed to the transferabilit y of their electronic structure within NLSCC as represen ted in terms of NLMOs. T able 24 summarizes minimalbasis and ccpVDZ NLSCCSD calculations for metenk ephalin. On a v erage eac h QM2 region con tains 29 atoms and 53 o ccupied orbitals with 32 virtuals for STO3G and 218 virtuals for ccpVDZ. The largest QM calculations with 65 o ccupied and 272 virtuals are for QM2 4 and QM2 9 for whic h the phen ylalaninet yrosine in teraction is calculated. The NLSCCSD result can b e constructed out of eectiv e functional groups as determined from the QM1 regions giving a correlation energy of 2 : 4905 H and 5 : 7742 H for STO3G and ccpVDZ bases resp ectiv ely F or STO3G con v en tional CCSD giv es 2 : 4987 H and th us 99 : 7% of the correlation energy is reco v ered. Although this is the simple minimalbasis it is p ossible to get the result for a calculation with 87 virtual orbitals from calculations with no more than 42 virtuals th us sa ving a factor of 18 in the ev aluation of h ab j cd i in tegrals. A similar comparison using 49 PAGE 50 the ccpVDZ basis, 587 virtuals, is curren tly under dev elopmen t using a new parallel v ersion of A CES, A CES I I I [ 26 ], where similar sa vings should b e exp ected giv en that the largest QM2 has only 272 virtuals. As a simple illustration of the time sa v ed in the NLSCC metho d consider the STO3G con v en tional CCSD calculation whic h to ok a total of 53 : 56 hour s in comparison to the NLSCCSD calculation whic h to ok 0 : 98 hour s for the rate limiting step. The other steps are shorter and can b e done sim ultaneously The ccpVDZ NLSCCSD calculation on metenk ephalin had a rate limiting step of 137 : 64 hour s in whic h the NLMO generation to ok only 69 seconds Con v en tional HF pro vides reference energies although other reference functions for example DFT or FMOHF [ 19 ] could b e used. Collectiv e threeb o dy w eak in teractions from methionine, phen ylalanine, and t yrosine w ere not necessary in the STO3G basis but ma y b e needed in larger and more diuse basis sets. Accurate triples con tributions for metenk ephalin are to o in tensiv e to calculate b ecause the QM2 regions should con tain at least amino acid dimers and these are already to o exp ensiv e for a prop er correlated treatmen t. Activ e space triples metho ds for aromatic b onds built up on con v en tional NLSCC w ould b e quite useful in this situation. 50 PAGE 51 0 10 20 30 40 50 60 70 80 90 100 1 2 3 4 5 6 7 % of amplitudesn, (H 2 O) n canon t ij ab local t ij ab canon t ijk abc local t ijk abc Figure 21: Increasing sparsit y of the CCSDT3 amplitudes with system size. The p ercen tage of canonical (canon) and NLMO (lo cal) based CCSDT3 j T 2 j and j T 3 j amplitudes whic h are ab o v e the 5 : 0 10 8 threshold is sho wn as a function of C 1 w ater clusters of v arying sizes. 51 PAGE 52 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0 10 20 30 40 50 60 70 80 90 100 ii  aa virtual orbital index, a local 0 local 1 local 2 Figure 22: Sparsit y of t w oelectron in tegrals in a lo calized virtual space. A prole of the absolute v alue of diagonal t w oelectron in tegrals, h ii j aa i is sho wn with resp ect to dieren t virtual bases from a C 1 ( H 2 O ) 5 calculation. The k ey is understo o d as follo ws with the n um b er giving the degree of lo calization: lo cal 0 (canonical o ccupied and canonical virtual), lo cal 1 (lo calized o ccupied and canonical virtual), and lo cal 2 (lo calized o ccupied and lo calized virtual). 52 PAGE 53 0.050 0.055 0.060 0.065 0.070 0.075 0.080 1 2 3 4 5 6 7 8 OsCHsCH / a.u.n, C n H 2n+2 T 2 L 2 C 2 Figure 23: T ransferabilit y of lo cal amplitudes from size(in)extensiv e metho ds. Diagonal T 2 (ground state CC), 2 (ground state CC), and C 2 (ground state CI) amplitudes for a C H b ond b elonging to a meth yl group as a function of alk ane size. The plot demonstrates that for the sizeextensiv e CC w a v efunction the amplitudes are transferable, while for the CI w a v efunction, whic h is sizeinextensiv e, they are not. Despite the fact that 2 is a CIlik e deexcitation op erator it is appro ximately sizeextensiv e within the curren t con text. 53 PAGE 54 9.50 10.00 10.50 11.00 11.50 12.00 12.50 1 2 3 4 5 tij ab 103n, H 3 C(CH 2 ) n1 R R = CH 3 R = OH R = NH 2 R = F R = H 0.345 0.350 0.355 0.360 0.365 0.370 0.375 0.380 0.385 0.390 0.395 0.400 1 2 3 4 5 tijk abc 103n, H 3 C(CH 2 ) n1 R R = CH 3 R = OH R = NH 2 R = F R = H Figure 24: T ransferabilit y of j T 2 j and j T 3 j from CCSDT3 in NLMOs. The amplitudes are conned to the meth yl cap on the opp osite side of the substituen t and the xaxis represen ts increasing distance b et w een the t w o in terms of the n um b er of meth ylene groups b elonging to the substituted alk ane. 54 PAGE 55 45.00 45.50 46.00 46.50 47.00 47.50 48.00 48.50 1 2 3 4 5 EsCH (mH)n, H 3 C(CH 2 ) n1 R R = CH 3 R = OH R = NH 2 R = F R = H 46.00 46.50 47.00 47.50 48.00 48.50 49.00 49.50 1 2 3 4 5 EsCH (mH)n, H 3 C(CH 2 ) n1 R R = CH 3 R = OH R = NH 2 R = F R = H Figure 25: T ransferabilit y of CCSD and CCSDT3 b ond energies in NLMOs. The C H b ond is on the opp osite side of the substituen t and the xaxis represen ts increasing distance b et w een the t w o in terms of the n um b er of meth ylene groups b elonging to the substituted alk ane. Absolute v alued b ond energies are sho wn. 55 PAGE 56 4.0 3.0 2.0 1.0 0.0 1.0 2.0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 Ei(CCSDT3) Ei(CCSD) (mH)i, occupied index cores Nterminus lones Nterminus s bonds Nterminus p bonds Nterminus cores Cterminus lones Cterminus s bonds Cterminus p bonds Cterminus Figure 26: Eect of triple excitations on calculated b ond energies. Dierence b et w een CCSD and CCSDT3 b ond energies for all of the o ccupied orbitals in t w o molecules used in the NLSCC calculation of p olyglycine. The t w o molecules are designated b y the fact that they are either the Ntermin us or Ctermin us and are sho wn in Fig. 27 on the lo w erleft and lo w errigh t resp ectiv ely 56 PAGE 57 Figure 27: Denition of the regions for the NLSCC calculation of p olyglycine. The region QM1 simply represen ts the functional groups of in terest, QM2 represen ts the CCSDT3 region, and QM3, the whole fragmen t, represen ts the CCSD region. Due to the more complicated structure there are four calculations (meth ylene group, p eptide group, Ntermin us group, and Ctermin us group) needed. 57 PAGE 58 T able 21: T ransferabilit y of b ond energies from dieren t molecules. The diagonal elemen ts are simply the energies tak en from corresp onding QM1 regions of Fig. 27 while the upp erdiagonal elemen ts for p eptide and meth ylene are the energies of those groups extracted from b oth the Ntermin us and Ctermin us calculations. F unctional group energies are rep orted in units of Hartree. CCSD Ctermin us Ntermin us p eptide meth ylene Ctermin us 1.1039 0.4184 0.2129 Ntermin us 0.7914 0.4171 0.2123 p eptide 0.4173 meth ylene 0.2146 CCSDT3 Ctermin us Ntermin us p eptide meth ylene Ctermin us 1.1388 0.4331 0.2193 Ntermin us 0.8150 0.4308 0.2185 p eptide 0.4311 meth ylene 0.2212 58 PAGE 59 T able 22: NLSCC correlation energies for p olyglycine. The unit cell is a glycine. Other v alues are tak en from the lo cal CC literature [ 43 55 ]. The sym b ol H means that the energies are rep orted in units of Hartree. ng l y cine NLSCCSD LCCSD/ NLS(CCSDT3/CCSD) LCCSDT1b/ LCCSD(T)/ LMP2 F C LMP2 F C LMP2 F C ncel l Ecor r(H) Ecor r/ ncel lEcor r(H) Ecor r(H) Ecor r/ ncel lEcor r(H) Ecor r(H) 1 99.4% 0.8466 99.1% 99.4% 0.8703 98.9% 99.0% 2 99.9% 0.7387 99.0% 1.5212 0.7606 4 99.3% 0.6854 2.6783 2.8262 0.7066 2.7552 2.7570 7 4.6376 0.6625 4.7838 0.6834 8 5.2696 0.6587 5.1499 5.4363 0.6795 5.3001 5.3039 9 5.9017 0.6557 6.0888 0.6765 10 6.5338 0.6534 6.3857 6.7413 0.6741 11 7.1658 0.6514 7.3938 0.6721 14 9.0620 0.6473 8.8574 9.3513 0.6680 15 9.6941 0.6462 10.0038 0.6669 25 16.0147 0.6405 16.5289 0.6611 41 26.1277 0.6372 26.9691 0.6577 61 38.7690 0.6355 40.0193 0.6560 163 103.2395 0.6333 106.5753 0.6538 265 167.7100 0.6328 173.1313 0.6533 500 316.2450 0.6324 326.4711 0.6529 1000 632.2769 0.6322 652.7261 0.6527 1 0.6321 0.6525 59 PAGE 60 0.630 0.640 0.650 0.660 0.670 0.680 0.690 0 25 50 75 100 125 150 175 200 225 250 275 E/ncell (H/cell)n cell HO(COCH 2 NH) n cell H NLSCCSD NLS(CCSDT3/CCSD) Figure 28: NLSCC correlation energy p er unit cell of p olyglycine. The unit cell is glycine and the energies are sho wn absolute v alued. 60 PAGE 61 Figure 29: Mo del of the threedimensional p en tap eptide metenk ephalin. The co ordinates (1PL W.p db) w ere tak en from reference [ 59 ]. Also sho wn is a crude represen tation of the QM1 regions c hosen for the NLSCC calculations. 61 PAGE 62 Figure 210: Mo del sho wing the sidec hains of nonglycine residues. A tom indices are included for use in T able 23 62 PAGE 63 T able 23: CCSD b ond correlation energies for nonglycine residues. These residues are sho wn in Fig. 210 NLMOs are ordered p er column as core, l p and orbitals as w ell as a 6cen ter phenolicb ond giv en b y 1 6 3 6 F or eac h t yp e of NLMO within eac h column the energy con tributions are ordered b y increasing magnitude. Energies are rep orted in units of Hartree. methionine pheny l al anine ty r osine S 5 0.0000 C 5 0.0017 O 2 0.0014 S 5 0.0013 C 8 0.0017 C 3 0.0016 S 5 0.0013 C 10 0.0017 C 6 0.0017 S 5 0.0016 C 12 0.0017 C 12 0.0017 S 5 0.0017 C 14 0.0018 C 4 0.0018 C 7 0.0018 C 6 0.0018 C 8 0.0018 C 2 0.0019 C 2 0.0019 C 10 0.0018 C 9 0.0020 C 10 H 11 0.0385 C 13 0.0019 S 5 0.0409 C 12 H 13 0.0386 O 2 0.0450 S 5 0.0480 C 8 H 9 0.0389 O 2 0.0548 C 2 H 1 0.0473 C 14 H 15 0.0393 C 6 H 7 0.0391 C 2 H 4 0.0474 C 6 H 7 0.0397 C 4 H 5 0.0391 C 2 H 3 0.0479 C 2 H 3 0.0401 C 8 H 9 0.0392 C 7 H 6 0.0492 C 2 H 4 0.0404 C 10 H 11 0.0393 C 9 H 10 0.0493 C 10 C 12 0.0431 C 13 H 14 0.0404 C 9 H 11 0.0495 C 8 C 10 0.0431 C 13 H 15 0.0404 C 7 H 8 0.0502 C 12 C 14 0.0434 C 3 C 4 0.0431 C 2 S 5 0.0539 C 6 C 8 0.0437 C 3 C 8 0.0431 S 5 C 7 0.0548 C 5 C 14 0.0441 C 4 C 6 0.0435 C 9 C 12 0.0553 C 5 C 6 0.0443 C 8 C 10 0.0437 C 7 C 9 0.0567 C 2 C 5 0.0473 C 6 C 12 0.0439 C 1 C 2 0.0477 C 10 C 12 0.0442 1 6 0.0536 C 12 C 13 0.0471 2 6 0.0699 C 13 C 16 0.0476 3 6 0.0699 O 2 H 1 0.0505 O 2 C 3 0.0547 1 6 0.0537 2 6 0.0675 3 6 0.0714 T otal 0.6619 0.8380 1.0049 63 PAGE 64 T able 24: Summary of NLSCCSD calculations on metenk ephalin. The n um b er of atoms, o ccupied orbitals, virtual orbitals, and group correlation energies for eac h QM1 and accompan ying QM2 regions according to Fig. 29 are sho wn. The NLSCCSD results are sho wn to b e a sum of the QM1 results and compared to con v en tional CCSD for the STO3G basis. Energies are rep orted in units of Hartree. In this case con v en tional HF pro vides reference energies although other reference functions could b e used. EN LS C C S Dis the total energy H F + N LS C C S D within the NLS framew ork while EC C S Dis the total energy H F + C C S D using con v en tional CC. i QM 1 QM 2 QM 1 Ei natomnoccnS T O 3 G v irncc pV D Z v irnatomnoccnS T O 3 G v irncc pV D Z v irSTO3G ccpVDZ 1 27 46 29 192 11 21 0.2486 0.6619 2 30 55 31 221 6 15 0.2387 0.6519 3 29 55 34 225 6 14 0.2244 0.5931 4 35 65 42 272 14 25 0.4927 0.8380 5 23 46 25 177 4 11 0.1758 0.4786 6 29 56 33 224 7 15 0.2408 0.6348 7 22 41 25 168 3 4 0.0644 0.1561 8 28 52 32 214 9 18 0.2736 0.7549 9 35 65 42 272 15 29 0.5315 1.0049 N LS C C S D 75 152 2.4905 5.7742 C C S D 75 152 87 587 2.4987 H F 75 152 87 587 2208.9502 2236.7860 EN LS C C S D75 152 2211.4407 2242.5602 EC C S D75 152 87 587 2211.4489 64 PAGE 65 CHAPTER 3 LOCAL APPR O XIMA TIONS F OR IONIZED ST A TES Y ears of researc h in ab initio quan tum c hemistry has sho wn the imp ortance of b oth nondynamic and dynamic correlation in c hemical systems. Although stateoftheart computing facilities and parallel computing oers an adv an tage to the calculation time necessary to go to suc h high excitation lev els or m ultiple determinan ts whic h are needed in treating nondynamical correlation eects, the steep scaling of these metho ds has giv en rise to a v ariet y of appro ximate correlation metho ds. Of these alternativ es the lo cal correlation appro ximation [ 63 64 45 65 36 66 44 48 67 28 68 69 29 ] just discussed has b een sho wn to b e quite accurate for the ground states of large molecules. Giv en the success of lo cal correlation metho ds in treating ground states it is in teresting to consider the accuracy of suc h appro ximations for electronically excited, detac hed, or attac hed states of large molecules. 3.1 Metho ds for Calculating Excited States In this w ork the fo cus is on applying ionization p oten tial equationofmotion CCSD (IPEOMCCSD) to large molecules, ho w ev er, b ecause dev elopmen ts in calculating ionization energies are less common than dev elopmen ts for excitation energies the excitation energies are review ed rst. The theoretical framew ork of ab initio metho ds for excited states is w ell established b y metho ds suc h as CIS [ 70 ], CC2 [ 71 ], CIS(D) [ 72 ], PEOMCCSD [ 73 ], SA CCI [ 74 ], EOMCCSD [ 75 ], etc. Although CIS is often times qualitativ ely correct for single excitations the absence of the doubles excitation comp onen t prev en ts the quan titativ e description of single excitations. The dev elopmen t of metho ds whic h treat the doubles blo c k p erturbativ ely lik e CC2 [ 71 ], CIS(D) [ 72 ], and PEOMCCSD [ 73 ] in a general sense, oer successiv ely b etter treatmen ts than CIS. In cases where the doubles blo c k of the eectiv e Hamiltonian is diagonally dominan t a L owdin partioning oers a w a y to incorp orate the eect of doubles in to the single excitation manifold. Large scale CI calculations can b e p erformed b y reducng the dimension of the CI matrix b y freezing CI 65 PAGE 66 co ecien ts [ 76 77 ]. The EOMCC metho d [ 75 ] is the most accurate and most exp ensiv e metho d, often times comparable to the F CI for single excitations. While ground state CC is built up on a single reference, EOMCC is built up on a m ultireference w a v efunction, i.e. the CC w a v efunction, and th us has the p oten tial to describ e nondynamical correlation eects. As is kno wn the rate limiting step in the EOM metho ds is the con traction of the CC eectiv e Hamiltonian with the trial w a v efunction needed in the iterativ e Da vidson diagonalization pro cedure. Man y of the metho ds for increasing the accuracy and/or applicabilit y of excited state calculations can also b e used for ionized state calculations. 3.2 Appro ximate Metho ds for Calculating Excited States 3.2.1 Lo calized Metho ds One approac h to lo cally correlated excited states is simply to extend lo cal CC metho ds dev elop ed for the ground state to EOMCC excited states [ 78 79 80 ] where the most basic, but rather strict, assumption migh t b e that the excited and ground state orbital domains are iden tical. Using a pilot v ersion of a lo cal EEEOMCCSD program with virtual pro jected atomic orbitals it w as demonstrated for small organic molecules in diuse basis sets that part of the excited state sp ectrum could b e reco v ered using lo cal appro ximations [ 78 ]. T o complete the sp ectrum the excited state orbital domains m ust b e relaxed so as to allo w for nonlo cal or c hargetransfer ev en ts. A dieren t pilot co de sho ws that b y using larger and more state sp ecic excitation domains, for example determined using CIS, part of the high energy nonlo cal excitation sp ectrum can b e reco v ered [ 79 ]. Although pilot v ersions limited applications to small molecules a pro duction lev el v ersion ough t to giv e meaningful timings for larger molecules and pro vide reasonable accuracy A more complete v ersion of a lo cal EOM w as recen tly used with the c heap er and more appro ximate EOMCC2 metho d together with densit ytting of t w oelectron in tegrals [ 80 81 82 ]. 66 PAGE 67 3.2.2 F ragmen ted Metho ds Other approac hes aim to repro duce the excitation sp ectrum of a large molecule b y a suitable fragmen tation. As an example consider the m ultila y ered FMO metho ds for CIS [ 83 ] and CIS(D) [ 84 ] as w ell as the fast electron correlation metho ds for b oth CIS and EEEOMCCSD [ 85 ] all of whic h are based up on FMOs. In these metho ds a system is considered to b e comp osed of a c hromophore in the presence of an en vironmen t region resp onsible for inducing shifts in the c hromophoric excitation sp ectrum. The fast EEEOMCCSD metho d [ 85 ], complete with ESPs in the usual con text of FMO metho ds, has b een sho wn to b e quite accurate for w eakly in teracting systems. One other fragmen t based approac h to calculating excited states of molecules is the ab initio fragmen t orbital theory (AF OT) [ 86 ]. 3.3 Chromophores 3.3.1 Compact Represen tations Excited, ionized, or attac hed w a v efunctions whic h are represen ted in the canonical basis of energy eigenstates oer little in the w a y of systematically reducing their representation to a small set of only critical elemen ts. Recen tly it w as sho wn that a SVD of the CIS w a v efunction oers a compact represen tation in whic h the n um b er of excited state w a v efunction amplitudes is v ery small [ 87 ]. Although this approac h is attractiv e b ecause it oers a less am biguous prescription, after all it is built up on a SVD and asso ciated excited state natural orbitals [ 88 ], it oers less in terms of in terpretation, and p ossibly p erformance, than metho ds based on lo calized orbitals. Analyzing the excited, as w ell as ionized or attac hed, state w a v efunctions in terms of c hromophores dened using lo calized orbitals is app ealing and could b e used in metho ds more adv anced than CIS. The excited, ionized, or attac hed state w a v efunctions can then b e written in terms of a small n um b er of determinan ts whic h are lo calized to a giv en c hromophore. In HF theory the atoms whic h a giv en canonical orbital most largely spans represen ts a c hromophore. In IPEOMCCSD all canonical orbitals con tribute thereb y p oten tially 67 PAGE 68 delo calizing the ionization and requiring a larger c hromophore. In HF theory lo calized ionizations dep end on the existence of a lo calized Ko opmans' theorem whic h in turn dep ends on the lo calit y of the canonical orbitals. In con trast in some cases the c hromophore is ev erywhere. As a simple example consider the cyclohexane molecule, where the lo cal symmetry requires all orbitals of a certain t yp e, for example core orbitals, axial C H equatorial C H and C C orbitals, to mak e con tributions to a giv en ionization. Giv en that the ionization energy is in tensiv e, in con trast to the total energy this means that the magnitude of the b ond ionization energies m ust decrease to accommo date the increasing n um b er of b onds when building larger cycloalk anes. This prev en ts the transferabilit y needed to build the ionization energy of a large molecule in terms of smaller molecules. In other language, the orbitals of the D 3 d cyclohexane cation can not b e lo calized in to 2cen ter b onds. Aside from this later pathological case a linear scaling metho d for IPEOMCCSD core and v alence states can b e dev elop ed using c hromophores determined from a lo w er lev el of theory [ 89 ]. 3.3.2 Lo cating Chromophores with Lo w er Lev el W a v efunctions The lo cation of the c hromophores in large systems can b e determined b y means of a lo w er lev el w a v efunction. As a simple example, consider whic h lo calized orbital on whic h w ater molecule dominates the k th ionized state w a v efunction of a w ater cluster. The exten t to whic h neigh b oring h ydrogenb onding w aters shift the ionization energy m ust b e tak en in to accoun t b ecause the c hromophore is often times larger than a single w ater molecule. Ionized states whic h in v olv e in teracting y et distan t c hromophores, as opp osed to a single c hromophore, can b e treated similarly b y using all of the c hromophores with the rest of the molecule b eing v acuum or an ESP The ma jorit y of results rep orted herein use b ond ionization energies calculated at lo w lev els of theory to determine the c hromophoric regions whic h are then treated at more adv anced lev els of theory Another approac h based on isolating the c hange in electron densit y when a molecule is ionized is also rep orted herein. As is kno wn there are a n um b er of to ols whic h can b e 68 PAGE 69 used to do this [ 90 ]. The metho d used here in v olv es comparing lo calized orbitals for the ground state with lo calized orbitals from an ionized state. This approac h is similar to those based on the transition densit y ho w ev er here the nal result is a transformation matrix whic h giv es lo calized orbitals for the ionized state from lo calized orbitals for the ground state. Unlik e other metho ds the transformation matrix here con tains correlation as w ell as a c hange in particle n um b er. Comparing these t w o sets of orbitals rev eals new b onding motifs whic h are pic k ed up in the ionized state and also sho ws b onds whic h are consisten t with the ground state. Ideally the calculation will pic k the correct c hromophoric region for the IPEOMCCSD calculation b y comparing the ground state w a v efunction with the ionized state w a v efunction calculated at a lo w er lev el of theory for example IPEOMMP2. Bond ionization energies from a n um b er of dieren t c hromophores com bine to pro vide the nal ionization energies for new, more complex c hromophores. Ho w they com bine is a question in dev eloping metho ds for calculating the ionized, additionally excited or attac hed, states of large molecules. The b ond ionization energies m ust b e appropriately w eigh ted to ensure prop er represen tation of the en vironmen t. 3.3.3 In teracting Chromophores and ChargeT ransfer T o motiv ate the discussion of c hargetransfer eects consider the CIS Hamiltonian for a system AB with subsystems A and B far apart giv en b y h ai j H j bj i = ij f ab ab f j i + w j abi : (3{1) T aking a diagonal elemen t of this matrix it can b e seen that it is c hargetransfer separable, despite the fact that CIS is sizeinextensiv e b y virtue of its linear ansatz for the excited state w a v efunction, h ai j H j ai i = f aa f ii + w iaai ; (3{2) Chargetransfer separabilit y means that the excitation energy k go es according to the Mullik en denition as E A I P + J where EA and IP are the electron anit y and 69 PAGE 70 ionization p oten tial resp eciv ely and J is the Coulom b in teraction. In the case of CIS of course the Ko opmans' appro ximation to the EA and IP could lead to large errors in the excitation energy but the structure for c hargetransfer separabilit y is presen t nonetheless. Note that as AB is separating if a 2 A and i 2 B then the CIS Hamiltonian as in Eqn. 3{2 is non v anishing ev en though the Coulom b piece v anishes. This means that when the CIS Hamiltonian is diagonalized for a system with a c hargetransfer state that the excited state w a v efunction will also con tain this nonlo calit y Another manifestation of this is, ( H C ) ai = c ai = X j b H j b ia c bj = X b f ab c bi X j f j i c aj ; (3{3) where ( H C ) ai = hj C ji = h ai j H C j 0 i Rewritting this equation as c ai = X b 2 A f ab c bi + X b 2 B f ab c bi X j 2 A f j i c aj X j 2 B f j i c aj (3{4) and considering a 2 A and i 2 B the second and third terms v anish b ecause of the lo calit y of the F o c k matrix. This means that excited state w a v efunction amplitudes whic h are longranged are nonzero thereb y complicating ev aluation via lo cal appro ximations. Using the same analysis for the EEEOM metho ds giv es, h ai j H j ai i = aa ii + iaai ; (3{5) whic h do esn't pro vide the same c hargetransfer separabilit y b ecause aa and ii are p o or appro ximations to the correlated electron attac hmen ts and ionization p oten tials. T o ensure c hargetransfer separabilit y F o c k space metho ds, whic h in tro duce a sizeextensiv e excitation op erator, suc h as STEOM [ 91 ] need to b e used. In this case the matrix elemen ts go as h ai j H j ai i = E A + I P + J : (3{6) T o the exten t that c hargetransfer states are accessible using EEEOM they can b e obtained within a lo cal appro ximation b y surrounding eac h o ccupied with its region of 70 PAGE 71 excitation as is done in the con v en tional appro ximations but no w adding a deexcitation region for the virtual orbital and lling the rest of the molecule with v acuum. The c hargetransfer excited state w a v efunction amplitudes obtained in this reduced space will b e transferable to the target system b y virtue of the transferabilit y of the corresp onding pieces of the eectiv e Hamiltonian [ 92 ]. F or excited states in whic h it migh t b e exp ected that the c hargetransfer determinan ts mak e negligible con tributions, the pair con tribution, H ia migh t b e appro ximated as, X ia H ia X i X a 2 QM 2 i H ia : (3{7) This form assumes that QM2 i = QM2 a and is particularly in teresting b ecause it systematically allo ws for more nonlo cal excited states b y gradually relaxing the QM2 i = QM2 a constrain t. This form sim ultaneously a v oids reducing to a con v en tional calculation b y dening an appropriate QM2 a whic h follo ws a regardless of its distance to QM2 i 3.4 Metho ds for Calculating Ionized States In man y resp ects the theoretical dev elopmen t of electron excited, ionized, and attac hed states is quite similar, ho w ev er, there are distinct dierences as a result of the c hange in the total n um b er of electrons in the nal w a v efunction. One ob vious connection is that all ionized states are preceded b y a n um b er of Rydb erg excited states. The excited state w a v efunctions and energies from EEEOMCCSD approac h those obtained from IPEOMCCSD as the Rydb erg states approac h the ionization con tin uum. In this w ork true ionizations are studied as opp osed to mo difying the basis set of an EEEOM to represen t a con tin uum excitation. Additionally ionized states, as w ell as attac hed states, are used in the F o c k space STEOM [ 91 ] metho d, whic h is an EOMCCSD metho d for excited states whic h, unlik e con v en tional EEEOMCCSD, has the correct c hargetransfer separabilit y meaning that the excitation energy go es as the dierence of electron attac hemen ts (EAEOM) and detac hmen ts (IPEOM). The IPEOMCCSD w a v efunction should b e sucien tly robust to accoun t for orbital relaxation eects accompan ying the 71 PAGE 72 loss of an electron whic h are presen t in energy dierence calculations. These orbital relaxation eects in IPEOMCCSD should appro ximate those obtained with a correlated calculation on the cation. Also, since the nal state can b e a free radical, necessary static correlation eects m ust also b e included via building the EOM w a v efunction on top of a CC w a v efunction. 3.4.1 IPEOMCC W a v efunction The k th ionized state w a v efunction in IPEOMCCSD, j k i is written as [ 93 ], j k i = R k j i ; (3{8) where R k is the ionization op erator giv en b y R = R 1 + R 2 = X i r i i + 1 4 X ij b r b ij b y j i; (3{9) where the k dep endence has b een dropp ed for simplicit y j i is the ground state CCSD w a v efunction giv en b y j i = e T j 0 i ; (3{10) where j 0 i is the reference HF determinan t and T is the excitation op erator written as, T = T 1 + T 2 = X ia t ai a y i + 1 4 X ij ab t abij a y b y j i: (3{11) In the case of using a MP2 reference for the ionized state w a v efunction as in IPEOMMP2, R 1 and R 2 are built up on rstorder T 2 i.e. T [1] 2 as opp osed to b oth con v erged T 1 and T 2 as it is in the case of the CC w a v efunction. As is kno wn [ 94 ], if the EOM linear op erator is not truncated then regardless of the reference function the F CI result is obtained.3.4.2 Eigen v alue Equation in H The Sc hr odinger equation for the ionized state w a v efunction, H j k i = E k j k i ; (3{12) 72 PAGE 73 can b e rewritten using the CC eectiv e Hamiltonian, H = e T H e T = ( H e T ) c where c means connectedness in the usual con text of man yb o dy metho ds, as, H R k j 0 i = k R k j 0 i ; (3{13) where k is the k th ionization energy and, H = X pq pq f p y q g + 1 4 X pq r s pq r s f p y q y sr g + ; (3{14) where pq pq r s etc. are the one, t w o, and higherb o dy in termediates. The IPEOMCCSD amplitude equations are found b y con tracting the eectiv e Hamiltonian with the trial ionized state w a v efunction as in, ( H R ) i = h i j H R j 0 i = h i j R j 0 i = r i ; (3{15) for the onehole ionization and, ( H R ) b ij = h bij j H R j 0 i = h bij j R j 0 i = r b ij ; (3{16) for the t w oholeoneparticle ionization. Due to the fact that in IPEOMCC the ionized state w a v efunction follo ws from an eigen v alue equation in the CC eectiv e Hamiltonian pro jected in to the ionized sector it is lik e a CI metho d built up on a CC reference function. 3.4.3 LeftHand Eigenstate of H As is w ell kno wn the eectiv e Hamiltonian is nonHermitian ha ving b oth left, L k and righ t, R k hand eigen v ectors with iden tical eigen v alues, k due to the similarit y transformation of H as in e T H e T In con v en tional IPEOMCCSD the k th righ thand eigen v ector is solv ed using a nonHermitian generalization of Da vidson's iterativ e diagonalization pro cedure using Ko opmans' guesses. This eigen v ector giv es the k th ionized w a v efunction and its eigen v alue is the ionization energy whic h is determined directly as opp osed to via energy dierences. The lefthand deexcitation eigen v ector of H is needed to compute the ionized state densit y matrices [ 95 ] and subsequen t prop erties, for example 73 PAGE 74 gradien ts, p olarizabilities, etc. and is giv en b y L = X i l i i y + X ij b l ij b i y j y b; (3{17) where the k dep endence has b een dropp ed for simplicit y As is w ell kno wn the lefthand ground state eigen v ector, L 0 of H is simply the con v en tional (1 + ) deexcitation op erator used in ground state CC gradien ts and densit y matrices. In IPEOMCCSD the leftand righ thand eigen v ectors form a biorthogonal set, L k y R k 0 = k k 0 : (3{18) The ionization energy can then b e written as a generalized exp ectation v alue, h 0 j L k H R k j 0 i = k ; (3{19) whic h in expanded form is, X ij L i H j i R j + X ij k a L i H j k a i R a j k + X ij k a L ija H k ij a R k + X ij k l ab L ija H k l b ij a R b k l = ; (3{20) again with k dep endence remo v ed. Note that for canonical HF only the rst term app ears and the eectiv e Hamiltonian is the F o c k matrix whic h is Hermitian with L = R = 1, i.e. the Ko opmans' guess. The ionization energy is simply the orbital energy giv en Ko opmans' theorem, k = F k k = k : (3{21) The IPEOMCCSD metho ds scales as O ( N 6 ) with N b eing the n um b er of basis functions, the rate determining step of whic h is the ( H R k ) i and ( H R k ) b ij con tractions. F or singly ionized states IPEOMCCSD is comparable to F CI. 3.5 Lo calized Metho ds for Calculating Ionized States The n um b er of free parameters in the IPEOMCCSD equations can b e greatly reduced b y using lo calized appro ximations similar to those metho ds dev elop ed for EEEOMCCSD. It is imp ortan t to note that using lo calized orbitals in IPEOMCCSD 74 PAGE 75 without an y appro ximations repro duces the same exact ionization energies as with the canonical HF orbitals, ho w ev er the states are obtained in a dieren t order unless the Ko opmans' guess for the Da vidson diagonalization pro cedure is c hanged to rerect the absence of a Ko opmans' theorem in a lo calized basis. Often times, man y ionized states obtained with IPEOMCCSD are lo calized to a c hromophoric region of the molecule with the rest of the molecule acting as an electrostatic [ 96 ] and, to a lesser exten t, correlation p erturbation. The p erturbation due to correlation is less of an eect giv en the transferabilit y of the eectiv e Hamiltonian. This marks the dierence of t w o approac hes, one in whic h only the IPEOMCCSD part of the calculation is lo cal to the c hromophore with the HF and CCSD parts b eing global and one in whic h the CCSD part of the calculation is lo cal to the c hromophore. 3.5.1 Lo cating Chromophores: Bond Ionization Energies The NLS metho d for ionized w a v efunctions and energies is motiv ated b y rewriting the ionization energy as a sum of con tributions from lo calized c hemical b onds, k = X i k i : (3{22) These b ond ionization energies are used to lo cate the c hromophoric regions of the molecule in terms of transferable c hemical b onds whic h are then calculated at the IPEOMCCSD lev el. F or the HF b ond ionization energies, the canonical energy eigenstates, C m ust b e rewritten in terms of the unitary transformation matrix, U whic h rotates canonical orbitals in to noncanonical lo calized orbitals, C 0 C 0 = CU ; (3{23) giving the ionization energy as a sum of b ond con tributions, k = k = F k k = X ij U k i F 0 ij U y j k = X i k i ; (3{24) 75 PAGE 76 where F is the canonical F o c k matrix and F 0 is the noncanonical F o c k matrix. The Ko opmans' theorem is lost in a lo calized basis [ 97 ] and therefore an ionization from a single canonical molecular orbital, as it is in HF theory b ecomes an ionization from a linear com bination of lo calized molecular orbitals. In the case of IPEOMMP2 or IPEOMCCSD the b ond ionization energies follo w from Eqn. 3{20 represen ted in terms of lo calized orbitals and using either the MP2 or CCSD eectiv e Hamiltonian. In all calculations except those on the densit y matrices, R y is used instead of L In the correlated treatmen t of ionization energies, as in IPEOMCCSD, the size of c hromophores necessary could b e larger than those for HF b ecause the problem is p oten tially more delo cal b ecause man y canonical orbitals con tribute to a giv en principle ionization and th us all of the lo calized orbitals asso ciated with those canonical orbitals. 3.5.2 Lo cating Chromophores: E Bond Ionization Energies T o compare the b ond ionization energies calculated with HF and IPEOMCCSD with the corresp onding total energy dierence calculations the follo wing decomp osition of the total energy in to b ond energies for the case of UHF is used, E U H F cor r = 1 2 X i j a b h i j jj a b i [ 1 2 t a b i j + t a i t b j ] + 1 2 X i j a b h i j jj a b i [ 1 2 t a b i j + t a i t b j ] + X i j a b h i j j a b i [ t a b i j + t a i t b j ] : (3{25) Because of the mixed spin comp onen t it is not p ossible to extract pure spin b ond energies and so this term is simply divided in t w o to giv e the follo wing decomp osition, E U H F cor r = X i E i + X i E i : (3{26) F or CCSD energy dierence ionization calculations the ionization energy is giv en b y = E (0) C C S D E (+1) C C S D = E (0) H F + E (0) cor r E (+1) H F E (+1) cor r : (3{27) 76 PAGE 77 The rst and third terms giv e H F and the second and fourth terms giv e cor r By decomp osing all terms in to lo calized b ond energies it is p ossible to form energy dierence b ond ionization energies, i = i H F + i cor r H F = X i i H F = X i E (0) i H F E (+1) i H F E (+1) i H F ; (3{28) and, cor r = X i i cor r = X i E (0) i cor r E (+1) i cor r E (+1) i cor r : (3{29) 3.5.3 Lo cating Chromophores: IPEOMMP2 Densit y Matrices Another approac h to lo cate the c hromophores is based on the densit y matrix from a lo w er lev el of theory and the lo calized orbitals whic h come from this ionized state densit y matrix. The k th IPEOMMP2 densit y matrix [ 98 92 ] for the whole system, r k q p = h k j p y q j k i = h 0 j L k [ p y q T 2 ] c R k j 0 i ; (3{30) is bac k transformed to the atomic orbital basis, P k = SC r k C y S ; (3{31) for use in the lo calized orbital pro cedure. The k th ionized state lo calized NBOs, C 0 k N 1 where N is the total n um b er of electrons in the original state, are then compared to the HF ground state lo calized NBOs, C 0N using the unitary matrix, U k as in, C 0 k N 1 = C 0N U k ; (3{32) where U k is unitary in the whole ov space. Giv en the structure of U k it is p ossible to order the ionized state orbitals according to their similarit y with the corresp onding ground state orbitals. This pro vides a pro cedure for dropping those orbitals whic h the t w o w a v efunctions, ground and ionized, ha v e in common b ecasue they do not pla y a signican t role in the k th state ionization. The remainder of the ionized state orbitals denes a 77 PAGE 78 c hromophoric region in the molecule for use in a IPEOMCCSD calculation. This region represen ts the lo cation in the molecule where the densit y c hanges the most during an ionization. These lo cal ionized state orbitals sometimes ha v e o ccupations whic h are far from in teger resulting in con v ergence diculties when used in the highlev el IPEOMCCSD calculations. This approac h needs to b e studied using more adv anced con v ergence accelerators.3.5.4 Electrostatic P oten tials The lo cal IPEOMCCSD problem is divided in to t w o categories with resp ect to the treatmen t of the lo calized c hromophore, one in whic h b oth the ground and ionized states are treated lo cally and one in whic h the ground and ionized states come from dieren t sized regions. Giv en that the ground state calculation determines the eectiv e Hamiltonian whic h is transferable, the former less computationally exp ensiv e approac h is used herein. The ground state CC equations are solv ed using the lo calized orbitals from a c hromophore whic h is determined in the ESP due to the rest of the molecule. The correlation energy for this region as w ell as the correlated w a v efunction for this region are then w ell dened. Subsequen tly the IPEOMCCSD equations are solv ed using this regionally correlated w a v efunction giving ionized state energies and w a v efunctions for this region. These ionized state energies and w a v efunctions are in reasonable agreemen t with what w ould ha v e b een obtained b y p erforming the CCSD and IPEOMCCSD on the whole molecule and then nding that particular lo calized ionized state from the sp ectrum obtained. Due to the dep endence of the iterativ e Da vidson diagonalization metho d on the initial guess, it is sometimes dicult to extract this particular lo calized ionized state b ecause of the large n um b er of ionized states determined for a large molecule calculation. T ailoring the initial guess w a v efunction certainly helps to nd the state but it is not a guaran tee. 78 PAGE 79 3.6 Implemen tation All calculations w ere p erformed using a mo died v ersion of the serial A CES I I [ 16 ], quan tum c hemistry soft w are pac k age. Considering the size of basis sets used these results are not mean t to b e denitiv e. F or b enc hmark results larger basis sets with higher angular momen tum functions should b e used. Calculations w ere p erformed using a 375 MHz p o w er3 mac hine with a total of 8 GB of shared memory as w ell as a 8 dualcore ia64 SGI Altix mac hine with a total of 256 GB of shared memory Geometries are a v ailable up on request. 3.7 Applications 3.7.1 Motiv ation Consider the lo calized orbitals for a single w ater molecule obtained using the HF, CCSD, EEEOMCCSD, IPEOMCCSD, and EAEOMCCSD densit y matrices. Although the fo cus is on the IPEOMCCSD metho ds, results for EEand EAEOMCCSD are included for completeness. The ground state uncorrelated and correlated densit y matrices giv e the usual c O l p O and O H NBOs with sligh tly dieren t o ccupations, for example the HF giv es an o ccupation of 1.99 for a O H b ond while the CCSD giv es an o ccupation of 1.95. A calculation on the cation sho ws that the electron is remo v ed from a l p O a result whic h can b e repro duced b y the IPEOMCCSD calculation. The IPEOMCCSD calculation can alternativ ely break the O H b onds thereb y giving the o xygen another lonepair, alb eit with a lo w er o ccupation, p oten tially accompanied b y empt y pair NBOs. A calculation on the anion sho ws that the electron is added to a H H b ond, a new b onding moiet y a result whic h can b e repro duced b y the EAEOMCCSD calculation. Other times the EAEOMCCSD calculation results in com binations of loneand empt ypairs as w ell. Similar results are obtained for EEEOMCCSD. Empt y pairs ha v e o ccupations less than b onds but greater than an tib onds. These results are consisten t with w ater dimers. 79 PAGE 80 T ables 31 and 32 sho w the partial atomic c harges obtained from NA O p opulation analysis of w ater monomer and ten dieren t w ater dimers, sho wn in Fig. 31 using HF and CCSD ground states as w ell as the rst v alence state from EE, IP, and EAEOMCCSD states in that order. F or eac h w ater monomer or dimer, partial c harges for the HF and CCSD ground states as w ell as the EEEOMCCSD excited states sum to zero, while the partial c harges for the IPand EAEOMCCSD calculations sum to one and min us one, resp ectiv ely Note that if the molecule has symmetry then the c harges rerect that symmetry The dierence in c harges b et w een the HF and CCSD ground states is small in all cases except for the rst three w ater dimer geometries whic h happ en to b e those dimers with single h ydrogenb onds. It can b e seen that the h ydrogen b eing donated undergo es signican t c hange in c harge as the correlation con tributions are added. F or w ater monomer the rst excited state has only small c harges assigned to the atoms, it is almost as if the atoms retain their densit y as atoms in the molecule b ecause c harge reorganization is quite small. Relativ e to the ground state the rst excited state inv olv es c hargetransfer from the o xygen to the h ydrogens essen tially rev ersing the formation of b onds needed for the ground state in the rst place. F or dimers with lo cal symmetry the electron is excited from b oth o xygens and is distributed appropriately among the h ydrogens. If the dimers lac k these symmetries than one of the w aters dominates the excitation sho wing c hargetransfer from o xygen to h ydrogen while the other w ater undergo es small c hanges. The rst ionized state of w ater monomer sho ws that the electron is remo v ed from the o xygen while the h ydrogen c harges c hange little from their ground state v alues. W ater dimers with lo cal symmetry ha v e the electron remo v ed equally from eac h of the t w o o xygens while dimers with no lo cal symmetry ha v e the electron remo v ed from one of the o xygens, the h ydrogenb onding o xygen if one is presen t. The other w ater sta ys the same. F or w ater monomer the electron attac hes itself equally among the t w o h ydrogens with little c hange in the o xygen. The results for attac hemen ts of w ater dimers are consisten t 80 PAGE 81 with the other studies. F or example the rst three geometries in v olv e adding the electron to the pair of h ydrogens whic h face a w a y from the h ydrogenb ond. Of the lo cally excited dimers, the singly h ydrogenb onded dimers #1, #2, and #3 and the bifurcated h ydrogenb onded dimers #7, #9, and #10, the a v erage excitation energies are 7.5 eV and 7.7 eV. Similar results are seen for the dierence in ionization energies 11.7 eV and 11.9 eV, while the attac hmen t energies are b oth around 0.3 eV. The tail eects due to the other w ater molecule are seen b y comparing with the w ater monomer excitation, ionization, and attac hemen t energies of 7.6 eV, 12.6 eV, and 0.6 eV and so it can b e seen that the ionization and attac hemen t energies are shifted b y more than the excitation energy T ransferabilit y of the oneparticle, H ij and H i a and t w oparticle, H a ij k comp onen ts of the CC eectiv e Hamiltonian whic h are needed in IPEOMCCSD are sho wn in Fig. 32 for a series of symmetric alcohols. The H ij con tribution, sho wn on the top left, is slo wly con v ergen t b ecause it con tains pieces of the F o c k matrix whic h b y virtue of its incorp oration of electrostatics is m uc h longerranged than t ypical electron correlation quan tities. Despite the longranged pieces in tro duced b y the F o c k matrix, the most troublesome example, H l pO l pO v aries only little as one mo v es from methanol to butanol, for example the v ariation is less than 0.004 a.u. If the F o c k matrix con tribution is remo v ed, see top righ t of Fig. 32 and note the c hange of ordering in the k ey the eectiv e Hamiltonian has greater transferabilit y The H C H C H sho wn on the b ottom left is slo wly con v ergen t requiring larger alcohols to establish its rate of transferabilit y The t w oparticle piece approac hes a constan t in ev ery com bination of lo calized orbitals studied. A v alence ionized state w a v efunction from IPEOMCCSD for these same alcohols is sho wn in Fig. 33 Both the onehole amplitudes (left) and t w oholeoneparticle amplitudes (righ t) are transferable. The dominan t singles con tribution to the ionized state w a v efunction is from the n O while the dominan t doubles con tribution is from the n O and the O C O C pair. 81 PAGE 82 Bond ionization energies from IPEOMCCSD for this v alence ionized state w a v efunction are sho wn in T able 33 for the same series of alcohols. The largest b ond ionization energy is for an n O b eing 0.4003, 0.3960, 0.3935, and 0.3921 for methanol, ethanol, propanol, and butanol, resp ectiv ely Note the con v ergence of this v alue in Fig. 34 as w ell as for the C H and O H con tributions. The next largest b ond ionization con tribution is the adjacen t C H b ond follo w ed b y the O H b ond. The large con tribution of C H b onds is due to its symmetry with resp ect to n O All core con tributions are small. The total ionization energy is giv en b y the sum of the b ond ionization energies and also is transferable b eing 0.3719, 0.3665, 0.3640, and 0.3624 for methanol, ethanol, propanol, and butanol, resp ectiv ely Note that assuming that the ionization is completely lo cal is space, i.e. from the n O alone, only results in 0.8 eV (0.03 H) error in all cases. 3.7.2 Example Using Bond Ionization Energies F or a larger example consider the radius of a w ater cluster, a quasilinear ( H 2 O ) 8 cluster, sho wn in Fig. 35 Changes in the HF and IPEOMCCSD total core ionization energies up on building this larger system are sho wn in Fig. 36 The calculations are p erformed b y adding additional w aters and p erforming a complete CCSD and IPEOMCCSD using all orbitals. Fig. 37 sho ws the transferabilit y of n O and O H b ond ionization con tributions to this core ionization for the left most w ater molecule as a function of w ater cluster size. It is seen that the lonepair con tribution con v erges m uc h more quic kly than the b ond con tribution. All HF and IPEOMCCSD b ond ionization energies for the quasilinear ( H 2 O ) 8 cluster are sho wn in T able 34 and Fig. 38 T able 34 con tains the largest v alues whic h are for the left most w ater in the eigh t w ater cluster. F or HF the lo calized c O orbital mak es the largest con tribution 20.3624 H with the n O next with a v alue of 0.0841 H while these n um b ers for IPEOMCCSD are reduced to 17.1150 H and increased to 1.1219 H resp ectiv ely b y virtue of adding electron correlation con tributions. This is a manifestation of the fact that for HF the Ko opmans' theorem establishes that the ionization energy 82 PAGE 83 is simply the energy of one o ccupied canonical molecular orbital, whic h in this case is lo cal, while for IPEOMCCSD the ionization energy is giv en b y including eects from all o ccupied and uno ccupied orbitals. In this case building in larger con tributions from other orbitals thereb y sligh tly delo calizing the ionization. Note that the n O and O H b ond ionization energies for b oth HF and IPEOMCCSD, v alues 0.0841 H, 0.0343 H, 1.1219 H, and 0.7139 H, are also sho wn in Fig. 37 for the full ( H 2 O ) 8 Fig. 38 con tains the b ond ionization energies for the second through eigh th w ater molecules in the eigh t w ater cluster. The p oin t whic h w as made previously regarding the delo calized nature of the IPEOMCCSD w a v efunction with resp ect to the HF one is m uc h more apparen t here where the HF result is delo calized only to the second w ater while the IPEOMCCSD spreads to the third w ater. The nal result is that an accurate core ionization for this ( H 2 O ) 8 cluster can b e otained with only t w o w aters for the case of HF and three for the case of IPEOMCCSD. That is the uncorrelated c hromophore is the rst t w o w aters while the correlated c hromophore extends to the third w ater as w ell. 3.7.3 Example Using IPEOMMP2 Densit y Matrix Chromophores whic h corresp ond to certain ionized states can b e lo cated b y using the b ond ionization energies determined at some lev el of theory There is a certain amoun t of arbitrariness used in the denition of these b ond ionization energies b ecause the ground state orbitals are still b eing used. It is questioned whether a more robust means of lo cating c hromophores is p ossible via the ionized state densit y matrices. If these matrices are lo calizable then comparison of the ionized state lo calized orbitals with the corresp onding ground state orbitals w ould pro vide a rigorous w a y to dene the activ e space to b e those orbitals whic h are presen t in the ionized state whic h are not presen t in the ground state. Again this is under the assumption that the densit y matrices from lo w er lev els of theory sucien tly represen t the target highlev el calculation. In this w a y it is p ossible to determine the lo cation of a c hromophore in more complicated systems, for example bulk w ater clusters. Once the c hromophore and the appropriate set of orbitals is 83 PAGE 84 determined the highlev el calculation can b e p erformed in a small region with little loss in accuracy Fig. 39 sho ws the dierence b et w een con v en tional and reduced space v alence IPEOMCCSD using NBOs for w ater dimer at three dieren t geometries giv en in Fig. 31 The result using NBOs determined from the HF densit y matrix is on the left while the result using NBOs determined from the IPEOMMP2 densit y matrix is on the righ t. The xaxis represen ts the p ercen tage of NBOs whic h w ere dropp ed for the IPEOMCCSD calculation, zero b eing the con v en tional result in the NBO basis whic h also serv es as a reference. Ionized state NBOs are dropp ed according to their similarit y with the NBOs determined from the ground state, for example 25%, 50%, and 75% mean that the giv en p ercen tage of orbitals shared b y the ground and ionized state are dropp ed. Similarit y of the orbitals is determined from the structure of the unitary matrix, U k from C 0 k N 1 = C 0N U k Note that U k is unitary in the full space. All orbitals are included in constructing U k and so the giv en p ercen tage of orbitals dropp ed applies to b oth the o ccupied space and the virtual space separately Note that 20% is the con v en tional dropping of core orbitals only The HF results w ere pro duced b y simply dropping the same list of orbitals as w as dropp ed in from the IPEOMMP2 calculations, those of whic h w ere determined using the ab o v e pro cedure. As can b e seen from Fig. 39 using HF NBOs in the IPEOMCCSD calculation w orks v ery w ell for 20% and 25% giving signican tly b etter results than those from the IPEOMMP2 densit y matrix, ho w ev er when the amoun t of NBOs dropp ed increases to 50% and 75% the error with resp ect to the con v en tional calculation increases to appro ximately 14 eV and 8 eV, resp ectiv ely Using the IPEOMMP2 densit y matrix, as sho wn on the righ t, k eeps this error b elo w 3.5 eV for all p ercen tages. The three curv es are closer to one another for HF than they are for IPEOMMP2 b ecause the NBOs determined from the HF densit y for the three systems are more alik e than they are for the NBOs from the IPEOMMP2 densit y Error with resp ect to the con v en tional IPEOMCCSD is not monotonic. F or larger more complicated 84 PAGE 85 examples the IPEOMMP2 calculation can b e carried out and a corresp onding set of lo calized ionized state NBOs can b e determined, ho w ev er there are problems in con v erging the CC equations using this set of orbitals. Adv anced con v ergence algorithms can b e emplo y ed in CC to help con v ergence. 3.7.4 ( H 2 O ) 13 Core and v alence ionization c hromophores determined using HF b ond ionization energies are sho wn in Fig. 310 resp ectiv ely for a ( H 2 O ) 13 cluster. In b oth cases the IPEOMCCSD, as w ell as HF, calculations are for a ( H 2 O ) 5 c hromophore whic h in this case is done in v acuum. The eects of an ESP due to the rest of the molecule will b e considered later. The gures also sho w the canonical orbitals, a core and highest o ccupied molecular orbital (HOMO) orbital resp ectiv ely whic h dominate the ionization. The resultan t error in the HF ionization energies is 0.6 eV. Fig. 311 sho ws the b ond ionization energies of w aters whic h are increasingly distan t from the dominan t w ater sho wn in mesh. The b ond ionization energies for the lo calized core orbital are the largest con tributor and are th us sho wn in the k ey for clarit y The HF con tributions from all lo calized orbitals, with the exception of the leading lo calized core, are small. F or IPEOMCCSD the other con tributions from the rst w ater are m uc h larger. Smaller con tributions can b e seen from w aters t w o and three as w ell. Fig. 310 sho ws another c hromophore in this case the HOMO ionization. The error in the calculation for HF is increased from the error for the core c hromophore to 1.1 eV giv en that the HOMO is less lo cal. Bond ionization energies for this c hromophore are sho wn in Fig. 311 where again the leading b ond ionization energy in this case from a lonepair orbital on the dominan t w ater whic h is in the cen ter of the circle in Fig. 310 is sho wn in the k ey F or HF it can b e seen that the rst, fourth, and fth w aters ha v e imp ortan t lonepair con tributions to the HOMO ionization energy despite the fact that the fourth and fth w aters are more distan t to the rst w ater than 85 PAGE 86 the second and third w aters. This is b ecause there is no h ydrogens screening the o xygeno xygen in teraction b et w een w aters one and v e. The IPEOMCCSD results are similar with additional con tributions from the second and third w aters as w ell. 3.7.5 Gly 5 Helix Core and v alence ionization c hromophores in gly 5 helix are sho wn in Fig. 312 to b e lo calized to one and t w o glycine residues resp ectiv ely This system w as c hosen as a test of these metho ds to co v alen tly b onded systems. The error in using the c hromophore of one glycine residue for the core ionization is 0.4 eV while the error in using t w o glycines as the c hromophore for the HOMO ionization is larger at 0.9 eV. As can b e seen from Fig. 313 the core ionization is lo cal ha ving the ob vious con tribution from the lo calized o xygen core sho wn in the k ey with residual con tributions from only those b onds b elonging to the h ydro xyl group and O C The IPEOMCCSD result is similar except that it includes con tributions from the carb on yl group suc h that the o v eral core excitation is lo calized to the carb o xyl group of the glycine molecule. In Fig. 313 it can b e seen that the HOMO ionization is dominated b y t w o p eptide groups, with larger con tributions from the rst p eptide group, for b oth HF and IPEOMCCSD. More sp ecically the n N and n O b oth from the rst p eptide group for HF and the C O b ond from the rst p eptide group and the n N from the second p eptide group for IPEOMCCSD. The rst of eac h of these con tributions is largest and sho wn b oth in the gure and in the k ey In this case b oth the HF and IPEOMCCSD HOMO ionizations ha v e small con tributions from a n um b er of b onds along the diglycine molecule. 3.7.6 ( H 2 O ) 28 T able 35 sho ws lo w energy and high energy v alence ionized states of ( H 2 O ) 28 calculated using IPEOMCCSD. Chromophores for b oth bulk (eigh t w aters) and surface (sev en w aters) ionizations w ere determined using HF b ond ionization energies are sho wn in Fig. 314 The rst column of the table is simply the ionization energy of the cen tral w ater of eac h c hromophore, w ater nine for the bulk c hromophore and w ater four for 86 PAGE 87 the surface c hromophore. The same data is sho wn in this case for the bulk and surface b ecause the calculations are simply a w ater molecule in v acuum. The ionized state amplitudes sho w that the lo w energy state is dominated b y a lonepair while the higer energy state is dominated b y a sigmab ond. Column t w o sho ws the ionization energy of this w ater no w calculated in the ESP of the other t w en t ysev en w ater molecules. The bulk lo w energy state is shifted the most 1.2 eV up from the v acuum ionization energy The high energy bulk state is shifted up b y 0.5 eV. Both the lo w and high energy state for the surface states are shifted do wn b y 0.6 eV and 1.1 eV resp ectiv ely It requires less energy to ionize the surface w ater than the bulk w ater for b oth lo w and high energy states. If the en tire c hromophoric region is considered, as opp osed to just the cen tral w ater, then the shift for the high energy bulk state is up at 0.8 eV while the lo w er energy surface state is up shifted b y 0.4 eV and the higher energy surface state is do wn shifted b y 1.2 eV. In this case b oth of the states for the surface c hromophore are high in energy one b eing dominated b y a sigmab ond on o xygen t w en t ythree and the other dominated b y the sigmab ond on o xygen v e. Note that no lo w energy state w as found for the bulk c hromophore and the high energy state is dominated b y a sigmab ond in v olving o xygen t w en t yt w o. 87 PAGE 88 #1 nonplanar op en C s #6 cyclic C 2 h #2 op en C 1 #7 triply h ydrogenb onded C s #3 planar op en C s #8 doubly bifurcated C 2 h #4 cyclic C i #9 nonplanar bifurcated C 2 v #5 cyclic C 2 #10 planar bifurcated C 2 v Figure 31: Ball and stic k mo dels of w ater dimer geometries. T en dieren t geometries are sho wn along with p oin t group symmetries and other c haracteristics [ 99 ]. 88 PAGE 89 T able 31: P artial atomic c harges for electronic states of H 2 O States include the HF and CCSD ground states as w ell as the rst EE, IP, and EAEOMCCSD states in a augccpVTZ basis. F or con v enience EE, IP, and EAEOMCCSD state energies are sho wn in eV in place of the corresp onding lab els. HF CCSD 7.613 12.608 0.626 O 1 0.929 0.904 0.244 0.034 0.960 H 2 0.464 0.452 0.122 0.517 0.019 H 3 0.464 0.452 0.122 0.517 0.019 89 PAGE 90 T able 32: P artial atomic c harges for electronic states of ( H 2 O ) 2 States include the HF and CCSD ground states as w ell as the rst EE, IP, and EAEOMCCSD states in a augccpVTZ basis for geometries sho wn in Fig. 31 F or con v enience EE, IP, and EAEOMCCSD state energies are sho wn in eV in place of the corresp onding lab els. #1 #6 HF CCSD 7.585 11.746 0.400 HF CCSD 7.719 12.288 0.603 O 1 0.963 0.783 0.230 0.070 0.815 0.952 0.927 0.657 0.495 0.939 O 2 0.944 0.888 0.949 0.950 0.931 0.952 0.927 0.657 0.495 0.939 H 3 0.458 0.457 0.043 0.505 0.436 0.468 0.458 0.417 0.510 0.125 H 4 0.495 0.279 0.262 0.533 0.282 0.484 0.469 0.239 0.484 0.313 H 5 0.476 0.467 0.480 0.490 0.014 0.468 0.458 0.417 0.510 0.125 H 6 0.476 0.467 0.480 0.490 0.014 0.484 0.469 0.239 0.484 0.313 #2 #7 HF CCSD 7.530 11.726 0.353 HF CCSD 7.774 12.038 0.400 O 1 0.958 0.782 0.229 0.070 0.814 0.945 0.921 0.959 0.958 0.965 O 2 0.945 0.893 0.943 0.958 0.944 0.941 0.914 0.242 0.029 0.935 H 3 0.456 0.455 0.064 0.505 0.437 0.471 0.460 0.477 0.498 0.112 H 4 0.495 0.285 0.277 0.536 0.287 0.473 0.461 0.468 0.464 0.089 H 5 0.477 0.467 0.477 0.488 0.139 0.470 0.457 0.128 0.512 0.462 H 6 0.475 0.466 0.482 0.498 0.105 0.470 0.457 0.128 0.512 0.462 #3 #8 HF CCSD 7.547 11.708 0.374 HF CCSD 7.567 12.339 0.462 O 1 0.958 0.783 0.243 0.070 0.812 0.933 0.909 0.589 0.470 0.937 O 2 0.948 0.898 0.936 0.967 0.945 0.933 0.909 0.589 0.470 0.937 H 3 0.457 0.456 0.050 0.506 0.436 0.466 0.454 0.294 0.485 0.218 H 4 0.494 0.289 0.272 0.538 0.293 0.466 0.454 0.294 0.485 0.218 H 5 0.478 0.468 0.474 0.490 0.152 0.466 0.454 0.294 0.485 0.218 H 6 0.477 0.467 0.481 0.502 0.125 0.466 0.454 0.294 0.485 0.218 #4 #9 HF CCSD 7.849 12.429 0.584 HF CCSD 7.722 11.841 0.336 O 1 0.952 0.926 0.590 0.487 0.939 0.942 0.915 0.220 0.051 0.942 O 2 0.952 0.926 0.590 0.487 0.939 0.944 0.920 0.959 0.977 0.965 H 3 0.468 0.456 0.259 0.507 0.136 0.470 0.457 0.116 0.523 0.463 H 4 0.484 0.469 0.331 0.480 0.302 0.470 0.457 0.116 0.523 0.463 H 5 0.468 0.456 0.259 0.507 0.136 0.472 0.460 0.473 0.490 0.009 H 6 0.484 0.469 0.331 0.480 0.302 0.472 0.460 0.473 0.490 0.009 #5 #10 HF CCSD 7.727 12.293 0.568 HF CCSD 7.670 11.932 0.339 O 1 0.952 0.926 0.594 0.491 0.939 0.936 0.910 0.397 0.049 0.937 O 2 0.952 0.926 0.594 0.491 0.939 0.941 0.917 0.905 0.965 0.963 H 3 0.467 0.457 0.266 0.508 0.113 0.468 0.454 0.208 0.521 0.462 H 4 0.484 0.469 0.327 0.482 0.325 0.468 0.454 0.208 0.521 0.462 H 5 0.467 0.457 0.266 0.508 0.113 0.470 0.459 0.443 0.485 0.011 H 6 0.484 0.469 0.327 0.482 0.325 0.470 0.459 0.443 0.485 0.011 90 PAGE 91 0.732 0.734 0.736 0.738 0.740 0.742 0.744 0.746 1 2 3 4 Hijn, HO(CH 2 ) n1 CH 3 s CH s CH 17.4*lp O s CH 1.4*lp O lp O 0.058 0.060 0.062 0.064 0.066 0.068 0.070 0.072 1 2 3 4 HijFijn, HO(CH 2 ) n1 CH 3 lp O lp O 33.0*lp O s CH 1.6* s CH s CH 0.00135 0.00140 0.00145 0.00150 0.00155 0.00160 0.00165 0.00170 0.00175 1 2 3 4 Ha in, HO(CH 2 ) n1 CH 3 s CH s CH 2.5*lp O s CH 0.025 0.030 0.035 0.040 0.045 0.050 0.055 0.060 0.065 0.070 1 2 3 4 Hijk an, HO(CH 2 ) n1 CH 3 lp O lp O s OC s OC 269.5*lp O lp O s CH s CH 54.3* s CH s CH lp O r O 1.9* s CH s CH s OC s OC Figure 32: T ransferabilit y of the ionized state pro jection of H The CC eectiv e Hamiltonian is sho wn for a series of ccpVDZ symmetric alcohols. The oneparticle elemen ts, H ij and H i a as w ell the t w oparticle elemen t, H a ij k are sho wn where all lo calized orbitals, n C H O C C H and r O b elong to the lefthand side of the molecule and are sho wn in the k eys. Also sho wn is H ij F ij where F is the F o c k matrix. The elemen ts are scaled b y the v alues sho wn in the k eys so that they are roughly the same for methanol. 91 PAGE 92 0.890 0.900 0.910 0.920 0.930 0.940 0.950 0.960 0.970 1 2 3 4 rin, HO(CH 2 ) n1 CH 3 lp O 4.1* s CH 0.024 0.026 0.028 0.030 0.032 0.034 0.036 0.038 0.040 0.042 0.044 1 2 3 4 rij bn, HO(CH 2 ) n1 CH 3 lp O s OC s OC 25.6* s CH lp O r O 9.0* s CH s OC s OC 8.9*lp O s CH s CH Figure 33: T ransferabilit y of a v alence ionized state w a v efunction. Calculations are for IPEOMCCSD on a series of ccpVDZ symmetric alcohols. The onehole amplitudes, r i and t w oholeoneparticle amplitudes, r b ij are sho wn on the left and righ t resp ectiv ely where all lo calized orbitals, n C H O C C H O C and r O b elong to the lefthand side of the molecule and are sho wn in the k eys. The amplitudes are scaled b y the v alues sho wn in the k eys so that they are roughly the same for methanol. 92 PAGE 93 T able 33: IPEOMCCSD v alence b ond ionization energies. Also sho wn is the corresp onding total ionization energies for a series of ccpVDZ symmetric alcohols. Mo ving do wn in the table corresp onds with mo ving left to righ t in the molecule for eac h t yp e of NLMO. All v alues are in Hartree. Note that an alternativ e denition of b ond ionization energies is used here where H R k elemen ts are divided b y R as seen from Eqns. 3{15 and 3{16 hence the p ossibilit y of negativ e v alues. NLMO HO(CH 2 ) 0 CH 3 HO(CH 2 ) 1 CH 3 HO(CH 2 ) 2 CH 3 HO(CH 2 ) 3 CH 3 c O 0.0001 0.0001 0.0001 0.0001 c C 0.0000 0.0000 0.0000 0.0000 c C 0.0000 0.0000 0.0000 c C 0.0000 0.0000 c C 0.0000 l p O 0.0086 0.0085 0.0085 0.0084 l p O 0.4003 0.3960 0.3935 0.3921 O H 0.0014 0.0014 0.0014 0.0014 C O 0.0001 0.0004 0.0005 0.0005 C H 0.0107 0.0115 0.0113 0.0114 C H 0.0107 0.0115 0.0113 0.0114 C C 0.0002 0.0001 0.0001 0.0001 C H 0.0002 0.0002 0.0002 C H 0.0002 0.0002 00002 C C 0.0001 0.0000 0.0000 C H 0.0000 0.0000 C H 0.0000 0.0000 C C 0.0000 0.0000 C H 0.0000 C H 0.0000 C H 0.0000 I P 0.3719 0.3665 0.3640 0.3624 93 PAGE 94 0.385 0.390 0.395 0.400 0.405 0.410 0.415 0.420 0.425 0.430 1 2 3 4  win, HO(CH 2 ) n1 CH 3 lp O 267.1* s OH 37.3* s CH Figure 34: T ransferabilit y of IPEOMCCSD b ond ionization energies. A v alence ionized state w a v efunction from IPEOMCCSD calculations w as studied for a series of ccpVDZ symmetric alcohols. All lo calized orbitals, n C H and O H b elong to the lefthand side of the molecule and are sho wn in the k eys. The energies are scaled b y the v alues sho wn in the k eys so that they are roughly the same for methanol. Hartree units are used throughout. 94 PAGE 95 Figure 35: Ball and stic k mo del of quasilinear ( H 2 O ) 8 The cluster represen ts the radius of a threedimensional bulk cluster. 95 PAGE 96 20.515 20.520 20.525 20.530 20.535 20.540 20.545 2 3 4 5 6 7 8 w / Hn, (H 2 O) n HF 19.915 19.920 19.925 19.930 19.935 19.940 19.945 2 3 4 5 6 7 8 w / Hn, (H 2 O) n IPEOMCCSD Figure 36: T otal core ionization energies of gro wing w ater clusters. Results for HF (left) and IPEOMCCSD (righ t) are sho wn for clusters leading up to the quasilinear ( H 2 O ) 8 radius sho wn in Fig. 35 The calculations are p erformed b y adding additional w aters and p erforming a complete CCSD and IPEOMCCSD in the 631G basis set with no orbitals dropp ed. 96 PAGE 97 0.0822 0.0824 0.0826 0.0828 0.0830 0.0832 0.0834 0.0836 0.0838 0.0840 0.0842 2 3 4 5 6 7 8 wlp O / Hn, (H 2 O) n HF 1.1080 1.1100 1.1120 1.1140 1.1160 1.1180 1.1200 1.1220 2 3 4 5 6 7 8 wlp O / Hn, (H 2 O) n IPEOMCCSD 0.03390 0.03395 0.03400 0.03405 0.03410 0.03415 0.03420 0.03425 0.03430 0.03435 2 3 4 5 6 7 8 ws OH / Hn, (H 2 O) n HF 0.70900 0.70950 0.71000 0.71050 0.71100 0.71150 0.71200 0.71250 0.71300 0.71350 0.71400 2 3 4 5 6 7 8 ws OH / Hn, (H 2 O) n IPEOMCCSD Figure 37: Bond ionization energy con tributions of gro wing w ater clusters. Results corresp ond to the core ionization energies for HF (left) and IPEOMCCSD (righ t) leading up to the quasilinear ( H 2 O ) 8 cluster sho wn in Fig. 35 The calculations are p erformed b y adding additional w aters and p erforming a complete CCSD and IPEOMCCSD in the 631G basis set with no orbitals dropp ed. The transferabilit y of the b ond ionization energy con tributions corresp onding to the left most w ater are sho wn for n O and O H 97 PAGE 98 T able 34: The largest b ond ionization energies for ( H 2 O ) 8 V alues are sho wn for b oth HF and IPEOMCCSD for the left most w ater con tributions to the core ionization energy of the quasilinear ( H 2 O ) 8 cluster sho wn in Fig. 35 The rest of the v alues for the second through eigh th w aters are sho wn in Fig. 38 All v alues are in Hartree units and a 631G basis set w as used with no orbitals dropp ed. NLMO HF IPEOMCCSD c O 20.3624 17.1150 n O 0.0841 1.1219 n O 0.0048 0.2697 O H 0.0304 0.6720 O H 0.0343 0.7139 98 PAGE 99 0.000 0.002 0.004 0.006 0.008 0.010 0.012 5 10 15 20 25 30 35 40 wi / Hi, local occupied index (H 2 O) 8 515 HF IPEOMCCSD Figure 38: Bond ionization energies for ( H 2 O ) 8 V alues are sho wn for b oth HF and IPEOMCCSD for the second through eigh th w ater con tributions to the core ionization energy of the quasilinear ( H 2 O ) 8 cluster sho wn in Fig. 35 All v alues are in Hartree units and a 631G basis set w as used with no orbitals dropp ed. 99 PAGE 100 0 2 4 6 8 10 12 14 16 0 25 50 75 ( w w0) / eV% dropped #1 HF #2 HF #3 HF 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 0 25 50 75 ( w w0) / eV% dropped #1 IPEOMMP2 #2 IPEOMMP2 #3 IPEOMMP2 Figure 39: IPEOMCCSD ionization energies from lo w er lev el densit y matrices. ( H 2 O ) 2 w as studied at three dieren t geometries giv en in Fig. 31 calculated using dieren t p ercen tages of the total n um b er of NBOs determined from the HF (left) or IPEOMMP2 (righ t) densit y matrices. The xaxis indicates that the giv en p ercen tage of NBOs are dropp ed from the o ccupied and separately from the virtual spaces, except for 20% whic h is the con v en tional dropping of only core o ccupied orbitals. The yaxis sho ws the dierence in the ionization energy without dropping orbitals compared to dropping the giv en p ercen tage. An augccpVTZ basis w as used. 100 PAGE 101 Figure 310: Ionization c hromophores in ( H 2 O ) 13 Core (left) and v alence (righ t) c hromophores are determined using 631G HF b ond ionization energies. The corresp onding canonical orbitals sho wn in circles. HF and IPEOMCCSD calculations on the ( H 2 O ) 5 c hromophore are done in v acuum as opp osed to an ESP due to the other w aters. The canonical core and HOMO orbitals whic h dominate the ionization are sho wn in mesh. 101 PAGE 102 0.0 0.2 0.4 0.6 0.8 1.0 1.2 25 20 15 10 5 1 wi / Hi, local occupied orbital index HF / 20.4847 IPEOMCCSD / 17.1687 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 25 20 15 10 5 1 wi / Hi, local occupied orbital index HF / 0.3377 IPEOMCCSD / 0.1984 Figure 311: Bond ionization energies for the ( H 2 O ) 5 c hromophore. HF and IPEOMCCSD 631G b ond ionization energies for core and HOMO ionizations are sho wn for the c hromophore in Fig. 310 The xaxis represen ts o ccupied orbitals of w aters whic h are an increasing distance from the dominan t w ater sho wn in the cen ter of the corresp onding circles in Fig. 310 The dominan t lo calized core and lonepair ionization energies corresp onding to orbital one and three, resp ectiv ely are sho wn in the k ey for sak e of clarit y 102 PAGE 103 Figure 312: Ionization c hromophores in gly 5 helix. Core (left) and v alence (righ t) c hromophores are determined using 631G HF b ond ionization energies. The corresp onding canonical orbitals sho wn in circles. HF and IPEOMCCSD calculations on the gly 1 and gly 2 c hromophores are done in v acuum as opp osed to an ESP due to the other residues. The canonical o xygen core and HOMO orbitals whic h dominate the ionizations are sho wn in mesh. 103 PAGE 104 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 wi / Hi, local occupied orbital index HF / 20.3846 IPEOMCCSD / 16.6172 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 wi / Hi, local occupied orbital index HF / 0.1741 IPEOMCCSD / 0.0640 Figure 313: Bond ionization energies for the gly 1 and gly 2 c hromophores. HF and IPEOMCCSD 631G b ond ionization energies for the core and HOMO ionizations are sho wn in Fig. 312 The xaxis represen ts o ccupied orbitals whic h are along the p eptide mo ving from Ctermin us to the Ntermin us. The dominan t lo calized o xygen core ionization energies corresp onding to orbital one are sho wn in the k ey on the left for sak e of clarit y The dominan t nitrogen lonepair in the case of HF and C O in the case of IPEOMCCSD ionization energies are sho wn in b oth the gure and k ey on the righ t. 104 PAGE 105 Figure 314: Ball and stic k mo dels of ( H 2 O ) 28 One cluster with a region con taining eigh t bulk w ater molecules is on the left while one with a region con taining sev en surface w ater molecules is on the righ t. Both regions are sho wn in y ello w. The eigh t w ater molecules w ere c hosen to b e those within a distance of 4.0 A from the cen ter molecule giv en b y o xygen 9 while the sev en w ater molecules w ere c hosen to b e those within a distance of 4.5 A from the cen ter molecule giv en b y o xygen 4. 105 PAGE 106 T able 35: IPEOMCCSD v alence ionization energies of ( H 2 O ) 28 The c hromophoric regions sho wn in Fig. 314 w ere used with a 631G* basis. There is one c hromophoric region represen tativ e of the bulk and one for the surface b oth of whic h w ere determined from b ond ionization energies at the HF lev el. The rst t w o columns sho w the ionization energies in eV for the corresp onding cen ter w ater molecule in v acuum as w ell as in the ESP of the other t w en t ysev en w aters. The last t w o columns sho w the ionization energies of the en tire c hromophoric region, eigh t w ater molecules for the bulk region and sev en for the surface region, in v acuum as w ell as in the ESP of the other t w en t y and t w en t yone w ater molecules, resp ectiv ely Also sho wn for all states are the leading ionized state w a v efunction amplitudes. F or eac h of the bulk and surface calculations a lo w er energy and higher energy state are sho wn, note the absence of a lo w energy state for the bulk ( H 2 O ) 8 region. bul k ( H 2 O ) 1 ( H 2 O ) 1 ( H 2 O ) 27 ( H 2 O ) 8 ( H 2 O ) 8 ( H 2 O ) 20 11.83 13.03 0.9733 r 9 n O 0.8708 r 9 n O 18.58 19.08 19.77 20.57 0.6944 r 9 O H 0.7003 r 9 O H 0.4737 r 22 O H 0.4475 r 22 O H 0.4238 r 16 O H 0.3412 r 10 n O 0.2505 r 10 O H sur f ace ( H 2 O ) 1 ( H 2 O ) 1 ( H 2 O ) 27 ( H 2 O ) 7 ( H 2 O ) 7 ( H 2 O ) 21 11.83 11.27 18.46 18.84 0.9733 r 4 n O 0.6972 r 4 n O 0.5436 r 23 O H 0.5175 r 23 O H 0.4308 r 5 O H 0.3947 r 24 O H 18.58 17.45 37.63 36.40 0.6944 r 4 O H 0.6906 r 4 O H 0.6191 r 5 O H 0.6029 r 5 O H 0.2664 r 5 n O 0.2099 r 23 O H 0.1548 r 23 O H 106 PAGE 107 CHAPTER 4 LOCAL APPR O XIMA TIONS F OR MOLECULAR PR OPER TIES When lo cal correlation metho ds fail, the concern is the prop er inclusion of nonlo calit y in the calculations, preferably via in termolecular force theory As the in ten t is to build all of the electronic structure of a large molecule from its largely transferable functional groups, b esides simple extensiv e prop erties lik e energies and densities, it is necessary to address higherorder prop erties. Primary among these is the dynamic p olarizabilit y that leads to excited states and their asso ciated densit y matrices [ 100 ]. The ob jectiv e is to b egin to assess the degree to whic h suc h prop erties can b e constructed from lo calized, transferable regions, and to actually b e able to observ e when the lo calit y appro ximation breaks do wn. Conceiv ably then, small in teraction matrices in v olving the comp onen t groups could b e built to allo w for essen tial further in teractions. A rst appro ximation w ould include London disp ersion forces. The exten t to whic h London disp ersion co ecien ts are transferable has b een a ma jor theme of c hemistry for a n um b er of y ears, for example, consider the com bination form ulas due to Slater and Kirkw o o d [ 101 ] whic h can b e used to construct heteron uclear co ecien ts from homon uclear co ecien ts. Ho w ev er, often times these atomic con tributions are not v ery transferable. Some of these mo dels ha v e b een sho wn to b e ev en qualitativ ely incorrect [ 6 ]. By virtue of the ncen ter NLMOs some in ternal cancellations are incorp orated a priori b y dening functional groups in terms of m ulticen tered b onds. The p olarizabilit y and higherorder resp onse terms pro vide information ab out the optical prop erties of molecules via the resp onse of a molecule to an external electric eld. The higherorder terms pro vide the nonlinear optical (NLO) prop erties [ 102 ] whic h are imp ortan t in the design of optical materials [ 103 104 ], while the p oles of the frequencydep enden t p olarizabilit y pro vide the ultraviolet/visible excitation sp ectrum of a molecule [ 105 101 ]. 107 PAGE 108 4.1 Eectiv e Hamiltonian Dynamic P olarizabilities It is w ell kno wn that these prop erties are giv en as external electric eld deriv ativ es of the energy or the quasienergy in the timedep enden t theories. The p olarizabilit y can b e calculated using analytical deriv ativ e metho ds or p erturbation theory [ 104 106 107 108 ]. The latter metho d based up on the EOMCC theory is used herein b ecause of its accuracy in calculating excited states [ 75 ], dynamic p olarizabilities [ 107 ], disp ersion co ecien ts [ 109 ], and NMR coupling constan ts [ 110 ]. The CC metho d [ 1 2 ] is w ell kno wn to b e the most accurate to ol for determining the electronic structure of small to mediumsized molecules. Determining the correlated p olarizabilities of molecules follo ws similarly from timedep enden t HF (TDHF) [ 106 104 ] except that the fo cus is on the correlated CC eectiv e Hamiltonian, H = e T H e T = ( H e T ) c as opp osed to the bare Hamiltonian. Here T is the CC excitation op erator and c means connectedness as in the con text of CC metho ds. The eectiv e Hamiltonian approac h to CC dynamic p olarizabilities has b een discussed b y Stan ton and Bartlett [ 107 ] and generalized b y Rozyczk o, et al. [ 108 ] and Sekino and Bartlett [ 111 ] to a fully extensiv e linear appro ximation used here. There has b een great progress in the calculation of correlated optical prop erties lik e p olarizabilities and disp ersion co ecien ts, some of whic h ha v e emphasized larger systems [ 107 108 109 112 113 114 33 115 116 ]. The eectiv e Hamiltonian approac h has b een sho wn to repro duce F CI v alues for atomic p olarizabilities and disp ersion co ecien ts [ 107 ]. The induced electric dip ole of a molecule is prop ortional to the electric eld with prop ortionalit y constan t giv en b y the frequencydep enden t p olarizabilit y whic h at time zero is giv en b y q = 0q + q q 0 ( ; ) F q 0 + ; (4{1) with 0q = h 0 j ( q e T ) c j 0 i where j 0 i is the singledeterminan t reference function, and with F q b eing the eld amplitude from the timedep enden t p erturbation F q cos ( t ). This equation comes from the follo wing T a ylor series of the energy expanded ab out zero electric eld 108 PAGE 109 strength tak en at time zero, E = E 0 + 0q F q + q q 0 ( ; ) F q F q 0 + ; (4{2) with E 0 = h 0 j ( H e T ) c j 0 i and where the q comp onen t of the electric dip ole, 0q is giv en as the rst deriv ativ e of the energy with resp ect to the q comp onen t of the eld. The (un)primed letter q designates Cartesian co ordinates. Mixed second partial eld deriv ativ es of the energy giv e the p olarizabilit y tensor, q q 0 whic h is also giv en as the rst dip ole deriv ativ e as in Eqn. 4{1 Higherorder deriv ativ es of the energy giv e the NLO prop erties of molecules. 4.1.1 SizeExtensivit y A metho d to determine dynamic electric dip ole p olarizabilities using the CC eectiv e Hamiltonian has b een previously describ ed [ 107 ] and will only briery b e summarized. This approac h is equiv alen t to a sumo v erstates (SOS) approac h [ 105 ] for the frequencydep enden t p olarizabilit y as giv en b y q q 0 ( ; ) = X k h 0 j q j k ih k j q 0 j 0 i k + + h 0 j q j k ih k j q 0 j 0 i k ; (4{3) in whic h is the external eld while k and k are the excited state energy gaps and w a v efunctions, resp ectiv ely In practice the eectiv e Hamiltonian approac h a v oids a diagonal represen tation of the eectiv e Hamiltonian whic h w ould b e necessary in the ab o v e SOS expression. Using p erturbation theory with a CC reference function [ 107 ] giv es the follo wing general form for the p olarizabilit y tensor, q q 0 ( ; ) = 1 X l =0 h 0 j (1 + ) [ q h q i ] j g ih g j H E + ( 1) l j g i 1 h g j [ q 0 h q 0 i ] j 0 i (4{4) = 1 X l =0 h 0 j (1 + ) q ; X l q 0 j 0 i ; (4{5) 109 PAGE 110 where X l q are the p erturb ed amplitudes. The total zeroeld ground state CC energy is giv en b y E and j 0 i is the reference determinan t while j g i are excited determinan ts whic h are complemen tary to the reference determinan t. In this general case the zeroeld Hamiltonian, H is nonHermitian ha ving left, h ~ k j = h 0 jL k e T and righ t, j k i = e T R k j 0 i eigen v ectors where L 0 = (1 + ) the deexcitation op erator from the theory of CC gradien ts and densit y matrices [ 2 75 ]. The SOS Eqn. 4{3 is reco v ered if the Hamiltonian and complemen tary functions are represen ted in terms of excited state w a v efunctions where the Hamiltonian is diagonal, as opp osed to its represen tation in terms of determinan ts. Note that up on expansion the h q 0 ih g j 0 i term v anishes. In the linear appro ximation the [ q h q i ] term with h q i = h 0 j ( q e T ) c j 0 i is a dip ole op erator whic h is represen ted with resp ect to the ground state CC w a v efunction and is necessary b ecause there is no requiremen t that and q b e connected. Ho w ev er, b y imp osing the condition that [ q h q i ] X l q 0 = q ; X l q 0 the previously prop osed linear appro ximation that is fully extensiv e [ 108 111 ] is obtained. 4.1.2 Linear Resp onse In the eectiv e Hamiltonian approac h the linear resp onse equations, h g j H E + ( 1) l j g ih g j X l q j 0 i = h g j q j 0 i ; (4{6) are solv ed for the p erturb ed w a v efunction amplitudes, with l = 0 or 1, using iterativ e metho ds. Once the amplitudes are determined they are con tracted with the p erturbation to giv e the p olarizabilit y q q 0 = X ia h 0 j (1 + ) ( q e T ) c ; X 0 q 0 + X 1 q 0 j ai i + X ij ab h 0 j (1 + ) ( q e T ) c ; X 0 q 0 + X 1 q 0 j abij i (4{7) whic h has b een written in a more complete form for clarication. The usual declaration of i; j; 2 f occ g a; b; 2 f v ir g and p; q ; 2 f occ g [ f v ir g is used. Often times it is 110 PAGE 111 useful to w ork with the isotropic p olarizabilit y h i giv en b y h i = 1 3 ( xx + y y + z z ) ; (4{8) to ensure a v eraging o v er molecular orien tation. 4.2 Lo calized Eectiv e Hamiltonian Dynamic P olarizabilities Calculations on larger systems are often prohibitiv e with highlev el metho ds due to an exp onen tial scaling w all. T o correct these limitations it is necessary to ha v e ab initio metho ds whic h scale fa v orably with system size but whic h hop efully are still systematically impro v able. Considering the success of lo cal correlation metho ds for large molecules it is in teresting to consider whether the same adv an tages hold for more complicated prop erties. Generally these metho ds can repro duce con v en tional CC results, emphasizing the correlated energies, and ha v e b een applied to optical prop erties with some success [ 114 115 ]. These metho ds can b e supplemen ted with an ESP as has b een done for the optical prop erties of large molecules using the FMO metho d [ 117 ]. Giv en that the optical prop erties are sensitiv e to the qualit y of the unp erturb ed w a v efunction it is imp ortan t that the lo cal appro ximation not destro y the reference function. The lo cal CC metho ds should allo w CC theory to b e applied to calculate the optical prop erties of larger molecules in a systematically impro v able fashion. The ma jor ob jectiv e is to assess the p oten tial transferabilit y of resp onse prop erties of c hromophoric regions in a molecule. The same to ols can b e used to obtain London disp ersion in teractions b y ev aluating suc h p olarizabilities at ctitious imaginary frequencies and using the CasimirP older form ula [ 118 ]. 4.2.1 Lo calized DeExcitation Op erators The NLSCCSD approac h to calculating the p olarizabilities of large molecules follo ws from the ground state NLSCCSD w a v efunction b y considering, in analogy with a lo cal excitation op erator, for example T i 2 as in the case of double excitations, a lo cal deexcitation op erator, i2 This linear deexcitation op erator is the lefthand eigen v ector of 111 PAGE 112 H and is a deexcitation sp ecic to orbital i giv en b y 2 = X i i2 = X i 1 4 X j ab ijab i y j y ba : (4{9) Because the lefthand eigenstate of the eectiv e Hamiltonian is parameterized b y a linear CIlik e op erator, m uc h lik e the excitation op erator from EOMCC [ 75 ], some higherorder prop erties are not alw a ys sizeextensiv e in the usual CC terminology T o guaran tee sizeextensivit y for suc h prop erties the linear appro ximation is used herein. Compared to CI, the deviations from sizeextensivit y in the op erator b y virtue of its disconnectedness are small although formally still presen t. While C from CI, see Fig. 23 is not transferable, from CC is transferable similarly to T the CC excitation op erator. The excitation op erators C 2 and T 2 from CISD and CCSD, resp ectiv ely are related b y C 2 = T 2 + 1 2 T 2 1 ; (4{10) while for the deexcitation op erators 2 and 2 from a linear and exp onen tial lefthand ansatz, resp ectiv ely 2 = 2 + 1 2 21 ; (4{11) is obtained. In the case of CISD the op erator C 2 is sizeinextensiv e and th us do es not scale prop erly with system size. One manifestation of this scaling is understo o d b y considering the noncum ulan t piece of double excitations, T 2 1 whic h can b e non v anishing in cases where t ai 2 A and t bj 2 B with A and B t w o systems whic h are distan t from one another. Although it is exp ected that similar b eha vior w ould b e observ ed for 2 b y virtue of its linearit y as in the con text of extensivit y in EOMCC [ 75 ], the size of 1 is quite small requiring that 2 b e appro ximately extensiv e. F or example, the 2 equation con tains t w o disconnected terms, one of whic h v anishes b ecause the NLMO basis is only noncanonical HF as opp osed to b eing nonHF. F or HF reference functions, where T 1 and 112 PAGE 113 1 are exp ected to b e small, con tributions from the other disconnected term, ijab X k c w ik ac t ck jb ; (4{12) are presen t, alb eit small, th us leading to the appro ximate extensivit y of 2 Note that for nonHF reference functions it is p ossible that 2 ma y b e less extensiv e. A qualitativ e explanation of the extensivit y of 2 is pro vided b y recognizing that the CC eectiv e Hamiltonian is almost Hermitian and th us CC theory in the presen t con text, is almost v ariational making the role of 2 in restoring an exp ectation v alue only minor. This is in con trast to the ma jor role of C 2 whic h m ust build in all t w oelectron correlations from a singledeterminan t. The deexcitation op erator, is th us transferable in analogy with the excitation op erator, T Higher excitation lev els of as in 3 = 3 + 2 1 + 31 ma y b e somewhat less transferable due to higherorder pro duct con tributions. The assumption that the op erator i in the NLMO basis only needs to b e lo cally correlated is built in b y summing o v er a small region, QM2, comp osed of orbitals that are spatially close to i as in Eqn. 2{16 giving, i2 1 4 X j ab 2 QM 2 ijab i y j y ba i 2 QM 1 j; a; b 2 QM 2 : (4{13) Note that the same QM regions for b oth excitation and deexcitation op erators are used. Ho w ev er, this is not necessary and it ma y b e b enecial to relax this condition in certain unforeseen cases. The CCSD equations can th us b e solv ed in linear or ev en sublinear time.4.2.2 Eectiv e Bond Dynamic P olarizabilities Expanding the righ thand side of Eqn. 4{7 sho ws that for b oth singles and doubles con tributions to the p olarizabilit y there will b e no terms with more than a giv en n um b er, three in this case, of o ccupied summing indices, i j and k This is in con trast to the CC correlation energy for whic h this n um b er is t w o. Breaking do wn op erators in terms of lo cal con tributions is useful in determining whic h functional groups ha v e imp ortan t 113 PAGE 114 excitations and deexcitations in to one another. This is similar in spirit to our recen t w ork on triples con tributions to the energy [ 29 ] where it is sho wn that triples con tributions are most imp ortan t for delo calized moieties suggesting an in teresting activ e space metho d. This decomp osition is seen b y using the lo cal T i and i op erators in Eqn. 4{7 along with the appropriate form for a oneparticle op erator, for example the dip ole q = P p q p = P pq q pq f p y q g As an example, consider a term from the singles equation, h 0 j i2 q j T k 2 j ai i X bc ikac q j b t cbk j ; (4{14) where the virtual orbital dep endence has b een summed out. When con tracted with the p erturbation amplitudes, q q 0 ij k X a h 0 j i2 q j T k 2 j ai i X 0 a q 0 i + X 1 a q 0 i ; (4{15) w e get the follo wing decomp osition for the p olarizabilit y in terms of o ccupied triples, q q 0 X ij k q q 0 ij k : (4{16) Similarly consider the term from the doubles equation, h 0 j i2 q j T k 1 j abij i X c ikab q j c t ck ; (4{17) whic h when con tracted again with the p erturbation amplitudes, q q 0 ij k X ab h 0 j i2 q j T k 1 j abij i X 0 ab q 0 ij + X 1 ab q 0 ij ; (4{18) also results in a sum o v er o ccupied triples. This breakdo wn of op erators in to lo cal orbital or functional group con tributions allo ws the represen tation of the p olarizabilit y in terms of single h 0 j i2 q i T i 2 j ai i double h 0 j i2 q i T j 2 j ai i and triple h 0 j i2 q j T k 2 j ai i con tributions for the single excitation equations, ignoring p erm utations thereof. T aking the o ccupied singles piece and represen ting the p olarizabilit y in terms of QM1 regions alone is tempting, ho w ev er, results indicate that 114 PAGE 115 it is necessary to accoun t for relaxation eects of orbital excitations and deexcitations in to other groups. These w ould not b e obtained with suc h a diagonal structure of the doubles excitation op erators. Relaxing this constrain t reco v ers o ccupied doubles terms but it is not un til the dip ole op erator is allo w ed to b e indep enden t of T 2 or 2 as it is for the o ccupied triples piece, that the con v en tional p olarizabilit y is reco v ered. Assuming that scales prop erly with system size man y of the q q 0 ij k con tributions are negligible thereb y allo wing useful linearscaling algorithms. Rather than transferring these n 3occ dep enden t con tributions, eectiv e b ond con tributions are dened whic h act to incorp orate longerranged con tributions in a giv en larger QM region, QM2, in to an o ccupied singles con tribution, h 0 j i 0 2 q i 0 T i 0 2 j ai i in the QM1 region. Note that for ev ery term it is the deexcitation op erator 2 or q whic h carries the i dep endence. 4.2.3 Eectiv e Bond Resp onse Matrices via L owdin P artioning The p erturbation amplitudes, X l q are determined from the linear resp onse equation giv en b y Eqn. 4{6 This equation can b e rewritten in simplied notation as, h g j A j g ih g j X j 0 i = h g j q j 0 i ; (4{19) where A = H E + ( 1) l and where the suband sup erscripts on X ha v e b een dropp ed. It is useful to consider the follo wing L owdin partitioning of the resp onse in to P and Q spaces for whic h P is sp ecic to the o ccupied NLMO, i giv en b y j ai i 2 P and Q is giv en b y j bj i 2 Q The singles blo c k is c hosen for simplicit y It is w ell kno wn that the set of linear equations can b e represen ted in the P space b y redening the linear resp onse and p erturbation matrices as follo ws, A 0P P X P = q 0 P ; (4{20) where A 0P P = A P P A P Q A 1 QQ A QP and q 0 P = q P A P Q A 1 QQ q Q The single orbital resp onse matrix, h ai j A j bi i alone, pro vides an o v ersimplication of the resp onse. Although it con tains the direct resp onse of i to the eld, it lac ks the resp onse of i to other orbitals whic h in turn are resp onding to the eld. This piece is corrected b y A P Q A 1 QQ A QP whic h 115 PAGE 116 again builds in longerranged in teractions from the QM2 region. By using the L owdin partitioning it is p ossible to dene an eectiv e resp onse matrix, h ai j A 0 j bi i in analogy with the previously men tioned eectiv e op erators. The lo cal p erturb ed amplitude solutions can b e written as X l q 2 = P i X l i q 2 and used in Eqn. 4{7 to determine eectiv e b ond p olarizabilities, q q 0 i i q i T i X i q 0 : (4{21) 4.3 Disp ersion In teractions The PES of a system of molecules is dominated at small in tern uclear distances b y quan tum mec hanical eects suc h as coulom b, exc hange, and correlation whic h cause large distortions in the electron densities of the molecules. In the limit of large in tern uclear distances, that is the distance is large relativ e to the sizes of the constituen ts and therefore the o v erlap is negligible, the in termolecular p oten tial is dominated b y attractiv e quan tum mec hanical London disp ersion or instan taneous dip ole in teractions [ 101 ]. The total longrange in teraction energy b et w een t w o c hemical sp ecies, atoms and/or molecules A and B can b e represen ted as, E AB = E AB es + E AB ind + E AB disp ; (4{22) with E AB es the usual electrostatic energy E AB ind the induction energy and E AB disp the disp ersion energy Consider the case for whic h the t w o c hemical sp ecies are c harge neutral and ha v e no p ermanen t dip ole momen t, requiring that E AB es and E AB ind v anish. The E AB disp can b e represen ted in terms of dip oledip ole, dip olequadrup ole, etc. terms as in the follo wing m ultip ole expansion, E AB disp = C AB 6 R 6 C AB 7 R 7 C AB 8 R 8 + ; (4{23) where the con v en tion of assuming p ositiv e disp ersion co ecien ts is used. The lo w estorder term, the dip oledip ole term, E AB dip dip = ~ A ~ B 3 ~ A ~ R ~ B ~ R R 3 ; (4{24) 116 PAGE 117 will b e the fo cus herein [ 101 ]. Note that R is the AB distance. Using p erturbation theory giv es the leading order term in the energy E AB dip dip as C AB 6 R 6 with, C AB 6 = 6 X k X k 0 j A0 k j 2 j B0 k 0 j 2 A k + B k 0 ; (4{25) where k ; k 0 lab el the excited states, 0 k is the transition dip ole momen t, and k are the excitation energies. 4.3.1 CasimirP older Equation Casimir and P older [ 118 ] transformed this expression for the disp ersion co ecien t in to an in tegral o v er imaginary frequencies. In this w a y Eqn. 4{25 can b e rewritten in terms of ctitious p olarizabilities ev aluated at imaginary frequencies, C AB 6 = 3 Z 1 0 d! X k j A0 k j 2 0 k 2 0 k + 2 X k 0 j B0 k 0 j 2 0 k 0 2 0 k 0 + 2 ; (4{26) or, C AB 6 = 3 Z 1 0 d! A ( i! ) B ( i! ) : (4{27) P olarizabilities at imaginary frequencies can b e understo o d to b e the resp onse of a molecule to an exp onen tially increasing electric eld [ 101 ]. Due to the fact that these p olarizabilities exp erience no p oles and monotonically decrease in magnitude with increasing eld strength, they can b e found quite easily and suer little from con v ergence diculties. This is to b e con trasted with determining the shap e of the p oles around an excitation in real frequency calculations. The eectiv e Hamiltonian approac h for p olarizabilities at imaginary frequencies has b een dev elop ed in detail elsewhere [ 109 ]. 4.3.2 Correctiv e Disp ersion P oten tial One question to address is to what exten t can the disp ersion co ecien t, C AB 6 b e decomp osed in to orbital disp ersion co ecien t con tributions, C AB 6 ij corresp onding to all p ossible in termolecular b ond pairs, C AB 6 = n occ A X i n occ B X j C AB 6 ij : (4{28) 117 PAGE 118 As suc h correlated disp ersion co ecien ts can b e determined for longranged in teractions in large molecules b y taking adv an tage of the transferabilit y of the QM regions in a NLSCC calculation. The p olarizabilit y at imaginary frequencies is w ell b eha v ed as previously men tioned, thereb y requiring only a small n um b er of gridp oin ts. 4.3.3 Anisotropic Con tributions The CasimirP older equation, as written ab o v e, uses isotropic p olarizabilities and therefore only pro vides isotropic disp ersion co ecien ts. T o assess ho w the disp ersion in teraction b et w een otherwise v ery distan t functional groups c hanges as a function of orien tation the anisotropic p olarizabilities need to b e incorp orated in to the CasimirP older equation [ 6 12 ]. F urthermore, the same tec hniques used here can b e applied to higher C n co ecien ts in the ev en t that suc h terms are necessary 4.4 Implemen tation All calculations are p erformed using a mo died v ersion of the A CES I I [ 16 ], quan tum c hemistry soft w are pac k age. These mo dications include an in terface to Flo c k e's highly ecien t and general NLMO pac k age [ 15 ]. Results are obtained using a 375 MHz p o w er3 pro cessor using a maxim um of 4 of the a v ailable 8 GB of shared memory o v er four pro cessors, as w ell as an 8 dualcore ia64 pro cessor SGI Altix mac hine with 256 GB of shared memory using m ultithreaded libraries. Suc h calculations mak e use of some recen t adv ances in the A CES I I [ 16 ] co de to tak e adv an tage of large shared memory mac hines. These adv ances include up dating some old memory limitations to allo w for large eectiv e Hamiltonians and incore con tractions in the con text of man yb o dy metho ds. Using incore con tractions are quite useful in the p ostCC part of the calculation in ev aluating the CC densit y matrix and subsequen t use in calculating the p olarizabilit y and disp ersion co ecien ts. All correlated p olarizabilit y and disp ersion co ecien t calculations are p erformed at the CCSD lev el using the eectiv e Hamiltonian approac h with a noncanonical RHF NLMO reference function. Disp ersion co ecien ts are determined using an 0 v alue of 0 : 4 [ 109 ]. The atomic units of p olarizabilit y are v olume in Bohr units 118 PAGE 119 cub ed, a 30 The results are exp ected to b e accurate for molecules whose excited states are dominated b y single excitations for whic h EOMCCSD [ 75 ] is kno wn to b e quite go o d. Geometries for p olyglycine are tak en from reference [ 36 ]. T ryptophan and diglycine molecules are created using Molden [ 30 ]. Geometries of test systems are a v ailable up on request. P olyglycine is mean t to represen t the simplest p eptide that still con tains a v ariet y of functional groups, for example, it has a 3cen ter p eptideb ond. T ryptophan has a 9cen ter indoleb ond. A ccpVDZ basis is used unless sp ecied otherwise, ho w ev er, for more accurate calculations more adv anced basis sets with high angular momen tum and diuse functions should b e used. Lo calized core o ccupied orbitals ha v e b een dropp ed unless sp ecied otherwise along with lo calized core virtual orbitals. F or example Rydb erg functions of angular momen tum s ha v e b een remo v ed. Other Rydb erg functions ha v e also b een dropp ed dep ending on the t yp e of calculation. 4.5 Applications 4.5.1 Motiv ation As men tioned ab o v e, a decomp osition of the CCSD w a v efunction in to functional group con tributions is p ossible b y virtue of its sizeextensivit y unlik e CISD. Diagonal T 2 from the ground state CCSD w a v efunction, 2 from the ground state CCSD densit y matrix, and C 2 from the ground state CISD w a v efunction amplitudes for a C H b ond b elonging to a meth yl group as a function of alk ane size w ere previously sho wn in Fig. 23 The gure sho ws that the CISD amplitudes, unlik e the CCSD amplitudes, are not transferable. Additionally it is seen that the lefthand eigenstate of H 2 is transferable despite the fact that it is parameterized b y a linear, sizeinextensiv e, as in the con text of EOMCCSD [ 75 ], CIlik e op erator. This op erator pro vides imp ortan t relaxation terms to the CC densit y matrix and th us correlated prop erties. The CISD amplitudes decrease in magnitude with increasing system size whic h is consisten t with the fact that in the innite limit the correlation energy p er particle b ecomes zero. 119 PAGE 120 Fig. 41 sho ws the transferabilit y of the resp onse matrix from Eqn. 4{6 for a C H b ond/an tib ond pair among a series of substituted alk anes. The resp onse matrix plotted excludes the constan t E + ( 1) l scalar. The plot demonstrates that the lo calized meth yl group is screened from the p erturbation. The left plot con tains the single excitation comp onen t while the double excitation comp onen t is on the righ t. When there is no screening presen t the resp onse matrix comp onen ts are quite dieren t, ho w ev er, these dierences b ecome negligible as the n um b er of screening regions increases. The rate at whic h this happ ens ob viously dep ends on the functional group, for example, the meth yl substituen t is found to c hange only sligh tly with screening while the ruorine, not surprisingly p erturbs the densit y signican tly Note that not only do es eac h functional group indep enden tly approac h a constan t but that the dieren t functional groups approac h the same constan t. The general conclusions are the same for b oth the singles and doubles comp onen ts. Giv en that the eectiv e Hamiltonian from CC theory con tains information ab out the excited states and can b e represen ted for a large target system in terms of man y smaller Hamiltonians, it is p ossible to obtain the excitation sp ectrum of a large molecule via lo cally transferable QM regions. A plot sho wing the transferabilit y of the meth yl group con tribution to the a v erage static p olarizabilit y among a series of substituted alk anes is sho wn in Fig. 42 These meth yl group con tributions are highly transferable in con trast to the eectiv e Hamiltonian elemen ts whic h are found to require more regions of screening to reac h the same lev el of transferabilit y This is b ecause the p olarizabilit y tak es adv an tage of some in ternal cancellations among the excitation, deexcitation, etc. op erators. When there are no screening regions the meth yl group p olarizabilit y is quite dep enden t on the substituted group, ho wev er, with as little as one screening region it b ecomes appro ximately indep enden t of the substituen t. The meth yl group con tribution to the a v erage p olarizabilit y of ethanol and eth ylamine v ersus frequency is sho wn in Fig. 43 F or smaller frequencies this con tribution is 120 PAGE 121 indep enden t of substituen t, while as the frequency increases to w ard an excitation energy the transferabilit y of the meth yl group p olarizabilit y breaks do wn. The example here is for the rst excitation energy of eth ylamine whic h is giv en b y excitation energy (EE) EOMCCSD at the dotted v ertical line. The breakdo wn in transferabilit y is a manifestation of the delo calized nature of the excited state w a v efunctions whic h o ccur at dieren t frequencies. Eac h group p olarizabilit y comprising the molecular p olarizabilit y will exp erience a p ole in the resp onse at the same excitation energy b ecause eac h con tribution con tains a piece of the corresp onding denominator. Despite these tec hnical complications among the p olarizabilit y con tributions in that they all exp erience singularities at the excitation energy of the system, Figs. 44 and 45 sho w that the rate at whic h this innit y is approac hed is quite dieren t among the functional groups thereb y pro viding useful information ab out the excited states. Figs. 44 and 45 sho w a complete decomp osition of the correlated p olarizabilit y in terms of functional group con tributions as a function of frequency for three excited states of ethanol and eth ylamine, resp ectiv ely These excited states are v eried with EEEOMCCSD and are giv en b y the corresp onding dotted v ertical lines. F or ethanol the rst t w o excited states are lo calized on the h ydro xyl group, whic h has a strong resp onse, with residual tail con tributions from the meth ylene group. One of the C C b onds, the one neigh b oring the h ydro xyl group, whic h are included in the meth ylene groups for con v enience, not surprisingly dominates this tail meth ylene con tribution. These p olarizabilit y con tributions are opp osite in sign, rerecting their dieren t h yp erp olarizabilities, and their sum giv es the total p olarizabilit y giv en that the meth yl group con tributions are negligible. If the C C b ond are mo v ed from the meth ylene group to the h ydro xyl group it w ould b e seen that the new h ydro xyl group dominates the p olarizabilit y while the con tributions from the meth ylene and meth yl groups are b oth negligible. This means that the excitation is dominan t on that group and, to suc h an exten t, is lo calized. The third excited state is more delo calized than the rst t w o b ecause other than the sligh t dominance of the meth ylene 121 PAGE 122 group all the functional groups mak e comparable con tributions. The third state is most lik ely a delo calized Rydb erg state. F rom Fig. 45 in analogy with ethanol the rst and second excited states of eth ylamine are strongly dominated b y the unique functional group, in this case the amino group, with residual con tributions from the meth yl and meth ylene groups. The dominance of the meth ylene group for the third excited state of eth ylamine is greater than it is for ethanol.4.5.2 T ryptophan Dynamic P olarizabilities T able 41 sho ws the decomp osition of the a v erage static p olarizabilit y in to functional group and NLMO con tributions for tryptophan. NLMOs are ordered p er column as l p and orbitals as w ell as a 9cen ter indoleb ond giv en b y 1 9 5 9 F or eac h t yp e of NLMO within eac h column the p olarizabilit y con tributions are ordered b y increasing magnitude. The molecule tryptophan is sho wn on the left in Fig. 46 along with the atomic indices. The molecule is decomp osed in to functional groups corresp onding to carb o xyl, amino, indole, and the remaining aliphatic part. Comparing the group p olarizabilities sho ws that the indole group has the largest p olarizabilit y follo w ed in order b y the aliphatic, carb o xyl, and amino groups. It is w ell kno wn that the p olarizabilit y of a molecule is prop ortional to its v olume, therefore the follo wing a v erages are considered, an orbital a v erage in whic h this quan tit y is divided b y the total n um b er of orbitals, and an atom a v erage in whic h it is divided b y the total n um b er of atoms. F or the a v erages it is found that the indole group is largest follo w ed b y the aliphatic region. The ordering of the carb o xyl v ersus amino groups dep ends on the a v eraging metho d used. The largest NLMO l p p olarizabilit y con tribution comes from the carb on yl o xygen in the carb o xyl group, while the largest b ond con tributions come from the cen tral C C b ond of the aliphatic region. Somewhat surprisingly the and con tributions to the carb on yl group of the carb o xyl region are the smallest NLMO con tributions among the molecule and are ev en comparable to one another. 122 PAGE 123 Fig. 47 sho ws the decomp osition of the p olarizabilit y in to functional groups used in T able 41 as a function of frequency Excitation energies for the rst t w o excited states as giv en b y EEEOMCCSD are sho wn b y the dotted v ertical lines. It is easy to see that the rst excited state is lo calized to the indole group due to its v ery large resp onse in comparison to the other functional groups. The second excited state is lo calized to the carb o xyl and aliphatic groups with constan t con tributions from the amino and indole groups. In this case b ecause the carb o xyl p olarizabilit y con tribution is negativ e it indicates that the lo cal dip ole is decreasing in magnitude with increasing frequency This is in con trast with the dip ole of the aliphatic group whic h is increasing in magnitude, suggesting ho w the densit y rearranges in resp onse to the electric eld. 4.5.3 P olyGlycine Dynamic P olarizabilities Fig. 27 sho ws the QM regions dening the NLSCCSD eectiv e Hamiltonian calculations on p olyglycine. The functional groups of in terest are represen ted b y the QM1 regions, the QM lab el is to sp ecify that this region is treated quan tum mec hanically as opp osed to b y more classical in teractions, lik e an ESP etc. Suc h a p oten tial could b e incorp orated in to regions outside of QM2, but will not in the curren t form ulation. There is one region for the meth ylene group, one for the p eptide group, one for the Ntermin us, and one for the Ctermin us. This region is em b edded in the QM2 region, whic h in turn could b e em b edded in an ev en larger QM region [ 29 ]. T able 42 has total and unit cell NLSCCSD dynamic p olarizabilities for translationally p erio dic p olyglycine where the unit cell is c hosen as one glycine residue. The four frequencies are c hosen as zero, one, t w o, and three times that of the so dium Dline in analogy with Stan ton's w ork [ 109 ]. Appro ximate dynamic p olarizabilities calculated using NLSCC are compared to the con v en tional eectiv e Hamiltonian approac h for tetraglycine where it is seen that for zero frequency o v er 99% of the correlated p olarizabilit y is reco vered. As the frequency approac hes an excitation energy of tetraglycine the QM regions b ecome less transferable giving sligh tly less agreemen t with con v en tional results. In the 123 PAGE 124 innite limit, the p olarizabilit y of a unit cell of glycine is 17 : 1 a 30 at zero frequency and 21 : 6 a 30 at three times the so dium Dline frequency (3 0 : 0773 h ). 4.5.4 Alk ane Disp ersion Co ecien ts Eqn. 4{28 sho ws that the total disp ersion co ecien t can b e represen ted in terms of in terb ond con tributions. Fig. 48 sho ws the transferabilit y of three pair con tributions to the disp ersion co ecien ts for the in teraction of t w o alk anes. Eigh t gridp oin ts are used for the n umerical in tegration. Dimers of propane through heptane are used and v alues for the meth ylmeth yl, meth ylmeth ylene, and meth ylenemeth ylene disp ersion co ecien ts are determined. It is seen that the meth ylmeth yl con tribution to disp ersion is appro ximately constan t in this isotropic appro ximation from one alk ane dimer to another while the meth ylmeth ylene and meth ylenemeth ylene con tributions require a few units of screening b efore b ecoming transferable. The denition of the meth ylene group used herein includes the adjacen t C C b onds and so it is not surprising that those con tributions are larger than the meth yl group con tributions whic h only con tain C H b onds. In T able 43 disp ersion co ecien ts p er meth ylene unit cell are sho wn for the in teraction of t w o alk anes giv en on the righ t side of the table. The column on the left giv es the QM2 region used in constructing the target results sho wn on the righ t. Note that the diagonal elemen ts are the exact result with the basis set used. The v alue p er unit cell sho wn are determined b y taking the total disp ersion co ecien t from Eqn. 4{28 and dividing b y the total n um b er of carb ons. The rate of con v ergence of the meth ylene unit cell disp ersion co ecien t is clearest for heptane whic h when represen ted in terms of a propane QM2 region is in error b y 7 : 3 a 60 h This error is reduced to 0 : 4 a 60 h when QM2 is hexane. Also sho wn is the meth ylene unit cell disp ersion co ecien t determined in the innite limit. The con v erged v alues for the meth ylene unit cell are comparable to those from Fig. 48 4.5.5 DiGlycine Disp ersion Co ecien ts The con v ergence with resp ect to the n um b er of gridp oin ts for the total and diagonal con tributions to the disp ersion co ecien t for the in teraction of t w o diglycines is sho wn 124 PAGE 125 in Fig. 49 The diglycine molecule is sho wn on the righ t in Fig. 46 The co ecien ts are normalized to the v alues corresp onding to the largest n um b er of gridp oin ts whic h are sho wn in the legend. Diglycine is decomp osed in to carb o xyl, p eptide, amino, and t w o t yp es of meth ylene groups. It can b e seen that all con tributions rapidly con v erge with resp ect to the n um b er of gridp oin ts. It is seen that the p eptidep eptide con tribution is largest, while the amino group has the smallest con tribution, and that the t w o meth ylene groups, not surprisingly are appro ximately equiv alen t. A decomp osition of the disp ersion co ecien t in to diagonal NLMO con tributions p er functional group for the t w o in teracting diglycines is sho wn in T able 44 NLMOs are ordered p er column as l p and orbitals as w ell as t w o 3cen ter carb o xyland p eptideb onds giv en b y 1 3 2 3 F or eac h t yp e of NLMO within eac h column the orbital disp ersion co ecien t con tributions are ordered b y increasing magnitude. A sixteen p oin t grid is used. The largest l p con tribution is from the nitrogen of the amino group while the largest b ond con tributions are from the C C and C H b onds. The 3cen ter p eptideb ond has b y far the largest con tribution. The more appropriate w a y to describ e longrange disp ersion eects b et w een structural regions is to ev aluate anisotropic co ecien ts whic h will b e the sub ject of future w ork. The curren t isotropic co ecien ts illustrate the computational approac h to b e used. 125 PAGE 126 1.065 1.070 1.075 1.080 1.085 1.090 1.095 1.100 1 2 3 4 5 HsCHsCH / a.u.n, H 3 C(CH 2 ) n1 R R=CH 3 R=OH R=NH 2 R=F 1.880 1.890 1.900 1.910 1.920 1.930 1.940 1 2 3 4 5 HsCHsCH / a.u.n, H 3 C(CH 2 ) n1 R R=CH 3 R=OH R=NH 2 R=F Figure 41: T ransferabilit y of the eectiv e Hamiltonian. Figure sho wing the transferabilit y of the resp onse matrix from Eqn. 4{6 for a C H b ond among a series of substituted alk anes. The C H b ond b elongs to a meth yl group whic h is on the opp osite side of the substituen t and the xaxis represen ts increasing distance from the meth yl group in terms of the n um b er of meth ylene groups. The left plot con tains the single excitation comp onen t, h C H C H j H N j C H C H i while the double excitation comp onen t is on the righ t. H in the gure means the CC eectiv e Hamiltonian, H N whic h is normal ordered with resp ect to the reference function j 0 i 126 PAGE 127 9.6 9.8 10.0 10.2 10.4 10.6 10.8 11.0 11.2 11.4 11.6 11.8 1 2 3 4 5 a / a.u.n, H 3 C(CH 2 ) n1 R R=CH 3 R=OH R=NH 2 R=F Figure 42: T ransferabilit y of CCSD b ond p olarizabilities. The gure is for the meth yl group con tribution to the a v erage static p olarizabilit y among a series of substituted alk anes. The meth yl group is on the opp osite side of the substituen t and the xaxis represen ts increasing distance from the meth yl group in terms of the n um b er of meth ylene groups. 127 PAGE 128 20 10 0 10 20 30 40 50 0.20 0.15 0.10 0.05 0.00 a / a.u.w / a.u. 0.2437 0.2436 0.2435 w / a.u. H 3 CCH 2 OH H 3 CCH 2 NH 2 Figure 43: F requency dep enden t b ond p olarizabilities and asso ciated p oles. The meth yl group con tribution to the a v erage p olarizabilit y of ethanol and eth ylamine is sho wn. The plot sho ws the gradual breakdo wn of the transferabilit y of the meth yl group p olarizabilit y as the frequency approac hes the excitation energy The example here is for the rst excitation energy of eth ylamine whic h is giv en b y EEEOMCCSD at the dotted v ertical line. Note the c hange in the domain on the leftand righ thand sides of the plot. 128 PAGE 129 1.0 0.8 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 1.0 0.24940 0.24920 a / a.u.w / a.u. 0.33025 0.33005 w / a.u. 0.36041 0.36021 w / a.u. Methyl Methylene Hydroxyl Total Figure 44: F requency dep enden t b ond p olarizabilities for ethanol. The a v erage p olarizabilit y is used. The frequencies are c hosen symmetrically ab out three excited states determined using EEEOMCCSD and are sho wn as the dotted v ertical line in eac h case. 129 PAGE 130 1.0 0.8 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 1.0 0.24371 0.24351 a / a.u.w / a.u. 0.29656 0.29636 w / a.u. 0.35103 0.35083 w / a.u. Methyl Methylene Amino Total Figure 45: F requency dep enden t b ond p olarizabilities for eth ylamine. The a v erage p olarizabilit y is used. The frequencies are c hosen symmetrically ab out three excited states determined using EEEOMCCSD and are sho wn as the dotted v ertical line in eac h case. 130 PAGE 131 Figure 46: Ball and stic k mo dels of tryptophan and diglycine. The molecules are sho wn on the left and righ t, resp ectfully along with atomic indices. 131 PAGE 132 T able 41: Static b ond p olarizabilities for tryptophan. The a v erage static p olarizabilit y is used. NLMOs are ordered p er column as l p and orbitals as w ell as a 9cen ter indoleb ond giv en b y 1 9 5 9 F or eac h t yp e of NLMO within eac h column the p olarizabilit y con tributions are ordered b y increasing magnitude. Also sho wn are t w o t yp es of p olarizabilit y densities. car boxy l amino al iphatic indol e O 13 0.6425 N 14 0.7464 C 11 N 14 1.4988 C 8 C 9 0.9998 O 15 0.6762 N 14 H 17 1.2272 C 10 H 19 1.7382 C 1 C 6 1.1163 O 13 1.2822 N 14 H 16 1.2913 C 11 H 18 1.7850 C 4 C 5 1.1168 O 15 2.5920 C 10 H 20 1.8908 C 3 N 7 1.1142 C 12 O 15 0.4309 C 11 C 12 2.0785 C 5 C 6 1.1446 C 12 O 13 0.7029 C 9 C 10 2.2084 C 1 C 2 1.1787 O 13 H 27 1.3575 C 10 C 11 2.7265 C 3 C 4 1.1847 C 12 O 15 0.4887 N 7 C 8 1.1696 C 2 C 3 1.2690 N 7 H 21 1.4704 C 2 C 9 1.6608 C 1 H 26 1.6810 C 4 H 23 1.7444 C 8 H 22 1.7679 C 6 H 25 1.7876 C 5 H 24 1.8183 1 9 0.7087 2 9 1.2645 3 9 6.8956 5 9 10.9716 4 9 13.7593 T otal 8.1732 3.2650 13.9266 55.8248 O r b Av g 1.0217 1.0883 1.9895 2.6583 Atm Av g 2.0433 1.0883 2.7853 3.7217 132 PAGE 133 1.0 0.8 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 1.0 0.21405 0.21401 0.21396 a / a.u.w / a.u. 0.23026 0.23021 w / a.u. Carboxyl Amino Aliphatic Indole Total Figure 47: F requency dep enden t b ond p olarizabilities for tryptophan. The a v erage p olarizabilit y is used. Excitation energies for the rst t w o excited states as giv en b y EEEOMCCSD are sho wn b y the dotted v ertical lines. 133 PAGE 134 T able 42: F requency dep enden t NLSCCSD p olarizabilities for p olyglycine. The table sho ws the total and unit cell v alues for translationally p erio dic p olyglycine. The unit cell is a glycine residue. F requencies are c hosen as m ultiples of the so dium Dline frequency ng l y cine0.0000 0.0773 0.1547 0.2321 ncel lh i h i / ncel lh i h i / ncel lh i h i / ncel lh i h i / ncel l 3 52.3967 17.4656 53.2957 17.7652 56.3886 18.7962 64.0370 21.3457 4 69.4862(99 : 4%) 17.3715 70.7215(99 : 4%) 17.6804 74.9951(99 : 2%) 18.7488 85.6499(98 : 7%) 21.4125 5 86.5756 17.3151 88.1474 17.6295 93.6015 18.7203 107.2627 21.4525 8 137.8439 17.2305 140.4250 17.5531 149.4209 18.6776 172.1013 21.5127 10 172.0228 17.2023 175.2767 17.5277 186.6338 18.6634 215.3271 21.5327 15 257.4701 17.1647 262.4060 17.4937 279.6661 18.6444 323.3914 21.5594 20 342.9173 17.1459 349.5353 17.4768 372.6984 18.6349 431.4557 21.5728 25 428.3645 17.1346 436.6647 17.4666 465.7307 18.6292 539.5201 21.5808 50 855.6007 17.1120 872.3112 17.4462 930.8922 18.6178 1079.8417 21.5968 75 1282.8368 17.1045 1307.9578 17.4394 1396.0538 18.6141 1620.1634 21.6022 100 1710.0730 17.1007 1743.6044 17.4360 1861.2153 18.6122 2160.4851 21.6049 250 4273.4899 17.0940 4357.4839 17.4299 4652.1844 18.6087 5402.4151 21.6097 500 8545.8514 17.0917 8713.9497 17.4279 9303.7996 18.6076 10805.6318 21.6113 1 17.0894 17.4259 18.6065 21.6129 134 PAGE 135 14 16 18 20 22 24 26 28 30 32 34 1 2 3 4 5 C6 ij / a.u.n, H 3 C(CH 2 ) n CH 3 methylmethyl methylmethylene methylenemethylene Figure 48: T ransferabilit y of disp ersion co ecien ts for alk anes. The gure sho ws the three pair con tributions for the in teraction of t w o alk anes. Eigh t gridp oin ts are used for the n umerical in tegration. 135 PAGE 136 T able 43: Disp ersion co ecien ts p er meth ylene unit cell. V alues sho wn are for the in teraction of t w o alk anes sho wn on the righ t side of the table. The column on the left giv es the QM2 region used in constructing the target results sho wn on the righ t. Note that the diagonal elemen ts are the exact result within the basis set used. Eigh t gridp oin ts are used for the n umerical in tegration. QM 2 C 3 H 8 C 4 H 10 C 5 H 12 C 6 H 14 C 7 H 16 C 1 H 1 C 3 H 8 20.3592 23.1539 24.9177 26.1297 27.0132 32.6242 C 4 H 10 20.0543 21.0499 21.7272 22.2176 25.2763 C 5 H 12 19.8873 20.4129 20.7926 23.1456 C 6 H 14 19.7885 20.1155 22.1356 C 7 H 16 19.7237 21.5582 136 PAGE 137 0.998 1.000 1.002 1.004 1.006 1.008 1.010 1.012 1.014 4 6 8 10 12 14 16 C6 / a.u.n, number of gridpoints carboxyl / 28.56 methylene / 29.40 peptide / 36.47 methylene / 28.42 amino / 8.21 total / 624.49 Figure 49: Con v ergence of diglycine disp ersion co ecien ts with gridp oin ts. The total and diagonal con tributions for the in teraction of t w o diglycines are sho wn. The co ecien ts are normalized to the v alues corresp onding to the largest n um b er of gridp oin ts whic h are sho wn in the legend. 137 PAGE 138 T able 44: Bond disp ersion co ecien ts for diglycine. Diagonal con tributions are sho wn for t w o in teracting diglycines. NLMOs are ordered p er column as l p and orbitals as w ell as t w o 3cen ter carb o xyland p eptideb onds giv en b y 1 3 2 3. F or eac h t yp e of NLMO within eac h column the orbital disp ersion co ecien t con tributions are ordered b y increasing magnitude. A 16 p oin t grid is used. car boxy l methy l ene peptide methy l ene amino O 1 0.2399 C 5 N 8 1.6104 O 3 0.2743 C 7 N 9 1.3419 N 9 0.6779 O 2 0.2570 C 4 C 5 1.8782 O 3 0.3130 C 7 H 14 1.9014 N 9 H 16 1.0548 O 2 0.3490 C 5 H 11 1.9359 O 3 C 6 0.1149 C 7 H 15 1.9014 N 9 H 17 1.0555 O 2 C 4 0.0932 C 5 H 12 1.9359 C 6 N 8 0.7885 C 6 C 7 2.0032 O 1 C 4 0.4252 N 8 H 13 1.0697 O 1 H 10 0.7193 1 30.7048 1 30.3183 2 33.8378 2 32.0970 138 PAGE 139 REFERENCES [1] R. J. Bartlett, Ann u. Rev. Ph ys. Chem. 32 359 (1981). [2] R. J. Bartlett and M. Musia l, Rev. Mo d. Ph ys. 79 291 (2007). [3] J. J. McClellan, T. F. Hughes, and R. J. Bartlett, In t. J. Quan tum Chem. 105 914 (2005). [4] M. A. Spac kman, J. Chem. Ph ys. 94 1295 (1991). [5] Y. K. Kang and M. S. Jhon, Theoret. Chim. Acta 61 41 (1982). [6] T. P Haley, E. R. Gra ybill, and S. M. Cybulski, J. Chem. Ph ys. 124 204301 (2006). [7] A. D. Bec k e and E. R. Johnson, J. Chem. Ph ys. 122 154104 (2005). [8] P Jin, J. S. Murra y, and P P olitzer, In t. J. Quan tum Chem. 106 2347 (2006). [9] K. J. Miller, J. Am. Chem. So c. 112 8533 (1990). [10] Q. W u and W. Y ang, J. Chem. Ph ys. 116 515 (2002). [11] A. T. Amos and R. J. Crispin, Mol. Ph ys. 31 159 (1976). [12] F. Pirani, D. Capp elletti, and G. Liuti, Chem. Ph ys. Lett. 350 286 (2001). [13] A. Y e and J. Autsc h bac h, J. Chem. Ph ys. 125 234101 (2006). [14] A. E. Reed and F. W einhold, J. Chem. Ph ys. 83 1736 (1985). [15] N. Flo c k e and R. J. Bartlett, Chem. Ph ys. Lett. 367 80 (2003). [16] A CES I I is a program pro duct of the Quan tum Theory Pro ject, Univ ersit y of Florida. Authors: J. F. Stan ton, J. Gauss, J. D. W atts, M. No oijen, N. Oliphan t, S. A. P erera, P G. Szala y W. J. Lauderdale, S. A. Kuc harski, S. R. Gw altney S. Bec k, A. Balk o v a, D. E. Bernholdt, K. K. Baec k, P Rozyczk o, H. Sekino, C. Hob er, and R. J. Bartlett. In tegral pac k ages included are VMOL (J. Alml of and P R. T a ylor); VPR OPS (P T a ylor); ABA CUS (T. Helgak er, H. J. Aa. Jensen, P Jrgensen, J. Olsen, and P R. T a ylor). [17] K. Esk andari, M. Mandado, and R. A. Mosquera, Chem. Ph ys. Lett. 437 1 (2007). [18] K. Kitaura, E. Ik eo, T. Asada, T. Nak ano, and M. Ueba y asi, Chem. Ph ys. Lett. 313 701 (1999). [19] T. Nak ano, T. Kamin uma, T. Sato, Y. Akiy ama, M. Ueba y asi, and K. Kitaura, Chem. Ph ys. Lett. 318 614 (2000). [20] D. G. F edoro v and K. Kitaura, J. Ph ys. Chem. A 111 6904 (2007). 139 PAGE 140 [21] D. G. F edoro v and K. Kitaura, J. Chem. Ph ys. 120 6832 (2004). [22] D. G. F edoro v, R. M. Olson, K. Kitaura, M. S. Gordon, and S. Koseki, J. Comp. Chem. 25 872 (2004). [23] K. F ukuza w a, K. Kitaura, M. Ueba y asi, K. Nak ata, T. Kamin uma, and T. Nak ano, J. Comp. Chem. 26 1 (2005). [24] D. G. F edoro v and K. Kitaura, J. Chem. Ph ys. 121 2483 (2004). [25] D. G. F edoro v and K. Kitaura, J. Chem. Ph ys. 123 134103 (2005). [26] V. Lotric h, N. Flo c k e, M. P on ton, A. D. Y au, S. A. P erera, E. Deumens, and R. J. Bartlett, J. Chem. Ph ys. 128 194104 (2008). [27] N. Flo c k e and V. Lotric h, J. Comp. Chem. 29 2722 (2008). [28] N. Flo c k e and R. J. Bartlett, J. Chem. Ph ys. 121 10935 (2004). [29] T. F. Hughes, N. Flo c k e, and R. J. Bartlett, J. Ph ys. Chem. A 112 5994 (2008). [30] G. Sc haftenaar and J. H. No ordik, J. Comput.Aided Mol. Design 14 123 (2000). [31] T. Jano wski, A. R. F ord, and P Pula y, J. Chem. Theory Comput. 3 1368 (2007). [32] T. Kinoshita, O. Hino, and R. J. Bartlett, J. Chem. Ph ys. 119 7756 (2003). [33] T. B. P edersen, A. M. J. S. de Meras, and H. Ko c h, J. Chem. Ph ys. 120 8887 (2004). [34] A. G. T aub e and R. J. Bartlett, J. Chem. Ph ys. 128 164101 (2008). [35] F. R. Man b y, H.J. W erner, T. B. Adler, and A. J. Ma y, J. Chem. Ph ys. 124 094103 (2006). [36] C. Hamp el and H.J. W erner, J. Chem. Ph ys. 104 6286 (1996). [37] W. Kohn, Ph ys. Rev. A 133 A171 (1964). [38] M. Sc h utz, J. Chem. Ph ys. 113 9986 (2000). [39] P E. Maslen, A. D. Dutoi, M. S. Lee, Y. Shao, and M. HeadGordon, Mol. Ph ys. 103 425 (2005). [40] G. Stollho and P F ulde, J. Chem. Ph ys. 73 4548 (1980). [41] S. Hirata, I. Grab o wski, M. T obita, and R. J. Bartlett, Chem. Ph ys. Lett. 345 475 (2001). [42] S. Li, J. Shen, W. Li, and Y. Jiang, J. Chem. Ph ys. 125 074109 (2006). [43] M. Sc h utz and H.J. W erner, J. Chem. Ph ys. 114 661 (2001). 140 PAGE 141 [44] G. E. Scuseria and P Y. Ay ala, J. Chem. Ph ys. 111 8330 (1999). [45] W. D. Laidig, G. D. Purvis, and R. J. Bartlett, J. Ph ys. Chem. 89 2161 (1985). [46] B. P aulus, K. Ro sciszewski, H. Stoll, and U. Birk enheuer, Ph ys. Chem. Chem. Ph ys. 5 5523 (2003). [47] N. Flo c k e and R. J. Bartlett, J. Chem. Ph ys. 118 5326 (2003). [48] S. Li, J. Ma, and Y. Jiang, J. Comp. Chem. 23 237 (2002). [49] W. Li and S. Li, J. Chem. Ph ys. 121 6649 (2004). [50] A. A. Auer and M. No oijen, J. Chem. Ph ys. 125 024104 (2006). [51] S. Hirata, R. P o deszw a, M. T obita, and R. J. Bartlett, J. Chem. Ph ys. 120 2581 (2004). [52] H. Stoll, Ph ys. Rev. B 46 6700 (1992). [53] H. Stoll, B. P aulus, and P F ulde, J. Chem. Ph ys. 123 144108 (2005). [54] P E. Maslen, M. S. Lee, and M. HeadGordon, Chem. Ph ys. Lett. 319 205 (2000). [55] M. Sc h utz, J. Chem. Ph ys. 116 8772 (2002). [56] M. Sc h utz and H.J. W erner, Chem. Ph ys. Lett. 318 370 (2000). [57] K. Ragha v ac hari, G. T ruc ks, J. P ople, and M. HeadGordon, Chem. Ph ys. Lett. 157 479 (1989). [58] J. W atts, J. Gauss, and R. J. Bartlett, J. Chem. Ph ys. 98 8718 (1993). [59] I. Marcotte, F. Separo vic, M. Auger, and S. M. Gagn e, Bioph ys. J. 86 1587 (2004). [60] S. Ab dali, T. A. Niehaus, K. J. Jalk anen, X. Cao, L. A. Nae, T. F rauenheim, S. Suhai, and H. Bohr, Ph ys. Chem. Chem. Ph ys. 5 1295 (2003). [61] D. G. F edoro v, T. Ishida, M. Ueba y asi, and K. Kitaura, J. Ph ys. Chem. A 111 2722 (2007). [62] T. M. W atson and J. D. Hirst, Ph ys. Chem. Chem. Ph ys. 6 2580 (2004). [63] G. Stollho and P F ulde, Z. Ph ys. B Con. Mat. 26 257 (1977). [64] P Pula y, Chem. Ph ys. Lett. 100 151 (1983). [65] H. Stoll, Chem. Ph ys. Lett. 191 548 (1992). [66] P E. Maslen and M. HeadGordon, J. Chem. Ph ys. 109 7093 (1998). [67] B. P aulus, Chem. Ph ys. Lett. 371 7 (2003). 141 PAGE 142 [68] D. G. F edoro v and K. Kitaura, J. Chem. Ph ys. 124 079904 (2006). [69] O. Christiansen, P Manninen, P J orgensen, and J. Olsen, J. Chem. Ph ys. 124 084103 (2006). [70] J. B. F oresman, M. HeadGordon, J. A. P ople, and M. J. F risc h, J. Ph ys. Chem. 96 135 (1992). [71] M. Jaszu nski, A. Rizzo, and P Jrgensen, Theor. Chem. Acc. 106 251 (2001). [72] D. Maurice and M. HeadGordon, J. Ph ys. Chem. 100 6131 (1996). [73] S. R. Gw altney, M. No oijen, and R. J. Bartlett, Chem. Ph ys. Lett. 248 189 (1996). [74] H. Nak atsuji, J.Y. Hasega w a, and M. Hada, J. Chem. Ph ys. 104 2321 (1996). [75] J. F. Stan ton and R. J. Bartlett, J. Chem. Ph ys. 98 7029 (1993). [76] C. F. Bunge and R. Carb oDorca, J. Chem. Ph ys. 125 014108 (2006). [77] C. F. Bunge, J. Chem. Ph ys. 125 014107 (2006). [78] T. D. Cra wford and R. A. King, Chem. Ph ys. Lett. 366 611 (2002). [79] T. Korona and H.J. W erner, J. Chem. Ph ys. 118 3006 (2003). [80] D. Kats, T. Korona, and M. Sc h utz, J. Chem. Ph ys. 125 104106 (2006). [81] K. Sadeghian and M. Sc h utz, J. Am. Chem. So c. 129 4068 (2007). [82] D. Kats, T. Korona, and M. Sc h utz, J. Chem. Ph ys. 127 064107 (2007). [83] Y. Mo c hizuki, S. Koik egami, S. Amari, K. Sega w a, K. Kitaura, and T. Nak ano, Chem. Ph ys. Lett. 406 283 (2005). [84] Y. Mo c hizuki, K. T anak a, K. Y amashita, T. Ishik a w a, T. Nak ano, S. Amari, K. Sega w a, T. Murase, H. T okiw a, and M. Sakurai, Theor. Chem. Acc. 117 541 (2007). [85] S. Hirata, M. V aliev, M. Dupuis, S. S. Xan theas, S. Sugiki, and H. Sekino, Mol. Ph ys. 103 2255 (2005). [86] G. P Das, D. S. Dudis, A. T. Y eates, and J.P Blaudeau, Chem. Ph ys. Lett. 438 89 (2007). [87] I. Ma y er, Chem. Ph ys. Lett. 437 284 (2007). [88] P Surj an, Chem. Ph ys. Lett. 439 393 (2007). [89] T. F. Hughes and R. J. Bartlett, in preparation (2008). [90] A. Dreu w and M. HeadGordon, Chem. Rev. 105 4009 (2005). 142 PAGE 143 [91] M. No oijen and R. J. Bartlett, J. Chem. Ph ys. 106 6441 (1997). [92] J. F. Stan ton, J. Chem. Ph ys. 101 8928 (1994). [93] M. No oijen and R. J. Bartlett, J. Chem. Ph ys. 102 3629 (1995). [94] S. Hirata, M. No oijen, and R. J. Bartlett, Chem. Ph ys. Lett. 326 255 (2000). [95] M. Musia l, S. A. Kuc harski, and R. J. Bartlett, J. Chem. Ph ys. 118 1128 (2003). [96] T. D. Bouman, B. V oigt, and A. E. Hansen, J. Am. Chem. So c. 101 550 (1979). [97] G. Doggett, Mol. Ph ys. 29 313 (1975). [98] J. Stan ton and J. Gauss, J. Chem. Ph ys. 101 8938 (1994). [99] G. S. Tsc h ump er, M. L. Leininger, B. C. Homan, E. F. V aleev, H. F. Sc haefer I I I, and M. Quac k, J. Chem. Ph ys. 116 690 (2002). [100] T. F. Hughes and R. J. Bartlett, J. Chem. Ph ys. 129 054105 (2008). [101] A. J. Stone, The The ory of Intermole cular F or c es (Oxford Univ ersit y Press, New Y ork, 1996), V ol. 32. [102] A. D. Buc kingham, Ann u. Rev. Ph ys. Chem. 49 xiii (1998). [103] R. J. Bartlett and H. Sekino, Nonline ar Optic al Materials: The ory and Mo deling (A CS, W ashington, 1996), V ol. 628, Chap. 2. [104] H. Sekino and R. J. Bartlett, J. Chem. Ph ys. 85 976 (1986). [105] J. Linderb erg and Y. Ohrn, Pr op agators in Quantum Chemistry (Wiley In terscience, Hob ok en, 2004). [106] G. J. B. Hurst, M. Dupuis, and E. Clemen ti, J. Chem. Ph ys. 89 385 (1988). [107] J. F. Stan ton and R. J. Bartlett, J. Chem. Ph ys. 99 5178 (1993). [108] P B. Rozyczk o, S. A. P erera, M. No oijen, and R. J. Bartlett, J. Chem. Ph ys. 107 6736 (1997). [109] J. F. Stan ton, Ph ys. Rev. A 49 1698 (1994). [110] S. A. P erera, M. No oijen, and R. J. Bartlett, J. Chem. Ph ys. 104 3290 (1996). [111] H. Sekino and R. J. Bartlett, Adv. Quan tum Chem. 35 149 (1999). [112] C. H attig, O. Christiansen, and P Jrgensen, J. Chem. Ph ys. 107 10592 (1997). [113] O. Christiansen, A. Halkier, H. Ko c h, P Jrgensen, and T. Helgak er, J. Chem. Ph ys. 108 2801 (1998). 143 PAGE 144 [114] T. Korona, K. Pr uger, and H.J. W erner, Ph ys. Chem. Chem. Ph ys. 6 2059 (2004). [115] N. J. Russ and T. D. Cra wford, Chem. Ph ys. Lett. 400 104 (2004). [116] T. Korona, M. Przyb ytek, and B. Jeziorski, Mol. Ph ys. 104 2303 (2006). [117] Y. Mo c hizuki, T. Ishik a w a, K. T anak a, H. T okiw a, T. Nak ano, and S. T anak a, Chem. Ph ys. Lett. 418 418 (2006). [118] H. B. G. Casimir and D. P older, Ph ys. Rev. 73 360 (1948). 144 PAGE 145 BIOGRAPHICAL SKETCH Thomas F rank Hughes w as b orn in 1980 in Long Island, New Y ork to Thomas James and Rose Ann Hughes. He has t w o older sisters, Erica Christine and Kelly Marie Hughes. He grew up in Spring Hill, Florida and graduated from Cen tral High Sc ho ol in 1998. He receiv ed a Bac helors of Science degree in c hemistry along with minors in math and ph ysics, from the Univ ersit y of North Florida in Jac kson ville, Florida in 2002. His undergraduate thesis w as on kinetic isotop e eects. During that time he also did researc h at Ma y o Clinic of Jac kson ville, Florida on do c king, ligand design, and molecular dynamics studies of cyclic p eptides and acet ylc holinesterase. As a result of his high sc ho ol and undergraduate exp eriences, he decided to go to graduate sc ho ol for his do ctorate in ph ysical c hemistry at the Quan tum Theory Pro ject of the Univ ersit y of Florida under the direction of Graduate Researc h Professor Ro dney J. Bartlett. Professor Bartlett's great understanding of theoretical c hemistry has lead him to b e a v ery inruen tial gure in dev eloping and applying quan tum c hemical metho ds, for example as in his A CES suite of highlev el electronic structure programs. P art of the fo cus of the author's graduate researc h has b een to dev elop and implemen t quan tum c hemical metho ds for larger molecules. 145 