Adapted ab Initio Theory

Material Information

Adapted ab Initio Theory A Simplified Kohn-Sham Density Functional Theory
McClellan, Joshua J.
Place of Publication:
[Gainesville, Fla.]
University of Florida
Publication Date:
Physical Description:
1 online resource (127 p.)

Thesis/Dissertation Information

Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Committee Chair:
Bartlett, Rodney J.
Committee Members:
Talham, Daniel R.
Ohrn, Nils Y.
Deumens, Erik
Trickey, Samuel B.
Graduation Date:


Subjects / Keywords:
Approximation ( jstor )
Atoms ( jstor )
Coordinate systems ( jstor )
Electronics ( jstor )
Electrons ( jstor )
Energy ( jstor )
Geometry ( jstor )
Mechanical forces ( jstor )
Molecules ( jstor )
Orbitals ( jstor )
Chemistry -- Dissertations, Academic -- UF
hamiltonian, semiempirical, transfer
Electronic Thesis or Dissertation
bibliography ( marcgt )
theses ( marcgt )
Chemistry thesis, Ph.D.


We present work toward a one-electron Hamiltonian whose solution provides electronic energies, forces, and properties for more than 1000 atoms fast enough to drive large scale molecular dynamics. Ideally, such a method would be as predictive as accurate ab initio quantum chemistry for such systems. To design the Hamiltonian requires that we investigate rigorous one-particle theories including density functional theory and the analogous wavefunction theory construction. These two complementary approaches help identify the essential features required by an exact one-particle theory of electronic structure. The intent is to incorporate these into a simple approximation that can provide the accuracy required but at a speed orders of magnitude faster than today's DFT. We call the framework developed adapted ab initio theory. It retains many of the computationally attractive features of the widely-used neglect of diatomic differential overlap and self-consistent-charge density-functional tight-binding semiempirical methods, but is instead, a simplified method as it allows for precise connections to high-level ab initio methods. Working within this novel formal structure we explore computational aspects that exploit modern computer architectures while maintaining a desired level of accuracy. ( en )
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Thesis (Ph.D.)--University of Florida, 2008.
Adviser: Bartlett, Rodney J.
Statement of Responsibility:
by Joshua J. McClellan

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright by Joshua J. McClellan. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Embargo Date:
LD1780 2008 ( lcc )


This item has the following downloads:


































































































































Full Text

3.2.2 Mulliken Approximation Connection to SCC-DFTB

To facilitate understanding of the density-dependent term in SCC-DFTB in relation

to concepts in wavefunction theory we must recognize that the term AqA is the charge

fluctuation on atom A from some reference charge for that atom-type (qA), with the

charge on atom A being defined by the Mulliken population analysis. In the expanded

representation in terms of the AO density matrix (P) the additional term to the density

dependent part of the Fock-like matrix is given by Equation 3-13, which has been shown

to be equivalent to Equation 3-15, though the latter has a form that some may recognize

as the Mulliken expansion for two-electron integrals,

(pV Au7) -SSoa(7YAC + 7BC + AD + 7BD) (3-18)

The origin of the Mulliken expansion and the further extension of the basic approximation

to the Riidenberg approximation will be explored later. For now it is sufficient that

we discuss the numerical properties of this approximation and what advantages and

limitations it entails.

It is perhaps easiest to consider the nature of the Mulliken approximation by way of

a simple example. For an NDDO-type two-center two-electron integral with normalized

orbitals it follows directly from Equation 3-18

7AB (PA IABAB) (319)

There have been many studies on the various forms for 7, most notably by Klopman [39],

Ohno [40], and Nishimoto-Mataga [41]. The particular form within SCC-DFTB model

is the Coulomb repulsion between two Is Slater type orbitals with an added condition

that as the separation between the two centers approaches zero YAB -- UA where UA is

the Hubbard parameter, which is related to chemical hardness, which in turn is related

to the difference between the first IP and EA of that atom. The particular form of the

7AB is only as significant as the underlying approximation in which it is used. In this

atoms for each method. Since the AATMO is designed to reproduce the exact B3LYP

result in the separated atom limit, the atomic energies are identical. So, the error between

AATMO and B3LYP is equivalent to the total energy differences. Since we also want to

compare B3LYP to AM1 we must also have atomization energy for B3LYP. Similarly, for

AMI we perform the neutral atom calculations and subtract the sum of atomic energies

from the total energy. The AATMO error from B3LYP is 0.057 Hartree. For AM1 the

corresponding error is 0.331 Hartree. This indicates that the AATMO total energies are in

better agreement with the underlying B3LYP total energies.

Since one goal of our approach is to reproduce a complex PES of large systems we

have applied the AATMO approximation to describe a somewhat unphysical reaction

mechanism for C20 in order to subject the approximation to an extreme case of bond

breaking. The pseudo-reaction-path corresponds to splitting the C20 molecule in half,

simultaneously breaking ten C-C bonds. Shown in Figure 4-10 are the total energies of

B3LYP, AATMO, AM1, and SCC-DFTB. The reaction coordinate is the distance between

the end-caps and the equilibrium value is 3.223 A, the range is from 2.8 A through 6

A. As expected we see that AM1 exhibits unphysical behavior, and some convergence

problems beyond 4.5 A. The SCC-DFTB actually performs quite well for C20, this result

may not be too surprising since the repulsive energy term in SCC-DFTB is parameterized

from B3LYP total energies for C-C bond breaking and the energy scale is very similar to

the sum of C-C bond energies. The AATMO overbinds this system significantly and may

in fact be converging to a different electronic state, as the eigenvalue spectrum deviates

significantly from the B3LYP result. Further work is needed to determine the precise

origin of this discrepancy and to correct it.

The final comparison we make is to the RMSD of the gradients for the G2 set. We

will consider two sets of data, one for the MP2 equilibrium structures and the other

for the B3LYP minimal basis set equilibrium structures. Shown in Figure 4-11 are the

average RMSD of the Cartesian gradients from the MP2 reference structures calculated

pertinent then the first-order energy expression given by Equation 3-29 would be suitable.

By separating terms that are dependent on the density (G) we can write

1 1
EKS -Tr[P(2HKS + GKS)] -Tr[PG S] + Exc[p] (3-30)
2 2


(GKS) c(r) v) (3-31)

and F = H+G. Equation 3-30 is a central theme of our work, as it gives the unambiguous

definition of the exact electronic energy. Tied to it are the one-particle KS equations that

define the orbitals, the density and associated first-order properties. In this way, the

problem for the total energy and gradients is completely specified by

* F that generates the ground state one-particle density matrix P.

* F that can be decomposed into density independent and (H) and dependent (G)

* The exchange-correlation energy functional.

3.3.1 General Energy Expression

We now choose a general electronic energy expression. Including the nuclear-nuclear

repulsion we have the total energy expression,

1 N ZAZ_
E = Tr [P(2H + G)] + Eridal[P] + ZAZ (3-32)
2 A,B>A AB

We will bring HF, NDDO, DFTB, SCC-DFTB and KS-DFT into this framework, to make

it clear what the three distinct components are:

* Density independent H.

* Density dependent G.

* Residual electronic energy Eresidual[P].

If a method is to give the correct electronic density, and we can know the exact

KS-DFT HKS and GKS, then the model components for H and G Jl. l. must





TH(CCSD) UHF ................
B3LYP/6-31 G* UHF ...........

1 1.5 2 2.5 3 3.5
C-N reaction coordinate (Angstrom)

Figure 2-1. The NMT force curve for C-N rupture


0 2 Standard Deviation




Figure 4-12. Force RMSD from B3LYP

1 0 0 ............................. -..............................
100 2-center
90 2-center
7 0 3-center ............
S 760 i4-center ................................

q 20
10 ,,

1 0 1 1 1 1 1 1 1 1
0 100 200 300 400 500 600 700 800 9001000
Number of atoms (5 orbitals per atom)

Figure 4-7. Percentage of multi-center terms versus system size

case the Mulliken approximation ought to be accurate. The Mulliken approximation in

SCC-DFTB essentially converts three- and four-center two-electron integral contributions

to the Coulomb repulsion energy term into overlap-weighted two-center repulsive terms.

If we apply the Mulliken approximation in standard HF and use the ab initio (pp|AA)

two-center integrals, the behavior is quite unphysical. Shown in Figure 3-4 are errors

from the full HF when the Mulliken and Riidenberg approximations are applied to the

two classes of three-center terms, (AAIBC) and (ABIAC), and the four-center terms in

NMT over the carbon-nitrogen reaction coordinate. The error introduced by the Mulliken

approximation is drastic, particularly for the (AAIBC) three-center term, for which

the error is over 1000 kcal/mol. Given that the only explicit density dependence of the

Fock-like matrix in SCC-DFTB is introduced by the Mulliken term, and yet the relative

energetic of SCC-DFTB in practice do not introduce such errors, then some density

independent part of the model must be accounting for the difference. We will avoid such

uncontrolled approximations in the AAT approach.

3.2.3 SCC-DFTB Repulsive Energy

The SCC-DFTB total energy expression relies on the original TB form for the total

energy, in which a sum of repulsive atom-pair potentials is used. An error is introduced

because of this reliance on atom-pair potentials to describe quantities that are inherently

density dependent. Under an ideal parameterization, using the same argument as

used earlier for NDDO, where we assume the effective one-particle model SCC-DFTB

Hamiltonian returns the correct density, the error in the SCC-DFTB energy to the exact

form is

pscc s KS SCC(R) E., [p] + v1t(r)p(r)dr C ZAZB (2Q
E c- EE c(RAB) -v,([p]+ r)p(r)dr- z (3-20)

The numerical behavior of this error is shown in Figure 3-3. Clearly, this repulsive

potential is carrying the burden of a portion of the electronic energy. This is not by

accident; the value of a function (in this case the total energy) that results from the

set {RAB} between atoms A and B. For the one-particle Hamiltonian, the usual LCAO

approximation is made and the Qi's are expanded in a set of X,'s giving the equation,

HeffC = SC (1-37)


(Hff)p = (p heff I v) (1-38)


S,, = (p (1-39)

In practice, these two-center matrix elements are evaluated using the Slater-Koster

parameterization [34]. The TB approximation has been applied successfully to many solid

state problems and works especially well when the density of the chemical system is well

approximated as a sum of spherically symmetric atomic densities. In the case of molecules,

a more advanced approach that I ii the delicate balance of ( i [7] is needed. If we

compare TB to HF, we see that U(IRA RBI) has to account for J K and VNN. In DFT

it would account for J, VNN, and would combine exchange with correlation by adding vc,.

1.9 Density-Functional Tight-Binding

To vest the TB energy expression with more rigor, Foulkes and Haydock [35] derived

it from first-principles DFT considerations. The energy expression in KS-DFT, Equation

1-27, may be written as

EDFT et+ iv7 + (r ) ] [p() + R -R RE (1-40)
i A,B

Equation 1-40 is exact if p -= f where p is the exact density. However, it should

be noted that Qi are eigensolutions of h+ (-t + vt + f + vc[po]) and not

(-v + Vext + f' P('). Following Foulkes and H 1i, .d 1 the density p is replaced by a

reference density po and a small deviation 6p, such that p = po + 6p. Then we can rewrite

yields a potential energy surface and force curve which are incorrect at non-equilibrium


2.3.2 Nitromethane Dimer

In analogy with how Bukowski et al. [63] have studied the interaction energy of

dimers as a function of rigid monomer separation, we calculate the relative energies of

the NMT dimer as a function of hydrogen bonding distance. The minimum chosen in our

NMT dimer calculations from [51] is stable due to the formation of two hydrogen bonds of

equal length and antiparallel arrangement of the molecular dipoles. In Figure 2-3, we plot

the energies relative to the respective methods minima in ROH.

Our data indicate that in the case of frozen monomer geometries, AM1 overestimates

the repulsive wall at small intermolecular distances. It appears as if the H, C, N, and

O parameters in the core-core repulsion function for AM1 were unable to cancel the

effect from which MNDO has traditionally suffered, namely overestimation of energies

in crowded molecular systems [31, 64]. The TH-CCSD reproduces the interaction forces

quite well for intermolecular distances up to ~3 A, while AM1 is shown to slightly

overbind. Thereafter the TH-CCSD underbinds slightly in disagreement with RI-MP2 but

in agreement with the OM1 Hamiltonian. The OM1 method has credence as a method

which models the dipole and thus hydrogen bonding interaction properly by correcting the

overlap matrix [32, 65, 66].

Adiabatic C-N bond dissociation for small NMT clusters was investigated starting

from published local minima [51] using the TH-CCSD and AM1. In studying the dimer

interaction, for which one of the monomers is undergoing unrestricted C-N bond rupture,

the TH-CCSD forces predict that a bimolecular process takes place when the stretched

bond is at a distance of 2.42 A (1.6Rgq). The reaction products are nitrosomethane,

methoxy radical, and nitrogen dioxide radical. Figure 2-4 shows snapshots of this

particular dimer decomposition channel.

Still, there are many aspects that need to be implemented efficiently to have a viable,

stable code. We have previously discussed the use of cubic splines are an efficient way of

balancing the numerical accuracy of the two-center functions and the number of reference

points stored.

Another aspect that needs to be considered is the rotation of the two-center matrix

elements in a local atom-pair coordinate system to the gl..1, l' coordinate system of the

molecule. This rotation is unambiguous once the global coordinate system is specified,

and although orbitals of the molecule can be rotated simultaneously without affecting

the energetic, the corresponding matrix elements will change on rotation. This degree

of flexibility must be incorporated to ensure that the Fock-like matrix contribution that

arises from every atom-pair corresponds to the orbital orientation of the global coordinate

system of the molecule. To achieve this we use a simple procedure that generates the

matrix that rotates a pair of atomic coordinates so that they are aligned to the axis of

the the stored cubic spline, and then perform the reverse rotation on that sub-block of the

Fock-like matrix.

Finally, the third component implemented was the numerical integration that is

performed once the SCF procedure has converged. The ACES III program system

generates a job archive file (JOBARC) that can be read by ACES II executables, assuming

it has been generated on the same computer architecture. To test the efficacy of the AAT

approximations, the quickest route was to modify the ACES II xintgrt executable.

The xintgrt module performs numerical integration on a density to determine the

exchange-correlation energy that is needed for the residual electronic energy contribution

of AAT. Effectively, xintgrt pulls the AO eigenvector matrix from the JOBARC, as well

as other information about the system that would typically be generated by the I; -. UI

executable in ACES II when running a KS-DFT calculation (e.g. occupation numbers).

Since ACES III does not create all of the necessary records in the JOBARC, some small

adjustments to the xintgrt routine led to a new modified executable that performed the

numerical precision of the Riidenberg approximation in an effort to avoid the explicit

calculation of the numerous four-center integrals. By reducing the cost of each four-center

integral we improve the most computationally expensive step of building the Fock matrix

for large systems.

First, consider the effect of the orbital expansion on a two-center product of two

orbitals with respect to the same electron coordinate,
10 0
(t(1)0, (1) 2 [SPAk ",. (1)vB(1) + SkAB ;. ( A() (4-3)

Since a general multi-center two-electron integral is given as

(PAAc -'VBD) (IAB AcD) J B 1 (2)D -dld2 (4-4)
J J r12

substituting Equation 4-3 into Equation 4-4 for both the AB and CD orbital products

yields the following expression, which is the Ridenberg two-electron integral form,

k=1 j=1 (45)

+SkAV SAjD (/AkA jDc9D) + SkAvB SJCD (1AkA Acjc)]

In the limit that the sums over auxiliary orbitals k and j are complete this expansion

is exact. In practice, the expansion is approximated by limiting the orbitals over which

we are expanding to those included in the AO basis functions, though using a separate

auxiliary set of orbitals may also be suitable.

The Ridenberg two-electron integral expression (Equation 4-5), is more general than

the well-known Mulliken two-electron integral approximation. The Mulliken approximation

is equivalent to restricting the sum in Equation 4-1 to be only over orbitals of the same


PA 1VB1 2SPAB (, 1) + B(1)> (1)] (4-6)

All of the terms in this expression that are independent of 6p comprise the standard

TB equation. Hence,

U(|RA RB) = -.f f Pfu[po] (Po) + E.,[po + t Z (1-44)
U(IRA RBI) 2 I2- + [ 0+ IRA R:I
2 Ir r'd 2A,BRA RB

which obviously is correct to second order in p. Note the inclusion of nuclear-nuclear

repulsion in this term.

One of the great advantages of DFT approaches is that they offer a reasonable

zeroth-order approximation to the energy simply from knowing some approximation to the

reference density, po. However, since E,.[p] is not known, getting the exchange-correlation

potential, vc,(r) = 6EW,/6p(r), and its kernel, fx,(r, r') = 62E,1/6p(r)6p(r'), from first

principles, which are required to set up a rigorous DFT, is difficult [1, 36]. However, were

we to accept one of the plethora of approximations to Ec (LDA, LYP, PBE, B3LYP, B88,

etc.) the equations can be solved for a few hundred atoms, but not on a time-scale that

can be tied to MD for thousands of atoms.

Furthermore, to provide a correct description of bond breaking the electronic charge

has to be free to relocate. A common example is LiF, in which at R = Rq we have

virtually an exact Li+F- form, but at separation in a vacuum, we must have LiF0.

Such self-consistency usually is introduced by solving the AO-based DFT equations via

diagonalization, but this approach imposes an iterative step that scales as N3 to the

procedure. Since the model is meant to be approximate anyway, the alternative charge

self-consistency approach, familiar from iterative extended Hiickel theory [37, 38], can be

used more efficiently. This extension leads to the self-consistent-charge density-functional

tight-binding (SCC-DFTB) approach of Elstner et al. [7].

1.10 Self-Consistent-Charge Density-Functional Tight-Binding

To introduce charge self-consistency, the terms that are second-order in 6p are

approximated as a short-range Coulombic interaction based upon Mulliken atomic charge

Another test of a purely electronic property is simply the electronic density. In this

case a direct comparison to existing SE methods is not possible in a straight forward

way because NDDO-based and DFTB-based methods drop the core orbitals since they

are using a valence-only approach. Instead, we have used B3LYP as the reference and

compared AATMO, minimal basis HF with the Riidenberg approximation applied to

four-center two-electron integrals, and full ab initio minimal basis HF. Considering the

one-particle density matrix (AO density matrix P) from B3LYP the average RMSDs

are 0.061, 0.136 and 0.029 for AATMO, HF (with Riidenberg), and full HF, respectively.

These results demonstrate some surprising behavior. On average the full HF has the best

agreement to B3LYP, at least in the case of a minimal basis set. The AATMO agreement

is somewhat worse and HF with the Riidenberg approximation introduces a significant

error. Since AATMO includes the Riidenberg approximation, we would expect its deviation

from B3LYP to be similar to the deviation of HF with the Riidenberg approximation from

the full ab initio HF results, especially since several systems in the test have only two or

three atoms and the four-center approximation does not affect those cases. We see two

probable origins of this unexpected discrepancy; first, there is a cooperative cancellation of

errors between the Riidenberg approximation and the exchange-correlation approximation

(Go,); second, because the hybrid B3LYP functional includes only 211'. nonlocal exchange

the corresponding error introduced by the Riidenberg approximation is not as significant

when compared to the HF treatment in which full nonlocal exchange is used.

In addition to the average RMSD of the density of AATMO and full HF we consider

the standard deviation of the RMSD of the density for each molecule in the test set.

The standard deviations are 0.042 and 0.057 for AATMO and full HF respectively. This

- i-I-. -r that while the overall average RMSD is somewhat higher from AATMO the error

is slightly more systematic.

Turning our attention to energetic properties we consider the atomization energies of

the relevant portion of the G2 set. We take as the atomic reference energies the neutral

[66] W. Weber and W. Thiel, Theor. ('!1. ini Acc. 103, 495 (2000).

[67] J. Stewart, Reviews in Computational C', i,,.-Iry (VCH Publishers, 1990), vol. 1,
p. 45.
[68] W. Thiel, Adv. Chem. Phys. 93, 703 (1996).

[69] D. E. Taylor, K. Runge, and R. J. Bartlett, Mol. Phys. 103, 2019 (2005).

[70] W. Thiel and A. Voityuk, Theor. Chem. Acc. 81, 391 (1992).

[71] A. Ourmazd, D. W. Taylor, J. A. Rentschler, and J. Bevk, Phys. Rev. Lett. 59, 213

[72] D. Muller, T. Sorsch, S. Moccio, F. Baumann, K. Evans-Lutterdodt, and G. Timp,
Nature 399, 758 (1999).

[73] X. Y. Liu, D. Jovanovic, and R. Stumpf, Applied Physics Letters 86, 82104 (2005).

[74] A. A. Demkov, J. Orteg, O. F. Sankey, and M. P. Grumback, Phys. Rev. B 52, 1618

[75] L. Stixrude and M. S. T. Bukowinski, Science 250, 541 (1990).
[76] L. Stixrude and M. S. T. Bukowinski, Phys. Rev. B 44, 2523 (1991).

[77] M. Menon and K. R. Subbaswamy, Phys. Rev. B 55, 9231 (1997).
[78] M. Elstner, J. Phys. ('!,. i, A 111, 5614 (2007).

[79] N. Otte, M. Scholten, and W. Thiel, J. Phys. ('!,. i, A 111, 5751 (2007).
[80] M. P. Repasky, J. C'!I iiioh isekhar, and W. L. Jorgensen, J. Comput. ('!I, ii 23, 1601

[81] P. Pulay, Mol. Phys. 17, 197 (1969).

[82] B. Dunlap, J. Connolly, and J. Sabin, J. C'!. in Phys. 71, 3396 (1979).

[83] C. A. White, B. G. Johnson, P. M. W. Gill, and M. Head-Gordon, C'!. in Phys. Lett.
230, 8 (1994).

[84] C. A. White, B. G. Johnson, P. M. W. Gill, and M. Head-Gordon, C'!. in Phys. Lett.
253, 268 (1996).

[85] M. C. Strain, G. E. Scuseria, and M. J. Frisch, Science 271, 51 (1996).

[86] J. C. Burant, G. E. Scuseria, and M. J. Frisch, J. ('!,. ii Phys. 105, 8969 (1996).

[87] M. C'!I .!! ...,l>e and E. Schwegler, J. ('!I. ii Phys. 106, 5526 (1997).

02008 Joshua J. McClellan

semiempirical counterparts: NDDO and SCC-DFTB. These connections form the basis of

AAT, which is a SM rather than a SE method.

Existing SE methods are widely used and, when applied judiciously, perform quite

well. Regrettably, they are often used in situations for which the results are not reliable,

such as locating transition states. To expand the applicability of SE methods, we combine

their primary features with those from rigorous WFT or DFT, and define the new AAT

framework. Hence, AAT encompasses both NDDO and SCC-DFTB.

Despite their formal deficiencies, both NDDO and SCC-DFTB have been quite

successful in a variety of applications [2]. Nonetheless, there remain several aspects

of these SE methods that fall short of expectation that could be greatly improved. In

addition to obvious formal limitations, existing SE methods were designed to reproduce

the energetic of systems at equilibrium and often yield unphysical results near transition

states and along dissociation pathi--,-. Uniform accuracy of the PES is critical for

molecular dynamics (\!1)) simulations, for determining the lowest energy (global

minimum) structure, and for calculating rate constants. Associated with these failings

are unrealistic electronic densities that, in turn, make combining these methods into

multi-scale modeling schemes difficult. Furthermore, errors in first-order properties such

as the dipole and higher-order moments would be remedied by a method that delivers

adequate electronic densities. Attempts to improve such properties have met with modest

success, typically at the expense of other aspects of the model.

By far, the most common route to incorporate such additional features is to

reparameterize the model. Unfortunately, building in accuracy for non-equilibrium

structures, excited states, and other properties, by simple reparameterization of existing

NDDO-based SE methods [3] is not readily achievable, at least not in a way that

maintains transferability (similar accuracy regardless of bonding characteristics or system

size). This difficulty highlights an important distinction between the current AAT route of

designing a new simplified model Hamiltonian based on rigorous ab initio principles versus

deceptively similar:
HF 1

P/ (pv|AU)JABJCD 2 AlV)JADJBC (3-2)

where p, v, A, and a are on centers A, B, C, and D respectively and the terms (puv Aa)

are subject to the multipole-multipole expansion approximation. It is true that a typical

NDDO-type two-electron integral encountered is often larger in magnitude than a

corresponding non-NDDO-type two-center, three-center, or four-center two-electron

integral. For molecules with more that three heavy atoms in a valence-only minimal basis

set representation, there are significantly more terms involving three- and four-center

two-electron integrals. The combined effect of this large number of multi-center terms

has a significant effect on the density dependent Fock matrix contributions. We consider

a simple example. Shown in Figure 3-1 are the individual contributions to the NMT

density dependent Fock matrix element of the carbon 2s-type orbital and nitrogen 2s-type

orbital in a minimal basis set for the combined three- and four-center, two-center, ab initio

NDDO-type, and full multi-center contributions. Though only a single matrix element

is plotted over this reaction coordinate, the trend is the same for other matrix elements

and other molecules. The three- and four-center contributions are nearly equal and oppo-

site to the two-center contributions. This cancellation is a universal trend and holds at

all energy scales. And, while it could be said that the NDDO-type class of two-electron

integrals approximates the full set of two-center two-electron integrals, it is clear that

the NDDO-type integrals alone bear little resemblance to the full two-electron integral

contribution. Instead, it is the delicate balance among all multi-center two-electron

integrals that is critical to obtaining the proper Fock matrix contribution.

It has been known since studies were first performed on the NDDO approximation

that a parameterized set of NDDO integrals perform better that the ab initio NDDO

integrals. However, what has been less clear is the lack of correspondence between the

NDDO approximation and the quantity it is supposedly approximating. Instead it seems

The electronic energy correct through first order is

&= (o01H 0o)

Tr[PHcore] + (ijij) (3-27)

Note that the first-order electronic energy is exactly the same as the HF energy expression

and furthermore has no explicit dependence on the particular form of the one-particle

operator f.

However, the goal is not the first-order electronic energy but the exact electronic

energy. We can consider the CC route, where we know that for a single determinant the

exact electronic energy is (with any choice of orbitals)

Ec = + Y j||ab)(t + t t tt ) + (if la)t? (3-28)
The CC energy expression provides great insight, but the explicit calculation of ta and ta

amplitudes requires much more computational effort than we are able to expend for a fast

semiempirical model. A more natural environment for a semiempirical methods is given

by KS-DFT. Assuming that we have the exact exchange-correlation energy functional, the

exact KS-DFT energy is (for KS-DFT orbitals)

Eel =R + (ijji) +Exc[p]

= Tr[PHcre] + (ijlij) + E[p] (3-29)

There is another form of the KS-DFT energy expression that is more useful for the

current development, though it requires that we use a slightly more specific form for

our one-particle operator. If the target is a one-particle Hamiltonian that returns the

correct electronic density, and hence the correct first-order electronic properties, then the

ideal one-particle operator is indeed the KS-DFT one-particle operator. And, if only the

total operator and not its individual components, i.e., V2, Vext(r), VJ(r), and vxc(r), is

which captures a large portion of the Riidenberg expansion and when applied to

two-electron integrals yields,

+(VBV IAc AcA) + (VBVB U DU D) -

Note the similarity of this expression to that which appears in SCC-DFTB, Equation

3-18, in which

7AB ~ (PAPA VBVB) (4-8)

though in SCC-DFTB the orbitals p and v are restricted to be s-type functions.

We consider here all eight different approximations that result from the application

of Equation 4-1 to three- and four-center two-electron integrals. The two classes of

three-center terms are of the form (PAVA IABcc) and (PAVB AAUc), which will be referred

to as types A and B respectively. To facilitate our discussion we introduce the numbering

convention in Table 4-1.

The Riidenberg and Mulliken formulas can be applied to either class of three-center or

the four-center integrals (approximations I-VI) using Equations 4-5 and 4-7, respectively.

For the partial approximations to four-center terms (VII and VIII), partial means that the

corresponding approximation has been applied only once. The expansion for these partial

approximations is then in terms of three-center two-electron integrals (instead of only

two-center terms). For the Mulliken partial (VII) we have

(PAVB AcI7D) = -{ SAB [(PAA AAciD) + (VBVBA CUD)1
4 (4-9)
+SACoD[(PAVB AcAc) + (PAVIB U-DD)1}.

and the Riidenberg partial (VIII) is
1 oo
(PAVB Aco-D) = 4 [SPAk (kBB IAcUD) + SkAB (pAkA AcUD)
k= (4-10)

+ SAckD (PA VB| I kDUD) + SkD (A VB I Ackc)]





Figure 2-7. Average deviation of O-Si-O angle

where Xk is a reference point (node), and f(xk) = Ak. The determination of the

parameters Bk, Ck, and Dk is straight-forward and unambiguous. We have used the freely

available Duris routine for discrete cubic spline interpolation and smoothing [91]. The only

degrees of flexibility when using cubic spline interpolation are the number of nodes used

and if they are evenly spaced (discrete) or based on an adaptive grid. In our case, for both

integrals and matrix elements, the functions in all cases go to zero in the separated limit

and at smaller length scales the function requires a dense grid to completely capture the

subtle changes that occur in that region. A dense node spacing over the entire length scale

is unnecessary because the functions change very little at long length scales.

To determine a suitable node mapping function we tried several different functional

forms and concluded that a simple exponential function provides a good balance between

having a dense grid in the more critical bonding region (from around 0.5 A to 5 A) and a

sparse grid at long bond lengths (hundreds of A). The actual node mapping function used


node(RAB) Int(1000 e-6/(RAB+)) 1. (4 36)

All of the functions we have currently implemented, those involving two-center exchange-

correlation approximations, are zero beyond 10 A. Using the node mapping function then

requires approximately 600 reference points. Since there are 5 variables per reference

point we need 3000 parameters that would form a look-up table for a single function. If

double precision numbers are used to store the spline parameters the storage requirement

is about 0.023 MB per function. Further details of the current implementation of the AAT

procedure in the parallel ACES III program system can be found in Appendix A.

4.5 Verification

We have shown that AATMO reproduces well the basic features of a PES for a small

test set of molecules involving the direct fission of a C-N bond. Now we turn our attention

to a more thorough comparison to test the behavior of AATMO for a variety of different

properties. We use the diverse set of molecules in the so-called G2 test of molecules

with universally transferable parameters. The difficulty still remains on how best to

incorporate the intersection of these two functions in a practical way.

We have developed the strategy and framework for eventually including the

three-center terms. For now we are satisfied with reducing the computational cost through

only the four-center Riidenberg approximation because the four-center terms account for

most of the computational effort associated with construction of the Fock matrix for large


4.2 Kohn-Sham Exchange and Correlation

If we envision an approximation to KS-DFT then a primary focus ought to be

avoiding the numerical integration that must be performed for each SCF iteration. The

formal scaling of the numerical integration step is norbitls, as it is performed in real space.

In practice, the scaling is lower because the exchange-correlation potential is local and

the density decays rapidly. Still, we want to avoid any extra computational effort that

contributes little to the overall energetic. To approximate the exchange-correlation

potential contribution consider the expansion of the potential in terms of atom-centered

densities, p = LA PA. Then we have the following telescoping series

Vxc[ PAI](r) = vYxc[pAI(r) + {vc, [pA+ PBI](r) -V,[PAI](r) V.cC[PBl (r)}

+ S {vXc[PA PB+pc](r)

Vc[PA + PBI ) Vc [PA + c](r) Vc [PB + Pc] (r)

+ vxc[PA (r) + Vx[PB (r) + xc[pc] (r)} + ..
1-center 2- center 3-center _18)
Vc 1+ Vc + Vzc + (4 8)

Equation 4-18 is exact. It only requires an arbitrary partition of the density into

atomic parts, though convergence of the series is affected by the choice of partition.

Our target is to include contributions from the exchange-correlation potential to the

KS Fock-like matrix in a way that only requires the storage of a limited set of two-center

such as found in HF or DFT. It is also possible to define others, as in the correlated

orbital potential (COP) method [10, 11].

HF illustrates many of the necessary features of a one-particle theory. One distinct

feature, which sets HF apart from most other one-particle theories, is that HF enforces the

Pauli exclusion principle exactly via the exchange operator. Such exact exchange terms

are missing in Kohn-Sham density functional theory [12] which leads to the so-called self-

interaction error. Another benefit of HF is that there is Koopmans' theorem that gives

meaning to the eigenvalues of the Pock matrix by relating them to ionization potentials

and electron affinities. However, the exact DFT structure can provide a formally correct

one-particle theory that includes electron correlation, which HF cannot do because by

definition it does not include electron correlation. COP is another exact alternative that

can be defined from WFT.

1.4 Coupled-Cluster Theory

The introduction of electron correlation in wavefunction theory can be achieved by

generalizing the mean-field wavefunction to allow electrons to be excited into the orbitals

that are unoccupied (virtual, i.e., do not appear in the Hartree-Fock determinant) orbitals,

a,b,c, .... Those additional orbitals allow the electrons to avoid each other in a more

detailed way (hence lower the energy) than in the mean-field wavefunction. That is, for

single excitations (replacements), double excitations, triple excitations, etc., we define

excitation operators Tj, j = 1,2, 3,...

T o1 = t + (1-12)

T2+=0 > tt (-13)
i Wou= C tb (1-14)
i i

[24] D. J. Tozer, V. E. Ingamells, and N. C. Handy, J. C'!i. in Phys. 105, 9200 (1996).

[25] K. Peirs, D. Van Neck, and M. Waroquier, Phys. Rev. A 67, 012505 (2003).

[26] R. Hoffmann, J. C'!I. ii Phys. 39, 1397 (1963).

[27] M. J. S. Dewar, J. Friedheim, G. Grady, E. F. Healy, and J. Stewart, Organometallics
5, 375 (1986).

[28] J. J. P. Stewart, J. Computat. C'!i. i, 10, 209 (1989).

[29] J. J. P. Stewart, J. Computat. C'!i. i, 10, 221 (1989).

[30] M. J. S. Dewar and W. Thiel, J. Am. C'!., in Soc. 99, 4899 (1977).

[31] M. J. S. Dewar, E. G. Zoebisch, E. F. Healy, and J. J. P. Stewart, J. Am. C'I, in Soc.
107, 3902 (1985).

[32] M. Kolb and W. Thiel, J. Computat. CI'. in 14, 775 (1993).

[33] M. J. S. Dewar and W. Thiel, Theor. C'!., in Ace. 46, 89 (1977).

[34] J. C. Slater and G. F. Koster, Phys. Rev. 94, 1498 (1954).
[35] W. M. C. Foulkes and R. Haydock, Phys. Rev. B 39, 12520 (1989).

[36] R. J. Bartlett, I. Grabowski, S. Hirata, and S. Ivanov, J. ('!i, in Phys. 122, 034104

[37] M. Wolfsberg and L. Helmholz, J. CI. in Phys. 20, 837 (1952).
[38] C. J. Ballhausen and H. B. Gray, Inorg. C'!i. i, 1, 111 (1962).

[39] G. Klopman, J. Am. C'!., in Soc. 86, 4550 (1964).
[40] K. Ohno, Theor. C'!I, i. Ace. 2, 219 (1964).

[41] K. Nishimoto and N. Mataga, Z. Physik. ('!I, in (Frankfurt) 12, 335 (1957).

[42] G. Z1i. w-- S. Irle, M. Elstner, and K. Morokuma, J. Phys. C'!, ii. A 108, 3182 (2004).

[43] C. E. Taylor, M. G. Cory, R. J. Bartlett, and W. Thiel, Comput. Mat. Sci. 27, 204

[44] R. J. Bartlett, J. Phys. C'!. i,, 93, 1697 (1989).

[45] I. Rossi and D. Truhlar, C'!., i, Phys. Lett. 233, 231 (1995).
[46] P. C'!i rbonneau, Astrophys. J. (Suppl.) 101, 309 (1995).

[47] R. H. Byrd, L-BFGS-B version 2.1, University of Colorado. (1997).

correspond to them. There is no flexibility with this interpretation. In addition, if we

have the F that yields the correct density then to get the exact total energy, Eresidual [P]

must be Ersidual [P]. Clearly, we are dealing with approximate models, though instead of

defining the target or reference as being a single number, such as an experimental heat

of formation, we instead have three separate and distinct functional forms, for which the

exact forms are, in principle, known.

3.3.2 Density Independent Fock-Like Matrix Contribution

This component generally accounts for nuclear-electron attraction and the electron

kinetic contribution. The following definitions are based on p E A and v E B.

* HF:
HHF2 A1- 1 ) (333)


H S u, 1- YBA ZB(ppBB) if- (334)
S/I (0/, + 0) if p '

HDFTB H= H (3-35)
Pi/ 2 "P

HSCC-DFTB Ho S, (7A (3-36)
Pi/'2 2

Hv2s 2ZA -
HPV (-v- IV) (337)
2 A r3A

The only density independent contribution to the ideal Fock-like matrix (the Fock-like

matrix which returns the correct one-particle ground state density) is the exact core

Hamiltonian, H0re". The 'li.-- -1 bottleneck for computing contributions for H" r" is the

three-center one-electron integral, which scale as norbitalsNatoms. Still, they represent

a small computational effort for large systems when compared to the orbitals terms

o TH(CCSD) UHF ................
S60 B3LYP/6-31G* -... -




1 1.5 2 2.5 3 3.5
C-N reaction coordinate (Angstrom)

Figure 2-2. The NMT energy curve for C-N rupture

which attempt to fit a known energy surface to a SE form via reparameterization, the Th

fits to the Hamiltonian itself, to provide a form that is simple enough that it can be solved

on a time scale consistent with large scale MD.

We start from the exponential ansatz in which the exact ground state wavefunction is

given by

I ,) = eT| o) (2-1)

where 4o is a reference function, which can be taken as a single determinant and T is the

excitation operator of CC theory. When defined this way, we may write the Schr6dinger

equation in terms of the similarity transformed Hamiltonian, H.

HIo) = e-THeT ) = E o). (2-2)

The solution of this equation has the same eigenvalues as the bare Hamiltonian. The total

energy, E, is exact, thus the forces, i-, are exact as well. In second-quantized form H

may be written as,
H = ,Pq + 2 (p llrs)ptqtsr
p,q p,q,r,s

+ (pqr lstu)ptqtrtuts + (2-3)

where we show the inclusion of three- and higher-body interaction terms. To include

the higher-order terms we renormalize with respect to the one- and two-body terms and

obtain a generalized, correlated Fock-like operator, called the transfer Hamiltonian [10, 43]

Th Y[h,, + Y Pxpq, rs(lth (2-4)
PV A,7

where PA, is the single particle density matrix and both h., and (pqllrs) can be

approximated with a suitable functional of atom-based parameters. We have used the

NDDO form in the current work, but the formalism developed thus far is not limited by

this choice of Fock-like operator.

for the three-center two-electron integrals. Though there is some cancellation, the

possibility of complete cancellation among the three-center one-electron attraction and

two-electron repulsion integrals is limited. All three-center one-electron terms appear only

in off-diagonal blocks of the Fock-like matrix and are small compared to the dominant

three-center two-electron terms which appear primarily in the diagonal blocks.

3.3.3 Density Dependent Fock-Like Matrix Contribution

This component generally accounts for electron-electron contributions.

* HF:
Gcf P,((piv A )- A(IA v)) (3 38)

GC DDO Px.((uA )6AB6D 2- (pA AD6Bc) (3-39)

G TB = 0 (3-40)


GSCC-DFTB PA[ SpS, (AC + BC + .AD + 7BD) (3-41)

G'fS = ( j[P] (r) + v.c[P](r) v) (3-42)

The density dependent term of the ideal Fock-like matrix is GKS. This choice

follows directly from the chosen form for the density independent component and the

requirement that we reproduce the exact electronic density. The density dependence of

the Fock-like matrix accounts for the complex interactions among electrons. This term

is dominated by Coulombic repulsion. The second most important term is generally

called exchange. The third component is correlation, and can be considered to be the

highly complex way in which electrons interact with one another, i.e., their motion is

0 5000 10000 15000 20000 25000

Figure 4-8. Timing of Fock build using traditional NDDO and NDDO with cubic splines
as a function of system size


0 0.06




Figure 2-5. Average deviation of Si-O stretch

linear scaling are equally applicable here. The nonlocal exchange that is not considered in

SCC-DFTB is allowed in AAT, so that B3LYP and other hybrid DFT methods can be a

target of the approach. We have also developed a route toward a fully two-center model

based on considerations of the three-center two-electron repulsion integrals.

The Fock build requires no numerical integration, but the AAT approach does

explicitly include exchange and correlation. In the current approach we have used

the minimal basis set B3LYP exchange-correlation potentials as a reference, further

improvement is possible by evaluating large basis set ab initio DFT exchange-correlation

potentials and subsequently projecting them in a minimal basis set representation.

Extensions of the AATMO that include a more complete expansion of the telescoping series

for the exchange-correlation potential have also been described and their implementation

would be a further improvement on the results presented.

-3 -2 -1 0 1 2 3
Separation (Bohr)

-4 -2 0 2 4
Separation (Bohr)

Figure 4-3.

Orbital products. A) Gaussian orbital product. B) Slater orbital product



t --2
-2.5 5 B3LYP
-3.5 A M 1 **................
SCC-DFTB -................
-4 -----
2.5 3 3.5 4 4.5 5 5.5 6
R (Angstrom)

Figure 4-10. Pseudo-reaction-path splitting C20

We have discussed the basic features of HF, CC, KS-DFT, Hiickel, NDDO, TB,

DFTB, and SCC-DFTB, and have alluded to the role each pl in the development

of an improved semiempirical method. We take two approaches to develop this target

method. First, we demonstrate the transfer Hamiltonian, which provides an exact

one-particle framework in WFT. Second, we develop adapted ab initio theory which

provides rigorously verifiable approximations to reproduce ab initio reference data.

3.2 Self-Consistent C' irige Density-Functional Tight-Binding . ... 62
3.2.1 Self-Consistent C!i ,i'ge Density-Functional Tight-Binding in an AO
Framework ...... .............. ........... 62
3.2.2 Mulliken Approximation Connection to SCC-DFTB . ... 65
3.2.3 SCC-DFTB Repulsive Energy . . . .. ... 66
3.3 Systematic Comparison of HF, NDDO, DFTB, SCC-DFTB, KS-DFT in a
Single Framework .................. ............. .. 67
3.3.1 General Energy Expression . . ........ .. .. 70
3.3.2 Density Independent Fock-Like Matrix Contribution . . 71
3.3.3 Density Dependent Fock-Like Matrix Contribution . ... 72
3.3.4 Residual Electronic Energy ................ ..... 73
3.3.5 General Gradient Expression . . ....... ..... 74
3.3.6 Requirements for an Accurate Simplified Method . . ... 75

4 ADAPTED AB INITIO THEORY .................. ..... .. 80

4.1 Two-Electron Integrals ............ . . ... 80
4.2 Kohn-Sham Exchange and Correlation ........... ... .. 89
4.3 Adapted ab initio Theory Model Zero ............. ... .. 92
4.4 Implementation ............... ........... .. 94
4.5 Verification ............... ............ .. 97
4.6 Conclusion ................ ............. .. 101


REFERENCES ..................... ...... .......... .122

BIOGRAPHICAL SKETCH ................... . ... 127

APPENDIX Parallel Implementation in the ACES III Environment

The ACES III program system environment has been designed for the efficient parallel

implementation of wavefunction theory methods in computational chemistry. The ACES

III program is suited to treating large systems, typically 500-1000 basis functions and up

to 300 electrons with post-HF ab initio methods. The super instruction processor (SIP)

manages the communication and processing of blocks of data, such blocks are intended to

be somewhat large (on the order of 500,000 floating point numbers). To provide a more

direct route for implementing new methods the SIP reads the code written in the high

level symbolic language, the super instruction '--, ,,,11,; ; ,la,;ij,h: (SIAL) (pronounced

- i!"). The SIP hides many of the more complicated aspects of parallel programing

allowing the SIAL programmer to focus on algorithm and method development. Within

SIAL each index of an array is divided into segments, the segments map elements of

the array into blocks. An algorithm then involves operations for which the segment is

the fundamental unit of indexing. Operations that involve the contraction of two arrays

can then be broken into several smaller contractions over two blocks. Each block-pair

contraction can then be distributed over many processors, allowing for the parallel

implementation of the contraction of two large arrays.

A general issue in parallel codes is ensuring that latency does not become a rate

limiting step. This can be done by making sure that the amount of computation done is

on a par with the latency of any operation. Such balancing is machine dependent, so it

requires some amount of testing to determine an optimal solution for a particular machine.

For the implementation of the AAT methodology this balancing is critical and requires a

different treatment than is usually used for post-HF methods.

The structure of the AAT equations are quite different than post-HF equations.

In the former the atomic indices are used as the primary variable in the loops used to

construct the Fock-like matrix. For post-HF methods the orbital indices are the important

variable. Moreover, the number of orbitals per atom in AAT is small relative to the

Table 1-2. The NDDO two-center two-electron integrals

(ss ss)
(ss pp,)

(P7P Iss)
(PaPa ss)

(PwPw Pua,)
(ppa lpp,)
(Pa\Pa aPa)
(spa ss)

(spa p-P,)
(spa PaPa)
(ss spa)
(p-Pw Spa)
(PaPa SPa)
(spw spw)
(spa Spa)
(sP lpWP,)
(p-Pa sp,)
(P-Pa pPa,)
(P'Pw' [PPw')

is to be described. However, this sacrifice means that the new Th parameter set must be

applied judiciously. If used for a class of molecules for which it was not trained, then the

accuracy of calculated properties could be diminished. If the appropriate set of reference

data and a sufficiently complete model are chosen, then the Th will provide an equally

appropriate and complete description of the system.

Among the first to study NMT theoretically were Dewar et al. [56] who used

MINDO/3 and found that the NMT-MNT rearrangement occurred with an energy

barrier of 47.0 kcal/mol and that MNT dissociated to H2CO+HNO with an energy of

32.4 kcal/mol. Hence they proposed that NMT decomposition occurs via rearrangement.

McKee [57] reported the NMT-MNT rearrangement barrier to be 73.5 kcal/mol calculated

at the MP2 level of theory with a 6-31G(d) basis and that MNT dissociated to H2CO+HNO

with an energy of 44.1 kcal/mol. The high barrier height of the rearrangement -I-. -

that the decomposition pathway is simple bond rupture to NO2 and the CH3 radical.

Hu et al. [59] performed calculations with the G2MP2 approximation and found that

the NMT dissociation occurs via direct C-N bond rupture with an energy of 61.9

kcal/mol. The NMT dissociation is lower than both the NMT-MNT rearrangement

and the NMT-aci-nitromethane rearrangement by 2.7 and 2.1 kcal/mol, respectively.

Nguyen et al. [58] reported that the NMT-MNT rearrangement barrier is 69 kcal/mol

and NMT direct dissociation to CH3 and NO2 radicals is 63 kcal/mol calculated at the

CCSD(T)/cc-pVTZ level using CCSD(T)/cc-pVDZ geometries. Most recently Taube and

Bartlett [60] report the NMT-MNT rearrangement at 70 kcal/mol using ACCSD(T).

Hu et al. [59] also performed a detailed study of unimolecular NMT decomposition

and isomerization channels using DFT with G2MP2//B3LYP level of theory in a

6-311++G(2d,2p) basis. Single points were basis set extrapolated along the minimum

energy paths to correct for incompleteness using the G2MP2 method. Their article shows

that from a particular intermediate state (IS2b) one possible reaction path is the rupture

of the N-O bond to give the methoxy radical and nitrous oxide. Hu et al. indicate that on

density matrix P is defined by a set of coefficients C for the real spin-unrestricted case,
P^ z(c a + AC,:c (3 21)

This dependence allows us to use a self-consistent field (SCF) procedure: determine the

Fock-like matrix from a guess density and determine a new set of coefficients and the

corresponding density, repeat until the density output is within some threshold of the

input density. Alternatively, FPS PSF and (a lhfj) 0 would be good measures of
i -Ca

Upon convergence of the SCF equations we have a set of AO coefficients that define

the set of MOs

= ZC^xv, (3 22)

a reference single determinant wavefunction is then generated as an antisymmetrized

product of the MOs,

=o A{p1(1)02(2).../. (n)} (3-23)

It is useful at this point to define the first-order electronic energy given a zeroth-order

wavefunction, IoQ). The exact electronic Hamiltonian is

^ h(i+ (3-24)
i i,j>i +3

where the core Hamiltonian is

h(1) ZA (3-25)
2 A 71A

and in its matrix representation the core Hamiltonian is

H,`"7 I XJ(r1)h(r1)x,(rdr1 (3-26)

Table 4-1. Adapted integrals
(PAVA ABec) I Mulliken
3-center Type A II Riidenberg
3-center Type B IV Riidenberg
(IAVBIAc- D) V Mulliken
4-center VI Riidenberg
VII Mulliken Partial
VIII Riidenberg Partial

Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy



Joshua J. McClellan

May 2008

C'!I ir: Rodney J. Bartlett
Major: Chemistry

We present work toward a one-electron Hamiltonian whose solution provides

electronic energies, forces, and properties for more than 1000 atoms fast enough to drive

large scale molecular dynamics. Ideally, such a method would be as predictive as accurate

ab initio quantum chemistry for such systems. To design the Hamiltonian requires that

we investigate rigorous one-particle theories including density functional theory and

the analogous wavefunction theory construction. These two complementary approaches

help identify the essential features required by an exact one-particle theory of electronic

structure. The intent is to incorporate these into a simple approximation that can provide

the accuracy required but at a speed orders of magnitude faster than tod 'v's DFT.

We call the framework developed adapted ab initio theory. It retains many of the

computationally attractive features of the widely-used neglect of diatomic differential

overlap and self-consistent-charge density-functional tight-binding semiempirical methods,

but is instead, a -.,,1l.:7 1 method as it allows for precise connections to high-level ab

initio methods. Working within this novel formal structure we explore computational

aspects that exploit modern computer architectures while maintaining a desired level of


rapid and accurate MD studies due to the inherent speed of NDDO and the fitting of

the parameters to high-level CC reference forces. In this sense we are only limited by

the reference data and the quality of the model Hamiltonian. We have demonstrated

transferability of monomer parameters to nitromethane dimer and trimer. These forces

predict chemical reactivity in clusters that is not observed in AM1 simulations and, in

the case of the trimer, we see a threefold rearrangement of NMT to MNT giving further

insight into the detonation process.

2.4 Silica and Silicon

The TH depends upon the choice of the form of the Hamiltonian. So far we have

used the NDDO form [67, 68] because of its wide use and prior successes. However, such

methods using standard parameters like AM1 [27], MNDO [30-32] and PM3 [28, 29]

are also known to have significant failings compared to ab initio methods. Our TH, on

the other hand, benefits from the fact that we only require specific parameters [45] for

the systems of interest, for example, SiO2 clusters and H20 in our hydrolytic weakening

studies [69]. These parameters permit all appropriate clustering and bond breaking for

all possible paths for all the components, to allow MD to go where the physics leads.

Furthermore, unlike traditional semiempirical methods, we do not use a minimal basis set,

preferring to use polarization functions on all atoms, and in some cases, diffuse functions.

The form of the Hamiltonian emphasizes two-center interactions (but is not limited to

nearest neighbors as in TB), which allows our parameters to saturate for relatively small

clusters. We check this by doing ab initio calculations on larger clusters involving the

same units to document that the TH is effectively unchanged with respect to system size.

At that point, we have a TH that we can expect to describe extended systems composed

of SiO2, and its solutions properly, including all multi-center effects on the energies and

forces. Only the Hamiltonian is short-range. We also check our TH by applying it to

related systems. We will show that our TH determined from forces for bond rupture in

disiloxane, pyrosilicic acid, and disilane also provide correct structures for several small

The total energy is

E = nici (1-30)

where ni is the occupation number for the ith MO. Note there is no explicit consideration

of the nuclear repulsion in Equation 1-30. The Hiickel approximations are

a ifi j

Hi = 3 if i are nearest neighbors j (131)

0 otherwise.

The extended Hiickel method [26], as its name implies, is merely an extension of the

standard Hiickel method with the same total energy formula. The full matrix equation

HC = SCc is solved, with overlap explicitly included. And the Hamiltonian is subject to a

slightly more sophisticated approximation:

a if i = j
Hij = KSij(Hii + Hjj) if i and j are nearest neighbors (1 32)

0 otherwise

where Hii is based on the ith AO of an isolated atom and K is a dimensionless adjustable

parameter for an atom-pair. Hoffmann -,ii-:. -1. 1 a value for K based on its effect on the

energy of ethane (C2H6). He argued that, since the energy dependence on K is nearly

linear beyond K = 1.5, the MO energies and, in turn, the total energy, will not be too

sensitive to K for values greater than that. He then considered the energy difference

between the 1 i-:-. red and eclipsed conformations of ethane and optimized the value of

K to minimize the error between calculated and experimental values. The final value he

arrived at was K = 1.75, a value is still in common use.

The important aspect to note about the Hiickel method, as it pertains to the

development of improved semiempirical methods, is that a very simple model can

reproduce a few features qualitatively well. The spatial symmetry is correct due to the

inclusion of the overlap matrix. However, despite the ability of such basic approximations

that one can include overlap explicitly and still retain a computationally efficient

framework. Thus, AAT explicitly includes the overlap between atomic orbitals.

The predominant computational aspects of the method have been highlighted; now,

consider the formal development of the AAT approach. There is an exact reference to

the exchange-correlation potential with which we can directly compare. The electronic

density can be corrected iteratively by an inverse procedure, such as that of Zhao,

Morrison, and Parr [8] or that of van Leeuwen and Baerends [9]. These procedures

take a high-quality reference density and generate the numerical exchange-correlation

potential that best reproduces it. In the limit that the reference density is exact, such

procedures produce an exact local exchange-correlation potential. This exact limit

defines a set of one-particle equations, which in turn, define the first-order electronic

energy. When added to the nuclear-nuclear repulsive energy, the electronic energy gives

the total energy correct through first-order in the trial wavefunction. In existing SE

methods, the remaining higher-order energy terms are combined with the nuclear-nuclear

repulsion to give a parameterized atom-pair based total energy correction that would be

predominantly nuclear repulsion. Instead, we choose to retain a separate nuclear-nuclear

repulsion contribution to ensure that every aspect of our model has a direct one-to-one

correspondence to ab initio theory. The connections to the exact limits for both density

and total energy components of the overall parameterization provide the necessary

structure so that every approximation of the model can be rigorously verified.

1.2 Quantum Chemistry

Electronic structure theory describes the distribution of electrons around nuclei in

atoms, molecules, and solids. Many observable features of a molecule are given by its

electronic structure, though computational efficiency requires approximations that can

limit the accuracy with which we can describe some chemical phenomena. There is a series

of standard approximations that permeate quantum chemistry that restrict its universal

applicability, but that are valid for many cases of interest:

* Time-independent Schridinger equation.

* Born-Oppenheimer approximation.

* Nonrelativistic Hamiltonian.

* Linear combination of atomic orbitals.

The time-independent Schridinger equation provides the solutions needed for the

time-dependent Schr6dinger equation, and though some problems require a time-dependent

solution, the vast ii, ii iiy of problems in quantum chemistry can be treated adequately

with the time-independent Schridinger equation. The Born-Oppenheimer approximation

is based on the simple observation that electrons move much faster than nuclei. This

difference in speed effectively implies nuclear motion cannot affect the electronic

structure significantly and we can solve the Schr6dinger equation for a set of fixed nuclear

coordinates. For example in the hydrogen atom the electron is traveling at about two

million meters per second, which is orders of magnitude faster than the motion of the

nucleus. Also, this speed is less than 1 of the speed of light, which also implies that we

need not concern ourselves with relativstic effects that would only become important for

atoms with heavier nuclei (typically those with atomic number larger than 25). For such

heavy atoms the speed of an electron near the nucleus will approach the speed of light and

would require a relativistic Hamiltonian for an accurate description. In such cases effective

core potentials that approximate the passive (Darwin and mass-velocity) relativistic effects

can be used, which still allows one to keep the nonrelativistic Hamiltonian framework.

Finally, the linear combination of atomic orbitals (LCAO) approximation introduces basis

sets into the problem and gives a general framework for systematically expanding the

reference space of the orbitals that the electrons may occupy.

The time-independent Schr6dinger equation within the Born-Oppenheimer (clamped-

nucleus) approximation is

Hf = ET (11)

Because we insist that an approximate Fock-like matrix reproduce the KS Fock-like

matrix, and thereby define the first-order electronic energy, the residual electronic energy

must be Ersidual. The residual electronic energy and its combination with the unrelated

nuclear-nuclear repulsion term in SE methods is notorious. We have attempted to include

this term in a simple and computationally efficient way in our method. If our Fock-like

operator included exact (non-local) exchange then this term would introduce only the

residual correlation energy contribution. Although the correlation energy is a small

component (< 1 .) of the total energy, its accurate inclusion is required to obtain chemical


3.3.5 General Gradient Expression

One of the overarching objectives of our work is to create a model Hamiltonian

to drive large-scale MD. Achieving this goal requires an efficient method for obtaining

gradients. Given the general energy expression provided by Equation 3-30, the gradient is


dE 2S _H ,,H, 1 8 8Eresid9al
da aa aa 2 6a aa
i Ac

which is based on C C, (F/,, cS,,) = 0 and orthonormality. Note that there is no

explicit dependence on !. Equation 3-48 only holds if terms in G are linear in the


G,, = PcAGZ (3-49)
The derivatives of Eresidual, which are not present in HF, are simple for those

methods that only depend on a sum of atom-pair terms, such as NDDO, DFTB, and

SCC-DFTB. In KS-DFT additional numerical integration is required for the perturbed

density. Potential approaches for reducing the computational cost associated with the

derivatives in KS-DFT will be discussed in further detail later.

Furthermore, there is evidence that any model that is only dependent upon

pair-potentials, (i.e., nearest-neighbor distances) such as standard orthogonal TB schemes,

lacks the ability to describe some fundamental properties of silica. It is difficult to

prove such a claim, as we are able to imagine an expansion of the energy about some

parameter, such as nearest-neighbor distances. However, Stixrude et al. [75] have

shown experimentally, based on the two-fold compression of liquid SiO2, that there is

essentially no change in nearest-neighbor distances though there are significant changes

in medium-r r',,' structure. This medium-range structure is best described by ring- and

cluster-statistics and cluster geometry [76], which would be not be incorporated in any

two-body potential.

To avoid the limitations of empirical two-body potentials, one may consider NDDO

or non-orthogonal TB methods discussed above, which include some many-body effects

due to inclusion of the overlap matrix [77]. Especially promising is the SCC-DFTB

approach [7] which includes the largest contributions to long-range Coulomb interactions

while handling charge in a self-consistent way. These methods offer a practical balance of

chemical accuracy and computational efficiency, as evidenced from their recent increase in


Figure 2-8 shows the deviations of the TH-CCSD parameterization, AM1, and

MNDO/d NDDO calculated unit cell densities from reference DFT densities. The

reference DFT calculations for all silica polymorphs are from the VASP code using

the gradient-corrected PW91 GGA exchange-correlation functional with 3D periodicity, an

ultrasoft pseudopotential, and a plane wave basis set (395 eV cutoff). A 2x2x2 k-point

grid is employ, 1 which converges total energies to within 0.05 eV/cell, and the

convergence of self-consistent steps is precise to 10-4 eV/cell. The optimization of cell

parameters and atomic positions is precise to < 10-3 eV A-1.

It is clear that the TH-CCSD parameters trained on small silica clusters are

transferable to several silica polymorphs. It is also interesting to note that not only is

or, better, -.:,,1 fi.:; methods (SM) are not yet sufficiently reliable, although they do

provide an excellent opportunity for improvement. There are two general approaches to

improve SE methods: first, existing reference data can be expanded to include additional

systems and properties or improved fitting procedures can be used to help locate new

minima in the model Hamiltonian's parameter space; second, more rigorous model

Hamiltonians can be derived to satisfy more demanding desiderata, such as uniform

accuracy of the energy and forces across the entire potential energy surfaces (PES) and

improved electronic densities, ionization potentials, and associated first-order properties.

The first route is a mainstay of SE method development, though we will show that the

second SM strategy has the potential to be more valuable in the long run. To this end,

a new SM is discussed that incorporates rigorous formal connections to high-level ab

initio methods while retaining computational features of commonly used SE methods. An

important computational aspect of existing SE methods is that their speed is dictated by

the solution of a set of one-particle eigenvalue equations (typically matrix diagonalization)

and not by the construction of those one-particle equations or any ensuing step. This

computational restriction, in addition to a modicum of accuracy, guides the development

of the method under discussion, adapted ab initio theory (AAT). AAT not only provides

a general theoretical SM framework into which existing SE methods can fit, but also has

the advantage that there are well-defined connections between it and high-level ab initio

theories. This novel framework not only allows for improvements to existing SE methods

that enhance transferability and extendibility, but also has the flexibility needed for more

systematic improvement in future versions of SE methodology.

Before we begin the detailed description of our quantum mechanical method the

need for a fast and accurate method to describe the properties of large molecules will be

demonstrated. Phenomena that involve bond-breaking or bond-forming often necessitate

a detailed knowledge of the electronic structure and the PES. Such quantum chemical

descriptions are important in many complex chemical processes; for example:


We have specified the target framework of our simplified quantum mechanical

method. The task now is to demonstrate simple well-controlled approximations that

significantly reduce computational effort while maintaining a desired level of accuracy. To

this end, a primary focus is on the density dependent component of the Fock-like matrix,

as it is the most computationally intensive part of the SCF procedure. Furthermore, the

residual electronic energy contribution is examined because we have shown that its proper

treatment is essential for a SM to achieve uniform accuracy of electronic structure and


Our strategy is to start with a reference theoretical method that is reasonably

well-behaved, in this case KS-DFT. Though there is much debate about the best DFT

energy functional, we choose a hybrid DFT functional, specifically the widely-used

B3LYP, because it includes a portion of exact exchange which adds a substantial degree

of flexibility to our simplified method. To simplify the computational procedure we

then explore various approximations and verify their accuracy. It may be too much to

demand a simplified method that is systematically improvable (in the CC wavefunction

theory sense), however, every approximation should be systematically verifiable. We

will show that by making a couple of well-controlled approximations we are able to

reproduce the topographical features of a hybrid DFT PES with a substantial reduction in

computational effort.

4.1 Two-Electron Integrals

Explicit calculation of two-electron integrals is a computational bottleneck in DFT

because the number of integrals scales as rbitals. For the Coulomb integral this can

be reduced to nrbitals by fitting the density prior to the evaluation of the integrals [82],

but this cannot be done for the exact exchange, leaving norbitals dependence. Though

most quantum chemistry programs employ screening techniques to reduce the number

of two-electron integrals calculated, they remain a significant cost, especially when exact

attempting to extend existing model Hamiltonians to include features that the model was

not originally intended to capture. To avoid such complications the AAT SM is primarily

designed to yield accurate electronic structure (electronic density) and, secondarily, the

total energy and forces. Excited and ionized states are of interest. Clearly, the electronic

structure and the energy of a molecule are closely connected. Consequently, the distinction

between the two is subtle, nevertheless, their separation is crucial in order to ensure that

the AAT achieves uniform accuracy for energies, forces and first-order properties over an

entire PES. The general computational approach is as follows:

* Approximate four-center terms with a sum of two-center terms using the Riidenberg
approximation [4, 5].

* Determine exchange-correlation potentials to guarantee the exact result at the
separated-atom limit, starting from a sum of atom-based densities.

* Use cubic splines to interpolate these two-center potential terms (a lookup table of
splines need only be generated once and is very fast).

* Solve a set of one-particle equations with explicit inclusion of orbital overlap.

* Include explicit exact (HF-like/ non-local) exchange.

* At convergence of self-consistency, make a final energy correction by inserting the
density into a DFT functional (B3LYP) and performing a single numeric integration
for the exchange-correlation energy.

An important computational consideration is the inclusion of the overlap between

orbitals, which is a feature of AAT that distinguishes it from standard NDDO methods.

The zero-differential overlap approximation, which is the fundamental approximation of

NDDO, implies that there is no overlap between orbitals. Though this approximation

enhances the speed of the method, it neglects an important physical feature of electrons

and the basis sets used to describe them. There have been attempts to reintroduce such

overlap effects in NDDO methods, most notably Thiel's orthogonalization correction

methods (OM1 and OM2) [6]. Still, Elstner's development of SCC-DFTB [7] demonstrates


Semiempirical method development has a long history dating back to the 1930's

starting with Hiickel theory for describing even-alternant hydrocarbons. In the current

study we have been guided by the developments over the last seventy years in regard to

both what should and should not be approximated in our model. We focus here on the

NDDO and SCC-DFTB models. Both models have been widely-used with reasonable

success. The relative numerical performance of the most recent versions of each model has

been reported recently [2], and extensive comparisons of the formal similarities of each

model have been made [78, 79]. It is important to understand the context in which these

models were designed to fully appreciate the difference in the design philosophy of the

current AAT model. NDDO-based methods were originally constructed to reproduce the

experimental heats of formation for small organic molecules. Since the heat of formation

is defined only for molecules in their equilibrium structures, it is not surprising that

NDDO models are parameterized specifically to reproduce equilibrium properties. For

nonequilibrium structures the accuracy of NDDO methods varies greatly, as we will show.

Similarly, SCC-DFTB has been parameterized to reproduce equilibrium structures.

The goal of the AAT approach is to achieve uniform accuracy of electronic properties,

structures, and energetic over the entire PES to the extent possible for a minimal ba-

sis set effective one-particle theory. Uniform accuracy is required to achieve meaningful

results when performing MD simulations, in which transition states and other nonequilibrium

structures p1l v a critical role.

3.1 Neglect of Diatomic Differential Overlap Based Methods

3.1.1 Does NDDO Approximate HF?

A common misconception about NDDO-based methods is that the NDDO model

Hamiltonian approximates the HF Hamiltonian. This assumption is not the case and it

can be shown that the density dependent contribution to the Fock-like matrix in NDDO

has little to do with the density dependent Fock matrix of HF theory, though they are

To my parents.








1 1.5 2 2.5 3 3.5
C-N reaction coordinate (Angstrom)

90 Mulliken
1 1.5 2 2.5 3 3.5
C-N reaction coordinate (Angstrom)


1 1.5 2 2.5 3 3.5
C-N reaction coordinate (Angstrom)

Figure 3-4. Energies from Riidenberg and Mulliken approximations. A) Three-center
(AABC). B) Three-center (ABAC). C) Four-center (ABCD).

exchange is included. There are two v--- to lower the computational cost associated with

two-electron integrals: first, the number of integrals can be lowered by applying screening

methods; second, the cost associated with the evaluation of each individual integral can

be reduced. Several techniques for reducing the number of integrals have been successfully

applied. They generally involve a procedure that recognizes a threshold for significant

contributions to the Fock matrix. These include fast matrix multiple (FMM) [83-85], the

linear exact exchange [86], and quantum chemical tree code (QCTC) [87]. These methods

still rely on the explicit calculation of integrals using standard techniques for Cartesian

Gaussian-type functions. For large systems the most numerous class of integrals are the

four-center integrals. Furthermore, four-center integrals are not only the largest class of

integrals, but also the most computational involved integrals. According to Gill et al. [88]

a four-center integral is typically an order of magnitude more computationally demanding

than a two-center integral. Our focus here is on systematic approximations that can be

made to these ab initio multi-center integrals that will reduce the computational effort

associated with calculating these terms while retaining acceptable accuracy.

The cornerstone of our approach is based on ideas originally presented by Riidenberg

in 1951 [5]. The underlying approximation is to expand an orbital on center A in terms of

a set of auxiliary orbitals located on center B:
A(1) YZSkB, PA M() (4 1)


SkB, PA kB AA) (4-2)

If the auxiliary orbitals on center B are complete and orthogonal then the expansion is

exact. By repeated application of Equation 4-1 the number of centers that are involved in

three- and four-center two-electron integrals can be reduced. The possible combinations

of this type of expansion has been explored and enumerated by Koch et al. [89], though

without comparing the numerical accuracy of such approximations. We will show the

[92]. The G2 set is made up of 148 neutral molecules with singlet, doublet, and triplet

spin multiplicities, though we focus on a subset of these that contain only C, N, 0, and

H because the current implementation of AATMO includes only the spline parameters

generated for those atom atom-pairs. There are 71 molecules in this truncated set ranging

from 2 to 14 atoms.

The separation of electronic and nuclear components in the energy expression

of our AAT approach sets it apart from commonly used SE methods. So it follows

that we expect more consistent results for purely electronic properties. To this end we

have considered the vertical ionization potentials of the relevant molecules in the G2

set. The vertical ionization potential is purely electronic as it does not depend on the

nuclear-nuclear repulsion. We have used the simple definition Elp = E(N) E(N 1).

This requires the total energy of the molecule and, since all molecules in the G2 set

are neutral, the corresponding cation with the properly modified multiplicity. We have

compared AATMO against the underlying B3LYP minimal basis set results and find

a average RMSD from B3LYP for our entire test set of 0.057 Hartree (1.6 eV). The

analogous comparison to AM1 yields an RMSD from B3LYP of 0.331 Hartree (9.0 eV).

This clearly demonstrates the ability of AATMO to reproduce the electronic structure of

the underlying B3LYP. The error of AM1 from B3LYP is substantial, though a more valid

comparison for AM1 would be against experimental IPs, the important issue here is the

ability our AATMO to reproduce the physics of the method it is approximating.

We now consider the dipole moment. The dipole RMSD deviation of AATMO and

AMI from minimal basis set B3LYP is 0.219 a.u. (0.56 Debye) and 0.595 a.u. (1.51

Debye) respectively. There were no sign errors observed for the orientation of the

dipole moment for either AATMO or AM1. Though the error introduced by AATMO

is significantly smaller than that of AMI, it is larger than would be expected given the

performance of other properties generated by the AATMO procedure.

Table 2-1. Equilibrium geometry for NMT, in A and degrees

CCSD/TZP 1.498 1.087 1.219 107 117
TH-CCSD 1.504 1.084 1.225 108 118
AMI 1.500 1.119 1.201 107 119

(Gscc)^ 2 SPS, A( + ) (3-13)

Note that unlike HF or NDDO, in SCC-DFTB it appears that it is not generally true that

GG = Gt, as A E ( and a is unrestricted. If SCC-DFTB were to have this property, that

would be useful both for the calculation of forces and for comparison among all effective

one-particle methods we consider. To enforce this similarity we make the following


2 1
SPA-S AAC 7BC)] C SS(7AC + BC + 7AD + 7BD) (314)
A<7 A<7

where p, v, A, and a are on A, B, C, and D respectively, and our working equation for the

density dependent contribution in SCC-DFTB is

(Gsc) -Sp, SA (AC + 7BC + 7AD + 7BD) (3-15)

Equation 3-15 bears a similarity to the Mulliken approximation to Coulomb repulsion, we

will address this in further detail later.

Now we revisit the total energy expression. Using our new definition of the density

dependent and independent terms in the SCC-DFTB Fock-like matrix.

ESCC t [2Hc + P ,(GSC) + ABqAqB + Es c(RAB)) (316)
Jv Ac A#B

This energy expression closely resembles other one-particle Hamiltonians, yet is equal to

Equation 3-8. Similarly, after significant manipulation the gradient expression can be

shown to be

t1 sCC f[2 rSCCah
2 1 a ( 9ca o

S- cc + + Y ( t 7ABqAq + Escc(RAB)) (3-17)
pv i A#B

which again is equal to the original expression, Equation 3-9, but allows a more

systematic comparison with other methods.

3.3.6 Requirements for an Accurate Simplified Method

We have discussed the features of a general simplified method aimed at large systems

that can be used to reliably obtain electronic properties as well as energetic. It is clear

that the framework is essentially that of KS-DFT, though we have structured the basic

equations in a form amenable to further approximation. The two essential elements for the

determination of electronic structure are the terms in the Fock-like matrix that are either

,.ii ,,*///. l/ independent (H) or dependent (G) on the electronic density. The final element,

which is needed to acquire accurate energetic and structures, is the residual electronic

energy term. Furthermore, we have well-defined exact limits for each of these terms. When

we discuss the AAT approach we have unambiguous connections to DFT and, from ab

initio DFT, an analogous connection to WFT and the transfer Hamiltonian.

0.4 Erep
3 _el ec .....


-0.6 /

1 1.5 2 2.5 3
C-C reaction coordinate (Angstrom)

Figure 3-3. The SCC-DFTB total energy breakdown.

with the excited determinants, cb, meaning the antisymmetrized product,

j = A[( (1) 2](2)..a(i) .. (n)], (1-15)

where unoccupied orbitals a and b replace the occupied ones, i and j. The coefficient

(amplitude) in front of the excitation introduces its proper weight into the wavefunction.

The ultimate version of such a sequence would include all possible excitations up to n-fold

for n electrons and is called the full configuration interaction or full CI. Full CI is:

* The best possible solution in a given basis set.

* Invariant to all unitary transformations (rotations) of the orbitals.

* Size-extensive [13] (meaning that it scales properly with the number of electrons.)

* An upper bound to the exact energy for a molecular electronic state.

The full CI can be written conveniently in the coupled-cluster (CC) form,

-= exp(i + 2+ T3+ ... + T)o. (116)

[13-16]. Approximations to CC theory are made by restricting the Tp operators to singles

and doubles, as in CCSD, or to other approximations that decouple higher-excitations

from lower ones, e.g., CCSD(T). Here the triple excitations are approximated from Ti and

T2, which can be obtained from CCSD, e.g., without allowing the new T3 estimate then to

contribute back to T1 and T2. The details of these methods, which can be found elsewhere

[17], are not important to the present discussion. What is important is that all truncated

CC methods are size-extensive, but do not necessarily give upper bounds to the exact


The full CC equations provide the exact electronic energy,

ECC = (o|H o) EHF + (i||ab)( + ta) -t + | f a)t (117)

number of atoms, the reverse of many post-HF calculations, which are generally used to

treat systems with fewer atoms but with much larger basis sets.

The speed of the AAT approach is achieved by avoiding the explicit calculation of the

numerous and costly four-center integrals. Instead, these terms are introduced by a sum of

contracted two-center terms. Initially, we used small segment block sizes. The block size

ultimately determines the computational efficiency by affecting how the IO, or fetching

of data, is done. In these initial studies each block spanned only one atom. In this case,

when the two-electron integrals were evaluated it was straight forward to simply neglect

the integrals that involved four atoms (four-centers.) This procedure worked for systems

with few atoms (<20), but, as the system size increased, the number of blocks rapidly

increased, and latency became an problem.

An improved approach that was implemented, which is more consistent with the

original ACES III design philosophy, involved removing the atom-spanning block

size restriction. This was achieved by rewriting the original routines that returned all

two-electron integrals spanning four-block units to explicitly exclude integrals involving

four-centers. In addition, the Riidenberg approximation, which relies on contractions

between the overlap matrix and NDDO-type two-center two-electron integrals ((AAIBB)),

required modification to avoid over-counting some terms. Similar to the routine that

excludes four-center integrals, the modified routine to generate the integrals used by the

Riidenberg approximation excluded all integrals that were not of the type (AAIBB).

These two modified routines involve some overhead that could be reduced with further

refinement. These modified routines are specific to the AAT approach and are not

included in the standard release version of ACES III.

Since two-electron integrals possess an eight-fold spatial symmetry the overall number

of integrals that need to be calculated can be reduced by roughly a factor of eight. To

do so requires recognizing the symmetry of every integral type. In ACES III the problem

is slightly different. Instead of having this eight-fold symmetry manifest itself by AO

40 B3LYP


A0 -20


1 1.5 2 2.5 3
C-N reaction coordinate (Angstrom)

40 B3LYP
E 20
o 0

B -20
I -40
1 1.2 1.4 1.6 1.8 2 2.2 2.4
C-N reaction coordinate (Angstrom)

50 B3LYP
,- 20
1 1.2 1.4 1.6 1.8 2 2.2 2.4
C-N reaction coordinate (Angstrom)




1 -100

1 1.5 2 2.5 3 3.5 4
C-N reaction coordinate (Angstrom)

Figure 4-4. Dissociation curve of C-N with AAT approximation. A) NMT. B) NET. C)

Table page

1-1 The NDDO one-center two-electron integrals ................ 32

1-2 The NDDO two-center two-electron integrals .................. ..33

1-3 Multipole distributions .................. ............. .. 34

2-1 Equilibrium geometry for NMT, in A and degrees . . . 49

4-1 Adapted integrals ............... ............ .. .. 103

4-2 Average percentage of Fock matrix contribution by adapted approximation 104







the possibility of improving the speed over traditional DFT methods by two-orders of

magnitude is well within range. It is sufficient to focus only on timings for the four-center

integrals as they quickly become the dominate contribution for large systems, as shown in

Figure 4-7.

Screening of small two-electron integrals can lead to linear scaling for large systems.

Such screening methods can be applied to the Riidenberg approximation very efficiently

because of the explicit dependence on overlap.

In an effort to minimize computational effort with building the Fock-like matrix we

have made extensive use of storing transferable one-center and two-center terms. In our

current implementation all integrals up through three-centers are calculated exactly. There

are two reasons for explicitly evaluating these integrals: first, they are a small contribution

when compared to the explicit calculation of the four-center terms; second, the application

of the Riidenberg or Mulliken approximations to three-center terms (decomposing them

into two-center terms) introduces significant errors. If an accurate, simple and transferable

approximation for the three-center terms were available then all terms in the model could

be two-center. It is a straight forward matter to interpolate functions of two centers using

cubic splines. We have implemented such techniques for the evaluation of the matrix

elements of the exchange-correlation: (G0),,, (RAB).

Cubic splines are applicable for numerical 2-center functions, where the exact

functional form is not known, and for complicated analytic 2-center functions. In the

first case, the benefit of splines is to provide an interpolation scheme for a set of empirical

reference points. In the second case splines provide a computationally efficient framework

for the evaluation of otherwise prohibitively expensive analytic functions. We are primarily

interested in the latter.

The ability to store a large number of parameters in computer memory (RAM) is

a requirement for the efficient implementation of cubic splines. The memory required

is directly determined by the number of cubic spline parameters. A pertinent example

ci(Hs S) 0 (3-6)

1 atoms
Hsc = (x HoIX,) X + S,, (7A- + A.)Aq
2 t

H0 + / 1 (3-7)

ocC atoms
E2TB |o I 0I) + S 7ABAqAAqB + Erep(RAB) (3 8)
i A,B A#B
and the corresponding gradients are
oCC OH o H S N a ,A B Erep(RAB)
ni ci ( I a' -1 Aq, -Aq a --


we have used p E A and v C B.

To facilitate comparison of the SCC-DFTB model Hamiltonian to other methods

we force it into a structure that is comprised of a density independent part H and a

density dependent part G. Using Mulliken population ,in ,i,~-; to determine the net charge

associated with atom A yields

AqA -qA + P ,, (3-10)

where the occupation numbers ni = 1 and the MO coefficients are real.

S atoms atoms
= H,, S,, (7A + .)o + 2 5 (7A + .)PAuSA
_i ( SC V^ pL /.^S(5\<7A /
(Hs"c),, + PA(Gs ) (3-11)

we have introduced the notation that
(H c),, H Sp, (A + .) (3-12)
~LV 2 ~L

concerted reaction mechanisms, such dramatic errors are much more difficult to recognize,

though it is precisely those regions that are often of primary interest.

3.1.3 Separation of Electronic and Nuclear Energy Contributions

In ab initio theory (without consideration of non-BO effects) inclusion of the

nuclear-nuclear repulsion term (VNN) is a trivial matter when evaluating total energies

and associated forces. While VNN has no part in the determination of purely electronic

properties (such as the electronic density, electronic spectra, ionization potentials,

or electron affinities) it is critical in the determination of molecular structure, the

identification of transition states, and their associated vibration spectra. In simplified

theories the role of VNN is obfuscated because the electronic and nuclear repulsion effects

are not separated.

The core-core repulsion term in NDDO Hamiltonians involves parameters that are

designed for energetic at equilibrium. For such a specific set of properties the difference

of two large numbers, the attractive (negative) electronic energy and the repulsive

(positive) nuclear-nuclear repulsion energy, is easier to model than each separately. In

our opinion, combining such unrelated terms is the origin of the long-standing contention

that electronic properties and total energy properties cannot be described within the same

set of parameters. Also, once the core-core terms are parameterized, the correspondence

between rigorous theory and semiempirical theory is lost. To solve this problem, we need

to search for a better form for our model Hamiltonian that does not require that VNN be

combined with terms that are purely electronic.

3.1.4 Total Energy Expression

Here we illustrate the origin of the combined electronic and nuclear energy contributions

of the total energy expression, the core-core term in NDDO methods, so that we may

avoid this pitfall in the development of AAT. Consider the case in which the exact

KS-DFT exchange-correlation energy functional and corresponding potential are known,

and further are represented in a complete basis set. Then our effective one-particle


The underlying physical laws for the mathematical theory of a large part
of physics and the whole of chemistry are thus completely known, and the
difficulty is only that the exact application of these laws leads to equations
much too complicated to be soluble. -P.A.M. Dirac

1.1 Who Ordered a New Semiempirical Theory?

Available quantum chemical methods are incapable of achieving so-called chemical

accuracy (1 to 2 kcal/mol for bond energies) for a system composed of thousands of

atoms within a computationally efficient framework (approximately one d4i- on a typical

desktop machine). A computational method with this capability would revolutionize the

way that matter is studied in biology, chemistry and physics. In the field of quantum

chemistry, there are many unexploited connections among wavefunction, density functional

and semiempirical theories. Use of such connections will aid in the development of

quantum chemical methods with the desired accuracy and computational efficiency. The

ultimate goal then is to provide scientists with a computational tool that makes materials

simulations predictive, chemically specific, and applicable to a wide variety of mechanical

and optical properties of realistic, complex systems.

Though high-level quantum methods approach the desired accuracy, the computational

scaling of such methods precludes their use for systems larger than a tens of atoms (e.g.,

coupled-cluster with singles and doubles (CCSD) scales O(N6), where N is the number

of basis functions). In contrast, the computational scaling of semiempirical (SE) methods

is acceptable (typically scales 0(N3)), though they can be made to scale linearly with N.

For instance, if the size of a system is increased by an order of magnitude then a CCSD

calculation would incur a factor of 106 more computational time. For the same increase

in system size, the computational time of a normal SE method would only increase by

a factor of a thousand, but with linear scaling only by ten. In the regime of very large

systems (thousands of atoms) it is apparent that a calculation that could be performed in

one di-, using SE methods might take several years for CCSD. Unfortunately, SE methods

ab initio
35000 Rudenberg
2 30000
S 15000
400 600 800 1000 1200 1400 1600

Figure 4-6. Timing of ab initio versus Riidenberg four-center integrals

4.6 A, and 4.8 A and angles are 620, 580, and 600 (Figure 2-4). During the geometry

optimization, because one of the monomers has a stretched C-N bond, the system is

unable to reach its ring-like hydrogen-bonded local minimum, leading to ring strain. The

TH-CCSD predicts that, as a result, the one monomer cleaves at a C-N bond distance of

1.8 A and, subsequently, a second monomer dissociates at the same C-N bond distance.

At this point, all the intermolecular N-N bond distances have decreased to 4.3 A, 4.5

A, and 4.6 A; the angles, however, remain unchanged. Geometry optimization with

AMI shows that the stretched monomer shifts slightly from the remaining two monomers

allowing one of them to rotate to a lower energy structure where it forms a bifurcated

hydrogen bond each of which is 2.44 A.

We have demonstrated that the TH-CCSD parameter set trained for NMT monomer

transfers to the dimer and trimer calculations. The criteria for transferability used

here regards the extent to which these forces yield products that are in accord with

theoretically supported decomposition mechanisms of NMT [58, 59]. Note that the

training of our Hamiltonian was not tailored to favor any particular rearrangement

mechanism, evident by the fact that the dimer decomposition differs markedly from the

trimer mechanism. Along the adiabatic surface AM1 does not predict any bond breaking

to occur. Therefore, the accuracy of modeling bond breaking and forming processes in

bulk simulations using AM1 is questionable, at best.

We have shown a parameterization of the NDDO Hamiltonian that reproduces proper

dissociation of NMT into radicals, both for the energy and forces. This parameter set

reproduces the reference CCSD/TZP geometries and forces for unrestricted C-N bond

breaking. This NDDO parameterization would appear to potentially offer improved

accuracy in MD simulations for clusters of high-energy materials. We also show that

concerted reactions for both the dimer (bimolecular rearrangement) and trimer (trimolecular

rearrangement) are not generated with an AM1 parameterization, yet are required for

the proper description of NMT clusters. This new transfer Hamiltonian will allow for



0 5000 10000 15000 20000 25000

Figure 4-9. Error introduced per atom from cubic splines as a function of system size

Table 1-1. The

NDDO one-center two-electron integrals
Two-electron integral

(sp sp)
(pp pp)
(pp p'p')

ability to verify every approximation systematically puts the current work into sharp

relief from current semiempirical methods that depend intrinsically upon uncontrolled


1.5 Kohn-Sham Density Functional Theory

The Kohn-Sham [12] equations may be written as

fKS Xi) = iXi) (1-22)

[-1V2 + t(r) + vj(r) + Vc(r)]Xi) Ei Xi) (1-23)

ve r) ZA (1-24)
A RA- 7I(

Vj(ri) P(2 dr2, (1-25)
J 712
p (r) I i(r) 2, (1-26)

and vxc(r) is the exchange-correlation potential. Constrained-search minimization of the

noninteracting kinetic energy alone determines orbitals that yield the ground state density

[20]. The other terms do not affect the minimization because they are explicit functionals

of the density.

The Hohenberg-Kohn-Sham theorems guarantee that the density obtained from the

exact ground state wavefunction will be returned by Equation 1-26 when using the exact

exchange-correlation potential. In a pragmatic sense, these arguments give a formal proof

of existence, though they do not guide the construction of the exact one-particle operator,

or, more specifically, the exchange-correlation potential. However, in combination with

so-called inverse procedures, such as that of Zhao, Morrison and Parr (ZMP) [21-23],

which have been explored by Tozer et al. [24], or the procedure by Peirs et al. [25], the

potential may be refined iteratively such that the solutions to Equation 1-23 will generate

a reference density, presumably a high-quality ab initio reference density.



Figure 2-6. Average deviation of Si-Si stretch
Figure 2-6. Average deviation of Si-Si stretch

40 | B3LYfP
0 A M 1 ................

A _-20 .\,"'


1 1.5 2 2.5 3
C-N reaction coordinate (Angstrom)

40 B3LYP
0 A M 1 ................
C 20
B -20
T -60
1 1.2 1.4 1.6 1.8 2 2.2 2.4
C-N reaction coordinate (Angstrom)

40 0 AM1 .......
C 1o
1 1.2 1.4 1.6 1.8 2 2.2 2.4
C-N reaction coordinate (Angstrom)

50 i AATMO
o AM1 .....
D- 50



1 1.5 2 2.5 3 3.5 4
C-N reaction coordinate (Angstrom)

Figure 4-5. Dissociation curve of C-N with AATMO approximation. A) NMT. B) NET. C)


A '



B '




D 'C

y = 0.0063x
R2 = -0.0065


-10 -5 0 5

Full 2-e- integral (a.u.)

3 y,
2 R


3 y,
2 R



= 0.8988 ,


-5 0 5

Full 2-e- integral (a.u.)

= 0.8



-5 0 5

Full 2-e- integral (a.u.)

y = 0.0216x
R2 0.6841 /

-5 0 5

Full 2-e- integral (a.u.)

Figure 4-1. Scatter plots of agreement between approximate and exact two-electron
integrals. A) One-center. B) Two-center. C) Three-center. D) Four-center

Exact EHF) and because many quantum chemists have developed an intuitive

understanding of it. Also, HF provides a convenient way to introduce notation that will be

used throughout the current work. The one-electron operator of the full Hamiltonian is

h(1)X) ( (-_ ZZA)X) (1-4)
2 A rA

and the two-electron operators are Coulomb J and exchange K,

J(1)jXi) XJ-2 )2dr2 Xi) (1-5)

iX \ X'(rX 2 Xi (ri2)
K(1)j i = dr2 \Xj) (1-6)
j7i tr-12

f(1)|Xi) [h(ri) + (J(1) K(1))]| X) Q|Xi) (1-7)

denoting the molecular spin orbital Xi as a linear combination of atomic spin orbitals Xi

Xi(ri) Cix(ri) (1 8)

Using Lagrange's method of undetermined multipliers and enforcing orthonormality of the
orbitals gives a set of one-particle matrix equations, the Hartree-Fock-Roothaan equations:

FC SCc (1-9)


Sp,= J2x(rl) ,(rl)dr, (1-10)


F, /- (ri)f(ri)x (ri)dri (1 11)

The construction of a Fock-like (or effective one-particle) matrix is a central theme of this
work. A Fock-like matrix is an orbital representation of an effective one-particle operator,

This is only part of the overall problem. In addition to the purely electronic

properties, we also want total energies and forces. The total energy in KS-DFT can

be written,

E K E i- -P 1 drd/ v.c(r)p(r)dr + E.[p] (1-27)

where the exchange-correlation potential is the functional derivative of the exchange-

correlation energy functional

vc (r) -= (1-28)

Unfortunately, currently there is no procedure that allows us to take a numerically

derived exchange-correlation potential, one that is determined such that it reproduces a

high-quality reference density, and generate the exchange-correlation functional, which is

an explicit functional of the density. This inability to determine the exchange-correlation

energy functional from a reference density alone is the first indication of a fundamental

problem in semiempirical methods, which will be discussed in further detail later: how

does one design a model Hamiltonian that will simultaneously achieve the correct result

for both electronic properties and energetic? However, if we are willing to assume the

form of a particular exchange-correlation energy functional then the potential is well

defined, as is the effective one-particle Hamiltonian.

1.6 Hiickel Method

The Hiickel r-electron method is perhaps the simplest quantum chemical approximation.

It is intended to highlight the basic underlying effects of symmetry involved with w

electrons for alternant hydrocarbons, though it can be made more general. In relation

to HF, the same type of one-particle matrix equations are solved, though with the

overlap matrix set to the identity matrix and the Fock-like matrix is composed of only a

one-electron (density independent) contribution, F = H

HC = Cc (1-29)

to capture a large fraction of the total energy, more sophisticated descriptions are needed

to guarantee chemical accuracy.

1.7 Neglect of Diatomic Differential Overlap

Neglect of diatomic differential overlap (NDDO) is the model that most people

associate with semiempirical theory in quantum chemistry todiw. The NDDO approach

is meant to approximate the HF solution, not the exact solution, though its parameters

can be varied to minimize error to reference data [27-32]. It is a minimal basis set,

valence-only approach. NDDO is based on the zero-differential overlap (ZDO) approximation:

products of orbitals with respect the same electron coordinate that are on different centers

are defined to be zero,

(X B() 0. (133)

The orbitals are considered to be L6wdin symmetrically orthogonalized AOs, so that

the overlap matrix is the identity matrix, and the set of one-particle matrix equations

solved is FNDDOC = Cc. The Fock-like matrix in NDDO contains one-electron (H) and

two-electron (G) components. The one-electron part is approximated as

S / U, EB#A ZB(ppI BB) if p = v)
_1, (/3 +/3,) if/P:/ V

where p is on atom A, v is on atom B, and U,, and 3., are adjustable parameters,

(pp|IBB) models the repulsion between a product of orbitals (p) and a point charge
at center B, ((p(1)1 1|1(1))). Note that the two-center one-electron terms include

overlap and are therefore not consistent with the ZDO approximation. The one-center

two-electron integrals are given in Table 1-1. The parameters Gs, Gpp, Hp,, Gpp, and Gp2

are adjustable.

Within a local framework there are twenty-two unique two-center NDDO-type

two-electron integrals for each atom pair (Table 1-2). The NDDO two-center two-electron

integrals are approximated by a multipole-multipole expansion [33]. Each product of

is the evaluation of two-center two-electron integrals terms in MNDO, specifically, the

multipole-multipole expansion of the NDDO-type two-electron integrals. There are

twenty-two unique NDDO-type integrals for every atom-pair involving two heavy atoms

in standard NDDO methods such as AM1, PM3, etc. This means that for CNOH atom

types, for which there are ten unique atom-pairs (CC, CN, CO, CH, etc.), there are fewer

than 220 analytic functions. The evaluation of a cubic spline requires fewer operations

than even the simplest of the twenty-two unique NDDO-type analytic expressions. As a

simple demonstration of the speed of cubic splines we have implemented them to replace

the explicit multipole-multipole evaluation for each of the 22 unique NDDO-type integrals.

The multipole-multipole approximation is already significantly faster than the ab initio

integral evaluation. As shown in Figure 4-8 a modest speed-up is achieved over the already

fast NDDO Fock-like matrix construction. These timings are for 16 protein systems up

to 20000 orbitals. Using cubic splines is up to 15'. faster than the multipole-multipole

expansion. There is an error introduced, though it is small and controllable. For example

the average error per atom is given in Figure 4-9. Furthermore, while going to higher

multipoles in the expansion of integrals is increasingly expensive, cubic splines cost the

same regardless of the underlying function being interpolated. However, the number of

parameters involved in a typical NDDO multipole-multipole expansion is relatively small

versus the number of parameters needed for cubic splines. The premium placed on RAM

(or core) memory when NDDO methods were first developed thirty years ago would have

made cubic splines impractical. Now, the amount of memory typically available on most

desktop machines is more than sufficient to make cubic splines the method of choice when

evaluating nearly any two-center integral in quantum chemistry.

A general cubic spline has the form

f(x) = A + Bk(x Xk) + Ck(x xk)2 + Dk(x -k xk)3 (4 35)

OC- 0- L00-
ow a u

nOi^o oc



CA O r C

1-> o 1-
~~~~ Ln~ L
y odLn~

1- ~00V>
00~0~C> C>

--- TO ^ CM --1 lO O- O

C IA -

3 ~~\ 7 ~"\ T ~ ^^ ~~\ T ~


-- oo t-
t" cd
> ,CIA

t-t- 10"-t
cid d


01 00000

Hamiltonian will yield a set of orbitals which yield the exact density. The exact energy is

given by the KS-DFT total energy expression, Equation 1-27.

Now, if we expect that the NDDO model Hamiltonian will generate the correct

electronic density, which is a requirement if we want any of the first-order electronic

properties of our system, then we can relate the NDDO and KS-DFT energy expressions.

Making the temporary assumption that fNDDO j fKS, and consequently PNDDO PKS,

then the total energy difference is
ENDDO EKS= NDDO AB RAB) Exc[p] + 1 Vc(r)p(r)dr (3-3)

Now it is clear that if the NDDO one-particle Hamiltonian were to yield the correct

electronic density then the two-center core-core term (ENDDO) must account for not only

the simple nuclear-nuclear repulsion term ( ), but also some pair-based contribution,

dependent only upon RAB, of the the complicated multi-center exchange-correlation

contribution (Exc[p]- f' vxc(r)p(r)). The motivation for approximating this last term

is obvious: it drastically simplifies and avoids a complicated and expensive step in the

determination of the total energy. However, there is no reason to expect a priori that this

approximation would be accurate. For our improved model Hamiltonian we make a special

effort to ensure that all approximations are controllable and systematically verifiable.

3.2 Self-Consistent Charge Density-Functional Tight-Binding

3.2.1 Self-Consistent Charge Density-Functional Tight-Binding in an AO

Equations 3-4 through 3-8 are the working equations for SCC-DFTB which come

directly from Elstner's original paper [7]:

AqA qA qA (3 4)

occ atoms
qA i > > cCiS + cj cusS,/) (3-5)
i pEA v

where E is the energy of the system, T is the wavefunction of the electrons, and the

nonrelativistic Hamiltonian in Hartree atomic units is

rAi rij RAB
i A,i i>j A>B (1 2)

Sh(i) + + VNN
i i>j

with i and j referring to electrons and A and B to nuclei. The terms in Equation 1-2

are the operators that correspond to the kinetic energy of the electrons, electron-nuclear

attraction, electron-electron repulsion, and nuclear-nuclear repulsion. The wavefunction

T describes the state of our quantum system. We choose to approximate T within a basis

set of atomic orbitals (AOs). Enforcing antisymmetry and satisfying the Pauli exclusion

principle can be achieved with an antisymmetric product of orthonormalized AOs within a

Slater determinant, the LCAO approximation

)=\ IX1(1r)X2(2) ... Xn(rn)) (1-3)

where n is the number of electrons and the spin orbital x(ri) = ((ri)a, and a, denote

spin and O(r) is a spatial orbital.

1.3 Hartree-Fock

Focusing on the ground electronic state, the most widely known approximate solution

of the electronic Schr6dinger equation is made by a mean-field approximation: HF theory.

That is, we insist that a zeroth-order approximation to the n-electron wavefunction (0o)

be written as an antisymmetrized product of n one-electron molecular spin-orbitals,

)o = A[ i(1)Q2(2)... ~(n)]. Here A is the antisymmetrization operator. Then using the

Raleigh-Ritz variational principle, EHF > EeaOct, we derive the one-particle HF equations

which yield the energetically best possible single determinant wavefunction. Each of the

orbitals in its canonical form is an eigen-function of the effective one-particle operator, f.

It is fitting to start with the most well-known approximation in electronic structure

theory. HF is an excellent touchstone since correlation energy is defined by it (Ecorreltion

Rudenberg (4C)
S40' B3LYP


1) -20


1 1.5 2 2.5 3
C-N reaction coordinate (Angstrom)

Rudenberg (4C) -
6 20

B -20
1 1.2 1.4 1.6 1.8 2 2.2 2.4
C-N reaction coordinate (Angstrom)

0 I Rudenberg (4C) -
.-i 20
C > io
S 0
1 1.2 1.4 1.6 1.8 2 2.2 2.4
C-N reaction coordinate (Angstrom)

50 Rudenberg (4C)



I -100

1 1.5 2 2.5 3 3.5 4
C-N reaction coordinate (Angstrom)

Figure 4-2. Dissociation curve of C-N with 4-center Riidenberg approximation. A) NMT.

Figure page

2-1 The NMT force curve for C-N rupture .................. .. 50

2-2 The NMT energy curve for C-N rupture .................. ..... 51

2-3 The NMT dimer energies relative minimum in the intermolecular hydrogen bonding
distance . . . . . . . . .. .. . 52

2-4 Snapshots of the geometry optimizations for NMT dimer and trimer performed
with the TH-CCSD and AM.................. . . .....

2-5 Average deviation of Si-O stretch...... . . .....

2-6 Average deviation of Si-Si stretch.... . . .....

2-7 Average deviation of O-Si-O angle.............. . . .....

2-8 Density deviation from PW91 DFT... . . .....

3-1 Multi-center contribution to the C28N28 matrix element of NMT . .

3-2 The AM1 artificial repulsive bump for NMT direct bond fission. . .

3-3 The SCC-DFTB total energy breakdown........... . . .....

3-4 Energies from Riidenberg and Mulliken approximations .. . .....

4-1 Scatter plots of agreement between approximate and exact two-electron integrals

4-2 Dissociation curve of C-N with 4-center Riidenberg approximation . .

4-3 Orbital products ...................... . . .....

4-4 Dissociation curve of C-N with GZERO(B3LYP) approximation . . .

4-5 Dissociation curve of C-N with AATMO approximation .. . .....

4-6 Timing of ab initio versus Riidenberg four-center integrals . . ..

4-7 Percentage of multi-center terms versus system size .. .............

4-8 Timing of Fock build using traditional NDDO and NDDO with cubic splines as
a function of system size....... . . ......

4-9 Error introduced per atom from cubic splines as a function of system size .

4-10 Pseudo-reaction-path splitting C20 .. ....................

4-11 Force RMSD from MP2........... . . .....

4-12 Force RMSD from B3LYP........ . . .....





















o OM 1 RHF ................
TH(CCSD) RHF ........... -


p -5
1 1.5 2 2.5 3 3.5 4 4.5
O-H reaction coordinate (Angstrom)

Figure 2-3. The NMT dimer energies relative minimum in the intermolecular hydrogen
bonding distance


ACKNOW LEDGMENTS ................................. 4

LIST OF TABLES ....................... ............. 7

LIST OF FIGURES .................................... 8

A BSTR A CT . . . . . . . . .. . 9


1 INTRODUCTION AND BACKGROUND .................... 10

1.1 Who Ordered a New Semiempirical Theory? ........ ......... 10
1.2 Quantum C('! iii-i ry ................... ....... 15
1.3 Hartree-Fock ............................... 17
1.4 Coupled-Cluster Theory ............................ 19
1.5 Kohn-Sham Density Functional Theory .................. 22
1.6 Hiickel Method .. ... .............. .......... 23
1.7 Neglect of Diatomic Differential Overlap .................. 25
1.8 Tight-Binding ............... . . .. ..26
1.9 Density-Functional Tight-Binding . . .... ... 27
1.10 Self-Consistent-C( rge Density-Functional Tight-Binding . ... 29

2 THE TRANSFER HAMILTONIAN .................. ...... .. 35

2.1 Theoretical Method .................. ............ .. 35
2.2 Computational Details .................. .......... .. 37
2.3 Nitrogen-Containing Energetic Materials .................. .. 37
2.3.1 Nitromethane Monomer. .................. .... .. 40
2.3.2 Nitromethane Dimer .................. ..... .. 41
2.3.3 Nitromethane Trimer ............... ..... .. 42
2.4 Silica and Silicon ............... .......... .. 44
2.4.1 Silicon Clusters ............... ......... .. 45
2.4.2 Silica Polymorphs ............... ........ .. 46


3.1 Neglect of Diatomic Differential Overlap Based Methods . ... 58
3.1.1 Does NDDO Approximate HF? ............ .. .. 58
3.1.2 Artificial Repulsive Bump ..... . . .... 60
3.1.3 Separation of Electronic and Nuclear Energy Contributions ..... .61
3.1.4 Total Energy Expression ............... .. .. 61

terms. Such two-center terms can be interpolated using cubic splines once, then the

cubic spline coefficients can be stored in RAM to be efficiently retrieved and computed

on-the-fly. With such a target in mind only terms in Equation 4-18 that can be retained

are v-center and v center. Furthermore, such contributions ought to be independent of

chemical environment and thus completely transferable. Therefore, we insist on densities

that are transferable. The natural choice is spherically symmetric atom-centered densities

of neutral atoms, PA PA), recognizing we can have a and 3 spin components. Based on

these considerations we have an approximate exchange-correlation potential,

xc[ PA](r) vxc[p(r) + [v [p +POB](r) ~Uc[po (r) vc[P](r)] (4-19)

We do not want to map and store the real-space exchange-correlation potential,

because of the obvious high degree of complexity. Instead, all that is ever needed are

the contributions from the potential projected in the basis set. The Fock-like matrix

contribution from this approximate exchange-correlation potential is

(Gxc)PA [p3 (PA xc[P]I B) (4-20)

Restricting this to contributions that only depend on two centers yields, for the diagonal

block (A = B)

(Gxc)PAVA (A vc[ZP1 VA

= (PA I c[AI A) (4-21)

+ (PA VLxc[P + PB xcI] IVA)

and for the off-diagonal block (A / B)

(Gxc) PAVB lA vxc[ P B)
A (4-22)


initio surface, surprisingly without the explicit calculation of four-center two-electron


We have demonstrated that the Mulliken and Riidenberg approximations do not

approximate the three-center one- or two-electron integrals well. These classes of integrals

pose a challenge as they are the only remaining terms that require explicit integration

since all one-center terms can be tabulated and two-center terms can be interpolated by

cubic splines, as will be demonstrated later. However, based on the Gaussian product rule

these terms can, without approximation, be reduced to two-center terms. In principle then

cubic spline interpolation could be used to store the intermediates and completely avoid

the need to do any integrals explicitly. The Gaussian product rule is
-a:(r-RA)2 a(r-RB)2 j (RA-RB)2 (a+a (-IRA + RB)2
./+.v ./+.v (4- i)

We can generate an auxiliary set of orbitals for each Gaussian-type orbital-pair that have

the orbital exponent a,,+a,. Also, because products of I= 1 (p-type) orbitals are involved

this auxiliary set must also now span I = 2 (d-type). The prefactor is not included in the

stored orbitals, instead it is computed on-the-fly to retain the two-center character of the

terms in which it will be used.

There is a significant difficulty with using the Gaussian product in a simple

way. Because we work in contracted Gaussian-type orbitals the number of orbital

pairs increases as the number of contracted functions squared. The centers of these

resulting product Gaussians are not necessarily the same, a A+a .B To use cubic spline

interpolation an auxiliary orbital must have a constant exponent, e.g., a. + ac in the case

of the simple product of uncontracted Gaussians. We envision an approximation to the

exact contracted Gaussian product, which is a sum of several Gaussians with different

centers and different exponents, that is a single contracted Gaussian located at a single

center. The approximate auxiliary Gaussian could be determined by fitting directly to the

2.2 Computational Details

Parameters were optimized by a qu( -i-N. .--ton-Raphson optimization algorithm in

tandem with a genetic algorithm to fit Austin Model 1 (AM1) NDDO parameters [27] to

reproduce the forces determined with CC theory. The parameterization was performed

by a combination of the PIKAIA [46] implementation of a genetic algorithm and the

linearized qu, -i-N. .-ton-Raphson minimization routine of L-BFGS [47].

The reference forces used in this parameterization were obtained with the ACES

II [48] program system. We have used CCSD in a triple zeta basis, with a unrestricted

reference and Hartree-Fock stability following (to ensure that we arrive at the proper

unrestricted spin solution.) All semiempirical AM1 and OM1 calculations were performed

using a modified version of MNDO97 Version 5.0 [30-32, 49].

DFT calculations for the nitromethane (NMT) monomer were performed at the

B3LYP/6-31G(d) level with the HONDOPLUS program [50]. The initial NMT dimer

and trimer geometries were taken from a DFT study using B3LYP/6-31++G** [51]. Due

to the large number of correlated reference calculations needed in modeling our clusters,

we used the Turbomole Version 5 electronic structure package [52, 53] to do all-electron

calculations at the resolution-of-the-identity second order M. .11, r-Plesset (RI-MP2) level

in a TZVP basis. We chose an all-electron model in a large auxiliary basis to ensure an

accurate treatment of correlation energies and geometries that were demonstrated on a

series of compounds representative of atoms in various oxidation states [54].

2.3 Nitrogen-Containing Energetic Materials

The motivation of this study is to develop a Th that can accurately predict the

quantum mechanical behavior of nitrogen-containing energetic materials. We present a

new parameterization for a semiempirical NDDO Hamiltonian which improves upon the

standard AM1 parameter set. This new parameterization reproduces the decomposition

of NMT to the CH3 and NO2 radicals as predicted with CCSD/TZP. We demonstrate

transferability to small NMT clusters. For NMT dimers and trimers this model Th predicts

With all eight of the two-electron integral approximations now defined, the accuracy

of each will be considered. Since one requirement of our approach is that it reproduce the

PES that would arise from a full ab initio treatment, we require that our approximations

be uniformly accurate over the entire PES. This is in contrast with many commonly

used semiempirical methods, in which the focus is typically on accuracy only about an

equilibrium structure. However, uniform accuracy is critical to getting accurate forces

to drive large-scale MD and to aid in locating transition states for rate constant studies.

Unfortunately there is no simple measure for uniform accuracy, due to the complexity of

the 3N 6 dimensional PES, so our analysis will be statistical.

Note that the total electronic energy is the sum of all the one-, two-, three-, and

four-center components. And, each of these components individually bear little similarity

to the total electronic energy, again consider each contribution to any particular matrix

element shown in Figure 3-1, instead the total energy is the result of the delicate

cancellation among these terms. In addition, since only the non-parallelity errors are

of importance in reconstructing the PES, the energetic contribution arising from each

individual class of terms may be shifted by a constant value independently of one another

without affecting the features of a PES. These subtleties, compounded by the 3N 6

degrees of freedom of the PES, make ascribing significance to any single class of terms

difficult. Because we want to assess the accuracy of a particular approximation as it

applies to a single class of terms we will instead focus on the Fock matrix elements, which

contain all the information needed for the zeroth-order electronic energy, to overcome some

of these difficulties.

Given any set of points on the PES (for example, a dissociation curve) and the Fock

matrix for each point, each class of multi-center two-electron integrals will contribute a

particular percentage to the full density dependent Fock matrix. The best approximation

for a particular class of integrals is then given by how well it reproduces the average

percentage of the Fock matrix for the corresponding ab initio set of integrals. Consider

the energy expression to be
occ V2 P '
EDFT (i + +v',t + Po0 + v~[Po1i)
2Ir r -
P(p V[po](po Vp) (1-41)

2 |r r 2 RA B
1 '6p'(po p) + E o + p + A,B

to enable us to introduce a form dependent upon the one-particle Hamiltonian, hs, plus its
corrections. The first new term accounts for the double-counting of the Coulomb potential,
the next accounts for the addition of vxc in the first term, and the third new term is the
remaining part of the Coulomb potential. Next, the exchange-correlation contribution
Exc is expanded about the reference density, po, with a Volterra (Taylor) expansion for

Exc[po + 6p] Ex[po] + vxc[po]p + f pf p 6P6P'

+- / / 6" 6p6p'6p" + "" (1-42)
6 6p6p'5p" PO
Substitution of this expression for Exc into the energy equation yields an expression for the
energy that is formally exact, yet only includes terms that are second-order and higher in

E + VEzX[ + Ir -rI + v [p Ii) [po]

t 6pp + t 62EX 7
1 Poo' + Epo + ZAZB

f f r 6p6p' (1-43)
oP "
2 1 r r' 2J 6p6p' P
1 r "3E
+ i E 6p6pl p" + -
6 6p6p'6p" PO

quantitative improvement achieved but the error from the PW91 DFT calculations is

systematic and parallels similar improvement found when this new parameterization is

applied to molecular systems. Further refinement of the TH-CCSD parameters must be

made to accurately describe the silicon-silica interface, since the current parameterization

fails to converge to the proper topology.

The application of the transfer Hamiltonian methodology has been presented for

nitrogen containing energetic materials and silicon containing compounds. In both cases

we have demonstrated significant improvement over the underlying semiempirical method

by reparameterizing with a training set of high-quality CC forces for representative

molecules. It is apparent that for focused MD studies that use semiempirical model

Hamiltonians there is an advantage to creating a special set of parameters that is tailored

to the chemical system being studied. However, the transferability of the model is affected

by which molecules are selected for the training set and the PES reference points used.

These models will often break down when applied to systems to which they have not been

parameterized. One route to achieve this level of transferability is to design a simplified

method that does not require parameterization, only controlled approximations, which

leads to the AAT approach to defining a superior Th.

The off-diagonal block terms can be further improved by applying the Riidenberg

approximation. Note that the diagonal block terms do not benefit from this approximation

because the orbitals only depend upon one center. The class of neglected terms (pA 1x2 [Pc] IB)


(LA \c PcB] = Y [SPAkB (kB ,[pU CVB) + SkA vB(PA c[p1 kA)I (4-23)
C#A,B k

The leading order terms are

(oPA (PA I c[pOA]I A) if A B
(Gc)/ A (4 24)
(PA Ic[PA +PII VB) if A/ B

and a slight improvement could be achieved with the addition of

X)1 B#A(A A xc[POA + POB vxc[[POA VA) if A B
S ZCA,B [SPAkB (kBv, [POCI]iB) + SkAVB (PA xc [PO] kA)] if A / B
Shown in Figure 4-4 are results for the GZERO(B3LYP) approximation: G6 is used with

the B3LYP energy functional interpolated over all the atom pairs CNOH. The errors
introduced are relatively small and GZERO(B3LYP) appears to be generally transferable.

In terms of computational feasibility for GZERO we have introduced a single

one-center matrix per atom-type and a single two-center matrix per atom-pair. For C,

N, O and H atom types the number of one-center matrices is trivial and does not present

a challenge, as they are the dimension of the number of orbitals per atom squared. The

two-center matrices are stored in the local coordinate framework of the atom pair and are
roughly the dimension of the number of orbitals on atom A times the number of orbitals

on atom B. The actual dimension can be reduced because most of the matrix elements are

zero due to orthogonality. To interpolate the values of these two-center matrix elements

cubic splines are used. Cubic splines require five parameters per number interpolated. We

used adaptive grids (non-uniformly spaced) that are defined for every pair term from 0.5

that the adjustable parameters involved in the multipole-multipole expansion are being fit

to the full density dependent part of the Fock matrix, and not the integrals themselves.

Furthermore, this fit is non-linear because of the two-electron exchange integrals, which

means that improving the NDDO approximation by modifying the integrals or trying to

add three- or four-center contributions becomes increasingly convoluted. For the AAT

model, as we will show, all integral approximations can be systematically improved and

do not involve any adjustable parameters. By avoiding the need to parameterize integral

contributions we can retain uniform accuracy of the PES because no structural bias has

been introduced.

3.1.2 Artificial Repulsive Bump

There are many chemically interesting cases in which existing semiempirical methods

seem to work quite well. For example Jorgensen et al. [80] show that their PDDG/PM3

reparameterization of PM3 yields heats of formation with a mean absolute error of 3.2

kcal/mol for a test set of 622 molecules. However, there is little evidence to sI--l:. -1 that

the same performance should be expected for chemical structures that are significantly

distorted from equilibrium structures. Even for simple activation barriers, the error is

typically 10-15 kcal/mol [80]. For the determination of rate constants and in molecular

dynamics studies the PES should be described within 1-2 kcal/mol. For example, since at

300K an error of 1 kcal/mol in the activation barrier will result in an order of magnitude

change in the rate constant, clearly a 10-15 kcal/mol error will not yield meaningful

rate constants. The situation is especially bad in cases in which direct bond fission

is treated with NDDO-based model Hamiltonians. In such cases there is an artificial

repulsive region near 1.2Req to 1.6Req, as is shown in Figure 3-2 for NMT UHF direct

bond fission. Generally, in cases where it is known that there is barrierless dissociation,

one would recognize such features as an error in the method and subsequent results would

be questioned. For more complicated chemical processes, for example transition states or

[88] P. M. W. Gill, A. T. B. Gilbert, and T. R. Adams, J. Computat. C'!. :ii 21, 1505

[89] W. Koch, B. Frey, J. F. S. Ruiz, and T. Scior, Z. N ,i w f-i ch 58a, 756 (2003).

[90] N. Oliphant and R. J. Bartlett, J. Chem. Phys. 100, 6550 (1994).

[91] C. S. Duris, ACl\ TOMS 6, 92 (1980).

[92] L. A. Curtiss, K. Raghavachari, P. Redfern, and J. A. Pople, J. Chem. Phys. 106,
1063 (1997).

0.8 AMI
MNDO/d -

S 0.6


0.4 -

0.3 -

0.2 -



Figure 2-8. Density deviation from PW91 DFT




I was born in Michigan but have done time in Ohio, Washington, Oregon, and now

Florida. My tumultuous career in science began in sixth grade when my science teacher

called me a dreamer (I know I am not the only one) for proposing the construction of an

elevator to the moon to avoid the costs associated with space flight. In hindsight, I should

not have specified my original design as a rigid structure. That same year I was awarded

a certificate as the classroom's str ,u. -. kid. After making the transition to high school,

I doubled-up on math and science and finished all available classes in those areas by my

Junior year, resulting in a misspent Senior year.

Somehow I managed to get into the only college to which I applied. I arrived a

Freshman filled with youthful optimism and persuaded my academic advisor to enroll me

in a chemistry course 2 years beyond my level. I continued my Sophomore year without

the youthful optimism. I participated in two Research Experience for Undergraduates

programs sponsored by the National Science Foundation at Cornell and the University

of Chicago. In Chicago I worked for Prof. Karl Freed, essentially spending the summer

building overly complicated internal coordinate input files. I must have enjoy, 1 it because

that same summer I tattooed myself with the mark of a quantum chemist, literally.

Eventually, I took an interdisciplinary BA in chemistry and physics from Reed College in

beautiful Portland, Oregon, in 2001.

cancellation of two unrelated large functions (the electronic and nuclear repulsion energies)

is easier to parameterize against than the values of the larger component functions.

Alternatively, if the parameterization is against the larger functions there is a risk that

any error introduced in the parameterization will be magnified when they are subsequently

combined. If only the total energies or heats of formation (as in NDDO), and hence

only some combined function, are of interest then such an average approach would be

suitable. However, for our purposes we are interested in two separate and distinct sets of

information, the electronic properties and the total energy. Because of this added demand

on our semiempirical method we must generate the individual components of the total

energy expression with equal accuracy. We recognize three distinct components: first-order

electronic energy, residual electronic energy, and nuclear-nuclear repulsion. The first-order

electronic component has been the focus of most semiempirical method development, and

yields, in addition to the first-order electronic energy, the first-order electronic properties:

IPs, EAs, dipole, etc. The residual energy term presents a unique challenge, though

we will demonstrate how it can be included in a computationally efficient way. The

nuclear-nuclear repulsion is trivial to calculate and the only complication arises when

screened charges must be used if we use a valence-only Hamiltonian.

3.3 Systematic Comparison of HF, NDDO, DFTB, SCC-DFTB, KS-DFT in a
Single Framework

One of the goals of this research is to bring many seemingly disparate approximate

methods under a single common framework. For now, we will not deal explicitly with

so-called post-HF methods that include correlation based upon a reference single

determinant (because of the required computational effort). Instead, we are trying to push

the limits of what can be done within a one-particle theory. The problem we are solving is

the set of Hartree-Fock-Roothaan matrix equations, FC = SCc, for a general one-particle

operator f. We use the notation that F = H + G, where H is the density independent

contribution and G is the density dependent contribution to F. The spin-free one-particle

Full Text








Ithankthechairandmembersofmysupervisorycommitteefortheirmentoring.IwouldalsoliketogivespecialthankstoMarkPonton,VictorLotrich,andDr.Deumensfortheirextensivehelpinthelastfewmonths.Ioweadebtofgratitudetomyfamily,whosesupporthasinstilledadeepyearningtopursuemyinterests.Severalpeoplemademanyothercontributionstomylife,bothpersonallyandacademically.Larryurgedmetoappreciatethenerthingslifehastooerandkeptmesane.Ithankthetworealchemists,TravisandLani,fortheirunendingquesttoanswerthebasicquestionmanasksoflife:Whatisre?ThepeopleoftheQuantumTheoryProjectoeredaconstantsourceofinspiration,andoftenasourceofmorningheadaches.IoermanythankstoAndrew,Tom,Dan,Christina,Georgios,Seonah,Martin,Joey,Lena,Julio,andofcourseMerve.IalsoacknowledgeProf.MerzfortheuseoftheDIVCONprogramandDr.MartinPetersforhelpingtoimplementandtestsomeinitialideasforcubicsplineinterpolation(Chapter4).SpecialthanksgotoAndrewTaubeandTomHughesforthenumerousextensivediscussionsthatwereinvaluableinmyresearch.Also,IacknowledgeTomHughes'workinimplementingthetransferHamiltonianparametersfornitromethaneclusters(Chapter2).IwouldliketorecognizethehelpfromDr.AjithPereraforhelpthroughoutmygraduatestudies. 4


page ACKNOWLEDGMENTS ................................. 4 LISTOFTABLES ..................................... 7 LISTOFFIGURES .................................... 8 ABSTRACT ........................................ 9 CHAPTER 1INTRODUCTIONANDBACKGROUND ..................... 10 1.1WhoOrderedaNewSemiempiricalTheory? ................. 10 1.2QuantumChemistry .............................. 15 1.3Hartree-Fock .................................. 17 1.4Coupled-ClusterTheory ............................ 19 1.5Kohn-ShamDensityFunctionalTheory .................... 22 1.6HuckelMethod ................................. 23 1.7NeglectofDiatomicDierentialOverlap ................... 25 1.8Tight-Binding .................................. 26 1.9Density-FunctionalTight-Binding ....................... 27 1.10Self-Consistent-ChargeDensity-FunctionalTight-Binding .......... 29 2THETRANSFERHAMILTONIAN ......................... 35 2.1TheoreticalMethod ............................... 35 2.2ComputationalDetails ............................. 37 2.3Nitrogen-ContainingEnergeticMaterials ................... 37 2.3.1NitromethaneMonomer ......................... 40 2.3.2NitromethaneDimer .......................... 41 2.3.3NitromethaneTrimer .......................... 42 2.4SilicaandSilicon ................................ 44 2.4.1SiliconClusters ............................. 45 2.4.2SilicaPolymorphs ............................ 46 3IMPETUSFORANEWSEMIEMPIRICALTHEORY .............. 58 3.1NeglectofDiatomicDierentialOverlapBasedMethods .......... 58 3.1.1DoesNDDOApproximateHF? .................... 58 3.1.2ArticialRepulsiveBump ....................... 60 3.1.3SeparationofElectronicandNuclearEnergyContributions ..... 61 3.1.4TotalEnergyExpression ........................ 61 5


.......... 62 3.2.1Self-ConsistentChargeDensity-FunctionalTight-BindinginanAOFramework ................................ 62 3.2.2MullikenApproximationConnectiontoSCC-DFTB ......... 65 3.2.3SCC-DFTBRepulsiveEnergy ..................... 66 3.3SystematicComparisonofHF,NDDO,DFTB,SCC-DFTB,KS-DFTinaSingleFramework ................................ 67 3.3.1GeneralEnergyExpression ....................... 70 3.3.2DensityIndependentFock-LikeMatrixContribution ......... 71 3.3.3DensityDependentFock-LikeMatrixContribution .......... 72 3.3.4ResidualElectronicEnergy ....................... 73 3.3.5GeneralGradientExpression ...................... 74 3.3.6RequirementsforanAccurateSimpliedMethod ........... 75 4ADAPTEDABINITIOTHEORY ......................... 80 4.1Two-ElectronIntegrals ............................. 80 4.2Kohn-ShamExchangeandCorrelation .................... 89 4.3AdaptedabinitioTheoryModelZero ..................... 92 4.4Implementation ................................. 94 4.5Verication ................................... 97 4.6Conclusion .................................... 101 APPENDIXPARALLELIMPLEMENTATIONINTHEACESIIIENVIRONMENT 117 REFERENCES ....................................... 122 BIOGRAPHICALSKETCH ................................ 127 6


Table page 1-1TheNDDOone-centertwo-electronintegrals .................... 32 1-2TheNDDOtwo-centertwo-electronintegrals .................... 33 1-3Multipoledistributions ................................ 34 2-1EquilibriumgeometryforNMT,inAanddegrees ................. 49 4-1Adaptedintegrals ................................... 103 4-2AveragepercentageofFockmatrixcontributionbyadaptedapproximation ... 104 7


Figure page 2-1TheNMTforcecurveforC-Nrupture ....................... 50 2-2TheNMTenergycurveforC-Nrupture ...................... 51 2-3TheNMTdimerenergiesrelativeminimumintheintermolecularhydrogenbondingdistance ........................................ 52 2-4SnapshotsofthegeometryoptimizationsforNMTdimerandtrimerperformedwiththeTH-CCSDandAM1. ............................ 53 2-5AveragedeviationofSi-Ostretch .......................... 54 2-6AveragedeviationofSi-Sistretch .......................... 55 2-7AveragedeviationofO-Si-Oangle .......................... 56 2-8DensitydeviationfromPW91DFT ......................... 57 3-1Multi-centercontributiontotheC2sN2smatrixelementofNMT. ........ 76 3-2TheAM1articialrepulsivebumpforNMTdirectbondssion. ........ 77 3-3TheSCC-DFTBtotalenergybreakdown. ..................... 78 3-4EnergiesfromRudenbergandMullikenapproximations. ............. 79 4-1Scatterplotsofagreementbetweenapproximateandexacttwo-electronintegrals. 105 4-2DissociationcurveofC-Nwith4-centerRudenbergapproximation. ....... 106 4-3Orbitalproducts. ................................... 107 4-4DissociationcurveofC-NwithGZERO(B3LYP)approximation. ......... 108 4-5DissociationcurveofC-NwithAATM0approximation. .............. 109 4-6TimingofabinitioversusRudenbergfour-centerintegrals ............ 110 4-7Percentageofmulti-centertermsversussystemsize ................ 111 4-8TimingofFockbuildusingtraditionalNDDOandNDDOwithcubicsplinesasafunctionofsystemsize ............................... 112 4-9Errorintroducedperatomfromcubicsplinesasafunctionofsystemsize .... 113 4-10Pseudo-reaction-pathsplittingC20 114 4-11ForceRMSDfromMP2 ............................... 115 4-12ForceRMSDfromB3LYP .............................. 116 8


Wepresentworktowardaone-electronHamiltonianwhosesolutionprovideselectronicenergies,forces,andpropertiesformorethan1000atomsfastenoughtodrivelargescalemoleculardynamics.Ideally,suchamethodwouldbeaspredictiveasaccurateabinitioquantumchemistryforsuchsystems.TodesigntheHamiltonianrequiresthatweinvestigaterigorousone-particletheoriesincludingdensityfunctionaltheoryandtheanalogouswavefunctiontheoryconstruction.Thesetwocomplementaryapproacheshelpidentifytheessentialfeaturesrequiredbyanexactone-particletheoryofelectronicstructure.Theintentistoincorporatetheseintoasimpleapproximationthatcanprovidetheaccuracyrequiredbutataspeedordersofmagnitudefasterthantoday'sDFT. Wecalltheframeworkdevelopedadaptedabinitiotheory.Itretainsmanyofthecomputationallyattractivefeaturesofthewidely-usedneglectofdiatomicdierentialoverlapandself-consistent-chargedensity-functionaltight-bindingsemiempiricalmethods,butisinstead,asimpliedmethodasitallowsforpreciseconnectionstohigh-levelabinitiomethods.Workingwithinthisnovelformalstructureweexplorecomputationalaspectsthatexploitmoderncomputerarchitectureswhilemaintainingadesiredlevelofaccuracy. 9


Theunderlyingphysicallawsforthemathematicaltheoryofalargepartofphysicsandthewholeofchemistryarethuscompletelyknown,andthedicultyisonlythattheexactapplicationoftheselawsleadstoequationsmuchtoocomplicatedtobesoluble.|P.A.M.Dirac Thoughhigh-levelquantummethodsapproachthedesiredaccuracy,thecomputationalscalingofsuchmethodsprecludestheiruseforsystemslargerthanatensofatoms(e.g.,coupled-clusterwithsinglesanddoubles(CCSD)scalesO(N6),whereNisthenumberofbasisfunctions).Incontrast,thecomputationalscalingofsemiempirical(SE)methodsisacceptable(typicallyscalesO(N3)),thoughtheycanbemadetoscalelinearlywithN.Forinstance,ifthesizeofasystemisincreasedbyanorderofmagnitudethenaCCSDcalculationwouldincurafactorof106morecomputationaltime.Forthesameincreaseinsystemsize,thecomputationaltimeofanormalSEmethodwouldonlyincreasebyafactorofathousand,butwithlinearscalingonlybyten.Intheregimeofverylargesystems(thousandsofatoms)itisapparentthatacalculationthatcouldbeperformedinonedayusingSEmethodsmighttakeseveralyearsforCCSD.Unfortunately,SEmethods 10


Beforewebeginthedetaileddescriptionofourquantummechanicalmethodtheneedforafastandaccuratemethodtodescribethepropertiesoflargemoleculeswillbedemonstrated.Phenomenathatinvolvebond-breakingorbond-formingoftennecessitateadetailedknowledgeoftheelectronicstructureandthePES.Suchquantumchemicaldescriptionsareimportantinmanycomplexchemicalprocesses;forexample: 11


2.Condensedmatterstudies. 3.Biologicalstudies. Quantumchemicalmethodsareusedtosomeextentintheanalysisofthesetypesofproblems.Theidealquantumchemicalmethodcouldbeappliedtoanysizesystemandachieveanydesiredaccuracy,whileusingminimalcomputationalresources,but,ofcourse,thiscannotbeachieved.Hence,thereareoftenpracticallimitationsdictatedbythedesiredaccuracyandtheavailablecomputationalresources.Ourgoalistocreateamethodwithchemicalaccuracythatcanbeappliedtosystemswiththousandsofatoms.Amethodwiththesefeatureswouldbeabletoreadilyprovidethequantummechanicalcontributiontothesolutionofthesystemsmentioned. Thetwomainapproachesinquantumchemistryarewavefunctiontheory(WFT)anddensityfunctionaltheory(DFT).Likewise,therearetwocommonlyusedSEtechniques:neglectofdiatomicdierentialoverlap(NDDO)methods,whichoriginatedinWFT,anddensity-functionaltight-binding(DFTB),includingitsself-consistent-chargeextension(SCC-DFTB),whichoriginatedinDFT.BothNDDOandSCC-DFTBcanbeconsideredsemiempiricalsincetheybothapproximatetheformalstructureoftheoriginalmethodsandareparameterizedtoreproduceeitherexperimentalortheoreticalreferencedata.JustastheformalconnectionsbetweenWFTandDFThaverecentlybeenuncoveredandexploited,viaabinitioDFT[ 1 ],therearesimilarformalconnectionsbetweentheir 12


ExistingSEmethodsarewidelyusedand,whenappliedjudiciously,performquitewell.Regrettably,theyareoftenusedinsituationsforwhichtheresultsarenotreliable,suchaslocatingtransitionstates.ToexpandtheapplicabilityofSEmethods,wecombinetheirprimaryfeatureswiththosefromrigorousWFTorDFT,anddenethenewAATframework.Hence,AATencompassesbothNDDOandSCC-DFTB. Despitetheirformaldeciencies,bothNDDOandSCC-DFTBhavebeenquitesuccessfulinavarietyofapplications[ 2 ].Nonetheless,thereremainseveralaspectsoftheseSEmethodsthatfallshortofexpectationthatcouldbegreatlyimproved.Inadditiontoobviousformallimitations,existingSEmethodsweredesignedtoreproducetheenergeticsofsystemsatequilibriumandoftenyieldunphysicalresultsneartransitionstatesandalongdissociationpathways.UniformaccuracyofthePESiscriticalformoleculardynamics(MD)simulations,fordeterminingthelowestenergy(globalminimum)structure,andforcalculatingrateconstants.Associatedwiththesefailingsareunrealisticelectronicdensitiesthat,inturn,makecombiningthesemethodsintomulti-scalemodelingschemesdicult.Furthermore,errorsinrst-orderpropertiessuchasthedipoleandhigher-ordermomentswouldberemediedbyamethodthatdeliversadequateelectronicdensities.Attemptstoimprovesuchpropertieshavemetwithmodestsuccess,typicallyattheexpenseofotheraspectsofthemodel. Byfar,themostcommonroutetoincorporatesuchadditionalfeaturesistoreparameterizethemodel.Unfortunately,buildinginaccuracyfornon-equilibriumstructures,excitedstates,andotherproperties,bysimplereparameterizationofexistingNDDO-basedSEmethods[ 3 ]isnotreadilyachievable,atleastnotinawaythatmaintainstransferability(similaraccuracyregardlessofbondingcharacteristicsorsystemsize).ThisdicultyhighlightsanimportantdistinctionbetweenthecurrentAATrouteofdesigninganewsimpliedmodelHamiltonianbasedonrigorousabinitioprinciplesversus 13


4 5 ]. Animportantcomputationalconsiderationistheinclusionoftheoverlapbetweenorbitals,whichisafeatureofAATthatdistinguishesitfromstandardNDDOmethods.Thezero-dierentialoverlapapproximation,whichisthefundamentalapproximationofNDDO,impliesthatthereisnooverlapbetweenorbitals.Thoughthisapproximationenhancesthespeedofthemethod,itneglectsanimportantphysicalfeatureofelectronsandthebasissetsusedtodescribethem.TherehavebeenattemptstoreintroducesuchoverlapeectsinNDDOmethods,mostnotablyThiel'sorthogonalizationcorrectionmethods(OM1andOM2)[ 6 ].Still,Elstner'sdevelopmentofSCC-DFTB[ 7 ]demonstrates 14


Thepredominantcomputationalaspectsofthemethodhavebeenhighlighted;now,considertheformaldevelopmentoftheAATapproach.Thereisanexactreferencetotheexchange-correlationpotentialwithwhichwecandirectlycompare.Theelectronicdensitycanbecorrectediterativelybyaninverseprocedure,suchasthatofZhao,Morrison,andParr[ 8 ]orthatofvanLeeuwenandBaerends[ 9 ].Theseprocedurestakeahigh-qualityreferencedensityandgeneratethenumericalexchange-correlationpotentialthatbestreproducesit.Inthelimitthatthereferencedensityisexact,suchproceduresproduceanexactlocalexchange-correlationpotential.Thisexactlimitdenesasetofone-particleequations,whichinturn,denetherst-orderelectronicenergy.Whenaddedtothenuclear-nuclearrepulsiveenergy,theelectronicenergygivesthetotalenergycorrectthroughrst-orderinthetrialwavefunction.InexistingSEmethods,theremaininghigher-orderenergytermsarecombinedwiththenuclear-nuclearrepulsiontogiveaparameterizedatom-pairbasedtotalenergycorrectionthatwouldbepredominantlynuclearrepulsion.Instead,wechoosetoretainaseparatenuclear-nuclearrepulsioncontributiontoensurethateveryaspectofourmodelhasadirectone-to-onecorrespondencetoabinitiotheory.Theconnectionstotheexactlimitsforbothdensityandtotalenergycomponentsoftheoverallparameterizationprovidethenecessarystructuresothateveryapproximationofthemodelcanberigorouslyveried. 15


Thetime-independentSchrodingerequationprovidesthesolutionsneededforthetime-dependentSchrodingerequation,andthoughsomeproblemsrequireatime-dependentsolution,thevastmajorityofproblemsinquantumchemistrycanbetreatedadequatelywiththetime-independentSchrodingerequation.TheBorn-Oppenheimerapproximationisbasedonthesimpleobservationthatelectronsmovemuchfasterthannuclei.ThisdierenceinspeedeectivelyimpliesnuclearmotioncannotaecttheelectronicstructuresignicantlyandwecansolvetheSchrodingerequationforasetofxednuclearcoordinates.Forexampleinthehydrogenatomtheelectronistravelingatabouttwomillionmeterspersecond,whichisordersofmagnitudefasterthanthemotionofthenucleus.Also,thisspeedislessthan1%ofthespeedoflight,whichalsoimpliesthatweneednotconcernourselveswithrelativsticeectsthatwouldonlybecomeimportantforatomswithheaviernuclei(typicallythosewithatomicnumberlargerthan25).ForsuchheavyatomsthespeedofanelectronnearthenucleuswillapproachthespeedoflightandwouldrequirearelativisticHamiltonianforanaccuratedescription.Insuchcaseseectivecorepotentialsthatapproximatethepassive(Darwinandmass-velocity)relativisticeectscanbeused,whichstillallowsonetokeepthenonrelativisticHamiltonianframework.Finally,thelinearcombinationofatomicorbitals(LCAO)approximationintroducesbasissetsintotheproblemandgivesageneralframeworkforsystematicallyexpandingthereferencespaceoftheorbitalsthattheelectronsmayoccupy. Thetime-independentSchrodingerequationwithintheBorn-Oppenheimer(clamped-nucleus)approximationis 16


2Xir2iXA;iZA withiandjreferringtoelectronsandAandBtonuclei.ThetermsinEquation 1{2 aretheoperatorsthatcorrespondtothekineticenergyoftheelectrons,electron-nuclearattraction,electron-electronrepulsion,andnuclear-nuclearrepulsion.Thewavefunctiondescribesthestateofourquantumsystem.Wechoosetoapproximatewithinabasissetofatomicorbitals(AOs).EnforcingantisymmetryandsatisfyingthePauliexclusionprinciplecanbeachievedwithanantisymmetricproductoforthonormalizedAOswithinaSlaterdeterminant,theLCAOapproximation wherenisthenumberofelectronsandthespinorbital(r1)=(r1);and;denotespinand(r)isaspatialorbital. Itisttingtostartwiththemostwell-knownapproximationinelectronicstructuretheory.HFisanexcellenttouchstonesincecorrelationenergyisdenedbyit(Ecorrelation= 17


^h(1)jii=(1 2r21XAZA andthetwo-electronoperatorsareCoulomb^Jandexchange^K, ^J(1)jjii=Zjj(r2)j2 ^K(1)jjii=Xj6=iZj(r2)i(r2) and ^f(1)jii=[^h(r1)+(^J(1)^K(1))]jii=ijii(1{7) denotingthemolecularspinorbitaliasalinearcombinationofatomicspinorbitals~i UsingLagrange'smethodofundeterminedmultipliersandenforcingorthonormalityoftheorbitalsgivesasetofone-particlematrixequations,theHartree-Fock-Roothaanequations: where and TheconstructionofaFock-like(oreectiveone-particle)matrixisacentralthemeofthiswork.AFock-likematrixisanorbitalrepresentationofaneectiveone-particleoperator, 18


10 11 ]. HFillustratesmanyofthenecessaryfeaturesofaone-particletheory.Onedistinctfeature,whichsetsHFapartfrommostotherone-particletheories,isthatHFenforcesthePauliexclusionprincipleexactlyviatheexchangeoperator.SuchexactexchangetermsaremissinginKohn-Shamdensityfunctionaltheory[ 12 ]whichleadstotheso-calledself-interactionerror.AnotherbenetofHFisthatthereisKoopmans'theoremthatgivesmeaningtotheeigenvaluesoftheFockmatrixbyrelatingthemtoionizationpotentialsandelectronanities.However,theexactDFTstructurecanprovideaformallycorrectone-particletheorythatincludeselectroncorrelation,whichHFcannotdobecausebydenitionitdoesnotincludeelectroncorrelation.COPisanotherexactalternativethatcanbedenedfromWFT.


abij=A[1(1)2(2)::a(i)::b(j):::n(n)];(1{15) whereunoccupiedorbitalsaandbreplacetheoccupiedones,iandj.Thecoecient(amplitude)infrontoftheexcitationintroducesitsproperweightintothewavefunction.Theultimateversionofsuchasequencewouldincludeallpossibleexcitationsupton-foldfornelectronsandiscalledthefullcongurationinteractionorfullCI.FullCIis: 13 ](meaningthatitscalesproperlywiththenumberofelectrons.) ThefullCIcanbewrittenconvenientlyinthecoupled-cluster(CC)form, =exp(bT1+bT2+bT3+:::+bTn)0:(1{16) [ 13 { 16 ].ApproximationstoCCtheoryaremadebyrestrictingthebTpoperatorstosinglesanddoubles,asinCCSD,ortootherapproximationsthatdecouplehigher-excitationsfromlowerones,e.g.,CCSD(T).HerethetripleexcitationsareapproximatedfrombT1andbT2;whichcanbeobtainedfromCCSD,e.g.,withoutallowingthenewbT3estimatethentocontributebacktobT1andbT2:Thedetailsofthesemethods,whichcanbefoundelsewhere[ 17 ],arenotimportanttothepresentdiscussion.WhatisimportantisthatalltruncatedCCmethodsaresize-extensive,butdonotnecessarilygiveupperboundstotheexactenergy. ThefullCCequationsprovidetheexactelectronicenergy, 20


(1{18) (1{19) andP12istheoperatorthatpermuteselectrons1and2.Theamplitudes,ftai;tabij;gcomefromprojectionofexp(bT)^Hexp(bT)= (1{20) (1{21) 17 { 19 ]. CCtheoryprovidesaroutetoexactreferencedata.Itiswell-knownthatthehierarchyofCCmethodsaresystematicallyimprovable,andleadtothefullCIsolutionforaparticularbasisset.Hence,forsmallmoleculeswehavereadyaccesstoreferenceenergies,forces,densities,andelectronicpropertiestowhichwecancomparedirectly.Thewealthofinformationthatisaordedbyahigh-levelabinitiocalculationissuperiortothetypicalamountofinformationavailablefromexperimentandprovidesuniqueinsightintohowwellanapproximationisworking.Forinstance,besidesgettingenergeticinformationaboutamoleculewesimultaneouslyhaveaccesstotheelectronicdensity.Thisincreasedlevelofcontrol,whencomparedtoexperimentalreferencedata,allowsforsystematicallyveriableapproximationsofoureectiveone-particleHamiltonian.The 21


12 ]equationsmaybewrittenas ^fKSjii="ijii(1{22) or, [1 2r2+ext(r)+J(r)+xc(r)]jii="ijii(1{23) where andxc(r)istheexchange-correlationpotential.Constrained-searchminimizationofthenoninteractingkineticenergyalonedeterminesorbitalsthatyieldthegroundstatedensity[ 20 ].Theothertermsdonotaecttheminimizationbecausetheyareexplicitfunctionalsofthedensity. TheHohenberg-Kohn-ShamtheoremsguaranteethatthedensityobtainedfromtheexactgroundstatewavefunctionwillbereturnedbyEquation 1{26 whenusingtheexactexchange-correlationpotential.Inapragmaticsense,theseargumentsgiveaformalproofofexistence,thoughtheydonotguidetheconstructionoftheexactone-particleoperator,or,morespecically,theexchange-correlationpotential.However,incombinationwithso-calledinverseprocedures,suchasthatofZhao,MorrisonandParr(ZMP)[ 21 { 23 ],whichhavebeenexploredbyTozeretal.[ 24 ],ortheprocedurebyPeirsetal.[ 25 ],thepotentialmayberenediterativelysuchthatthesolutionstoEquation 1{23 willgenerateareferencedensity,presumablyahigh-qualityabinitioreferencedensity. 22


2ZZ(r)(r0) wheretheexchange-correlationpotentialisthefunctionalderivativeoftheexchange-correlationenergyfunctional Unfortunately,currentlythereisnoprocedurethatallowsustotakeanumericallyderivedexchange-correlationpotential,onethatisdeterminedsuchthatitreproducesahigh-qualityreferencedensity,andgeneratetheexchange-correlationfunctional,whichisanexplicitfunctionalofthedensity.Thisinabilitytodeterminetheexchange-correlationenergyfunctionalfromareferencedensityaloneistherstindicationofafundamentalprobleminsemiempiricalmethods,whichwillbediscussedinfurtherdetaillater:howdoesonedesignamodelHamiltonianthatwillsimultaneouslyachievethecorrectresultforbothelectronicpropertiesandenergetics?However,ifwearewillingtoassumetheformofaparticularexchange-correlationenergyfunctionalthenthepotentialiswelldened,asistheeectiveone-particleHamiltonian. HC=C(1{29) 23


whereniistheoccupationnumberfortheithMO.NotethereisnoexplicitconsiderationofthenuclearrepulsioninEquation 1{30 .TheHuckelapproximationsare TheextendedHuckelmethod[ 26 ],asitsnameimplies,ismerelyanextensionofthestandardHuckelmethodwiththesametotalenergyformula.ThefullmatrixequationHC=SCissolved,withoverlapexplicitlyincluded.AndtheHamiltonianissubjecttoaslightlymoresophisticatedapproximation: whereHiiisbasedontheithAOofanisolatedatomandKisadimensionlessadjustableparameterforanatom-pair.HomannsuggestedavalueforKbasedonitseectontheenergyofethane(C2H6).Hearguedthat,sincetheenergydependenceonKisnearlylinearbeyondK=1:5,theMOenergiesand,inturn,thetotalenergy,willnotbetoosensitivetoKforvaluesgreaterthanthat.HethenconsideredtheenergydierencebetweenthestaggeredandeclipsedconformationsofethaneandoptimizedthevalueofKtominimizetheerrorbetweencalculatedandexperimentalvalues.ThenalvaluehearrivedatwasK=1:75,avalueisstillincommonuse. TheimportantaspecttonoteabouttheHuckelmethod,asitpertainstothedevelopmentofimprovedsemiempiricalmethods,isthataverysimplemodelcanreproduceafewfeaturesqualitativelywell.Thespatialsymmetryiscorrectduetotheinclusionoftheoverlapmatrix.However,despitetheabilityofsuchbasicapproximations 24


27 { 32 ].Itisaminimalbasisset,valence-onlyapproach.NDDOisbasedonthezero-dierentialoverlap(ZDO)approximation:productsoforbitalswithrespectthesameelectroncoordinatethatareondierentcentersaredenedtobezero, TheorbitalsareconsideredtobeLowdinsymmetricallyorthogonalizedAOs,sothattheoverlapmatrixistheidentitymatrix,andthesetofone-particlematrixequationssolvedisFNDDOC=C.TheFock-likematrixinNDDOcontainsone-electron(H)andtwo-electron(G)components.Theone-electronpartisapproximatedas 2S(+)if6=(1{34) whereisonatomA,isonatomB,andUandareadjustableparameters,(jBB)modelstherepulsionbetweenaproductoforbitals()andapointchargeatcenterB,(h(1)j1 1-1 .TheparametersGss,Gpp,Hsp,Gpp,andGp2areadjustable. Withinalocalframeworktherearetwenty-twouniquetwo-centerNDDO-typetwo-electronintegralsforeachatompair(Table 1-2 ).TheNDDOtwo-centertwo-electronintegralsareapproximatedbyamultipole-multipoleexpansion[ 33 ].Eachproductof 25


1-3 ).AdjustableparametersarisefromthedistancesbetweenthepointchargesusedinthemultipoleexpansionandbyenforcingthecorrectlimitasR!0wheretheequivalentone-centertermshouldbereproduced. Asmentionedabove,originallytheNDDOFock-likematrixwasdesignedtomodeltheHFone-particleHamiltonian.TheHFsolutionneglectsallelectroncorrelationeectsandonthescaleofchemicalaccuracyoneshouldnotaprioriexpecttheNDDOapproximationtoreproduceexperimentalresults.However,becausesomeparametersareadjustableandttoexperimentaldatathetypicalargumentisthatelectroncorrelationisimplicitlyincluded.Infact,itisclearthatelectroncorrelationisnomoreimplicitintheNDDOmodelthanarerelativisticeects,time-dependence,non-Born-Oppenheimereects,basissetincompleteness,andexperimentalerrorsinthereferencedata.Despitesuchformalgaps,theNDDOmethodhasenjoyedatremendousamountofsuccessandreproducesheatsofformationandotherpropertieswithsurprisingaccuracy.Thedicultyariseswhenbetteraccuracyisrequiredinsituationsforwhichthemodelwasnotoriginallyintendedtodescribe,suchastransitionstatesorothernonequilibriumstructures.Insuchcasesonequicklyapproachesapointofdiminishingreturnsforparameterizingtonewreferencedata.Thepreciseoriginoftheselimitationsisexploredlater. 2XA6=BU(jRARBj):(1{35) wherethei'saretheeigenvaluesof 2r2+vext)jii=ijii;(1{36)vextisthenuclear-electronCoulombattractiontermandU(jRARBj)isconsideredtorepresentalltheremainingterms.Intotal,thelatterconstitutearepulsivefunctionofthe 26


where (Heff)=hjbheffji(1{38) and Inpractice,thesetwo-centermatrixelementsareevaluatedusingtheSlater-Kosterparameterization[ 34 ].TheTBapproximationhasbeenappliedsuccessfullytomanysolidstateproblemsandworksespeciallywellwhenthedensityofthechemicalsystemiswellapproximatedasasumofsphericallysymmetricatomicdensities.Inthecaseofmolecules,amoreadvancedapproachthat\treatsthedelicatebalanceofcharge"[ 7 ]isneeded.IfwecompareTBtoHF,weseethatU(jRARBj)hastoaccountfor^J^KandVNN.InDFTitwouldaccountforJ,VNN,andwouldcombineexchangewithcorrelationbyaddingvxc. 35 ]deriveditfromrst-principlesDFTconsiderations.TheenergyexpressioninKS-DFT,Equation 1{27 ,maybewrittenas 2Z0(r0) 2NXA;BZAZB Equation 1{40 isexactif=Poccij2ij,whereistheexactdensity.However,itshouldbenotedthatiareeigensolutionsof^hs=(r2 2R0(r0) 27


2ZZ000(0+) (1{41) +1 2ZZ00(0+) 2NXA;BZAZB 2ZZ02Exc 6ZZ0Z003Exc SubstitutionofthisexpressionforExcintotheenergyequationyieldsanexpressionfortheenergythatisformallyexact,yetonlyincludestermsthataresecond-orderandhigherin. 2ZZ0000 2NXA;BZAZB 2ZZ00 2ZZ02Exc +1 6ZZ0Z003Exc


2ZZ0000 2NXA;BZAZB whichobviouslyiscorrecttosecondorderin.Notetheinclusionofnuclear-nuclearrepulsioninthisterm. OneofthegreatadvantagesofDFTapproachesisthattheyoerareasonablezeroth-orderapproximationtotheenergysimplyfromknowingsomeapproximationtothereferencedensity,0:However,sinceExc[]isnotknown,gettingtheexchange-correlationpotential,vxc(r)=Exc=(r);anditskernel,fxc(r;r0)=2Exc=(r)(r0),fromrstprinciples,whicharerequiredtosetuparigorousDFT,isdicult[ 1 36 ].However,werewetoacceptoneoftheplethoraofapproximationstoExc(LDA,LYP,PBE,B3LYP,B88,etc.)theequationscanbesolvedforafewhundredatoms,butnotonatime-scalethatcanbetiedtoMDforthousandsofatoms. Furthermore,toprovideacorrectdescriptionofbondbreakingtheelectronicchargehastobefreetorelocate.AcommonexampleisLiF,inwhichatR=ReqwehavevirtuallyanexactLi+Fform,butatseparationinavacuum,wemusthaveLi0F0:Suchself-consistencyusuallyisintroducedbysolvingtheAO-basedDFTequationsviadiagonalization,butthisapproachimposesaniterativestepthatscalesasN3totheprocedure.Sincethemodelismeanttobeapproximateanyway,thealternativechargeself-consistencyapproach,familiarfromiterativeextendedHuckeltheory[ 37 38 ],canbeusedmoreeciently.Thisextensionleadstotheself-consistent-chargedensity-functionaltight-binding(SCC-DFTB)approachofElstneretal.[ 7 ]. 29


2ZZ0(0 2NXA;BqAqBAB:(1{45) TheqAarethedierencesbetweensomereferencecharge,q0A,andthechargeonatomA, 39 ],Ohno[ 40 ],andMataga-Nishimoto[ 41 ].Forexample,thelatterformis wherethesingleindexAismeanttobeapurelyatomictwo-electronrepulsionterm.Thesearetypicallychosenasxedatomicparametersinthecalculation.ForanassessmentofhowtheSCC-DFTBmodelperformscomparedtostandardsemiempiricalquantumchemicalmethods,seeMorokuma,etal.[ 42 ].Forexample,unlikeSCC-DFTB,standardAM1doesnotgivethemostandleaststableisomersofC28whencomparedtoDFTcalculationswiththeB3LYPExcanda6-31G(d)basisset.Incontrast,therelativeenergiesoftheisomersofC28withB3LYP/6-31G(d)//SCC-DFTBareinverygoodagreementwithB3LYP/6-31G(d),havingalinearregressionR2coecientof0.9925.ThoughthegeometriesgeneratedbySCC-DFTBareexcellent,theSCC-DFTBrelativeenergiesarepoor,withalinearregressionR2of0.7571. SCC-DFTBisafairlyrigoroussemiempiricalmethod.TheframeworkprovidedbyFoulkesandHaydockexposedhowhigher-ordertermscouldbeincluded.Thepracticalimplementationofawell-behavedapproximationofhigher-orderwassubsequentlyprovidedbyElstner.Later,wewilldeveloptheSCC-DFTBformalismfromanMOviewpointtofacilitatethecomparisonbetweenitandothermethods. 30




TheNDDOone-centertwo-electronintegrals ParameterTwo-electronintegral 32


TheNDDOtwo-centertwo-electronintegrals TermTerm 1(ssjss)12(spjpp)2(ssjpp)13(spjpp)3(ssjpp)14(ssjsp)4(ppjss)15(ppjsp)5(ppjss)16(ppjsp)6(ppjpp)17(spjsp)7(ppjp0p0)18(spjsp)8(ppjpp)19(spjpp)9(ppjpp)20(ppjsp)10(ppjpp)21(ppjpp)11(spjss)22(pp0jpp0) 33


Multipoledistributions TermChargedistributionPointcharges (ssjMonopole1(spjDipole2(ppjLinearQuadrapole4(pp0jSquareQuadrapole4 34


However,practicallimitationsrestrictthefunctionalformofthewavefunctionandtheHamiltonian.Thechallengeistoincorporatethemany-bodyeectsofexactWFTintoasimple(low-rank)modelHamiltonian.Oneaspectoftheapproachtakenhereistodevelopaformallyexactone-particletheory,thetransferHamiltonian(Th),whichincludesallmany-particleeects.AnotheraspectistoconnectasimpleapproximatemodelHamiltoniantotheexactTh.Ourinitialworkusedthewell-knownNDDO-typemodelHamiltonian.Becausethesemodelshavelimitedapplicability,ithasbecomeevidentthatmorecompletemodelsareneededthataccuratelyincorporatethemany-particleeectsoftheTh.WedeveloptheAATsimpliedapproachtoincludethefeaturesoftheThinasystematicway,whichisdiscussedindetaillater. HereweintroducetheTh[ 43 ],aone-particleoperatorwithacorrelationcontribution,asanextensionofthesimilaritytransformedHamiltonianfromCCtheory[ 44 ].Thisapproachprovidesarigorousformalframeworkforanexactone-particleWFT.Also,unlikeothermethods,suchasthespecicreactioncoordinateapproachofTruhlar[ 45 ], 35


Westartfromtheexponentialansatzinwhichtheexactgroundstatewavefunctionisgivenby where0isareferencefunction,whichcanbetakenasasingledeterminantandTistheexcitationoperatorofCCtheory.Whendenedthisway,wemaywritetheSchrodingerequationintermsofthesimilaritytransformedHamiltonian, ThesolutionofthisequationhasthesameeigenvaluesasthebareHamiltonian.Thetotalenergy,E,isexact,thustheforces,@E @RA,areexactaswell.Insecond-quantizedform 2!Xp;q;r;s 3!Xp;q;r;s;t;u whereweshowtheinclusionofthree-andhigher-bodyinteractionterms.Toincludethehigher-ordertermswerenormalizewithrespecttotheone-andtwo-bodytermsandobtainageneralized,correlatedFock-likeoperator,calledthetransferHamiltonian[ 10 43 ] wherePisthesingleparticledensitymatrixandbothehand^hpqjjrsicanbeapproximatedwithasuitablefunctionalofatom-basedparameters.WehaveusedtheNDDOforminthecurrentwork,buttheformalismdevelopedthusfarisnotlimitedbythischoiceofFock-likeoperator. 36


27 ]toreproducetheforcesdeterminedwithCCtheory.TheparameterizationwasperformedbyacombinationofthePIKAIA[ 46 ]implementationofageneticalgorithmandthelinearizedquasi-Newton-RaphsonminimizationroutineofL-BFGS[ 47 ]. ThereferenceforcesusedinthisparameterizationwereobtainedwiththeACESII[ 48 ]programsystem.WehaveusedCCSDinatriplezetabasis,withaunrestrictedreferenceandHartree-Fockstabilityfollowing(toensurethatwearriveattheproperunrestrictedspinsolution.)AllsemiempiricalAM1andOM1calculationswereperformedusingamodiedversionofMNDO97Version5.0[ 30 { 32 49 ]. DFTcalculationsforthenitromethane(NMT)monomerwereperformedattheB3LYP/6-31G(d)levelwiththeHONDOPLUSprogram[ 50 ].TheinitialNMTdimerandtrimergeometriesweretakenfromaDFTstudyusingB3LYP/6-31++G**[ 51 ].Duetothelargenumberofcorrelatedreferencecalculationsneededinmodelingourclusters,weusedtheTurbomoleVersion5electronicstructurepackage[ 52 53 ]todoall-electroncalculationsattheresolution-of-the-identitysecondorderMller-Plesset(RI-MP2)levelinaTZVPbasis.Wechoseanall-electronmodelinalargeauxiliarybasistoensureanaccuratetreatmentofcorrelationenergiesandgeometriesthatweredemonstratedonaseriesofcompoundsrepresentativeofatomsinvariousoxidationstates[ 54 ]. 37


NMThasbeenthesubjectofmanytheoreticalandexperimentalstudiesduetothesignicantrolethatnitrogen-containingcompoundsplayinthechemistryofexplosives,propellants,andatmosphericpollution[ 55 ].ThedescriptionofthedecompositionofNMTanditsproductsiscrucialtotheunderstandingofthekineticsanddynamicsoflargerenergeticmaterialsforwhichNMTisamodel.NMTissmallenoughtoallowdetailedinvestigationofitscomplexdecomposition[ 56 57 ].Specically,evenforasystemofdeceivingsimplicity,itsdecompositiontoradicalsviasimpleC-NbondruptureversusthecompetingNMT-methylnitrite(MNT)rearrangementhasbeenthesubjectofmuchdebate[ 56 { 59 ]. TheuseofclassicalpotentialsandstandardsemiempiricalNDDOmethodsoftenyieldincorrectbehaviorforenergies,forces,andgeometries,especiallyinnon-equilibriumcases.Therefore,theuseofsuchtechniquestodoMDwilloftenleadtoincorrectresultswhenbondbreakingandformingarestudied.Alternatively,abinitioquantumchemistryisknowntobepredictive,inthesensethatitcanreproducemostquantitiesadequatelycomparedtotheexperimentalvaluesinthoseregionswithoutknowingtheresultpriortocalculation.However,thesemethodsaretooexpensiveand,therefore,currentlyimpracticalforMDbecauseofthetimeanddiskspacerequirements. PreviousNDDOparameterizations,suchasAM1,canreproducenear-equilibriumstructuresforalargenumberofmolecules,yettheyoftenfailfornon-equilibriumgeometries.Thus,forthepurposesofdoingMDsimulations,wearewillingtosacricetheabilitytoaccuratelytreatalargenumberofmoleculesnearequilibrium,fortheabilitytotreatasmallclassofmoleculesaccuratelyatallgeometries.Whenmodelingcombustionordetonationitisparticularlyimportanttohaveamethodthatworksequallywellforequilibriumandnon-equilibriumgeometriesifthechemistryofbondbreakingandforming 38


AmongthersttostudyNMTtheoreticallywereDewaretal.[ 56 ]whousedMINDO/3andfoundthattheNMT-MNTrearrangementoccurredwithanenergybarrierof47.0kcal/molandthatMNTdissociatedtoH2CO+HNOwithanenergyof32.4kcal/mol.HencetheyproposedthatNMTdecompositionoccursviarearrangement.McKee[ 57 ]reportedtheNMT-MNTrearrangementbarriertobe73.5kcal/molcalculatedattheMP2leveloftheorywitha6-31G(d)basisandthatMNTdissociatedtoH2CO+HNOwithanenergyof44.1kcal/mol.ThehighbarrierheightoftherearrangementsuggeststhatthedecompositionpathwayissimplebondrupturetoNO2andtheCH3radical.Huetal.[ 59 ]performedcalculationswiththeG2MP2approximationandfoundthattheNMTdissociationoccursviadirectC-Nbondrupturewithanenergyof61.9kcal/mol.TheNMTdissociationislowerthanboththeNMT-MNTrearrangementandtheNMT-aci-nitromethanerearrangementby2.7and2.1kcal/mol,respectively.Nguyenetal.[ 58 ]reportedthattheNMT-MNTrearrangementbarrieris69kcal/molandNMTdirectdissociationtoCH3andNO2radicalsis63kcal/molcalculatedattheCCSD(T)/cc-pVTZlevelusingCCSD(T)/cc-pVDZgeometries.MostrecentlyTaubeandBartlett[ 60 ]reporttheNMT-MNTrearrangementat70kcal/molusingCCSD(T). Huetal.[ 59 ]alsoperformedadetailedstudyofunimolecularNMTdecompositionandisomerizationchannelsusingDFTwithG2MP2//B3LYPleveloftheoryina6-311++G(2d,2p)basis.SinglepointswerebasissetextrapolatedalongtheminimumenergypathstocorrectforincompletenessusingtheG2MP2method.Theirarticleshowsthatfromaparticularintermediatestate(IS2b)onepossiblereactionpathistheruptureoftheN-Obondtogivethemethoxyradicalandnitrousoxide.Huetal.indicatethaton 39


58 61 62 ]. However,thereliabilityofboththeB3LYPand,toalesserextent,CCSD(T)atnon-equilibriumgeometries[ 59 ]islimitedandthesameaccuracyshouldnotbeexpectedinregionsotherthanequilibrium. TheworkofLietal.[ 51 ]suggeststhatthethree-bodyinteractionenergyinbulkNMTisacriticalcomponentinaccuratelydescribingthepotentialenergysurface.WeinvestigatedtheuseofourThinpredictinghowanaccuratetreatmentofthree-bodyeectsdictateschemicalreactivity. 2-1 theagreementbetweenCCSDandTH-CCSDistowithinadegreefortheanglesandlessthanonehundredthofanAforbondlengths. TH-CCSDisshowninFigure 2-1 toreproducetheoriginalforcecurveforC-Nbondrupturetowithinanaccuracyof0.005Hartree/Bohr.TheAM1parametersweredesignedtoreproduceequilibriumstructures,andthustheyhavethecorrectforceattheequilibriumbondlength.However,at0.1Aawayfromequilibriumtheforcesareinconsiderableerror.ArecurringfeatureforAM1forcecurves,whichisproblematicforMD,istheunphysicalrepulsivebehavioratlargeinternucleardistances.ThetransferHamiltonianremovesthisarticialrepulsionandgivesthequalitativelyandquantitativelycorrectdissociation. Onewayofdeterminingenergydierencesinasinglereferencetheory,otherthanHartree-Fock,isbyintegrationoftheforcecurve.ForAM1theerrorintheforcesisreectedintheenergyasshowninFigure 2-2 .AlsoshowninFigure 2-2 theDFTsolution 40


63 ]havestudiedtheinteractionenergyofdimersasafunctionofrigidmonomerseparation,wecalculatetherelativeenergiesoftheNMTdimerasafunctionofhydrogenbondingdistance.TheminimumchoseninourNMTdimercalculationsfrom[ 51 ]isstableduetotheformationoftwohydrogenbondsofequallengthandantiparallelarrangementofthemoleculardipoles.InFigure 2-3 ,weplottheenergiesrelativetotherespectivemethodsminimainROH. Ourdataindicatethatinthecaseoffrozenmonomergeometries,AM1overestimatestherepulsivewallatsmallintermoleculardistances.ItappearsasiftheH,C,N,andOparametersinthecore-corerepulsionfunctionforAM1wereunabletocanceltheeectfromwhichMNDOhastraditionallysuered,namelyoverestimationofenergiesincrowdedmolecularsystems[ 31 64 ].TheTH-CCSDreproducestheinteractionforcesquitewellforintermoleculardistancesupto3A,whileAM1isshowntoslightlyoverbind.ThereaftertheTH-CCSDunderbindsslightlyindisagreementwithRI-MP2butinagreementwiththeOM1Hamiltonian.TheOM1methodhascredenceasamethodwhichmodelsthedipoleandthushydrogenbondinginteractionproperlybycorrectingtheoverlapmatrix[ 32 65 66 ]. AdiabaticC-NbonddissociationforsmallNMTclusterswasinvestigatedstartingfrompublishedlocalminima[ 51 ]usingtheTH-CCSDandAM1.Instudyingthedimerinteraction,forwhichoneofthemonomersisundergoingunrestrictedC-Nbondrupture,theTH-CCSDforcespredictthatabimolecularprocesstakesplacewhenthestretchedbondisatadistanceof2.42A(1.6Req).Thereactionproductsarenitrosomethane,methoxyradical,andnitrogendioxideradical.Figure 2-4 showssnapshotsofthisparticulardimerdecompositionchannel. 41


2-1 ).ThegeometryoptimizationusingAM1revealsthatthemonomersactuallyseparateby0.45Aandthatnomethylrotationtakesplace. ThebimoleculardimerreactionpredictedbytheTH-CCSDissupportedbytheunimolecularrearrangementofintermediateIS2b[ 59 ]forwhichnotransitionstatewaslocated.ThesimilarityisthatthereactionproductsbothinvolvetheformationofmethoxyradicalviathecleavageofanN-Obond.Unlikethestudyin[ 59 ],ourprocessisbimolecular,wheremethoxyradicalformedisbyNMTundergoingoxygeneliminationorN-Orupture.Contraryto[ 59 ],wherethemethylgrouppicksupanoxygenfromitsownmonomer,weobservethattheoxygenispickedupbyamethylgrouponadierentmonomer.Ourresultsforoxygeneliminationareinagreementwithpreviousstudies[ 61 62 ],whichsuggestthisistheprimaryprocessindetonation. 51 ]hasaringstructureinvolvingthreehydrogenbonds.Performingthesamestudydoneforthedimeronthereferencestructuretrimerfrom[ 51 ]showsthatataC-Nbonddistanceof1.94Aor1.3ReqaconcertedrearrangementofNMTtoMNToccurs.ThedecompositionofNMTintrimerthusoccursataC-Ndistancethatis0.5Alessthanthatinthedimer.ThisindicatesacooperativereactivityinclustersofNMTinitiatedbythevibrationalexcitationofaC-Nbond.ThemonomerrearrangementtoMNTiswidelysupported[ 58 59 ]andisbelievedtobeakeystepinthedetonationofNMT.TheinitialintermolecularN-Ndistancesare4.5A, 42


2-4 ).Duringthegeometryoptimization,becauseoneofthemonomershasastretchedC-Nbond,thesystemisunabletoreachitsring-likehydrogen-bondedlocalminimum,leadingtoringstrain.TheTH-CCSDpredictsthat,asaresult,theonemonomercleavesataC-Nbonddistanceof1.8Aand,subsequently,asecondmonomerdissociatesatthesameC-Nbonddistance.Atthispoint,alltheintermolecularN-Nbonddistanceshavedecreasedto4.3A,4.5A,and4.6A;theangles,however,remainunchanged.GeometryoptimizationwithAM1showsthatthestretchedmonomershiftsslightlyfromtheremainingtwomonomersallowingoneofthemtorotatetoalowerenergystructurewhereitformsabifurcatedhydrogenbondeachofwhichis2.44A. WehavedemonstratedthattheTH-CCSDparametersettrainedforNMTmonomertransferstothedimerandtrimercalculations.ThecriteriafortransferabilityusedhereregardstheextenttowhichtheseforcesyieldproductsthatareinaccordwiththeoreticallysupporteddecompositionmechanismsofNMT[ 58 59 ].NotethatthetrainingofourHamiltonianwasnottailoredtofavoranyparticularrearrangementmechanism,evidentbythefactthatthedimerdecompositiondiersmarkedlyfromthetrimermechanism.AlongtheadiabaticsurfaceAM1doesnotpredictanybondbreakingtooccur.Therefore,theaccuracyofmodelingbondbreakingandformingprocessesinbulksimulationsusingAM1isquestionable,atbest. WehaveshownaparameterizationoftheNDDOHamiltonianthatreproducesproperdissociationofNMTintoradicals,bothfortheenergyandforces.ThisparametersetreproducesthereferenceCCSD/TZPgeometriesandforcesforunrestrictedC-Nbondbreaking.ThisNDDOparameterizationwouldappeartopotentiallyoerimprovedaccuracyinMDsimulationsforclustersofhigh-energymaterials.Wealsoshowthatconcertedreactionsforboththedimer(bimolecularrearrangement)andtrimer(trimolecularrearrangement)arenotgeneratedwithanAM1parameterization,yetarerequiredfortheproperdescriptionofNMTclusters.ThisnewtransferHamiltonianwillallowfor 43


67 68 ]becauseofitswideuseandpriorsuccesses.However,suchmethodsusingstandardparameterslikeAM1[ 27 ],MNDO[ 30 { 32 ]andPM3[ 28 29 ]arealsoknowntohavesignicantfailingscomparedtoabinitiomethods.OurTH,ontheotherhand,benetsfromthefactthatweonlyrequirespecicparameters[ 45 ]forthesystemsofinterest,forexample,SiO2clustersandH2Oinourhydrolyticweakeningstudies[ 69 ].Theseparameterspermitallappropriateclusteringandbondbreakingforallpossiblepathsforallthecomponents,toallowMDtogowherethephysicsleads.Furthermore,unliketraditionalsemiempiricalmethods,wedonotuseaminimalbasisset,preferringtousepolarizationfunctionsonallatoms,andinsomecases,diusefunctions.TheformoftheHamiltonianemphasizestwo-centerinteractions(butisnotlimitedtonearestneighborsasinTB),whichallowsourparameterstosaturateforrelativelysmallclusters.WecheckthisbydoingabinitiocalculationsonlargerclustersinvolvingthesameunitstodocumentthattheTHiseectivelyunchangedwithrespecttosystemsize.Atthatpoint,wehaveaTHthatwecanexpecttodescribeextendedsystemscomposedofSiO2,anditssolutionsproperly,includingallmulti-centereectsontheenergiesandforces.OnlytheHamiltonianisshort-range.WealsocheckourTHbyapplyingittorelatedsystems.WewillshowthatourTHdeterminedfromforcesforbondruptureindisiloxane,pyrosilicicacid,anddisilanealsoprovidecorrectstructuresforseveralsmall 44


NowwewillshowthattheTH-CCSDparametersforanNDDO-typeHamiltoniantrainedonasmallsetofreferencemoleculesaretransferableintherstsense.Wechoseourtrainingsettobereasonablerepresentativesofthelocalbehaviorofextendedsystems:disiloxane(H3SiOSiH3),pyrosilicicacid((OH)3SiOSi(OH)3)anddisilane(H3SiSiH3).TheinitialparametersusedwerefromMNDO/d[ 70 ]becauseofavailability,overallperformanceonthetrainingset,andinclusionofd-orbitalsonSi. ThetwasbasedonforcesfromtheCCSDapproximationusingDZPbasissets.PrimarilytheTH-CCSDtwastotheforcesalongtheintrinsicreactioncoordinateforSi-SibondbreakingindisilaneandSi-Obondbreakinginbothdisiloxaneandpyrosilicicacid.FurtherrenementwasachievedbyttingtotheforcesfortheSi-O-Sibendsofdisiloxaneandpyrosilicicacid.ThemostreliableerrorfunctiontouseinthetypeofttingseemstobethesquareofthedierencebetweenthereferenceCCSDandTH-CCSDforces.Thettingprocedureitself,thoughnonlinear,involvediterativelyperformingalinearminimizationofeachparameter,startingwiththosethatmostreducedtheerror. Toverifythetransferabilityoftheparametersoneshouldevaluatepropertiesformoleculesoutsidethereferenceset.VericationforbondbreakingtestswereperformedonSi(OH)4,Si(SiH3)(OH)3,Si(SiH3)2(OH)2,Si(SiH3)3(OH),Si(SiH3)4,H2Si=SiH2, 45


2-5 and 2-6 showtheresultsrelativetoDFT(withtheB3LYPapproximateexchange-correlationfunctional)values. ItisapparentthattheTH-CCSDparametersoutperformmanyofthecommonlyusedNDDOmethodsforavarietyofdierentbondingenvironments.Furthermore,althoughtheparametersweredesignedtoreproduceSi-O-Si,Si-O,andSi-Siforces,theytransferwelltoequilibriumO-Si-O(Figure 2-7 ). 71 ].UnderstandingsilicaandsiliconbringsusclosertounderstandingdefectsinMOS.Inthem,phenomenasuchascurrentleakage,electricalbreakdownovertime,andthethinningtrendsofthedielectriclayeringateoxidesarestronglyinuencedbydefects[ 72 73 ]. Italsoshouldbenotedthatwefocusontrainingmodelsthatperformwellprimarilyforlocalizedunits(i.e.,clusters)andsecondarilyforbulksystemswithperiodicboundaryconditions.Thisisbecauseitismuchhardertobuildlocalphenomenaintoamodelusedprimarilyforextendedsystemsthanitistoconstructamodelbasedonlocalchemistrythatisapplicableforsmallmolecules,clusters,andbulk.Forexample,periodiccalculationswithplanewavesrequireexceedinglylargenumbersofplanewavestodescribeoxygendefectsinsilica[ 74 ]. 46


75 ]haveshownexperimentally,basedonthetwo-foldcompressionofliquidSiO2,thatthereisessentiallynochangeinnearest-neighbordistancesthoughtherearesignicantchangesinmedium-rangestructure.Thismedium-rangestructureisbestdescribedbyring-andcluster-statisticsandclustergeometry[ 76 ],whichwouldbenotbeincorporatedinanytwo-bodypotential. Toavoidthelimitationsofempiricaltwo-bodypotentials,onemayconsiderNDDOornon-orthogonalTBmethodsdiscussedabove,whichincludesomemany-bodyeectsduetoinclusionoftheoverlapmatrix[ 77 ].EspeciallypromisingistheSCC-DFTBapproach[ 7 ]whichincludesthelargestcontributionstolong-rangeCoulombinteractionswhilehandlingchargeinaself-consistentway.Thesemethodsoerapracticalbalanceofchemicalaccuracyandcomputationaleciency,asevidencedfromtheirrecentincreaseinpopularity. Figure 2-8 showsthedeviationsoftheTH-CCSDparameterization,AM1,andMNDO/dNDDOcalculatedunitcelldensitiesfromreferenceDFTdensities.ThereferenceDFTcalculationsforallsilicapolymorphsarefromtheVASPcodeusingthegradient-correctedPW91GGAexchange-correlationfunctionalwith3Dperiodicity,anultrasoftpseudopotential,andaplanewavebasisset(395eVcuto).A2x2x2k-pointgridisemployed,whichconvergestotalenergiestowithin0.05eV/cell,andtheconvergenceofself-consistentstepsispreciseto104eV/cell.Theoptimizationofcellparametersandatomicpositionsispreciseto<103eVA1. ItisclearthattheTH-CCSDparameterstrainedonsmallsilicaclustersaretransferabletoseveralsilicapolymorphs.Itisalsointerestingtonotethatnotonlyis 47


TheapplicationofthetransferHamiltonianmethodologyhasbeenpresentedfornitrogencontainingenergeticmaterialsandsiliconcontainingcompounds.Inbothcaseswehavedemonstratedsignicantimprovementovertheunderlyingsemiempiricalmethodbyreparameterizingwithatrainingsetofhigh-qualityCCforcesforrepresentativemolecules.ItisapparentthatforfocusedMDstudiesthatusesemiempiricalmodelHamiltoniansthereisanadvantagetocreatingaspecialsetofparametersthatistailoredtothechemicalsystembeingstudied.However,thetransferabilityofthemodelisaectedbywhichmoleculesareselectedforthetrainingsetandthePESreferencepointsused.Thesemodelswilloftenbreakdownwhenappliedtosystemstowhichtheyhavenotbeenparameterized.Oneroutetoachievethisleveloftransferabilityistodesignasimpliedmethodthatdoesnotrequireparameterization,onlycontrolledapproximations,whichleadstotheAATapproachtodeningasuperiorTh. 48


EquilibriumgeometryforNMT,inAanddegrees MethodRCNRCHRNOAHCNAONC 49


TheNMTforcecurveforC-Nrupture 50


TheNMTenergycurveforC-Nrupture 51


TheNMTdimerenergiesrelativeminimumintheintermolecularhydrogenbondingdistance 52


B C D Figure2-4. SnapshotsofthegeometryoptimizationsforNMTdimerandtrimerperformedwiththeTH-CCSDandAM1.A)NMTdimertreatedwithTH-CCSD.B)NMTdimertreatedwithAM1.C)NMTtrimertreatedwithTH-CCSD.D)NMTtrimertreatedwithAM1 53


AveragedeviationofSi-Ostretch 54


AveragedeviationofSi-Sistretch 55


AveragedeviationofO-Si-Oangle 56


DensitydeviationfromPW91DFT 57


Semiempiricalmethoddevelopmenthasalonghistorydatingbacktothe1930'sstartingwithHuckeltheoryfordescribingeven-alternanthydrocarbons.Inthecurrentstudywehavebeenguidedbythedevelopmentsoverthelastseventyyearsinregardtobothwhatshouldandshouldnotbeapproximatedinourmodel.WefocushereontheNDDOandSCC-DFTBmodels.Bothmodelshavebeenwidely-usedwithreasonablesuccess.Therelativenumericalperformanceofthemostrecentversionsofeachmodelhasbeenreportedrecently[ 2 ],andextensivecomparisonsoftheformalsimilaritiesofeachmodelhavebeenmade[ 78 79 ].ItisimportanttounderstandthecontextinwhichthesemodelsweredesignedtofullyappreciatethedierenceinthedesignphilosophyofthecurrentAATmodel.NDDO-basedmethodswereoriginallyconstructedtoreproducetheexperimentalheatsofformationforsmallorganicmolecules.Sincetheheatofformationisdenedonlyformoleculesintheirequilibriumstructures,itisnotsurprisingthatNDDOmodelsareparameterizedspecicallytoreproduceequilibriumproperties.FornonequilibriumstructurestheaccuracyofNDDOmethodsvariesgreatly,aswewillshow.Similarly,SCC-DFTBhasbeenparameterizedtoreproduceequilibriumstructures. ThegoaloftheAATapproachistoachieveuniformaccuracyofelectronicproperties,structures,andenergeticsovertheentirePEStotheextentpossibleforaminimalba-sisseteectiveone-particletheory.UniformaccuracyisrequiredtoachievemeaningfulresultswhenperformingMDsimulations,inwhichtransitionstatesandothernonequilibriumstructuresplayacriticalrole. 3.1.1DoesNDDOApproximateHF? 58


2(j)(3{1) (j)ABCD1 2 (j)ADBC(3{2) where,,,andareoncentersA,B,C,andDrespectivelyandtheterms (j)aresubjecttothemultipole-multipoleexpansionapproximation.ItistruethatatypicalNDDO-typetwo-electronintegralencounteredisoftenlargerinmagnitudethanacorrespondingnon-NDDO-typetwo-center,three-center,orfour-centertwo-electronintegral.Formoleculeswithmorethatthreeheavyatomsinavalence-onlyminimalbasissetrepresentation,therearesignicantlymoretermsinvolvingthree-andfour-centertwo-electronintegrals.Thecombinedeectofthislargenumberofmulti-centertermshasasignicanteectonthedensitydependentFockmatrixcontributions.Weconsiderasimpleexample.ShowninFigure 3-1 aretheindividualcontributionstotheNMTdensitydependentFockmatrixelementofthecarbon2s-typeorbitalandnitrogen2s-typeorbitalinaminimalbasissetforthecombinedthree-andfour-center,two-center,abinitioNDDO-type,andfullmulti-centercontributions.Thoughonlyasinglematrixelementisplottedoverthisreactioncoordinate,thetrendisthesameforothermatrixelementsandothermolecules.Thethree-andfour-centercontributionsarenearlyequalandoppo-sitetothetwo-centercontributions.Thiscancellationisauniversaltrendandholdsatallenergyscales.And,whileitcouldbesaidthattheNDDO-typeclassoftwo-electronintegralsapproximatesthefullsetoftwo-centertwo-electronintegrals,itisclearthattheNDDO-typeintegralsalonebearlittleresemblancetothefulltwo-electronintegralcontribution.Instead,itisthedelicatebalanceamongallmulti-centertwo-electronintegralsthatiscriticaltoobtainingtheproperFockmatrixcontribution. IthasbeenknownsincestudieswererstperformedontheNDDOapproximationthataparameterizedsetofNDDOintegralsperformbetterthattheabinitioNDDOintegrals.However,whathasbeenlessclearisthelackofcorrespondencebetweentheNDDOapproximationandthequantityitissupposedlyapproximating.Insteaditseems 59


80 ]showthattheirPDDG/PM3reparameterizationofPM3yieldsheatsofformationwithameanabsoluteerrorof3.2kcal/molforatestsetof622molecules.However,thereislittleevidencetosuggestthatthesameperformanceshouldbeexpectedforchemicalstructuresthataresignicantlydistortedfromequilibriumstructures.Evenforsimpleactivationbarriers,theerroristypically10-15kcal/mol[ 80 ].ForthedeterminationofrateconstantsandinmoleculardynamicsstudiesthePESshouldbedescribedwithin1-2kcal/mol.Forexample,sinceat300Kanerrorof1kcal/molintheactivationbarrierwillresultinanorderofmagnitudechangeintherateconstant,clearlya10-15kcal/molerrorwillnotyieldmeaningfulrateconstants.ThesituationisespeciallybadincasesinwhichdirectbondssionistreatedwithNDDO-basedmodelHamiltonians.Insuchcasesthereisanarticialrepulsiveregionnear1.2Reqto1.6Req,asisshowninFigure 3-2 forNMTUHFdirectbondssion.Generally,incaseswhereitisknownthatthereisbarrierlessdissociation,onewouldrecognizesuchfeaturesasanerrorinthemethodandsubsequentresultswouldbequestioned.Formorecomplicatedchemicalprocesses,forexampletransitionstatesor 60


Thecore-corerepulsionterminNDDOHamiltoniansinvolvesparametersthataredesignedforenergeticsatequilibrium.Forsuchaspecicsetofpropertiesthedierenceoftwolargenumbers,theattractive(negative)electronicenergyandtherepulsive(positive)nuclear-nuclearrepulsionenergy,iseasiertomodelthaneachseparately.Inouropinion,combiningsuchunrelatedtermsistheoriginofthelong-standingcontentionthatelectronicpropertiesandtotalenergypropertiescannotbedescribedwithinthesamesetofparameters.Also,oncethecore-coretermsareparameterized,thecorrespondencebetweenrigoroustheoryandsemiempiricaltheoryislost.Tosolvethisproblem,weneedtosearchforabetterformforourmodelHamiltonianthatdoesnotrequirethatVNNbecombinedwithtermsthatarepurelyelectronic. 61


1{27 Now,ifweexpectthattheNDDOmodelHamiltonianwillgeneratethecorrectelectronicdensity,whichisarequirementifwewantanyoftherst-orderelectronicpropertiesofoursystem,thenwecanrelatetheNDDOandKS-DFTenergyexpressions.Makingthetemporaryassumptionthat^fNDDO^fKS,andconsequentlyNDDOKS,thenthetotalenergydierenceis 2Zvxc(r)(r)drXA;B>AZAZB NowitisclearthatiftheNDDOone-particleHamiltonianweretoyieldthecorrectelectronicdensitythenthetwo-centercore-coreterm(ENDDOAB)mustaccountfornotonlythesimplenuclear-nuclearrepulsionterm(ZAZB 2Rvxc(r)(r)).Themotivationforapproximatingthislasttermisobvious:itdrasticallysimpliesandavoidsacomplicatedandexpensivestepinthedeterminationofthetotalenergy.However,thereisnoreasontoexpectapriorithatthisapproximationwouldbeaccurate.ForourimprovedmodelHamiltonianwemakeaspecialeorttoensurethatallapproximationsarecontrollableandsystematicallyveriable. 3.2.1Self-ConsistentChargeDensity-FunctionalTight-BindinginanAOFramework 3{4 through 3{8 aretheworkingequationsforSCC-DFTBwhichcomedirectlyfromElstner'soriginalpaper[ 7 ]: qA=qAq0A(3{4) 2occXiniX2AatomsX(ciciS+ciciS)(3{5) 62


~HSCC=hj^H0ji+1 2SatomsX(A+B)q=~H0+~H1 2atomsXA;BABqAqB+XA6=BErep(RAB)(3{8) andthecorrespondinggradientsare wehaveused2Aand2B. TofacilitatecomparisonoftheSCC-DFTBmodelHamiltoniantoothermethodsweforceitintoastructurethatiscomprisedofadensityindependentpartHandadensitydependentpartG.UsingMullikenpopulationanalysistodeterminethenetchargeassociatedwithatomAyields qA=q0A+X2A;PS(3{10) wheretheoccupationnumbersni=1andtheMOcoecientsarereal. 2SatomsX(A+B)q0+1 2SatomsXX2;(A+B)PS=(HSCC)+XP(G]SCC) wehaveintroducedthenotationthat (HSCC)=~H01 2SatomsX(A+B)q0(3{12) 63


2SS(A+B)(3{13) NotethatunlikeHForNDDO,inSCC-DFTBitappearsthatitisnotgenerallytruethatG=G,as2andisunrestricted.IfSCC-DFTBweretohavethisproperty,thatwouldbeusefulbothforthecalculationofforcesandforcomparisonamongalleectiveone-particlemethodsweconsider.Toenforcethissimilaritywemakethefollowingobservation: 2SS(AC+BC)]=XP[1 4SS(AC+BC+AD+BD)](3{14) where,,,andareonA,B,C,andDrespectively,andourworkingequationforthedensitydependentcontributioninSCC-DFTBis (GSCC)=1 4SS(AC+BC+AD+BD)(3{15) Equation 3{15 bearsasimilaritytotheMullikenapproximationtoCoulombrepulsion,wewilladdressthisinfurtherdetaillater. Nowwerevisitthetotalenergyexpression.UsingournewdenitionofthedensitydependentandindependenttermsintheSCC-DFTBFock-likematrix. 2XP[2HSCC+XP(GSCC)]+XA6=B(1 2ABq0Aq0B+ESCCrep(RAB))(3{16) Thisenergyexpressioncloselyresemblesotherone-particleHamiltonians,yetisequaltoEquation 3{8 .Similarly,aftersignicantmanipulationthegradientexpressioncanbeshowntobe 2XP[2@HSCC @XA6=B(1 2ABq0Aq0B+ESCCrep(RAB)) (3{17) whichagainisequaltotheoriginalexpression,Equation 3{9 ,butallowsamoresystematiccomparisonwithothermethods. 64


3{13 ,whichhasbeenshowntobeequivalenttoEquation 3{15 ,thoughthelatterhasaformthatsomemayrecognizeastheMullikenexpansionfortwo-electronintegrals, (j)1 4SS(AC+BC+AD+BD)(3{18) TheoriginoftheMullikenexpansionandthefurtherextensionofthebasicapproximationtotheRudenbergapproximationwillbeexploredlater.Fornowitissucientthatwediscussthenumericalpropertiesofthisapproximationandwhatadvantagesandlimitationsitentails. ItisperhapseasiesttoconsiderthenatureoftheMullikenapproximationbywayofasimpleexample.ForanNDDO-typetwo-centertwo-electronintegralwithnormalizedorbitalsitfollowsdirectlyfromEquation 3{18 Therehavebeenmanystudiesonthevariousformsfor,mostnotablybyKlopman[ 39 ],Ohno[ 40 ],andNishimoto-Mataga[ 41 ].TheparticularformwithinSCC-DFTBmodelistheCoulombrepulsionbetweentwo1sSlatertypeorbitalswithanaddedconditionthatastheseparationbetweenthetwocentersapproacheszeroAB!UAwhereUAistheHubbardparameter,whichisrelatedtochemicalhardness,whichinturnisrelatedtothedierencebetweentherstIPandEAofthatatom.TheparticularformoftheABisonlyassignicantastheunderlyingapproximationinwhichitisused.Inthis 65


IfweapplytheMullikenapproximationinstandardHFandusetheabinitio(j)two-centerintegrals,thebehaviorisquiteunphysical.ShowninFigure 3-4 areerrorsfromthefullHFwhentheMullikenandRudenbergapproximationsareappliedtothetwoclassesofthree-centerterms,(AAjBC)and(ABjAC),andthefour-centertermsinNMToverthecarbon-nitrogenreactioncoordinate.TheerrorintroducedbytheMullikenapproximationisdrastic,particularlyforthe(AAjBC)three-centerterm,forwhichtheerrorisover1000kcal/mol.GiventhattheonlyexplicitdensitydependenceoftheFock-likematrixinSCC-DFTBisintroducedbytheMullikenterm,andyettherelativeenergeticsofSCC-DFTBinpracticedonotintroducesucherrors,thensomedensityindependentpartofthemodelmustbeaccountingforthedierence.WewillavoidsuchuncontrolledapproximationsintheAATapproach. 2Zvxc(r)(r)drXA;B>AZAZB ThenumericalbehaviorofthiserrorisshowninFigure 3-3 .Clearly,thisrepulsivepotentialiscarryingtheburdenofaportionoftheelectronicenergy.Thisisnotbyaccident;thevalueofafunction(inthiscasethetotalenergy)thatresultsfromthe 66




Thisdependenceallowsustouseaself-consistenteld(SCF)procedure:determinetheFock-likematrixfromaguessdensityanddetermineanewsetofcoecientsandthecorrespondingdensity,repeatuntilthedensityoutputiswithinsomethresholdoftheinputdensity.Alternatively,FPS=PSFandhajheffjii UponconvergenceoftheSCFequationswehaveasetofAOcoecientsthatdenethesetofMOs areferencesingledeterminantwavefunctionisthengeneratedasanantisymmetrizedproductoftheMOs, 0=Af1(1)2(2):::n(n)g(3{23) Itisusefulatthispointtodenetherst-orderelectronicenergygivenazeroth-orderwavefunction,j0i.TheexactelectronicHamiltonianis ^H=Xih(i)+Xi;j>i1 wherethecoreHamiltonianis 2r21XAZA andinitsmatrixrepresentationthecoreHamiltonianis 68


2Xi;jhijjjiji Notethattherst-orderelectronicenergyisexactlythesameastheHFenergyexpressionandfurthermorehasnoexplicitdependenceontheparticularformoftheone-particleoperator^f. However,thegoalisnottherst-orderelectronicenergybuttheexactelectronicenergy.WecanconsidertheCCroute,whereweknowthatforasingledeterminanttheexactelectronicenergyis(withanychoiceoforbitals) TheCCenergyexpressionprovidesgreatinsight,buttheexplicitcalculationoftabijandtaiamplitudesrequiresmuchmorecomputationaleortthanweareabletoexpendforafastsemiempiricalmodel.AmorenaturalenvironmentforasemiempiricalmethodsisgivenbyKS-DFT.Assumingthatwehavetheexactexchange-correlationenergyfunctional,theexactKS-DFTenergyis(forKS-DFTorbitals) 2Xi;jhijjjii+Exc[]=Tr[PHcore]+1 2Xi;jhijjiji+Exc[] (3{29) ThereisanotherformoftheKS-DFTenergyexpressionthatismoreusefulforthecurrentdevelopment,thoughitrequiresthatweuseaslightlymorespecicformforourone-particleoperator.Ifthetargetisaone-particleHamiltonianthatreturnsthecorrectelectronicdensity,andhencethecorrectrst-orderelectronicproperties,thentheidealone-particleoperatorisindeedtheKS-DFTone-particleoperator.And,ifonlythetotaloperatorandnotitsindividualcomponents,i.e.,1 2r2,ext(r),J(r),andxc(r),is 69


3{29 wouldbesuitable.Byseparatingtermsthataredependentonthedensity(G)wecanwrite 2Tr[P(2HKS+GKS)]1 2Tr[PGKSxc]+Exc[](3{30) where (GKSxc)=hjvxc(r)ji(3{31) andF=H+G.Equation 3{30 isacentralthemeofourwork,asitgivestheunambiguousdenitionoftheexactelectronicenergy.Tiedtoitaretheone-particleKSequationsthatdenetheorbitals,thedensityandassociatedrst-orderproperties.Inthisway,theproblemforthetotalenergyandgradientsiscompletelyspeciedby 2Tr[P(2H+G)]+Eresidual[P]+XA;B>AZAZB WewillbringHF,NDDO,DFTB,SCC-DFTBandKS-DFTintothisframework,tomakeitclearwhatthethreedistinctcomponentsare: Ifamethodistogivethecorrectelectronicdensity,andwecanknowtheexactKS-DFTHKSandGKS,thenthemodelcomponentsforHandGabsolutelymust


2r21XAZA 2S(+)if6=(3{34) 2H0(3{35) 2H01 2SNX(A+B)q0(3{36) 2r2XAZA TheonlydensityindependentcontributiontotheidealFock-likematrix(theFock-likematrixwhichreturnsthecorrectone-particlegroundstatedensity)istheexactcoreHamiltonian,Hcore.ThebiggestbottleneckforcomputingcontributionsforHcoreisthethree-centerone-electronintegrals,whichscaleasn2orbitalsNatoms.Still,theyrepresentasmallcomputationaleortforlargesystemswhencomparedtothen3orbitalsterms 71


2(j))(3{38) (j)ABCD1 2 (j)ADBC)(3{39) 4SS(AC+BC+AD+BD)](3{41) ThedensitydependenttermoftheidealFock-likematrixisGKS.Thischoicefollowsdirectlyfromthechosenformforthedensityindependentcomponentandtherequirementthatwereproducetheexactelectronicdensity.ThedensitydependenceoftheFock-likematrixaccountsforthecomplexinteractionsamongelectrons.ThistermisdominatedbyCoulombicrepulsion.Thesecondmostimportanttermisgenerallycalledexchange.Thethirdcomponentiscorrelation,andcanbeconsideredtobethehighlycomplexwayinwhichelectronsinteractwithoneanother,i.e.,theirmotionis 72


2Tr[PGKSxc]+Exc[](3{47) 73


3{30 ,thegradientis[ 81 ] d=2@S 2XP@G whichisbasedonPiCi(FiS)=0andorthonormality.Notethatthereisnoexplicitdependenceon@P 3{48 onlyholdsiftermsinGarelinearinthedensity ThederivativesofEresidual,whicharenotpresentinHF,aresimpleforthosemethodsthatonlydependonasumofatom-pairterms,suchasNDDO,DFTB,andSCC-DFTB.InKS-DFTadditionalnumericalintegrationisrequiredfortheperturbeddensity.PotentialapproachesforreducingthecomputationalcostassociatedwiththederivativesinKS-DFTwillbediscussedinfurtherdetaillater. 74




Multi-centercontributiontotheC2sN2smatrixelementofNMT. 76


TheAM1articialrepulsivebumpforNMTdirectbondssion. 77


TheSCC-DFTBtotalenergybreakdown. 78


B C Figure3-4. EnergiesfromRudenbergandMullikenapproximations.A)Three-center(AABC).B)Three-center(ABAC).C)Four-center(ABCD). 79


Wehavespeciedthetargetframeworkofoursimpliedquantummechanicalmethod.Thetasknowistodemonstratesimplewell-controlledapproximationsthatsignicantlyreducecomputationaleortwhilemaintainingadesiredlevelofaccuracy.Tothisend,aprimaryfocusisonthedensitydependentcomponentoftheFock-likematrix,asitisthemostcomputationallyintensivepartoftheSCFprocedure.Furthermore,theresidualelectronicenergycontributionisexaminedbecausewehaveshownthatitspropertreatmentisessentialforaSMtoachieveuniformaccuracyofelectronicstructureandenergetics. Ourstrategyistostartwithareferencetheoreticalmethodthatisreasonablywell-behaved,inthiscaseKS-DFT.ThoughthereismuchdebateaboutthebestDFTenergyfunctional,wechooseahybridDFTfunctional,specicallythewidely-usedB3LYP,becauseitincludesaportionofexactexchangewhichaddsasubstantialdegreeofexibilitytooursimpliedmethod.Tosimplifythecomputationalprocedurewethenexplorevariousapproximationsandverifytheiraccuracy.Itmaybetoomuchtodemandasimpliedmethodthatissystematicallyimprovable(intheCCwavefunctiontheorysense),however,everyapproximationshouldbesystematicallyveriable.Wewillshowthatbymakingacoupleofwell-controlledapproximationsweareabletoreproducethetopographicalfeaturesofahybridDFTPESwithasubstantialreductionincomputationaleort. 82 ],butthiscannotbedonefortheexactexchange,leavingn4orbitalsdependence.Thoughmostquantumchemistryprogramsemployscreeningtechniquestoreducethenumberoftwo-electronintegralscalculated,theyremainasignicantcost,especiallywhenexact 80


83 { 85 ],thelinearexactexchange[ 86 ],andquantumchemicaltreecode(QCTC)[ 87 ].ThesemethodsstillrelyontheexplicitcalculationofintegralsusingstandardtechniquesforCartesianGaussian-typefunctions.Forlargesystemsthemostnumerousclassofintegralsarethefour-centerintegrals.Furthermore,four-centerintegralsarenotonlythelargestclassofintegrals,butalsothemostcomputationalinvolvedintegrals.AccordingtoGilletal.[ 88 ]afour-centerintegralistypicallyanorderofmagnitudemorecomputationallydemandingthanatwo-centerintegral.Ourfocushereisonsystematicapproximationsthatcanbemadetotheseabinitiomulti-centerintegralsthatwillreducethecomputationaleortassociatedwithcalculatingthesetermswhileretainingacceptableaccuracy. ThecornerstoneofourapproachisbasedonideasoriginallypresentedbyRudenbergin1951[ 5 ].TheunderlyingapproximationistoexpandanorbitaloncenterAintermsofasetofauxiliaryorbitalslocatedoncenterB: where IftheauxiliaryorbitalsoncenterBarecompleteandorthogonalthentheexpansionisexact.ByrepeatedapplicationofEquation 4{1 thenumberofcentersthatareinvolvedinthree-andfour-centertwo-electronintegralscanbereduced.ThepossiblecombinationsofthistypeofexpansionhasbeenexploredandenumeratedbyKochetal.[ 89 ],thoughwithoutcomparingthenumericalaccuracyofsuchapproximations.Wewillshowthe 81


First,considertheeectoftheorbitalexpansiononatwo-centerproductoftwoorbitalswithrespecttothesameelectroncoordinate, 21Xk=1[SAkBkB(1)B(1)+SkABA(1)kA(1)](4{3) Sinceageneralmulti-centertwo-electronintegralisgivenas substitutingEquation 4{3 intoEquation 4{4 forboththeABandCDorbitalproductsyieldsthefollowingexpression,whichistheRudenbergtwo-electronintegralform, (ABjCD)=1 41Xk=11Xj=1[SAkBSCjD(kBBjjDD)+SAkBSjCD(kBBjCjC)+SkABSCjD(AkAjjDD)+SkABSjCD(AkAjCjC)](4{5) Inthelimitthatthesumsoverauxiliaryorbitalskandjarecompletethisexpansionisexact.Inpractice,theexpansionisapproximatedbylimitingtheorbitalsoverwhichweareexpandingtothoseincludedintheAObasisfunctions,thoughusingaseparateauxiliarysetoforbitalsmayalsobesuitable. TheRudenbergtwo-electronintegralexpression(Equation 4{5 ),ismoregeneralthanthewell-knownMullikentwo-electronintegralapproximation.TheMullikenapproximationisequivalenttorestrictingthesuminEquation 4{1 tobeonlyoverorbitalsofthesametype, 2SAB[A(1)A(1)+B(1)B(1)](4{6) 82


(ABjCD)=1 4SABSCD[(AAjCC)+(AAjDD)+(BBjCC)+(BBjDD)]:(4{7) NotethesimilarityofthisexpressiontothatwhichappearsinSCC-DFTB,Equation 3{18 ,inwhich thoughinSCC-DFTBtheorbitalsandarerestrictedtobes-typefunctions. WeconsiderherealleightdierentapproximationsthatresultfromtheapplicationofEquation 4{1 tothree-andfour-centertwo-electronintegrals.Thetwoclassesofthree-centertermsareoftheform(AAjBC)and(ABjAC),whichwillbereferredtoastypesAandBrespectively.TofacilitateourdiscussionweintroducethenumberingconventioninTable 4-1 TheRudenbergandMullikenformulascanbeappliedtoeitherclassofthree-centerorthefour-centerintegrals(approximationsI-VI)usingEquations 4{5 and 4{7 ,respectively.Forthepartialapproximationstofour-centerterms(VIIandVIII),partialmeansthatthecorrespondingapproximationhasbeenappliedonlyonce.Theexpansionforthesepartialapproximationsisthenintermsofthree-centertwo-electronintegrals(insteadofonlytwo-centerterms).FortheMullikenpartial(VII)wehave (ABjCD)=1 4fSAB[(AAjCD)+(BBjCD)]+SCD[(ABjCC)+(ABjDD)]g:(4{9) andtheRudenbergpartial(VIII)is (ABjCD)=1 41Xk=1[SAkB(kBBjCD)+SkAB(AkAjCD)+SCkD(ABjkDD)+SkCD(ABjCkC)](4{10) 83


Notethatthetotalelectronicenergyisthesumofalltheone-,two-,three-,andfour-centercomponents.And,eachofthesecomponentsindividuallybearlittlesimilaritytothetotalelectronicenergy,againconsidereachcontributiontoanyparticularmatrixelementshowninFigure 3-1 ,insteadthetotalenergyistheresultofthedelicatecancellationamongtheseterms.Inaddition,sinceonlythenon-parallelityerrorsareofimportanceinreconstructingthePES,theenergeticcontributionarisingfromeachindividualclassoftermsmaybeshiftedbyaconstantvalueindependentlyofoneanotherwithoutaectingthefeaturesofaPES.Thesesubtleties,compoundedbythe3N6degreesoffreedomofthePES,makeascribingsignicancetoanysingleclassoftermsdicult.BecausewewanttoassesstheaccuracyofaparticularapproximationasitappliestoasingleclassoftermswewillinsteadfocusontheFockmatrixelements,whichcontainalltheinformationneededforthezeroth-orderelectronicenergy,toovercomesomeofthesediculties. GivenanysetofpointsonthePES(forexample,adissociationcurve)andtheFockmatrixforeachpoint,eachclassofmulti-centertwo-electronintegralswillcontributeaparticularpercentagetothefulldensitydependentFockmatrix.ThebestapproximationforaparticularclassofintegralsisthengivenbyhowwellitreproducestheaveragepercentageoftheFockmatrixforthecorrespondingabinitiosetofintegrals.Consider 84


4-1 arethescatterplotsforeachmulti-centercontributiontotheCoulombandexchangecontributionsoftheFockmatrixfromHFtheoryinaminimalbasis.TheC-Nreactioncoordinaterangesfrom1.0Ato5.0Ainstepsof0.2A.Theslopesofthecorrespondingbest-tlinesoftheone-,two-,three-andfour-centercontributionsare0.0063,0.5199,0.4522,and0.0216respectively,notethattheseslopesrepresentthefractionofthefullcontribution,andthattheirsumisone. Table 4-2 showsthestatisticalaveragesofthedensity-dependentcontributionstotheFockmatrixforseveralmoleculesalongthespeciedreactioncoordinate(Rxn)attheHFleveloftheoryinaminimalbasisset.Thedissociationinallcasesisfrom1.0Ato5.0Ainstepsof0.2A.Onaverage,thetwo-centertwo-electronintegralcomponentaccountsforroughlyhalfofthefullmatrixelementsandthethree-centertypeAintegralsrecoveranotherthird.Again,thecorrespondingenergeticcontributionswillnotexhibitthesameratios.Thepurposehereistovalidateindividualapproximationsmadetovariousmulti-centertwo-electronintegrals.Tothisendweconsidertheaveragecontribution:the 85


Theaprioribestapproximationtothetwotypesofthree-centercontributionsisthatofRudenberg.ShowninFigure 3-4 aretheerrorsintroducedbymakingapproximationsI-VI,forNMTdirectbondssionalongthecarbon-nitrogenreactioncoordinate.FortypeAandBthree-centerintegrals(3-CAand3-CB)theaveragematrixcontributionsare35.67%and4.98%respectively.TheRudenbergapproximationsIIandIVdierbyabouthalfapercentat35.12%and5.50%.ThoughthepercenterrorforbothapproximationsIIandIVareroughlythesame,thecorrespondingerrorintroducedenergeticallyistwoordersofmagnitudelargerforthe(AAjBC)three-centertermsthanforthe(ABjAC)three-centerterms.Incontrast,theRudenbergapproximationappliedtofour-centerterms(VI)is2.47%whichdiersfromthefour-centerabinitio(4-C)valueof2.39%byonly0.08%.TheerrorintroducedbythefullRudenbergfour-centerapproximation(VI)issurprisinglysmall.OnewouldexpectthatthepartialRudenbergapproximationwouldhavethebestagreementwiththeexactvalue,instead,atleastempirically,thefullRudenbergapproximationperformsbest,andhasthesmallesterrorrelativetotheaveragecontributiontotheFock-likematrix.Indeed,theeectthis0.08%dierencehasontheenergeticsofadissociationcurveiswithinanacceptablerangeandreproducestheenergeticswellforavarietyofdierentdissociationcurves.WewillfocusonthemoleculesinTable 4-2 thatinvolvethedissociationofaC-Nbond.PlottedinFigure 4-2 arethedissociationcurvesforNMT,nitroethane(NET),COHNO2,andCH3NH2attheB3LYPleveloftheoryinaminimalbasisset.Thereisacleartrendforthefour-centerRudenbergapproximationtooverestimatetheaveragecontributiontotheFock-likematrix.Theenergetics,ontheotherhand,donotshowthisclear-cuttrendsincetheenergyisloweredforNET,raisedforCH3NH2,andessentiallyunchangedforNMT.However,eventhoughthePESarenotpreciselyreproducedtheystillexhibitallthebasicfeaturesoftheab


WehavedemonstratedthattheMullikenandRudenbergapproximationsdonotapproximatethethree-centerone-ortwo-electronintegralswell.Theseclassesofintegralsposeachallengeastheyaretheonlyremainingtermsthatrequireexplicitintegrationsinceallone-centertermscanbetabulatedandtwo-centertermscanbeinterpolatedbycubicsplines,aswillbedemonstratedlater.However,basedontheGaussianproductrulethesetermscan,withoutapproximation,bereducedtotwo-centerterms.Inprinciplethencubicsplineinterpolationcouldbeusedtostoretheintermediatesandcompletelyavoidtheneedtodoanyintegralsexplicitly.TheGaussianproductruleis +(RARB)2e(+)(rRA+RB +)2(4{11) WecangenerateanauxiliarysetoforbitalsforeachGaussian-typeorbital-pairthathavetheorbitalexponent+.Also,becauseproductsofl=1(p-type)orbitalsareinvolvedthisauxiliarysetmustalsonowspanl=2(d-type).Theprefactorisnotincludedinthestoredorbitals,insteaditiscomputedon-the-ytoretainthetwo-centercharacterofthetermsinwhichitwillbeused. ThereisasignicantdicultywithusingtheGaussianproductinasimpleway.BecauseweworkincontractedGaussian-typeorbitalsthenumberoforbitalpairsincreasesasthenumberofcontractedfunctionssquared.ThecentersoftheseresultingproductGaussiansarenotnecessarilythesame,RA+RB 87


IfweconsiderthebasissetSTO-nG,asumofcontractedGaussiansapproximateaSlater-typeorbital.ThereisnoSlaterproductrulethatreducesaproductoftwoSlaterstoasingleSlater.Insteadwehave ConsiderasimpleexampleoftheproductoftwoGaussianfunctionsversustwoSlaterfunctionsseparatedbyoneBohr,whichisrepresentedbythebluecurveinFigure 4-3 .Thechallengeisnowclearer,theunderlyingbasisfunctionsweusearettoSlaterfunctions,andtheproductoftwoSlaterfunctionsdoesnotresembleasimpleGaussian,orevenasumofcontractedGaussians.TheSlaterproductisdenedexactlybytheintersectionoftwofunctions: where,for

Wehavedevelopedthestrategyandframeworkforeventuallyincludingthethree-centerterms.Fornowwearesatisedwithreducingthecomputationalcostthroughonlythefour-centerRudenbergapproximationbecausethefour-centertermsaccountformostofthecomputationaleortassociatedwithconstructionoftheFockmatrixforlargesystems. Equation 4{18 isexact.Itonlyrequiresanarbitrarypartitionofthedensityintoatomicparts,thoughconvergenceoftheseriesisaectedbythechoiceofpartition. Ourtargetistoincludecontributionsfromtheexchange-correlationpotentialtotheKSFock-likematrixinawaythatonlyrequiresthestorageofalimitedsetoftwo-center 89


4{18 thatcanberetainedarev1centerxcandv2centerxc.Furthermore,suchcontributionsoughttobeindependentofchemicalenvironmentandthuscompletelytransferable.Therefore,weinsistondensitiesthataretransferable.Thenaturalchoiceissphericallysymmetricatom-centereddensitiesofneutralatoms,A0A,recognizingwecanhaveandspincomponents.Basedontheseconsiderationswehaveanapproximateexchange-correlationpotential, ~vxc[XAA](r)XAvxc[0A](r)+XA>B[vxc[0A+0B](r)vxc[0A](r)vxc[0B](r)](4{19) Wedonotwanttomapandstorethereal-spaceexchange-correlationpotential,becauseoftheobvioushighdegreeofcomplexity.Instead,allthatiseverneededarethecontributionsfromthepotentialprojectedinthebasisset.TheFock-likematrixcontributionfromthisapproximateexchange-correlationpotentialis (eGxc)AB[]=hAj~vxc[]jBi(4{20) Restrictingthistocontributionsthatonlydependontwocentersyields,forthediagonalblock(A=B) (eGxc)AA=hAj~vxc[XA0A]jAi=hAj~vxc[0A]jAi+1 2XB6=AhAj~vxc[0A+0B]~vxc[0A]jAi(4{21) andfortheo-diagonalblock(A6=B) (eGxc)AB=hAj~vxc[XA0A]jBi=hAj~vxc[0A+0B]jBi(4{22) 90


2XC6=A;B1Xk[SAkBhkBj~vxc[0C]jBi+SkABhAj~vxc[0C]jkAi](4{23) Theleadingordertermsare (eG0xc)AB=8><>:hAj~vxc[0A]jAiifA=BhAj~vxc[0A+0B]jBiifA6=B(4{24) andaslightimprovementcouldbeachievedwiththeadditionof (eG1xc)AB=8><>:1 2PB6=AhAj~vxc[0A+0B]~vxc[0A]jAiifA=B1 2PC6=A;BP1k[SAkBhkBj~vxc[0C]jBi+SkABhAjvxc[0C]jkAi]ifA6=B(4{25) ShowninFigure 4-4 areresultsfortheGZERO(B3LYP)approximation:eG0xcisusedwiththeB3LYPenergyfunctionalinterpolatedoveralltheatompairsCNOH.TheerrorsintroducedarerelativelysmallandGZERO(B3LYP)appearstobegenerallytransferable. IntermsofcomputationalfeasibilityforGZEROwehaveintroducedasingleone-centermatrixperatom-typeandasingletwo-centermatrixperatom-pair.ForC,N,OandHatomtypesthenumberofone-centermatricesistrivialanddoesnotpresentachallenge,astheyarethedimensionofthenumberoforbitalsperatomsquared.Thetwo-centermatricesarestoredinthelocalcoordinateframeworkoftheatompairandareroughlythedimensionofthenumberoforbitalsonatomAtimesthenumberoforbitalsonatomB.Theactualdimensioncanbereducedbecausemostofthematrixelementsarezeroduetoorthogonality.Tointerpolatethevaluesofthesetwo-centermatrixelementscubicsplinesareused.Cubicsplinesrequireveparameterspernumberinterpolated.Weusedadaptivegrids(non-uniformlyspaced)thataredenedforeverypairtermfrom0.5 91


Inadditiontoapproximationsforintegralswealsodemonstratedapproximationsthatcanbeappliedtotheexchange-correlationpotentialcontributionoftheFock-likematrix,andtherebyavoidtheneedtoperformnumericalintegrationforeverySCFiteration.Extensionstoincludethree-centercontributionsfromthetelescopingseriesapproximationtothepotentialwerealsopresented. Thelastcomponentwewillconsiderinvolvestheresidualtotalenergy.Thoughitisworthwhiletoavoidnumericalintegrationtogeneratethepotentialandsubsequentmatrixcontribution,analone-timeevaluationofExc[]uponSCFconvergenceshouldnotbeabottleneck.Furthermore,asdemonstratedbyBartlettandOliphant[ 90 ]theexchange-correlationenergyfunctionalisrelativelyinsensitivetotheinputdensity.Indeed,aswehaveseenalreadyinFigures 4-2 and 4-4 theB3LYPenergyevaluationwasusedandintroducednosignicanterror. 92


2r21+XAZA allone-electrontermsarecalculatedexactlyastheyrepresentasmallportionofthetotalcomputationalcostforlargesystems.GiventhatweareusingahybridKS-DFTfunctionalwehaveafractionofexactexchangeinthedensitydependentFock-likematrix,forB3LYP,=0:2. (GJK1;2;3center)=XP((j)(j))(4{27) (GJK4center)=XP(^(j)^(j))(4{28) where 4Xk=1Xj=1[SAkBSCjD(kBBjjDD)+SAkBSjCD(kBBjCjC)+SkABSCjD(AkAjjDD)+SkABSjCD(AkAjCjC)](4{29) (G0xc)=8><>:hAjvxc[0A]jAiifA=BhAjvxc[0A+0B]jBiifA6=B(4{30) andtheresidualelectronicenergyis 2Tr[PG0xc]+Exc[P](4{32) Finally,showninFigure 4-5 areUHFdissociationcurvesforAATModelZero(AATM0),whereboththeRudenbergandGZERO(B3LYP)approximationshavebeenappliedsimultaneously.ThismodelreproducestheunderlyingB3LYPdissociationcurveswellespeciallywhencomparedtoAM1. ThecorrespondinggradientexpressionforAATM0isgivenbyEquation 4{34 .TheonlydierencebetweenstandardKS-typederivativesarisefromthemodiedfour-center 93


4Xk=1Xj=1[@SAkB d=2@S 2XP@f(G)1;2;3center+(G)4center]g 4-6 aretherelativetimingsfortheevaluationoftheabinitiofour-centerintegralsversustheRudenbergapproximation.Thesetimingsarepreliminaryandmoreworkcouldbedonetooptimizebothmethodsforevaluatingthisfour-centercontributions.However,itisclearthatthereisasignicantadvantagetousingtheRudenbergapproximationand 94


4-7 Screeningofsmalltwo-electronintegralscanleadtolinearscalingforlargesystems.SuchscreeningmethodscanbeappliedtotheRudenbergapproximationveryecientlybecauseoftheexplicitdependenceonoverlap. InaneorttominimizecomputationaleortwithbuildingtheFock-likematrixwehavemadeextensiveuseofstoringtransferableone-centerandtwo-centerterms.Inourcurrentimplementationallintegralsupthroughthree-centersarecalculatedexactly.Therearetworeasonsforexplicitlyevaluatingtheseintegrals:rst,theyareasmallcontributionwhencomparedtotheexplicitcalculationofthefour-centerterms;second,theapplicationoftheRudenbergorMullikenapproximationstothree-centerterms(decomposingthemintotwo-centerterms)introducessignicanterrors.Ifanaccurate,simpleandtransferableapproximationforthethree-centertermswereavailablethenalltermsinthemodelcouldbetwo-center.Itisastraightforwardmattertointerpolatefunctionsoftwocentersusingcubicsplines.Wehaveimplementedsuchtechniquesfortheevaluationofthematrixelementsoftheexchange-correlation:(G0xc)AB(RAB). Cubicsplinesareapplicablefornumerical2-centerfunctions,wheretheexactfunctionalformisnotknown,andforcomplicatedanalytic2-centerfunctions.Intherstcase,thebenetofsplinesistoprovideaninterpolationschemeforasetofempiricalreferencepoints.Inthesecondcasesplinesprovideacomputationallyecientframeworkfortheevaluationofotherwiseprohibitivelyexpensiveanalyticfunctions.Weareprimarilyinterestedinthelatter. Theabilitytostorealargenumberofparametersincomputermemory(RAM)isarequirementfortheecientimplementationofcubicsplines.Thememoryrequiredisdirectlydeterminedbythenumberofcubicsplineparameters.Apertinentexample 95


4-8 amodestspeed-upisachievedoverthealreadyfastNDDOFock-likematrixconstruction.Thesetimingsarefor16proteinsystemsupto20000orbitals.Usingcubicsplinesisupto15%fasterthanthemultipole-multipoleexpansion.Thereisanerrorintroduced,thoughitissmallandcontrollable.ForexampletheaverageerrorperatomisgiveninFigure 4-9 .Furthermore,whilegoingtohighermultipolesintheexpansionofintegralsisincreasinglyexpensive,cubicsplinescostthesameregardlessoftheunderlyingfunctionbeinginterpolated.However,thenumberofparametersinvolvedinatypicalNDDOmultipole-multipoleexpansionisrelativelysmallversusthenumberofparametersneededforcubicsplines.ThepremiumplacedonRAM(orcore)memorywhenNDDOmethodswererstdevelopedthirtyyearsagowouldhavemadecubicsplinesimpractical.Now,theamountofmemorytypicallyavailableonmostdesktopmachinesismorethansucienttomakecubicsplinesthemethodofchoicewhenevaluatingnearlyanytwo-centerintegralinquantumchemistry. Ageneralcubicsplinehastheform 96


91 ].Theonlydegreesofexibilitywhenusingcubicsplineinterpolationarethenumberofnodesusedandiftheyareevenlyspaced(discrete)orbasedonanadaptivegrid.Inourcase,forbothintegralsandmatrixelements,thefunctionsinallcasesgotozerointheseparatedlimitandatsmallerlengthscalesthefunctionrequiresadensegridtocompletelycapturethesubtlechangesthatoccurinthatregion.Adensenodespacingovertheentirelengthscaleisunnecessarybecausethefunctionschangeverylittleatlonglengthscales. Todetermineasuitablenodemappingfunctionwetriedseveraldierentfunctionalformsandconcludedthatasimpleexponentialfunctionprovidesagoodbalancebetweenhavingadensegridinthemorecriticalbondingregion(fromaround0.5Ato5A)andasparsegridatlongbondlengths(hundredsofA).Theactualnodemappingfunctionusedis Allofthefunctionswehavecurrentlyimplemented,thoseinvolvingtwo-centerexchange-correlationapproximations,arezerobeyond10A.Usingthenodemappingfunctionthenrequiresapproximately600referencepoints.Sincethereare5variablesperreferencepointweneed3000parametersthatwouldformalook-uptableforasinglefunction.Ifdoubleprecisionnumbersareusedtostorethesplineparametersthestoragerequirementisabout0.023MBperfunction.FurtherdetailsofthecurrentimplementationoftheAATprocedureintheparallelACESIIIprogramsystemcanbefoundinAppendixA. 97


92 ].TheG2setismadeupof148neutralmoleculeswithsinglet,doublet,andtripletspinmultiplicities,thoughwefocusonasubsetofthesethatcontainonlyC,N,O,andHbecausethecurrentimplementationofAATM0includesonlythesplineparametersgeneratedforthoseatomatom-pairs.Thereare71moleculesinthistruncatedsetrangingfrom2to14atoms. TheseparationofelectronicandnuclearcomponentsintheenergyexpressionofourAATapproachsetsitapartfromcommonlyusedSEmethods.Soitfollowsthatweexpectmoreconsistentresultsforpurelyelectronicproperties.TothisendwehaveconsideredtheverticalionizationpotentialsoftherelevantmoleculesintheG2set.Theverticalionizationpotentialispurelyelectronicasitdoesnotdependonthenuclear-nuclearrepulsion.WehaveusedthesimpledenitionEIP=E(N)E(N1).Thisrequiresthetotalenergyofthemoleculeand,sinceallmoleculesintheG2setareneutral,thecorrespondingcationwiththeproperlymodiedmultiplicity.WehavecomparedAATM0againsttheunderlyingB3LYPminimalbasissetresultsandndaaverageRMSDfromB3LYPforourentiretestsetof0.057Hartree(1.6eV).TheanalogouscomparisontoAM1yieldsanRMSDfromB3LYPof0.331Hartree(9.0eV).ThisclearlydemonstratestheabilityofAATM0toreproducetheelectronicstructureoftheunderlyingB3LYP.TheerrorofAM1fromB3LYPissubstantial,thoughamorevalidcomparisonforAM1wouldbeagainstexperimentalIPs,theimportantissuehereistheabilityourAATM0toreproducethephysicsofthemethoditisapproximating. Wenowconsiderthedipolemoment.ThedipoleRMSDdeviationofAATM0andAM1fromminimalbasissetB3LYPis0.219a.u.(0.56Debye)and0.595a.u.(1.51Debye)respectively.TherewerenosignerrorsobservedfortheorientationofthedipolemomentforeitherAATM0orAM1.ThoughtheerrorintroducedbyAATM0issignicantlysmallerthanthatofAM1,itislargerthanwouldbeexpectedgiventheperformanceofotherpropertiesgeneratedbytheAATM0procedure. 98


InadditiontotheaverageRMSDofthedensityofAATM0andfullHFweconsiderthestandarddeviationoftheRMSDofthedensityforeachmoleculeinthetestset.Thestandarddeviationsare0.042and0.057forAATM0andfullHFrespectively.ThissuggeststhatwhiletheoverallaverageRMSDissomewhathigherfromAATM0theerrorisslightlymoresystematic. TurningourattentiontoenergeticpropertiesweconsidertheatomizationenergiesoftherelevantportionoftheG2set.Wetakeastheatomicreferenceenergiestheneutral 99

PAGE 100

SinceonegoalofourapproachistoreproduceacomplexPESoflargesystemswehaveappliedtheAATM0approximationtodescribeasomewhatunphysicalreactionmechanismforC20inordertosubjecttheapproximationtoanextremecaseofbondbreaking.Thepseudo-reaction-pathcorrespondstosplittingtheC20moleculeinhalf,simultaneouslybreakingtenC-Cbonds.ShowninFigure 4-10 arethetotalenergiesofB3LYP,AATM0,AM1,andSCC-DFTB.Thereactioncoordinateisthedistancebetweentheend-capsandtheequilibriumvalueis3.223A,therangeisfrom2.8Athrough6A.AsexpectedweseethatAM1exhibitsunphysicalbehavior,andsomeconvergenceproblemsbeyond4.5A.TheSCC-DFTBactuallyperformsquitewellforC20,thisresultmaynotbetoosurprisingsincetherepulsiveenergyterminSCC-DFTBisparameterizedfromB3LYPtotalenergiesforC-CbondbreakingandtheenergyscaleisverysimilartothesumofC-Cbondenergies.TheAATM0overbindsthissystemsignicantlyandmayinfactbeconvergingtoadierentelectronicstate,astheeigenvaluespectrumdeviatessignicantlyfromtheB3LYPresult.Furtherworkisneededtodeterminethepreciseoriginofthisdiscrepancyandtocorrectit. ThenalcomparisonwemakeistotheRMSDofthegradientsfortheG2set.Wewillconsidertwosetsofdata,onefortheMP2equilibriumstructuresandtheotherfortheB3LYPminimalbasissetequilibriumstructures.ShowninFigure 4-11 aretheaverageRMSDoftheCartesiangradientsfromtheMP2referencestructurescalculated 100

PAGE 101

AbettercomparisonforhowwelltheAATM0approximatestheunderlyingB3LYPstructureisshowninFigure 4-12 .AttheB3LYPgeometriestheRMSDis0.05Hartree/Bohrand0.09Hartree/BohrforAM1andSCC-DFTB,respectively.WeseetheAATM0hasanaverageRMSDof0.006Hartree/Bohrwithastandarddeviationofonly0.003Hartree/Bohr.ThisclearlyshowsthatAATM0isreproducingtheB3LYPgradientswithexcellentaccuracy. Fastandaccurateapproximationstofour-centertwo-electronintegralsviaRudenberg(forCoulomborexactexchange)enablesubstantialsavingsforverylargesystems.SincetheRudenbergapproximationmerelyreducesthecostassociatedwithcalculatingeachfour-centertwo-electronintegral,methodsforscreeningtwo-electronintegralstoachieve 101

PAGE 102

TheFockbuildrequiresnonumericalintegration,buttheAATapproachdoesexplicitlyincludeexchangeandcorrelation.InthecurrentapproachwehaveusedtheminimalbasissetB3LYPexchange-correlationpotentialsasareference,furtherimprovementispossiblebyevaluatinglargebasissetabinitioDFTexchange-correlationpotentialsandsubsequentlyprojectingtheminaminimalbasissetrepresentation.ExtensionsoftheAATM0thatincludeamorecompleteexpansionofthetelescopingseriesfortheexchange-correlationpotentialhavealsobeendescribedandtheirimplementationwouldbeafurtherimprovementontheresultspresented. 102

PAGE 103

Adaptedintegrals (AAjBC)IMulliken3-centerTypeAIIRudenberg (ABjAC)IIIMulliken3-centerTypeBIVRudenberg (ABjCD)VMulliken4-centerVIRudenbergVIIMullikenPartialVIIIRudenbergPartial 103

PAGE 104

AveragepercentageofFockmatrixcontributionbyadaptedapproximation Rxn1-C2-C3-CAIII3-CBIIIIV4-CVVIVIIVIII COHNO2CN0.6155.1140.5941.8540.722.282.522.361.401.411.401.381.39NMTCN0.6351.9941.0742.3040.854.154.874.472.

PAGE 105

Scatterplotsofagreementbetweenapproximateandexacttwo-electronintegrals.A)One-center.B)Two-center.C)Three-center.D)Four-center 105

PAGE 106


PAGE 107

Orbitalproducts.A)Gaussianorbitalproduct.B)Slaterorbitalproduct 107

PAGE 108


PAGE 109


PAGE 110

TimingofabinitioversusRudenbergfour-centerintegrals 110

PAGE 111

Percentageofmulti-centertermsversussystemsize 111

PAGE 112

TimingofFockbuildusingtraditionalNDDOandNDDOwithcubicsplinesasafunctionofsystemsize 112

PAGE 113

Errorintroducedperatomfromcubicsplinesasafunctionofsystemsize 113

PAGE 114


PAGE 115

ForceRMSDfromMP2 115

PAGE 116

ForceRMSDfromB3LYP 116

PAGE 117

TheACESIIIprogramsystemenvironmenthasbeendesignedfortheecientparallelimplementationofwavefunctiontheorymethodsincomputationalchemistry.TheACESIIIprogramissuitedtotreatinglargesystems,typically500-1000basisfunctionsandupto300electronswithpost-HFabinitiomethods.Thesuperinstructionprocessor(SIP)managesthecommunicationandprocessingofblocksofdata,suchblocksareintendedtobesomewhatlarge(ontheorderof500,000oatingpointnumbers).ToprovideamoredirectrouteforimplementingnewmethodstheSIPreadsthecodewritteninthehighlevelsymboliclanguage,thesuperinstructionassemblylanguage(SIAL)(pronounced\sail").TheSIPhidesmanyofthemorecomplicatedaspectsofparallelprogramingallowingtheSIALprogrammertofocusonalgorithmandmethoddevelopment.WithinSIALeachindexofanarrayisdividedintosegments,thesegmentsmapelementsofthearrayintoblocks.Analgorithmtheninvolvesoperationsforwhichthesegmentisthefundamentalunitofindexing.Operationsthatinvolvethecontractionoftwoarrayscanthenbebrokenintoseveralsmallercontractionsovertwoblocks.Eachblock-paircontractioncanthenbedistributedovermanyprocessors,allowingfortheparallelimplementationofthecontractionoftwolargearrays. Ageneralissueinparallelcodesisensuringthatlatencydoesnotbecomearatelimitingstep.Thiscanbedonebymakingsurethattheamountofcomputationdoneisonaparwiththelatencyofanyoperation.Suchbalancingismachinedependent,soitrequiressomeamountoftestingtodetermineanoptimalsolutionforaparticularmachine.FortheimplementationoftheAATmethodologythisbalancingiscriticalandrequiresadierenttreatmentthanisusuallyusedforpost-HFmethods. ThestructureoftheAATequationsarequitedierentthanpost-HFequations.IntheformertheatomicindicesareusedastheprimaryvariableintheloopsusedtoconstructtheFock-likematrix.Forpost-HFmethodstheorbitalindicesaretheimportantvariable.Moreover,thenumberoforbitalsperatominAATissmallrelativetothe 117

PAGE 118

ThespeedoftheAATapproachisachievedbyavoidingtheexplicitcalculationofthenumerousandcostlyfour-centerintegrals.Instead,thesetermsareintroducedbyasumofcontractedtwo-centerterms.Initially,weusedsmallsegmentblocksizes.TheblocksizeultimatelydeterminesthecomputationaleciencybyaectinghowtheIO,orfetchingofdata,isdone.Intheseinitialstudieseachblockspannedonlyoneatom.Inthiscase,whenthetwo-electronintegralswereevaluateditwasstraightforwardtosimplyneglecttheintegralsthatinvolvedfouratoms(four-centers.)Thisprocedureworkedforsystemswithfewatoms(<20),but,asthesystemsizeincreased,thenumberofblocksrapidlyincreased,andlatencybecameanproblem. Animprovedapproachthatwasimplemented,whichismoreconsistentwiththeoriginalACESIIIdesignphilosophy,involvedremovingtheatom-spanningblocksizerestriction.Thiswasachievedbyrewritingtheoriginalroutinesthatreturnedalltwo-electronintegralsspanningfour-blockunitstoexplicitlyexcludeintegralsinvolvingfour-centers.Inaddition,theRudenbergapproximation,whichreliesoncontractionsbetweentheoverlapmatrixandNDDO-typetwo-centertwo-electronintegrals((AAjBB)),requiredmodicationtoavoidover-countingsometerms.Similartotheroutinethatexcludesfour-centerintegrals,themodiedroutinetogeneratetheintegralsusedbytheRudenbergapproximationexcludedallintegralsthatwerenotofthetype(AAjBB).Thesetwomodiedroutinesinvolvesomeoverheadthatcouldbereducedwithfurtherrenement.ThesemodiedroutinesarespecictotheAATapproachandarenotincludedinthestandardreleaseversionofACESIII. Sincetwo-electronintegralspossessaneight-foldspatialsymmetrytheoverallnumberofintegralsthatneedtobecalculatedcanbereducedbyroughlyafactorofeight.Todosorequiresrecognizingthesymmetryofeveryintegraltype.InACESIIItheproblemisslightlydierent.Insteadofhavingthiseight-foldsymmetrymanifestitselfbyAO 118

PAGE 119

Do EndIf(=) If6= Do (j)!(j);(j);(j);(j);(j);(j);(j) EndIf(6=and>) EndDo() EndIf(6=) If> [ApplyRudenbergapproximationtotheterms(j)andSj;k](j)!(j);(j);(j) Do EndIf(>and6=and6=) EndDo() EndIf(>and6=) EndDo() EndIf(>) EndDo() EndDo() InadditiontotheRudenbergapproximation,wehavealsoimplementedaninterfacetothestoredcubicsplineparametersneededtoconstructtheapproximationtotheexchange-correlationcontributiontotheFock-likematrix.Thiscomponentismuchfasterthanthetwo-electronintegralcomponent,thereforeitisnotaratelimitingstep. 119

PAGE 120

Anotheraspectthatneedstobeconsideredistherotationofthetwo-centermatrixelementsinalocalatom-paircoordinatesystemtotheglobalcoordinatesystemofthemolecule.Thisrotationisunambiguousoncetheglobalcoordinatesystemisspecied,andalthoughorbitalsofthemoleculecanberotatedsimultaneouslywithoutaectingtheenergetics,thecorrespondingmatrixelementswillchangeonrotation.ThisdegreeofexibilitymustbeincorporatedtoensurethattheFock-likematrixcontributionthatarisesfromeveryatom-paircorrespondstotheorbitalorientationoftheglobalcoordinatesystemofthemolecule.Toachievethisweuseasimpleprocedurethatgeneratesthematrixthatrotatesapairofatomiccoordinatessothattheyarealignedtotheaxisofthethestoredcubicspline,andthenperformthereverserotationonthatsub-blockoftheFock-likematrix. Finally,thethirdcomponentimplementedwasthenumericalintegrationthatisperformedoncetheSCFprocedurehasconverged.TheACESIIIprogramsystemgeneratesajobarchivele(JOBARC)thatcanbereadbyACESIIexecutables,assumingithasbeengeneratedonthesamecomputerarchitecture.TotesttheecacyoftheAATapproximations,thequickestroutewastomodifytheACESIIxintgrtexecutable.Thexintgrtmoduleperformsnumericalintegrationonadensitytodeterminetheexchange-correlationenergythatisneededfortheresidualelectronicenergycontributionofAAT.Eectively,xintgrtpullstheAOeigenvectormatrixfromtheJOBARC,aswellasotherinformationaboutthesystemthatwouldtypicallybegeneratedbythexvscf ksexecutableinACESIIwhenrunningaKS-DFTcalculation(e.g.occupationnumbers).SinceACESIIIdoesnotcreateallofthenecessaryrecordsintheJOBARC,somesmalladjustmentstothexintgrtroutineledtoanewmodiedexecutablethatperformedthe 120

PAGE 121


PAGE 122

[1] R.J.Bartlett,V.F.Lotrich,andI.V.Schweigert,J.Chem.Phys.123,062205(2005). [2] K.W.Sattelmeyer,J.Tirado-Rives,andW.L.Jorgensen,J.Phys.Chem.110,13551(2006). [3] J.McClellan,T.Hughes,andR.J.Bartlett,Int.J.QuantumChem.105,914(2005). [4] R.S.Mulliken,J.Chim.Phys.46,497(1949). [5] K.Rudenberg,J.Chem.Phys.34,1433(1951). [6] W.WeberandW.Thiel,Theor.Chem.Acc.103,495(2000). [7] M.Elstner,D.Porezag,G.Jungnickel,J.Elsner,M.Haugk,T.Frauenheim,S.Suhai,andG.Seifert,Phys.Rev.B58,7260(1998). [8] Q.Zhao,R.C.Morrison,andR.G.Parr,Phys.Rev.A50,2138(1994). [9] R.vanLeeuwenandE.J.Baerends,Phys.Rev.A49,2421(1994). [10] A.BesteandR.J.Bartlett,J.Chem.Phys.120,8395(2004). [11] A.BesteandR.J.Bartlett,J.Chem.Phys.123,154103(2005). [12] W.KohnandL.J.Sham,Phys.Rev.140,A1133(1965). [13] R.J.BartlettandG.D.PurvisIII,Int.J.QuantumChem.XIV,561(1978). [14] J.^C^zekandJ.Paldus,J.Chem.Phys.47,3976(1960). [15] J.PaldusandJ.^C^zek,J.Chem.Phys.54,2293(1971). [16] R.J.Bartlett,Annu.Rev.Phys.Chem.32,359(1981). [17] R.J.BartlettandM.Musial,Rev.Mod.Phys.79,291(2007). [18] T.Helgaker,P.Jrgensen,andJ.Olsen,MolecularElectronic-StructureTheory(Wiley,NewYork,2000). [19] R.J.Bartlett,TheoryandApplicationsofComputationalChemistry(Elsevier,2005),p.1191. [20] M.Levy,Proc.oftheNat.Acad.ofSci.76,6062(1979). [21] Q.Zhao,R.C.Morrison,andR.G.Parr,Phys.Rev.A50,2138(1994). [22] Q.ZhaoandR.G.Parr,Phys.Rev.A46,2337(1992). [23] Q.ZhaoandR.Parr,J.Chem.Phys.98,543(1993). 122

PAGE 123

D.J.Tozer,V.E.Ingamells,andN.C.Handy,J.Chem.Phys.105,9200(1996). [25] K.Peirs,D.VanNeck,andM.Waroquier,Phys.Rev.A67,012505(2003). [26] R.Homann,J.Chem.Phys.39,1397(1963). [27] M.J.S.Dewar,J.Friedheim,G.Grady,E.F.Healy,andJ.Stewart,Organometallics5,375(1986). [28] J.J.P.Stewart,J.Computat.Chem.10,209(1989). [29] J.J.P.Stewart,J.Computat.Chem.10,221(1989). [30] M.J.S.DewarandW.Thiel,J.Am.Chem.Soc.99,4899(1977). [31] M.J.S.Dewar,E.G.Zoebisch,E.F.Healy,andJ.J.P.Stewart,J.Am.Chem.Soc.107,3902(1985). [32] M.KolbandW.Thiel,J.Computat.Chem.14,775(1993). [33] M.J.S.DewarandW.Thiel,Theor.Chem.Acc.46,89(1977). [34] J.C.SlaterandG.F.Koster,Phys.Rev.94,1498(1954). [35] W.M.C.FoulkesandR.Haydock,Phys.Rev.B39,12520(1989). [36] R.J.Bartlett,I.Grabowski,S.Hirata,andS.Ivanov,J.Chem.Phys.122,034104(2005). [37] M.WolfsbergandL.Helmholz,J.Chem.Phys.20,837(1952). [38] C.J.BallhausenandH.B.Gray,Inorg.Chem.1,111(1962). [39] G.Klopman,J.Am.Chem.Soc.86,4550(1964). [40] K.Ohno,Theor.Chem.Acc.2,219(1964). [41] K.NishimotoandN.Mataga,Z.Physik.Chem(Frankfurt)12,335(1957). [42] G.Zheng,S.Irle,M.Elstner,andK.Morokuma,J.Phys.Chem.A108,3182(2004). [43] C.E.Taylor,M.G.Cory,R.J.Bartlett,andW.Thiel,Comput.Mat.Sci.27,204(2003). [44] R.J.Bartlett,J.Phys.Chem.93,1697(1989). [45] I.RossiandD.Truhlar,Chem.Phys.Lett.233,231(1995). [46] P.Charbonneau,Astrophys.J.(Suppl.)101,309(1995). [47] R.H.Byrd,L-BFGS-Bversion2.1,UniversityofColorado.(1997). 123

PAGE 124

ACESIIisaprogramproductoftheQuantumTheoryProject,UniversityofFlorida.J.F.Stanton,J.Gauss,J.D.Watts,M.Nooijen,N.Oliphant,S.A.Perera,P.G.Szalay,W.J.Lauderdale,S.R.Gwaltney,S.Beck,A.BalkovaD.E.Bernholdt,K.-KBaeck,P.Rozyczko,H.Sekino,C.Hober,andR.J.Bartlett.IntegralpackagesincludedareVMOL(J.AlmlofandP.R.Taylor);VPROPS(P.Taylor)ABACUS;(T.Helgaker,H.J.Aa.Jensen,P.Jrgensen,J.Olsen,andP.R.Taylor)(2006). [49] W.Thiel,MNDO97,Organisch-chemischesInstitut,UniversitatZurich,Winterhurerstrasse190,CH-8057Zurich,Switzerland(1997). [50] M.Dupuis,A.Marquez,andE.Davidson,HONDO99.6,basedonHONDO95.3,QuantumChemistryProgramExchange(QCPE),IndianaUniversity,Bloomington,IN47405(1999). [51] J.Li,F.Zhao,andF.Jing,J.Comput.Chem.24,345(2003). [52] R.Ahlrichs,M.Bar,M.Haser,H.Horn,M.Hser,andC.Klemel,Chem.Phys.Lett.162,165(1989). [53] M.HaserandR.Ahlrichs,J.Comput.Chem.10,104(1989). [54] F.Weigend,M.Haser,H.Patzelt,andR.Ahlrichs,Chem.Phys.Lett.294,143(1998). [55] P.PolitzerandJ.S.M.Eds.,TheoreticalandComputationalChemistryEnergeticMaterialsPart1.Decomposition,CrystalandMolecularProperties,vol.12(Elsevier,Amsterdam,TheNetherlands,2003). [56] M.J.S.Dewar,J.P.Ritchie,andJ.Alster,J.Org.Chem.50,1031(1985). [57] M.L.J.McKee,J.Am.Chem.Soc.108,5784(1986). [58] M.T.Nguyen,H.T.Le,B.Hajgato,T.Veszpremi,andM.C.Lin,J.Phys.Chem.A107,4286(2003). [59] W.F.Hu,T.J.He,D.M.Chen,andF.C.Liu,J.Phys.Chem.A106,7294(2002). [60] A.G.TaubeandR.J.Bartlett,J.Chem.Phys.accepted(2007). [61] W.D.Taylor,T.D.Allston,M.J.Moscato,G.B.Fazekas,R.Kozlowski,andG.A.Takacs,Int.J.Chem.Kin.12,231(1980). [62] H.A.TaylorandV.V.Vesselovsky,J.Phys.Chem.39,1095(1935). [63] R.Bukowski,K.Szalewicz,andC.F.Chabalowski,J.Phys.Chem.A103,7322(1999). [64] M.J.S.DewarandW.Thiel,J.Am.Chem.Soc.99,4907(1977). [65] W.Thiel,J.Mol.Struc.(Theochem)1-6,398(1997). 124

PAGE 125

W.WeberandW.Thiel,Theor.Chem.Acc.103,495(2000). [67] J.Stewart,ReviewsinComputationalChemistry(VCHPublishers,1990),vol.1,p.45. [68] W.Thiel,Adv.Chem.Phys.93,703(1996). [69] D.E.Taylor,K.Runge,andR.J.Bartlett,Mol.Phys.103,2019(2005). [70] W.ThielandA.Voityuk,Theor.Chem.Acc.81,391(1992). [71] A.Ourmazd,D.W.Taylor,J.A.Rentschler,andJ.Bevk,Phys.Rev.Lett.59,213(1987). [72] D.Muller,T.Sorsch,S.Moccio,F.Baumann,K.Evans-Lutterdodt,andG.Timp,Nature399,758(1999). [73] X.Y.Liu,D.Jovanovic,andR.Stumpf,AppliedPhysicsLetters86,82104(2005). [74] A.A.Demkov,J.Orteg,O.F.Sankey,andM.P.Grumback,Phys.Rev.B52,1618(1995). [75] L.StixrudeandM.S.T.Bukowinski,Science250,541(1990). [76] L.StixrudeandM.S.T.Bukowinski,Phys.Rev.B44,2523(1991). [77] M.MenonandK.R.Subbaswamy,Phys.Rev.B55,9231(1997). [78] M.Elstner,J.Phys.Chem.A111,5614(2007). [79] N.Otte,M.Scholten,andW.Thiel,J.Phys.Chem.A111,5751(2007). [80] M.P.Repasky,J.Chandrasekhar,andW.L.Jorgensen,J.Comput.Chem.23,1601(2002). [81] P.Pulay,Mol.Phys.17,197(1969). [82] B.Dunlap,J.Connolly,andJ.Sabin,J.Chem.Phys.71,3396(1979). [83] C.A.White,B.G.Johnson,P.M.W.Gill,andM.Head-Gordon,Chem.Phys.Lett.230,8(1994). [84] C.A.White,B.G.Johnson,P.M.W.Gill,andM.Head-Gordon,Chem.Phys.Lett.253,268(1996). [85] M.C.Strain,G.E.Scuseria,andM.J.Frisch,Science271,51(1996). [86] J.C.Burant,G.E.Scuseria,andM.J.Frisch,J.Chem.Phys.105,8969(1996). [87] M.ChallacombeandE.Schwegler,J.Chem.Phys.106,5526(1997). 125

PAGE 126

P.M.W.Gill,A.T.B.Gilbert,andT.R.Adams,J.Computat.Chem.21,1505(2000). [89] W.Koch,B.Frey,J.F.S.Ruiz,andT.Scior,Z.Naturforsch58a,756(2003). [90] N.OliphantandR.J.Bartlett,J.Chem.Phys.100,6550(1994). [91] C.S.Duris,ACMTOMS6,92(1980). [92] L.A.Curtiss,K.Raghavachari,P.Redfern,andJ.A.Pople,J.Chem.Phys.106,1063(1997). 126

PAGE 127

IwasborninMichiganbuthavedonetimeinOhio,Washington,Oregon,andnowFlorida.Mytumultuouscareerinsciencebeganinsixthgradewhenmyscienceteachercalledmeadreamer(IknowIamnottheonlyone)forproposingtheconstructionofanelevatortothemoontoavoidthecostsassociatedwithspaceight.Inhindsight,Ishouldnothavespeciedmyoriginaldesignasarigidstructure.ThatsameyearIwasawardedacerticateastheclassroom'sstrangestkid.Aftermakingthetransitiontohighschool,Idoubled-uponmathandscienceandnishedallavailableclassesinthoseareasbymyJunioryear,resultinginamisspentSenioryear.SomehowImanagedtogetintotheonlycollegetowhichIapplied.IarrivedaFreshmanlledwithyouthfuloptimismandpersuadedmyacademicadvisortoenrollmeinachemistrycourse2yearsbeyondmylevel.IcontinuedmySophomoreyearwithouttheyouthfuloptimism.IparticipatedintwoResearchExperienceforUndergraduatesprogramssponsoredbytheNationalScienceFoundationatCornellandtheUniversityofChicago.InChicagoIworkedforProf.KarlFreed,essentiallyspendingthesummerbuildingoverlycomplicatedinternalcoordinateinputles.ImusthaveenjoyeditbecausethatsamesummerItattooedmyselfwiththemarkofaquantumchemist,literally.Eventually,ItookaninterdisciplinaryBAinchemistryandphysicsfromReedCollegeinbeautifulPortland,Oregon,in2001. 127