Scaling effects on metal-oxide-semiconductor device characteristics


Material Information

Scaling effects on metal-oxide-semiconductor device characteristics
Physical Description:
vi, 142 leaves : ill. ; 29 cm.
Walstra, Steven V., 1970-
Publication Date:


Subjects / Keywords:
Metal oxide semiconductors   ( lcsh )
Electrical and Computer Engineering thesis, Ph.D   ( lcsh )
Dissertations, Academic -- Electrical and Computer Engineering -- UF   ( lcsh )
bibliography   ( marcgt )
non-fiction   ( marcgt )


Thesis (Ph.D.)--University of Florida, 1997.
Includes bibliographical references (leaves 132-141).
Statement of Responsibility:
by Steven V. Walstra.
General Note:
General Note:

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 028638920
oclc - 38746309
System ID:

This item is only available as the following downloads:

Full Text







I would like to thank Prof. Chih-Tang Sah for his time and guidance as

chairman of my supervisory committee, and Dr. Arnost Neugroschel, Dr. Toshikazu

Nishida, Dr. Sheng Li, and Dr. Randy Chow for serving on my supervisory committee.

Additional thanks go to K. Michael Han for many insightful discussions and debates

concerning all aspects of device physics. I would also like to thank Dr. Changhong Dai,

Dr. Shiuh-Wuu Lee, Mary Wesela, and Jerry Leon for providing the devices,

measurement equipment, and technical expertise during my internship at Intel

Corporation where the intrinsic capacitance data were taken. Financial support from a

Semiconductor Research Corporation Fellowship is also gratefully acknowledged.



ACKNOWLEDGMENTS.................. .............. ..........................ii

A B ST R A C T ..................................................................................... ........... .....................


1 INTRODUCTION...................................................


Introduction....................... ............................. 5
Background......................... ..... ... ..................... 6
Long-Channel Theory...................... .........................6
Pao-Sah Model........................ ........................7
Bulk-Charge Model..................................................14
Charge-Sheet M odel........................ ....... ............. 18
Comparison of Long-Channel Models................................... 19
Two-Section Models................................................... 20
Beyond Two-Section Models.................................................27
Examples Using Pao-Sah................................................27
Field-Matching Method........................................28
Saturation-Voltage Method....................................30
Surface-Potential Self-Saturation Method...................31
In Search of the M atch Point....................................... 31
Summary....................... .... .. ........................38


Introduction............................................. 41
M etal-Gate CV............................... ....................... 42
Polysilicon-Gate CV............................... ............................ 46
Polysilicon-Gate Effects...................... ...........................51
Parameter Extraction Using the LFCV Model.........................57
3-Point Extraction Methodology.......................... ..58
3-Region Extraction Methodology................................ 63


Methodology Comparison.....................................65
Convergence Speed-up Details....................................72

ON CIRCUIT PERFORMANCE.................................................75

Introduction....................... ...... ... ................... 75
Background.......................................... ........................ 75
Measurement of Intrinsic Capacitances.................................79
Measurement Configurations...............................................82
Sample M easurements............................................. ..............85
Channel Hot-Carrier Stress Effects on Cgd and Cgs................ 94
Intrinsic Capacitance Degradation Model..............................97
Degraded Circuit Simulation............................................ 105
C onclusion................................. .............. 110

5 SUMMARY AND CONCLUSIONS........................................... 114


R E FER E N C E S...................................................... ................................................132

BIOGRAPHICAL SKETCH......................................................................142

Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy


Steven V. Walstra
December 1997

Chairman: Chih-Tang Sah
Major Department: Electrical and Computer Engineering

As metal-oxide-semiconductor (MOS) transistor dimensions are decreased,

channel-length modulation, polysilicon-gate depletion, and intrinsic-capacitance

degradation have increasingly larger impacts on transistor performance. It is

demonstrated that the Pao-Sah 1-D current model can be extended to include the channel-

length modulation effect by use of a two-section model. This two-section model employs

the normal long-channel Pao-Sah model in one region and adds a variable length

depletion region in the other. Three methods for matching the boundary between the two

regions are presented, with the best results coming from the most complex method of

matching the longitudinal fields at the boundary point.

The effect of polysilicon-gate depletion on the MOS low-frequency

capacitance-voltage (LFCV) characteristics is demonstrated using a Fermi-Dirac-based

model. It is shown that, as the oxide thickness decreases, the effect of polysilicon

depletion becomes increasingly pronounced. This depletion, in conjunction with the

Fermi-Dirac carrier distribution, offset the current gain expected from thinning the MOS

gate oxide. With this polysilicon-gate LFCV model, it is shown that the oxide thickness,

flatband voltage, and gate and substrate doping concentrations can be extracted from

experimental capacitance data. Two extraction methods, the 3-point and 3-region, are

developed and are shown to work well with gate oxide thickness of 130A (2.7% RMS fit)

and sub 30A (10% RMS fit).

Voltage-accelerated stress is performed on state-of-the-art 0.24 pm effective-

channel-length nMOS and pMOS devices to assess the impact on the most important

intrinsic capacitances: Cgd and Cgs. The nMOS devices exhibit a Cgd reduction and Cgs

enhancement with stress time, whereas the pMOS devices show negligible change.

Because of Miller feedback, the nMOS Cgd reduction dominates the Cgs increase,

resulting in an overall CMOS capacitive load reduction. Pre-stress and post-stress ID,

Cgd, and Cgs data were fit using the BSIM3 device model. With the resulting parameter

sets, a 31-stage ring oscillator was simulated for three situations: unstressed devices,

stressed devices only including ID degradation, and stressed devices including ID, Cgd,

and Cgs degradation. It is shown that the inclusion of the intrinsic capacitance

degradation results in improved simulated circuit performance because the capacitive

load reduction offsets the drain current reduction. This improved degradation

methodology will result in looser guardbands and less reliability redesign.


The last three decades of production integrated circuits (IC) have seen two

orders of magnitude decrease in device dimensions, from 25 im in 1962 to 0.25 gtm in

1997 [1-3]. This continual reduction, fueled by requirements for higher switching speeds,

lower cost, and decreased power, has been sustained by improvements in lithography and

has resulted in increased areal and chip densities (transistors/cm2 and transistors/chip).

Compared to the -500 transistors/chip in the first experimental 64-bit static random-

access memory (SRAM) in 1965, the -64M transistors/chip 64 Mbit dynamic random-

access memory (DRAM) of 1997 and the -4G transistor/chip 4 Gbit DRAMs due from

NEC in 2000 typify the strong push toward increased density.

Increased areal density implies decreased dimensions. As transistor and

capacitor dimensions decrease, previously negligible effects have become or are

becoming increasingly important. Many of these effects were assumed avoidable through

constant-field scaling [4]. These scaling rules have been debated, amended, and

improved [5-7] to account for noise-margin, hot-electron, and extrinsic-capacitance

considerations, but present and future smaller dimensions have necessitated these effects

be included in the design process. Several of these effects are discussed below.

As channel lengths decrease, the thickness of the space-charge region at the

drain of a metal-oxide-semiconductor (MOS) transistor becomes a significant fraction of


the total channel length. As the drain voltage is changed, the space-charge-region

thickness also changes, resulting in an effective channel length which is drain-voltage

dependent, an effect known as channel-length modulation (CLM). This problem can be

tolerated for complementary MOS (CMOS) logic circuits, but needs to be properly

modeled in order to predict the drive current of the MOS transistors in the circuit in order

to estimate the speed of the resulting circuit.

As the density of transistors increases, so does the power density. This

requires a reduction in the operating voltage, since the active (switching) output power is

proportional to the square of the operating voltage (Pactive fclockCoV2). To obtain the

same performance at lower voltages, the oxide thickness must be reduced. Simple MOS

theory predicts the drain current is inversely proportional to the gate oxide thickness.

However, for thin oxides (< 50 A), depletion of the polysilicon gate offsets the effects of

thinner oxides, resulting in lower current and diminishing returns on oxide scaling.

Additionally, the gate voltage cannot be reduced indefinitely, because a large enough

margin is needed between the signal voltage and ground-plane noise to ensure that noise

does not change the state of the device.

The increased density of transistors also requires more closely-spaced

interconnections between the transistors. Interconnect scaling has made delays due to the

interconnection a limiter in process speed [8], and major efforts are currently underway to

reduce the interconnect resistance and capacitance. Copper has recently been introduced

into 1998 production by IBM to reduce the interconnect resistance. Additional efforts

have been underway to lower the dielectric constant of the intermetal dielectrics in order

to reduce the interconnect capacitance. When the interconnect capacitance is reduced, the

only remaining capacitance left to slow the CMOS circuitry is the intrinsic capacitance of

the transistors, which cannot be easily reduced and will become the predominant speed


There are many other issues concerning the perpetual reduction in transistor

dimensions, the least of which is the brick wall of atomic dimensions. Clearly transistors

cannot be scaled to less than ten or twenty atoms and still work in the traditional sense of

transistors, yet this dissertation includes data from a transistor pushing the atomic limit

with a gate insulator thickness of less than 30A, or under six atomic layers of silicon and

oxygen. The goal of this dissertation is to investigate the issues described in the previous


Chapter 2 discusses the history of 1-dimensional drain current models and

some of the methods which have been implemented to extend these models to include the

CLM effect. The Pao-Sah model, the most accurate long-channel current models, will be

extended to include the CLM effect using three different approaches. The CLM effect (as

demonstrated in the new models) will be discussed, as well as the pros and cons of the


Chapter 3 tackles the polysilicon depletion problem by deriving the Fermi-

Dirac-statistics-based polysilicon-gate MOS low-frequency capacitance model, including

the effect of dopant impurity deionziation. By comparing this with the traditional metal-

gate model, the effect of polysilicon gate depletion will be shown to increase significantly

as the oxide thins. With this model, a parameter extraction methodology is presented

which allows the extraction of substrate and gate doping concentrations as well as the

oxide thickness and flatband voltage from experimental LFCV data. Two methodologies

will be presented and compared, and data from thick (130A) and thin (< 30A) gate-oxide

devices will be used. Additional oxide thickness issues, such as quantum effects, are also


Chapter 4 considers the intrinsic capacitances, in particular, those most

important in modern complementary MOS (CMOS) circuits: Cgd and Cgs. Compared to

the drain current, which is also an intrinsic property of a MOS transistor, intrinsic

capacitances have been relatively ignored because of measurement difficulty and

relatively small impact compared to extrinsic capacitances. However, as processing and

dielectric technology advances, the primary remaining capacitive load in CMOS circuits

will be the intrinsic capacitances. The chapter presents an experimental investigation

how these capacitances change with hot-carrier stress and, after modeling the stress-

induced changes in the intrinsic capacitances, shows that part of the drain current

degradation is offset by the intrinsic capacitance reduction, resulting in a slower

degradation of overall circuit performance.



The simplest I-D model is of crucial importance for applications in

semiconductor physics. Although 3-D models will best match experimental data because

of both inclusion of real effects and simply additional variables, they may be intractable

as compact device models, where computational efficiency is critical. Conversely, these

3-D models are often validated by demonstrating their reduction to the rigorous I-D

forms for non-critical (wide and long channels with thick oxides) geometries. For back-

of-the-envelope calculations, knowledge of the basic physics embodied in a good I-D

model is exceedingly useful.

The required accuracy of a model is largely determined by the application.

For predicting the drive current, such as might be required for a discrete-transistor

specification sheet, a model need not worry about the linear or subthreshold regions of

operation. Similarly, if modeling only the operating range (0 to power supply voltage),

then the accumulation region of applied gate voltages can be ignored in the model. There

are cases, particularly when attempting to predict the performance of new technology,

where 3-D full-range MOSFET models are necessary, but they are a relative minority

compared to the wide array of applications for 1-D models.

This chapter contains a brief history of one-dimensional (I-D) approaches to

drain current models, including calculations and comparisons, followed by a new two-

section model using the I-D Pao-Sah long-channel IV model in conjunction with a

variable-length depletion region. The goal is to extend the I-D long-channel model to

short-channel use.


In 1926 Lilienfeld [9] submitted the patent for the first MOSFET device, an

Al/Al203/Cu2S transistor. Thirty-two years later in 1960, Kahng and Atalla [10]

fabricated the first silicon MOS transistor. A year later, the first MOST current-voltage

(IV) papers were published internally at AT&T Bell Labs in 1961 by Kahng [11] and

later at Stanford by Ihantola [12]. These were followed in 1964 by more complete (and

widely released) I-D theories by Sah [13] and Ihantola and Moll [14]. A comprehensive

history of MOS developments was reviewed by Sah [1]. In the subsequent years since

the first MOST model, hundreds of papers and theses have been written about the

modeling of various aspects of MOS transistors. This chapter will discuss the prevailing

I-D models including Pao-Sah, bulk-charge, charge-sheet, and the many two-section


Long-Channel Theory

"Long channel" is a term used to specify that short-channel effects can be

neglected when modeling MOSTs, and the predominant short-channel effect is

encroachment of the drain depletion region into the channel. The depletion region exists

due to the reverse-biased p/n junction between the substrate and the drain, and has

nothing to do with the actual channel length. For long-channel devices, however, the

amount of encroachment relative to the channel length is small, so the effective channel

length is essentially constant (equal to the drawn gate length). For short channels,

however, the effective channel length can be significantly reduced by the encroachment.

Another short-channel effect neglected in long-channel theory is drain-induced barrier

lowering [15], where the source barrier is lowered by the applied drain voltage.

Pao-Sah Model

The most accurate long-channel theory was published by Pao and Sah (PS)

[16]. The PS model is the only one which correctly accounted for drift and diffusion. The

PS theory, to be discussed below, contains a double integral, but can be reduced to a more

efficient form containing only single integrals [17, 18]. Although cumbersome to

calculate, the PS double integral is extremely didactic and is a useful starting point for

showing the approximations used to derive other long-channel IV models. The total

current flowing in the channel is given by the integral

ID = J(x,y)Z dx, (2.1)


J(x,y) = JN + Jp = JN = q/nNEy + qDnVN = qDnNVt. (2.2)

JN and Jp are the electron and hole current densities, respectively, and it is assumed that

the current is dominated by electrons in an n-channel device in (2.2). The electron charge

is q, t, and D. are the electron mobility and diffusion respectively, and VN is the

gradient of the electron concentration. The electron quasi-Fermi level, t, is measured

relative the bulk Fermi level and normalized to kT/q.

If d/dx is assumed negligible (which is a fundamental assumption in the

long-channel approximation and should be valid to a depth on the order of the drain

junction depth), then ID can be found from summing up all the current from the surface

down to some depth xi below which the additional contribution is negligible:

ID = qDnZ(dt/dy) N(x)dx

This can be transformed from physical space in the y direction to potential space as

L UD xi
ID dy = qDnZ d 0 N(x) dx (2.3)
0'0 0

where UD=qVDs/(kT) is the normalized drain voltage at y=L and the lower limit 0 is the

grounded source voltage at y=0. A similar transform in the x direction yields:

Z UD Us N(U)
I, = qDn- d] dU (2.4)
L 0 'UF (-dU/dx)

where (dU/dx) is the x-component of the electric field, which can easily derived from

integrating Poisson's equation by quadrature and is given below. The Boltzmann

approximation to the carrier concentration is being used and the impurities are assumed

completely ionized, but the Fermi-Dirac and deionized form can be used. Us is the

normalized surface potential (where surface is at x=0), the total amount of surface band

bending relative to the intrinsic Fermi level. It is a function of both the gate voltage and

the drain voltage. UF is the normalized bulk Fermi level, below which the current

contribution is assumed negligible, and is analogous to the to physical point x=xi in

(2.3). The derivative (dU/dx) is found from

(-dU/dx) = F(U, U) /L, (2.5)


F(U, ,UF)=[exp(U--UF) + exp(Up-U) + (U-l)exp(UF)

(U+exp(-t) )exp(-UF) ]1/2 (2.6)

After applying Einstein's relationship, Dn/nt = kT/q, (2.4) becomes

kT 2 Z IUD Us exp(U-(-UF)
ID = I dUdt (2.7)
q 2L LD 0 UF F(U, ,Up)

The surface potential, Us(4), is needed in (2.7). The relationship between the surface

potential and the gate voltage can be found by applying Gauss's Law at the

semiconductor/insulator interface. The resulting equation, given below, can be solved

iteratively for Us for a given !.

UG = Us + sign(US) FF(Us,,UF) (2.8)

where UG is the normalized gate voltage, q(VGs VFB)/kT; y is Es/(LD.C o); LD is the Debye

length ('[EskT/(2ni)]/q); and F(Us,V,UF) is given by (2.6).

Equation 2.7 is the traditional form of the PS integral, often called the Pao-Sah

double integral. A more computationally friendly and accurate single-integral form [17]

was used for the calculations in this dissertation. The mobility in (2.7) need not be taken

out of the integrals. Instead, it can be a function of the vertical and lateral fields and moved

inside of the integrals. In this chapter the mobility will be assumed independent of field.

A good way to understand Eq. 2.7 is to consider the three-dimensional band

structure of a MOST under gate and drain bias, as shown in Figures 2.1-2.4, based on the

original Pao-Sah paper [16]. Figure 2.1 shows an idealized n-channel MOST. Figure 2.2

shows the corresponding energy band diagram with no applied terminal voltages except

VGs=VFp. From the position of the Fermi level it is easily verified that the source and drain

are n-type and the substrate is p-type (n-channel device). Electrons in the source and drain

see a potential barrier toward the channel.

Application of a positive voltage to the gate lowers the barrier near the surface,

as shown in Figure 2.3. The applied gate voltage pulls electrons toward the surface (and

pushes holes away from the surface), as can be seen from the position of the Fermi-level

relative to the band edges. Farther into the substrate (away from the gate/substrate

interface) there is no bending from the gate potential, so the region is identical to the

unbiased case (Figure 2.2) and considered quasi-neutral.

Applying a voltage to the drain (VDs < VDSsar) splits the Fermi level into quasi-

Fermi levels (FN for electrons and Fp for holes), as shown in Figure 2.4. One can imagine

an electron in the conduction band surmounting the source barrier and then falling down the

potential 'cliff' until reaching the drain. This 'free fall' is where the electron gains energy

while moving across the channel. If the electron is not scattered while moving across the

channel (losing energy to the lattice via phonons), it becomes increasingly energetic as it

approaches the drain and may become 'hot' enough to produce an e-h pair via impact, the

resulting hole may generate interface traps via dehydrogenation of Si-H bonds near the

Si/SiO2 interface [19]. This is only one of several mechanisms for interface trap generation.

/ I I_ I
SV / /j / ^* / }


Fig. 2.1 Simplified view of two-dimensional MOS device.




Fig. 2.2 Schematic 2-D energy band diagram of simple MOS device with source
and drain grounded and VGS=VFB. Adapted from Pao and Sah [16].


Source Drain

A V / /% 'V El



Fig. 2.3 Schematic 2-D energy band diagram of simple MOS device with VGS >
VpB, drain and source grounded. Adapted from Pao and Sah [16].

Fig. 2.4 Schematic 2-D energy band diagram of simple MOS device with VGS >
VFB, 0 < VDS < VDSsa, and source grounded. Adapted from Pao and Sah

Figure 2.5 shows the result of applying a drain voltage in excess of VDSsat. As

will be discussed in the two-section model section later, the drain depletion region becomes

increasingly longer as the reverse-biased drain voltage increases. For this long-channel

section of the dissertation, however, the change in length, AL, is assumed much less than the

channel length L. The voltage drop across this thin depletion region often results in large

fields which can greatly accelerate carriers, causing the interface damage mentioned above.

Now that the effect of applied biases on the 2-D structure of the band has been

discussed, it is easy to see the basis of the integral limits in Equation 2.7. The inner integral

is integrating from the surface into the bulk (from Us to UF), which is a cross section of the

channel as shown in Figure 2.6. The outer integral is integrating from drain to the source

(UD to 0, source is grounded) along the channel. Thus, the double integral is summing up

all the current contribution in the channel, exactly as would be expected. Since Us is a

function of the drain voltage (or the channel potential), the order of the double integration is

not trivially reversible.

Bulk-Charge Model

The first group of ID models, in order of complexity, were by Sah [13], Ihantola

and Moll [14], and Sah and Pao [20]. These are all bulk charge models, taking increasingly

more into account. As the name suggests, the bulk charge model takes the depleted region

under the channel (in the bulk) into account. It assumes drift is the major component and so

neglects the diffusion component. This greatly simplifies the problem and reduces (2.2) to

J(x,y) = JN + Jp = JN = qnNEy = qpnN(x) (dV/dy) (2.9)

F/q / I/ Drain

Gate -

S4- Depletion Region

Fig. 2.5 Schematic 2-D energy band diagram of simple MOS device with VGS > 0,
VDS > VDSsat, and source grounded. Adapted from Pao and Sah [16].

E versus X near source

E versus X near drain

E, X
f-- Ev X


ql ,L i-' II


S ...-- --E--- 5

Ev X


Fig. 2.6 Schematic 2-D energy band diagram of simple MOS device with VGS > 0
and VDS < VDSsat. Cross-sections show the 1-D energy-band diagrams
near the source and drain. G-Gate electrode, X-Substrate electrode.


ID = qp/Z(dV/dy) qN(x)dx (2.10)

ID = -PnZ(dV/dy)QN (2.11)


QN = -Co(VG V -Vs) + (2qPxxs)1/2[Vso + V]1/2 (2.12)

Co is the oxide capacitance per unit area, VG is VGS VFB, VSo is the surface potential at

the source, and V is the channel potential (=VDs at the unsaturated drain). Pxx is the

substrate impurity concentration and Es is the dielectric constant of silicon. The first term is

the charge accumulated in the channel and the second term is the uncompensated charge in

the depletion region beneath the channel (i.e. bulk charge). Integrating (2.12) along the

channel gives:

I, = Pn(Z/L)Co{ (VG Vs)VD V,/2 (2.13)

(1/Co) (2/3) (2qPxxs)1/2[ (Vs0 + VDS)3/2 (Vso)3/2]

This form is slightly different than the Sah-Pao and Ihantola-Moll forms because it is not

assumed that Vso=2VF, where VF is the Fermi voltage. A more exact form [17] is:

ID = n(Z/L)Co{ VG(VsL Vso) (1/2) (VL V2o) (2.14)

(2/3) (1/Co) (2qPxxes)1/2[(VsL)3/2 -(Vso)3/2])

where VSL is the surface potential at the drain. This differs from (2.13) in that the surface

potential at the drain is calculated instead of assumed to be VsL=Vso + VDs. When the

drain current approaches or exceeds saturation (VDs > VDssat), VsLVso + VDS-

Additionally, in subthreshold, VSL is typically closer to Vso than Vso + VDS [21]. As will

be shown later, the bulk charge formula should never be used for subthreshold calculations

since it neglects diffusion, which is the primary subthreshold current contribution.

The bulk charge form, compared to PS, is considerably easier to calculate,

particularly when using (2.13) with Vso = 2VF, but is invalid in subthreshold. Equation

2.13 is also invalid in saturation as written, but that can be fixed somewhat by calculating

the saturation voltage VDSsat and fixing the current for all drain voltages greater than VDsat.

This will make the first derivative (drain conductance) non-continuous at VDs=VDSsat. All

saturation problems are solved in (2.14), where the calculation of VSL negates these

problems. Iterative calculation of VSL is time consuming, particularly compared to

assuming a constant, or pinned, surface potential value.

Charge-Sheet Model

While most of the interest centered on super-threshold operation of the MOST,

some people became concerned with the lack of accurate modeling for subthreshold

operation. Barron [21] and Van Overstaeten et al. [22] developed subthreshold formulae

based on simplifications of the Pao-Sah integral, with results applicable only to the

subthreshold region.

Six years later, Brews [23] made a critical approximation which would allow both

drift and diffusion components to be introduced simultaneously without the need for a

double (or single) integral. When he proposed his "charge-sheet model," he introduced the

following simplification:

I = qZnN(y) (dC/dy)

dt/dy=dts/dy 1/f dln(n)/dy (2.15)

This approximation for dF/dy was justified "based upon its success in producing 'correct' I-

V curves," although he added a footnote relating the formula to electrochemical potential.

This wide-open statement resulted in several subsequent 'proofs' which derived the same

formula [17, 24, 25]. Essentially, though, he decoupled the drift and diffusion components

from the tight interdependency seen in the Pao-Sah form to the simple form of (2.15).

Through a similar derivation to bulk-charge, ID is given by

I=Pn(Z/y) (1/f) {Co(l/Pf+VG) (Vs(y) Vo) (1/2)C,(V (y) V20) (2.16)

(3/2) (2qPxxs)/ 2[ (pVs(Y) 1)3/2 (Vs 1)/2]

+ (2qPxE,)/2[ (pVs(Y) 1)1/2 (pVSO 1)1/2]

Eq. 2.16 reduces to bulk-charge form of Eq. 2.14 if VG, Vs(y), Vs0 >> 1/ and the square

root terms are negligible. Unlike bulk-charge, this formula is valid in subthreshold and does

not require a calculation of VDSsat (assuming VSL and Vs0 are calculated iteratively). Like

bulk-charge, this is much easier to calculate than a double, or even single, integral.

Brews, and many subsequent authors, validated the charge-sheet model by

comparing it to the results of the Pao-Sah formula. It has been shown to be an excellent

approximation, as will be discussed in the next section.

Comparison of Long-Channel Models

The Pao-Sah double-integral model has been heralded as the best long-channel

model. Brews [23] went so far to say that "Comparison of the charge-sheet model with the

Pao-Sah model has the force of comparison with experiment, since the Pao-Sah model is

known to work well for long channel devices." Schrimpf et al. [26] agreed, saying Pao and

Sah "produced a quantitative model so accurate that it is the standard by which other models

are judged." Since bulk-charge and charge-sheet are both approximations to Pao-Sah, it

makes sense to compare them with Pao-Sah to see how accurate they are, taking into

account that all the models are only valid for long-channel devices.

Figure 2.7 shows all three methods simulated for Tox=500 A, T=296 K, Pxx=1015

cm-3, W/L=10. These are typical parameters for LSI devices of the 1970s, and were chosen

to match the data used in Pierret and Sheilds [17]. As can be seen, the bulk-charge and

charge-sheet models underestimate the current. Figure 2.8 shows the percentage error for

each model at the gate voltages shown in Fig. 2.7, demonstrating that the charge-sheet

model maintains an error of less than 2.6% for all gate voltages, while the bulk charge

model ranges from 2.5% for VGS=5.0V to 8.4% for VGS=2.0 V. This suggests that the

much simpler charge sheet can be used in place of Pao-Sah incurring only about 2.5% error

at low voltages.

Figure 2.9 shows the subthreshold region for the same device with VDS=0.1 V.

Clearly demonstrated in this figure is both the glaring inadequacy of the bulk-charge model

for subthreshold modeling and the remarkable accuracy of the simple charge-sheet model.

However, recall that this is charge-sheet with iteratively calculated surface potentials, so the

numerical solution is not entirely trivial.

Two-Section Models

Up until now, only long-channel ID equations have been considered. For short-

channel devices (<1 gm), the most prominent non-modeled effect on the drain current is

finite drain conductance beyond saturation. The primary cause of this non-zero drain-

conductance (gD) is channel shortening from the drain space-charge region (SCR)

2.5 V,=5.0V
2.5- Pao-Sah
-Charge Sheet
2.0 ----Bulk Charge

< V=4.0V
E 1.5- ------------------------------------



0 1 2 3 4 5 6 7 8 9

vD (V)

Fig. 2.7 ID versus VDs for different VGs values for the three I-D ID models.
Parameters are To=500 A, T=296 K, Pxx=1015 cm-3, W/L=10, which
were used to match data in Pierret and Shields [17].

I -Bulk Charge
Charge Sheet

-- -

0 1 2 3 4 5 6

7 8 9

VD (V)

Percentage error in ID for charge sheet and bulk charge relative to Pao-
Sah versus VDS, from Fig. 2.7. Plots are VGS = 5, 4, 3, and 2 V, with
higher errors for lower voltages.

Fig. 2.8

104 Pao-Sah
10-5 -- Charge Sheet
10-6 Bulk Charge

< 10- NA=51015 cm- -
10-9 T=296K
S10-1 Xox=500A
10-1 / VD=0.1v

10 -16 1 kI I I I I I I -
0.0 0.5 1.0 1.5 2.0

vG (V)

Fig. 2.9 ID versus VGS for Pao-Sah, charge sheet, and bulk charge using same
data as Fig. 2.7 with VDS=0.1 V. Clearly bulk charge is not useful in
subthreshold, whereas charge-sheet is almost coincident with Pao-Sah.

encroaching into the channel. This effect is often called channel-length modulation since

the drain voltage modulates the effective channel length.

The most logical approach is to divide the region between the source and drain into

two sections: a 'source side' and a 'drain side'. The 'source side' may contain any

appropriate long-channel IV model, such as Pao-Sah, charge sheet, or bulk charge. The

'drain region' is the depletion region, and can be modeled with or without mobile charge, 2-

D effects, mobility differences, etc. The location of the boundary between these regions,

and the voltages and fields at this boundary, are what make this a challenging problem.

Figure 2.10 shows a diagram of a MOS transistor divided into two sections.

There are essentially three things which differ among approaches to two-section

theory: the the source-side IV model, the drain-side space-charge region (SCR) model, and

the boundary conditions.

Source Side Drain Side
Source (Long channel approx) (SCR)

Le= (L- AL) AL

y-0 y=yM y=L

Fig. 2.10 Schematic diagram of two-section MOST for I-D modeling. SCR
means 'Space-Charge Region' and Leff refers to the effective channel

The IV model can be one of the many already discussed. The SCR model can be

assumed fully depleted, take mobile charge into account, or be a complete 2- or 3-D model.

The boundary conditions are the most difficult and varied among approaches. Essentially,

the potentials, fields, and charge at the boundary between the two regions need to be


The simplest two-section MOST model was introduced in 1965 by Reddi and Sah

[27]. They used a source-side bulk-charge model for the current and a fully-depleted drain-

side depletion model. From the first derivative of the bulk-charge model (Eq. 2.13 with

Vs=2VF), Reddi-Sah (and others) calculated the drain voltage where, for a constant gate

voltage, the drain conductance drops to zero (VDSsat). They then assumed all voltage in

excess of VDSsat falls across the SCR to form the drain region of the two-section model.

By assuming complete depletion (no mobile charge) and no y-field at the boundary,

the length can be calculated from simple p/n junction theory as:

AL = [2Es (VD Vossat + Vbi) / (qPxx) 1/2 (2.17)

where Vbi = (kTq)ln(NdrainNubstrat/n) from standard abrupt-junction p/n theory. Replacing

L by Leff=L-AL and Vso with 2VF in (2.13) yields the Reddi-Sah two-section current.

The simplicity of this formula is extremely attractive, but the solution is dependent

on the ID model. Specifically, it assumes that a VDSsat voltage can be found. If using Pao-

Sah or charge-sheet, the surface potential is not constant and a VDSsat point does not actually

exist. Even if VDSsat is found from extrapolation, the first derivatives of the drain current

will be non-smooth at the point where the drain current switches from one model (Pao-Sah,

charge-sheet, bulk-charge) to another (constant ID), although this can be fixed with various

smoothing transitional functions.

Four years after Reddi and Sah's paper, Chiu and Sah [28] came out with a two-

section model which solved Laplace equation in the oxide layer and matched values in four

regions (source, drain, oxide, and bulk). The drain region was solved as a 2-D, fully-

depleted region, and the solution required seven matching parameters. The complexity of

the solution relinquished this model to an almost constant reference as "too complex."

The following year (1969) Frohman-Bentchkowsky and Grove [29] developed a

two-section model using bulk-charge model in the source region and an empirical model for

the drain section. This simple model essentially added two additional fringe field

contributions to the Reddi-Sah model and added two empirical variables to fit the data.

Merckel, Borel, and Cupcea [30] added mobile charge to the drain region

empirically by writing Poisson's equation in the drain region as

d2V/dy2 = q/s (Pxx + IDS)/ (qZa) (2.18)

where a is essentially a fitting parameter related to the junction depth. This mobile charge

is akin to the Kirk effect in bipolar devices, just as the drain-depletion encroachment is

analogous to the Early effect. Using an iteratively determined VDSsat, they were able to

calculate the drain depletion width. Popa [31] devised a similar model and extended the

drain depletion region to be of three types depending on the injected current. In both

mobile-charge cases, fitting parameters were introduced either through (2.18) or mobility.

Both used variations of the simple bulk charge model for the source side.

After Brews developed the charge-sheet model, all subsequent two-section models

employed the charge-sheet model. Guebels and Van de Wiele [32] developed a three-

section model to account for the x-field reversal near the drain. They employ the same trick

as the previous papers by fitting the a in (2.18), using VDSsat (or IDsat) and adding some

empiricism to their field calculations.

Beyond Two-Section Models

The charge-sheet model (and Pao-Sah, as will be shown) does not lend itself well

to analytical two-section models due to the greater complexity of the drain current model

relative to bulk charge. As noted above, fitting parameters and empirical formulae were

required to be introduced to satisfy some of the boundary conditions.

The newer compact models, such as BSIM [33,34] and Siemen's [35-37] model,

are based loosely on one-section bulk-charge and charge-sheet models, respectively,

sometimes dividing the model into different sections based on operation (separate

subthreshold and superthreshold formulae). They both model short-channel effects by

adding semi-empirical additions to the threshold voltage, which makes for a considerably

faster calculation speed at the expense of a less-physical model.

Examples Using Pao-Sah

The goal was to develop a two-section model which employs the Pao-Sah integral

as the source-side current formula. The following is a description of the methodology and

results of the exercise.

Field-Matching Method

The Pao-Sah current has already been discussed, as have been models for the

depletion region. Let us consider the matching boundary of the two section model to occur

at the point Y=YM where the channel voltage is VM with a lateral field EM and electric field

gradient d2Us/dy2=dEM/dy.

A simple way to look at this problem is from the Poisson's equation in the drain

region while considering the boundary conditions. Within the drain region, which extends

from y=yM to y=L, the boundary conditions are (see Fig 2.10):



dV(yM)/dy=Em (field at the match point)

d2V(yM)/dy2= (1/Es)[qPxx + (mobile charge terms)] = C

It is possible from Pao-Sah to calculate dV(YM)/dy=EMps [38]. This gives us the following

equations after integrating the Poisson's equation twice with the above boundary conditions:

(VDs VM) = (C/2) (L yM)2 EM(L M) (2.20)

This reduces all the boundary conditions to one equation with two unknowns (yM and VM).

The ideal additional equation would be d2V(yM)/dy2 on the Pao-Sah side, but this quantity

is incalculable from the Pao-Sah integral.

If it is assumed that assume EM=O (as was done in Reddi-Sah), the depletion length

into the channel can be easily found. It is reasonable to assume that the lateral field at the

matching point (EM) is much less than the field right at the drain (ED), so ED > EM, making

the difference in yM small. This gives (from 2.20, also 2.17)

YM = L (2(VDs VM + Vb)/C)1/2 (2.21)

Where Vbi accounts for the pre-existing depletion region originating from the abrupt p/n

junction. Since the YM approximation has already been made, it will be assumed that the

field throughout the drain region is a constant at the boundary and is given by

EMdep = (VD VM) / (L M) (2.22)

Clearly there are conflicting assumptions (EM = 0, and now EM 0). One might wonder

why EM is not (VDs VM + Vbi)/(L yM) to be consistent with 2.21. This comes from the

subtlety of the boundary conditions. Looking back to Figure 2.2, note that the integration is

actually from Vs + Vbi to VDS + Vbi, which excludes the p/n depletion layers. The Vbi's

cancel out for symmetrical devices, so this is no problem. At VDsO0 (and source grounded),

no current or field is expected, which would make VM correctly equal to 0 in (2.22).

However, if Vbi were added to (2.22), then VM would have to equal Vbi, which would

incorrectly cause a field (and possibly current flow depending on VGS). Essentially, (2.22)

gives the excess field. However, Vbi does contribute to the depletion width, so it is included

in (2.21).

The normalized field on the Pao-Sah side at the boundary is given by [32]

[exp (U) -1] exp (U-U-U )
EMps =
2 r
F(Us ,UM,UF) + [exp(Us-UM-UF) -exp(UF-Us)+exp(U) -exp(-UF)

UMIUS exp(U-4-UF)
------- dUdt
U0 U F(U, ,Up)

rUs exp(U-UM-UF)
S --s- (2.23)

where UM is the normalized matching voltage, VMy(q/kT).

Figure 2.11 shows the results of this approach, with mobile charge terms neglected

(C=qPxx/Esi) for Pxx=5xl017 cm-3, T=300 K, and T,,=50 A. The data cover a wide range

of channel lengths from '/4 gtm to -, and for all cases the width is equal to the length (square

devices). The saturation current predicted by long-channel theory for these square devices

would be the same for all channel lengths, so the deviation from this is due to channel-

length modulation, which clearly becomes more important and the channel length decreases.

Figure 2.12 shows that the drain conductance (gD=dlD/dVDs) is smooth, which is important

for circuit simulator applications. Although not shown, the derivative of the drain

conductance is also smooth. Thus, this field-matching model successfully extends the I-D

Pao-Sah model to short-channels, at least with regards to including the effective channel-

shortening effect.

Saturation-Voltage Method

Reddi and Sah [27] assumed VM=VDSsat, which simplified things considerably.

VDSsat is easy to calculate when using the bulk-charge formula assuming Vso=2VF since the

derivative of the surface potential with respect to the drain voltage is zero. The Pao-Sah

current, however, does not technically saturate (numerically there will be a point where the

current does not increase, but it will be at a drain voltage well in excess of the normal VDSsat

point). This problem is solved by extrapolating VDSsat from dID/dVDs versus VDS without

channel shortening. Figures 2.13 and 2.14 show the results of employing this method with

the same device as used in the previous section (Pxx=5xl017 cm-3, T=300 K, T,,=50 A),

using Eq. 2.21 for yM with VM=VDSsat. Clearly the channel-length modulation is being

accounted for, but the transition is slightly abrupt. A look at the resulting drain conductance

(Fig. 2.14) shows a drastic discontinuity near the calculated VDSsat point. Use of a fitting

function could rectify this derivative problem, and is a common practice for compact


Surface-Potential Self-Saturation Method

Another possible way to circumvent finding the VM point was posed by Katto and

Itoh [39]. Instead of finding VDSsat, they used the fact that the surface potential itself will

saturate when solved iteratively from (2.8). Thus replacing the matching voltage, VM, with

the surface potential at the drain, VSL (solved iteratively) gives another decoupled way to

solve for yM. Using the surface potential to find the depletion thickness was also used by

Sah [2]. This is better than the VDSat method since there will not be an immediate point

where saturation occurs. However, as shown in Figs. 2.15 and 2.16, the current still has a

slight 'jump' resulting in discontinuities in gD'

In Search of the Match Point

Sah [2] showed pictorially that in saturation, the energy band near the drain edge

will actually be bent upward, or in other words, the surface will be accumulated rather than

inverted (actually, the surface will still be depleted, but now accumulation refers only to the

shape of the band bending). This must be the case since the potential along the channel is

actually higher than VGS VGT = VDSsatr This means that there must be a point along the

channel at which the band bending is zero at the surface, and this point would be an

excellent candidate for the yM point. Like the methods above, however, this point has some

--- L=0.25 pm
- L=0.50 gm
--- L=1.00 pm
- L=5.00 pm
- L=, pm

.~- 2 ..-.-.- ..... .
2 -

0 2 4 6 8 10


Fig. 2.11 ID versus VDS plots for different channel lengths (square devices) using
field matching at the match point. VGS = 5 V, Pxx=5xl017 cm3,
T=300K, Tox=50 A.


2 .0 i i I I"I

--- L=0.25 im
1.5 -- L=0.50 Im
-- L=1.00 gim
--L=5.00 gm

1.0 -- L=om pm


0.0 .-- -- ---- .-.-. -.
0 2 4 6 8 10


Fig. 2.12 gD versus VDS plots for different channel length (square devices) using
field matching at the match point. Same parameters as Fig. 2.11.

--- L=0.25 gm
S -- L=0.50 ipm
L=1.00 pm
--L=5.00 gm
-. -- L=om m
< 3 --


I l I I I I I I I l l l lI I, I I I

0 2 4 6 8 10


Fig. 2.13 ID versus VDS plots for different channel lengths (square devices) using
VM=iterative surface potential at drain. VGS = 5 V, Pxx=5xl07 cm-3,
T=300K, To=50 A.






--- L=0.25 pnm
-- L=0.50 gpm
-.- L=1.00 pm
--L=5.00 gm
L= pm


0 2 4 6 8 10


Fig. 2.14 gD versus VDS plots for different channel length (square devices) using
VM=iterative surface potential at drain. Same parameters as Fig. 2.13.

--- L=0.25 gim
- -L=0.50 gm
--- L=1.00 pLm
- L=5.00 pm
- L== pm

II 1 1 1 II


I I I i I I I .

0 2 4 6 8 10


Fig. 2.15 ID versus VDS plots for different channel lengths (square devices) using
VM=VDSsa. VGS = 5 V, PX=5xl017 cm-3, T=300K, T,,=50 A.


.............. .... ---- --- -

2.0 I I

--- L=0.25 pim
1.5 -- L=0.50 gm
-.--L=1.00 gm
--L=5.00 gm
L=oo pm
E I.
1.0 -

0.5 \ -'
-- -- -- -- --------
0.0 -- -- -- --- :-- '-:- :-

0 2 4 6 8 10


Fig. 2.16 gD versus VDS plots for different channel length (square devices) using
VM=VDsa,, Same parameters as Fig. 2.15.

logical flaws. For instance, the field in the x-direction is zero by definition, which means

that using the channel potential at this point to calculate the current from the long-channel

model will clearly invalidate the gradual channel approximation (Ex >> Ey), a basis for the

Pao-Sah ID derivation.

A simple approximation for this point would be to use VM = VGS VFB when VDS

> VGS VFB, which is akin to setting VDSsat = VGS VFB. This ends up resulting in the

same sort of problem seen in the VDSat method.

It is interesting to verify the existence of this turn-around region in the channel near

the drain, however. This was done recently using the MINIMOS device simulator [40].

MINIMOS was modified to use a constant mobility model so as to be comparable to the 1-

D model cases above. Figure 2.17 shows the resulting electrostatic potentials into the

substrate at different points near the drain edge of a 50A, 100x100 pim (corrected for

subdiffusion) nMOST with Pxx=5xl017 cm-3 at VGS=1.5 V and VDS=3.0 V. VFB was fixed

at zero for this case. What is clear is that the band moves from inversion (top) through

flatband into accumulation (bottom) at the surface (x=0.0). The flatband point occurs when

the channel potential is equal to VGs VFB = 1.5 V, as expected.


This chapter reviewed the history of 1-D long-channel drain-current models and

discussed the pros and cons of their derivation and applications. From this, the importance

of a non-pinned surface potential was shown, as demonstrated by the excellent

approximation of the simple charge-sheet model to the Pao-Sah double integral--the best of

the 1-D long-channel models.


Next, methods to extend the 1-D into two I-D sections to create the best full-range

1-D model. It was discovered that, no matter what, the depletion region is strictly 2-D, and

obtaining a 1-D approximation requires rather substantial assumptions. One model, the

field-matching approach, was seen to give reasonably good characteristics, while all the

other approximations (VDSsat and surface-potential self-saturation) resulted in

discontinuities in the first (and higher) derivatives.

A 2-D simulation was used to verify that there is a point in the saturated channel

where the (x-directed) field reverses and the surface band bending is, thus, zero. This point

has been suggested many times before in our group, but never verified two-dimensionally.

Attempting to use this point to demark the boundary of the source region and drain region of

the two-section model results in the same poor results as the VDSsat method.

SiO2/Si interface

i Into substrate

X (pm)

Fig. 2.17 Electrostatic potentials into the substrate at different points near the
drain edge of a 50A, 100x 100 tm (corrected for subdiffusion) nMOST
with Pxx=5x1017 cm-3 at VGs=1.5 V and VDS=3.0 V. The Y=1.321
tm (near source) and Y=50.000 gtm (middle of the channel) curves are
indistinguishable. The band is flat at the SiO2/Si surface when the
channel electrostatic potential equals VGs -VFB = 1.5 V (VFB = 0 for
this data).



For modern ULSI technology, polysilicon gates are universally used on MOS

devices. With respect to MOS device characteristics, there is no advantage to substituting

metal gates with heavily-doped polysilicon (poly) gates. In fact, poly gates, as will be

shown in this chapter, greatly reduce the effectiveness of thinning the oxide layer to

increase the drain current. The use of poly gates is a question of cost as well as

performance, however, and poly gates have some tremendous processing and density

benefits over metal gates. Polysilicon gates can withstand high temperature steps that

would cause most deposited metal gates to evaporate, particularly the source/drain drive-

in step. Polysilicon gates also allow for self-alignment of the gate over the oxide between

the source and drain, removing what would be the most difficult (and costly) alignment

step in the process flow [41-42].

This chapter covers the derivation of a Fermi-Dirac-based polysilicon-gate

MOS low-frequency capacitance-voltage model. This model will be used to illustrate the

effects of polysilicon gates on MOS low-frequency (LF) capacitance-voltage (CV)

characteristics compared to metal-gate LFCV characteristics. A useful application for the

model is physical parameter extraction, which is demonstrated in this chapter using two

different methodologies: 3-point fit and 3-region fit. Sample parameter extractions for

thick (130A) and thin (20A) gate oxides are shown, and discussion about limitations of

the model are presented. Quantum effects are purposely ignored, and the reasoning

behind this decision is discussed. Important details related to fast convergence of the

parameter-extraction routines are also given.

Metal-Gate CV

Ideal metal-gate CV theory using Boltzmann statistics has been extensively

discussed [3,43], as well the extension to include Fermi-Dirac carrier distribution and

deionization effects [44-46]. The appendix contains the full metal-gate LFCV model

derivation, taking Fermi-Dirac statistics and deionization into account. The relevant

solutions are given below.

Figure 3.1 shows a schematic diagram of an ideal metal-gate MOS device and

the corresponding band diagram. From Figure 3.1 (b), as explained in the appendix,

Kirchkoffs voltage law around the loop gives:

OM + VO = XS VIx + (Ec EI)/q + VF + VG (3.1)

where OM is the work-function for the metal, Vo is the potential drop across the oxide, Xs

is the electron affinity of the substrate, Ec and EI are the conduction-band edge and

intrinsic energies, respectively, in the substrate, and VF is the Fermi voltage, which is

equivalent to (El Fp)/q for p-type material, where Fp is the quasi-Fermi-level for holes

and q is the electron charge. Collecting these terms in cleaner form gives

V, = Vo + Vi, + Ms (3.2)

Vaccum Level


Metal (Al)

Vo Ecox Xs
| /A

-* Ec
~----E" T j

Semiconductor (Si)


Oxide (Si02)

MOS capacitor schematic and corresponding energy-band diagram. (A)
Schematic diagram of a MOS capacitor and (B) corresponding energy-
band diagram depicting the potential drops. Shown is a positive voltage
VG applied at the gate, resulting in the Si02/Si surface entering

Fig. 3.1

where OgM = cM Os = Mn (Xs + (EC EI) /q + VF) is the work-

function difference between the metal and the semiconductor. As will be shown in the

next section, the work function difference for a polysilicon-gate MOS device is much

simpler than metal-gate MOS since the substrate and gate materials are the same. The

drop across the oxide can be found from Gauss's Law requirements as

Vo = EEIx/Co (QT + QIT) /Co, (3.3)

where es is dielectric constant of the semiconductor (- 11.7x8.85x 1014 F/cm2 for Si), EIx

is the field across the oxide, Co is the oxide capacitance, and QOT and Qrr represent fixed

and interface trapped oxide charge respectively. With this relation, (3.2) can be rewritten


VG = VFB + VIX + esEix/Co, (3.4)

where VFB, the flat-band voltage, is given by

VFB = 'MS (QOT + QIT) /Co. (3.5)

For metal-gates, there is no capacitive contribution from the metal, so the gate

capacitance is simply the series equivalent of the fixed oxide capacitance, Co, and the

variable substrate capacitance, Cix.

Cg = CCo/ (Cix + Co). (3.6)

The field going into the substrate, E1x, and the substrate capacitance, Ci, are given by

EX = -- Nv[3/2 (-UI-Uv+UF) "3/2 (-Uv+UF)

+ Nc[ 3/2( UIx+Uc-UF) 3/2( UC-UF)]

+ UPxx UIx+ in g-)xp(U UA UIX)
1+ gAexp(UF UA
+P ( U +1 + gexp(U U) I
l + +g ( F A

+ [Nxx(-UIx + In 1+ g-ex- ( ) (3.7)
1 + gDexp(UD UF)

Cix = -NF/2 (-UIx--U+UF) + NCi/(UIx+UC-Up)

[ Pxx
1 + gAexp(UF UA UIx)

1 + goexp(UD UF + UIX)

where all 'U' values are potentials normalized to kT/q and referenced to the intrinsic

Fermi level. For example, UF is the normalized Fermi level, qVpF(kT). Pxx is acceptor

substrate doping concentration and Nxx is the donor substrate doping concentration, and

gA and gD are the corresponding degeneracy factors for the trap levels UA (acceptor

energy level) and UD (donor energy level, not to be confused with the normalized drain

voltage of an MOS transistor). Nv is the valance band density of states and Nc is the

conduction band density of states.

Those familiar with MOS capacitance equations might find these far more

complex than they recall; a perusal of the appendix should clear up any questions about

this form. However, it is instructive to show how this reduces to a more familiar

Boltzmann form. First, all of the Fermi-Dirac integrals [F12(l) and F3/2(T) terms] reduce

to exponentials in the Boltzmann range of applied gate voltages (tr < -4). Second, there

is typically only one dominant dopant, so one of the last two terms in (3.7) and (3.8) can

be neglected (the first can be neglected for n-type substrate, and the second for p-type

substrate). Furthermore, if deionization is neglected (UF UA UIX < -3 for p-type or

UD UF + UIx < -3) for n-type), then the last two terms of (3.7) reduce to

PxxUlx NxxUIx. Likewise, the two lines of (3.8) reduce to Pxx Nxx when

deionization is neglected.

As an example of the simplified form, let us consider a p-type substrate in

strong accumulation. In this case, it can be assumed that only the accumulated surface

carrier term is dominant (Ujx is large and negative). Noting also that, in the Boltzmann

case, UF UV = ln(Pxx/Nv), (3.7) and (3.8) would reduce to

Eix = (2kTPxx/es)exp(UIx/2)

Cix = q [Pxx/ (2kT)]exp(Uix/2)

These are the more tractable strong-accumulation forms found in undergraduate

textbooks [3, 43] and which form the basis for one well-known oxide-thickness

extrapolation algorithm [47].

Polysilicon-Gate CV

Implicit in the derivation of the metal-gate CV theory above was that the

capacitance of the gate is infinite and that the voltage drop across the gate is zero. With

metal gates, this is a reasonable assumption for the ideal isolated device. However, with

polysilicon gates, there is a finite polysilicon gate capacitance as well as a voltage drop

[3, 49]. Indeed, the capacitor is now a semiconductor-oxide-semiconductor device, so it

will have a corresponding surface potential for the gate, as well as an associated gate

capacitance with a form exactly like the substrate capacitance.

This requires only minor additional derivation to arrive at the poly-gate MOS

capacitor (MOSC) ideal device characteristics. Figure 3.2 shows the band diagram for an

n+-polysilicon gate MOS capacitor with a p-type substrate (a schematic of the device

would be identical to 3.1 (a), with a metal gate replaced by a polysilicon gate). From this

figure it is clear that the potential drop across the device can be given similarly to (3.1) as

-VF-poly + (EC EI)/q + Xs + VIG + VO

= Xs VIx + (Ec Ei)/q + VF + VG. (3.9)

Assuming the energy gap has not narrowed due to the higher gate doping, the (Ec El)

terms are identical and cancel because the materials are both silicon. The electron affinity

is the same for both the gate and substrate for the same reason. This reduces (3.9) to

VG = Vo + VIX + VIG + VF + VF-poly. (3.10)

Thus, for the poly-gate case, OMS (more aptly called OGS, where 'G' represents the gate,

but still traditionally referred to as 'M' for metal) is simply given by

MS = VF + VF-poly. (3.11)

For Figure 3.2, Ms is given by In(PxxNGG/n2), where Pxx is the substrate doping ('P'

implying p-type) and NGG is the gate doping ('N' implying n-type). This simple formula

assumes a Boltzmann carrier distribution in the substrate and gate, which is invalid in the

Vaccum Level

Vo Ecox Xs
o V/


"Metal" (n-poly Si)

Semiconductor (Si)

Band diagram of n+ polysilicon-gate MOS capacitor with all the
potential drops labeled. The band diagram shown depicts a positive
voltage VG applied at the gate, with the SiO2/Si surface entering
inversion and the poly-Si/SiO2 surface depleting.

Fig. 3.2

gate due to the high doping and likely invalid substrate for modern ULSI devices. A

more appropriate formula using inverse Fermi-Dirac integrals can be used using the

examples in the appendix.

The extra potential drop from the poly gate is easily taken into account via

Kirkoff's law with VIG:

VG = VFB + Vix + VIG + ,sEI/Co, (3.12)

where VFB from (3.5) still holds assuming negligible contribution from the

polysilicon/oxide interface, using (3.11) for gMs. Finally, the gate capacitance formula

needs to be extended for three capacitors in parallel. This changes (3.6) to

Cg = CixCoCig/ (CixC + CiCig + CigCo), (3.13)

where Cig, the capacitance from the polysilicon gate, is given by

Cig = q --NvyIF(--Ui-Uv+UF) + Nc/, (UIG+Uc-UF)


1 + gAexp(UF UA UIG)

1 + goexp(UD UF + UIG)

This is simply (3.8) re-written with the band notation for the gate. Thus, UIG is the

normalized surface potential in the gate, EIG is the field in the gate (defined below), and

PGG, NGG, Uv, Uc, NC, Nv, gA' gD, UD' UA are precisely as defined before, except that

they apply now to the gate rather than the substrate. UF above was called UFpoly

elsewhere--it is left as UF in (3.14) to maintain the symmetry of the equation. The gate

field is given by

EIG NVF3/2(-UIG-Uv+UF) F3/2 (-U+UF)

+ Nc[3/2( UIG+UC-UF) F3/2( Uc-UF)]

+ +I in{1 + gAexp(Up UA UIG)
+ PGG ^ IX----------
++ n1 + gAexp (Up UA) I

[ 1 + gnexp -UD F IG
+ [NG(-U I + In + )U ,(3.15)
1 + gDexp(UD UF)

which is identical to (3.7) with the surface potentials changed. Again, the same caveat

applies to (3.15)--all the terms refer to the gate now, not the substrate. Things like trap

levels and band edges are nearly, if not exactly, the same in the substrate and polysilicon

gate. However, UF is clearly quite different (assuming the gate and substrate are not

doped identically, which would make a poor capacitor or transistor).

An additional equation, which was not needed in the metal-gate case, is required

to relate the gate and substrate. This equation equates the charge density at the gate/oxide

interface with the charge density at the oxide/substrate interface:

sEIx + QITX + EEIG + QITG = 0.

QITX is the interface charge at the substrate/insulator interface and QITG is the charge at

the gate/insulator interface. It is assumed that these values are negligible, and that the

dielectric constant for the silicon substrate and the silicon gate are identical (already

implicitly assumed in the equation). This gives the following


which allows the surface potential in the gate to be related to the surface potential in the

substrate. The iterative solution of the above equation requires many calculations of (3.7)

and (3.15), and is the most time-consuming part of the LFCV solver as well as any

software using the routine (such as a parameter extractor which works by comparing the

data to the theoretical curve, as discussed later in the application section).

Polysilicon-Gate Effects

The effect of polysilicon gates, compared to metal gates, is a reduction of the gate

capacitance, Cg, when the gate is in depletion. This is arises when the value of Cig falls

below that of Co and Cix, which only occurs during gate depletion and substrate inversion

or accumulation, and only then to a significant degree for thin oxides. This is easily

visualized from the three series capacitances--the one which dominates is the smallest,

and the capacitance due to the substrate and gate are both minimized during depletion

(and maximized during accumulation, as well as inversion for the LF case). As oxides

thin, the oxide capacitance increases, which causes the effect of the substrate and gate

depletion to have more control over the characteristics of the Cg-VG curve.

Figure 3.3 shows the difference between metal-gate and polysilicon-gate data,

normalized to Co, for two different technologies. The 'higher' pair of curves for a 1000

A oxide (thick oxide means low oxide capacitance) shows little difference between

polysilicon gates and metal gates. The lower pair of curves for a 50 A oxide (thin oxide

means large oxide capacitance) shows a large decrease in Cg for all values of VG,

particularly for VG > IV, where the gate is still in depletion and the substrate is inverted.


S-------------*^ -----
0.8 Tox=50A
0.7 .
0 .6 -Pxx=3.0xl016
Q 0.5 Tox=1000A
O 0.4
0.3 -
-- MetaI-G/SiO2/pSi
0.1 -- Poly-nSi-G/SiO2/pSi (NGG=3.0x1019cm-3)
-4 -3 -2 -1 0 1 2 3 4

VG / (1 V)

Fig. 3.3 Comparison of metal-gate and n+ poly-gate MOSC curves for two
different technologies. One set has 1000 A oxide with Pxx=3X1016
cm-3 and the second set has 50 A oxide with Pxx=2xl0'7 cm-3. In each
case, the VFB is adjusted to be -1.V and the gate doping is 3xl107
cm-3. Clearly shown is the dramatic difference between poly-gate
(dotted line) and metal-gate (solid line) for the 50A case, and the
negligible impact on the 1000 A case--the polysilicon gate effects
increase as the oxide scales thinner.

This continual decrease in Cg for increasing VG (in this n+ poly-gate on p-Si substrate) is

often referred to as 'poly depletion,' since the polysilicon gate is still depleting.

Eventually the gate itself will invert, and the characteristics will be much improved.

However, resulting field caused by the gate voltage required to invert the gate is typically

beyond the reliability limit of 4MV/cm in properly scaled devices. In fact, the only way

to make the gate invert sooner is to lower the gate doping, which exaggerates the poly

depletion effect even more until the gate inverts.

It might seem, as it did to this author, that the ultimate solution would be to use

undoped gates, as they would invert much sooner and behave just like metal gates at

reasonably low applied gate voltages. This works well in simulation, but the question

then becomes: where is the supply of minority carriers to invert the gate? In particular, for

an n+gate in a rapidly switching MOST, what would supply the holes? It has been shown

that, for at least one technology, the holes are likely supplied via thermal generation

(rather than ion impact) [50]. Thermal generation, then, could not supply the holes fast

enough for practical use of an undoped gate. However, it might be possible to design in a

minority carrier source nearby to supply minority carriers (similar to how the source and

drain supply minority carriers in the substrate).

The reduction in the gate capacitance due to poly depletion causes a reduction in

the drive current, which degrades circuit performance [51-54], since the amount of

current supplied by the transistor directly relates to the switching speed of the device. In

a complementary-MOS (CMOS) circuit, the current charged up the interconnect and

intrinsic capacitances of the next transistors in the line, as discussed in detail in Chapter

4. Because of this poly-gate ID reduction, there may eventually be a move back to metal

gate (or silicides) once the processing issues of gate alignment are solved.

It is instructive to look at the individual capacitance components to see how the

'complex' poly LFCV curve forms. Figure 3.4 shows such a curve for a theoretical 50.0

A oxide with an n+ gate doped (rather lowly) to 9xl018 cm-3 and a substrate doped to

5x1017 cm-3. The gate area is lx10-4 cm2 and the flatband voltage is -1.0 V. The Cg

curve, being the serial sum of Co, Cix, and Cig (Eq. 3.17), is always lower than the

component curves. It can be clearly seen how each of these three components influences

the overall structure of the resulting gate capacitance. In fact, this 'regional' effect will

be used to help speed up parameter extraction in the next section.

Also of interest is a breakdown of potentials across the MOSC device as a

function of VG. Figure 3.5 shows the four components of VG, namely VIX, VIG, Vox,

and VFB (see Eq. 3.12) as a function of VG using the same parameters as the example in

the last paragraph. To show show these are related to the resulting gate capacitance, the

Cg-VG curve is also plotted. What is most relevant in this figure is that as the primary

'dip' in the CV curve occurs as the surface potential in the substrate, VIx, sweeps from

accumulation to inversion (i.e. moves from a small negative number to about one volt),

and ends sharply as the surface potential approaches its maximum (strong inversion).

Similarly, the secondary polydepletion 'dip' occurs as the gate surface potential, VIG,

moves from accumulation to inversion (again, moves from a small negative voltage to

around a volt). Note that the final surface potential in the gate is higher than that in the

substrate (VIG > VIX when VC > 4V). This agrees with the common approximation that

i.. 150-

100- Co

0 I I I I I I I I I I
-6 -5-4-3-2-1 0 1 2 3 4 5 6

VG / (1 V)

Fig. 3.4 Individual capacitance values for a theoretical 100xl00 Rm nMOSC
with a 50 A gate oxide, Pxx=5.0x1017 cm-3, NGG=9xl018 cm-, T=300
K, and VFB=-l.O V. This figure demonstrates how the three parallel
capacitances (Cix, C,, and Co) add to give the overall gate capacitance.
See Fig. 3.5 for the corresponding potential breakdown.

6 70
4- 4 60
> 3- C-

1 40
"FZ 0 -- --- y -
-' -1 30
S-2 v / VFB o
0v -3 20
-4 -
-6 0
-6 -5-4-3-2-1 0 1 2 3 4 5 6

VG/(1 V)

Fig. 3.5 Individual potential breakdown for a theoretical 100xl00 J.m nMOSC
with a 50 A gate oxide, Pxx=5.0xl017 cm-3, NGG=9xl101 cm-3, T=300
K, and VFB=-1.0 V, along with the corresponding LFCV curve. Note
how the surface potential in the substrate, Vix, increases rapidly in the
range VG = -1 to 0 V as Cg increases (substrate inversion) and the
similar increase in VIG in the range VG=l to 3 V (gate inversion). See
Fig. 3.4 for the corresponding capacitance breakdown.

the surface potential pins to a little over 2VF, since the Fermi voltage in the gate will be

larger than that of the substrate due to the greater gate doping.

Parameter Extraction Using the LFCV Model

Of the multitude of variables in the LFCV equations, most of them are known to a

reasonable degree of accuracy (such as the dielectric constant, energy gap, conduction-

band density, etc.), can be measured easily (temperature), or need not be known very

accurately (acceptor and donor trap level) due to their small effect. This leaves the gate

and substrate doping, the oxide thickness, and the flatband voltage as the 'unknown'


These parameters may be extracted from experimental data by comparing

experimental data to the theoretical model presented in this chapter. This may appear to

be an easy task, since the equation need only be used, along with some data, in

conjunction with a non-linear least-squares-solver. However, one will note that the

polysilicon gate LFCV formula is doubly parametric (that is, is related through two

parameters--the surface potentials UIx and UIG), neither of which are known from the

data. Thus, solving this problem is non trivial.

The first step toward a solution, then, is to write a program which will calculate

Cg given VG. This requires intensive calculations to find Uix and UIG for each VG, but

can be done since there is only one unique solution. Thus, with a Cg(VG) routine written,

a nonlinear least-squares-fit program can be used. The code written for this dissertation

took advantage of the fact that, as the solution converges to values of the unknown

parameters, the values of the surface potentials at each experimental data point could be

used for initial guesses for each subsequent iteration of VG to find each Cg (since the

parameters {substrate and gate doping, oxide thickness, and flatband voltage)) should not

be changing too rapidly). This greatly increased the convergence rate over estimating

Ulx and UIG on each call, at the expense of additional code complexity and memory


3-Point Extraction Methodology

If the model were perfect, then it would require only three points to match the

experimental data to the model. Why only three data points for four parameters?

Because the additional constraint that one of the points should be the minimum of the

experimental LFCV curve can be used. From this information, the flatband can be found

by comparing the VG of the theoretical minimum with the VC of the experimental

minimum. The other three parameters can be found directly from the Cg values of the

three points. Figure 3.6 shows the three points, labeled Cg.acc, Cg-depl, and Cg-dep2 as

they relate to the whole LFCV curve. Only Cg.depl is unique--the other two points can be

anywhere within their region.

The Cg-acc point is a point from the LFCV gate accumulation region. From this, a

good estimate of the oxide thickness can be found, since the other parameters have very

little influence over this point (see Figs. 3.7 and 3.8). Cg.acc asymptotically approaches

Co, which is inversely proportional to the oxide thickness, Tox, via the parallel plate

formula. There has been much research in obtaining Tox and/or Co from (substrate)

accumulation CV data [48,55-58].

0o 0.6 -
0 0.4
0.3- depi
0.2 Gate Gate Gate Inversion
0.1 Accum. Depletion 1 Depletion 2
0.0 :

VG / (arbitrary scaling)

Fig. 3.6 Example of a general polysilicon-gate (n+ gate, p substrate for this case)
showing how all the important regions can be labeled in terms of the
gate state rather than the typical substrate state. This regional
breakdown is used to improve the speed and accuracy of the parameter
extraction routine.

The Cg.depl point is the minimum of the LFCV curve, and allows us to find the

substrate doping, since the substrate depletion region is strongly dependent on the

substrate doping concentration. In fact, depletion CV data can also be used to determine

the substrate doping profile [59-61]. Figure 3.7 shows LFCV data for several different

constant substrate doping concentrations, clearly demonstrating the strong dependence of

substrate doping on the location of Cg-depl. This was also demonstrated in Fig. 3.4, since

the main influence in this depletion region is C1x, which itself is strongly dependent on

UF (see Eq. 3.8), which is directly related to the the inverse Fermi-Dirac integral (natural

logarithm if assuming a Boltzmann distribution) of the substrate doping. The position of

the minimum along the VG axis also allows us to estimate the flatband voltage by

comparing the VG of the minimum of the theoretical curve to the VG of the data.

The Cg.dep2 point is from the gate depletion region. Figure 3.8 shows that the gate

doping has the most affect on this part of the curve, whereas Figure 3.7 shows that the

substrate doping has very little effect in this region. For the n+ gate on p-substrate

example in Figure 3.8, the substrate is in inversion. However, even if the substrate were

n-type (and the substrate thus accumulated), Cg depletion would still occur because the

gate would still be in depletion (of course, the entire curve would be shifted due to the

flatband difference). Hence, this point is called Cg-dep2, with the 'dep' in reference to the

depleted state of the gate.

By varying the parameters in the appropriate regions to match these three points, a

unique parameter set will be obtained which will describe a theoretical LFCV curve

passing through the three points.

1.0 I I I I '
0.8 /
0.7 Pxx=2.0x10'8 \
0 0.6 --- Pxx=4.0xl17 -
0.5 -- Pxx=2.0x1016 Decreasing Pxx
0 0.4 o I'
0.3 -
0.2 T=300 K
Poly-n'Si-G/SiOdpSi VFB=-1.00 V
0.1 MOS LFCV NaG=5.0x10' cm-3
0 .0 I 1I I I 1 1
-4 -3 -2 -1 0 1 2 3 4

VG/ (1 V)

Fig. 3.7 Effect of substrate doping changes (Pxx) on LFCV characteristics. The
'depletion-l' region (see Fig. 3.6) is the region of largest impact.


o Neg=4.OX1020
k 0.6 0
0. -- N,=5.0x10 Decreasing N
0.5 --NGG=2.O0x109
4 -.-NGG=8.Ox 1018
C 0.4
0.2 T=300 K
Poly-n+Si-G/SiO2/pSi VFB=-1.00 V
0.1 MOS LFCV Pxx=2.0x10" cm -
0 0 1 1 I I I I I
-4 -3 -2 -1 0 1 2 3 4

VG / (1 V)

Fig. 3.8 Effect of gate doping changes (NGG) on LFCV characteristics. The
'depletion-2' region (see Fig. 3.6) is the region of largest impact.

3-Region Extraction Methodology

As good as our model is, there are still several effects which are not being

considered. These include retrograde doping in the substrate and quantum effects in the

substrate inversion channel. Retrograde doping is commonly used for sub-'/2-micron

design to maintain a high sub-surface doping concentration to prevent punchthrough,

while still maintaining a low VT for low-VG operation (to accommodate the thin oxides)

[7]. Figure 3.9 shows an example of a retrograde profile from our internally-modified

MINIMOS. The LFCV model assumes a constant doping profile in both the substrate

and gate, and so deviation from this assumption will cause changes in the experimental

LFCV curve relative to the theoretical model.

Charge-carrier layer push-out due to quantum effects in the inversion and

accumulation layers has been an area of much research [62-65]. Experimental

verification of these quantum effects are invariably at low temperatures, where phonons

will not broaden the quantum bands into a continuum. Although some amount of

quantum effect is likely present, it is probably impossible to model correctly when one

considers thermal broadening, SiO2/Si interface roughness and transitional regions, non-

random dopant distribution, and other non-idealities. These will all tend to broaden the

electron levels into a more classical continuum.

It has been noted that electrical and optical oxide thicknesses do not often agree,

and the difference has been attributed to quantum effects. As will be discussed later, the

effect is likely overestimated. More important, if there is a difference, it is the electrically

effective oxide thickness (as determined from electrical experiments, such as CV)


Sample of retrograde doping profile, showing low surface concentration
(5xl016 cm-3) and higher bulk concentration (IxlO18 cm-3).

Fig. 3.9

which is most important compared to the optical thickness (which is not what affects

device performance).

Due to these two main non-idealities (non-constant doping and quantum effects),

there could be some dependence on the extracted parameters using only three points.

That is, extracted parameters might be dependent on which points we choose for Cg-acc

and Cg-dep2. To overcome this, the entire curve could be fit to the model. This would

result in extremely long convergence times, as a partial derivative must be calculated for

each variable at every point for every iteration. However, Figures 3.7 and 3.8 show that

some parameters have no influence on the LFCV curve in certain gate-voltage regions.

Thus, the information provided from their partial derivatives does not help convergence,

and will actually slow down the convergence, not to mention waste time during the


Instead of fitting all the data to the model, the data can be broken up into the same

three regions suggested in Fig. 3.6 for the three-point fit. Then the model can be fit using

only the parameter (or parameters) dominant in the specific region, thus greatly

improving the convergence rate (since the data being used is most relevant). This adds

complication to the coding, as the data partitioning into each region (discussed later) must

be automated, and a different fitting routines must be created for each of the three regions

(same model, but separate partial derivative calculations).

Methodology Comparison

Figure 3.10 shows a comparison of the fit using the 3-point and 3-region methods

to experimental data for a 130A oxide from an industrial 100x100 itm MOST transistor.

(A ) 30 "'. .""I

S20 c g-dep2
LL 20 g-acc x x

15 3-Point Fit

0 10
x Experimental Cg-depl-Cg-min

0 UIlIl,,linllnlilI, In Illl ii
-2.5 -2-1.5 -1 -0.5 0.0 0.5 1.0 1.5 2.0 2.5
(B ) 30 ;' u"','"', "I '" ', I ',


UL 20


S 10 -
x Experimental Full-Curve Fit

-2.5 -2-1.5 -1 -0.5 0.0 0.5 1.0 1.5 2.0 2.5
VG / (1 V)

Fig. 3.10 Theoretically generated curves compared to original LFCV data using
the (A) three-point and (B) full-curve extractions to on an n+polysilicon
gate, 100x100 gm nMOST. Extracted parameters were: (A) Xox=130A,
Pxx=9.0xl016 cm-3, NGG=5.8xl09 cm-3, and VFB=-1.06V. (B)
Xox=130A, Pxx=8.7x1016 cm-3, NGG=3.0x1019 cm-3, and

The solid-line, of course, is the theoretical curve using the extracted parameters, and the

'x' marks are the data used for the extraction. The top (A) curve uses only the three

points marked to fit the data while the lower (B) curve uses the full set of data. The RMS

error (for the second curve), calculated from the square root of the sum of the squares of

the difference between theoretical and experimental capacitance, divided by the square

root of n 5 (5 = degrees of freedom = 1 + number of fitting parameters), was 2.7%, an

excellent fit for only four parameters. There is not much in literature to compare the

'goodness' of these results, as there are not many capacitance models available (most

compact models, such as BSIM3 [34], fit IV characteristics only, and do not consider

capacitance extraction).

Aggressively scaled MOSC device data is given in Figure 3.11. This shows two

LFCV curves from a -20A oxide from an industrial MOST transistor, where the 20A was

determined from some optical method, most likely ellipsometry. This is a rather complex

figure, and requires some explanation. There are two experimental curves: one p+ poly-

gate and one n+ poly-gate, both on a p-well. Thus, in the VG > IV region, the n+ gate is

in depletion, while the p+gate is in accumulation (clearly the curves were shifted to align

them, as the flatband should differ by about a volt between the two curves, although a

threshold adjustment implant would offset this somewhat). From looking at the data at

1.3V and assuming Cg = Co, they conclude that the effective oxide thickness is 33A for

the depleted n+ gate, and 28.5A for the accumulated p+ gate device. This is a poor

approximation, since the oxide thickness value would vary greatly at different points

along the depleted curve. However, they correctly state that the difference between the

two is due to poly depletion, which is a reasonable statement when applied to that

specific gate voltage only. Thus, they attribute a 4.5A reduction in effective oxide

thickness due to polysilicon depletion at 1.3V, which is the operating voltage for that

technology. They next assert that the difference between the accumulated-gate curve

(28.5A based on assuming Cg = Co) and the optically measured oxide (20A) must be

entirely due to quantum effects. This conclusion is wrong, as the quantum model most

likely does not take all the effects mentioned previously into account (such as thermal

broadening, interface roughness and transitional region, not to mention retrograde

doping). More importantly, they are completely ignoring the severe error of using Cg =


Figure 3.12 shows the fit (using the three-point method) to the data in Figure 3.11.

From this fit, the extracted oxide thickness is 24.4A. Compared to the industry-stated

results, the same effective oxide thickness reduction due to poly (4.7A here versus 4.5A)

is seen. However, by correctly accounting for the distribution (instead of assuming Cg =

Co), an additional 4.1A reduction from the Fermi-Dirac distribution is also found (that is,

from the fact that Cg < Co)! This leaves 4.4A of difference between the extracted oxide

thickness of 24.4A and the optically measured thickness of 20A. This 4.4A difference

may include some quantum effects, but it may also be due to optical errors, such as not

accounting for the transitional layer properly [66-67] and/or some other effects (i.e.

doping profile). Ellipsometry and other optical methods can not be used on the actual

device (since the gate electrode is not transparent), so it does not measure the oxide

thickness in the active part of the device, which may differ slightly due to the additional

15 C, calculated
,, llj,,,, I,,. l i ., ,i l ,I I ,ll l III i I rl from optical
S -- P'Poly/p-well -----
.12 N'Poly/p-well
I 8.5A Quantumf
L Xo=28.5A

9 4.5A poly

S 6 X=33A


-3 -2 -1 0 1 2 3

VG / (1 V)

Fig. 3.11 P+poly/p-well and n+poly/p-well '20A' industrial data. Data is shifted
to align minimums, and labelling refers to industrial interpretation of the
two curves. Please see text for explanation of this breakdown.
Compare to Fig. 3.12, which is the author's interpretation of the same
data after quantumless parameter extraction.

15 I
Xo Gate Doping
-- P+Poly/p-well, 24.4A, 7.9x10"' cm-3 I Xo=20.oA Opt
S12 N+Poly/p-well, 24.8A, 4.2x10" cm3 4.4A Error
LL- -..... ............... Xo=24.4A 'Ac
C. Xo=28.5A t 4.1A Fermi
.9 /_ -- > 4.7Apoly

0 6 \ Xo=33.2A

Pxx=4.2x1017 cm-3

-3 -2 -1 0 1 2 3

VG /(1 V)

Fig. 3.12 Theoretically generated LFCV data three-point parameter extraction
using data in Fig. 3.11. Extracted parameters are shown. Instead of
attributing the difference between the optical thickness of 20.0A and the
'extrapolated' thickness of 28.5A at VG=1.3V to quantum effects, we
find that 4.1A is due to the distribution function used (Fermi-Dirac),
with the remaining 4.4A possibly due to error (in optical measurement
and/or other factors).

processing. As far as parameter extraction is concerned, the most important factor should

be agreement with electrical results, not optical.

Although a three-point method was used here (to improve the match at the 1.3V

point), a full fit to the data in Fig. 3.12 has about a 10% RMS error, which, considering

the thin oxide and the fact that the substrate gate doping is not constant, is extremely

good. An interesting side note, brought up during the proposal for this project, concerned

fitting a Boltzmann model to the data instead of a Fermi-Dirac. The surprising result of

this was a better fit (6.6% RMS error), but, as would be expected, a thicker extracted

oxide thickness of 27.5A. This is an interesting result, as it shows that using the wrong

model can appear to give 'better' results (in terms of fit), even though the resulting

parameters actually have greater error (due to the incorrect carrier distribution).

Philosophically, the issue of oxide thickness is an interesting topic. Many people

would argue that TEM is the only way to measure the 'true' thickness. However,

ignoring that this is a destructive and time-consuming technique, it only yields the

thickness of that particular cross section. What is desired is the average oxide thickness,

as it is the average oxide thickness which affects the amount of charge accumulated by an

applied voltage in a MOSC. This is why Fowler Nordheim (FN) tunnelling is also not a

particularly good method--it will always underestimate the oxide thickness since the

tunnelling will occur in the thinner spots on the gate. Additionally, one would rather not

stress the devices while trying to find the oxide thickness. One recent technique for ultra-

thin oxide thickness determination is using quantum oscillations in the tunnelling gate

current, which are caused by quantum interference of electrons in the oxide conduction

band [68]. This method potentially suffers from the same problems as FN, and

additionally requires knowledge of the effective mass and oxide barrier height, the latter

two of which add about 2.5A of error to the results [69], assuming they are known to

within 5%.

Convergence Speed-up Details

Iteration stops when none of the fitting parameters (i.e. the extracted parameters)

changes by more than 5x10-4% between successive serial cycles (i.e. each parameter was

fit, and none changed by more than 5x10-4%). Below are several of the methods

employed to speed up this convergence.

The calculations involved in the extraction are extremely complex. Of greatest

importance is the convergence speed of the poly LFCV model, which itself must

converge on the two surface potentials just to give one Cg data point for a given VG. This

one data point is used in the numerical partial derivatives for the non-linear least-squares-

fit, which means the Cg(VG) is called twice for each experimental data point for each

iteration! Because this model is called so frequently, it is important to keep track of all

the converged surface potentials for each experimental data point so that the LFCV model

has a good estimate for subsequent calls.

The delta used for the numerical partial derivatives, as it turns out, has a major

influence on the correctness of the fit. Since two of the parameters vary logarithmically

(the substrate and gate doping), the actual parameters used during the fit are the logs of

these parameters. Thus, the deltas used for the derivatives must be calculated differently.

Another problem is that the delta used for the oxide thickness is dependant upon the

actual thickness of the oxide (that is, it should be different for a 50 A oxide compared to a

1000 A). The empirical results for the best deltas, as determined from analysis of the

numerical derivatives, were A=10-7 for ln(Nxx), A=0.1 for In(NGG), A=10-4 for VFp, and

10-7xTox for Tx. For example, the partial derivative of Cg with respect to Tox is

calculated from (C,(VG)1 Cg(VG)2)/2A, where Cg(VG)I is the gate capacitance

calculated at some VG with an oxide thickness of Tox + A and Cg(VG)2 is the gate

capacitance calculated at the same VG with an oxide thickness of To A. The reason

arbitrarily small values cannot be used, of course, is because the LFCV model itself is

only accurate to about eight digits (less near flatband) due to its own internal convergence

criteria [46].

Finally, to start the extraction, a reasonable initial guess must be made. The initial

guess of the oxide thickness is simply AEox/Cgmax, where Cgmax is the maximum gate

capacitance in the dataset and A is the gate area. This is the standard first-order

approximation based on Cg = Co. For the substrate doping initial guess, the asymptotic

high-frequency CV formula for Cg, [43] is solved iteratively using the minimum and

maximum Cg values from the data set. The gate doping is simply set to 3x1019 cm-3

With these three parameters approximately known, the flatband voltage is estimated from

VGmindata VGmin theory Since the minimum of the CV data is not necessarily given, but

is needed internally to estimate the flatband, the minimum three data points (in terms of

Cg) are used to estimate the true Cg minimum based on the parabolic minimum formula

[70]. This slightly improves the convergence, but not as much as would be expected,

largely because the minimum of the CV curve is not very parabolic.

Because it was clear that convergence was slower as the results approached the

final values, a 'trick' was developed to improve this end case. Whenever a trend was

visible during a fit, the routine doubled the amount of the parameter increase. A trend, in

this case, is defined as three successive moves of a parameter in the same direction for all

the parameters (possibly different directions for different parameters). This cut down the

number of iterations by about 20% in most cases.

One thing which would have improved the speed of convergence greatly would be

to use a simpler model for the Fermi-Dirac integral. The Cody-Thatcher model [71] is

extremely accurate, but requires the quotient of ten exponentials from a Chebyshev

approximation. This approximation was used instead of some other simpler (though less

accurate) approximations [72-76] because it was desired to add as little error as possible

from the Fermi-Dirac integral calculation.



In this chapter, the relatively obscure subject of intrinsic capacitances will be

discussed. The area of MOS intrinsic capacitance has received little attention over the

years due to the difficulty of measurement and small impact relative to extrinsic

capacitances such as interconnect and packaging. However, as the push toward higher

density continues, the extrinsic capacitance is being reduced as much as possible to

improve performance. This will eventually leave the intrinsic capacitance as the primary

load in CMOS circuits, thus making this a topic worth studying now. After a discussion

of the intrinsic capacitances which most effect CMOS circuits (Cgd and Cg.), direct

experimental measurements of the effect of hot-carrier degradation on intrinsic

capacitance will be discussed, and the results modeled. The impact of this degradation on

circuit performance will be evaluated and shown to offset some of the losses due to ID



The effect of hot-carrier degradation on the drain current, ID, has been studied

intensely since Abbas's initial observation in 1975 [77]. Another intrinsic property of a

MOS transistor, the intrinsic capacitance, has a much shorter history of study with regard

to hot-carrier degradation. The first systematic study of intrinsic capacitances was done

by Sah [13] in 1964, which was used by Meyer in 1971 [78] in his widely referenced

work. In his paper he defines the intrinsic capacitance between terminals as:


That is, the change in the charge at terminal x due to a change in voltage at terminal y.

This definition applies to any two-or-more terminal device, but from now on will be used

with respect to a 4-terminal MOS transistor. Thus, it is clear that there are 16 possible

intrinsic capacitance terms for a 4-terminal MOS transistor. Please note that in this

small-signal definition, all of the non-y terminals are virtual ground. Thus, dVy is

referenced to ground (i.e. it is essentially relative to all the other terminals).

At first thought, one might assert that there are only 8 possible capacitances

since Cxy=Cyx. However, this is not true because our definition of intrinsic capacitances

does not represent static capacitive values and are not reciprocal. Consider the two

intrinsic capacitances Cgd and Cdg. Neglecting overlap capacitance, when the applied

gate voltage is less than the gate threshold voltage, VGT, both of these capacitances

should be zero (since both Cgd=dQg/dVd and Cdg=dQd/dVg are zero due to no existing

channel). Once VGS > VGT (and VDs < VDSsat), Cgd and Cdg will both have some finite

positive value when the channel forms. The interesting case is when VGo > VGT and VDS

> VDSsat. Now there is a channel, but it is 'pinched off' near the drain end. Cdg

(dQd/dVg) is non-zero since a change in the gate voltage still affects the charge associated

with the drain (Qd); Cgd (dQg/dVd) is zero since a change in the drain voltage has no

affect on the gate charge since the drain is not connected to the channel due to the pinch-

off. As clear as this seems now, both Meyer [78] and others [79] assumed that the

capacitances should be reciprocal. These should not be confused with the small-signal

circuit element terms, which are named the same way but actually are reciprocal by


Ward and Dutton [80] were the first to argue that the intrinsic capacitances

were, in fact, non-reciprocal. The paper also stressed the importance of including all the

capacitances, particularly the gate to bulk (Cgb) capacitance, which had been omitted by

Sah, and hence Meyer. Ward and Dutton's charge-based model was a huge improvement

at the time, as Meyer's model does not guarantee charge-conservation in circuit

simulators (due to omitting Cgb), resulting in erroneous results for the simplest of circuits.

Papers predating Meyer's work largely used discrete devices, and so authors

logically argued that modeling the intrinsic capacitances would be useless since the

capacitance from packaging and external circuitry would be vastly larger [81].

Furthermore, there was no direct method to measure the data to verify the models. With

the advent of integrated circuits, the primary capacitive load between CMOS circuit cells

(i.e. an nMOS and pMOS inverter pair) became dominated by the intrinsic and

interconnect capacitances, rather than the packaging and external circuitry. Thus,

modeling the intrinsic capacitance (as well as interconnect) became important.

Integrated circuits also hailed the need for compact models to simulate large

numbers of transistors. One of the first compact models was CSIM [82] from AT&T Bell

labs. Surprisingly, the authors of this model stayed with the simple Meyer model,

although argued that including the intrinsic capacitances was critical, particularly Cgd.

Cgd accounts for most of the intrinsic capacitance load in CMOS circuits due to the Miller

feedback effect [82]. Berkeley's BSIM [33] built upon CSIM, also retained the Meyer

model. BSIM2, however, corrected this deficiency by including a non-reciprocal intrinsic

capacitance model. The BSIM3 [34] model moved from an strongly empirical d.c. model

to a more physically-based model, but retained the unaltered a.c. model (including

intrinsic capacitances) from BSIM2, suggesting a lag in a.c. model development.

Current a.c. models are extremely poor. A great deal of additional research is

needed before a.c. models become nearly as sophisticated as d.c. MOS current models.

There are two reasons the a.c. models are so far behind the d.c. models. First, intrinsic

capacitance data have only been available since the early 1980s, over twenty years after

the first MOS transistor ID data. Second, until recently, external capacitances and

interconnect capacitances dominated the total capacitive load, making the intrinsic

capacitance fairly unimportant. However, as the transistor dimensions have decreased

and substantial improvements in drain current density become difficult due to physical

limitations, major efforts have been implemented to reduce the interconnect capacitances,

such as low-k dielectrics. This has increased the impact of intrinsic capacitances in

overall circuit performance and, with improved interconnect, could become the

predominant capacitive load in the circuit. It is interesting to note that publications on

intrinsic capacitance modeling have been increasing year-to-year since Ward and

Dutton's work [83-90].

Measurement of Intrinsic Capacitances

Because direct measurement of the intrinsic capacitance is difficult, many of

the first measurements were done with on-chip circuitry using reference capacitors [92-

93] or op-amps circuits configured as coulombers [94]. Eventually, external circuitry was

used, including a lock-in amplifier connected to an HP 4145 (as a voltage source) [95]

and later an off-the-rack LCR meter [96], such as the HP 4275 A. In this section, the

measurement of a few of the intrinsic capacitances will be described. These can be done

using an HP 4275 or HP 4276 (same equipment with different a.c. frequency ranges), or

the newer HP 4284.

The first discussion of using an LCR meter for the direct measurement of

intrinsic capacitances was written by K. C.-K. Weng and P. Yang in 1985 [96]. In this

letter, many of the important problems with measuring the intrinsic capacitances were

discussed. The main problem is that LCR meters are not designed to measure intrinsic

capacitances. There are two sets of terminals on the LCR meter: High and Low. The

high port applies the d.c. bias as well as the superimposed a.c. test signal. The low port

measures the resulting small-signal current. From the magnitude and phase difference of

the current relative to the applied small-signal test voltage, the capacitance can be found.

Unfortunately, the low port is a virtual a.c. and d.c. ground, so no d.c. bias may be

applied to it. To measure Cgd (dQ/dVd), the high port is attached to the drain (to apply

the dVd) while the low port is attached to the gate (to measure the dQg via the small

signal-current, ig times dt). If Cgd is desired as a function of VGS, the problem becomes

apparent: How can VGS be ramped if the gate is grounded?

The only solution, of course, is to independently bias the three terminals not

connected to the low port, as shown in Figure 4.1 for a Cgd measurement. Thus, two

additional power supplies are required, along with the internal d.c. power supply in the

LCR meter. These power supplies must be well calibrated with one-another to ensurethat

no potential difference exists between them when the same voltage is programmed. The

burden of negotiating the polarities of the theee power supplies, once worked out, can be

easily programmed into an automated station. As an example of the polarity problem,

consider the following: if Cgd at VGS=2 V, VDs=3V, and Vxs=0 (note: the device is

active, with a current flowing from the drain to the source, unlike standard CV

measurements, where the source, drain, and substrate are tied together) is desired, the

source and substrate can be biased at -2 V and the drain can be biased at 1 V. Since the

gate is virtual ground (VGs-=), it is easy to verify that the above applied voltages give the

desired potential differences (VGs, VDs, and Vxs). There is nothing particularly odd

about this configuration except that it differs from the traditional C-V measurements

where the substrate is the ground reference instead of the gate.

In the above case of Cgd, the source and substrate may be tied together to

forego one of the power supplies in Figure 4.1. If Vxs not equal to zero was required,

however, all terminals must be biased independently. Thus, if one is designing a

measurement station where any of the possible intrinsic capacitances can be measured,

three power supplies (including the internal one of the LCR meter) are necessary.

MOS Transistor
gate -

0 VD signal

Vs T \ I 1
p substrate


Fig. 4.1 Measurement configuration for Cgd. Requires LCR meter with internal
d.c. power supply, as well as two additional external d.c. power

Measurement Configurations

Although the standard textbook MOS device is symmetric with respect to

interchanging the source and drain, production devices may be asymmetric. This

asymmetry may be the result implant shadowing, drain and/or source engineering, or hot-

carrier-induced degradation, among other possibilities. Implant shadowing is an

interesting case, as it may result in the gate/source and gate/drain overlap regions being

different lengths, as shown in Figure 4.2. While the resulting ID characteristics are

symmetric (that is, the ID versus VDS characteristics are the same if the source and drain

leads are swapped), the measured Cgd characteristics (as well as Cgs, Cdg, and Cds) are

asymmetric. This occurs because the measured characteristics include the constant

overlap component, as shown in the following simple equation:

Cgd measured Cov drain+ Cgd.

The Cov drain term is composed of the constant overlap of the gate with the drain, as well

as an inner and outer fringe component. These fringe components have been calculated

theoretically [97], and assuming they are constant as a function of gate voltage introduces

negligible error [96]. The value of the measured Cgd in subthreshold (where Cgdmeasured

= Covdrain) has been used to estimate the length of the gate-to-drain overlap region [98],

and with the drawn channel length know, these overlap values could be used to extract

the effective channel length.

When necessary, the 'normal' and 'reverse' configurations of Cgd and Cgs

measurements will be specified. These are shown in Figure 4.3. C"m or Cgd norm refers

to the 'normal' measurement mode, where the high port is applied to the drain for a Cgd




; T ovdrain

source channel
overlap channel drain
', ,T-overlap

p substrate

Simplified schematic of asymmetric gate overlap, which results in
Cov drain ovsource'

Fig. 4.2



(A) Cgd_norm LCR Low
LCR High

LCR High

p substrate

p substrate

(D)CdvLCR HighC


p substrte

Fig. 4.3 Measurement configurations for (A) Cgd in normal configuration mode;

(B) Cgp in normal configuration mode; (C) Cgd in reverse configuration
mode; and (D) Cgs in reverse configuration mode.

p substrate

Fig. 4.3 Measurement configurations for (A) Cgd in normal configuration mode;
(B) Cgs in normal configuration mode; (C) Cgd in reverse configuration
mode; and (D) Cgs in reverse configuration mode.

measurement. Cgd or Cgdrev refers to the 'reverse' measurement mode, where the high

port is applied to the source for a Cgd measurement. This is necessary because, for short-

channel devices, the resulting Coy value (where Co. is Covdrain or Cov source) can become

a significant fraction of the total effective intrinsic capacitance. Although perhaps not

obvious now, Cgs = Cgd when VDS=O. However, due to the difference in Co, Cgsmeasured

may not equal Cgdmeasured. Figure 4.4 shows the Cgs and Cgd measurements in thenormal

and reverse modes for a 20 x 20 gm device. Figure 4.5 shows the same measurements on

a 20 x 0.40 im device (effective channel length is 0.24 im). Comparing the two figures

clearly shows the negligible impact of Coy on the long-channel device Cgd and Cgs

characteristics and the large impact on the short-channel device. In both cases, the Cgd

and Cgs values are almost identical, as is the overlap-induced difference of about 3 fF

(This 3 fF offset is not visible on the Ldrawn=20 Apm device because it contributes less than

2% to the maximum capacitance, whereas the overlap contributes about 60% of the total

measured maximum capacitance for the Ldrawn=-0.40 m device).

Later in this chapter, the results of channel hot-carrier stress on Cgd and Cgs

will be shown. Because channel hot-carrier stress is inherently asymmetric (since the

damage occurs near the drain edge), it is necessary to lay down the above notation for

later use.

Sample Measurements

For all capacitance measurements in this chapter, an HP 4828A LCR meter

was used with a small-signal voltage was 400MHz at 60 mV peak-to-peak. These

number were chosen after testing a wide range of a.c. signal voltages and frequencies

a' 1.0

0.8- Cgdrev Cgdnorm
) gs-rev Cgsnorm
0 0.6
VDS = 0.0 V
0.4 -
0) 20 x 20 rm -
0 0.2 nMOST _

0 .0 ,, Il I l ,
-0.5 0.0 0.5 1.0 1.5 2.0 2.5

VGS /(1 V)

Fig. 4.4 Cgd_norm, Cgd_rev, Cgs_norm, and Cgs_rev versus VGS for a 20x20 Im
MOST with VDS=0.0. Although it appears that all four curves are the
same, there are actually two sets of curves, Cgd norm/Cgsrev and
Cgd_rev/Cgsnom separated by 3 fF. Very little difference is seen because
the overlap capacitances shift is much less than then the peak Cgd and
Cp values. Compare this with Fig. 4.5.

.30 "-gdrevi ''gs norm-
34 Constant difference
_34 due to overlap
30 Cgd-norm, Cgsrev
0 28
26- VDs=0.0 V
) 22 20 x 0.40 m
20 -
-0.5 0.0 0.5 1.0 1.5 2.0 2.5

VGS /(1 V)

Fig. 4.5 Cgdnorm, Cgdrev, Cgsnorm, and Cgsrev versus VGS for a 20x0.40 ntm
MOST with VDS=O.O. Roughly 3 fF parallel shift of Cgs norm/Cgd_rev
and Cgs_rev/Cgd_norm is due to a difference in constant overlap
capacitance between the source and drain.

to obtain the most accurate results. The 60 mV signal may seem a little large to those

familiar with common C-V measurements, where 25 mV is typically used, but is actually

on the low end of the 23 mV to 400 mV found in most intrinsic capacitance papers [95-

96,98-106]. Frequencies below 100 MHz result in extremely poor-resolution (noisy)

intrinsic capacitance data, while frequencies above 500 MHz begin to show

markedreduction due to series resistance. LCR-specific settings on the HP 4284A were a

medium integration time with 8-cycle averaging.

So far the measurement procedures and naming conventions of intrinsic

capacitance have been discussed. Figures 4.4 and 4.5 showed sample measurements with

VDS=0. Although this is the typical way capacitances are measured, the ability to

measure the capacitance of active devices, where VDS > 0 when VGS > VGT (where VGT

is the threshold voltage at which an inversion channel form between the source and

drain), is important. Why is this capability important? Because in a real circuit, this will

commonly occur. If a correct model for the behavior of an operating transistor is desired,

then data from an active device is required. Indeed, without this data, it would be like

trying to verify an IDsat model with data only taken in subthreshold!

Examples of Cgd measurements on active devices are shown in Figures 4.6

and 4.7 for 20 x 20 im and 20 x 0.40 gm as a function of VGS for VDS = 0.0, 0.5, and 1.0

V (Vsx = 0.0V). Cgd transitions from Cov drain to a larger value once VDS < VDSat, or the

channel is no longer pinched-off. From a charge perspective, this means changes in VDS

(dVd) cause changes in Qchannel, which in turn cause changes in Qg (dQg), resulting in a

0.5 1.0 1.!

VGS /(1 V)

Cgd versus VGS for a 20 x 20 tim MOST with VDS=0.0, 0.5, and 1.0 V.




Fig. 4.6

'*- 30
t- 20 x 0.40 m
S- 28 MOST /VDS= .5 V

) 26 /VDs=1.0 V
0 24-
-0.5 0.0 0.5 1.0 1.5 2.0 2.5

VGS /(1 V)

Cgd versus VGS for a 20 x 0.40 pm MOST with VDs=0.0, 0.5, and
1.0 V.

Fig. 4.7

Cgd. Thus, as VDs increases, the point at which this transition occurs also increases, as

can be seen in the figures.

As mentioned previously, Cgd is the most important intrinsic capacitance

because, in a common-source configuration (which is the configuration for all CMOS

circuits), the effective load is 2(Cgs + Cgd(l Av)), where A, is the gain between the gate

input and drain output (a large negative number).

The next most important capacitance, based on the above load formula, is Cgs.

Figures 4.8 and 4.9 show both Cgs and Cgd for a 20 x 20 gim and 20 x 0.40 gm as a

function of Vos for VDS = 0.0, 0.5, and 1.0 V. Unlike Cgd, Cgs will have a finite value as

long as VGS is greater than VGT, since the channel will always be connected to the source.

At VDS=0, Cgs=Cgd since the channel charge is equally controlled by the source and

drain. However, if VGS > VGT (channel forms) and VDs > VDSsat (drain pinched off),

then the source terminal will actually control more than half of the channel charge,

resulting in a rise in Cgs above the value at VDs=0. However, once VGS increases to a

point that VDS < VDSsat, the drain is no longer pinched off, and the Cg, value begins to

decline with increasing VGS as Cgd increases rapidly. This is clearly demonstrated in

Figure 4.8 (and to a lesser extent in 4.9), where the decline in Cgs corresponds to the

increase in Cgd. The model for Cgd and Cgs will be discussed later. Recalling the

discussion about the overlap-capacitance shifting in the previous section, the capacitances

shows in 4.8 and 4.9 are actually Cgnrm and CgrV in order to offset the effects of the

overlap capacitance. (Fig. 4.5 shows why this was necessary)

1.0 V
LL 1.4 20 x 20 gm 0.5 V
C 1 MOST C--

S1.0 gd
D 0.8 VDS=0.0 V
O 0.8
-o 0.6 0.5 V

S0.4 1. V
0 0.2

-0.5 0.0 0.5 1.0 1.5 2.0 2.5

VGS/(1 V)

Fig. 4.8 Cgd and Cgs versus VGS for a 20 x 20 pm MOST with VDS=O.0, 0.5, and
1.0 V.




0.5 1.0 1.

VGS/(1 V)

Cgd versus VGS for a 20 x 0.40 mrn MOST with VDS=0.0, 0.5, and
1.0 V.

18 I
-0.5 0.0

Fig. 4.9

Channel Hot-Carrier Stress Effects on Cgd and Cg

Because the intrinsic capacitances are somewhat difficult to measure, as well

as the relatively small contribution of intrinsic capacitance on circuit performance in past

generations, very little work has been done to investigate the impact of hot-carrier stress

on intrinsic capacitance. Although the first report of hot-carrier degradation on ID was

published in 1975 by Abbas and Dockerty [I], the first investigation of Cgd and Cgs

degradation was not published until 1988 by Yao, Peckerar, Friedman, and Hughes [107].

Since then there have been several papers [102-106] by two research groups showing Cgd

and Cg, degradation for various stress conditions. Only one paper, by Dai, Walstra, and

Lee, [108] showed the impact of Cgd and Cgs degradation on circuit performance. This

section will present those data, a model for the degradation [109], and additional

supplementary information not released in that short paper.

Transistors from a 0.35 pmr CMOS technology for 2.5 V operation were used;

the same devices shown throughout this chapter. Drawn channel lengths were 0.40 lim

and 0.48 ipm, with effective channel lengths of 0.24tim and 0.32ntm respectively.

Accelerated stress was performed using the following procedure:

1) Take unstressed ('fresh') ID versus VgD data from 0 to 2.5 V at VGs=2.5, 2.0, 1.5,

and 1.2 V.

2) Take 'fresh' Cgs (normal mode) for reference.

3) Take Cgd (normal mode) versus VGS from 0 to 2.5V at VDs=0.0, 0.5, and 1.0 V.

4) Without re-probing, stress for exponentially longer times (see next paragraph for

stress conditions), followed by capacitance measurements as in (3).

Full Text
xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID EFWQ0S1EB_1I0JXU INGEST_TIME 2013-09-28T02:47:52Z PACKAGE AA00014242_00001