UFDC Home  Search all Groups  UF Institutional Repository  UF Institutional Repository  UF Theses & Dissertations  Vendor Digitized Files   Help 
Material Information
Subjects
Notes
Record Information

Full Text 
SCALING EFFECTS ON METALOXIDESEMICONDUCTOR DEVICE CHARACTERISTICS By STEVEN V. WALSTRA A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA ACKNOWLEDGMENTS I would like to thank Prof. ChihTang Sah for his time and guidance as chairman of my supervisory committee, and Dr. Arnost Neugroschel, Dr. Toshikazu Nishida, Dr. Sheng Li, and Dr. Randy Chow for serving on my supervisory committee. Additional thanks go to K. Michael Han for many insightful discussions and debates concerning all aspects of device physics. I would also like to thank Dr. Changhong Dai, Dr. ShiuhWuu Lee, Mary Wesela, and Jerry Leon for providing the devices, measurement equipment, and technical expertise during my internship at Intel Corporation where the intrinsic capacitance data were taken. Financial support from a Semiconductor Research Corporation Fellowship is also gratefully acknowledged. TABLE OF CONTENTS Page ACKNOWLEDGMENTS.................. .............. ..........................ii A B ST R A C T ..................................................................................... ........... ..................... CHAPTERS 1 INTRODUCTION................................................... 2 EXTENDING THE ONEDIMENSIONAL CURRENT MODEL.....5 Introduction....................... ............................. 5 Background......................... ..... ... ..................... 6 LongChannel Theory...................... .........................6 PaoSah Model........................ ........................7 BulkCharge Model..................................................14 ChargeSheet M odel........................ ....... ............. 18 Comparison of LongChannel Models................................... 19 TwoSection Models................................................... 20 Beyond TwoSection Models.................................................27 Examples Using PaoSah................................................27 FieldMatching Method........................................28 SaturationVoltage Method....................................30 SurfacePotential SelfSaturation Method...................31 In Search of the M atch Point....................................... 31 Summary....................... .... .. ........................38 3 POLYSILICONGATE MOS LOWFREQUENCY CAPACITANCEVOLTAGE CHARACTERISTICS....................41 Introduction............................................. 41 M etalGate CV............................... ....................... 42 PolysiliconGate CV............................... ............................ 46 PolysiliconGate Effects...................... ...........................51 Parameter Extraction Using the LFCV Model.........................57 3Point Extraction Methodology.......................... ..58 3Region Extraction Methodology................................ 63 iii Methodology Comparison.....................................65 Convergence Speedup Details....................................72 4 THE EFFECT OF INTRINSIC CAPACITANCE DEGRADATION ON CIRCUIT PERFORMANCE.................................................75 Introduction....................... ...... ... ................... 75 Background.......................................... ........................ 75 Measurement of Intrinsic Capacitances.................................79 Measurement Configurations...............................................82 Sample M easurements............................................. ..............85 Channel HotCarrier Stress Effects on Cgd and Cgs................ 94 Intrinsic Capacitance Degradation Model..............................97 Degraded Circuit Simulation............................................ 105 C onclusion................................. .............. 110 5 SUMMARY AND CONCLUSIONS........................................... 114 APPENDIX METALGATE LFCV MODEL DERIVATION................ 18 R E FER E N C E S...................................................... ................................................132 BIOGRAPHICAL SKETCH......................................................................142 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy SCALING EFFECTS ON METALOXIDESEMICONDUCTOR DEVICE CHARACTERISTICS By Steven V. Walstra December 1997 Chairman: ChihTang Sah Major Department: Electrical and Computer Engineering As metaloxidesemiconductor (MOS) transistor dimensions are decreased, channellength modulation, polysilicongate depletion, and intrinsiccapacitance degradation have increasingly larger impacts on transistor performance. It is demonstrated that the PaoSah 1D current model can be extended to include the channel length modulation effect by use of a twosection model. This twosection model employs the normal longchannel PaoSah model in one region and adds a variable length depletion region in the other. Three methods for matching the boundary between the two regions are presented, with the best results coming from the most complex method of matching the longitudinal fields at the boundary point. The effect of polysilicongate depletion on the MOS lowfrequency capacitancevoltage (LFCV) characteristics is demonstrated using a FermiDiracbased model. It is shown that, as the oxide thickness decreases, the effect of polysilicon depletion becomes increasingly pronounced. This depletion, in conjunction with the FermiDirac carrier distribution, offset the current gain expected from thinning the MOS gate oxide. With this polysilicongate LFCV model, it is shown that the oxide thickness, flatband voltage, and gate and substrate doping concentrations can be extracted from experimental capacitance data. Two extraction methods, the 3point and 3region, are developed and are shown to work well with gate oxide thickness of 130A (2.7% RMS fit) and sub 30A (10% RMS fit). Voltageaccelerated stress is performed on stateoftheart 0.24 pm effective channellength nMOS and pMOS devices to assess the impact on the most important intrinsic capacitances: Cgd and Cgs. The nMOS devices exhibit a Cgd reduction and Cgs enhancement with stress time, whereas the pMOS devices show negligible change. Because of Miller feedback, the nMOS Cgd reduction dominates the Cgs increase, resulting in an overall CMOS capacitive load reduction. Prestress and poststress ID, Cgd, and Cgs data were fit using the BSIM3 device model. With the resulting parameter sets, a 31stage ring oscillator was simulated for three situations: unstressed devices, stressed devices only including ID degradation, and stressed devices including ID, Cgd, and Cgs degradation. It is shown that the inclusion of the intrinsic capacitance degradation results in improved simulated circuit performance because the capacitive load reduction offsets the drain current reduction. This improved degradation methodology will result in looser guardbands and less reliability redesign. CHAPTER 1 INTRODUCTION The last three decades of production integrated circuits (IC) have seen two orders of magnitude decrease in device dimensions, from 25 im in 1962 to 0.25 gtm in 1997 [13]. This continual reduction, fueled by requirements for higher switching speeds, lower cost, and decreased power, has been sustained by improvements in lithography and has resulted in increased areal and chip densities (transistors/cm2 and transistors/chip). Compared to the 500 transistors/chip in the first experimental 64bit static random access memory (SRAM) in 1965, the 64M transistors/chip 64 Mbit dynamic random access memory (DRAM) of 1997 and the 4G transistor/chip 4 Gbit DRAMs due from NEC in 2000 typify the strong push toward increased density. Increased areal density implies decreased dimensions. As transistor and capacitor dimensions decrease, previously negligible effects have become or are becoming increasingly important. Many of these effects were assumed avoidable through constantfield scaling [4]. These scaling rules have been debated, amended, and improved [57] to account for noisemargin, hotelectron, and extrinsiccapacitance considerations, but present and future smaller dimensions have necessitated these effects be included in the design process. Several of these effects are discussed below. As channel lengths decrease, the thickness of the spacecharge region at the drain of a metaloxidesemiconductor (MOS) transistor becomes a significant fraction of 2 the total channel length. As the drain voltage is changed, the spacechargeregion thickness also changes, resulting in an effective channel length which is drainvoltage dependent, an effect known as channellength modulation (CLM). This problem can be tolerated for complementary MOS (CMOS) logic circuits, but needs to be properly modeled in order to predict the drive current of the MOS transistors in the circuit in order to estimate the speed of the resulting circuit. As the density of transistors increases, so does the power density. This requires a reduction in the operating voltage, since the active (switching) output power is proportional to the square of the operating voltage (Pactive fclockCoV2). To obtain the same performance at lower voltages, the oxide thickness must be reduced. Simple MOS theory predicts the drain current is inversely proportional to the gate oxide thickness. However, for thin oxides (< 50 A), depletion of the polysilicon gate offsets the effects of thinner oxides, resulting in lower current and diminishing returns on oxide scaling. Additionally, the gate voltage cannot be reduced indefinitely, because a large enough margin is needed between the signal voltage and groundplane noise to ensure that noise does not change the state of the device. The increased density of transistors also requires more closelyspaced interconnections between the transistors. Interconnect scaling has made delays due to the interconnection a limiter in process speed [8], and major efforts are currently underway to reduce the interconnect resistance and capacitance. Copper has recently been introduced into 1998 production by IBM to reduce the interconnect resistance. Additional efforts have been underway to lower the dielectric constant of the intermetal dielectrics in order to reduce the interconnect capacitance. When the interconnect capacitance is reduced, the only remaining capacitance left to slow the CMOS circuitry is the intrinsic capacitance of the transistors, which cannot be easily reduced and will become the predominant speed limiter. There are many other issues concerning the perpetual reduction in transistor dimensions, the least of which is the brick wall of atomic dimensions. Clearly transistors cannot be scaled to less than ten or twenty atoms and still work in the traditional sense of transistors, yet this dissertation includes data from a transistor pushing the atomic limit with a gate insulator thickness of less than 30A, or under six atomic layers of silicon and oxygen. The goal of this dissertation is to investigate the issues described in the previous paragraphs. Chapter 2 discusses the history of 1dimensional drain current models and some of the methods which have been implemented to extend these models to include the CLM effect. The PaoSah model, the most accurate longchannel current models, will be extended to include the CLM effect using three different approaches. The CLM effect (as demonstrated in the new models) will be discussed, as well as the pros and cons of the approaches. Chapter 3 tackles the polysilicon depletion problem by deriving the Fermi Diracstatisticsbased polysilicongate MOS lowfrequency capacitance model, including the effect of dopant impurity deionziation. By comparing this with the traditional metal gate model, the effect of polysilicon gate depletion will be shown to increase significantly as the oxide thins. With this model, a parameter extraction methodology is presented which allows the extraction of substrate and gate doping concentrations as well as the oxide thickness and flatband voltage from experimental LFCV data. Two methodologies will be presented and compared, and data from thick (130A) and thin (< 30A) gateoxide devices will be used. Additional oxide thickness issues, such as quantum effects, are also discussed. Chapter 4 considers the intrinsic capacitances, in particular, those most important in modern complementary MOS (CMOS) circuits: Cgd and Cgs. Compared to the drain current, which is also an intrinsic property of a MOS transistor, intrinsic capacitances have been relatively ignored because of measurement difficulty and relatively small impact compared to extrinsic capacitances. However, as processing and dielectric technology advances, the primary remaining capacitive load in CMOS circuits will be the intrinsic capacitances. The chapter presents an experimental investigation how these capacitances change with hotcarrier stress and, after modeling the stress induced changes in the intrinsic capacitances, shows that part of the drain current degradation is offset by the intrinsic capacitance reduction, resulting in a slower degradation of overall circuit performance. CHAPTER 2 EXTENDING THE ONEDIMENSIONAL CURRENT MODEL Introduction The simplest ID model is of crucial importance for applications in semiconductor physics. Although 3D models will best match experimental data because of both inclusion of real effects and simply additional variables, they may be intractable as compact device models, where computational efficiency is critical. Conversely, these 3D models are often validated by demonstrating their reduction to the rigorous ID forms for noncritical (wide and long channels with thick oxides) geometries. For back oftheenvelope calculations, knowledge of the basic physics embodied in a good ID model is exceedingly useful. The required accuracy of a model is largely determined by the application. For predicting the drive current, such as might be required for a discretetransistor specification sheet, a model need not worry about the linear or subthreshold regions of operation. Similarly, if modeling only the operating range (0 to power supply voltage), then the accumulation region of applied gate voltages can be ignored in the model. There are cases, particularly when attempting to predict the performance of new technology, where 3D fullrange MOSFET models are necessary, but they are a relative minority compared to the wide array of applications for 1D models. This chapter contains a brief history of onedimensional (ID) approaches to drain current models, including calculations and comparisons, followed by a new two section model using the ID PaoSah longchannel IV model in conjunction with a variablelength depletion region. The goal is to extend the ID longchannel model to shortchannel use. Background In 1926 Lilienfeld [9] submitted the patent for the first MOSFET device, an Al/Al203/Cu2S transistor. Thirtytwo years later in 1960, Kahng and Atalla [10] fabricated the first silicon MOS transistor. A year later, the first MOST currentvoltage (IV) papers were published internally at AT&T Bell Labs in 1961 by Kahng [11] and later at Stanford by Ihantola [12]. These were followed in 1964 by more complete (and widely released) ID theories by Sah [13] and Ihantola and Moll [14]. A comprehensive history of MOS developments was reviewed by Sah [1]. In the subsequent years since the first MOST model, hundreds of papers and theses have been written about the modeling of various aspects of MOS transistors. This chapter will discuss the prevailing ID models including PaoSah, bulkcharge, chargesheet, and the many twosection models. LongChannel Theory "Long channel" is a term used to specify that shortchannel effects can be neglected when modeling MOSTs, and the predominant shortchannel effect is encroachment of the drain depletion region into the channel. The depletion region exists due to the reversebiased p/n junction between the substrate and the drain, and has nothing to do with the actual channel length. For longchannel devices, however, the amount of encroachment relative to the channel length is small, so the effective channel length is essentially constant (equal to the drawn gate length). For short channels, however, the effective channel length can be significantly reduced by the encroachment. Another shortchannel effect neglected in longchannel theory is draininduced barrier lowering [15], where the source barrier is lowered by the applied drain voltage. PaoSah Model The most accurate longchannel theory was published by Pao and Sah (PS) [16]. The PS model is the only one which correctly accounted for drift and diffusion. The PS theory, to be discussed below, contains a double integral, but can be reduced to a more efficient form containing only single integrals [17, 18]. Although cumbersome to calculate, the PS double integral is extremely didactic and is a useful starting point for showing the approximations used to derive other longchannel IV models. The total current flowing in the channel is given by the integral ID = J(x,y)Z dx, (2.1) 0 where J(x,y) = JN + Jp = JN = q/nNEy + qDnVN = qDnNVt. (2.2) JN and Jp are the electron and hole current densities, respectively, and it is assumed that the current is dominated by electrons in an nchannel device in (2.2). The electron charge is q, t, and D. are the electron mobility and diffusion respectively, and VN is the gradient of the electron concentration. The electron quasiFermi level, t, is measured relative the bulk Fermi level and normalized to kT/q. If d/dx is assumed negligible (which is a fundamental assumption in the longchannel approximation and should be valid to a depth on the order of the drain junction depth), then ID can be found from summing up all the current from the surface down to some depth xi below which the additional contribution is negligible: Xi ID = qDnZ(dt/dy) N(x)dx 0 This can be transformed from physical space in the y direction to potential space as follows: L UD xi ID dy = qDnZ d 0 N(x) dx (2.3) 0'0 0 where UD=qVDs/(kT) is the normalized drain voltage at y=L and the lower limit 0 is the grounded source voltage at y=0. A similar transform in the x direction yields: Z UD Us N(U) I, = qDn d] dU (2.4) L 0 'UF (dU/dx) where (dU/dx) is the xcomponent of the electric field, which can easily derived from integrating Poisson's equation by quadrature and is given below. The Boltzmann approximation to the carrier concentration is being used and the impurities are assumed completely ionized, but the FermiDirac and deionized form can be used. Us is the normalized surface potential (where surface is at x=0), the total amount of surface band bending relative to the intrinsic Fermi level. It is a function of both the gate voltage and the drain voltage. UF is the normalized bulk Fermi level, below which the current contribution is assumed negligible, and is analogous to the to physical point x=xi in (2.3). The derivative (dU/dx) is found from (dU/dx) = F(U, U) /L, (2.5) where F(U, ,UF)=[exp(UUF) + exp(UpU) + (Ul)exp(UF) (U+exp(t) )exp(UF) ]1/2 (2.6) After applying Einstein's relationship, Dn/nt = kT/q, (2.4) becomes kT 2 Z IUD Us exp(U(UF) ID = I dUdt (2.7) q 2L LD 0 UF F(U, ,Up) The surface potential, Us(4), is needed in (2.7). The relationship between the surface potential and the gate voltage can be found by applying Gauss's Law at the semiconductor/insulator interface. The resulting equation, given below, can be solved iteratively for Us for a given !. UG = Us + sign(US) FF(Us,,UF) (2.8) where UG is the normalized gate voltage, q(VGs VFB)/kT; y is Es/(LD.C o); LD is the Debye length ('[EskT/(2ni)]/q); and F(Us,V,UF) is given by (2.6). Equation 2.7 is the traditional form of the PS integral, often called the PaoSah double integral. A more computationally friendly and accurate singleintegral form [17] was used for the calculations in this dissertation. The mobility in (2.7) need not be taken out of the integrals. Instead, it can be a function of the vertical and lateral fields and moved inside of the integrals. In this chapter the mobility will be assumed independent of field. A good way to understand Eq. 2.7 is to consider the threedimensional band structure of a MOST under gate and drain bias, as shown in Figures 2.12.4, based on the original PaoSah paper [16]. Figure 2.1 shows an idealized nchannel MOST. Figure 2.2 shows the corresponding energy band diagram with no applied terminal voltages except VGs=VFp. From the position of the Fermi level it is easily verified that the source and drain are ntype and the substrate is ptype (nchannel device). Electrons in the source and drain see a potential barrier toward the channel. Application of a positive voltage to the gate lowers the barrier near the surface, as shown in Figure 2.3. The applied gate voltage pulls electrons toward the surface (and pushes holes away from the surface), as can be seen from the position of the Fermilevel relative to the band edges. Farther into the substrate (away from the gate/substrate interface) there is no bending from the gate potential, so the region is identical to the unbiased case (Figure 2.2) and considered quasineutral. Applying a voltage to the drain (VDs < VDSsar) splits the Fermi level into quasi Fermi levels (FN for electrons and Fp for holes), as shown in Figure 2.4. One can imagine an electron in the conduction band surmounting the source barrier and then falling down the potential 'cliff' until reaching the drain. This 'free fall' is where the electron gains energy while moving across the channel. If the electron is not scattered while moving across the channel (losing energy to the lattice via phonons), it becomes increasingly energetic as it approaches the drain and may become 'hot' enough to produce an eh pair via impact, the resulting hole may generate interface traps via dehydrogenation of SiH bonds near the Si/SiO2 interface [19]. This is only one of several mechanisms for interface trap generation. / I I_ I SV / /j / ^* / } S G D Fig. 2.1 Simplified view of twodimensional MOS device. Drain Ec El Ev Fig. 2.2 Schematic 2D energy band diagram of simple MOS device with source and drain grounded and VGS=VFB. Adapted from Pao and Sah [16]. 12 Source Drain SEc A V / /% 'V El Ev Gate Fig. 2.3 Schematic 2D energy band diagram of simple MOS device with VGS > VpB, drain and source grounded. Adapted from Pao and Sah [16]. Fig. 2.4 Schematic 2D energy band diagram of simple MOS device with VGS > VFB, 0 < VDS < VDSsa, and source grounded. Adapted from Pao and Sah [16]. Figure 2.5 shows the result of applying a drain voltage in excess of VDSsat. As will be discussed in the twosection model section later, the drain depletion region becomes increasingly longer as the reversebiased drain voltage increases. For this longchannel section of the dissertation, however, the change in length, AL, is assumed much less than the channel length L. The voltage drop across this thin depletion region often results in large fields which can greatly accelerate carriers, causing the interface damage mentioned above. Now that the effect of applied biases on the 2D structure of the band has been discussed, it is easy to see the basis of the integral limits in Equation 2.7. The inner integral is integrating from the surface into the bulk (from Us to UF), which is a cross section of the channel as shown in Figure 2.6. The outer integral is integrating from drain to the source (UD to 0, source is grounded) along the channel. Thus, the double integral is summing up all the current contribution in the channel, exactly as would be expected. Since Us is a function of the drain voltage (or the channel potential), the order of the double integration is not trivially reversible. BulkCharge Model The first group of ID models, in order of complexity, were by Sah [13], Ihantola and Moll [14], and Sah and Pao [20]. These are all bulk charge models, taking increasingly more into account. As the name suggests, the bulk charge model takes the depleted region under the channel (in the bulk) into account. It assumes drift is the major component and so neglects the diffusion component. This greatly simplifies the problem and reduces (2.2) to J(x,y) = JN + Jp = JN = qnNEy = qpnN(x) (dV/dy) (2.9) F/q / I/ Drain Gate  S4 Depletion Region Fig. 2.5 Schematic 2D energy band diagram of simple MOS device with VGS > 0, VDS > VDSsat, and source grounded. Adapted from Pao and Sah [16]. E versus X near source E versus X near drain E, X f Ev X FN ql ,L i' II EEC S ... E 5 Ev X qV(y) qV(y)=FpFN=(y)kT Fig. 2.6 Schematic 2D energy band diagram of simple MOS device with VGS > 0 and VDS < VDSsat. Crosssections show the 1D energyband diagrams near the source and drain. GGate electrode, XSubstrate electrode. qVGs do Xi ID = qp/Z(dV/dy) qN(x)dx (2.10) 0 ID = PnZ(dV/dy)QN (2.11) where QN = Co(VG V Vs) + (2qPxxs)1/2[Vso + V]1/2 (2.12) Co is the oxide capacitance per unit area, VG is VGS VFB, VSo is the surface potential at the source, and V is the channel potential (=VDs at the unsaturated drain). Pxx is the substrate impurity concentration and Es is the dielectric constant of silicon. The first term is the charge accumulated in the channel and the second term is the uncompensated charge in the depletion region beneath the channel (i.e. bulk charge). Integrating (2.12) along the channel gives: I, = Pn(Z/L)Co{ (VG Vs)VD V,/2 (2.13) (1/Co) (2/3) (2qPxxs)1/2[ (Vs0 + VDS)3/2 (Vso)3/2] This form is slightly different than the SahPao and IhantolaMoll forms because it is not assumed that Vso=2VF, where VF is the Fermi voltage. A more exact form [17] is: ID = n(Z/L)Co{ VG(VsL Vso) (1/2) (VL V2o) (2.14) (2/3) (1/Co) (2qPxxes)1/2[(VsL)3/2 (Vso)3/2]) where VSL is the surface potential at the drain. This differs from (2.13) in that the surface potential at the drain is calculated instead of assumed to be VsL=Vso + VDs. When the drain current approaches or exceeds saturation (VDs > VDssat), VsLVso + VDS Additionally, in subthreshold, VSL is typically closer to Vso than Vso + VDS [21]. As will be shown later, the bulk charge formula should never be used for subthreshold calculations since it neglects diffusion, which is the primary subthreshold current contribution. The bulk charge form, compared to PS, is considerably easier to calculate, particularly when using (2.13) with Vso = 2VF, but is invalid in subthreshold. Equation 2.13 is also invalid in saturation as written, but that can be fixed somewhat by calculating the saturation voltage VDSsat and fixing the current for all drain voltages greater than VDsat. This will make the first derivative (drain conductance) noncontinuous at VDs=VDSsat. All saturation problems are solved in (2.14), where the calculation of VSL negates these problems. Iterative calculation of VSL is time consuming, particularly compared to assuming a constant, or pinned, surface potential value. ChargeSheet Model While most of the interest centered on superthreshold operation of the MOST, some people became concerned with the lack of accurate modeling for subthreshold operation. Barron [21] and Van Overstaeten et al. [22] developed subthreshold formulae based on simplifications of the PaoSah integral, with results applicable only to the subthreshold region. Six years later, Brews [23] made a critical approximation which would allow both drift and diffusion components to be introduced simultaneously without the need for a double (or single) integral. When he proposed his "chargesheet model," he introduced the following simplification: I = qZnN(y) (dC/dy) dt/dy=dts/dy 1/f dln(n)/dy (2.15) This approximation for dF/dy was justified "based upon its success in producing 'correct' I V curves," although he added a footnote relating the formula to electrochemical potential. This wideopen statement resulted in several subsequent 'proofs' which derived the same formula [17, 24, 25]. Essentially, though, he decoupled the drift and diffusion components from the tight interdependency seen in the PaoSah form to the simple form of (2.15). Through a similar derivation to bulkcharge, ID is given by I=Pn(Z/y) (1/f) {Co(l/Pf+VG) (Vs(y) Vo) (1/2)C,(V (y) V20) (2.16) (3/2) (2qPxxs)/ 2[ (pVs(Y) 1)3/2 (Vs 1)/2] + (2qPxE,)/2[ (pVs(Y) 1)1/2 (pVSO 1)1/2] Eq. 2.16 reduces to bulkcharge form of Eq. 2.14 if VG, Vs(y), Vs0 >> 1/ and the square root terms are negligible. Unlike bulkcharge, this formula is valid in subthreshold and does not require a calculation of VDSsat (assuming VSL and Vs0 are calculated iteratively). Like bulkcharge, this is much easier to calculate than a double, or even single, integral. Brews, and many subsequent authors, validated the chargesheet model by comparing it to the results of the PaoSah formula. It has been shown to be an excellent approximation, as will be discussed in the next section. Comparison of LongChannel Models The PaoSah doubleintegral model has been heralded as the best longchannel model. Brews [23] went so far to say that "Comparison of the chargesheet model with the PaoSah model has the force of comparison with experiment, since the PaoSah model is known to work well for long channel devices." Schrimpf et al. [26] agreed, saying Pao and Sah "produced a quantitative model so accurate that it is the standard by which other models are judged." Since bulkcharge and chargesheet are both approximations to PaoSah, it makes sense to compare them with PaoSah to see how accurate they are, taking into account that all the models are only valid for longchannel devices. Figure 2.7 shows all three methods simulated for Tox=500 A, T=296 K, Pxx=1015 cm3, W/L=10. These are typical parameters for LSI devices of the 1970s, and were chosen to match the data used in Pierret and Sheilds [17]. As can be seen, the bulkcharge and chargesheet models underestimate the current. Figure 2.8 shows the percentage error for each model at the gate voltages shown in Fig. 2.7, demonstrating that the chargesheet model maintains an error of less than 2.6% for all gate voltages, while the bulk charge model ranges from 2.5% for VGS=5.0V to 8.4% for VGS=2.0 V. This suggests that the much simpler charge sheet can be used in place of PaoSah incurring only about 2.5% error at low voltages. Figure 2.9 shows the subthreshold region for the same device with VDS=0.1 V. Clearly demonstrated in this figure is both the glaring inadequacy of the bulkcharge model for subthreshold modeling and the remarkable accuracy of the simple chargesheet model. However, recall that this is chargesheet with iteratively calculated surface potentials, so the numerical solution is not entirely trivial. TwoSection Models Up until now, only longchannel ID equations have been considered. For short channel devices (<1 gm), the most prominent nonmodeled effect on the drain current is finite drain conductance beyond saturation. The primary cause of this nonzero drain conductance (gD) is channel shortening from the drain spacecharge region (SCR) 2.5 V,=5.0V 2.5 PaoSah Charge Sheet 2.0 Bulk Charge < V=4.0V E 1.5  1.0 VG=3.OV 0.0 0 1 2 3 4 5 6 7 8 9 vD (V) Fig. 2.7 ID versus VDs for different VGs values for the three ID ID models. Parameters are To=500 A, T=296 K, Pxx=1015 cm3, W/L=10, which were used to match data in Pierret and Shields [17]. I Bulk Charge Charge Sheet    0 1 2 3 4 5 6 7 8 9 VD (V) Percentage error in ID for charge sheet and bulk charge relative to Pao Sah versus VDS, from Fig. 2.7. Plots are VGS = 5, 4, 3, and 2 V, with higher errors for lower voltages. Fig. 2.8 103 104 PaoSah 105  Charge Sheet 106 Bulk Charge < 10 NA=51015 cm  109 T=296K S101 Xox=500A 101 / VD=0.1v 1012 1014 1015 10 16 1 kI I I I I I I  0.0 0.5 1.0 1.5 2.0 vG (V) Fig. 2.9 ID versus VGS for PaoSah, charge sheet, and bulk charge using same data as Fig. 2.7 with VDS=0.1 V. Clearly bulk charge is not useful in subthreshold, whereas chargesheet is almost coincident with PaoSah. encroaching into the channel. This effect is often called channellength modulation since the drain voltage modulates the effective channel length. The most logical approach is to divide the region between the source and drain into two sections: a 'source side' and a 'drain side'. The 'source side' may contain any appropriate longchannel IV model, such as PaoSah, charge sheet, or bulk charge. The 'drain region' is the depletion region, and can be modeled with or without mobile charge, 2 D effects, mobility differences, etc. The location of the boundary between these regions, and the voltages and fields at this boundary, are what make this a challenging problem. Figure 2.10 shows a diagram of a MOS transistor divided into two sections. There are essentially three things which differ among approaches to twosection theory: the the sourceside IV model, the drainside spacecharge region (SCR) model, and the boundary conditions. Source Side Drain Side Source (Long channel approx) (SCR) Drain Le= (L AL) AL L y0 y=yM y=L V=O V=VM V=V, Fig. 2.10 Schematic diagram of twosection MOST for ID modeling. SCR means 'SpaceCharge Region' and Leff refers to the effective channel length. The IV model can be one of the many already discussed. The SCR model can be assumed fully depleted, take mobile charge into account, or be a complete 2 or 3D model. The boundary conditions are the most difficult and varied among approaches. Essentially, the potentials, fields, and charge at the boundary between the two regions need to be matched. The simplest twosection MOST model was introduced in 1965 by Reddi and Sah [27]. They used a sourceside bulkcharge model for the current and a fullydepleted drain side depletion model. From the first derivative of the bulkcharge model (Eq. 2.13 with Vs=2VF), ReddiSah (and others) calculated the drain voltage where, for a constant gate voltage, the drain conductance drops to zero (VDSsat). They then assumed all voltage in excess of VDSsat falls across the SCR to form the drain region of the twosection model. By assuming complete depletion (no mobile charge) and no yfield at the boundary, the length can be calculated from simple p/n junction theory as: AL = [2Es (VD Vossat + Vbi) / (qPxx) 1/2 (2.17) where Vbi = (kTq)ln(NdrainNubstrat/n) from standard abruptjunction p/n theory. Replacing L by Leff=LAL and Vso with 2VF in (2.13) yields the ReddiSah twosection current. The simplicity of this formula is extremely attractive, but the solution is dependent on the ID model. Specifically, it assumes that a VDSsat voltage can be found. If using Pao Sah or chargesheet, the surface potential is not constant and a VDSsat point does not actually exist. Even if VDSsat is found from extrapolation, the first derivatives of the drain current will be nonsmooth at the point where the drain current switches from one model (PaoSah, chargesheet, bulkcharge) to another (constant ID), although this can be fixed with various smoothing transitional functions. Four years after Reddi and Sah's paper, Chiu and Sah [28] came out with a two section model which solved Laplace equation in the oxide layer and matched values in four regions (source, drain, oxide, and bulk). The drain region was solved as a 2D, fully depleted region, and the solution required seven matching parameters. The complexity of the solution relinquished this model to an almost constant reference as "too complex." The following year (1969) FrohmanBentchkowsky and Grove [29] developed a twosection model using bulkcharge model in the source region and an empirical model for the drain section. This simple model essentially added two additional fringe field contributions to the ReddiSah model and added two empirical variables to fit the data. Merckel, Borel, and Cupcea [30] added mobile charge to the drain region empirically by writing Poisson's equation in the drain region as d2V/dy2 = q/s (Pxx + IDS)/ (qZa) (2.18) where a is essentially a fitting parameter related to the junction depth. This mobile charge is akin to the Kirk effect in bipolar devices, just as the draindepletion encroachment is analogous to the Early effect. Using an iteratively determined VDSsat, they were able to calculate the drain depletion width. Popa [31] devised a similar model and extended the drain depletion region to be of three types depending on the injected current. In both mobilecharge cases, fitting parameters were introduced either through (2.18) or mobility. Both used variations of the simple bulk charge model for the source side. After Brews developed the chargesheet model, all subsequent twosection models employed the chargesheet model. Guebels and Van de Wiele [32] developed a three section model to account for the xfield reversal near the drain. They employ the same trick as the previous papers by fitting the a in (2.18), using VDSsat (or IDsat) and adding some empiricism to their field calculations. Beyond TwoSection Models The chargesheet model (and PaoSah, as will be shown) does not lend itself well to analytical twosection models due to the greater complexity of the drain current model relative to bulk charge. As noted above, fitting parameters and empirical formulae were required to be introduced to satisfy some of the boundary conditions. The newer compact models, such as BSIM [33,34] and Siemen's [3537] model, are based loosely on onesection bulkcharge and chargesheet models, respectively, sometimes dividing the model into different sections based on operation (separate subthreshold and superthreshold formulae). They both model shortchannel effects by adding semiempirical additions to the threshold voltage, which makes for a considerably faster calculation speed at the expense of a lessphysical model. Examples Using PaoSah The goal was to develop a twosection model which employs the PaoSah integral as the sourceside current formula. The following is a description of the methodology and results of the exercise. FieldMatching Method The PaoSah current has already been discussed, as have been models for the depletion region. Let us consider the matching boundary of the two section model to occur at the point Y=YM where the channel voltage is VM with a lateral field EM and electric field gradient d2Us/dy2=dEM/dy. A simple way to look at this problem is from the Poisson's equation in the drain region while considering the boundary conditions. Within the drain region, which extends from y=yM to y=L, the boundary conditions are (see Fig 2.10): V(L)=VDS V(yM)=VM dV(yM)/dy=Em (field at the match point) d2V(yM)/dy2= (1/Es)[qPxx + (mobile charge terms)] = C It is possible from PaoSah to calculate dV(YM)/dy=EMps [38]. This gives us the following equations after integrating the Poisson's equation twice with the above boundary conditions: (VDs VM) = (C/2) (L yM)2 EM(L M) (2.20) This reduces all the boundary conditions to one equation with two unknowns (yM and VM). The ideal additional equation would be d2V(yM)/dy2 on the PaoSah side, but this quantity is incalculable from the PaoSah integral. If it is assumed that assume EM=O (as was done in ReddiSah), the depletion length into the channel can be easily found. It is reasonable to assume that the lateral field at the matching point (EM) is much less than the field right at the drain (ED), so ED > EM, making the difference in yM small. This gives (from 2.20, also 2.17) YM = L (2(VDs VM + Vb)/C)1/2 (2.21) Where Vbi accounts for the preexisting depletion region originating from the abrupt p/n junction. Since the YM approximation has already been made, it will be assumed that the field throughout the drain region is a constant at the boundary and is given by EMdep = (VD VM) / (L M) (2.22) Clearly there are conflicting assumptions (EM = 0, and now EM 0). One might wonder why EM is not (VDs VM + Vbi)/(L yM) to be consistent with 2.21. This comes from the subtlety of the boundary conditions. Looking back to Figure 2.2, note that the integration is actually from Vs + Vbi to VDS + Vbi, which excludes the p/n depletion layers. The Vbi's cancel out for symmetrical devices, so this is no problem. At VDsO0 (and source grounded), no current or field is expected, which would make VM correctly equal to 0 in (2.22). However, if Vbi were added to (2.22), then VM would have to equal Vbi, which would incorrectly cause a field (and possibly current flow depending on VGS). Essentially, (2.22) gives the excess field. However, Vbi does contribute to the depletion width, so it is included in (2.21). The normalized field on the PaoSah side at the boundary is given by [32] [exp (U) 1] exp (UUU ) EMps = 2 r F(Us ,UM,UF) + [exp(UsUMUF) exp(UFUs)+exp(U) exp(UF) UMIUS exp(U4UF)  dUdt U0 U F(U, ,Up) rUs exp(UUMUF) S s (2.23) UF F(U,UM,UF) where UM is the normalized matching voltage, VMy(q/kT). Figure 2.11 shows the results of this approach, with mobile charge terms neglected (C=qPxx/Esi) for Pxx=5xl017 cm3, T=300 K, and T,,=50 A. The data cover a wide range of channel lengths from '/4 gtm to , and for all cases the width is equal to the length (square devices). The saturation current predicted by longchannel theory for these square devices would be the same for all channel lengths, so the deviation from this is due to channel length modulation, which clearly becomes more important and the channel length decreases. Figure 2.12 shows that the drain conductance (gD=dlD/dVDs) is smooth, which is important for circuit simulator applications. Although not shown, the derivative of the drain conductance is also smooth. Thus, this fieldmatching model successfully extends the ID PaoSah model to shortchannels, at least with regards to including the effective channel shortening effect. SaturationVoltage Method Reddi and Sah [27] assumed VM=VDSsat, which simplified things considerably. VDSsat is easy to calculate when using the bulkcharge formula assuming Vso=2VF since the derivative of the surface potential with respect to the drain voltage is zero. The PaoSah current, however, does not technically saturate (numerically there will be a point where the current does not increase, but it will be at a drain voltage well in excess of the normal VDSsat point). This problem is solved by extrapolating VDSsat from dID/dVDs versus VDS without channel shortening. Figures 2.13 and 2.14 show the results of employing this method with the same device as used in the previous section (Pxx=5xl017 cm3, T=300 K, T,,=50 A), using Eq. 2.21 for yM with VM=VDSsat. Clearly the channellength modulation is being accounted for, but the transition is slightly abrupt. A look at the resulting drain conductance (Fig. 2.14) shows a drastic discontinuity near the calculated VDSsat point. Use of a fitting function could rectify this derivative problem, and is a common practice for compact models. SurfacePotential SelfSaturation Method Another possible way to circumvent finding the VM point was posed by Katto and Itoh [39]. Instead of finding VDSsat, they used the fact that the surface potential itself will saturate when solved iteratively from (2.8). Thus replacing the matching voltage, VM, with the surface potential at the drain, VSL (solved iteratively) gives another decoupled way to solve for yM. Using the surface potential to find the depletion thickness was also used by Sah [2]. This is better than the VDSat method since there will not be an immediate point where saturation occurs. However, as shown in Figs. 2.15 and 2.16, the current still has a slight 'jump' resulting in discontinuities in gD' In Search of the Match Point Sah [2] showed pictorially that in saturation, the energy band near the drain edge will actually be bent upward, or in other words, the surface will be accumulated rather than inverted (actually, the surface will still be depleted, but now accumulation refers only to the shape of the band bending). This must be the case since the potential along the channel is actually higher than VGS VGT = VDSsatr This means that there must be a point along the channel at which the band bending is zero at the surface, and this point would be an excellent candidate for the yM point. Like the methods above, however, this point has some  L=0.25 pm  L=0.50 gm  L=1.00 pm  L=5.00 pm  L=, pm .~ 2 .... ..... . 2  0 2 4 6 8 10 VD Fig. 2.11 ID versus VDS plots for different channel lengths (square devices) using field matching at the match point. VGS = 5 V, Pxx=5xl017 cm3, T=300K, Tox=50 A. OF I I I II I I I I I I I I I I I I I I I I I I : 2 .0 i i I I"I  L=0.25 im 1.5  L=0.50 Im  L=1.00 gim L=5.00 gm 0 1.0  L=om pm 0.5 0.0 .   ... . 0 2 4 6 8 10 VD Fig. 2.12 gD versus VDS plots for different channel length (square devices) using field matching at the match point. Same parameters as Fig. 2.11.  L=0.25 gm S  L=0.50 ipm L=1.00 pm L=5.00 gm .  L=om m < 3  1 I l I I I I I I I l l l lI I, I I I 0 2 4 6 8 10 VD Fig. 2.13 ID versus VDS plots for different channel lengths (square devices) using VM=iterative surface potential at drain. VGS = 5 V, Pxx=5xl07 cm3, T=300K, To=50 A. 2.0 1.5 E1.0 0) 0.5 0.0  L=0.25 pnm  L=0.50 gpm . L=1.00 pm L=5.00 gm L= pm  J 0 2 4 6 8 10 VD Fig. 2.14 gD versus VDS plots for different channel length (square devices) using VM=iterative surface potential at drain. Same parameters as Fig. 2.13.  L=0.25 gim  L=0.50 gm  L=1.00 pLm  L=5.00 pm  L== pm II 1 1 1 II I I I I I I I i I I I . 0 2 4 6 8 10 VD Fig. 2.15 ID versus VDS plots for different channel lengths (square devices) using VM=VDSsa. VGS = 5 V, PX=5xl017 cm3, T=300K, T,,=50 A. .n .............. ....    2.0 I I  L=0.25 pim 1.5  L=0.50 gm .L=1.00 gm L=5.00 gm L=oo pm E I. 1.0  0.5 \ '      0.0     : ': : 0 2 4 6 8 10 VD Fig. 2.16 gD versus VDS plots for different channel length (square devices) using VM=VDsa,, Same parameters as Fig. 2.15. logical flaws. For instance, the field in the xdirection is zero by definition, which means that using the channel potential at this point to calculate the current from the longchannel model will clearly invalidate the gradual channel approximation (Ex >> Ey), a basis for the PaoSah ID derivation. A simple approximation for this point would be to use VM = VGS VFB when VDS > VGS VFB, which is akin to setting VDSsat = VGS VFB. This ends up resulting in the same sort of problem seen in the VDSat method. It is interesting to verify the existence of this turnaround region in the channel near the drain, however. This was done recently using the MINIMOS device simulator [40]. MINIMOS was modified to use a constant mobility model so as to be comparable to the 1 D model cases above. Figure 2.17 shows the resulting electrostatic potentials into the substrate at different points near the drain edge of a 50A, 100x100 pim (corrected for subdiffusion) nMOST with Pxx=5xl017 cm3 at VGS=1.5 V and VDS=3.0 V. VFB was fixed at zero for this case. What is clear is that the band moves from inversion (top) through flatband into accumulation (bottom) at the surface (x=0.0). The flatband point occurs when the channel potential is equal to VGs VFB = 1.5 V, as expected. Summary This chapter reviewed the history of 1D longchannel draincurrent models and discussed the pros and cons of their derivation and applications. From this, the importance of a nonpinned surface potential was shown, as demonstrated by the excellent approximation of the simple chargesheet model to the PaoSah double integralthe best of the 1D longchannel models. 39 Next, methods to extend the 1D into two ID sections to create the best fullrange 1D model. It was discovered that, no matter what, the depletion region is strictly 2D, and obtaining a 1D approximation requires rather substantial assumptions. One model, the fieldmatching approach, was seen to give reasonably good characteristics, while all the other approximations (VDSsat and surfacepotential selfsaturation) resulted in discontinuities in the first (and higher) derivatives. A 2D simulation was used to verify that there is a point in the saturated channel where the (xdirected) field reverses and the surface band bending is, thus, zero. This point has been suggested many times before in our group, but never verified twodimensionally. Attempting to use this point to demark the boundary of the source region and drain region of the twosection model results in the same poor results as the VDSsat method. SiO2/Si interface / i Into substrate X (pm) Fig. 2.17 Electrostatic potentials into the substrate at different points near the drain edge of a 50A, 100x 100 tm (corrected for subdiffusion) nMOST with Pxx=5x1017 cm3 at VGs=1.5 V and VDS=3.0 V. The Y=1.321 tm (near source) and Y=50.000 gtm (middle of the channel) curves are indistinguishable. The band is flat at the SiO2/Si surface when the channel electrostatic potential equals VGs VFB = 1.5 V (VFB = 0 for this data). CHAPTER 3 POLYSILICONGATE MOS LOWFREQUENCY CAPACITANCEVOLTAGE CHARACTERISTICS Introduction For modern ULSI technology, polysilicon gates are universally used on MOS devices. With respect to MOS device characteristics, there is no advantage to substituting metal gates with heavilydoped polysilicon (poly) gates. In fact, poly gates, as will be shown in this chapter, greatly reduce the effectiveness of thinning the oxide layer to increase the drain current. The use of poly gates is a question of cost as well as performance, however, and poly gates have some tremendous processing and density benefits over metal gates. Polysilicon gates can withstand high temperature steps that would cause most deposited metal gates to evaporate, particularly the source/drain drive in step. Polysilicon gates also allow for selfalignment of the gate over the oxide between the source and drain, removing what would be the most difficult (and costly) alignment step in the process flow [4142]. This chapter covers the derivation of a FermiDiracbased polysilicongate MOS lowfrequency capacitancevoltage model. This model will be used to illustrate the effects of polysilicon gates on MOS lowfrequency (LF) capacitancevoltage (CV) characteristics compared to metalgate LFCV characteristics. A useful application for the model is physical parameter extraction, which is demonstrated in this chapter using two different methodologies: 3point fit and 3region fit. Sample parameter extractions for thick (130A) and thin (20A) gate oxides are shown, and discussion about limitations of the model are presented. Quantum effects are purposely ignored, and the reasoning behind this decision is discussed. Important details related to fast convergence of the parameterextraction routines are also given. MetalGate CV Ideal metalgate CV theory using Boltzmann statistics has been extensively discussed [3,43], as well the extension to include FermiDirac carrier distribution and deionization effects [4446]. The appendix contains the full metalgate LFCV model derivation, taking FermiDirac statistics and deionization into account. The relevant solutions are given below. Figure 3.1 shows a schematic diagram of an ideal metalgate MOS device and the corresponding band diagram. From Figure 3.1 (b), as explained in the appendix, Kirchkoffs voltage law around the loop gives: OM + VO = XS VIx + (Ec EI)/q + VF + VG (3.1) where OM is the workfunction for the metal, Vo is the potential drop across the oxide, Xs is the electron affinity of the substrate, Ec and EI are the conductionband edge and intrinsic energies, respectively, in the substrate, and VF is the Fermi voltage, which is equivalent to (El Fp)/q for ptype material, where Fp is the quasiFermilevel for holes and q is the electron charge. Collecting these terms in cleaner form gives V, = Vo + Vi, + Ms (3.2) Vaccum Level Xo FM Metal (Al) Vo Ecox Xs  /A * Ec E, "VF FL E, ~E" T j Semiconductor (Si) / vOxi Oxide (Si02) MOS capacitor schematic and corresponding energyband diagram. (A) Schematic diagram of a MOS capacitor and (B) corresponding energy band diagram depicting the potential drops. Shown is a positive voltage VG applied at the gate, resulting in the Si02/Si surface entering inversion. Fig. 3.1 where OgM = cM Os = Mn (Xs + (EC EI) /q + VF) is the work function difference between the metal and the semiconductor. As will be shown in the next section, the work function difference for a polysilicongate MOS device is much simpler than metalgate MOS since the substrate and gate materials are the same. The drop across the oxide can be found from Gauss's Law requirements as Vo = EEIx/Co (QT + QIT) /Co, (3.3) where es is dielectric constant of the semiconductor ( 11.7x8.85x 1014 F/cm2 for Si), EIx is the field across the oxide, Co is the oxide capacitance, and QOT and Qrr represent fixed and interface trapped oxide charge respectively. With this relation, (3.2) can be rewritten as VG = VFB + VIX + esEix/Co, (3.4) where VFB, the flatband voltage, is given by VFB = 'MS (QOT + QIT) /Co. (3.5) For metalgates, there is no capacitive contribution from the metal, so the gate capacitance is simply the series equivalent of the fixed oxide capacitance, Co, and the variable substrate capacitance, Cix. Cg = CCo/ (Cix + Co). (3.6) The field going into the substrate, E1x, and the substrate capacitance, Ci, are given by 2kT EX =  Nv[3/2 (UIUv+UF) "3/2 (Uv+UF) ES + Nc[ 3/2( UIx+UcUF) 3/2( UCUF)] + UPxx UIx+ in g)xp(U UA UIX) 1+ gAexp(UF UA +P ( U +1 + gexp(U U) I l + +g ( F A + [Nxx(UIx + In 1+ gex ( ) (3.7) 1 + gDexp(UD UF) Cix = NF/2 (UIxU+UF) + NCi/(UIx+UCUp) EIX [ Pxx 1 + gAexp(UF UA UIx) (3.8) 1 + goexp(UD UF + UIX) where all 'U' values are potentials normalized to kT/q and referenced to the intrinsic Fermi level. For example, UF is the normalized Fermi level, qVpF(kT). Pxx is acceptor substrate doping concentration and Nxx is the donor substrate doping concentration, and gA and gD are the corresponding degeneracy factors for the trap levels UA (acceptor energy level) and UD (donor energy level, not to be confused with the normalized drain voltage of an MOS transistor). Nv is the valance band density of states and Nc is the conduction band density of states. Those familiar with MOS capacitance equations might find these far more complex than they recall; a perusal of the appendix should clear up any questions about this form. However, it is instructive to show how this reduces to a more familiar Boltzmann form. First, all of the FermiDirac integrals [F12(l) and F3/2(T) terms] reduce to exponentials in the Boltzmann range of applied gate voltages (tr < 4). Second, there is typically only one dominant dopant, so one of the last two terms in (3.7) and (3.8) can be neglected (the first can be neglected for ntype substrate, and the second for ptype substrate). Furthermore, if deionization is neglected (UF UA UIX < 3 for ptype or UD UF + UIx < 3) for ntype), then the last two terms of (3.7) reduce to PxxUlx NxxUIx. Likewise, the two lines of (3.8) reduce to Pxx Nxx when deionization is neglected. As an example of the simplified form, let us consider a ptype substrate in strong accumulation. In this case, it can be assumed that only the accumulated surface carrier term is dominant (Ujx is large and negative). Noting also that, in the Boltzmann case, UF UV = ln(Pxx/Nv), (3.7) and (3.8) would reduce to Eix = (2kTPxx/es)exp(UIx/2) Cix = q [Pxx/ (2kT)]exp(Uix/2) These are the more tractable strongaccumulation forms found in undergraduate textbooks [3, 43] and which form the basis for one wellknown oxidethickness extrapolation algorithm [47]. PolysiliconGate CV Implicit in the derivation of the metalgate CV theory above was that the capacitance of the gate is infinite and that the voltage drop across the gate is zero. With metal gates, this is a reasonable assumption for the ideal isolated device. However, with polysilicon gates, there is a finite polysilicon gate capacitance as well as a voltage drop [3, 49]. Indeed, the capacitor is now a semiconductoroxidesemiconductor device, so it will have a corresponding surface potential for the gate, as well as an associated gate capacitance with a form exactly like the substrate capacitance. This requires only minor additional derivation to arrive at the polygate MOS capacitor (MOSC) ideal device characteristics. Figure 3.2 shows the band diagram for an n+polysilicon gate MOS capacitor with a ptype substrate (a schematic of the device would be identical to 3.1 (a), with a metal gate replaced by a polysilicon gate). From this figure it is clear that the potential drop across the device can be given similarly to (3.1) as VFpoly + (EC EI)/q + Xs + VIG + VO = Xs VIx + (Ec Ei)/q + VF + VG. (3.9) Assuming the energy gap has not narrowed due to the higher gate doping, the (Ec El) terms are identical and cancel because the materials are both silicon. The electron affinity is the same for both the gate and substrate for the same reason. This reduces (3.9) to VG = Vo + VIX + VIG + VF + VFpoly. (3.10) Thus, for the polygate case, OMS (more aptly called OGS, where 'G' represents the gate, but still traditionally referred to as 'M' for metal) is simply given by MS = VF + VFpoly. (3.11) For Figure 3.2, Ms is given by In(PxxNGG/n2), where Pxx is the substrate doping ('P' implying ptype) and NGG is the gate doping ('N' implying ntype). This simple formula assumes a Boltzmann carrier distribution in the substrate and gate, which is invalid in the Vaccum Level Vo Ecox Xs o V/ V GV Xs "Metal" (npoly Si) Semiconductor (Si) Band diagram of n+ polysilicongate MOS capacitor with all the potential drops labeled. The band diagram shown depicts a positive voltage VG applied at the gate, with the SiO2/Si surface entering inversion and the polySi/SiO2 surface depleting. Fig. 3.2 gate due to the high doping and likely invalid substrate for modern ULSI devices. A more appropriate formula using inverse FermiDirac integrals can be used using the examples in the appendix. The extra potential drop from the poly gate is easily taken into account via Kirkoff's law with VIG: VG = VFB + Vix + VIG + ,sEI/Co, (3.12) where VFB from (3.5) still holds assuming negligible contribution from the polysilicon/oxide interface, using (3.11) for gMs. Finally, the gate capacitance formula needs to be extended for three capacitors in parallel. This changes (3.6) to Cg = CixCoCig/ (CixC + CiCig + CigCo), (3.13) where Cig, the capacitance from the polysilicon gate, is given by Cig = q NvyIF(UiUv+UF) + Nc/, (UIG+UcUF) EIG PGu 1 + gAexp(UF UA UIG) (3.14) 1 + goexp(UD UF + UIG) This is simply (3.8) rewritten with the band notation for the gate. Thus, UIG is the normalized surface potential in the gate, EIG is the field in the gate (defined below), and PGG, NGG, Uv, Uc, NC, Nv, gA' gD, UD' UA are precisely as defined before, except that they apply now to the gate rather than the substrate. UF above was called UFpoly elsewhereit is left as UF in (3.14) to maintain the symmetry of the equation. The gate field is given by 2kT EIG NVF3/2(UIGUv+UF) F3/2 (U+UF) s + Nc[3/2( UIG+UCUF) F3/2( UcUF)] + +I in{1 + gAexp(Up UA UIG) + PGG ^ IX ++ n1 + gAexp (Up UA) I [ 1 + gnexp UD F IG + [NG(U I + In + )U ,(3.15) 1 + gDexp(UD UF) which is identical to (3.7) with the surface potentials changed. Again, the same caveat applies to (3.15)all the terms refer to the gate now, not the substrate. Things like trap levels and band edges are nearly, if not exactly, the same in the substrate and polysilicon gate. However, UF is clearly quite different (assuming the gate and substrate are not doped identically, which would make a poor capacitor or transistor). An additional equation, which was not needed in the metalgate case, is required to relate the gate and substrate. This equation equates the charge density at the gate/oxide interface with the charge density at the oxide/substrate interface: sEIx + QITX + EEIG + QITG = 0. QITX is the interface charge at the substrate/insulator interface and QITG is the charge at the gate/insulator interface. It is assumed that these values are negligible, and that the dielectric constant for the silicon substrate and the silicon gate are identical (already implicitly assumed in the equation). This gives the following EIX = EIG, which allows the surface potential in the gate to be related to the surface potential in the substrate. The iterative solution of the above equation requires many calculations of (3.7) and (3.15), and is the most timeconsuming part of the LFCV solver as well as any software using the routine (such as a parameter extractor which works by comparing the data to the theoretical curve, as discussed later in the application section). PolysiliconGate Effects The effect of polysilicon gates, compared to metal gates, is a reduction of the gate capacitance, Cg, when the gate is in depletion. This is arises when the value of Cig falls below that of Co and Cix, which only occurs during gate depletion and substrate inversion or accumulation, and only then to a significant degree for thin oxides. This is easily visualized from the three series capacitancesthe one which dominates is the smallest, and the capacitance due to the substrate and gate are both minimized during depletion (and maximized during accumulation, as well as inversion for the LF case). As oxides thin, the oxide capacitance increases, which causes the effect of the substrate and gate depletion to have more control over the characteristics of the CgVG curve. Figure 3.3 shows the difference between metalgate and polysilicongate data, normalized to Co, for two different technologies. The 'higher' pair of curves for a 1000 A oxide (thick oxide means low oxide capacitance) shows little difference between polysilicon gates and metal gates. The lower pair of curves for a 50 A oxide (thin oxide means large oxide capacitance) shows a large decrease in Cg for all values of VG, particularly for VG > IV, where the gate is still in depletion and the substrate is inverted. 1.0 S*^  0.9 Pxx=2.0x1017 0.8 Tox=50A 0.7 . 0 .6 Pxx=3.0xl016 Q 0.5 Tox=1000A O 0.4 0.3  0.2  MetaIG/SiO2/pSi 0.1  PolynSiG/SiO2/pSi (NGG=3.0x1019cm3) 0.0 4 3 2 1 0 1 2 3 4 VG / (1 V) Fig. 3.3 Comparison of metalgate and n+ polygate MOSC curves for two different technologies. One set has 1000 A oxide with Pxx=3X1016 cm3 and the second set has 50 A oxide with Pxx=2xl0'7 cm3. In each case, the VFB is adjusted to be 1.V and the gate doping is 3xl107 cm3. Clearly shown is the dramatic difference between polygate (dotted line) and metalgate (solid line) for the 50A case, and the negligible impact on the 1000 A casethe polysilicon gate effects increase as the oxide scales thinner. This continual decrease in Cg for increasing VG (in this n+ polygate on pSi substrate) is often referred to as 'poly depletion,' since the polysilicon gate is still depleting. Eventually the gate itself will invert, and the characteristics will be much improved. However, resulting field caused by the gate voltage required to invert the gate is typically beyond the reliability limit of 4MV/cm in properly scaled devices. In fact, the only way to make the gate invert sooner is to lower the gate doping, which exaggerates the poly depletion effect even more until the gate inverts. It might seem, as it did to this author, that the ultimate solution would be to use undoped gates, as they would invert much sooner and behave just like metal gates at reasonably low applied gate voltages. This works well in simulation, but the question then becomes: where is the supply of minority carriers to invert the gate? In particular, for an n+gate in a rapidly switching MOST, what would supply the holes? It has been shown that, for at least one technology, the holes are likely supplied via thermal generation (rather than ion impact) [50]. Thermal generation, then, could not supply the holes fast enough for practical use of an undoped gate. However, it might be possible to design in a minority carrier source nearby to supply minority carriers (similar to how the source and drain supply minority carriers in the substrate). The reduction in the gate capacitance due to poly depletion causes a reduction in the drive current, which degrades circuit performance [5154], since the amount of current supplied by the transistor directly relates to the switching speed of the device. In a complementaryMOS (CMOS) circuit, the current charged up the interconnect and intrinsic capacitances of the next transistors in the line, as discussed in detail in Chapter 4. Because of this polygate ID reduction, there may eventually be a move back to metal gate (or silicides) once the processing issues of gate alignment are solved. It is instructive to look at the individual capacitance components to see how the 'complex' poly LFCV curve forms. Figure 3.4 shows such a curve for a theoretical 50.0 A oxide with an n+ gate doped (rather lowly) to 9xl018 cm3 and a substrate doped to 5x1017 cm3. The gate area is lx104 cm2 and the flatband voltage is 1.0 V. The Cg curve, being the serial sum of Co, Cix, and Cig (Eq. 3.17), is always lower than the component curves. It can be clearly seen how each of these three components influences the overall structure of the resulting gate capacitance. In fact, this 'regional' effect will be used to help speed up parameter extraction in the next section. Also of interest is a breakdown of potentials across the MOSC device as a function of VG. Figure 3.5 shows the four components of VG, namely VIX, VIG, Vox, and VFB (see Eq. 3.12) as a function of VG using the same parameters as the example in the last paragraph. To show show these are related to the resulting gate capacitance, the CgVG curve is also plotted. What is most relevant in this figure is that as the primary 'dip' in the CV curve occurs as the surface potential in the substrate, VIx, sweeps from accumulation to inversion (i.e. moves from a small negative number to about one volt), and ends sharply as the surface potential approaches its maximum (strong inversion). Similarly, the secondary polydepletion 'dip' occurs as the gate surface potential, VIG, moves from accumulation to inversion (again, moves from a small negative voltage to around a volt). Note that the final surface potential in the gate is higher than that in the substrate (VIG > VIX when VC > 4V). This agrees with the common approximation that LL Ci C i.. 150 100 Co 50 Cg=CIICxg11lCo 0 I I I I I I I I I I 6 54321 0 1 2 3 4 5 6 VG / (1 V) Fig. 3.4 Individual capacitance values for a theoretical 100xl00 Rm nMOSC with a 50 A gate oxide, Pxx=5.0x1017 cm3, NGG=9xl018 cm, T=300 K, and VFB=l.O V. This figure demonstrates how the three parallel capacitances (Cix, C,, and Co) add to give the overall gate capacitance. See Fig. 3.5 for the corresponding potential breakdown. 6 70 5 4 4 60 > 3 C 220 1 40 "FZ 0   y  ' 1 30 S2 v / VFB o 0v 3 20 4  10 5 6 0 6 54321 0 1 2 3 4 5 6 VG/(1 V) Fig. 3.5 Individual potential breakdown for a theoretical 100xl00 J.m nMOSC with a 50 A gate oxide, Pxx=5.0xl017 cm3, NGG=9xl101 cm3, T=300 K, and VFB=1.0 V, along with the corresponding LFCV curve. Note how the surface potential in the substrate, Vix, increases rapidly in the range VG = 1 to 0 V as Cg increases (substrate inversion) and the similar increase in VIG in the range VG=l to 3 V (gate inversion). See Fig. 3.4 for the corresponding capacitance breakdown. the surface potential pins to a little over 2VF, since the Fermi voltage in the gate will be larger than that of the substrate due to the greater gate doping. Parameter Extraction Using the LFCV Model Of the multitude of variables in the LFCV equations, most of them are known to a reasonable degree of accuracy (such as the dielectric constant, energy gap, conduction band density, etc.), can be measured easily (temperature), or need not be known very accurately (acceptor and donor trap level) due to their small effect. This leaves the gate and substrate doping, the oxide thickness, and the flatband voltage as the 'unknown' parameters. These parameters may be extracted from experimental data by comparing experimental data to the theoretical model presented in this chapter. This may appear to be an easy task, since the equation need only be used, along with some data, in conjunction with a nonlinear leastsquaressolver. However, one will note that the polysilicon gate LFCV formula is doubly parametric (that is, is related through two parametersthe surface potentials UIx and UIG), neither of which are known from the data. Thus, solving this problem is non trivial. The first step toward a solution, then, is to write a program which will calculate Cg given VG. This requires intensive calculations to find Uix and UIG for each VG, but can be done since there is only one unique solution. Thus, with a Cg(VG) routine written, a nonlinear leastsquaresfit program can be used. The code written for this dissertation took advantage of the fact that, as the solution converges to values of the unknown parameters, the values of the surface potentials at each experimental data point could be used for initial guesses for each subsequent iteration of VG to find each Cg (since the parameters {substrate and gate doping, oxide thickness, and flatband voltage)) should not be changing too rapidly). This greatly increased the convergence rate over estimating Ulx and UIG on each call, at the expense of additional code complexity and memory usage. 3Point Extraction Methodology If the model were perfect, then it would require only three points to match the experimental data to the model. Why only three data points for four parameters? Because the additional constraint that one of the points should be the minimum of the experimental LFCV curve can be used. From this information, the flatband can be found by comparing the VG of the theoretical minimum with the VC of the experimental minimum. The other three parameters can be found directly from the Cg values of the three points. Figure 3.6 shows the three points, labeled Cg.acc, Cgdepl, and Cgdep2 as they relate to the whole LFCV curve. Only Cg.depl is uniquethe other two points can be anywhere within their region. The Cgacc point is a point from the LFCV gate accumulation region. From this, a good estimate of the oxide thickness can be found, since the other parameters have very little influence over this point (see Figs. 3.7 and 3.8). Cg.acc asymptotically approaches Co, which is inversely proportional to the oxide thickness, Tox, via the parallel plate formula. There has been much research in obtaining Tox and/or Co from (substrate) accumulation CV data [48,5558]. 0o 0.6  Cgdep2 0.5 0 0.4 0.3 depi Gate 0.2 Gate Gate Gate Inversion 0.1 Accum. Depletion 1 Depletion 2 0.0 : VG / (arbitrary scaling) Fig. 3.6 Example of a general polysilicongate (n+ gate, p substrate for this case) showing how all the important regions can be labeled in terms of the gate state rather than the typical substrate state. This regional breakdown is used to improve the speed and accuracy of the parameter extraction routine. The Cg.depl point is the minimum of the LFCV curve, and allows us to find the substrate doping, since the substrate depletion region is strongly dependent on the substrate doping concentration. In fact, depletion CV data can also be used to determine the substrate doping profile [5961]. Figure 3.7 shows LFCV data for several different constant substrate doping concentrations, clearly demonstrating the strong dependence of substrate doping on the location of Cgdepl. This was also demonstrated in Fig. 3.4, since the main influence in this depletion region is C1x, which itself is strongly dependent on UF (see Eq. 3.8), which is directly related to the the inverse FermiDirac integral (natural logarithm if assuming a Boltzmann distribution) of the substrate doping. The position of the minimum along the VG axis also allows us to estimate the flatband voltage by comparing the VG of the minimum of the theoretical curve to the VG of the data. The Cg.dep2 point is from the gate depletion region. Figure 3.8 shows that the gate doping has the most affect on this part of the curve, whereas Figure 3.7 shows that the substrate doping has very little effect in this region. For the n+ gate on psubstrate example in Figure 3.8, the substrate is in inversion. However, even if the substrate were ntype (and the substrate thus accumulated), Cg depletion would still occur because the gate would still be in depletion (of course, the entire curve would be shifted due to the flatband difference). Hence, this point is called Cgdep2, with the 'dep' in reference to the depleted state of the gate. By varying the parameters in the appropriate regions to match these three points, a unique parameter set will be obtained which will describe a theoretical LFCV curve passing through the three points. 1.0 I I I I ' 0.9 0.8 / 0.7 0.7 Pxx=2.0x10'8 \ 0 0.6  Pxx=4.0xl17  0.5  Pxx=2.0x1016 Decreasing Pxx 0 0.4 o I' 0.3  0.2 T=300 K Polyn'SiG/SiOdpSi VFB=1.00 V 0.1 MOS LFCV NaG=5.0x10' cm3 0 .0 I 1I I I 1 1 4 3 2 1 0 1 2 3 4 VG/ (1 V) Fig. 3.7 Effect of substrate doping changes (Pxx) on LFCV characteristics. The 'depletionl' region (see Fig. 3.6) is the region of largest impact. 1.0 0.9 o Neg=4.OX1020 k 0.6 0 0.  N,=5.0x10 Decreasing N 0.5 NGG=2.O0x109 4 .NGG=8.Ox 1018 C 0.4 0.3 0.2 T=300 K Polyn+SiG/SiO2/pSi VFB=1.00 V 0.1 MOS LFCV Pxx=2.0x10" cm  0 0 1 1 I I I I I 4 3 2 1 0 1 2 3 4 VG / (1 V) Fig. 3.8 Effect of gate doping changes (NGG) on LFCV characteristics. The 'depletion2' region (see Fig. 3.6) is the region of largest impact. 3Region Extraction Methodology As good as our model is, there are still several effects which are not being considered. These include retrograde doping in the substrate and quantum effects in the substrate inversion channel. Retrograde doping is commonly used for sub'/2micron design to maintain a high subsurface doping concentration to prevent punchthrough, while still maintaining a low VT for lowVG operation (to accommodate the thin oxides) [7]. Figure 3.9 shows an example of a retrograde profile from our internallymodified MINIMOS. The LFCV model assumes a constant doping profile in both the substrate and gate, and so deviation from this assumption will cause changes in the experimental LFCV curve relative to the theoretical model. Chargecarrier layer pushout due to quantum effects in the inversion and accumulation layers has been an area of much research [6265]. Experimental verification of these quantum effects are invariably at low temperatures, where phonons will not broaden the quantum bands into a continuum. Although some amount of quantum effect is likely present, it is probably impossible to model correctly when one considers thermal broadening, SiO2/Si interface roughness and transitional regions, non random dopant distribution, and other nonidealities. These will all tend to broaden the electron levels into a more classical continuum. It has been noted that electrical and optical oxide thicknesses do not often agree, and the difference has been attributed to quantum effects. As will be discussed later, the effect is likely overestimated. More important, if there is a difference, it is the electrically effective oxide thickness (as determined from electrical experiments, such as CV) yy1 Sample of retrograde doping profile, showing low surface concentration (5xl016 cm3) and higher bulk concentration (IxlO18 cm3). Fig. 3.9 which is most important compared to the optical thickness (which is not what affects device performance). Due to these two main nonidealities (nonconstant doping and quantum effects), there could be some dependence on the extracted parameters using only three points. That is, extracted parameters might be dependent on which points we choose for Cgacc and Cgdep2. To overcome this, the entire curve could be fit to the model. This would result in extremely long convergence times, as a partial derivative must be calculated for each variable at every point for every iteration. However, Figures 3.7 and 3.8 show that some parameters have no influence on the LFCV curve in certain gatevoltage regions. Thus, the information provided from their partial derivatives does not help convergence, and will actually slow down the convergence, not to mention waste time during the calculation. Instead of fitting all the data to the model, the data can be broken up into the same three regions suggested in Fig. 3.6 for the threepoint fit. Then the model can be fit using only the parameter (or parameters) dominant in the specific region, thus greatly improving the convergence rate (since the data being used is most relevant). This adds complication to the coding, as the data partitioning into each region (discussed later) must be automated, and a different fitting routines must be created for each of the three regions (same model, but separate partial derivative calculations). Methodology Comparison Figure 3.10 shows a comparison of the fit using the 3point and 3region methods to experimental data for a 130A oxide from an industrial 100x100 itm MOST transistor. (A ) 30 "'. .""I S20 c gdep2 LL 20 gacc x x 15 3Point Fit 0 10 Theoretical x Experimental CgdeplCgmin 5 0 UIlIl,,linllnlilI, In Illl ii 2.5 21.5 1 0.5 0.0 0.5 1.0 1.5 2.0 2.5 (B ) 30 ;' u"','"', "I '" ', I ', 25 UL 20 15 S 10  Theoretical x Experimental FullCurve Fit 5 2.5 21.5 1 0.5 0.0 0.5 1.0 1.5 2.0 2.5 VG / (1 V) Fig. 3.10 Theoretically generated curves compared to original LFCV data using the (A) threepoint and (B) fullcurve extractions to on an n+polysilicon gate, 100x100 gm nMOST. Extracted parameters were: (A) Xox=130A, Pxx=9.0xl016 cm3, NGG=5.8xl09 cm3, and VFB=1.06V. (B) Xox=130A, Pxx=8.7x1016 cm3, NGG=3.0x1019 cm3, and VFB=1.01V. The solidline, of course, is the theoretical curve using the extracted parameters, and the 'x' marks are the data used for the extraction. The top (A) curve uses only the three points marked to fit the data while the lower (B) curve uses the full set of data. The RMS error (for the second curve), calculated from the square root of the sum of the squares of the difference between theoretical and experimental capacitance, divided by the square root of n 5 (5 = degrees of freedom = 1 + number of fitting parameters), was 2.7%, an excellent fit for only four parameters. There is not much in literature to compare the 'goodness' of these results, as there are not many capacitance models available (most compact models, such as BSIM3 [34], fit IV characteristics only, and do not consider capacitance extraction). Aggressively scaled MOSC device data is given in Figure 3.11. This shows two LFCV curves from a 20A oxide from an industrial MOST transistor, where the 20A was determined from some optical method, most likely ellipsometry. This is a rather complex figure, and requires some explanation. There are two experimental curves: one p+ poly gate and one n+ polygate, both on a pwell. Thus, in the VG > IV region, the n+ gate is in depletion, while the p+gate is in accumulation (clearly the curves were shifted to align them, as the flatband should differ by about a volt between the two curves, although a threshold adjustment implant would offset this somewhat). From looking at the data at 1.3V and assuming Cg = Co, they conclude that the effective oxide thickness is 33A for the depleted n+ gate, and 28.5A for the accumulated p+ gate device. This is a poor approximation, since the oxide thickness value would vary greatly at different points along the depleted curve. However, they correctly state that the difference between the two is due to poly depletion, which is a reasonable statement when applied to that specific gate voltage only. Thus, they attribute a 4.5A reduction in effective oxide thickness due to polysilicon depletion at 1.3V, which is the operating voltage for that technology. They next assert that the difference between the accumulatedgate curve (28.5A based on assuming Cg = Co) and the optically measured oxide (20A) must be entirely due to quantum effects. This conclusion is wrong, as the quantum model most likely does not take all the effects mentioned previously into account (such as thermal broadening, interface roughness and transitional region, not to mention retrograde doping). More importantly, they are completely ignoring the severe error of using Cg = Co0 Figure 3.12 shows the fit (using the threepoint method) to the data in Figure 3.11. From this fit, the extracted oxide thickness is 24.4A. Compared to the industrystated results, the same effective oxide thickness reduction due to poly (4.7A here versus 4.5A) is seen. However, by correctly accounting for the distribution (instead of assuming Cg = Co), an additional 4.1A reduction from the FermiDirac distribution is also found (that is, from the fact that Cg < Co)! This leaves 4.4A of difference between the extracted oxide thickness of 24.4A and the optically measured thickness of 20A. This 4.4A difference may include some quantum effects, but it may also be due to optical errors, such as not accounting for the transitional layer properly [6667] and/or some other effects (i.e. doping profile). Ellipsometry and other optical methods can not be used on the actual device (since the gate electrode is not transparent), so it does not measure the oxide thickness in the active part of the device, which may differ slightly due to the additional 15 C, calculated ,, llj, I.li,,, I,,. l i ., ,i l ,I I ,ll l III i I rl from optical Xo=20.OA S  P'Poly/pwell  .12 N'Poly/pwell I 8.5A Quantumf L Xo=28.5A 9 4.5A poly S 6 X=33A 3 3 2 1 0 1 2 3 VG / (1 V) Fig. 3.11 P+poly/pwell and n+poly/pwell '20A' industrial data. Data is shifted to align minimums, and labelling refers to industrial interpretation of the two curves. Please see text for explanation of this breakdown. Compare to Fig. 3.12, which is the author's interpretation of the same data after quantumless parameter extraction. 15 I Xo Gate Doping  P+Poly/pwell, 24.4A, 7.9x10"' cm3 I Xo=20.oA Opt S12 N+Poly/pwell, 24.8A, 4.2x10" cm3 4.4A Error LL ..... ............... Xo=24.4A 'Ac C. Xo=28.5A t 4.1A Fermi .9 /_  > 4.7Apoly 0 6 \ Xo=33.2A 3 Pxx=4.2x1017 cm3 3 2 1 0 1 2 3 VG /(1 V) Fig. 3.12 Theoretically generated LFCV data threepoint parameter extraction using data in Fig. 3.11. Extracted parameters are shown. Instead of attributing the difference between the optical thickness of 20.0A and the 'extrapolated' thickness of 28.5A at VG=1.3V to quantum effects, we find that 4.1A is due to the distribution function used (FermiDirac), with the remaining 4.4A possibly due to error (in optical measurement and/or other factors). processing. As far as parameter extraction is concerned, the most important factor should be agreement with electrical results, not optical. Although a threepoint method was used here (to improve the match at the 1.3V point), a full fit to the data in Fig. 3.12 has about a 10% RMS error, which, considering the thin oxide and the fact that the substrate gate doping is not constant, is extremely good. An interesting side note, brought up during the proposal for this project, concerned fitting a Boltzmann model to the data instead of a FermiDirac. The surprising result of this was a better fit (6.6% RMS error), but, as would be expected, a thicker extracted oxide thickness of 27.5A. This is an interesting result, as it shows that using the wrong model can appear to give 'better' results (in terms of fit), even though the resulting parameters actually have greater error (due to the incorrect carrier distribution). Philosophically, the issue of oxide thickness is an interesting topic. Many people would argue that TEM is the only way to measure the 'true' thickness. However, ignoring that this is a destructive and timeconsuming technique, it only yields the thickness of that particular cross section. What is desired is the average oxide thickness, as it is the average oxide thickness which affects the amount of charge accumulated by an applied voltage in a MOSC. This is why Fowler Nordheim (FN) tunnelling is also not a particularly good methodit will always underestimate the oxide thickness since the tunnelling will occur in the thinner spots on the gate. Additionally, one would rather not stress the devices while trying to find the oxide thickness. One recent technique for ultra thin oxide thickness determination is using quantum oscillations in the tunnelling gate current, which are caused by quantum interference of electrons in the oxide conduction band [68]. This method potentially suffers from the same problems as FN, and additionally requires knowledge of the effective mass and oxide barrier height, the latter two of which add about 2.5A of error to the results [69], assuming they are known to within 5%. Convergence Speedup Details Iteration stops when none of the fitting parameters (i.e. the extracted parameters) changes by more than 5x104% between successive serial cycles (i.e. each parameter was fit, and none changed by more than 5x104%). Below are several of the methods employed to speed up this convergence. The calculations involved in the extraction are extremely complex. Of greatest importance is the convergence speed of the poly LFCV model, which itself must converge on the two surface potentials just to give one Cg data point for a given VG. This one data point is used in the numerical partial derivatives for the nonlinear leastsquares fit, which means the Cg(VG) is called twice for each experimental data point for each iteration! Because this model is called so frequently, it is important to keep track of all the converged surface potentials for each experimental data point so that the LFCV model has a good estimate for subsequent calls. The delta used for the numerical partial derivatives, as it turns out, has a major influence on the correctness of the fit. Since two of the parameters vary logarithmically (the substrate and gate doping), the actual parameters used during the fit are the logs of these parameters. Thus, the deltas used for the derivatives must be calculated differently. Another problem is that the delta used for the oxide thickness is dependant upon the actual thickness of the oxide (that is, it should be different for a 50 A oxide compared to a 1000 A). The empirical results for the best deltas, as determined from analysis of the numerical derivatives, were A=107 for ln(Nxx), A=0.1 for In(NGG), A=104 for VFp, and 107xTox for Tx. For example, the partial derivative of Cg with respect to Tox is calculated from (C,(VG)1 Cg(VG)2)/2A, where Cg(VG)I is the gate capacitance calculated at some VG with an oxide thickness of Tox + A and Cg(VG)2 is the gate capacitance calculated at the same VG with an oxide thickness of To A. The reason arbitrarily small values cannot be used, of course, is because the LFCV model itself is only accurate to about eight digits (less near flatband) due to its own internal convergence criteria [46]. Finally, to start the extraction, a reasonable initial guess must be made. The initial guess of the oxide thickness is simply AEox/Cgmax, where Cgmax is the maximum gate capacitance in the dataset and A is the gate area. This is the standard firstorder approximation based on Cg = Co. For the substrate doping initial guess, the asymptotic highfrequency CV formula for Cg, [43] is solved iteratively using the minimum and maximum Cg values from the data set. The gate doping is simply set to 3x1019 cm3 With these three parameters approximately known, the flatband voltage is estimated from VGmindata VGmin theory Since the minimum of the CV data is not necessarily given, but is needed internally to estimate the flatband, the minimum three data points (in terms of Cg) are used to estimate the true Cg minimum based on the parabolic minimum formula [70]. This slightly improves the convergence, but not as much as would be expected, largely because the minimum of the CV curve is not very parabolic. Because it was clear that convergence was slower as the results approached the final values, a 'trick' was developed to improve this end case. Whenever a trend was visible during a fit, the routine doubled the amount of the parameter increase. A trend, in this case, is defined as three successive moves of a parameter in the same direction for all the parameters (possibly different directions for different parameters). This cut down the number of iterations by about 20% in most cases. One thing which would have improved the speed of convergence greatly would be to use a simpler model for the FermiDirac integral. The CodyThatcher model [71] is extremely accurate, but requires the quotient of ten exponentials from a Chebyshev approximation. This approximation was used instead of some other simpler (though less accurate) approximations [7276] because it was desired to add as little error as possible from the FermiDirac integral calculation. CHAPTER 4 THE EFFECT OF INTRINSIC CAPACITANCE DEGRADATION ON CIRCUIT PERFORMANCE Introduction In this chapter, the relatively obscure subject of intrinsic capacitances will be discussed. The area of MOS intrinsic capacitance has received little attention over the years due to the difficulty of measurement and small impact relative to extrinsic capacitances such as interconnect and packaging. However, as the push toward higher density continues, the extrinsic capacitance is being reduced as much as possible to improve performance. This will eventually leave the intrinsic capacitance as the primary load in CMOS circuits, thus making this a topic worth studying now. After a discussion of the intrinsic capacitances which most effect CMOS circuits (Cgd and Cg.), direct experimental measurements of the effect of hotcarrier degradation on intrinsic capacitance will be discussed, and the results modeled. The impact of this degradation on circuit performance will be evaluated and shown to offset some of the losses due to ID degradation. Background The effect of hotcarrier degradation on the drain current, ID, has been studied intensely since Abbas's initial observation in 1975 [77]. Another intrinsic property of a MOS transistor, the intrinsic capacitance, has a much shorter history of study with regard to hotcarrier degradation. The first systematic study of intrinsic capacitances was done by Sah [13] in 1964, which was used by Meyer in 1971 [78] in his widely referenced work. In his paper he defines the intrinsic capacitance between terminals as: dQx Cx dVy That is, the change in the charge at terminal x due to a change in voltage at terminal y. This definition applies to any twoormore terminal device, but from now on will be used with respect to a 4terminal MOS transistor. Thus, it is clear that there are 16 possible intrinsic capacitance terms for a 4terminal MOS transistor. Please note that in this smallsignal definition, all of the nony terminals are virtual ground. Thus, dVy is referenced to ground (i.e. it is essentially relative to all the other terminals). At first thought, one might assert that there are only 8 possible capacitances since Cxy=Cyx. However, this is not true because our definition of intrinsic capacitances does not represent static capacitive values and are not reciprocal. Consider the two intrinsic capacitances Cgd and Cdg. Neglecting overlap capacitance, when the applied gate voltage is less than the gate threshold voltage, VGT, both of these capacitances should be zero (since both Cgd=dQg/dVd and Cdg=dQd/dVg are zero due to no existing channel). Once VGS > VGT (and VDs < VDSsat), Cgd and Cdg will both have some finite positive value when the channel forms. The interesting case is when VGo > VGT and VDS > VDSsat. Now there is a channel, but it is 'pinched off' near the drain end. Cdg (dQd/dVg) is nonzero since a change in the gate voltage still affects the charge associated with the drain (Qd); Cgd (dQg/dVd) is zero since a change in the drain voltage has no affect on the gate charge since the drain is not connected to the channel due to the pinch off. As clear as this seems now, both Meyer [78] and others [79] assumed that the capacitances should be reciprocal. These should not be confused with the smallsignal circuit element terms, which are named the same way but actually are reciprocal by definition. Ward and Dutton [80] were the first to argue that the intrinsic capacitances were, in fact, nonreciprocal. The paper also stressed the importance of including all the capacitances, particularly the gate to bulk (Cgb) capacitance, which had been omitted by Sah, and hence Meyer. Ward and Dutton's chargebased model was a huge improvement at the time, as Meyer's model does not guarantee chargeconservation in circuit simulators (due to omitting Cgb), resulting in erroneous results for the simplest of circuits. Papers predating Meyer's work largely used discrete devices, and so authors logically argued that modeling the intrinsic capacitances would be useless since the capacitance from packaging and external circuitry would be vastly larger [81]. Furthermore, there was no direct method to measure the data to verify the models. With the advent of integrated circuits, the primary capacitive load between CMOS circuit cells (i.e. an nMOS and pMOS inverter pair) became dominated by the intrinsic and interconnect capacitances, rather than the packaging and external circuitry. Thus, modeling the intrinsic capacitance (as well as interconnect) became important. Integrated circuits also hailed the need for compact models to simulate large numbers of transistors. One of the first compact models was CSIM [82] from AT&T Bell labs. Surprisingly, the authors of this model stayed with the simple Meyer model, although argued that including the intrinsic capacitances was critical, particularly Cgd. Cgd accounts for most of the intrinsic capacitance load in CMOS circuits due to the Miller feedback effect [82]. Berkeley's BSIM [33] built upon CSIM, also retained the Meyer model. BSIM2, however, corrected this deficiency by including a nonreciprocal intrinsic capacitance model. The BSIM3 [34] model moved from an strongly empirical d.c. model to a more physicallybased model, but retained the unaltered a.c. model (including intrinsic capacitances) from BSIM2, suggesting a lag in a.c. model development. Current a.c. models are extremely poor. A great deal of additional research is needed before a.c. models become nearly as sophisticated as d.c. MOS current models. There are two reasons the a.c. models are so far behind the d.c. models. First, intrinsic capacitance data have only been available since the early 1980s, over twenty years after the first MOS transistor ID data. Second, until recently, external capacitances and interconnect capacitances dominated the total capacitive load, making the intrinsic capacitance fairly unimportant. However, as the transistor dimensions have decreased and substantial improvements in drain current density become difficult due to physical limitations, major efforts have been implemented to reduce the interconnect capacitances, such as lowk dielectrics. This has increased the impact of intrinsic capacitances in overall circuit performance and, with improved interconnect, could become the predominant capacitive load in the circuit. It is interesting to note that publications on intrinsic capacitance modeling have been increasing yeartoyear since Ward and Dutton's work [8390]. Measurement of Intrinsic Capacitances Because direct measurement of the intrinsic capacitance is difficult, many of the first measurements were done with onchip circuitry using reference capacitors [92 93] or opamps circuits configured as coulombers [94]. Eventually, external circuitry was used, including a lockin amplifier connected to an HP 4145 (as a voltage source) [95] and later an offtherack LCR meter [96], such as the HP 4275 A. In this section, the measurement of a few of the intrinsic capacitances will be described. These can be done using an HP 4275 or HP 4276 (same equipment with different a.c. frequency ranges), or the newer HP 4284. The first discussion of using an LCR meter for the direct measurement of intrinsic capacitances was written by K. C.K. Weng and P. Yang in 1985 [96]. In this letter, many of the important problems with measuring the intrinsic capacitances were discussed. The main problem is that LCR meters are not designed to measure intrinsic capacitances. There are two sets of terminals on the LCR meter: High and Low. The high port applies the d.c. bias as well as the superimposed a.c. test signal. The low port measures the resulting smallsignal current. From the magnitude and phase difference of the current relative to the applied smallsignal test voltage, the capacitance can be found. Unfortunately, the low port is a virtual a.c. and d.c. ground, so no d.c. bias may be applied to it. To measure Cgd (dQ/dVd), the high port is attached to the drain (to apply the dVd) while the low port is attached to the gate (to measure the dQg via the small signalcurrent, ig times dt). If Cgd is desired as a function of VGS, the problem becomes apparent: How can VGS be ramped if the gate is grounded? The only solution, of course, is to independently bias the three terminals not connected to the low port, as shown in Figure 4.1 for a Cgd measurement. Thus, two additional power supplies are required, along with the internal d.c. power supply in the LCR meter. These power supplies must be well calibrated with oneanother to ensurethat no potential difference exists between them when the same voltage is programmed. The burden of negotiating the polarities of the theee power supplies, once worked out, can be easily programmed into an automated station. As an example of the polarity problem, consider the following: if Cgd at VGS=2 V, VDs=3V, and Vxs=0 (note: the device is active, with a current flowing from the drain to the source, unlike standard CV measurements, where the source, drain, and substrate are tied together) is desired, the source and substrate can be biased at 2 V and the drain can be biased at 1 V. Since the gate is virtual ground (VGs=), it is easy to verify that the above applied voltages give the desired potential differences (VGs, VDs, and Vxs). There is nothing particularly odd about this configuration except that it differs from the traditional CV measurements where the substrate is the ground reference instead of the gate. In the above case of Cgd, the source and substrate may be tied together to forego one of the power supplies in Figure 4.1. If Vxs not equal to zero was required, however, all terminals must be biased independently. Thus, if one is designing a measurement station where any of the possible intrinsic capacitances can be measured, three power supplies (including the internal one of the LCR meter) are necessary. MOS Transistor gate  Test 0 VD signal iH=1 Vs T \ I 1 p substrate Vx Fig. 4.1 Measurement configuration for Cgd. Requires LCR meter with internal d.c. power supply, as well as two additional external d.c. power supplies. Measurement Configurations Although the standard textbook MOS device is symmetric with respect to interchanging the source and drain, production devices may be asymmetric. This asymmetry may be the result implant shadowing, drain and/or source engineering, or hot carrierinduced degradation, among other possibilities. Implant shadowing is an interesting case, as it may result in the gate/source and gate/drain overlap regions being different lengths, as shown in Figure 4.2. While the resulting ID characteristics are symmetric (that is, the ID versus VDS characteristics are the same if the source and drain leads are swapped), the measured Cgd characteristics (as well as Cgs, Cdg, and Cds) are asymmetric. This occurs because the measured characteristics include the constant overlap component, as shown in the following simple equation: Cgd measured Cov drain+ Cgd. The Cov drain term is composed of the constant overlap of the gate with the drain, as well as an inner and outer fringe component. These fringe components have been calculated theoretically [97], and assuming they are constant as a function of gate voltage introduces negligible error [96]. The value of the measured Cgd in subthreshold (where Cgdmeasured = Covdrain) has been used to estimate the length of the gatetodrain overlap region [98], and with the drawn channel length know, these overlap values could be used to extract the effective channel length. When necessary, the 'normal' and 'reverse' configurations of Cgd and Cgs measurements will be specified. These are shown in Figure 4.3. C"m or Cgd norm refers to the 'normal' measurement mode, where the high port is applied to the drain for a Cgd gate V Cov_source oxide ; T ovdrain source channel overlap channel drain ', ,Toverlap p substrate Simplified schematic of asymmetric gate overlap, which results in Cov drain ovsource' Fig. 4.2 1 If (A) Cgd_norm LCR Low LCR High LCR High p substrate p substrate (D)CdvLCR HighC VD oxide p substrte Fig. 4.3 Measurement configurations for (A) Cgd in normal configuration mode; (B) Cgp in normal configuration mode; (C) Cgd in reverse configuration mode; and (D) Cgs in reverse configuration mode. oxde p substrate Fig. 4.3 Measurement configurations for (A) Cgd in normal configuration mode; (B) Cgs in normal configuration mode; (C) Cgd in reverse configuration mode; and (D) Cgs in reverse configuration mode. measurement. Cgd or Cgdrev refers to the 'reverse' measurement mode, where the high port is applied to the source for a Cgd measurement. This is necessary because, for short channel devices, the resulting Coy value (where Co. is Covdrain or Cov source) can become a significant fraction of the total effective intrinsic capacitance. Although perhaps not obvious now, Cgs = Cgd when VDS=O. However, due to the difference in Co, Cgsmeasured may not equal Cgdmeasured. Figure 4.4 shows the Cgs and Cgd measurements in thenormal and reverse modes for a 20 x 20 gm device. Figure 4.5 shows the same measurements on a 20 x 0.40 im device (effective channel length is 0.24 im). Comparing the two figures clearly shows the negligible impact of Coy on the longchannel device Cgd and Cgs characteristics and the large impact on the shortchannel device. In both cases, the Cgd and Cgs values are almost identical, as is the overlapinduced difference of about 3 fF (This 3 fF offset is not visible on the Ldrawn=20 Apm device because it contributes less than 2% to the maximum capacitance, whereas the overlap contributes about 60% of the total measured maximum capacitance for the Ldrawn=0.40 m device). Later in this chapter, the results of channel hotcarrier stress on Cgd and Cgs will be shown. Because channel hotcarrier stress is inherently asymmetric (since the damage occurs near the drain edge), it is necessary to lay down the above notation for later use. Sample Measurements For all capacitance measurements in this chapter, an HP 4828A LCR meter was used with a smallsignal voltage was 400MHz at 60 mV peaktopeak. These number were chosen after testing a wide range of a.c. signal voltages and frequencies a' 1.0 0.8 Cgdrev Cgdnorm ) gsrev Cgsnorm 0 0.6 VDS = 0.0 V 0.4  0) 20 x 20 rm  0 0.2 nMOST _ 0 .0 ,, Il I l , 0.5 0.0 0.5 1.0 1.5 2.0 2.5 VGS /(1 V) Fig. 4.4 Cgd_norm, Cgd_rev, Cgs_norm, and Cgs_rev versus VGS for a 20x20 Im MOST with VDS=0.0. Although it appears that all four curves are the same, there are actually two sets of curves, Cgd norm/Cgsrev and Cgd_rev/Cgsnom separated by 3 fF. Very little difference is seen because the overlap capacitances shift is much less than then the peak Cgd and Cp values. Compare this with Fig. 4.5. .30 "gdrevi ''gs norm 34 Constant difference _34 due to overlap 32 30 Cgdnorm, Cgsrev Cl) 0 28 26 VDs=0.0 V 24 ) 22 20 x 0.40 m MOST 20  18 0.5 0.0 0.5 1.0 1.5 2.0 2.5 VGS /(1 V) Fig. 4.5 Cgdnorm, Cgdrev, Cgsnorm, and Cgsrev versus VGS for a 20x0.40 ntm MOST with VDS=O.O. Roughly 3 fF parallel shift of Cgs norm/Cgd_rev and Cgs_rev/Cgd_norm is due to a difference in constant overlap capacitance between the source and drain. to obtain the most accurate results. The 60 mV signal may seem a little large to those familiar with common CV measurements, where 25 mV is typically used, but is actually on the low end of the 23 mV to 400 mV found in most intrinsic capacitance papers [95 96,98106]. Frequencies below 100 MHz result in extremely poorresolution (noisy) intrinsic capacitance data, while frequencies above 500 MHz begin to show markedreduction due to series resistance. LCRspecific settings on the HP 4284A were a medium integration time with 8cycle averaging. So far the measurement procedures and naming conventions of intrinsic capacitance have been discussed. Figures 4.4 and 4.5 showed sample measurements with VDS=0. Although this is the typical way capacitances are measured, the ability to measure the capacitance of active devices, where VDS > 0 when VGS > VGT (where VGT is the threshold voltage at which an inversion channel form between the source and drain), is important. Why is this capability important? Because in a real circuit, this will commonly occur. If a correct model for the behavior of an operating transistor is desired, then data from an active device is required. Indeed, without this data, it would be like trying to verify an IDsat model with data only taken in subthreshold! Examples of Cgd measurements on active devices are shown in Figures 4.6 and 4.7 for 20 x 20 im and 20 x 0.40 gm as a function of VGS for VDS = 0.0, 0.5, and 1.0 V (Vsx = 0.0V). Cgd transitions from Cov drain to a larger value once VDS < VDSat, or the channel is no longer pinchedoff. From a charge perspective, this means changes in VDS (dVd) cause changes in Qchannel, which in turn cause changes in Qg (dQg), resulting in a 0.5 1.0 1.! VGS /(1 V) Cgd versus VGS for a 20 x 20 tim MOST with VDS=0.0, 0.5, and 1.0 V. 1.( S0.( O 0.1 Fig. 4.6 LL '* 30 t 20 x 0.40 m S 28 MOST /VDS= .5 V ) 26 /VDs=1.0 V 0 24 22 20 18 0.5 0.0 0.5 1.0 1.5 2.0 2.5 VGS /(1 V) Cgd versus VGS for a 20 x 0.40 pm MOST with VDs=0.0, 0.5, and 1.0 V. Fig. 4.7 Cgd. Thus, as VDs increases, the point at which this transition occurs also increases, as can be seen in the figures. As mentioned previously, Cgd is the most important intrinsic capacitance because, in a commonsource configuration (which is the configuration for all CMOS circuits), the effective load is 2(Cgs + Cgd(l Av)), where A, is the gain between the gate input and drain output (a large negative number). The next most important capacitance, based on the above load formula, is Cgs. Figures 4.8 and 4.9 show both Cgs and Cgd for a 20 x 20 gim and 20 x 0.40 gm as a function of Vos for VDS = 0.0, 0.5, and 1.0 V. Unlike Cgd, Cgs will have a finite value as long as VGS is greater than VGT, since the channel will always be connected to the source. At VDS=0, Cgs=Cgd since the channel charge is equally controlled by the source and drain. However, if VGS > VGT (channel forms) and VDs > VDSsat (drain pinched off), then the source terminal will actually control more than half of the channel charge, resulting in a rise in Cgs above the value at VDs=0. However, once VGS increases to a point that VDS < VDSsat, the drain is no longer pinched off, and the Cg, value begins to decline with increasing VGS as Cgd increases rapidly. This is clearly demonstrated in Figure 4.8 (and to a lesser extent in 4.9), where the decline in Cgs corresponds to the increase in Cgd. The model for Cgd and Cgs will be discussed later. Recalling the discussion about the overlapcapacitance shifting in the previous section, the capacitances shows in 4.8 and 4.9 are actually Cgnrm and CgrV in order to offset the effects of the overlap capacitance. (Fig. 4.5 shows why this was necessary) 1.0 V LL 1.4 20 x 20 gm 0.5 V C 1 MOST C 1.2 S1.0 gd D 0.8 VDS=0.0 V O 0.8 o 0.6 0.5 V S0.4 1. V 0 0.2 0.0 0.5 0.0 0.5 1.0 1.5 2.0 2.5 VGS/(1 V) Fig. 4.8 Cgd and Cgs versus VGS for a 20 x 20 pm MOST with VDS=O.0, 0.5, and 1.0 V. 'I o0 O 0 0) 0.5 1.0 1. VGS/(1 V) Cgd versus VGS for a 20 x 0.40 mrn MOST with VDS=0.0, 0.5, and 1.0 V. 28 26 24 22 20 18 I 0.5 0.0 Fig. 4.9 Channel HotCarrier Stress Effects on Cgd and Cg Because the intrinsic capacitances are somewhat difficult to measure, as well as the relatively small contribution of intrinsic capacitance on circuit performance in past generations, very little work has been done to investigate the impact of hotcarrier stress on intrinsic capacitance. Although the first report of hotcarrier degradation on ID was published in 1975 by Abbas and Dockerty [I], the first investigation of Cgd and Cgs degradation was not published until 1988 by Yao, Peckerar, Friedman, and Hughes [107]. Since then there have been several papers [102106] by two research groups showing Cgd and Cg, degradation for various stress conditions. Only one paper, by Dai, Walstra, and Lee, [108] showed the impact of Cgd and Cgs degradation on circuit performance. This section will present those data, a model for the degradation [109], and additional supplementary information not released in that short paper. Transistors from a 0.35 pmr CMOS technology for 2.5 V operation were used; the same devices shown throughout this chapter. Drawn channel lengths were 0.40 lim and 0.48 ipm, with effective channel lengths of 0.24tim and 0.32ntm respectively. Accelerated stress was performed using the following procedure: 1) Take unstressed ('fresh') ID versus VgD data from 0 to 2.5 V at VGs=2.5, 2.0, 1.5, and 1.2 V. 2) Take 'fresh' Cgs (normal mode) for reference. 3) Take Cgd (normal mode) versus VGS from 0 to 2.5V at VDs=0.0, 0.5, and 1.0 V. 4) Without reprobing, stress for exponentially longer times (see next paragraph for stress conditions), followed by capacitance measurements as in (3). 
Full Text 
xml version 1.0 encoding UTF8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchemainstance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd INGEST IEID EFWQ0S1EB_1I0JXU INGEST_TIME 20130928T02:47:52Z PACKAGE AA00014242_00001 AGREEMENT_INFO ACCOUNT UF PROJECT UFDC FILES 