UFDC Home  UF Theses & Dissertations   Help 
Material Information
Thesis/Dissertation Information
Subjects
Notes
Record Information

Full Text 
PAGE 1 1 PARALLEL SIMULATION OF NICKEL SILICIDE SILICON AND NICKEL SILICIDE GALL IUM ARSENIDE CONTACT RESISTANCE IN A WIDE DOPING RANGE By DUKJIN KIM A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIA L FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORIDA 2011 PAGE 2 2 2011 D ukjin Kim PAGE 3 3 To my family PAGE 4 4 ACKNOWLEDGMENTS First of all, I need to thank Prof. Jing Guo for his encouragement guidance and patience He taught me how to approach unexpected and complicated problems which I encounter ed while working on this thesis by always guiding me in the right direction His concise explanation s on device physics and computer program s ha ve been of great help. I thank very much Prof essor Gijs Bosman and Ant Ural for serving on my committee teaching device physics and giving priceless advice on this work. I also thank my colleagues, Dr. Yijian Ouyang, Jason Seol Yang Lu, Qun Gao, Wenchao Chen, Jyotsna Chauhan Dr. Ba la Kumar and Leitao Liu for their collaboration. PAGE 5 5 TABLE OF CONTENTS page ACKNOWLEDGMENTS ................................ ................................ ................................ .. 4 LIST OF TABLES ................................ ................................ ................................ ............ 7 LIST OF FIGURES ................................ ................................ ................................ .......... 8 LIST OF ABBREVIATIONS ................................ ................................ ........................... 10 ABSTRACT ................................ ................................ ................................ ................... 11 CHAPTER 1 INTRODUCTION ................................ ................................ ................................ .... 13 Contact ................................ ................................ ................................ ................... 13 Non ................................ ............................. 14 2 SILICIDE SILICON CONTACT ................................ ................................ ............... 18 N type Silicon ................................ ................................ ................................ .......... 18 Physical M odel ................................ ................................ ................................ 18 Simulation Algorithm and Strategy ................................ ................................ ... 23 P type Silicon ................................ ................................ ................................ .......... 25 Physical Model ................................ ................................ ................................ 25 Simulation Algorithm and Strategy ................................ ................................ ... 27 Result ................................ ................................ ................................ ...................... 27 N Type Silicon ................................ ................................ ................................ .. 27 P Type Silicon ................................ ................................ ................................ .. 28 3 SILICIDE GALLIUM ARSENIDE CONTACT ................................ ........................... 44 Physical Model and Simulation Algor ithm ................................ ............................... 44 Result ................................ ................................ ................................ ...................... 44 N type Gallium Arsenide ................................ ................................ ................... 44 P type Gallium Arse nide ................................ ................................ ................... 44 4 NUMERICAL INVESTIGATION ................................ ................................ .............. 48 Simulation Environment ................................ ................................ .......................... 48 P arallel Computing Performance Analysis ................................ .............................. 48 5 NUMERICAL ISSUES ................................ ................................ ............................. 53 Summation ................................ ................................ ................................ .............. 53 Integration ................................ ................................ ................................ ............... 54 PAGE 6 6 Poisson Equation ................................ ................................ ................................ .... 55 6 CONCLUSION ................................ ................................ ................................ ........ 59 APPENDIX THE FORTRAN CODE FOR THE APAPTIVE SIMPSON ALGORITHM ....................... 60 LIST OF REFERENCES ................................ ................................ ............................... 63 BIOGRAPHICAL SKETCH ................................ ................................ ............................ 65 PAGE 7 7 LIST OF TABLES Table page 4 1 Runtime (16 CPUs) for all simulation programs. ................................ ................. 52 PAGE 8 8 LIST OF FIGURES Figure page 1 1 Band profile of silicide n type silicon contact ................................ ...................... 1 7 1 2 Device structure of a channel with source and drain. ................................ ......... 1 7 2 1 The effective mass model and silicide silicon (n type) contact .......................... 2 9 2 2 Image charge ................................ ................................ ................................ ..... 30 2 3 Image potential energy ................................ ................................ ...................... 30 2 4 Flow chart of parallel algorithm for n type silicon ................................ ................ 31 2 5 The band diagra m of the silicide silicon (p type) contact ................................ ... 32 2 6 Conduction band profile for Nd=1.9e20. ................................ ............................. 32 2 7 Conduction band profile for Nd =3.3e18 ................................ .............................. 33 2 8 Transmission for n type silicon ................................ ................................ .......... 34 2 9 Comparison between the transmission and the conduction band. ...................... 34 2 10 The multiplication of the transmission and the thermal broadening function for n type Si. ................................ ................................ ................................ ............ 35 2 11 The resistances calculated from ( 2 2, solid line) and (2 7, diamond line) ......... 36 2 12 The simulation and experimental data for silicide n type silicon contact ............ 37 2 13 The resistances over various barrier heights ................................ ..................... 38 2 14 Valence band profile for Na=3e20. ................................ ................................ ..... 39 2 15 Valence band profile for Na=1.7e18 ................................ ................................ .. 40 2 16 Transmission for p type silicon .. ................................ ................................ ........ 41 2 17 The multiplication of the transmission and the thermal broadening fun ction for p type s i licon ................................ ................................ ................................ ...... 42 2 18 Comparison between the transmission and the valence band. .......................... 42 2 19 The simulation and experimen tal data for silicide p type silicon contact ............. 43 3 1 The resistances from numerical integration and 2D Fermi function ................... 45 PAGE 9 9 3 2 Comp arison between n type GaAs and n type Si simulation resistances ........... 46 3 3 p type GaAs and p type Si simulation resistances ................................ ............ 47 4 1 Parallel algorithm runtime ................................ ................................ .................. 50 4 2 Parallel algorithm speedup ................................ ................................ ................ 51 4 3 Parallel algorithm efficiency ................................ ................................ ............... 52 5 1 P(E,a,b) plot ................................ ................................ ................................ ........ 57 5 2 Three examples of inappropriate tolerance. for the Simpson algorithm ............... 58 5 3 Tunning integral value for the Simpson algorithm ................................ .............. 58 PAGE 10 10 LIST OF ABBREVIATION S NEGF Non equilibrium Green s function MOSFET Metal oxide semiconductor field effect transistor GHz Giga Hertz GB Giga Bityes CPU Centr al processing unit ID Identification 2D Two dimension s PAGE 11 11 Abstract of Thesis Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Science PARALLEL SIMULATION O F NICKEL SILICIDE SILICON AND NICKEL SILICIDE GALLIUM ARSENIDE CONTACT RESIST ANCE IN A WIDE DOPING RANGE By Dukjin Kim August 2011 Chair: Jing Guo Major: Electrical and Computer Engineering Scaling of solid state device s has been successful during the past 40 years. As device feature size reduce d the density of transistors in a single chip has increased and this improved cost and performance However, no one is sure if Moor s law would continue for the next decade due to the barriers to extend scalin g such as gate tunneling, subthreshold channel leakage, device parameter variability, parasitic resistance and capacitance. There have been intensive efforts to continue Moore s law. C urrent research efforts are mainly focused on new channel material to i mprove carrier mobility However, it is reported that parasitic effect including contact resistance is also a serious limiting factor. In this thesis, simu lation o f specific resistance s of NiSi / Si contact and NiSi/ GaAs contact in a broad doping range for n type and p type dopants are presented. The simulated Si resistance s are compared to experimental data, but the simulated GaAs resistance s are compared to the simulated Si resistance s due to lack of experimental data. Simulation has been performed by par allel computing and the non equilibrium Green s function (NEGF) formalism is used for modeling the ballistic transport in the PAGE 12 12 contact For parallel programming F ORTRAN with the Message Passing Interface (MPI) is used. PAGE 13 13 CHAPTER 1 INTRODUCTION Contact One of the main purposes of scaling solid state device is to improve the density of transistors and the circuit speed. Increasing density is crucial to reduce manufacturing cost and fast circuit speed is need ed to i mprove the device performance. S caling device size guarantees higher transistor density. Nonetheless, device speed cannot be increased proportionally with physical device size. The metal oxide semiconductor field effect transistor (MOSFET) structure has an intrinsic parasitic resistance which cannot be scaled down proportionally with feature size and as a result, it degrade s speed significantly [ 1 ], [ 2 ] The intrinsic parasitic resistance includes contact resistance and diffusion sheet resistance. Both of the resistances are import ant with respect to speed degradation. However, the contact resistance is considered to be the most serious component [ 3 ], [ 4 ]. The s ilicide diffusion contact has been employed to reduce the contact resistance. The s ilicide diffusion contact is a Schottky contact. The b and profile for silicide n type Si contact is shown in Fig. 1 1. The contact in this thesis is phenomenologically modeled. The energy level of the bottom of the Fermi sea of silicide is set to be low enough to mimic the real contact. Th is va lue is 2eV for n t ype material and 2eV for p type material. The barrier height is fitted using the experimental data [ 5 ] If we assume that a Schottky contact is formed at one end of the semiconductor, the band profile at the o t her end is assumed to be flat to make the tran smission perfect. T he Fermi level is set to be at zero. PAGE 14 14 Non Equilibrium Green s Function (NEGF) The NEGF formalism is widely used for nanoscale device modeling [ 6 ], [ 7 ]. In the NEGF formalism the device shown in Fig. 1 2 is described by a Hamiltonian (H) self energies ( 1 2 ) and overlap matrix (S) for the ballistic transport The Hamiltonian includes band structure information Eigen values of t he Hamiltonian are energy states of the channel. The self energies represent coupling between th e sourc e and the channel and b etween the drain and the channel Specifically, discrete energy levels of the channel are broadened and become continuous due to the formation of contact. This phenomenon is described by self energy terms. The overlap matrix presents the excitation effect of the channel due to electron waves leaking into the channel from the contact. For incoherent transport case, additional term, Scat is needed to capture the scattering mechanism. In this thesis, only ballistic transport is considered. Now, we can write the retarded Green s function G, for the channel at energy E (1 1) where 0 + is a positive infinitesimal and U is self consistent potential in the channel calculated by the Poisson solver. Under equilibrium case, the density of states (DOS) is defined as (1 2) where t he spectral function, A(E), is calculated from G (1 3) T he density matrix, [ e ], is (1 4) PAGE 15 15 where is the equilibrium Fermi level and f(E ) is the Fermi Dirac distribution function. (1 5) where k B is the Boltzmann constant and T is the temperature Under non equilibrium condition, we have two different spectral functions ( A 1 A 2 ) and Fermi functions ( f 1 f 2 ) for the source and drain respectively The density matrix is now defined as (1 6) where (1 7) (1 8) (1 9) 1,2 is called the Broadening matrix and represents the energy level broadening effect The density matrix equation, (1 6), implies that if 1 > 2 the sou r ce pumps electrons into the channel until the channel Fermi level is equal to 1 and the drain tak es electrons out of the channel until the channel Fermi level is equal to 2 However, neither the source nor the drain can be in equilibrium with the channel. This process takes place continuously as long as 1 > 2 This mechanism explains how electron de nsity in the channel is determined and h ow electron s flow through the device Coherent transport is usually described by transmission formalism. Transmission T(E) is calculated as follows (1 10) PAGE 16 16 With T(E) and the Fermi functions for the source and drain, the current is defined as (1 11) This is a one dimensional cur rent. Spin degeneracy is not taken into consideration here. The derivative of the Fermi function in the current equation (1 11) gives conductance C (1 12) where the thermal broadening function, F T (E) is defined as (1 13) The resistance, is calculated from C. (1 14) PAGE 17 17 Figure 1 1. Band profile of silicide n type Si contact. Figure 1 2 Device structure of a channel with source and drain. 1 and 2 are the Fermi levels of the source and the drain respectively Hamiltonian matrix, H, contains the band structure information of the channel. Self energies, and represent the coupling between the source an d the channel and between the drain and the channel, respectively The overlap matrix, S, describes the excitation of the channel due to the formation of the contact. PAGE 18 18 CHAPTER 2 SILICIDE SILICON CON TACT N type Silicon Physical M odel For n type Silicon, one band effective mass model discretized by the finite difference method is employed (see Fig. 2 1 top). The model is valid due to the fact that energy level s in the conduction band are parabolic, isotropic and well separated from each other. The silicide silicon contact is basically a Schottky contact. This contact is phenomenologically modeled (see Fig. 2 1 bottom). The silicide Fermi energy level E m which is equal to the conduction band edge of s i licide and the silicon Fermi energy level, E F are set to be at zero Open boundary con dition is applied at the side of the silicon which is opposite to the Schottky contact The barrier height bn is fitted using the experimental data [ 5 ] The energy level of the bottom of the Fermi sea is assumed to be 2eV. The image charge barrier lowering is also considered. In the previous chapter, conductance equation is derived as, (1 12) This is a one dimensional equation and a two dimensional cross section area is not taken into consideration. Let us assume tha t transport is in z direction and the cross sectional area is in (x, y) space. To cover (x, y ) space, we simply do a summation of (1 12) over a (k x k y ) spa ce with assumption that periodic boundary conditions are applied and the transverse modes are not coupled. There are two methods of this s ummation (2 1 ) PAGE 19 19 and ( 2 2 ) where m c is conduction band effective mass. Transmission T(E), is de pendent on k x k y and k z Therefore, (2 1 ) is correct But because (k x k y ) modes are all decoupled, transmission is just shifted by the (k x k y ) modes. Moreover, we do the summation over the entire (k x k y ) space. This proves that (2 2 ) gives the same re sult as (2 1 ). The unit of conductance from the two equations is [simens /m 2 ] There is another way to take the cross sectional area into consideration for conductance. The following is the current equation derived in the previous chapter. (1 11 ) This is a one dimensional current equation. W e can do summation of this equation over (k x k y ) space. (2 3 ) (2 4 ) (2 3 ) and (2 4 ) are equivalent for the same rea son as in the (2 1) and (2 2) case. We can change (2 4 ) into an analytical expression by introducing a two dimensional Fermi function f 2D (E), [ 8 ] PAGE 20 20 (2 5 ) Replac ing f(E) by f 2D (E) in (2 4 ), (2 6 ) Conducta nce is the derivative of (2 6 ). (2 7 ) Transmission is need ed to calculate conductance As shown in chapter 1, we can find transmission using overlap matrix, Hamiltonian self energy and potential profile. First, overlap matrix is an identity matrix This is because only orth o normal basis functions are used in thi s one band effective mass model. Hamiltonian matrix is shown as ( 2 8 ) PAGE 21 21 H is a sparse matrix because o nly three diagonal elements are non zero. If N atoms are used for modeling, siz e of H is N x N. In Fig. 2 1 top, the leftmost atom represents a silicide atom. The next atom is in the intersection between silicide and silicon. The rest of the atoms are silicon. Likewise, H(1,1) is a s ilicide atom, H(2,2) is an atom in the intersection and H(3,3) ~ H(N,N) are silicon atoms. Self energ y is a n N x N sparse matrix, too. It has only one non zero element. (2 9 ) (2 1 0 ) where k 1 and k 2 are wavenumbers and can be calculated from the dispersion relation. (2 1 1 ) Potential U, is calculated self consist ently from the Poisson equation and the Boltzmann Maxwell approximation equation. ( 2 1 2 ) (2 1 3 ) PAGE 22 22 First, a guess ed initial value of U is used to calculate n(z) value fro m (2 1 2 ). E c is equal to q U. N ext, U is calculated from (2 1 3 ) using the n(z) value. T his U value is compared to the initial U value. If the difference is too large, the process is repeated again. This process is repeated until the error is within an allowed error rang e. T he image charge effect should be included in the potential energy profile Let us assume that there is negative charge ( q) inside silicon and t he distance between the negative charge and the surface of metal is x ( Fig 2 2). An equal amount of p ositive charge will be induced at the surface due to the negative charge. The induced surface charge can be replaced by positive charge (+q) at a distance x from the metal surface. This charge is called the image charge. There is an attractive force between the negative charge and image charge. (2 1 4 ) F is called the image force. The amount of work to bring an electron from infinity to t he point x is (2 1 5 ) By definition, E(x) is electron potential energy at a distance x from the metal surface and is shown in Fig. 2 3. As a cons equence, the Schottcky barrier is lowered by E(x) and PAGE 23 23 this phenomenon is referred to as the image charged induced barrier lowering. E(x) should be added to E c and next, E c is added to the diagonal elements of Hamiltonian as shown in (2 8 ) Simulation Algor ithm and Strategy The thr ee expressions for conductance are, (2 1 ) (2 2 ) (2 7 ) (2 7 ) requires the least amount of computation because it is an analytical equation (2 1 ) computes T(E) as many times as the number of grid points in (k x k y ) summation sp ace and (2 2 ) computes F T (E) over the same number of gird points Less computational effort is required in c omputing F T (E) h ence, ( 2 2 ) is more efficient then (2 1) In this thesis, ( 2 2) and (2 7 ) are adapted to simulate conductance. The results from the two algorithms will be compared. (2 7 ) contains no parallel workload and therefore its algorithm should be performed by seria l program. On the other hand, (2 2 ) has good parallel workload i.e. summation. The parallel algorithm flow chart is given in Fig 2 4. One processor is named master and other processor s are named slave s F irst, the m aster processor calculates the self consistent potential energy which is the conduction band edge, E c The following step is the summation of the thermal broadening func tion over (k x k y ) space. This step requires the largest computation workload and is parallelized. Let us assume PAGE 24 24 that grid points for k x are in a range from 1 to r. For simplicity k y grid points ar e assumed to be from 1 to r, as well Summation over a two dimensional space could be performed by two loops: one for k x and another for k y Par allelization could be achieved by dividing one of the two loop s All of the processors perform summation over the entire k y grid but only a part of k x grid. If the number of the processors is p, r/p of k x grid points is assigned to each of the processors. The range, 1 to r, should be appropriately assigned. Each processor should be aware of which part of the range it should work on. This is done using processor ID s Each o f the processor s has its own ID from 0 to (p 1). Once r is decided, the specific range for each of the processor s is decided through the ID s To initiate the slave processors, the master broadcasts parameters for summation of the thermal broadening functio n and finalizing flag (=0) to the slave processors using MPI_BCAST. MPI_BCAST is a collective communication function and is more efficient than a one to one communication function w here one processor sends the same data to all processors. While the slaves are working on the summation, the master also performs the summation in the assigned range. Next, the master collects the summed data from the slaves using MPI_GATHER. MPI_GATHER is also a collective communication function and is efficient when one process or receives data from all other processors. The m aster sums the collected data. The resulting value is named f_prime. Next, transmission is calculated. f_prime weighted by transmission is input (integrand) to the integration algorithm which uses t he adapti ve Simpson rule This algorithm is recursively repeated until the integral value is within predefined error If the algorithm needs to be repeated with an other input when the integral value is out of the error the master goes back to the broadcasting step Otherwise, the master finalizes the slaves PAGE 25 25 by broadcasting finalizing flag ( =1 ) The resistance value is printed and the master terminates. P type Silicon Physical M odel The e nergy levels in valence band are spaced closely and they are significantly an isotropic and non parabolic. One band effective mass model is not valid for the valence band. Instead, three bands effective mass model is adap ted. The heavy hole, light hole and spin orbit split off band s are used In t he case of n type silicon, the one d imensional transmission is just shifted by additional (k x k y ) modes and the shape of transmission plot remains unchanged It could be easily understood because all the m odes are assumed to be decoupled This assumption is no longer valid fo r the valence b and case. T r ansmission with additional (k x k y ) mode is shifted from the original transmission and its shape also changes. T herefore, mathematical t echniques for (2 2) and (2 7 ) cannot be applied here. Instead, conductan ce equation is given as follows (2 8 ) The dispersion relation is shown in (2 9 ) [ 9 ] The Luttinger parameters are material dependent [ 10 ] (2 9 ) is discretized by the finite difference method and t he resulting matrix is a Hamiltonian matrix. If N atoms are used in this model, the Hamiltonian is a 6N x 6N block tridiagonal matrix. 1 is 6N x 6N sparse matrix and only the first 6 x 6 block contains non zero elements. 2 has non zero values in the last 6 x 6 block. The self energies are calculated by the Sancho Rubio approach [ 11 ]. Overlap matrix is an identity matrix and the potentia l energy is calculated through the same procedure as for the n type case. PAGE 26 26 The band diagram for p type contact is depicted in Fig. 2 5. As in n type contact, the contact at the side opposite to the Schottky contact of silicon is treated by open boundary co ndition. E F and E m are set to be at zero and the Schottky barrier is fitted to experimental data. The energy level of the bottom of the Fermi Sea for silicide is assumed to be 2eV. ( 2 9 ) where PAGE 27 27 Simulation Algorithm and Strategy The main frame of the p type algorithm is the same as that of the n type algorithm Only the s ummation part of (2 8 ) is parallelized. (2 10 ) The difference is that T(E, k x ,k y ) is summed rather th an F T (E, k x k y ). The s ummation of T(E, k x k y ) needs the self consistent potential energy. The potential energy could be calculated by master processor and broadcasted to the slave processors. However, when N atoms are used for the effective mass modeling, do uble precision buffer which has a size of N is needed for the broadcast ing C ommunication time is the main source of overhead in parallel computing Furthermore, calculating potential energy algorithm is quite fast. Therefore, it is time saving that ever y processor calculate s the potential energy a nd save s it in its own memory. Result N Type Silicon The contact resistance is simulated over ten doping concentrations (Nd) in a range from 3.3e18 to 1.9e20. The conduction band profile s of the highest and low est doping case are given in Fig. 2 6 and Fig. 2 7 Fig. 2 6 has thinner barrier width as expected. For Nd=1.9e20 the lowest conductio n band edge is at about 0.1eV and accordingly transmission should start to increase from zero around this energy This i s shown in Fig. 2 8. The t ransmission d o e s not reach its maximum value even at 5eV. This is due to the quantum reflection effect The maximum value is six because there are six valleys at the PAGE 28 28 Si conduction band edge. In Fig. 2 9 the transmission is compare d to the conduction band over an identical energy range. The conductance is obtained from the multiplication of transmission and the thermal broadening function. The resulting function is depicted in Fig. 2 10 The resistances from (2 2) and (2 7 ) are show n in Fig. 2 11 The two values are well matched proving that the numerical integration is accurate. In Fig. 2 12 the simulation and experimental data are plotted The barrier height is fitted using the experimental value for the lowest doping concentrati on case. The lowest concentration experimental value is chosen because in low doping concentration the effect of the barrier height on the resistance is more significant. Fig. 2 13 shows the resistance s simulated over the five different barrier height s Th e resistance varies with the barrier height more significantly in low concentration. P Type Silicon The p type doping (Na) is in a ra nge from 1.7e18 to 3e20 for the resistance simulation The valence band profiles are given in Fig. 2 14 and Fig. 2 15 for the highest and lowest doping case s, respectively The transmission which is flipped compared to n type case is shown in Fig. 2 16. The maximum transmission is also six because three bands including the heavy hole, li ght hole and spin orbit split off band s are used and spin degeneracy is included The resulting function from the multiplication of the transmission and thermal broadening function has the same shape with n type case, as shown in Fig. 2 17 The valence band profile is compared to the transmissi on over an identical energy range in Fig. 2 18 The simulated resistance and experimental data are given in Fig. 2 19 The barrier hei ght is tuned using the ninth point PAGE 29 29 Figure 2 1 The effective mass model and si licide silic on (n type) contact A) t1 and t2 are bonding energies and a is atomic distance. B) The contact is phenomenologically modeled The Fermi level (E m for silicide, E F for silicon) is set to be at zero and the barrier height, bn =0.499eV is fitted using the ex perimental data. The energy level of the bottom of the Fermi sea for silicide is set to be at 2eV. PAGE 30 30 Figure 2 2 Negative charge inside the silicon induces positive surface charge on the metal. This surface charge can be replaced by equal positive char ge located inside the metal. The distance of the positive charge and negative charge from the metal surface is equal. Figure 2 3 The dotted line indicates the image potential energy profile created by the positive image charge inside the metal. The i mage potential energy is added to the self consistent potential energy calculated by the Poisson solver and it lowers the Schottky barrier. PAGE 31 31 Figure 2 4 F low chart of parallel algorithm for n type silicon. Only the summation part is parallelized. PAGE 32 32 Fi g ure 2 5 The band diagram of the silicide silicon (p type) contact. The Fermi level (Em for silicide, EF for silicon) is set to be at zero and the barrier height, bp=0.44eV is fitted using the experimental data. The energy level of the bottom of the Fermi sea for siliside is set to be at 2eV. Figure 2 6 Conduction band profile for Nd=1.9e20. PAGE 33 33 Figure 2 7 Conduction band profile for Nd=3.3e18. PAGE 34 34 Figure 2 8 Transmission is zero below 0.1eV due to the bandgap and it starts to rise around 0.1eV. The transmission does not reach its maximum value immediately because of quantum reflection. Figure 2 9 Comparison between the tran smission and the conduction band. A) Transmission. B) Conduction band profile. Y axis is energy for A) and B). PAGE 35 35 Figure 2 10 The multiplication of the transmission and the thermal broadening function for n type Si. PAGE 36 36 Figure 2 11 The resistances calc ulated from (2 2 solid line ) and (2 7 diamond line ) are compared. The two values are well matched proving that the numerical integration in (2 7) is accurate. PAGE 37 37 Figure 2 12 The simulation and experimental data are plotted. The barrier height fitted us ing the experimental value of the highest concentration is 0.499eV. PAGE 38 38 Figure 2 13 The resistances over various barrier heights. The effect of barrier height is more prominent in low concentration. PAGE 39 39 Figure 2 14 Valence band profile for Na=3e20. PAGE 40 40 Figure 2 15 Valence band profile for Na=1.7e18 PAGE 41 41 Figure 2 16 Transmission for p type Si. The transmission starts to increase from zero at around 0.2eV which is the top of the valence band. It does not reach its maximum value even below the contact ba rrier due to the quantum reflection. PAGE 42 42 Figure 2 17 The multiplication of the transmission and the thermal broadening function for p type Si. Figure 2 18 Comparison between the transmission and the valence band. A) Transmis sion. B) Valence band. Y axis is energy for A) and B). PAGE 43 43 Figure 2 19 The simulation and experimental data. The barrier height fitted using the ex perimental value of the second highest concentration is 0.44eV PAGE 44 44 CHAPTER 3 SILICIDE GALLIUM ARS ENIDE CONTAC T Physical Model and Simulation Algorithm Physical model and simulation algorithm for n type and p type Gallium Arsenide are corresponding to that of Silicon except material property constants such as the effective mass for n type and the Luttinger paramet er for p type. Result N type Gallium Arsenide The simulation data from the 2D Fermi function and numerical integration are well matched as shown in Fig. 3 1 Because the e xperimental data does not exist GaAs resistance is compared to Si resistance in Fig 3 2 The same barrier height and doping concentrations are used for GaAs and Si. GaAs resistance is expected to be lower due to the lighter effective mass over all doping concentrations. However, GaAs has higher resistance for the first two points in Fig 3 2 This can be explained as follows. In low doping case, the dominant transport mechanism is thermionic emission. Therefore, GaAs which has lighter effective mass shows lower resistance. On the other hand, in high doping case, tunneling mechanism is do minant. Si has six modes coming from the six valleys and GaAs has only one mode. When all modes have high transmission due to the thin barrier width Si could have lower resistance. P type Gallium Arsenide P type GaAs resistance is also compared with Si r esistance in Fig 3 3 due to lack of experimental data. The barrier height and doping concentrations are fixed for the two materials. GaAs has always lower resistance and this is consistent w ith the fact that GaAs has light er effective mass, and GaAs and S i have the same number of modes PAGE 45 45 Figure 3 1 The resistances from numerical integration and 2D Fermi function are well matched proving that the numerical integration is accurate. PAGE 46 46 Figure 3 2 n type GaAs and n type Si simulation resistances are co mpared. The barrier height and doping concentrations are fixed for the two materials. In low concentration, GaAs resistance is lower because the effective mass in GaAs is lighter. In high concentration, Si resistance is lower because Si has six transmissio n modes and GaAs has only one transmission mode. PAGE 47 47 Figure 3 3 p type GaAs and p type Si simulation resistances are plotted. GaAs has always lower resistance. This is because the number of transmission modes of p type GaAs is equal to that of p type Si, and p type GaAs has a lighter effective mass. PAGE 48 48 CHAPTER 4 NUMERICAL INVESTIGAT ION Simulation Environment The simulation programs have been written in F ORTRAN with the Math Kernel Library (MKL). For parallel computing, the Message Passing Interface (MPI) is used The programs are run on the University of Florida High Performance Computing (HPC) Center where each computing nodes ha s 4 to 8 cores (2.2 to 2.8 GHz) and 4 to 64 GB memory ( 2.7 GB per one core on average). There are a total of 3854 cores in the HPC center. Parallel Computing Performance Analysis In parallel computing, processors interact with each other by sending and receiving data. The time needed for the data to be transferred from one processor to an other processor (s) is defined as communicat ion time. T he total parallel runtime consists of computation time and communication time. The communication time is the main source of overhead which is defined below ( 4 1 ) where T p is the time taken from the start of parallel program to the moment when the last processing unit finishes working. T s is the runtime when the program is run on a single processor. There are three main performance metrics for parallel systems: runtime, speedup and efficiency [ 12 ] N type Si simulation algorithm is analyzed by the metrics. Fig. 4 1 shows the runtime. The runtime continuously decreases until 16 CPUs, but PAGE 49 49 after that, the runtime actually increases. This is due to the fact that as more processing units are used, the communication time become s more significant. Once the communication time is comparable to computation time, we cannot obtain any benefit fr om parallelization. Speedup is ( 4 2 ) and is shown in Fig. 4 2 S reaches the peak value at 16 CPUs. Therefore, the opt im al number of processing units for this algorithm is 16. Efficiency is always less than or equal to 1 due to the overhead and drops as more CPUs are used (see Fig. 4 3 ). In Table 4 1 runtimes for all simulation algorithms are given. PAGE 50 50 Figu re 4 1. Runtime decreases until 16 CPUs and after that it increases. Beyond 16 CPUs, the overhead (communication time) is dominant and we cannot obtain any benefit from parallelization PAGE 51 51 Figure 4 2 Speedup increases until 16 CPUs and drops after that. The optimal number of CPUs for this parallel algorithm is 16. PAGE 52 52 Figure 4 3 Efficiency is always less than or equal to 1 due to the overhead. Continuously dropping E implies that the overhead continuously increases. Table 4 1. Runtime (16 CPUs) for al l simulation programs Material n type Si p type Si n type GaAs p type GaAs Runtime 2.16 [sec] 8.36 [min] 0.46 [sec] 96.26 [min] PAGE 53 53 CHAPTER 5 NUMERICAL ISSUES Summation (2 2 ) (2 7 ) Two conductance equations are used for n type silicon. ( 2 2) includes summation and (2 7 ) is an analytical equation. The result from (2 7 ) could be considered to be a perfect answer and the result from (2 2) should be within an allowed error range (1%~3%) from the answer. To achieve high accuracy, summation interval and range are critical. k x and k y can be defined as follows. (5 1 ) where L x and L y are device length in x and y direction. a and b are integers. Let us assume L x =L y =L for simplicity. Then summation interval is ( 5 2 ) The summation range for k x and k y is assumed to be the same. (5 3 ) where m is an integer. m and L should be chosen carefully to guarantee high accuracy. Function P(E,a,b) is defined as, (5 4 ) The summation and in tegration part of (2 2) is PAGE 54 54 (5 5 ) P(E,a,b) has a form of hyperbolic secant function. With fixed E, as the absolute value of a or b increases, P(E, a,b) decreases. (5 6 ) An example of (5 6 ) is given in Fig. 5 1 m is chosen such that the maximum value of P(E,m,m) is less than 1% of that of P(E,0,0). L is determined self consistently. S ufficiently large value which is in an order of micrometer s is chosen first. If the simulation data based on the first L is well matched with the data fro m the 2D Fermi function (n type ) or the experimental data (p type), shorter L is chosen for the next simulation. L is reduced in this way until the runtime of the simulation is fast enough and the simulation data is with in an allowed error range. Integration The adaptive Simpson a lgorithm is employed for integration [ 1 3 ] Setting appropriate tolerance for the algorithm is crucial to guarantee not only high accuracy but also optimized runtime. If the tolerance is too tight, the algorithm would run for a long time. If the tolerance i s too loose, error would be out of the allowed range. Moreover, machine precision should be taken into account. For instance, in the case where the machine precision is 4 digits and integral value has one digit after decimal point, the tolerance needs to b e at least 10 1 If the tolerance is less than or equal to 10 2 a PAGE 55 55 computing machine would not be able to handle it On the contr ary the tolerance could be too large and beyond the machine precision. Scale of the integral value should be considered, too. If the first non zero number of the integral value comes 2 digits after the decimal point, the tolerance should be less than or equal to 10 3 Otherwise, error would be significant (see Fig. 5 2 ) The undesired examples mentioned above could take place wh en the Simpson algorithm with fixed tolerance is used for many integrands which have integral values over a large range. In this thesis, for example, p type silicon conductance is simulated on ten different doping concentrations in a range from 1.6 e 18 to 3 e20 and the biggest conductance is 10 6 times larger th an the smallest one. In order to avoid the undesired examples, first, integral value which is conductance is estimated by using the experimental value. Constant A is chosen such that the integral va lue divided b y A is in the order of zero Next, i nput (integrand) to the S impson algorithm is divided by A Tolerance is set to be 10 4 (Tolerance can be fixed because A is different for different integral values). Output is multip lied back by A This trea tment on the input and output guarantees that all of the integral values are within 10 4 error (see Fig. 5 3 ) When the experimental values are not available, we can run the algorithm with random tolerance first. The tolerance can be adjusted after based o n the output of the first run Poisson Equation (2 1 3 ) PAGE 56 56 The Poisson equation has a form of a second order differential equat ion. The left hand side of (2 13 ) can be discretized using the finite difference method and it turns into a matrix form. (5 7 ) where [D ] is a co efficient matrix [U] is potential, c is a constant, and [N] is electron density matrix. [ U] could be found as (5 8 ) Inverting matrix is acceptable when the matrix size is small. But [D ] in this simulation has a size of double precision 1000 x 1000. Inverting [D ] would result in significant error and a waste of runtime and memory space. Basically, (5 7 ) is a linear equation and w e can use L U decomposition method instead This method requires much less computation time and the error is negligible. In addition, [D ] is tridiagonal matrix and three arrays can be used to represent [D ] L U decomposition is performed on the three array s rather than the large matrix. PAGE 57 57 Figure 5 1 P(E,a,b) decreases as the absolute value of a or b increases. PAGE 58 58 Figure 5 2 Three examples of inappropriate tolerance. A) Tolerance is too small and it is out of machine precisi on. M achine precision is assumed to be 4 digits. B) Tolerance is too big and it is out of machine precision. M achine precision is assumed to be 4 digits. C) T olerance is within machine precision but it is so large that error would be significant. Figure 5 3 Constant A is chosen to tune integral into the number in the order of zero. Input (integrand) is divided by A and the output is multiplied back by A. By using A, the fixed tolerance can be used for different integrands. The runtime is optimized and the error is always within 10 4. PAGE 59 59 CHAPTER 6 CONCLUSION The specific NiSi nSi and NiSi pSi contact resistance s are simulated over a wide range of doping concentration s using the ballistic NEGF modeling method and the result is compared to the experimental data. The barrier height is fitted using the experimental data of the lowest or second lowest concentration. For n type case, the numerical integration and 2D Fermi function approaches are compared and they show good agreement proving the numerical integr ation approach in this thesis has high accuracy. The investigation on accuracy of the numerical integration approach is essential for p type case where 2D Fermi function cannot be employed. Due to lack of experimental data, simulation result of GaAs is com pared with that of Si based on the fixed barrier height and concentration For high concentration, n type GaAs has higher resistance than n type Si even though n type GaAs has lighter effective mass. This is because n type GaAs has only one mode and n type Si has six modes. When the barrier width is so thin that all modes have high transmission, n type Si can have lower resistance. p type GaAs has always lower resistance than that of p type Si because they have the same number of modes and the effective mas s for p type GaAs is lighter. P arallel computing is used for the simulation. For n type Si algorithm, speedup has a peak value at 16 CPUs. The efficiency of the parallel algorithm drops quite rapidly This is due to the fact that computa tion time is not la rge for this simulation. If the same algorithm is applied to other simulation s where computation is expensive, the speedup would have a peak value at more than 16 CPUs and the efficiency would drop slowly. PAGE 60 60 APPENDIX THE FORTRAN CODE FOR THE APAPTIVE SIM PSON ALGORITHM !adaptive Simpson algorithm module quad_module implicit none contains subroutine quad(Q,fcnt,f,a,b,tol) implicit none integrand external ::f !the integrand integration range double precision intent (in )::a,b absolute tolerence double precision intent (in)::tol function count integer intent (out)::fcnt return value double precision intent (out)::Q local variable do uble precision ::h,hmin,c,d,e double precision ::x1,x2,x3,x4,x5,x6,x7 double precision ::y1,y2,y3,y4,y5,y6,y7 double precision ::Q1 double precision ::Q2 double precision ::Q3 integer ::warn integer ::warn1,warn2,warn3 fcnt=0 h=0.13579*(b a) x1=a x2=a+h x3=a+2*h x4=(a+b)/2 x5=b 2*h x6=b h x7=b call f(x1,y1) call f(x2,y2) call f(x3,y3) call f(x4,y4) call f(x5,y5) call f(x6,y6) call f(x7,y7) PAGE 61 61 fcnt=fcnt+7 hmin= epsilon (b a)/1024 call quadstep(Q1,fcnt,warn1,f,x1,x3,y1,y2,y3,tol,hmin) call quadstep(Q2,fcnt,warn2,f,x3,x5,y3,y4,y5,tol,hmin) call quadstep(Q3,fcnt,warn3,f,x5,x7,y5,y6,y7,tol,hmin) Q=Q1+Q2+Q3 warn = max (warn1,warn2,warn3) end subroutine quad recursive subroutine quadstep(Q,fcnt,warn,f,a,b,fa,fc,fb,tol,hmin) implicit none subroutine argument list external ::f !the integrand integration range double precision intent (in)::a,b f unction value double precision intent (in)::fa,fc,fb absolute tolerence double precision intent (in)::tol double precision intent (in)::hmin function count integer intent (inout)::fcnt integer intent (out)::warn double precision intent (out)::Q local variable integer ::maxfcnt double precision ::h double precision ::c,d,e double precision ::Q1,Q2 double precision ::Qac,Qcb double precision ::fd,fe i nteger ::warnac,warncb maximun function count maxfcnt=9000 h=b a c=(a+b)/2 d=(a+c)/2 e=(c+b)/2 call f(d,fd) call f(e,fe) fcnt=fcnt+2 Q1 = (h/6)*(fa + 4*fc + fb); PAGE 62 62 Five point double Simpson's rule. Q2 = (h/12)*(fa + 4*fd + 2*fc + 4*fe + fb); One step of Romberg extrapolation. Q = Q2 + (Q2 Q1)/15; termination criterion floating point if ( abs (Q)>= huge (1.0)) then warn=3 print *, 'floating point' return end if reached maximum function count if (fcnt > maxfcnt) then warn=2 print *, 'reached maximum function count' return end if error is less than the tolerence if ( abs (Q2 Q)<=tol) then warn=0 print *, 'error is less than the tolerance' return end if prevent infinite recursion if (( abs (h)< hmin).OR.(c==a).OR.(c==b)) then warn=1 print *, 'infinite recursion' return end if call quadstep(Q ac,fcnt,warnac,f,a,c,fa,fd,fc,tol,hmin) call quadstep(Qcb,fcnt,warncb,f,c,b,fc,fe,fb,tol,hmin) Q=Qac+Qcb warn= max (warnac,warncb) end subroutine quadstep end module quad_module PAGE 63 63 LIST OF REFERENCES [ 1 ] S. Thompson, P. Packan, T Ghani, M. Stettler, M. Alavi, I. Post, S. Tyagi, S. Ahmed, s. Yang, and M. Bohr, Source/drain extension scaling for 0.1 m and below channel length MOSFETs Proc. of VLSI Technology Symposium pp. 132 133, Jun. 1998 [2 ] P. Keys, H. J. Gossmann, K. K. Ng, and C. S. Rafferty, Series resistance limits for 0.05 m MOSFETs, Superlattices and Microstructures vol. 27, no. 2/3, pp. 125 136, Feb. 2000 [3 ] K. K. NG and W. T. Lynch, The impact of intrinsic series resistance on MOSFET scaling, IEEE Trans Electron Devices vol. ED 34, no. 3, pp. 503 511, Mar. 1987 [4 ] S. D. Kim C. M. Park, and J. C. S. Woo, Advanced model and an alysis for series resistance in sub 100nm CMOS including poly depletion and overlap doping gradient effect, IEDM Tech. Dig ., pp. 723 726, Dec. 2000 [ 5 ] N. Stavitski, M. J. H. van Dal, A. Lauwers, C. Vrancken, A. Y. Kovalgin, and R. A. M. Wolters, Sy stematic TLM measurements of NiSi and PtSi specific contact resistance to n and p type Si in a broad doping range, IEEE Electron Device Lett. vol. 29, no. 4, pp. 378 381, April. 2008. [6 ] S. Datta, The non equilibrium Green s Function (NEGF) formalism: an elementary introduction, IEDM pp. 703 706, 2002. [7 ] M. P. Anantram, M. S. Lundstrom, and D. E. Nikonov, Modeling of Nanoscale Devices, Proc. of IEEE vol. 96, no. 9, Sep. 2008 [8 ] S. Datta, Quantum transport atom to transistor, 2nd edition, Camb ridge University Press 2005 [9 ] C. Y. P. Chao and S. L. Chuang, Spin orbit coupling effects on the valence band structure of strained semiconductor quantum wells, Phys. Rev. B vol. 46, no. 7, pp 4110 4122, Aug. 1992. [10 ] P. Lawaetz, Valence band pa rameters in cubic semiconductors, Phys. Rev. B vol. 4, no. 10, pp 3460 3467, Nov. 1971. [11 ] M. P. L. Sancho, J. M. L. Sancho, and J. Rubio, Highly convergent schemes for the calculation of bulk and surface Green functions, J. Phys. F, vol. 15, pp. 85 1 858, 1985. [12 ] A. Grama, A. Gupta, G. Karypis, and V. Kumar, Introduction to parallel computing, 2 nd edition, Pearson 2003 PAGE 64 64 [13 ] W. Gander and W. Gautschi, Adaptive quadrature revisited, BIT vol. 40, no. 1, pp. 84 101, Aug. 1998. PAGE 65 65 BIOGRAPHICAL SKETCH Dukjin Kim was born in Kyungbuk, South Korea. He received his B achelor of S cience degree in electrical and computer engin eering from Hanyang University Seoul, South Korea in 2008 In 2009, h e started his M.S. study in electrical and computer engineering under the guidance of Prof essor Jing Guo at the University of Florida Gainesville, F lorida His research in the graduate school is focused on parallel simulation of silicide semiconductor contact resistance 